PDA

View Full Version : The Wartberg culture paper (Immel et. al. 2019) - Y-chromosomes



artemv
12-24-2019, 12:20 AM
Neolithic genomes reveal a distinct ancient HLA allele pool and population transformation in Europe

URL to preprint:
https://www.biorxiv.org/content/10.1101/851188v1.full

URL to main thread:
https://anthrogenica.com/showthread.php?19256-Neolithic-genomes-reveal-a-distinct-ancient-HLA-allele-pool-(Immel-et-al-2019)

I guess it more convenient to post the Y-chromosome results here, and disscuss them in the main thread. The pre-print is out without any supplementary info, but *.bam files are already published. Some of them are very heavy (more than 20 Gb!), so I started the chromosome check with small sized files, because processing heavy files really takes time (as we have no supplementary data at all, processing some files will only show is that was a woman).

I remind that all the samples are from the same collective grave, that was used by the same group of people for about 100 years.






file_name
my_result


KH150208_KH150210
woman


KH150287
pre-I-S9234


KH150289
I-M423*


KH150290
woman


KH150417
(pre?)I-S9234


KH150418
pre-I-S9234


KH150419
woman


KH150422
I-M423*


KH150618
pre-I-S9234


KH150621
(pre?)I-S9234



KH150287
He is definitely I-Y5334:
FGC18615+ (1 read)
S22470+ (1 read)
FGC18575+ (1 read)

Has one of I-S9234 synonim mutations, others ancestral:
Y1049.2- (1 read)
F313- (1 read)
FGC18575+ (1 read)

KH150289
He is positive to all the M423 synonimes that have reads, and negative to all known downstream branches:
AM0129+ (1 read)
Z2594+ (1 read)
FGC8118+ (1 read)
AM01260+ (1 read)

Negative downstreams:
L161.1- (1 read)
S10302- (1 read) -> this is synonim of L621
Y80035- (1 read) -> one more M423 branch(Y24694 on y-full), still not present in ISOGG

KH150417
FGC18549+ (1 read)
L1251.1+ (1 read)

KH150418
FGC18544+ (2 reads)
FGC18648- (1 read)

KH150422
Likely one more I-M423*. This one has no calls for L-161.1 synonimes (and ancestral results for other known M423 branches), but I guess this is the same branch as KH150289.

M423 synonimes:
AM01285+ (1 read)
AM01293+ (1 read)
AM01260+ (2 reads)

L621 synonimes:
FGC30451- (1 read)
F3145- (1 read)
S10980- (1 read)
S10302- (1 read)
S13737- (2 reads)

Y24694 synonimes:
Y81096- (1 read)
Y129078- (1 read)

KH150618
Out of S9234 synonimes has just 1 ancestral SNP and 6 derived.
Interesting result for FGC18624.
FGC18584+ (1 read)
FGC18616+ (1 read)
L1251.1+ (2 reads)
FGC18554+ (1 read)
S22054+ (2 reads)
FGC18558+ (3 reads)
FGC18608- (1 read)

FGC18624 G (2 reads) - this is a T->C SNP, but here we get G

KH150621
Out of S9234 synonimes all 3 have derived results. But we know S9234 branch is too young, so for 5000 years old sample, so this must be some pre-S9234 branch.
FGC18584+ (2 reads)
S26491+ (2 reads)
S25233+ (2 reads)

FGC18624 G (2 reads) - this is a T->C SNP, but here we get G, again.
That's interesting, do we have the same error for both KH150621 and KH150618 samples, or thats a new SNP?

Romilius
12-24-2019, 08:25 AM
sample KH150204_KH150634 has many calls on I level. It seems I2.

artemv
12-24-2019, 08:52 AM
KH150610 is a woman.

Romilius
12-24-2019, 07:36 PM
sample KH150641 seems to be I of some sort. Probably I2c (it has got some calls on I2c level).

artemv
12-24-2019, 08:43 PM
file_name
my result


KH150610
woman


KH150625
woman


KH150626
woman


KH150628
pre-I-S9234


KH150639
pre-I-S9234


KH150191
I-S6635 (pre-I-S9234?)


KH180044
woman


KH150193_KH150286
woman


KH150195_KH150196_KH150616
woman


KH150197
(pre?)-I-S9234


KH150198
(pre?)-I-S9234


KH150200
woman


KH150203
woman


KH150640
woman



KH150628
FGC18616+ (1 read)
FGC18544+ (2 reads)
FGC18555- (2 reads)
FGC18648- (1 read)
Y7859- (1 read)

KH150639
FGC18584+ (2 reads)
FGC18650+ (2 reads)
FGC18606+ (1 read)
FGC18588- (1 read)
S25233+ (2 reads)

KH150191
This is a low coverage sample, just 17Mb.
PF6915+ (1 read)
Although he is only positive to one of S6635 synonimes, there are no downstream negatives, and he is likely pre-I-S9234, like most of his mates.

KH150197
Upstream positives:
PF3893+
L596+
S6729+
PF3892+

S9234 synonimes:
L1251.1+ (1 read) -> This is a C->T SNP
We only het one S9234 synonime, and that synonime is a C->T SNP. But given S9234 age this cannot be full S9234, this must be some pre-I-S9234

KH150198
One more low coverage sample.
Upstream positives:
L596+ (1 read)
FGC18548+ (2 reads)

S9234 synonimes:
FGC18609+ (1 read) -> This is a C->T SNP
Again just 1 synonime, and again C->T SNP. But as we have no negatives, we can assume this is likely a pre-I-S9234 sample

Now I only have "heavy weight" file left, it would take some time to process them.

artemv
12-25-2019, 07:10 PM
2 more results for high-quality samples today:



file name
my result


KH150641
pre-I-S9234


KH150633
pre-I-S9234




sample KH150641 seems to be I of some sort. Probably I2c (it has got some calls on I2c level).
Yes, I guess this happlogroup was called I2c in some ISOGG tree version, in actual ISOGG this happlogroup is called I2a2.

Chcking this samples, I have found a mistake in ISOGG data.
FGC18600(Y7172) is a C->G mutation
FGC18624(Y7182) is a T->G mutation


KH150641
Here we have 21 derived SNPs, 10 ancestral SNPs


Derived:
FGC18584+ 1 read
FGC18616+ 3
S16199+ 1
FGC18650+ 2
S22470+ 1
FGC18544+ 1
FGC18549+ 1
FGC18601+ 2
FGC18651+ 1
S17221+ 1
L1251.1+ 4
FGC18609+ 2
S22054+ 3
S9234+ 1
S22170+ 4
FGC18607+ 1
FGC18558+ 5
BY2776+ 7
S25233+ 1
FGC18600+ 2
FGC18624+ 1


Ancestral:
FGC18653 - 2 reads
S18042 - 3
S18299 - 3
FGC18551 - 4
FGC18648 - 1
S11571 - 4
FGC18588 - 2
FGC18608 - 1
FGC18559 - (2 reads for ancestral G, one read - "other" T, the only T case - on the edge of a read, so I consider sample is FGC18559-)
Y7859 - 3 reads



KH150633
Here we have 19 derived SNPs and 15 ancestral SNPs.

Derived:
BY435+ 1 read
FGC18584+ 3
FGC18616+ 1
FGC18544+ 2
FGC18601+ 2
FGC18603+ 3
S10141+ 2
L1251.1+ 3
FGC18554+ 2
FGC18575+ 1
FGC18609+ 2
S21750+ 1
S22054+ 2
FGC18618+ 1
FGC18607+ 1
FGC18558+ 1
FGC18547+ 1
FGC18600+ 5


Ancestral:
FGC18653- 1 read
S18042- 1
S18299- 1
FGC18555-
FGC18551- 1
FGC18553- 3
FGC18643- 1
Y7878- 1
FGC18585- 1
BY1802- 1
S16446- 2
FGC18588- 1
FGC18608- 1
FGC18559- 2
Y7859- 1

artemv
12-26-2019, 05:01 PM
file_name
my_result


KH150189_KH150632_KH150636
pre-I-S9234


KH150613_KH180043
pre-I-S9234



No surprises here. I have 12 heavy file files left, all of them 10Gb+. And little hope to find anything new.

KH150189_KH150632_KH150636
Total 10 ancestral and 29 derived.

Derived:
BY435+ 3
FGC18578+ 1
FGC18584+ 4
FGC18616+ 3(this one has 3 C and one T read, likely that T because of deamination, so I consider it C)
FGC18624+ 4
S16199+ 1
S26491+ 1
FGC18650+ 2
S22470+ 4
FGC18544+ 1
FGC18549+ 3
FGC18601+ 4
FGC18603+ 1
FGC18651+ 1
S10141+ 2
S17221+ 1
L1251.1+ 2
FGC18554+ 2
FGC18609+ 2
S21750+ 2
S22054+ 1
FGC18610+ 2
FGC18618+ 2
S9234+ 1
S22170+ 2
FGC18607+ 4
FGC18558+ 4
BY2776+
FGC18547+

Ancestral:
FGC18653- 8 reads
S18299- 8
FGC18555- 2
FGC18551- 2
FGC18553- 1
S11571- 5
Y7878- 4
FGC18585- 1
S16446- 2
FGC18588- 3
FGC18608- 3
FGC18559- 3
Y7859- 5



KH150613_KH180043
Total 15 ancestral and 23 derived.

Derived:
BY435+ 1
FGC18584+ 2
FGC18616+ 1
S16199+ 1
S22470+ 2
FGC18544+ 4
FGC18549+ 3
FGC18601+ 1
L1251.1+ 4
FGC18554+ 3
FGC18600+ 4
FGC18606+ 2
S22054+ 3
BY1802+? 1 (this is a problematic one. This is an A->T SNP, and we have 1 read for T and 1 for C)
FGC18610+ 1
FGC18618+ 2
S9234+ 2
FGC18607+ 2
FGC18558+ 10
BY2776+ 3 (another case of contradicting reads. 3 reads for A, 1 read for T, so I consider this to be A)
Y11278+ 1
S25233+ 2
FGC18628+ 1

Ancestral:
Y7839- 1 read
FGC18653- 3
S18042- 4
S18299- 5
FGC18555- 3
FGC18551- 4
FGC18648- 4
S11571- 3
Y7878- 2
FGC18585- 1
S16446- 1
FGC18588- 11
FGC18608- 3
FGC18559-
Y7859-

artemv
12-28-2019, 07:01 PM
file_name
my_result


KH150629
pre-I-S9234


KH150637
pre-I-S9234



KH150629
Total 30 derived and 15 ancestral.

Derived:
BY435+ 4 reads
FGC18578+ 4
FGC18584+ 4
FGC18616+ 3
FGC18624+ 6
S16199+ 2
FGC18650+ 2
S22470+ 6
FGC18544+ 2
FGC18549+ 2
FGC18601+ 5
FGC18603+ 2
S10141+ 2
S17221+ 4
L1251.1+ 2
FGC18554+ 3
FGC18575+ 3
FGC18600+ 8
FGC18609+ 2
S21750+ 1
S22054+ 2
FGC18618+ 4
S9234+ 3(this SNP has 3 reads for derived G and 1 read for ancestral A, consider it to be derived)
S22170+ 5
FGC18607+ 2
FGC18558+ 9
BY2776+ 3
Y11278+ 2
S25233+ 2
FGC18628+ 4

Ancestral:
Y7839- 1
FGC18653- 3
S18299- 1
FGC18555- 3
FGC18551- 4
FGC18553- 2
FGC18643- 1
FGC18648- 1
Y7878- 1
FGC18585- 2
S16446- 2
FGC18588- 2
FGC18608- 4
FGC18559- 3
Y7859- 4



KH150637
Total 27 derived and 15 ancestral

Derived:
BY435+ 1 read
FGC18578+ 1
FGC18584+ 3
FGC18616+ 3
S16199+ 1
S26491+ 3
FGC18650+ 1
FGC18544+ 2
FGC18549+ 2
FGC18601+ 2
FGC18651+ 1
S10141+ 1
S17221+ 1
L1251.1+ 5
FGC18554+ 2
FGC18600+ 3
FGC18609+ 3
S21750+ 3
S22054+ 2
BY1802+ 1
FGC18610+ 3
S9234+ 1
S22170+ 3
FGC18558+ 4
BY2776+ 4
S25233+ 1
FGC18628+ 1

Ancestral:
FGC18653- 1
S18042- 4
S18299- 3
FGC18555- 2
FGC18551- 3
FGC18553- 2
FGC18643- 1
FGC18648- 1
S11571- 3
Y7878- 3
FGC18585- 2
S16446- 4
FGC18588- 5
FGC18608- 2
Y7859- 5

Romilius
01-15-2020, 04:39 PM
I'm trying to download and process the BAM files left behind.

There is a problem: KH150612 can't be downloaded... so I start with KH150614_KH150615... I think it will take long time!

From the authors so far no answer to my e-mail about the Supplemental tables.

artemv
01-15-2020, 05:43 PM
I've downloaded KH150612, it worked fine. Will post tomorrow results.
Although I do not expect anything new.

Romilius
01-16-2020, 02:10 PM
KH180045 seems to be a woman.

artemv
01-16-2020, 02:32 PM
Ok, there is something new, I mean a new branch of I2.
KH150612 is I-S23680* (under I-M223).
https://www.yfull.com/tree/I-S23680/
S12377+ (6 reads)
S23467+ (3 reads)
F1696+ (2 reads)

And he has important downstream negatives, as he is negative for I-PF692 synonimes:
L1228-
S20108-
S20581-
YP4753-
Y18439-
S19522-
S19682-
Y31668-
M9771-
S10631-
S17535-
Y18948-
YP4522-
S24785-

Now on y-full PF692 is the only happlgroup under S23680 with a single person of I-S23680 who is not PF692 (from California, US).
So, this Wartberg individ either represents some very rare branch or even a 2600 year old dead end branch.

Romilius
01-16-2020, 09:02 PM
KH150635 is strange: very big BAM file and very few calls on Y chromosome. I think that's a woman, but I ask for your opinion.

Here the SNPs:
Z15137-, PF3529-, PF6026-, PF6811-, Y1192-, Z2907+, S27554+, PF2640-, S3052-, S3053-, ZS69-, S3054-, S7065-, Z5004-, F3763-, S12897-, Z734-, Z12857-, Z3238-, L558-, PF7408-, PR1234+, PR1235-, PR1236-, PR1237-, PR1238+, PR1239-, L540-, PR1240-, PR1241-, PR1242-, L150.2!-, PF6274.2!-, S351.2!-, L150.1+, PF6274.1+, S351.1+, PR1243-, PR1244-, PR1251+, PR1252-, PR1253+, PR1254-, PR1255-, PF420-, F4159-, Z132-, S12909+, S12910+, S27714-, PF421-, PF1381-, PF1382-, PF1383-, PF422+, Z2927+, S12911-, PF6813+, PF423+, S12913-, Z11984-, S12914-, Z11985-, PF424+, Z16021-, Z5342-, Z5343-, PF425-, PF426+, PF427-, PF428+, PF429+, PF430+, S12920-, PF431+, PF1385-, PF432-, PF433+, S12922-, S12924-, S12925-, S12926-, S12931-, S12945-, Z5344-, S12957-, S12958+, S12959-, S12960-, K160-, PF6729+, PF6730-, PF6731+, PF6740+, S12964-, S12965-, Z15138-, PF6814+, PF6815-, S12966-, PF6816-, PF6817-, PF434+, S12977-, S12978-, S12979-, S12980-, S12982-, S12983+, S12984+, S12985-, S12986-, PF436+, PF437+, PF438+, PF439+, K161-, PF3332-, YSC0000100+, YSC0000101-, S13018-, Z11986-, S13028-, S13030-, S13031-, S13039+, S13045+, S13046-, S13047-, Z5345-, S13052-, S13053-, S13054-, PF440-, PF441-, S27117-, S13058-, S13062+, S13063+, S13065-, S13075+, S13076-, S13077-, S13078-, S13079+, S13088-, S13105-, S13106-, S13108-, S13109+, S13122-, S13123-, S13124-, S13128-, PF7407-, FGC3271-, S13155-, Z7815-, Z5346-, Z15140-, PF442-, PF443+, S3055-, PF7485-, S7158-, PF1386-, PF444+, Z15480-, Z2467-, PF445-, K162-, PF3200+, PF3201+, S13186-, PF1387-, S13187-, Z17708-, Z15817-, PF1388-, Y1193.2+, S27600-, S13195+, S13201-, Z15206-, Y1194-, PF4679-, PF4680+, PF1389-, PF5483-, PF1390-, Z2166-, PF6819+, S3590-, S5956-, Z7237-, Z12858-, PF448+, Z14831-, PF449+, PF450+, PF4525-, PF6734-, PF6741+, PF2120-, Z15141-, S13334+, Z2302+, S13338-, Z5347-, PF7486-, PF1391-, PF451+, FGC599-, Z6655-, Z16963-, Z15089-, PF452+, PF453+, PF454+, Z14326-, Z5348-, PF459-, Z14979-, Z15912-, ZS70-, Z14033-, S13508+, PF5611-, S13521-, PF463-, S13527-, Z19220-, S13540-, S13574-, Z17036-, Y1196-, S13599-, S13613-, S13614-, S13617-, Y198-, Z13373-, PF465-, YSC0000107+, PF4527-, Z160-, Z14034-, S13732-, K165-, S8157-, ZS144-, Z15295-, PF471-, PF472+, PF475+, PF476+, PF478+, PF479+, PF480+, PF481+, YSC0000623-, PF482+, PF484+, PF485+, ZS208-, ZS315-, PF486+, Z1909-, PF488+, Z28836-, PF1599-, PF489-, Z4007-, PF490-, PF491-, ZS253+, PF2261-, PF493-, S5489-, PF494+, Z1918-, YSC0000111+, ZS162-, PF3370-, PF495-, PF3989-, PF496-, PF501-, PF503-, S13761-, Z137-, PR1256+, PR1260-, PR1261-, YSC0000247-, PR1262-, Z2745-, Z13901-, Z2279-, PF1603-, Z4008-, Z2000-, Z13058-, S7369-, Y621-, PF509+, Z14038-, Z15668-, Z5357-, Z14409-, PF515+, Z3246-, PF516-, Z13708-, S26889-, PF517+, Y753-, Z6323-, PF519+, Z15144-, Y1197-, PF6737+, PF6738+, PF524-, S13782-, Z6040-, Z14521-, Z13374-, Z5360-, PF525-, PF526+, PF528+, PF529-, S3169-, S534-, Z7817-, Z14523-, PF3262-, Z14578-, S26890-, S26891-, PF539+, Z6324-, Z16984-, PF541+, PF1392-, PF3682+, ZS316-, PF1393-, PF546-, PF547+, Z2304-, YSC0000122-, YSC0000123-, S13785-, YSC0000006-, PF548+, ZS211+, S13801-, S13803-, S13809-, S13824-, S13825-, S13830-, S13831-, Z6310-, Z5363-, S13870-, S13875-, S13876-, Z3248-, YSC0000124-, S13887-, S13888-, S13910-, S13911-, S13912-, Z12446-, S26892-, S26893+, Y1198-, S13947-, S13950-, S13952-, S13956-, S13957-, S13965-, S13966-, S13968-, S13969-, Z27268-, S13970-, S13971-, F1662-, S26895-, S13972-, S13973-, Z15670-, S7809-, Z12714-, Z5364-, S13999+, S14003-, S14009-, S14010-, S14045-, S14053-, S14062-, S14064-, S14065-, S14067-, S14068-, S14086-, PF551-, BY27-, PF552+, PF6820-, YSC0000125-, YSC0000126+, YSC0000127-, PF553-, YSC0000128-, YSC0128-, YSC0000129-, YSC0000130-, S14095-, Z11997+, S14124-, S14125-, S14130-, S26896-, Z14731-, S14142-, S14145-, S14147-, S14152-, S14163-, S14167-, S14169-, PF1394-, PF1395-, Z2568-, FGC6793-, Z15145-, YSC0000131-, Z7818-, Z14732-, PF554+, S14170-, PF6821+, S14199-, PF555-, S14201-, Z15297-, PF557-, PF558-, PF559-, PF1397-, Z14043-, PF562+, Z15146-, PF4281+, PF565+, K169-, PF3684-, K171-, PF1948-, PF571+, PF572+, PF1398-, Z16964-, Z14921-, S26898-, PF576-, Z12153-, PF582+, PF583+, PF587+, PF589+, PF590+, Z16051-, PF7622-, Z12535-, PF593+, PF594+, PF596+, PF4356-, PF597-, PF4903-, PF598+, PF599+, PF7487+, Y1201-, Y1270-, FGC8092-, Z5378-, PF3203-, Z12360-, Z14582-, Z14048+, S14328-, PF602-, YSC0000132+, YSC0000133-, Z17714+, S14338-, Z13520-, FGC7545-, Z6561-, S14346-, Z14526-, L777-, YSC0000248-, S1435-, Z3258-, PF4403-, PF2334-, Z12869-, Z3258-, S14380-, S14381-, S14382-, S14385-, Z12001+, Z15492-, Z15671-, S14393-, Z5941-, S14406-, Z5381-, Z28839-, Z3260-, BY62-, PF1616-, Z4212-, Z4892-, Z14583-, Z1258-, Z4893-, Z12773-, S14445-, Z12871-, PF4685-, Z2354-, Z15672-, Z2416-, PF3296-, PF606-, PF3692-, S5092-, PF615-, Z2006-, Z12003-, S3511-, Z5383-, YSC0000137-, YSC0000138-, YSC0000139-, YSC0000140-, Z6294-, Z5385-, Z4568-, PF620+, S14524-, S14526-, Z16985-, Y754-, S14534-, Z4569-, Z6325-, PF4686-, Z15494-, PF2919-, K177-, K178-, YSC0000141-, PF621-, Z8018-, Z16828-, Z17432-, S14590-, PF3693-, Z2644-, PF5734-, Y779-, S14609-, Z1639-, Z4570-, Z3268-, S14642-, YSC0000142-, S3410-, Z2121-, Z14332-, S14669+, Z12361-, PF2920-, Z15673-, Z2557-, Z4894-, S6245-, Z28842-, S14739-, S27773-, YSC0000143-, YSC0000145-, YSC0000147-, YSC0000148-, YSC0000149-, YSC0000150-, YSC0000151-, Z14924-, PF624-, K179-, Z14586-, Z15495-, S14779-, S14780-, Z12006+, K526-, Z28843-, S14801-, Z15674-, Z14872-, GG181-, PF627+, PF6299-, S14811+, Z3277-, Z3278-, S14821-, Z4895-, S14827-, Z12716-, S14846-, S14847-, S14848-, S14849-, S14850-, S14851-, PF628+, Z5387-, Z15916-, S14887-, Z2913-, S14897-, S14898-, S14901-, Z13526-, Z4896-, S14923+, S14924-, S14927-, PF6653-, Z1762-, S14938-, S14940-, S14941+, S14953-, S14959-, S14963-, PF2648+, S14988-, Z13527-, YSC0000152-, PF3204-, PF3153-, PF7639-, Z737-, PF6724+, Z14274-, Z15833-, Z3285-, PF631-, S15044-, S15045-, Z14049-, YSC0000153-, YSC0000154+, YSC0000155-, YSC0000156-, YSC0000157-, YSC0000158-, Z2044-, PF6445-, YSC0000159-, PF2335-, S15077-, S15078-, S15079-, S15097-, Z13381-, Z4571-, Z2305+, PF7095-, Z2307+, YSC0000160-, Z5390-, YSC0000161-, YSC0000162-, YSC0000163-, Z4897-, Z16102-, PF1401+, Z5391-, PF633-, Z4572-, S15158-, S15162-, S15164-, PF6822-, S15183-, S15184-, S15185-, Z12875-, M9067-, S15216-, CTS1552-, M1203+, PF5893+, S4891+, CTS1556-, M9074+, S7374-, CTS1712-, CTS1713+, PF5542-, CTS1886-, M1334-, CTS1887-, Z14744-, M2458-, CTS2055-, PF3997-, S5334-, Z2106-, M2800-, CTS2294-, M9727-, CTS2316+, S613+, Z3625+, CTS2398-, CTS2612-, M849-, PR1405-, Z6057-, PR5081-, S16042+, CTS2773+, CTS2774+, CTS2775+, CTS2776+, CTS2777+, CTS2784+, CTS2785+, L1325-, CTS2786+, CTS2836+, CTS2904+, CTS2905+, CTS3032+, M7281-, M11267-, CTS3524+, CTS3525+, CTS3527-, M676-, F1901+, S5523-, M1358-, Z6249-, CTS3681-, Z2509-, Z5978-, CTS3694+, CTS3695+, Z6392-, CTS3894+, CTS3895+, CTS3983-, CTS4352+, Z15228-, CTS4498-, M2687-, CTS4753-, CTS4754-, M4539-, CTS4892-, PF1976-, Z1082-, Z167-, CTS4925-, F4194-, PF6979-, CTS5231-, CTS5273-, PF6190-, V161.1+, V161.2+, CTS5500-, CTS5810-, CTS6170-, CTS6259-, M874-, F2283-, M5481-, PF1709-, CTS6555-, CTS6557+, CTS6592-, M4083-, Z16469-, M6700-, Z6245-, M4595-, M3842-, M9201-, CTS6926-, M8807-, CTS7127-, PF4022-, M9996-, M4843-, M4446-, M4844-, M4448-, CTS7206+, CTS7408+, CTS7409+, CTS7410+, CTS7537-, S5286-, S709-, CTS7570-, M2062-, M2929-, Z4271-, CTS7816-, M5238-, PF1731-, Y981-, S20279-, CTS8278+, M1022-, PF3242-, Z2091-, CTS8661-, PR2750-, CTS8745-, CTS8762-, Z881-, CTS8858-, Z15920-, K8-, S21111-, CTS9141-, CTS9254+, CTS9255+, S27351-, CTS9531+, CTS9693-, Z3407-, CTS9789-, M10131-, PF5563-, S5757-, Z378-, S21919-, CTS10150+, M6262+, Z13147-, PF2388-, CTS10367-, CTS10368+, M6766-, M10969-, CTS10404-, M4885-, CTS10608-, S27119-, YSC0000215+, YSC0000216+, S4272+, S4334+, S4779+, S4947+, S3821+, S4138+, S4635+, S4964+, YSC0000217+, S3815+, S4135+, S3175-, S3771-, S4126+, S4611+, S4067-, S4183-, S27116-, S4585+, S4636+, S4799+, S4806+, S4983+, S4805+, S22641-, S22642+, Z6105-, PF6501+, Z15374-, PR3258-, PR3259-, PR3260-, PF7241-, Y994-, Z14780-, M11608-, PF7081-, M3402-, Z6110-, M1163-, ZS157-, S27278-, M6016-, M4166-, M10590-, S24646-, M11837+, M10813-, CTS10842-, CTS10865-, CTS10866+, CTS10867+, CTS10932-, CTS11105+, CTS11123-, M6319-, Z14175-, PF3107-, CTS11465+, PR7281-, CTS11616+, CTS11617-, CTS11618+, PF6845-, M11695-, M9383-, M9384-, M9385-, CTS11841-, M7164-, CTS11883-, M8916-, M3809-, M8919-, S3106-, M8490-, M4415-, M11841+, M8518-, M1496-, M6335-, F3521-, M5365-, PF2412-, M809-, Z3541-, M5575-, M4581-, CTS69-, Z4485-, CTS12113+, M3849-, CTS12252+, CTS12253+, CTS12306-, CTS12346-, CTS12363+, S587+, CTS12370-, CTS12371+, CTS12379-, CTS12383+, PF1319+, CTS12384-, CTS12387+, CTS12388+, PF1320+, CTS12405-, CTS12406+, CTS12407+, CTS12413+, CTS12414+, CTS12415+, PF1322+, PF1402-, CTS12427+, PF1324+, CTS12428+, CTS12429+, CTS12435-, CTS12436-, CTS12440+, CTS12441+, S3833+, CTS12442+, PF2778+, CTS12443+, CTS12444+, CTS12447+, CTS12448+, CTS12449+, S4803+, CTS12450+, S4062+, CTS12451-, CTS12452-, CTS12453+, CTS12454+, CTS12455+, CTS12456+, CTS12457+, CTS12458+, CTS12459-, CTS12460-, CTS12461+, CTS12462+, CTS12463-, PF1326+, PF1327+, CTS12518-, CTS12540+, CTS12541+, CTS12542+, PF1328+, PF6345-, CTS12561+, CTS12562+, M3959-, Z12977-, Z3578-, CTS12763+, CTS12764+, CTS12770-, CTS12771-, CTS12791+, PF3283-, CTS12806-, CTS12816+, CTS12916-, CTS12938-, CTS12953+, P397-, CTS12954+, CTS12957+, CTS12987+, CTS12988+, CTS12998+, CTS12999+, PF1336+, CTS13000+, CTS13001-, CTS13018+, S26515-, S26532-, S26535-, Z13957-, Y43-, Z13203-, S5665-, Z12536-, PF1339-, S3117-, S626-, S700-, Z4160-, S26568-, S26569-, Z15270-, S26576-, PF1352-, YSC0000243+, L1164-, YSC0000244+, S27707+, Y2234-, Z16951-, PF5847-, PF5995-, Y883-, Z875-, ZS36+, Z1913+, K344-, Y176-, Y1993-, Z15085-, FGC576-, Z6537-, ZS249+, Z6120-, Z15774-, PF2207-, Z2957-, YSC0001348-, PF3596-, M520-, PF6410-, S27565+, PF2812-, PF6776-, Z14463-, PR4363-, PR4364-, PR4365-, PF119-, PF1484-, FGC7649-, Y2365-, PF6993-, Z12825-, PF148-, Z1989+, Y1165+, Y193-, Y194-, K450-, S3135-, Z1912-, S3136-, S7418-, S26581-, S26582+, S26583+, S26584+, S3137-, S26590+, S3138-, S3139-, S3140-, PF1494-, Z4824-, S7810-, Z2032-, M5055-, PF2481-, Z1159-, M9456-, CTS453+, CTS455-, S5062-, CTS533-, CTS537+, CTS614-, M4723-, CTS732-, CTS989+, M8158+, PF3320-, S1822-, M1793-, CTS1085-, PF4950-, F927-, YSC0001381-, S10311-, M7925-, PF3643-, M4117-, K465-, K472-, YSC0000499-, YSC0499-, M9021+, PF288+, F1314-, Z300-, F1315-, M4118-, Z4860-, Z14415-, M3273-, PF2895-, M6113-, M4768-, M2021-, Z4873-, M4771-, M11804+, M4961-, K494-, Z12770+, FGC5673-, PF355+, PF6721-, Y1058-, Z15733-, S12810-, S12811-, S12812-, PF356-, Z5336-, PF357+, Z6306-, PF1375-, PF360+, Z11981-, Z5338-, PF361+, PF7224-, PF362+, PF363+, PF364-, S27187-, PF2905-, PF366+, PF369+, PF370+, Z5339-, PF372-, Z13702-, PF373+, PF2906+, PF374-, PF1376-, PF375+, Z3233+, PF376+, PF378+, PF6723+, PF6726-, PF379+, PF380+, PF1377-, PF381+, PF382+, PF383+, PF384+, PF385+, K158-, PF386+, PF387+, PF388+, S12856+, Z15399-, PF389+, PF390+, S3040+, S3041+, PF391+, PF392+, PF7484-, PF393+, PF394+, PF395+, PF398+, PF399+, F4158-, Z15275-, PF400+, PF4678-, Z2316-, PF6809-, PF401+, K159-, PF402+, Z14273-, PF403+, PF404+, S12869-, Z4208-, Z6307-, PF1594-, S3046+, PF3152+, PF405+, PF406+, PF407+, Z14795-, Z5340-, S3049+, Z2293-, Z12176-, Y755-, PF409-, PF6810-, PF410+, S8132-, PF411+, PF412-, PF6728+, YSC0000084+, Z15345-, YSC0000085+, PF413+, PF1378-, S27190+, Z28834-, FGC1119-, K496-, L1115-, PR1213-, L1213-, PR1214-, PR1215-, PR1216-, PR1217+, L293-, PR1218-, PR1219-, PR1220-, PF5286-, PR1221+, F3910-, YSC0000086-, YSC0086-, YSC0000087-, YSC0000088-, YSC0088-, YSC0000089+, YSC0089+, YSC0000090-, YSC0090-, YSC0000091-, YSC0000093+, YSC0000094-, YSC0000095+, YSC0000096+, YSC0000098+, YSC0098+, YSC0000099+, YSC0099+, L102-, Z5982-, S12879-, Z15734-, Z17706-, PR4936-, PR4937+, PR4938-, PR4939-, PR1222+, PR4940-, PR1223+, PR4942-, PR4943-, PR1224+, PR4944-, PR1225+, PR4945-, PR4946+, PR1226+, PR4947-, P231-, PR1228-, PR4948-, PR4949-, PR4950-, PR4951-, PR1229+, PR1230+, PR4952+, PR4953-, PR4954-, PR1231-, PR1232-, PR4955-, PR1233-, PR4956-, PR4957-, PF2260-, Z12355-, Z14830+, PF415+, S3050-, S12889+, PF416+, PF1379-, PF1380-, Z16134-, Z3237-, FGC12138-, PF6025-, S3051+

And here the haplogroup assigned (as per ISOGG):
CTS4803 = G2a2b2a1b1a1b1
CTS11916/M5562/PF1901 = E
L150.1/PF6274.1/S351.1 = R1b1a2
M5562/CTS11916/PF1901 = E
M9021/PF288 = BT
PF288/M9021 = BT
PF509 = BT
PF620 = BT
PF627 = BT
PF3728/CTS4272 = I
PF6274.1/L150.1/S351.1 = R1b1a2
S351.1/L150.1/PF6274.1 = R1b1a2
V161.1 = A0a1a
V161.2 = A1
Y1193.2 = E1a
Z11997 = BT
Z12001 = BT
Z12006 = BT
Z12770 = H3a2
Z14048 = H1b1
Z14830 = D1b1d1a
Z17714 = CT

artemv
01-17-2020, 01:20 AM
KH150635 is strange: very big BAM file and very few calls on Y chromosome. I think that's a woman, but I ask for your opinion.


Very big file with very few calls on Y chromosome means a woman.

Romilius
01-18-2020, 07:22 AM
KH150630 belongs to haplogroup I of some sort (many calls on I level, few on its subclades).

Romilius
01-18-2020, 10:57 AM
KH150627 seems to be a woman.

Only 6 samples left!

Romilius
01-18-2020, 11:21 AM
KH150190 is a very small file, but it seems to belong to a male. Haplogroup is CT... but I can't say if I or something similar. It has also calls on R1b1a2 (some are R1b/investigation... whatever it could mean in the BAM Analysis kit) level and E level. I ask for your assistance in decoding the haplogroup.

Now I'm downloading KH150623... it's huge (23,7 GB )...

Romilius
01-18-2020, 11:28 AM
For KH150190, here is the Y haplogroup as per ISOGG:
CTS543/M5584/PF206 = CT
CTS1181/M5595 = CT
CTS2711 = CT
CTS4088/PF3868 = I
CTS5318/M5690 = CT
CTS5746/M5697 = CT
CTS5903/M9174/PF795/S1572 = BT
CTS6265/PF3871 = I
CTS6397 = I1b1
CTS6911/M5220/PF2372 = E1b1b1b1
CTS7257/M5718 = CT
CTS7517/M5722 = CT
CTS7831/PF3766 = I
CTS8243/M5736/PF891 = CT
CTS9948 = CT
CTS10512/M5772 = CT
CTS11468/FGC49 = R1b1a2
CTS12633/M5830/PF1329 = CT
F929/M578/PF3494/S6397 = HIJK
F2076/L524/PF3136 = G
FGC49/CTS11468 = R1b1a2
FGC756 = G2a1a1a1a1a1b1
FGC2045/Z12203 = GHIJK
FGC2412/Y1909 = I
L16/M522/PF3493/S138 = IJK
L141.1 = G2a2b2
L141.2 = R1b/(Private)
L149.1 = G2a
L149.2 = N1c1a1a1a1a1
L150.2!/PF6274.2!/S351.2! = R1b/(Investigation)
L292 = R1b/(Investigation)
L413/PF1409/V31 = BT
L418 = BT
L524/F2076/PF3136 = G
L681 = O/(Investigation)
L957 = CT
L971 = CT
L977 = CT
L990 = A0
L1004 = A1
L1009 = A1b
L1060/PF1021 = BT
L1061/PF1101 = BT
L1105 = A0-T
L1118 = A0-T
L1123 = A0-T
L1130 = A0-T
L1143 = A0-T
L1145 = A0-T
L1220/M9212 = BT
L1228 = I2a2a2a
L1251 = I2c1a
L1492/Y1569 = CT
M168/PF1416 = CT
M294 = CT
M522/L16/PF3493/S138 = IJK
M578/F929/PF3494/S6397 = HIJK
M5220/CTS6911/PF2372 = E1b1b1b1
M5584/CTS543/PF206 = CT
M5585/PF1414/V9 = CT
M5588/PF210 = CT
M5595/CTS1181 = CT
M5599/PF234 = CT
M5606/PF256 = CT
M5607 = CT
M5609 = CT
M5632 = CT
M5633 = CT
M5640/PF318 = CT
M5642 = CT
M5656 = CT
M5675/PF719 = CT
M5684 = CT
M5686 = CT
M5688 = CT
M5689 = CT
M5690/CTS5318 = CT
M5692 = CT
M5697/CTS5746 = CT
M5706/PF815 = CT
M5712 = CT
M5716/PF840 = CT
M5718/CTS7257 = CT
M5722/CTS7517 = CT
M5723/PF862 = CT
M5736/CTS8243/PF891 = CT
M5737/PF892 = CT
M5751/PF937 = CT
M5756/PF951 = CT
M5769/PF996 = CT
M5772/CTS10512 = CT
M5778/PF1031 = CT
M5809/PF1237 = CT
M5812 = CT
M5821/PF1269 = CT
M5823/PF1272 = CT
M5830/CTS12633/PF1329 = CT
M8951 = BT
M8957 = BT
M8969/PF1407/V21 = BT
M8972 = BT
M8980/PF229 = BT
M8983/PF230 = BT
M8986 = BT
M8994 = BT
M9003 = BT
M9010 = BT
M9017/PF282 = BT
M9031 = BT
M9032/PF304 = BT
M9042 = BT
M9046/PF324 = BT
M9070 = BT
M9075 = BT
M9080 = BT
M9081 = BT
M9083 = BT
M9097/PF672 = BT
M9098 = BT
M9099/PF674 = BT
M9105 = BT
M9124/PF701 = BT
M9125 = BT
M9129/PF707 = BT
M9135 = BT
M9136/PF724 = BT
M9174/CTS5903/PF795/S1572 = BT
M9178 = BT
M9187 = BT
M9188 = BT
M9204 = BT
M9210 = BT
M9212/L1220 = BT
M9213 = BT
M9218/PF860 = BT
M9226/PF869 = BT
M9227 = BT
M9228 = BT
M9230/PF870 = BT
M9231/PF876 = BT
M9240/PF896 = BT
M9242/PF899 = BT
M9246 = BT
M9262/PF932 = BT
M9263 = BT
M9269 = BT
M9285 = BT
M9293/PF997 = BT
M9300 = BT
M9303 = BT
M9312 = BT
M9315/PF1033 = BT
M9317 = BT
M9319 = BT
M9321/PF1045 = BT
M9334 = BT
M9335/PF1060 = BT
M9336/Page26 = BT
M9338/PF1064 = BT
M9343 = BT
M9357/PF1209 = BT
M9365/PF1218 = BT
M9370 = BT
M9378 = BT
M9379/PF1253 = BT
M9394/PF1271 = BT
M9397 = BT
M9405 = BT
M9420 = BT
M9421 = BT
M11756/PF301 = BT
M11773/PF809 = BT
M11779/PF1027 = BT
P80 = H1b1a
P97 = BT
P126 = IJ
P135/PF2741 = F
P141/PF2602 = F
P142 = F
P143/PF2587 = CF
P158/PF2706 = F
P159/PF2717 = F
P160/PF2618 = F
P187/PF2632 = F
P198 = O3
PF206/CTS543/M5584 = CT
P266 = H1b1b
P316 = F
Page26/M9336 = BT
PF38 = CT
PF210/M5588 = CT
PF228 = CT
PF229/M8980 = BT
PF230/M8983 = BT
PF234/M5599 = CT
PF256/M5606 = CT
PF282/M9017 = BT
PF301/M11756 = BT
PF304/M9032 = BT
PF318/M5640 = CT
PF324/M9046 = BT
PF342 = CT
PF420 = BT
PF672/M9097 = BT
PF674/M9099 = BT
PF701/M9124 = BT
PF707/M9129 = BT
PF719/M5675 = CT
PF724/M9136 = BT
PF795/CTS5903/M9174/S1572 = BT
PF809/M11773 = BT
PF815/M5706 = CT
PF840/M5716 = CT
PF860/M9218 = BT
PF862/M5723 = CT
PF869/M9226 = BT
PF870/M9230 = BT
PF876/M9231 = BT
PF891/CTS8243/M5736 = CT
PF892/M5737 = CT
PF896/M9240 = BT
PF899/M9242 = BT
PF932/M9262 = BT
PF937/M5751 = CT
PF951/M5756 = CT
PF970 = CT
PF996/M5769 = CT
PF997/M9293 = BT
PF1021/L1060 = BT
PF1027/M11779 = BT
PF1031/M5778 = CT
PF1033/M9315 = BT
PF1045/M9321 = BT
PF1060/M9335 = BT
PF1064/M9338 = BT
PF1101/L1061 = BT
PF1109 = CT
PF1184 = BT
PF1218/M9365 = BT
PF1237/M5809 = CT
PF1249 = BT
PF1253/M9379 = BT
PF1269/M5821 = CT
PF1271/M9394 = BT
PF1272/M5823 = CT
PF1329/CTS12633/M5830 = CT
PF1407/M8969/V21 = BT
PF1409/L413/V31 = BT
PF1410/V235 = BT
PF1414/M5585/V9 = CT
PF1416/M168 = CT
PF2372/CTS6911/M5220 = E1b1b1b1
PF2587/P143 = CF
PF2602/P141 = F
PF2618/P160 = F
PF2632/P187 = F
PF2706/P158 = F
PF2717/P159 = F
PF2741/P135 = F
PF2794 = G2
PF3493/L16/M522/S138 = IJK
PF3494/F929/M578/S6397 = HIJK
PF3766/CTS7831 = I
PF3868/CTS4088 = I
PF3871/CTS6265 = I
PF6274.2!/L150.2!/S351.2! = R1b/(Investigation)
S138/L16/M522/PF3493 = IJK
S351.2!/L150.2!/PF6274.2! = R1b/(Investigation)
S1572/CTS5903/M9174/PF795 = BT
S6397/F929/M578/PF3494 = HIJK
V9/M5585/PF1414 = CT
V21/M8969/PF1407 = BT
V31/L413/PF1409 = BT
V161.1 = A0a1a
V161.2 = A1
V234 = B/(Investigation)
V235/PF1410 = BT
Y1450 = CT
Y1493 = CT
Y1525 = CT
Y1567 = CT
Y1569/L1492 = CT
Y1791 = CT
Y1909/FGC2412 = I
Z1518 = D1b2
Z6106 = G
Z11941 = BT
Z11983 = BT
Z11989 = BT
Z11993 = BT
Z11999 = BT
Z12001 = BT
Z12007 = BT
Z12031 = BT
Z12077 = BT
Z12139 = BT
Z12203/FGC2045 = GHIJK
Z13364 = H3a
Z14050 = H1b1
Z17339 = BT
Z17371 = BT
Z17390 = BT
Z17710 = CT
Z17714 = CT
Z17718 = CT

artemv
01-18-2020, 02:42 PM
I've started a check on KH150614_KH150615. You wrote you are going to check it, but posted no results.
About KH150190. This is a small file, and I thought I've already checked all the small files, but looks like i've missed this one. Will do it quickly after KH150614_KH150615 finishes.
You should probably hide under spoiler all the long SNP lists.

About your results on KH150190. It looks like I* (quite surprising result), but let me check the file myself. And - always look not just at positive SNPs, but also at negative one's. Does he have any calls on I2 or I1 level?

Romilius
01-18-2020, 04:43 PM
I've started a check on KH150614_KH150615. You wrote you are going to check it, but posted no results.
About KH150190. This is a small file, and I thought I've already checked all the small files, but looks like i've missed this one. Will do it quickly after KH150614_KH150615 finishes.
You should probably hide under spoiler all the long SNP lists.

About your results on KH150190. It looks like I* (quite surprising result), but let me check the file myself. And - always look not just at positive SNPs, but also at negative one's. Does he have any calls on I2 or I1 level?

Yes, sorry. I started from KH150614_KH150615, but when you said you've downloaded the sample just before it, I thought to start from the last one in the BAM files list and go back, in order not to process the same sample twice.

On I2 I think yes, but few.

ArmandoR1b
01-18-2020, 08:07 PM
For KH150190, here is the Y haplogroup as per ISOGG:


What program do you use? You missed some important calls such as FGC18616 and S22470



About your results on KH150190. It looks like I* (quite surprising result), but let me check the file myself. And - always look not just at positive SNPs, but also at negative one's. Does he have any calls on I2 or I1 level?

Here is a table of pertinent derived calls for KH150190 that are downstream from I that are not G->A or C->T so not likely to be due to deamination



pos
marker_name
haplogroup
mutation
anc
der
reads
called_perc
called_base
state


8211182
PF3651
I2
C->G
C
G
2
100
G
D


8571993
PF3893
I2a2
T->C
T
C
1
100
C
D


8634040
S6631/SK1266
I2a2
C->A
C
A
4
100
A
D


6934194
PF6915
I2a2a
A->G
A
G
1
100
G
D


18426990
FGC18616/Y7867
I2a2a1a
T->C
T
C
1
100
C
D


20818893
S22470
I2a2a1a
G->T
G
T
1
100
T
D

Romilius
01-18-2020, 08:59 PM
What program do you use? You missed some important calls such as FGC18616 and S22470

I use the old version of BAM analysis kit, because the new one gives me some problems.

Romilius
01-18-2020, 09:00 PM
KH150623 is I2a and maybe I2a2 (some calls on I2a2 level, only one on I2a1 level).

ArmandoR1b
01-18-2020, 10:27 PM
I use the old version of BAM analysis kit, because the new one gives me some problems.

Which old one? I can upload a file that you can use with the old program so you can get a better output but I would need to know which version so I can upload it with the right format and file name.

artemv
01-19-2020, 05:41 AM
Ok, KH150614_KH150615 has been finished.
He is I-M838(L880). Again happlogroup, that is very rare now. Just 2 examples on y-full, no internal subbranches
Just a short reminder - this is one of happlogroups, found in nearby Blätterhöhle. I think that's an important result.
https://anthrogenica.com/showthread.php?8066-Genetic-Genealogy-amp-Ancient-DNA-in-the-News-(DISCUSSION-ONLY)/page301

Some details:
All the L880 synonimes are derived:
L880+ (2 reads)
Y28228+ (5 reads)
Y28222+ (3 reads)
Y28224+ (6 reads)
Y28226+ (5 reads)
Y28223+ (5 reads)
Y28225+ (1 read)
Y28227+ (9 reads)
Y28229+ (5 reads)
M838+ (8 reads)
M4120+ (3 reads)
S7845+ (6 reads)
Y28221+ (8 reads)
Y28230+ (3 reads)

Now lets return to KH150190.
I do not know what is wrong with your software. He is definitely pre-I-S9234 like majority of people from this paper.
Out of 5 S9234 synonimes we have 4 derived, 1 ancestral.

artemv
01-19-2020, 05:52 AM
What program do you use? You missed some important calls such as FGC18616 and S22470



Here is a table of pertinent derived calls for KH150190 that are downstream from I that are not G->A or C->T so not likely to be due to deamination



pos
marker_name
haplogroup
mutation
anc
der
reads
called_perc
called_base
state


8211182
PF3651
I2
C->G
C
G
2
100
G
D


8571993
PF3893
I2a2
T->C
T
C
1
100
C
D


8634040
S6631/SK1266
I2a2
C->A
C
A
4
100
A
D


6934194
PF6915
I2a2a
A->G
A
G
1
100
G
D


18426990
FGC18616/Y7867
I2a2a1a
T->C
T
C
1
100
C
D


20818893
S22470
I2a2a1a
G->T
G
T
1
100
T
D






Both of his C->T SNPs are not because of deamination. Because all the other samples from the study, who are pre-I-S9234, always have the same S9234 synonimes ancestral or derived.
L1251.1 and FGC18554 are positive in all samples I've marked as pre-I-S9234 that have calls for them.
In case of deamination we would have mixed calls.

Romilius
01-19-2020, 05:08 PM
Which old one? I can upload a file that you can use with the old program so you can get a better output but I would need to know which version so I can upload it with the right format and file name.

It is the version you can download from y-str.org: version 1.8 (august 2015).

Romilius
01-19-2020, 08:11 PM
KH150619 seems to be a woman.

I think now there are only 2 samples left to run.

ArmandoR1b
01-19-2020, 11:41 PM
Both of his C->T SNPs are not because of deamination. Because all the other samples from the study, who are pre-I-S9234, always have the same S9234 synonimes ancestral or derived.
L1251.1 and FGC18554 are positive in all samples I've marked as pre-I-S9234 that have calls for them.
In case of deamination we would have mixed calls.

The C->T SNPs that I was referring to aren't phylogenetic equivalents of I-S9234 and since there are plenty derived reads due to deamination I filtered all of them out. I am interested in the derived SNPs that can't argued to be due to deamination. By the way, a synonym and a phylogenetic equivalent are two different things and you are using the word synonym when in reality the context of your statement shows that you are referring to phylogenetic equivalents. An example of a synonyms are FGC18616 and Y7867 because they refer to the exact same mutation. Different names same thing. That is what a synonym (https://www.dictionary.com/browse/synonym) is. An example of phylogenetic equivalents are FGC18616/Y7867 and S22470 since they are all on the same level.

Lastly, there is no reason to call a person that is positive for an SNP that is a phylogenetic equivalent of I-S9234 but no coverage on I-S9234 as pre-I-S9234. They aren't pre since that would be prior to S9234 and if there is no read for S9234 there is no way to know if the specimens without coverage of that SNP are ancestral for S9234 which if derived for other SNPs on the level then they could be called pre-I-S9234.

ArmandoR1b
01-20-2020, 12:15 AM
It is the version you can download from y-str.org: version 1.8 (august 2015).

It turns out for that one I would have to use a macro in Excel to get a report with a more current list of SNPs. I can get a more up to date Excel file with a macro if you want it so your report doesn't miss so many pertinent SNPs.

artemv
01-20-2020, 01:22 AM
The C->T SNPs that I was referring to aren't phylogenetic equivalents of I-S9234 and since there are plenty derived reads due to deamination I filtered all of them out. I am interested in the derived SNPs that can't argued to be due to deamination. By the way, a synonym and a phylogenetic equivalent are two different things and you are using the word synonym when in reality the context of your statement shows that you are referring to phylogenetic equivalents. An example of a synonyms are FGC18616 and Y7867 because they refer to the exact same mutation. Different names same thing. That is what a synonym (https://www.dictionary.com/browse/synonym) is. An example of phylogenetic equivalents are FGC18616/Y7867 and S22470 since they are all on the same level.

Lastly, there is no reason to call a person that is positive for an SNP that is a phylogenetic equivalent of I-S9234 but no coverage on I-S9234 as pre-I-S9234. They aren't pre since that would be prior to S9234 and if there is no read for S9234 there is no way to know if the specimens without coverage of that SNP are ancestral for S9234 which if derived for other SNPs on the level then they could be called pre-I-S9234.

1. Ok, looks like I've indeed mixed phylogenetic equivalent and synonim.
2. I call them pre-I-S9234, because all of samples are positive to some phylogenetic equivalents of I-S9234, while they are all negative to others.
I-S9234 is a 3000 year old happlogroup, while samples are 5000 years old.

I took all the samples with good coverage that belong to this happlogroup and compared phylogenetic equivalents of I-S9234.
Out of 53 I-S9234 synonimes (on ISOGG) I can see 17 negative SNPs (in all the files they are always either negative or no call) and 35 positive SNPs (in all the files they are always either positive or no call).

I found only one SNP that is sometimes positive and sometimes negative (BY1802 is positive for KH150637 and KH150613_KH180043 but negative for KH150633). But after checking the results more closely, I decided that this is likely an error - BY1802 has only a single read for KH150633, and the SNP itself is located on the edge of a read.

Romilius
01-21-2020, 05:20 AM
KH150622 seems to be a woman.