PDA

View Full Version : Central-West Asian Y-DNA I



DMXX
02-06-2017, 12:19 AM
Just revisited Di Cristafaro et al. 2013's uniparental data from Central Asia and had a good look at the Y-DNA I's for some unknown reason (intuition? Cannot pin down the precise motivation here).

This peculiar result popped out (showing Y-Filer panel):



Y defining marker Sample Population DYS389-I DYS389-II DYS390 DYS456 DYS394/19 DYS385a DYS385b DYS458 DYS437 DYS438 DYS448 YGATAH4 DYS391 DYS392 DYS393 DYS439 YGATAC4/Y_DYS635 DYS388 DYS426
I-M258 AZ6_5 Hazara-Bamiyan 12 28 22 15 15 12 14 16 16 10 20 12 10 11 15 11 21 12 11


I've had a few back and forths in the past with K Nordtvedt regarding Asian Y-DNA I. It's mostly I2 (specifically I2a2a-M223; found in Iran and Central Asia). This sample is negative for M223, M436 and M438.

Any thoughts, ladies and gents? I'll provide the full 44 Y-STR results if requested/when I find time later.

Gravetto-Danubian
02-06-2017, 12:30 AM
You're saying it's mostly M223 but it seems negative for M223?

DMXX
02-06-2017, 12:41 AM
Sorry if my wording was confusing:

The majority of Central Asian (+Iranian) I is I2a2a-M223, both in this study and in earlier ones. However, this particular sample does not belong to I2a2a-M223, I2a2-M436 or even I2-M438.

The authors included the following upstream I SNPs in their genotype panel:



... I-M258, I1-M253, I2-M438 ...


By deduction, this sample has to be I*-M258. Which, I believe, is the first time that has been reported anywhere outside Europe.

Grugni et al. 2012 reported a handful of IJ*-M429 in Iran. (http://3.bp.blogspot.com/-9ItFg3ZDOCc/UAfrJKKY7aI/AAAAAAAAFC8/WQfxUO6_9Vw/s1600/journal.pone.0041252.t001.jpg) I haven't seen any detailing of Iranian I through STR's or SNP's to ascertain whether they're recent or prehistoric relics, potentially relating to the pre-Villabruna population.

Impression: Wayward lineage ?source. ;)

Agamemnon
02-06-2017, 12:43 AM
Sorry if my wording was confusing:

The majority of Central Asian (+Iranian) I is I2a2a-M223, both in this study and in earlier ones. However, this particular sample does not belong to I2a2a-M223, I2a2-M436 or even I2-M438.

The authors included the following upstream I SNPs in their genotype panel:



By deduction, this sample has to be I*-M258. Which, I believe, is the first time that has been reported anywhere outside Europe.

Grugni et al. 2012 reported a handful of IJ*-M429 in Iran.

Impression: Wayward lineage ?source. ;)

I remember that, IJ* was also found in one of the Vestonice samples IIRC. Looks like the Iranian plateau might prove very relevant to IJ's emergence and diversification, I'd certainly like to see more ancient data from Iran.

DMXX
02-06-2017, 12:55 AM
I remember that, IJ* was also found in one of the Vestonice samples IIRC.

Right you are - Genetiker called it. (https://genetiker.wordpress.com/y-snp-calls-for-vestonice-43/)

There's two likely scenarios with this Afghan I* sample - Either it is a prehistoric and (literally) drifted offshoot of the pre-Villabruna population that existed further west, or it is a more recent export from the Eurasian steppes.

All the Y-DNA I retrieved so far from the steppes has been I2a2a (L699+ in the Yamnaya Samara site, Genetiker also identifies Ukraine_HG1 as I2a2a-M223 (https://genetiker.wordpress.com/2017/02/04/y-snp-calls-from-mesolithic-and-neolithic-latvia-and-ukraine/)). We can reasonably determine that the I2a2a-M223 found in Central Asia (including Iran) probably has LNBA steppe origins... But not this I* outlier.

Agamemnon
02-06-2017, 01:12 AM
Right you are - Genetiker called it. (https://genetiker.wordpress.com/y-snp-calls-for-vestonice-43/)

There's two likely scenarios with this Afghan I* sample - Either it is a prehistoric and (literally) drifted offshoot of the pre-Villabruna population that existed further west, or it is a more recent export from the Eurasian steppes.

All the Y-DNA I retrieved so far from the steppes has been I2a2a (L699+ in the Yamnaya Samara site, Genetiker also identifies Ukraine_HG1 as I2a2a-M223 (https://genetiker.wordpress.com/2017/02/04/y-snp-calls-from-mesolithic-and-neolithic-latvia-and-ukraine/)). We can reasonably determine that the I2a2a-M223 found in Central Asia (including Iran) probably has LNBA steppe origins... But not this I* outlier.

The former scenario (pre-Villabruna) makes more sense to me right now, but we can't jump to conclusions just yet, especially considering the lack of ancient data from the region.

RCO
02-06-2017, 01:25 AM
We desperately would need a good article about NGS Y-DNA from Iran, the Caspian Sea, Central Asia, Eastern Anatolia, Northern Iraq but after two years nothing happened.

DMXX
02-06-2017, 01:31 AM
The most instructive piece of additional evidence to weigh either scenario against the other, for the time being, is STR haplotype comparisons. I have little familiarity with I haplotypes, so I'm hoping a more learned enthusiast would shed some light regarding how "typical" this sample looks relative to the more basal European I's that have been picked up.

Gravetto-Danubian
02-06-2017, 02:09 AM
Sorry if my wording was confusing:

The majority of Central Asian (+Iranian) I is I2a2a-M223, both in this study and in earlier ones. However, this particular sample does not belong to I2a2a-M223, I2a2-M436 or even I2-M438.

The authors included the following upstream I SNPs in their genotype panel:



By deduction, this sample has to be I*-M258. Which, I believe, is the first time that has been reported anywhere outside Europe.

Grugni et al. 2012 reported a handful of IJ*-M429 in Iran. (http://3.bp.blogspot.com/-9ItFg3ZDOCc/UAfrJKKY7aI/AAAAAAAAFC8/WQfxUO6_9Vw/s1600/journal.pone.0041252.t001.jpg) I haven't seen any detailing of Iranian I through STR's or SNP's to ascertain whether they're recent or prehistoric relics, potentially relating to the pre-Villabruna population.

Impression: Wayward lineage ?source. ;)


Id treat those results with caution, because they were non-reproducible
I also recall another study with purported I* (based on limited SNP tests), but the STR profile looked like J or even G2

Agamemnon
02-06-2017, 02:33 AM
We desperately would need a good article about NGS Y-DNA from Iran, the Caspian Sea, Central Asia, Eastern Anatolia, Northern Iraq but after two years nothing happened.

I'd like to see ancient data from all over Iraq, Syria and the Arabian peninsula for that matter, not just the northern parts of SW Asia.

eastara
02-06-2017, 11:01 AM
I think this is just a bug in the table S4. If you scroll further up you will see identical haplotypes listed as G-P303 - see Hazara-Bamiyan AZ6_53 and AZ6_64

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0076748#s6

gravetti
02-06-2017, 11:17 AM
Haplogroup I was found outside of Europe at 2,000 year old remains from Mongolia.[11]

Two of four Hungarian conquerors from 10th Century are I2a.

Sample 12 is I2a1b3a>S17250.
Probability 63,42
Fitness 48,81

mtDNA: A12

Sample 17 is I2a1b3a>4460>4318
Probability 87,52
Fitness 29,66

mtDNA: H6a1a

http://link.springer.com/article/10.1007/s00438-016-1267-z

http://biorxiv.org/content/biorxiv/early/2016/12/07/092239.full.pdf

DMXX
02-06-2017, 12:53 PM
Very keen eye eastara, thanks for pointing that out!

This I* sample indeed matches several G2-P303 samples exactly across 17/17 Y-STRs, so it looks like a typo in the original entry.

eastara
02-06-2017, 11:42 PM
Most predictors gave him as haplogroup G, that's why I looked at the other Gs in the study and noticed the exact matches.
However, it may not be just a typo, but a lab mistake. If they tested in stages, a person falsely I-M258+ was tested further for the I hapolgoup subbranches, and was, of course negative for everything. This may be the case of other I* and IJ* from different studies and even for the large percent I2a found among Iranians and Kurds in one.
Lately I noticed an interesting case in the Bulgarian DNA project. A Bulgarian Turk, who is however from the Alevi minority has tested what it looks like Dinaric South. However his haplotype is unusual and has no matches at any level. The Alevis are supposedly with Iranian origin, so I was thinking is this the mythical Iranian I2a at last? In this case it is unlikely to be just a Turkified local as the Alevi were prosecuted by the Ottoman authorities and it was not beneficial to become one.
The person is promising to test deeper and will see.

gravetti
02-07-2017, 10:24 AM
The Tajik-Takhar sample TJ7_14 on table S5. (nr.336) is I-M423. Is that correct or a lab mistake?

Gravetto-Danubian
02-07-2017, 11:31 AM
The Tajik-Takhar sample TJ7_14 on table S5. (nr.336) is I-M423. Is that correct or a lab mistake?

You also said a Mongol had I .
Which study ?

eastara
02-07-2017, 11:42 AM
The Tajik-Takhar sample TJ7_14 on table S5. (nr.336) is I-M423. Is that correct or a lab mistake?

Predictors give the Tajik as 100% I2a-L621 according to his STR haplotype, so it is not a mistake. However it could be a Russian Slav introgression and not native for the region.

MfA
02-07-2017, 02:21 PM
Can you guys check if any prediction in the spreadsheet below doesn't seem right and have a better one? (FYI the empty cells under the Haplogroup column is prediction and/or STR comparison from FTDNA projects and not SNP confirmed.)

https://docs.google.com/spreadsheets/d/1wI9uarzrf4i9yrI1j1HKcuOqJbQY2XWfAh9JSr2klg8/edit?usp=sharing

gravetti
02-07-2017, 03:37 PM
You also said a Mongol had I .
Which study ?

https://en.wikipedia.org/wiki/Haplogroup_I-M170
1.Di Cristofaro 2013. (1/160 hg.I)
2.Keyser-Tracqui...August 2003.

gravetti
02-07-2017, 03:54 PM
Predictors give the Tajik as 100% I2a-L621 according to his STR haplotype, so it is not a mistake. However it could be a Russian Slav introgression and not native for the region.

Who knows? There is 5x I-M223;1x M423; 2x M436 and 2x M438 in this study.

I-M436, MG3_53 is from Northwest Mongolia.

There is I-M170 in Siberia too.

DMXX
02-07-2017, 06:49 PM
As stated earlier, I've been in contact with Nordtvedt over the years regarding the sporadic cases of Y-DNA I reported in West and Central Asia. The STR count in the older studies preventing him from pinning down the exact subclades some of those earlier samples belonged to, but he did notice one of the Tajik haplotypes I sent him resembled the "Continental2" cluster, which he told me was characteristic of Central Europe.

Thousands of Russians were purportedly in Afghanistan (https://en.wikipedia.org/wiki/Russians_in_Afghanistan) prior to the Soviet-Afghan war. However, there's a rather obvious culturo-religious divide between the two peoples. There probably are some Russian-derived Y-DNA I lines in Afghanistan, but I consider it quite unlikely that ethnic Russians are the primary source of these haplotypes.

[Edit]: Fished up the haplotype Nordtvedt said resembled the Central European cluster:



Y defining marker Sample DYS389-I DYS389-II DYS390 DYS456 DYS394/19 DYS385a DYS385b DYS458 DYS437 DYS438 DYS448 YGATAH4 DYS391 DYS392 DYS393 DYS439 YGATAC4/Y_DYS635 DYS388 DYS426
I2b-M223 Tajik 14 32 23 13 16 14 14 16 14 10 19 11 10 12 14 13 21 13 11

dp
02-08-2017, 12:54 AM
The most instructive piece of additional evidence to weigh either scenario against the other, for the time being, is STR haplotype comparisons. I have little familiarity with I haplotypes, so I'm hoping a more learned enthusiast would shed some light regarding how "typical" this sample looks relative to the more basal European I's that have been picked up.

If I scanned the original haplotype correctly, it was not tested for DYS455. As such it may not be relevant but the danishdemes site (http://danishdemes.org/YDNA-results-HgI1.html) puts DYS455=8 being found in 98% of I1's, with the other 2% being either 7s or 9s.


Among European haplotypes, DYS455=8 is virtually exclusive to I1. Most males are 11 at this marker, and the deletion to 8 in Hg I1 is believed to have taken place about 10,000 years ago. Since then, about 2% of I1's have mutated to 7 or 9, but some 98% of I1's still remain 8 at DYS455

One comment they make on DYS455=8 is that it is also found in a branch of J


Only a small, Middle Eastern subclade of J2b1a is known to also be 8 at DYS455, however that deletion is believed to have taken place only about a thousand years ago.


perhaps this is why I've seen posts in this thread mentioning predictions in both I & J.

eastara
02-08-2017, 05:58 AM
As stated earlier, I've been in contact with Nordtvedt over the years regarding the sporadic cases of Y-DNA I reported in West and Central Asia. The STR count in the older studies preventing him from pinning down the exact subclades some of those earlier samples belonged to, but he did notice one of the Tajik haplotypes I sent him resembled the "Continental2" cluster, which he told me was characteristic of Central Europe.

Thousands of Russians were purportedly in Afghanistan (https://en.wikipedia.org/wiki/Russians_in_Afghanistan) prior to the Soviet-Afghan war. However, there's a rather obvious culturo-religious divide between the two peoples. There probably are some Russian-derived Y-DNA I lines in Afghanistan, but I consider it quite unlikely that ethnic Russians are the primary source of these haplotypes.

[Edit]: Fished up the haplotype Nordtvedt said resembled the Central European cluster:



Y defining marker Sample DYS389-I DYS389-II DYS390 DYS456 DYS394/19 DYS385a DYS385b DYS458 DYS437 DYS438 DYS448 YGATAH4 DYS391 DYS392 DYS393 DYS439 YGATAC4/Y_DYS635 DYS388 DYS426
I2b-M223 Tajik 14 32 23 13 16 14 14 16 14 10 19 11 10 12 14 13 21 13 11


By meaning I2a-M423 may be a Slavic admixture I did not mean directly left from the Russian occupation of Afghanistan in the 80s. Some line of arrival could be the Volga and Crimean Russian Tartars, which have some I2a.
Remember the excitement when haplogroup I was found in the study about Uyghur province in China? However most did not read it was found only among the Tartar minority. These were descendants of Volga Tartars, who came to China as trader in 19 century, some ended there after the demise of the Siberian White Army, a large part of which were Tartars.
Well some Tartars do live in Bamyan, maybe in other Afghan provinces and may have mixed with the locals. We can go further back to the time of the Mongols and the Golden Horde, Russians were once a part of it. Not mentioning the Balkan connection with Alexander the Great, which was a favourite explanation about J2b2 in that area until it was found they belong to a completely different branch than the Europeans.
However there is definitely I2a2a(I2b1) in that region, which goes back to the Bronze age. I2b1 was found in Eastern Europe during the Neolithic and the Bronze age and was spread with the back migration of "Scythians" to Central Asia. It is not just a German haplogroup as was speculated before. The only other I branch, which is found outside Europe and is probably ancient is I2c2, however it is localised mainly around Anatolia and Caucasus.

eastara
02-08-2017, 06:20 AM
There is some homoplasmy between I1 and G2a haplotypes, especially at 12 STR markers.
We had several cases in the Bulgarian DNA project, when a newly tested was not assigned a haplogroup and while we were arguing if he is I1 or I2 the Backbone test came back as G.
With the prediction of unusual and ambiguous haplotypes I always apply the "Occam razor". For example if a Bulgarian matches both West European and Balkan haplogroups and branches (as is sometimes with R1b) I presume he is from the Balkan.
If a Central Asian resembles both I and J, I would presume he is J, as is common in that region.

bicicleur
02-08-2017, 08:12 AM
I remember that, IJ* was also found in one of the Vestonice samples IIRC. Looks like the Iranian plateau might prove very relevant to IJ's emergence and diversification, I'd certainly like to see more ancient data from Iran.

I agree with you. The Vestonice IJ probably got extinct.
But Gravettian are I and IJ newcomers in Europe mixing with autochtone C1a1.
And J appears as Epigravettian in Georgia.
Modern humans are present in Georgia since 42 ka.
It looks like this is the place where IJ split and I wouldn't be surprised if IJ itself came from further east.

DMXX
02-08-2017, 08:14 AM
Holding these plausible scenarios aside for a moment - I should state that several of these purported Y-DNA I lines in Central Asia and Iran did, in fact, come from studies where backbone SNP testing was undertaken and included the more basal Y-DNA I subclades. I've just re-reviewed Di Cristafaro et al. 2013, Haber et al. 2012 and Haber et al. 2011. All yielded at least 19 Y-STRs alongside backbone SNP testing. There's 16 confirmed Y-DNA I's in Iran and Afghanistan between these three papers (Nordtvedt received a couple of these courtesy of me). Therefore, the focus on haplotype predictions is somewhat beside the point. We know these are Y-DNA I lines (with the exception of the likely typing error you picked up).

Addressing the subclades and reconciling the samples with your information, from the above papers (most downstream confirmed SNP shown):
8x I2b1-M223 (4 Hazaras, 1 Tajik, 3 Iranians from Isfahan, Khorasan, South Iran - two of the four Hazaras from Bamiyan and share the same haplotype)
2x I2-M438 (2x Tajiks from Balkh, share the same haplotype)
1x I2a1b-M423 (1x Tajik)
2x I2a2-M436 (1x Mongolia, 1x South Iran)
3x I-M170 (2x West Iran, 1x East Iran - This paper didn't include anything further downstream SNP-wise)

As expected, mostly I2b1. I suspect the other confirmed I2's are due to the various scenarios you envisaged (Tatars, Slavic admixture, bidirectional movements due to the various Turkic or Mongol empires).

eastara
02-08-2017, 08:25 AM
Can you guys check if any prediction in the spreadsheet below doesn't seem right and have a better one? (FYI the empty cells under the Haplogroup column is prediction and/or STR comparison from FTDNA projects and not SNP confirmed.)

https://docs.google.com/spreadsheets/d/1wI9uarzrf4i9yrI1j1HKcuOqJbQY2XWfAh9JSr2klg8/edit?usp=sharing

Lately I am using the nevgen.org predictor. It was developed recently by a Serbian guy and includes all Balkan and East European haplotypes. Previous predictors like Athey's are based almost exclusively on West European samples and do not work correctly with more Eastern haplotypes.

Regarding the Kurdish samples I find ambiguous the last 2. The I2b2 according to me is most likely what we called previously J2a* (probably J2a PF5050 now). It would be normal for the last to be I2c2, but I2c1!? It is found in Western Europe only and some offshoots in Eastern Europe.
The Nevgen predictor also thinks the first is more likely to be O!? Haplogroup O is found in Azerbaijan, so it is not improbable for a Kurd with it.

MfA
02-08-2017, 08:45 AM
Lately I am using the nevgen.org predictor. It was developed recently by a Serbian guy and includes all Balkan and East European haplotypes. Previous predictors like Athey's are based almost exclusively on West European samples and do not work correctly with more Eastern haplotypes.

Regarding the Kurdish samples I find ambiguous the last 2. The I2b2 according to me is most likely what we called previously J2a* (probably J2a PF5050 now). It would be normal for the last to be I2c2, but I2c1!? It is found in Western Europe only and some offshoots in Eastern Europe.
The Nevgen predictor also thinks the first is more likely to be O!? Haplogroup O is found in Azerbaijan, so it is not improbable for a Kurd with it.

What're their IDs in the spreadsheet?

I don't see any O at al. I said the ones with empty cells are predictions based on nevgen and FTDNA database others are SNP confirmed I2 cases. D24 is I2, no question at all. Its poor fit is purely based on the database nevgen predictor uses which probably mainly Europeans, and since D24 is SNP confirmed, we can say it's STR profile is unique and likely pretty diverged from the rest based on low res STR.

eastara
02-08-2017, 11:57 AM
Not sure how Nevgen works, but let's take Kurd3, who is predicted to I2b2 L38.
If you enter only the first 12(in fact 10 markers) it comes with J2a1 Z1846> Y8378 96.53%.
Only if you enter the last tested 437=15 and 438=10 it comes with I2a2b-L38 83.46%. Somehow it thinks those markers combination changes radically the prediction. However I can see in the J2 project quite a few from the J2a-PF5050 have these values as well, which are otherwise unusual for J2a.
Regarding haplogroup O, see for yourself by entering the STRs of D24.
O3a2 85.39%

DMXX
02-08-2017, 01:38 PM
I assume D24 is the same as the I2a2a-M223 sample from Dogukoy which Palisto identified as being Kurdish (http://kurdishdna.blogspot.ch/2012/12/kurdish-y-dna-part-viii.html), MfA?

Here is the paper. (http://onlinelibrary.wiley.com/doi/10.1111/j.1548-1433.2010.01310.x/full) They did not employ backbone Y-SNP testing, as most of the Central Asian and Iranian Y-DNA I's I shared above had. Gokcumen et al. 2011 only used 17 STR's for the paternal side and HVR1 for the maternal.

To make matters worse:



To dissect the temporal origins of the genotypes observed in Yuksekyer, we allocated Y-STR haplotypes to SNP haplogroups using Haplogroup Predictor Online Software (Athey 2004; see Figure 6).


Athey's calculator is one of the most outdated ones they could've used; several better alternatives have existed since 2010. Here's D24 in the supplementary data:



Sample Hg Prob. DYS 19 DYS 385a DYS 385b DYS 389I DYS 389II DYS 390 DYS 391 DYS 392 DYS 393 DYS 437 DYS 438 DYS 439 DYS 448 DYS 456 DYS 458 DYS 635 H4-GATA
D24 I1b1 95.4 15 15 17 12 31 24 12 13 12 15 10 14 20 15 16 25 21


I recall running this data through a more up-to-date predictor four years ago and found some of the other assignments were incorrect.

On the basis of the above, D24 is not guaranteed to belong to Y-DNA I and any superior Y-STR predictor to Athey's one should be considered more authoritative.

[ADMIN] Changing title to reflect broadening of discussion.

lgmayka
02-08-2017, 02:26 PM
One comment they make on DYS455=8 is that it is also found in a branch of J
My own project has examples of DYS455=8 in R1a and R1b.