PDA

View Full Version : Some provisional calculations for haplogroup R1a based on the first FGC result



Michał
10-25-2013, 06:29 PM
Here are some very provisional calculations based on the recently received full Y chromosome sequencing results for a first member of our R1a1a and Subclades project tested at FGC (kit 208920, clade L1029). Of course, this will be refined once we receive his raw data (bam file) and once more R1a members are tested.

833

parasar
10-25-2013, 06:36 PM
Michał ,

Thanks!
So it is within the realm of possibility that that Siberian Baikal 24000ybp Y could well be R1. Now we need an R2 to test to capture that string length.

parasar
10-26-2013, 03:34 AM
CTS4385 and Z283 must have branched very close in time ~ just a few hundred years!

Michał
10-26-2013, 08:10 AM
Our R1a1a and Subclades project is looking for more R1a members who are ready to spend their money on sequencing their entire chromosome Y.
Here is a list of all FGC testees from the R1a haplogroup who are known to us:

Branch L664
???

Branch Z93
221090 (Y57)
241703 (CTS6)
251286 (CTS6)

Branch Z284
???

Branch M458
208920 (L1029) - already completed

Branch Z280
46500 (CTS1211*) Old Carpathian - just completed!
199037 (CTS3402* or CTS8816?) Volga-Carpathian
211024 (CTS3402* or CTS8816?) Volga-Carpathian
152839 (?) (CTS1211/CTS3402?) Southern Baltic type, unclustered
E4464 (L1280)
B1578 (L1280)
B1171 (L1280)


It's a pity that we don't have any L664 and Z284 members in the above group. Also, it would be good to have at least one result for such large clades like L658, L260 and Z92 (and of course for both L448 and Z287). Additionally, I would love to see some CTS4385*, Z93*, Z94*, CTS2124* Z283*, Z282*, Z280*, Z284* and M458* members being sequenced.

Please let us know about any R1a people who are missing on the above list.

Michał
10-26-2013, 08:49 AM
CTS4385 and Z283 must have branched very close in time ~ just a few hundred years!
This is indeed quite interesting. The very close distance between the founders of the Z93, M458, Z280 and Z284 branches is also very intriguing. Actually, the founders of the M458, Z280 and Z284 branches (and all major lineages from the paragroup Z282*) could have been just members of one large family that was suddenly expanding in different directions.

Another thing that has been already suspected based on the STR results (but is now clearly confirmed by the SNP data) is the very large distance (about 2000 years) between the initial separation of the two major subclades of M458 (L1260 and CTS1962/L1029) and their subsequent (much more recent) massive explosion. This characteristic structure of branch M458 resembles a bit a structure of its Scandinavian sister branch Z284 (where two major sub-branches L448 and Z287 are both relatively young and all indicates that this is related to the very recent expansion of the Norse Vikings). All this, however, is in a huge contrast to the predicted structure of the Z280 branch that has significantly expanded shortly after its birth (producing multiple sublineages that have managed to survive till today in very different regions of Europe, ranging from Ural to Ireland and from Finland to the Balkans) but neither of its subclades was able to undergo such massive expansion in the more recent times (though Z92 and the hypothetical clade CTS8816 have also significantly expanded at about the same time as CTS11962/L1029 and L260). This suggests that the initial localization and the subsequent fate were quite different in the case of M458 and Z280, respectively, although it seems also evident that at some point these two branches (or at least their major portions) became a part of the same phenomenon, likely associated with the expansion of the Early Slavs.

parasar
10-26-2013, 03:28 PM
This is indeed quite interesting. The very close distance between the founders of the Z93, M458, Z280 and Z284 branches is also very intriguing. Actually, the founders of the M458, Z280 and Z284 branches (and all major lineages from the paragroup Z282*) could have been just members of one large family that was suddenly expanding in different directions.

Another thing that has been already suspected based on the STR results (but is now clearly confirmed by the SNP data) is the very large distance (about 2000 years) between the initial separation of the two major subclades of M458 (L1260 and CTS1962/L1029) and their subsequent (much more recent) massive explosion ...predicted structure of the Z280 branch that has significantly expanded shortly after its birth...

Something similar looks possible for L657 too - a major massive expansion like M458, while Z2124 looks more like Z280.

I will not be surprised if Z93 was also a member of that large family!

alan
10-26-2013, 05:46 PM
I would think the opposite. The calculations seem very unlikely to suggest that IMO. Archaeologically it is hard not to see the big branching explosion around the Z282 and Z93/94 expansion phases as linked to anything other than the Yamnaya and related phases.

Archaeological calibration might be speculative but worth a go -

Z645, the shared not too distant ancestor of both the European and Asian R1a lineages is only a couple of SNPs older than Z282 which is ancestor of most European R1a. If Z645 is the shared ancestor of most of the big R1a expansion then its almost impossible not to place that ancestor just before Yamnaya and Afanasievo divided -- no later than 3500BC and probably no older than 4000BC.

I would expect Z282 to date to between 3300 and 3000BC if its the common ancestor of Yamnaya lines who headed west in the Yamnaya expansion and was partly incorporated into corded ware from 3000BC.

M458 sounds most likely to be corded ware linked to me, perhaps including links to Fatyanovo and Abashevo which had corded ware links. That would be expected to date maybe to 2800BC or so from memory.

M417xz645 with its odd unexpected distribution is hard to find an archaeological link and also its branching and its convergence dates are so far apart that its hard to say. The impression I get is not much happened between M417 and Z645 so I would date M417 no later than 4000BC.

So, when I look at the table,the dates in the bolded right hand most column fits almost freakily well with the archaeological expectations for R1a lineages. So, I think that column makes by far the most archaeological sense and must be on the money.

The upshot of the column being correct is that R1 is about 2500 years younger than the R boy at Mal'ta. When for example the far left column is looked at there is a very poor correspondence with archaeological expectations.

So Michal congradulation on that- you might need to tweak the R1a tree (which I desperately need when reading work on R1a) unless it already has been - the link is a few months old now.

http://www.familytreedna.com/public/R1a/default.aspx?section=results

The fit is just so incredibly good I would say you had rigged it if you were not the honest man we know you are :-)




Michał ,

Thanks!
So it is within the realm of possibility that that Siberian Baikal 24000ybp Y could well be R1. Now we need an R2 to test to capture that string length.

Jean M
10-26-2013, 06:49 PM
This is indeed quite interesting. The very close distance between the founders of the Z93, M458, Z280 and Z284 branches is also very intriguing. Actually, the founders of the M458, Z280 and Z284 branches (and all major lineages from the paragroup Z282*) could have been just members of one large family that was suddenly expanding in different directions.

I am taking notes!


Another thing that has been already suspected based on the STR results (but is now clearly confirmed by the SNP data) is the very large distance (about 2000 years) between the initial separation of the two major subclades of M458 (L260 and CTS1962/L1029) and their subsequent (much more recent) massive explosion.

A perfect fit for the early Slavs, as you say. For new readers coming in - the distribution of M458 is pretty solidly Slavic. So we can picture M458 enjoying a long quiet period in successive cultures of the Middle Dnieper before the Post-Roman explosion.


All this, however, is in a huge contrast to the predicted structure of the Z280 branch that has significantly expanded shortly after its birth

Its present distribution is reminiscent of the Battle Axe Cultures, is it not? Linguistically that would partly fit with waves of IE spreading up river from the Middle Dnieper into Fatyanovo (3200 BC-2300 BC) and then up to the Baltic, gradually spreading the Baltic family, but thinly.

alan
10-26-2013, 07:26 PM
If you look at the central column which dates R1 to 17325 years, it makes all the SNPs much too young to make sense. That is interesting and suggests Karafet's fairly similar rough estimate centrepoint of 18500 years for R1 is too young. The right hand bolded column just fits so well archaeological expectations that it is hard not to conclude that your estimate of 21560 years of age or c. 19500BC is an improvement on Karafets estimate. It still all implies that the R boy at Mal'ta must be R*. (although I have raised the outside and much less likely possibility that he could be R2 in theory at least).

Anyway, assuming this new date for R1 is correct then it is interesting in that it pushes it back a 2000 years. That actually would mean that R1 emerged within the later part of the LGM period rather than just after its end. I think that could have some implications for the history of R1. There is a lot of debate but I think the strongest cards are held by those who think Siberia was most abandoned shortly after the R Mal'ta boy was buried. That would confirm the suspicion that R* had headed off shortly after the Ma'ta boy was buried c. 24000 years ago and that R1 seems to have been born c. 2500 years later after the evacuation of most of Siberia and in a place no further north than the southermost fringes of Siberia or south/south-westwards of there.

That also raises the question of R2. It allegedly arose before R1 and therefore would have existed in parallel with R* lineages for the first 3000 years or so of its existence. Perhaps R2 arose among the first of the R* lineages to move south as conditions deteriorated c. 25-26000 years ago or so while some other groups of R* remained in the harsh conditions of Siberia until around the time of the boy's burial. By then R2 might have sewn up the more pleasant climates of LGM in the Indian subcontinent leaving the rest of R* to wander further west.

I dont know the details but it is possible that not leaving Siberia until the LGM had a grip may have made a journey into the Indian subcontinent through the mountains not an option for the late R* stagglers who didnt move as fast as the R2 lines. That might explain why the R* line ancestor soon to lead to R1* appears to have moved west rather than more directly south.

If that is correct then the ancestors of R1 arrived as late R* lines coming from Siberia heading west As far as I understand it was a desert in what is now central Asia from the Caspian to China so that now seems an unlikely route to me.

I cannot see an LGM escape from Siberia south on these maps:

http://www.esd.ornl.gov/projects/qen/euras18k.gif

http://upload.wikimedia.org/wikipedia/commons/e/e4/Last_glacial_vegetation_map.png

http://www.esd.ornl.gov/projects/qen/euras(2.gif

http://www.esd.ornl.gov/projects/qen/NEW_MAPS/eurasia1.gif

This actually might shift my opinion on the route of escape the R* ancestors of R1 took. A route south after 22000BC/24000 years ago would not seem rational. To head south would seem to head towards temperate and polar deserts, glaciers and polar alpine area of mountains. It would be far easier to simply move directly west along the steppe tundra belt that was their more familiar environment. That belt would lead them west and stretch the length of Europe and western Asia at a certain latitude.

However, just because in theory they could move that length doestnt mean they did - they dont seem to have judging by modern clades.

There is also growing evidence that even the hardened hunters on the steppe-tundra were squeezed towards its southern edges and less harsh spots at the height of the LGM. So, a route following the southerNmost edge of the steppe-tundra belt would seem the most likely to me. That would lead more to the Caspian NE corner than the south as I had previously thought.

Michał
10-27-2013, 12:41 PM
@Alan

Please keep in mind that all those TMRCA estimates provided in my first post were based on just one FGC result for a single L1029 member, so we should avoid jumping to any definite conclusions regarding the age of R1 or any of its subclades. It seems very likely that after testing more R1a (and R1b!) members, the (average) number of mutations downstream of R1 will significantly change. Also, when assuming that the age of R1 is roughly in the range of 18-25 ky, it will be practically impossible to provide any TMRCA estimates for such an old haplogroup that would place an age of R1 within the margin of error that is less than 3 or 4 ky.

Additionally, the huge problems with choosing an appropriate calibration method make any attempt to provide an absolute age for any clade very insecure. Therefore, all we can quite safely assume based on those data is a relative age of particular clades and subclades only.

As for the R1/R2 issue in the context of the recent Siberian finding, it seems obvious to me that these two ancestral lineages were born (founded) at the same moment, as they are obviously derived from just two brothers descending from the same father. Thus, even if one of those lineages is likely to have expanded earlier than the other, it is still possible to find an R1 member once we know that the R2 lineage had already existed at given moment. Of course, we don’t know the sequence in which particular mutations defining the R1 and R2 clades were arising, which makes this situation a bit complicated, as we are currently unable to exclude the possibility that a member of haplogroup R who is negative for M420 is in fact a very early member of clade R1 (let’s say positive for L62). The same applies for a hypothetical ancient member of haplogroup R who is negative for M479, as he can easily represent an early (extinct) sublineage of R2 or even a major sublineage of R2 (just before the M479 mutation first arose). Only by identifying a mutation specific for R2 or by excluding any single mutation that arose before R1 and R2 were separated, we will be able to exclude that this Siberian male was representing an R1 lineage.




So Michal congradulation on that- you might need to tweak the R1a tree (which I desperately need when reading work on R1a) unless it already has been - the link is a few months old now.
http://www.familytreedna.com/public/R1a/default.aspx?section=results
The fit is just so incredibly good I would say you had rigged it if you were not the honest man we know you are :-)
:) I should again remind all of you that all credits for creating this beautiful tree should go to our main admin Łukasz Łapiński.

Michał
10-27-2013, 12:43 PM
Its present distribution is reminiscent of the Battle Axe Cultures, is it not?
It will depend on your definition of the Battle-Axe cultures, but I would rather move this discussion to a new thread. :)

alan
10-27-2013, 04:25 PM
Oh yes I realise the limitations and the confidence interval etc. It just hope the caveats are understood on this site without repeating every time. However, I did find it an uncanny match for archaeology which suggests to me its nearly on the money and even if its out by several centuries that wouldnt change that general observation. Its good to see SNP based calculations reasonably close to STR ones too. So there are a couple of reasons to be optimistic about this.


@Alan

Please keep in mind that all those TMRCA estimates provided in my first post were based on just one FGC result for a single L1029 member, so we should avoid jumping to any definite conclusions regarding the age of R1 or any of its subclades. It seems very likely that after testing more R1a (and R1b!) members, the (average) number of mutations downstream of R1 will significantly change. Also, when assuming that the age of R1 is roughly in the range of 18-25 ky, it will be practically impossible to provide any TMRCA estimates for such an old haplogroup that would place an age of R1 within the margin of error that is less than 3 or 4 ky.

Additionally, the huge problems with choosing an appropriate calibration method make any attempt to provide an absolute age for any clade very insecure. Therefore, all we can quite safely assume based on those data is a relative age of particular clades and subclades only.

As for the R1/R2 issue in the context of the recent Siberian finding, it seems obvious to me that these two ancestral lineages were born (founded) at the same moment, as they are obviously derived from just two brothers descending from the same father. Thus, even if one of those lineages is likely to have expanded earlier than the other, it is still possible to find an R1 member once we know that the R2 lineage had already existed at given moment. Of course, we don’t know the sequence in which particular mutations defining the R1 and R2 clades were arising, which makes this situation a bit complicated, as we are currently unable to exclude the possibility that a member of haplogroup R who is negative for M420 is in fact a very early member of clade R1 (let’s say positive for L62). The same applies for a hypothetical ancient member of haplogroup R who is negative for M479, as he can easily represent an early (extinct) sublineage of R2 or even a major sublineage of R2 (just before the M479 mutation first arose). Only by identifying a mutation specific for R2 or by excluding any single mutation that arose before R1 and R2 were separated, we will be able to exclude that this Siberian male was representing an R1 lineage.



:) I should again remind all of you that all credits for creating this beautiful tree should go to our main admin Łukasz Łapiński.

palamede
10-27-2013, 05:24 PM
I'd like to know what are the knowledge bases used to determine the location in the phylogenic tree of each mutation of this first sample.

for example for the 375 repaired mutations (307 high qual + 57 low qual) of the sample downstairs from R1, the repartition could be done in
- 277 beteen R1 and M417
- 9 between M417 and Z645
- 3 between Z645 and Z283/Z282
and so on
...........


I suppose the mutations of this first sample were compared to other samples in the genetic bases with the location of each mutation already determined.

Michał
10-27-2013, 07:49 PM
I suppose the mutations of this first sample were compared to other samples in the genetic bases with the location of each mutation already determined.
EDIT: Comparing different FGC-tested samples will become the best method in the future, but right now it is possible to reach a similar effect by comparing the FGC data with the previously sequenced 1KG and PGP samples only.

For example, by comparing the FGC-tested sample with some samples representing different sub-branches of M417 (for example L664 and Z645) and some R1b samples, we should identify all SNPs between the R1 and L664/Z645 level (as they should be positive in the L664 and Z645 members while being negative in all R1b samples). To identify the mutations at the Z645 level, we should look for those mutations that are present in both Z93 and Z283 but absent in L664. Mutations at the Z283/Z282 level should be present in M458, Z284 and Z280, but absent in branches Z93 and L664.

Baltimore1937
10-28-2013, 08:26 AM
Our R1a1a and Subclades project is looking for more R1a members who are ready to spend their money on sequencing their entire chromosome Y.
Here is a list of all FGC testees from the R1a haplogroup who are known to us:

Branch L664



Branch Z93
221090 (Y57)
241703 (CTS6)
251286 (CTS6)

Branch Z284
???

Branch M458
208920 (L1029) - already completed

Branch Z280
46500 (CTS1211*) Old Carpathian - just completed!
199037 (CTS3402* or CTS8816?) Volga-Carpathian
211024 (CTS3402* or CTS8816?) Volga-Carpathian
152839 (?) (CTS1211/CTS3402?) Southern Baltic type, unclustered
E4464 (L1280)
B1578 (L1280)
B1171 (L1280)


It's a pity that we don't have any L664 and Z284 members in the above group. Also, it would be good to have at least one result for such large clades like L658, L260 and Z92 (and of course for both L448 and Z287). Additionally, I would love to see some CTS4385*, Z93*, Z94*, CTS2124* Z283*, Z282*, Z280*, Z284* and M458* members being sequenced.

Please let us know about any R1a people who are missing on the above list.

No money & credit cards are maxed out. Otherwise I would love to participate.

Michał
10-28-2013, 03:09 PM
No money & credit cards are maxed out. Otherwise I would love to participate.
You may find it interesting that I have been informed about two members of CTS4385 (including CTS4385* and L664, cluster C1) who have recently ordered this test.

parasar
10-28-2013, 03:32 PM
EDIT: Comparing different FGC-tested samples will become the best method in the future, but right now it is possible to reach a similar effect by comparing the FGC data with the previously sequenced 1KG and PGP samples only.

For example, by comparing the FGC-tested sample with some samples representing different sub-branches of M417 (for example L664 and Z645) and some R1b samples, we should identify all SNPs between the R1 and L664/Z645 level (as they should be positive in the L664 and Z645 members while being negative in all R1b samples). To identify the mutations at the Z645 level, we should look for those mutations that are present in both Z93 and Z283 but absent in L664. Mutations at the Z283/Z282 level should be present in M458, Z284 and Z280, but absent in branches Z93 and L664.

And, I would add full genome comparisons with ancient DNA. Let's hope Eske Willerslev and team come up with really good data from Mal'ta and Afontova Gora to compare with.

T101
10-28-2013, 11:40 PM
Archaeologically it is hard not to see the big branching explosion around the Z282 and Z93/94 expansion phases as linked to anything other than the Yamnaya and related phases.

Archaeological calibration might be speculative but worth a go -


Z282 originates as early as the Yamna horizon, and then blossoms within the Corded Ware horizon (Battle Axe, Single Grave), where
it gives birth to Z280 and its younger siblings M458 and Z284.

Z280 expands in the Middle Dnieper Culture as part of the Corded Ware Horizon, with Z92 an early breakaway as part of Fatyanovo and CTS1211 succeeding the Globular Amphora culture in present day Poland and Northeastern Germany.

CTS3402 along with the other sons of CTS1211 expand in the Lusatian culture and its offspring are later dispersed from Spain to the Urals by the Germanic people, Huns, Avars, Slavs and others.

Whereas M458 with the Slavs, and Z284 with the Norse Vikings, experience expansive growth post Roman era.

Or something like that in a nutshell! lol

bolek
10-29-2013, 08:03 AM
Z282 originates as early as the Yamna horizon, and then blossoms within the Corded Ware horizon (Battle Axe, Single Grave), where
it gives birth to Z280 and its younger siblings M458 and Z284.

.
Question:

Can Z282 be anything but proto-Slavic or proto-Balto-Slavic?

Could you put Y-DNA lineages on language tree differently than it is done here:

http://www.anthrogenica.com/showthread.php?1519-Languages-and-Y-DNA-lineages&p=17647#post17647

Michał
10-29-2013, 12:19 PM
Question:
Can Z282 be anything but proto-Slavic or proto-Balto-Slavic?

I think Z282 is too old to be called “Proto-Balto-Slavic”, but something like “Pre-Balto-Slavic” would indeed make sense in my opinion. I will write more on this subject in your thread about a possible relationship between different Y-DNA haplogroups and the classification of the IE languages.

Michał
10-30-2013, 01:31 PM
As recently announced by Semargl on the Molgen forum, the YFull team has created a special R1a project (http://www.yfull.com/groups/r1a/) at the YFull site, where all fully sequenced R1a members will be able to upload their raw data and compare their results with the results of other R1a members (including the R1a samples from the 1KG project that have been already included there). These data will be publically available, so everyone will have a chance to search for some specific mutations or some selected positions on chromosome Y.

alan
10-30-2013, 02:54 PM
Michal-I really value your opinions as you seem very very well informed. Does the probable R* boy at Mal'ta in Siberia surprise you?


I think Z282 is too old to be called “Proto-Balto-Slavic”, but something like “Pre-Balto-Slavic” would indeed make sense in my opinion. I will write more on this subject in your thread about a possible relationship between different Y-DNA haplogroups and the classification of the IE languages.

Michał
10-30-2013, 04:06 PM
Does the probable R* boy at Mal'ta in Siberia surprise you?
You put me in an awkward situation, Alan, as I don’t want to sound like a “Besserwisser” in this particular case. Nevertheless, I must admit that it didn’t surprise me at all, mostly because I have always believed that not only haplogroup R, but also, R1, R2 and even R1a and R1b, were all born somewhere in Central Asia, or at least not far from that region. Another reason why this wasn’t any huge surprise for me was because my previous SNP-based TMRCA estimates have suggested that R1 is about 25-27 ky old, while haplogroup R should be at least 5000 years older (i.e. about 31-33 ky old):
http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting&p=15936#post15936

If these estimates are more or less correct, it would mean that it could be possible to find some ancient R1-M173* cases dated not only to about 24-25 ky ago but also to any period from the range of 25-34 kya. And since those calculations have also suggested that haplogroups R and Q were separated about 38-40 kya, this would even mean that we could find some ancient R* cases dated to any period between 40 and 30 kya (or to some more recent times, if a potential existence of some extinct sister clades of R1 and R2 is taken into account).

alan
10-30-2013, 09:06 PM
I wasnt massively surprised either as it broadly fits a lot of people's predictions about R. I maybe didnt expect it quite as far east but I suppose we have no idea if the Mal'ta boy was part of a Siberian-specific group or the eastern end of wider spread.

So, in terms of central Asia, its a very broad term and there are aspects like the LGM. Do you see R1 as being driven south into south central Asia or west along the steppe tundra into eastern Europe.


You put me in an awkward situation, Alan, as I don’t want to sound like a “Besserwisser” in this particular case. Nevertheless, I must admit that it didn’t surprise me at all, mostly because I have always believed that not only haplogroup R, but also, R1, R2 and even R1a and R1b, were all born somewhere in Central Asia, or at least not far from that region. Another reason why this wasn’t any huge surprise for me was because my previous SNP-based TMRCA estimates have suggested that R1 is about 25-27 ky old, while haplogroup R should be at least 5000 years older (i.e. about 31-33 ky old):
http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting&p=15936#post15936

If these estimates are more or less correct, it would mean that it could be possible to find some ancient R1-M173* cases dated not only to about 24-25 ky ago but also to any period from the range of 25-34 kya. And since those calculations have also suggested that haplogroups R and Q were separated about 38-40 kya, this would even mean that we could find some ancient R* cases dated to any period between 40 and 30 kya (or to some more recent times, if a potential existence of some extinct sister clades of R1 and R2 is taken into account).

parasar
10-30-2013, 10:51 PM
You put me in an awkward situation, Alan, as I don’t want to sound like a “Besserwisser” in this particular case. Nevertheless, I must admit that it didn’t surprise me at all, mostly because I have always believed that not only haplogroup R, but also, R1, R2 and even R1a and R1b, were all born somewhere in Central Asia, or at least not far from that region. Another reason why this wasn’t any huge surprise for me was because my previous SNP-based TMRCA estimates have suggested that R1 is about 25-27 ky old, while haplogroup R should be at least 5000 years older (i.e. about 31-33 ky old):
http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting&p=15936#post15936

If these estimates are more or less correct, it would mean that it could be possible to find some ancient R1-M173* cases dated not only to about 24-25 ky ago but also to any period from the range of 25-34 kya. And since those calculations have also suggested that haplogroups R and Q were separated about 38-40 kya, this would even mean that we could find some ancient R* cases dated to any period between 40 and 30 kya (or to some more recent times, if a potential existence of some extinct sister clades of R1 and R2 is taken into account).
So M173 is possible?
How about M343 or M417?

Michał
10-31-2013, 08:57 AM
So, in terms of central Asia, its a very broad term and there are aspects like the LGM. Do you see R1 as being driven south into south central Asia or west along the steppe tundra into eastern Europe.
I don’t think we have enough data to point to a specific region in Central Asia as a most likely place of origin for haplogroup R or R1. However, it seems to me that once the R1a and R1b lineages were separated, the R1a* group went immediately west towards Eastern Europe, while since the R1b1* group (or R1b-L278) was likely to reside in the Western part of Central Asia (or not far east from the Caspian Sea), it seems that either R1b was born there or, alternatively, this could have been an effect of a slightly earlier migration of R1b* directed south-west (at least this is the most likely scenario IMO).

Michał
10-31-2013, 08:59 AM
So M173 is possible?
How about M343 or M417?
M343, a mutation defining the R1b level, and M420, as one of the mutations defining the R1a level, cannot be ruled out in my opinion, although I would consider the chances of finding R1a* or R1b* to be significantly smaller than in the case of R1*, R2* or R*.

As for M417, this seems to be absolutely impossible, so if it’s found there, it will indicate an apparent contamination.

alan
11-01-2013, 01:22 AM
The main help in this all is the climate. Much of what is now desert Central Asia was also desert during the LGM and much of the upland to the south was also very inhospitable. I think it may have been more natural and feasible to follow the same sort of land they had been used to in Siberia by following the steppe-tundra belt west which stretched from western Siberia to western Europe. The LGM just seems to late to move south due to the desert. It could however simply be that, as the archaeology suggests, the R boy at Mal'ta was a late straggler whose family stayed behind in the teeth of the LGM when others of the same lineage headed west and south earlier just before the desertification of central Asia.


I don’t think we have enough data to point to a specific region in Central Asia as a most likely place of origin for haplogroup R or R1. However, it seems to me that once the R1a and R1b lineages were separated, the R1a* group went immediately west towards Eastern Europe, while since the R1b1* group (or R1b-L278) was likely to reside in the Western part of Central Asia (or not far east from the Caspian Sea), it seems that either R1b was born there or, alternatively, this could have been an effect of a slightly earlier migration of R1b* directed south-west (at least this is the most likely scenario IMO).

parasar
11-25-2013, 02:25 AM
Something similar looks possible for L657 too - a major massive expansion like M458, while Z2124 looks more like Z280.

I will not be surprised if Z93 was also a member of that large family!

So with a few hundred years here and there, it can be said that:
1.Age of Z93=Age of Z283=Age of CTS4385 - About 6500 years
2. Age of L657=Age of M458 - About 5500 years

1 does give us a split, but not a clear sharp split between India and Europe.
2 gives us a mutually exclusive split between India and Europe.

Therefore, if R1a correlates with IE, then age of IE is at a minimum about 5500 years, though an age of >6500 years is more likely.

lgmayka
11-25-2013, 10:52 AM
So with a few hundred years here and there, it can be said that:
1.Age of Z93=Age of Z283=Age of CTS4385 - About 6500 years
2. Age of L657=Age of M458 - About 5500 years
That depends on the set of figures you choose. At the moment, I prefer the rightmost column of the provisional SNP-based TMRCA calculations that Michał posted recently. (They best match the TMRCA calculations from STRs.)

R1 : 21560
R1a-M417 : 6048
R1a-Z645 : 5544
R1a-Z282 : 5376
R1a-M458 : 4704
R1a-L1029 : 2744

TMRCAs will continue to improve relative to each other as we get additional full-Y examples and average their SNP counts; but the scaling of the entire TMRCA set still requires correlation with an archaeologically datable event, and/or a reliable stable SNP mutation rate.

parasar
11-25-2013, 08:18 PM
That depends on the set of figures you choose. At the moment, I prefer the rightmost column of the provisional SNP-based TMRCA calculations that Michał posted recently. (They best match the TMRCA calculations from STRs.)

R1 : 21560
R1a-M417 : 6048
R1a-Z645 : 5544
R1a-Z282 : 5376
R1a-M458 : 4704
R1a-L1029 : 2744

TMRCAs will continue to improve relative to each other as we get additional full-Y examples and average their SNP counts; but the scaling of the entire TMRCA set still requires correlation with an archaeologically datable event, and/or a reliable stable SNP mutation rate.

That is true. I was going by the Mal'ta boy's 24000 years age. He was a pre-R derivative - 5 mutations ancestral and with 35 private mutations ((1.5X on 5.8 million bases).
I have made an assumption that his age is the approximate age of R1 for ballpark dates.

Michał
11-26-2013, 12:01 AM
That is true. I was going by the Mal'ta boy's 24000 years age. He was a pre-R derivative - 5 mutations ancestral and with 35 private mutations ((1.5X on 5.8 million bases).

Actually, these new data provide some important information regarding the age of R1. If the Mal'ta boy was indeed positive for as many as 35 private mutations, this suggests that the separation between R1a and R1b took place significantly earlier than 24 kya, most likely about 27-28 kya. Frankly speaking, I was extremely surprised that these data fit nearly exactly my previous calculations regarding the age of R and R1 (though I am perfectly aware that this is mostly coincidental):
http://www.anthrogenica.com/showthre...5936#post15936

When taking the distance between the TMRCAs for P and R1 from the above link (40.2-27.6=12.6) and dividing this by the number of SNPs separating these two levels in the Mal'ta study (42), we get 300 years per each mutation. Most importantly, the Mal'ta boy shows 54 mutations downstream of hg P, which places him 16.2 ky (54 x 0.3) downstream of 40.2 kya, which is 24 kya, exactly as suggested by the calibrated radiocarbon dating. Here is a tree based on the above calculations:
971

Of course, all above would be not true if a substantial number of the private mutations found in the Mal'ta boy were false positives (which cannot be easily verified).

lgmayka
11-26-2013, 12:42 AM
If the Mal'ta boy was indeed positive for as many as 35 private mutations, this suggests that the separation between R1a and R1b took place significantly earlier than 24 kya, most likely about 27-28 kya.
You are apparently assuming that the very specific, unusual sequencing of the Mal'ta boy's ancient DNA is directly comparable to modern high-quality DNA scans (in terms of the number of genuine mutations vs. false positives vs. false negatives). That is not a reasonable assumption.


Frankly speaking, I was extremely surprised that these data fit nearly exactly my previous calculations regarding the age of R and R1 (though I am perfectly aware that this is mostly coincidental):
http://www.anthrogenica.com/showthre...5936#post15936
That link does not work.

parasar
11-26-2013, 01:50 AM
Actually, these new data provide some important information regarding the age of R1. If the Mal'ta boy was indeed positive for as many as 35 private mutations, this suggests that the separation between R1a and R1b took place significantly earlier than 24 kya, most likely about 27-28 kya. Frankly speaking, I was extremely surprised that these data fit nearly exactly my previous calculations regarding the age of R and R1 (though I am perfectly aware that this is mostly coincidental):
http://www.anthrogenica.com/showthre...5936#post15936

When taking the distance between the TMRCAs for P and R1 from the above link (40.2-27.6=12.6) and dividing this by the number of SNPs separating these two levels in the Mal'ta study (42), we get 300 years per each mutation. Most importantly, the Mal'ta boy shows 54 mutations downstream of hg P, which places him 16.2 ky (54 x 0.3) downstream of 40.2 kya, which is 24 kya, exactly as suggested by the calibrated radiocarbon dating. Here is a tree based on the above calculations:
971

Of course, all above would be not true if a substantial number of the private mutations found in the Mal'ta boy were false positives (which cannot be easily verified).

The paper notes that some may be.

"SNPs were filtered for quality (using VQHIGH as the threshold, as defined by Complete Genomics) ... only a few (11) mutations were detected that are likely to be false positives ...it is assumed that recurrent mutation is less likely than a sequencing error ... . One position among the 35 private to MA-1 is characteristic of a distant hg – namely C3c14 ... 10 additional phylogenetically non-concordant positions in MA-1 were found ...."
http://www.nature.com/nature/journal/vaop/ncurrent/extref/nature12736-s1.pdf

The two R1a on their NJ tree happen to be GIH L657+

parasar
11-26-2013, 02:45 AM
...

That link does not work.

http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting&p=15936#post15936


40.2 (34.3-52.9) haplogroup P
...
33.5 (28.6-44.1) haplogroup R
27.6 (23.5-36.3) haplogroup R1

Michał
11-26-2013, 10:47 AM
You are apparently assuming that the very specific, unusual sequencing of the Mal'ta boy's ancient DNA is directly comparable to modern high-quality DNA scans (in terms of the number of genuine mutations vs. false positives vs. false negatives). That is not a reasonable assumption.
This is not exactly true. My main conclusion that R1a is older than 24 ky was actually based solely on the sequencing results received for the Mal'ta boy, though of course this required assigning all his alleles (both ancestral and derived) to some proper positions on the Y-DNA tree (and this was the only moment when the data from the modern high-quality scans was used).

Let me explain my line of reasoning again. If the analysis of all Y-DNA sequences available for the Malta boy identified 35 loci that show the presence of some derived alleles that are not shared by any known R1 members, while the same analysis revealed 23 loci that are known to have the derived status in all known R1 members but are ancestral in the Mal'ta boy, this clearly suggests that R1 is older than 24 ky.

The main reason of some uncertainty in this case is associated with the unknown number of false positives among those 35 private mutations found in MA-1, but we simply need to trust the authors of the paper who have tried to exclude all such non-secure "private" (i.e. false positives) mutations. When judging based on the relatively low proportion of some discordant SNPs found among all phylogenetically relevant SNPs analyzed in the Mal'ta boy, I wouldn't expect that the number of those false positives among those 35 private mutations was high enough to undermine the above conclusion.


That link does not work.
Sorry about that (and thank you, Parasar, for providing a correct link).

Michał
11-26-2013, 10:58 AM
"SNPs were filtered for quality (using VQHIGH as the threshold, as defined by Complete Genomics) ... only a few (11) mutations were detected that are likely to be false positives ...it is assumed that recurrent mutation is less likely than a sequencing error ... . One position among the 35 private to MA-1 is characteristic of a distant hg – namely C3c14 ... 10 additional phylogenetically non-concordant positions in MA-1 were found ...."

This indicates that they have tried to exclude all those potential "false positives", while reporting only those 35 private SNPs that are likely to be "genuine" mutations.

jamesdowallen
11-26-2013, 11:37 AM
Hello. I'm a new member with a layman's interest in Y-chromosome dating (especially R1b-L11). My comment isn't directly about R1a: Is there a better thread for discussing Y-chromosome mutation rates in general?

I think the recent analysis of the 24,000-year old Siberian boy by Raghavan-Skoglund will finally place tighter bounds on the mutation rate. I hope forum members will review my comments here (http://fabpedigree.com/james/yhchart.htm).

The chart at the top of that page has date labels based on my old guess -- a 1/140 mutation rate. Just Click here (http://fabpedigree.com/james/yhestimf.jpg) to ignore my uninteresting discussion and see the same tree with the 1/165 mutation rate implied by Raghavan-Skoglund's paper.

(I do hope experts will review my derivation of the mutation rate, and help guess how big the "error bars" are.)

Jean M
11-26-2013, 12:05 PM
Welcome to the forum James Allen!

Michał
11-26-2013, 04:07 PM
I hope forum members will review my comments here (http://fabpedigree.com/james/yhchart.htm).

I think your approach is a correct one. There are some minor points in your description of your procedure that are not clear to me. For example, you seem to equate the R-L node in the Francalacci tree with the R-N node in the Raghavan tree (which is not the same). Also, I found some minor errors in your calculations (for example,” the distance from F-breakup to Siberian boy and to present” is 101 and 160 SNPs, respectively, and not 98 and 155 as you wrote) but this does not affect your final outcome though.

Most importantly, your average mutation rate (when recalculated from the value of 166 years per mutation for the region covered by Francalacci) is about 0.66 x 10^-9 per nucleotide per year, which is consistent with the average value for the three previously known mutation rates, as provided by Francalacci, Poznik and Mendez, respectively. This is also very close to the 0.7 mutation rate I have used in my previous calculations cited above.

lgmayka
11-26-2013, 05:53 PM
If the analysis of all Y-DNA sequences available for the Malta boy identified 35 loci that show the presence of some derived alleles that are not shared by any known R1 members, while the same analysis revealed 23 loci that are known to have the derived status in all known R1 members but are ancestral in the Mal'ta boy, this clearly suggests that R1 is older than 24 ky.
Again, you are comparing two very different qualities of scan. The figure of 35 for the Mal'ta boy is based on low-quality scanning of ancient DNA, whereas the figure of 23 is based on high-quality scans of modern R1 men.

Full scanning of ancient (i.e., highly degraded) Y chromosomes is still in its infancy. I think we need to remain cautious until we get enough successful examples to cross-check against each other. Perhaps the most egregious recent failure was that of an alleged Neanderthal Y chromosome, whose scan results were finally determined by experts to be that of a female.

EDIT: In re-reading your previous posts, I now see the reason for my "visceral" reaction to your age estimates. You wrote: "...the separation between R1a and R1b took place significantly earlier than 24 kya, most likely about 27-28 kya." This terminology is confusing and somewhat incorrect. What you meant was: "...the TMRCA of R1a and R1b--i.e., the age of R1--is significantly earlier than 24 kya..." You are referring to a split of R1 into two branches which eventually became R1a and R1b. M420 and M343 (the mutations typically associated with R1a and R1b) may have occurred much later, and the TMRCAs of their modern descendants may be much later than that.

RCO
11-26-2013, 07:23 PM
Still the Francalacci phylogenetic diagram looks like to be the best proportional tree already available. The chronology can be disputed but the proportions of the SNPs between the branches should display the number of SNPs in every different branch. The first FG results make sense there. We need a proportional diagram of the 1000 Genomes SNPs with the proportional positions per SNPs of the phylogenetic branches of the tree as a tool of analysis.

alan
11-26-2013, 11:48 PM
I can certainly see implications if R1 really does date back to c. 25000BC. That takes us into a slightly different climate-environmental setting. I have wondered looking at the Mal'ta boy's date in relation to the the LGM and the dissapearance of his type of middle upper palaeolithic culture at that time. It looked to me that he was almost the last of his culture and was in a very bad situation by that time as no retreat south was possible by 22000BC in the teeth of the LGM. So, his line going extinct looked a very good bet to me.

However, if R dated back to c. 30000BC and R1 to perhaps 25000BC this has a number of implications

1. The date of R rather nicely fits the appearance of the Siberian middle upper Palaeolithic culture. This suggests a very good match.

2. A date for R1 of perhaps 25000BC would allow R1 to have had the option of heading either directly west through the steppe-tundra or beating the LGM desert formation to head south-west towards north Iran and the area of central Asia at a similar latitude a little to the east.

3. That might possibly allow for a geographical divide between R1a and b to have formed by the worst of the LGM, something that I think would have been impossible by the time the Mal'ta boy lived.

4. If P25 dates to something like 17000BC then it can easily be envisaged that the massive expansion of the Caspian after the LGM could have hugely pushed apart P25 groups with some remaining close to the southern Caspian (which was little effected) or pushed deep into north-west central Asia or deep into eastern European Russia. I am even guessing that this pushing back of dates might push P297 back a few millenia from its usual date of around c. 9-10000BC. If that is the case then even P297 could have been split well apart, perhaps eventually coming to light with M269 and M73.

4.

jamesdowallen
11-27-2013, 01:02 AM
I think your approach is a correct one. There are some minor points in your description of your procedure that are not clear to me. For example, you seem to equate the R-L node in the Francalacci tree with the R-N node in the Raghavan tree ... etc.

Thank you very much for your comments. Browsing the thread I see you posted your own estimates 6 weeks ago, very similar to the dates I derive with the "1/165" rate. Thus my contribution, if any, is to demonstrate that Raghavan-Skoglund data seem to confirm the mutation rate you derived independently.

My conflation of R-L and R-N is because neither paper I started with had a complete repertoire of reference genomes. (Francalacci is missing M,N, etc. Raghavan is missing L,M etc.) I'll look into the other numeric discrepancies; although, as you say, it's hardly urgent. :)

What I would like to see is a phylogeny chart, similar to Francalacci's with SNP counts, for the R1 or R1b tree, or even just the R1b-L11 subtree. (I know I could search for such a thing, but would be grateful if anyone has a link handy!)

Michał
11-27-2013, 01:23 AM
Again, you are comparing two very different qualities of scan. The figure of 35 for the Mal'ta boy is based on low-quality scanning of ancient DNA, whereas the figure of 23 is based on high-quality scans of modern R1 men.
I'm afraid this is not correct. Both numbers (35 and 23) were derived from the low-quality scanning of ancient DNA and both of them were based on a comparison with some high-quality scans of some modern people from different haplogroups (not only R1). The only difference was that the 35 loci were associated with the presence of derived alleles in MA-1 and ancestral alleles in modern R1 samples, while the 23 loci were associated with the presence of ancestral alleles in MA-1 and derived alleles in modern R1 samples.


Full scanning of ancient (i.e., highly degraded) Y chromosomes is still in its infancy. I think we need to remain cautious until we get enough successful examples to cross-check against each other. Perhaps the most egregious recent failure was that of an alleged Neanderthal Y chromosome, whose scan results were finally determined by experts to be that of a female.
I agree that we need to be cautious, but we cannot say that only the derived alleles shared by modern R1 people and the ancestral alleles not shared by modern R1 people are ok in such case, while all "private" derived alleles not shared by modern R1 people should be neglected. When using such approach, we will never be able to acknowledge that a given ancient sample contains a set of private mutations that are likely indicating an extinct sublineage of a known haplogroup. Therefore, if only the general level of discordant SNP results is very low (like in this particular case and unlike in the above-mentioned Neanderthal Y-DNA case), we have simply no reason to reject all those private SNPs that have successfully passed the relatively strict criteria for reliability.



In re-reading your previous posts, I now see the reason for my "visceral" reaction to your age estimates. You wrote: "...the separation between R1a and R1b took place significantly earlier than 24 kya, most likely about 27-28 kya." This terminology is confusing and somewhat incorrect. What you meant was: "...the TMRCA of R1a and R1b--i.e., the age of R1--is significantly earlier than 24 kya..." You are referring to a split of R1 into two branches which eventually became R1a and R1b. M420 and M343 (the mutations typically associated with R1a and R1b) may have occurred much later, and the TMRCAs of their modern descendants may be much later than that.
It seems to me that you are exaggerating a bit. Are you saying that these two sublineages of R1 (R1a and R1b) were not separated from each other before the M420 and M343 mutations first arose? This would be like saying that your patrilineage and my patrilineage were not separate lineages until we both were born. The fact that we currently define R1a using M420 while R1b is most frequently defined by M343 is a pure convention, and once we learn that there is another mutation upstream of M420 that is parallel to M343, we will simply redefine R1a, though this will change nothing in respect to the age of R1 (and in respect to the moment of separation between R1a and R1b, though the age (TMRCA) of R1a itself will certainly be affected).

Michał
11-27-2013, 01:32 AM
What I would like to see is a phylogeny chart, similar to Francalacci's with SNP counts, for the R1 or R1b tree, or even just the R1b-L11 subtree. (I know I could search for such a thing, but would be grateful if anyone has a link handy!)
I don't think you will find such comprehensive tree for R1b-L11, but there is a chance that someone will soon create this kind of scheme based on the recently ordered (and partially completed) FGC results for a relatively large number of R1b-L11 people.

alan
11-27-2013, 02:51 PM
I have posted before too that there are a sequence of climatic-environmental aspects to all of this. I think the new dates Michal is suggesting actual have a very good fit to major climatic events.

1. The beginning of the Siberian middle upper Palaeolithic culture around 30000BC not only would coincide with the date for R but it also coincides with a relatively good period of climate that allowed this culture to enter Siberia after a period of abandonment (Previous early upper Palaeolithic settlement dating to c. 40000BC onwards had abandoned the area due to an earlier climate downturn - I dont know who these people were and the problem is made worse by some sort of thing that Russian archaeologists have for seeing them as smoothly evolving from archaic pre-homo-sapiens-sapiens groups)

2. The end of this middle upper palaeolithic culture took place c. 25-22000BC with the Mal'ta boy being a very late example and most indicators of settlement ending closer to 25000BC. The beginning of the LGM c. 25000BC not only coincides with the general abandonment of south central Siberia other than the Malta boy but roughly coincides with the date Michal is suggesting for R1. This is a shift from Karafet who placed it at the end of the LGM. It suggests to me that R1 occurred around the time that the bulk of people of the middle upper palaeolithic Siberian culture had left south-central Siberia leaving the hold-outs like Mal'ta boy behind. My guess is that R1 occurred among the refugees heading west, probably near the start of their period of abandoning south-central Siberia. Most evidence points to a route west or south-west. I would guess R1 happened somewhere in north central Asia on the escape route west, early on in that journey. R2 is probably evidence for other lines who took a variant route out of Siberia. I dont know much about R2 other than its big in India and also has a showing in central Asia but I have read that from south-central Siberia you either have to go through the Dzungarian gates into Mongolia or head west to at least to Alghanistan in order to pass through to south Asia.

3. I am agnostic about whether a date for R1 of 25000BC means they ended up on opposite sides of the LGM north central Asian desert that ran from the east Caspian to China. It means its theoretically possible but they didnt have a lot of time before splitting like that would become impossible. The real question is then how many SNPs do R1a and b share after R-M173, the R1* defining SNP? Has that been established? Clearly the key to this is how fast after the R1 lines split into two lines.

4. Michal's suggested P25 date would now coincide with the end of the LGM and perhaps part of the mega-expansion of the Caspian.

ADW_1981
11-27-2013, 03:16 PM
I have posted before too that there are a sequence of climatic-environmental aspects to all of this. I think the new dates Michal is suggesting actual have a very good fit to major climatic events.

1. The beginning of the Siberian middle upper Palaeolithic culture around 30000BC not only would coincide with the date for R but it also coincides with a relatively good period of climate that allowed this culture to enter Siberia after a period of abandonment (Previous early upper Palaeolithic settlement dating to c. 40000BC onwards had abandoned the area due to an earlier climate downturn - I dont know who these people were and the problem is made worse by some sort of thing that Russian archaeologists have for seeing them as smoothly evolving from archaic pre-homo-sapiens-sapiens groups)

2. The end of this middle upper palaeolithic culture took place c. 25-22000BC with the Mal'ta boy being a very late example and most indicators of settlement ending closer to 25000BC. The beginning of the LGM c. 25000BC not only coincides with the general abandonment of south central Siberia other than the Malta boy but roughly coincides with the date Michal is suggesting for R1. This is a shift from Karafet who placed it at the end of the LGM. It suggests to me that R1 occurred around the time that the bulk of people of the middle upper palaeolithic Siberian culture had left south-central Siberia leaving the hold-outs like Mal'ta boy behind. My guess is that R1 occurred among the refugees heading west, probably near the start of their period of abandoning south-central Siberia. Most evidence points to a route west or south-west. I would guess R1 happened somewhere in north central Asia on the escape route west, early on in that journey. R2 is probably evidence for other lines who took a variant route out of Siberia. I dont know much about R2 other than its big in India and also has a showing in central Asia but I have read that from south-central Siberia you either have to go through the Dzungarian gates into Mongolia or head west to at least to Alghanistan in order to pass through to south Asia.

3. I am agnostic about whether a date for R1 of 25000BC means they ended up on opposite sides of the LGM north central Asian desert that ran from the east Caspian to China. It means its theoretically possible but they didnt have a lot of time before splitting like that would become impossible. The real question is then how many SNPs do R1a and b share after R-M173, the R1* defining SNP? Has that been established? Clearly the key to this is how fast after the R1 lines split into two lines.

4. Michal's suggested P25 date would now coincide with the end of the LGM and perhaps part of the mega-expansion of the Caspian.

It should also be considered that Mal'ta cultures were northern extension of R*. When we talk about where the R1 guy lived, and subsequent descendants M420 and M343, we could be talking about a region near the Caspian - Turkmenistan, Afghanistan, Kazakhstan..etc It should be noted that the R* lineage in Siberia, for all intents and purposes is a dead end. I doesn't look like any men survived from this group. It's adding additional assumptions to follow that far "north and east" trail and then tie it back directly to the distribution of M173, M343, and M420 IMHO.

I don't mean to take this thread off course :)

parasar
11-27-2013, 03:58 PM
I can certainly see implications if R1 really does date back to c. 25000BC. That takes us into a slightly different climate-environmental setting. I have wondered looking at the Mal'ta boy's date in relation to the the LGM and the dissapearance of his type of middle upper palaeolithic culture at that time. It looked to me that he was almost the last of his culture and was in a very bad situation by that time as no retreat south was possible by 22000BC in the teeth of the LGM. So, his line going extinct looked a very good bet to me ...

His line is definitely gone, and perhaps was gone soon after his demise. But his type is seen in Afontova Gora II 7000 years later.


results from a second south-central Siberian from Afontova Gora-2 site are presented in order to address human occupation of the region during and after the Last Glacial Maximum (LGM; ca. 26,000 to 19,000 years ago), a climatically cold period when glacial ice sheets extended to their maximum range.

At approximately 17,000 years ago, this post-LGM individual demonstrates similar genomic signatures as MA-1, with close affinity to modern western Eurasians and Native Americans and none to present-day East Asians. This result indicates that genetic continuity persisted in south-central Siberia throughout this climatically harsh period, which is a significant consideration for the peopling of Beringia, and eventually the Americas some 15,000 years ago. http://news.ku.dk/all_news/2013/2013.11/ancient_siberian_genome_reveals_genetic_origins_of _native_americans/

The blogger Genetiker performed an analysis and has this to say:

The results indicate that the Mal’ta boy was a proto-Aryan ... Proto-Aryans would have been evolutionarily intermediate between Veddoids and Nordics, and their DNA will therefore appear to be a mix of Veddoid and Nordic elements, as here.

I.e., while they appear a mix it more correctly a transition from the Vedda type (which we see more clearly in the still earlier Kostenki 14).

There seem to be a progressive transition in the next 7000 years with a further shift away from Vedda.


The Afontova Gora man had more of the Nordic components and less of the Veddoid component than the Mal’ta boy. This makes sense, because the Afontova Gora man was 7,000 years further up the path of Nordic evolution than the Mal’ta boy.
http://genetiker.wordpress.com/2013/11/24/analyses-of-the-afontova-gora-genome/

alan
11-27-2013, 04:26 PM
I dont want to take seriously someone using terms like Aryan and Nordic in the same sentence.


His line is definitely gone, and perhaps was gone soon after his demise. But his type is seen in Afontova Gora II 7000 years later.

http://news.ku.dk/all_news/2013/2013.11/ancient_siberian_genome_reveals_genetic_origins_of _native_americans/

The blogger Genetiker performed an analysis and has this to say:


I.e., while they appear a mix it more correctly a transition from the Vedda type (which we see more clearly in the still earlier Kostenki 14).

There seem to be a progressive transition in the next 7000 years with a further shift away from Vedda.


http://genetiker.wordpress.com/2013/11/24/analyses-of-the-afontova-gora-genome/

parasar
11-27-2013, 04:37 PM
I am not too sure I want to take seriously someone using terms like Aryan and Nordic in the same paragraph.

My focus is on the analysis rather that the terms he uses (but then I'm not burdened by the ugly history of those terms, the term Arya-Putra is still often used by us, especially at ceremonies).

The content though is consistent with what paper's authors reported (at the first link in my post).

TigerMW
11-29-2013, 04:15 PM
Here are some very provisional calculations based on the recently received full Y chromosome sequencing results for a first member of our R1a1a and Subclades project tested...

Michal, thank you for your work. You call these TMRCAs. I just want to be sure I understand if the applicability is the same as what I'm used to, if as I understand it, you are counting SNPs only.

My first impression is that you since you are specifically looking at branch lines this, your estimated dates are the dates the SNPs actually are predicted to occur. If so, there may be other younger phylogenetically equivalent SNPs that are the Most Recent Common Ancestors, at least in some cases. Do I understand that? Perhaps it doesn't matter.

alan
11-29-2013, 04:30 PM
I am very curious to know more about Michal's calculations too. Michal presented two alternatives. He expressed preferrence for the first variation because it fitted Mal'ta boy better. However, his 2nd variation is much more in line with the STR based calculations that people have come up with. The first variation would have implications for M269 as it would push its earliest subclades into a period between the first farmers and the copper age and would be a good fit for the Marmara dairy pastoralists 2nd Neolithic wave c. 5550BC option. It also would indicate that the SNPs themselves are considerably older and existed for 1000 years or so before expansion.

The other finding about L51 being as old as the L23XL51 clade is interesting and I believe someone else suggested this before a few months ago, I think using STRs.

His calculations also make M269* not quite as old as the above two clades but still significantly older than L11.

Finally Michal's V88 dates from both methods are much much older than the 5000BC date I previously had heard of from STR variance. That would place V88 in an entirely different context.

I dont suppose you could calculate for M73 Michal ? That would be very interesting.


Michal, thank you for your work. You call these TMRCAs. I just want to be sure I understand if the applicability is the same as what I'm used to, if as I understand it, you are counting SNPs only.

My first impression is that you since you are specifically looking at branch lines this, your estimated dates are the dates the SNPs actually are predicted to occur. If so, there may be other younger phylogenetically equivalent SNPs that are the Most Recent Common Ancestors, at least in some cases. Do I understand that? Perhaps it doesn't matter.

alan
11-29-2013, 04:36 PM
I am very curious to know more about Michal's calculations too. Michal presented two alternatives. He expressed preferrence for the first variation because it fitted Mal'ta boy better. However, his 2nd variation is much more in line with the STR based calculations that people have come up with. The first variation would have implications for M269 as it would push its earliest subclades into a period between the first farmers and the copper age and would be a good fit for the Marmara dairy pastoralists 2nd Neolithic wave c. 5550BC option. It also would indicate that the SNPs themselves are considerably older and existed for 1000 years or so before expansion.

The other finding about L51 being as old as the L23XL51 clade is interesting and I believe someone else suggested this before a few months ago, I think using STRs.

His calculations also make M269* not quite as old as the above two clades but still significantly older than L11.

Finally Michal's V88 dates from both methods are much much older than the 5000BC date I previously had heard of from STR variance. That would place V88 in an entirely different context.

I dont suppose you could calculate for M73 Michal ? That would be very interesting.


Michal, thank you for your work. You call these TMRCAs. I just want to be sure I understand if the applicability is the same as what I'm used to, if as I understand it, you are counting SNPs only.

My first impression is that you since you are specifically looking at branch lines this, your estimated dates are the dates the SNPs actually are predicted to occur. If so, there may be other younger phylogenetically equivalent SNPs that are the Most Recent Common Ancestors, at least in some cases. Do I understand that? Perhaps it doesn't matter.

alan
11-29-2013, 04:40 PM
I think I am asking a similar question as Mike but where in this list do you switch from SNP ages and clade ages. Obviously I can tell from the way it is presented that the M269 and L23 ages are SNP ages as further down you note the actual subclades. However, I am not clear about L51, V88 and P25 - are they SNP ages or the ages of L51* clades, the P25* paragroup and the V88 clade?

63.0 (53.7-82.8) haplogroup F
61.9 (52.8-81.3) haplogroup IJK
58.8 (50.1-77.3) haplogroup K
40.2 (34.3-52.9) haplogroup P
36.6 (31.2-48.2) haplogroup I
33.5 (28.6-44.1) haplogroup R
27.6 (23.5-36.3) haplogroup R1
22.9 (19.5-30.1) R1b-P25
14.9 (12.5-19.6) R1b-V88
8.6 (7.3-11.3) R1b-M269
8.3 (7.1-10.9) R1b-L23
7.6 (6.5-10.0) R1b-L51
7.4 (6.3-9.7) R1b-Z2105
7.2 (6.1-9.5) R1b-M269(xL23)
6.6 (5.6-8.6) R1b-L11
6.2 (5.3-8.2) R1b-P312
6.1 (5.2-8.0) R1b-U152

TigerMW
11-29-2013, 06:08 PM
... The other finding about L51 being as old as the L23XL51 clade is interesting and I believe someone else suggested this before a few months ago, I think using STRs....

This is the example I was thinking about when I asked the question. The Z2105 SNP (some associate with L23xL51) may have occurred later than L51. However, that doesn't mean the L51 remnant left today is as old as the Z2105 remnant. I tend to see Z2105 STR diversity as higher than L51 and L51 has the consistent 426=13 marker. L51 may have had a period of struggle before L51xL11 426=13 and L11 emerged from the bottleneck.

Michał
11-30-2013, 01:44 AM
Michal, thank you for your work. You call these TMRCAs. I just want to be sure I understand if the applicability is the same as what I'm used to, if as I understand it, you are counting SNPs only.

My first impression is that you since you are specifically looking at branch lines this, your estimated dates are the dates the SNPs actually are predicted to occur. If so, there may be other younger phylogenetically equivalent SNPs that are the Most Recent Common Ancestors, at least in some cases. Do I understand that? Perhaps it doesn't matter.
Mike, these TMRCA values have the same meaning as in the case of your STR-based estimates. They are supposed to represent the branching points and not the positions of particular SNP markers within a group of markers currently assigned to the same level.

Michał
11-30-2013, 01:52 AM
Michal presented two alternatives. He expressed preferrence for the first variation because it fitted Mal'ta boy better. .
This is not correct. Those calculations were performed before the Mal'ta boy results were known (please check the dates). The reason why I suggested that the first set of estimates is slightly more reliable is because the second set was more likely to underestimate the age of some young clades (and I have actually explained it in detail when presenting those results). However, I think it is still possible that those younger ages are a bit closer to the real values, so the whole question remains open.



The other finding about L51 being as old as the L23XL51 clade is interesting and I believe someone else suggested this before a few months ago, I think using STRs. .
As far as I remember, this was first noticed by several members of this forum, including Richard Rocca and myself, as soon as the Sardinian SNP data were published. I don't recall anyone suggesting it based on some STR results, but I wouldn't be surprised if someone has noticed this before.

One important thing to note is that my calculations are based exclusively on the Sardinian data, so if any early subclade of L23 (or of Z2105) is not represented there, this would lead to some underestimated TMRCA values. The same applies of course for M269(xL23), though based on some M269* STR haplotypes I've seen, I wouldn't expect this new potential brother clade of L23 to be significantly older than what I have suggested, unless there are some very rare early separated subclades of M269 that are present in some undertested regions only (for example somewhere in Asia or in some very specific locations in the Balkan peninsula).



Finally Michal's V88 dates from both methods are much much older than the 5000BC date I previously had heard of from STR variance. That would place V88 in an entirely different context. .
This is because V88 is clearly divided into two early separated subclades. Both of them are found in Sardinia and both are much younger than their parental clade V88.



I dont suppose you could calculate for M73 Michal ? That would be very interesting.
Unfortunately, there was no M73 in Sardinia. I would of course love to see such calculations being performed for this intriguing clade.

Michał
11-30-2013, 01:55 AM
I think I am asking a similar question as Mike but where in this list do you switch from SNP ages and clade ages. Obviously I can tell from the way it is presented that the M269 and L23 ages are SNP ages as further down you note the actual subclades. However, I am not clear about L51, V88 and P25 - are they SNP ages or the ages of L51* clades, the P25* paragroup and the V88 clade?

Alan, they were all calculated the same way. For example, the TMRCA value for haplogroup R1 corresponds to the R1a-R1b branching point. Similarly, the TMRCA value for L51 corresponds to the branching point for its two major subclades, i.e. L11 and another subclade of L51 that remained unnamed at that time (a brother clade of L11).

alan
11-30-2013, 01:59 AM
Thanks for the feedback. I didnt mean to imply that you chose based on the Mal'ta results. I realise you did your calculations and decided which might be better before Mal'ta. What I meant was that you recently noted that Mal'ta does support your earlier conclusion.

Regarding M73, its amazing how little interest there is in this clade. There wasnt even a section for M73 under the R1b section of this site until I posted a couple of threads on it. I think M73 is crucial to understanding M269 its brother clade. However, I have never even seen a single distribution map although I have hinted that Maciamo should have a go at it as he is known for his maps.


This is not correct. Those calculations were performed before the Mal'ta boy results were known (please check the dates). The reason why I suggested that the first set of estimates is slightly more reliable is because the second set was more likely to underestimate the age of some young clades (and I have actually explained it in detail when presenting those results). However, I think it is still possible that those younger ages are a bit closer to the real values, so the whole question remains open.



As far as I remember, this was first noticed by several members of this forum, including Richard Rocca and myself, as soon as the Sardinian SNP data were published. I don't recall anyone suggesting it based on some STR results, but I wouldn't be surprised if someone has noticed this before.

One important thing to note is that my calculations are based exclusively on the Sardinian data, so if any early subclade of L23 (or of Z2105) is not represented there, this would lead to some underestimated TMRCA values. The same applies of course for M269(xL23), though based on some M269* STR haplotypes I've seen, I wouldn't expect this new potential brother clade of L23 to be significantly older than what I have suggested, unless there are some very rare early separated subclades of M269 that are present in some undertested regions only (for example somewhere in Asia or in some very specific locations in the Balkan peninsula).



This is because V88 is clearly divided into two early separated subclades. Both of them are found in Sardinia and both are much younger than their parental clade V88.



Unfortunately, there was no M73 in Sardinia. I would of course love to see such calculations being performed for this intriguing clade.

Michał
11-30-2013, 02:06 AM
This is the example I was thinking about when I asked the question. The Z2105 SNP (some associate with L23xL51) may have occurred later than L51. However, that doesn't mean the L51 remnant left today is as old as the Z2105 remnant.
As I wrote above, it is still possible that clade Z2105 is slightly older than L51, as we cannot rule out that some early separated subclades of Z2105 are absent in Sardinia. On the other hand, the same can be said about some rare subclades of L51. Honestly speaking, I don't think it is important whether one of those subclades is slightly older than the other. What is important is that when focusing on some major subclades of L23 only, it seems obvious that L23 is divided into two major subclades of about the same age (L51 and Z2105) that are about 700-900 years younger than their parent clade (L23). Since L51 and Z2105 show different geographical distribution, this clearly suggests that before they expanded they were already represented by two separate populations that lived in some distinct locations. Personally, I would assign L51 to the first "Kurgan" wave from the North Pontic region (Suvorovo, Cernavoda, Ezero, Cotofeni, Baden, Vucedol), while Z2105 (probably accompanied by M269*) would represent the Western Yamnaya population that moved to the Balkans at a much later date (when most L51 people have already moved west).

parasar
12-06-2013, 05:55 AM
That depends on the set of figures you choose. At the moment, I prefer the rightmost column of the provisional SNP-based TMRCA calculations that Michał posted recently. (They best match the TMRCA calculations from STRs.)

R1 : 21560
R1a-M417 : 6048
R1a-Z645 : 5544
R1a-Z282 : 5376
R1a-M458 : 4704
R1a-L1029 : 2744

TMRCAs will continue to improve relative to each other as we get additional full-Y examples and average their SNP counts; but the scaling of the entire TMRCA set still requires correlation with an archaeologically datable event, and/or a reliable stable SNP mutation rate.

lgmayka, Michał,

How do I interpret ~137 SNPs at the Y9 or earlier level?
Thanks.

Michał
12-06-2013, 12:19 PM
How do I interpret ~137 SNPs at the Y9 or earlier level?

I am not sure what you mean by that. Could you please provide some more details?

parasar
12-06-2013, 03:35 PM
I am not sure what you mean by that. Could you please provide some more details?

It was from the YFull site from their interpretation of the N12617.BAM data.
But it gone now and the SNP qualifiers have since changed. Perhaps the Full Genomes .BAM data is still being analyzed. I will check with Vladimir.

Now the 137 Y9 equivalent or upstream have been broken down, and it shows:
R-Y2392 terminal 1 SNP
R-Y2392 private 87 SNPs
R-Y2392 equivalent or upstream 13 SNPs
R-Y2392 2 SNPs
R-Y2353 equivalent or upstream 18 SNPs
R-Y2353 5 SNPs
R-Y9 equivalent or upstream 10 SNPs
R-Y9 1 SNP

For SNP names and positions on YFull tree:
http://www.yfull.com/tree/R-Y9/

parasar
12-07-2013, 04:34 AM
Terminal
Y2392 = FGC7405

Total positive SNPs=1982
So if we go by an average of one mutation every third generation
@ 25 years per generation separation from root('African Adam') = 148650; from BT ('Eurasian Adam')112800; from MP (Papuan M1205) 63750; from P 48375 (potential calibration point with Mal'ta MA1, though coverage, quality, and scan differs).

L657 11475
Z93 12300
R1a 36450
R1 42225

palamede
12-07-2013, 12:28 PM
The problem of the average of number of generations (or years) between mutations remains difficult.

With high quality mutations, the average for the tree above the more recent mutations is certainly a lot lower than for the end of the tree branches because a great part of the tree has got a very great number of reads of numerous samples and the position of mutations are more sure whereas the end of the branches have got a weak number of reads and the exploration is a lot less assured.

Therefore an average of one mutation every third generation is almost right for high quality mutations for the greatest part of the tree, but may be a different value for the end of branches

By taking a total positive SNPs=1982, parasar use high and low quality mutations which are not assured for all of them but maybe, he avoids the preceding disadvantages in a statistical point of view and for me, these datations seem more plausible than calculations with the high quality mutations only.
They seem in accordance with the tree of Francalacci and al. and a mutation rate of 0,5x10-9/bp/year.

An average of one mutation every third generation (1 for 75 years) is for 24Mbp test and we can also think the high quality mutations counted for the common parts of the tree is for 24Mbp because mapping problems are solved for these well-known parts of the tree.

While for the 12Mbp with less mapping problems, for the branch ends the high quality mutations are proportionaly more numerous but it is one high quality mutation ever sixth generation (1 for 150 years and maybe a lot more years due to the lack of comparizon of mutations).

I am not a specialist at all and maybe, I am foolish in my understanding.

parasar
12-10-2013, 05:50 PM
The problem of the average of number of generations (or years) between mutations remains difficult.

With high quality mutations, the average for the tree above the more recent mutations is certainly a lot lower than for the end of the tree branches because a great part of the tree has got a very great number of reads of numerous samples and the position of mutations are more sure whereas the end of the branches have got a weak number of reads and the exploration is a lot less assured.

Therefore an average of one mutation every third generation is almost right for high quality mutations for the greatest part of the tree, but may be a different value for the end of branches

By taking a total positive SNPs=1982, parasar use high and low quality mutations which are not assured for all of them but maybe, he avoids the preceding disadvantages in a statistical point of view and for me, these datations seem more plausible than calculations with the high quality mutations only.
They seem in accordance with the tree of Francalacci and al. and a mutation rate of 0,5x10-9/bp/year.

An average of one mutation every third generation (1 for 75 years) is for 24Mbp test and we can also think the high quality mutations counted for the common parts of the tree is for 24Mbp because mapping problems are solved for these well-known parts of the tree.

While for the 12Mbp with less mapping problems, for the branch ends the high quality mutations are proportionaly more numerous but it is one high quality mutation ever sixth generation (1 for 150 years and maybe a lot more years due to the lack of comparizon of mutations).

I am not a specialist at all and maybe, I am foolish in my understanding.
Same here. :)

I think your points are well taken. The 24000 year old Mal'ta boy is a pre-R derivative with 5 mutations ancestral and with 35 private mutations.
His full genome was scanned at 1.5X over 5.8 million bases.


Due to low depth-of-coverage of the MA-1 individual (1.5X on 5.8 million bases), genotyping at each site on the Y chromosome was performed by selecting the allele
with the highest frequency of bases with a base quality of 13 or higher. Additionally, a multi-fasta file was generated from the variable positions on the Y chromosomes available from 24 Complete Genomics public genomes1. SNPs were filtered for quality (using VQHIGH as the threshold, as defined by Complete Genomics), with tri-allelic positions excluded and only those Y chromosome regions determined as being phylogenetically informative being used2. This yielded a final dataset of 22492 positions ... In the course of this analysis, the original dataset was severely pruned... MA-1 is determined to be carrying the derived allele in 183 sites and the ancestral allele in 1706 sites. http://www.nature.com/nature/journal/vaop/ncurrent/extref/nature12736-s1.pdf

With this low depth of coverage it is quite likely that they missed many mutations.

50x on the on the other hand is almost fully complete, as 28x is considered to be sufficiently complete.

To find the complete genomic sequence of one person with current sequencing platforms requires sequencing that person's DNA the equivalent of about 28 times (called 28X). If the amount of sequence done is only an average of once across the genome (1X), then much of the sequence will be missed, because some genomic locations will be covered by several pieces while others will have none. The deeper the sequencing coverage, the more of the genome will be covered at least once. Also, people are diploid; the deeper the sequencing coverage, the more likely that both chromosomes at a location will be included. In addition, deeper coverage is particularly useful for detecting structural variants, and allows sequencing errors to be corrected. http://www.1000genomes.org/about

Full Genomes conducts 50 reads per base over "about 20,000,000 bp" and has reports about about 28,000 known variants. YFull notes about 43,000 SNPs. There are 1981 SNPs for YF01382 - https://docs.google.com/file/d/0B-b3KeGG3Un1VUJTSU12UUVkXzQ/edit - including 87 private SNPs (cf. Full Genomes Total 'private' high quality mutations at 95%+ level 43).

Based on the above, if I take one mutation every 75 years as reported at YFull's quality control sufficiency, then each mutation for the Mal'ta boy would have to account for many more years - about 4-10 times as much depending on quality (Which gives me 24000 + ((35-5)x75x4)=33000 years or 24000 + (30x75x10)=46500 for R).

The higher number is in line with what I calculated before: "from P 48375 (potential calibration point with Mal'ta MA1, though coverage, quality, and scan differs).
L657 11475
Z93 12300
R1a 36450
R1 42225"


Calibrating by the lower number, I would get:
L657 7828
Z93 8390
R1a 24865
R1 28804

palamede
12-12-2013, 09:10 AM
The higher number is in line with what I calculated before: "from P 48375 (potential calibration point with Mal'ta MA1, though coverage, quality, and scan differs).
L657 11475
Z93 12300
R1a 36450
R1 42225

If we suppose this hypothesis is right, the first coming of R1a into India would date during what is called the Younger Dryas (or "Dryas 3"- 12800-11500 calibrated) in Europe : a brutal cold after the warmer phases of Bolling and Allerod (15000-12800) which had known a great increase of population by an important factor .
R2 coming into India Peninsula before the LGM about 30.000BP calibrated.
http://en.wikipedia.org/wiki/Younger_Dryas

A lot more coherent with the present repartition of L657 in India than a coming with Aryan invasions 3500 years ago, these last ones by other Z93 branches remained in Central Asia like Z2123.

Rathna
12-12-2013, 10:05 AM
I'd like to remember to you all some posts of mine:

Y-DNA, Mt-DNA, Autosomal DNA
http://eng.molgen.org/
Is it R1a older than it is usually thought?

http://eng.molgen.org/viewtopic.php?f=77&t=81
Page 1 of 1
Is it R1a older than it is usually thought?
Posted: Sun Mar 18, 2012 7:22 am
by Gioiello

This posting of mine, posted here and published also by Dienekes, is waiting some response, above all from Anatole Klyosov:


An interesting haplotype of R1a1a (M17) has been found in the paper of Gunjan Sharma et al., Genetic Affinities of the Central Indian Tribal Population, PLoS one, February 2012:
DYS19=18
DYS385=14-17
DYS389=15-30
DYS390=28
DYS391=12
DYS392=14
DYS393=13
DYS437=17
DYS439=13
DYS448=22
DYS456=17
DYS458=17

At first sight it could seem we have found the R-M420 not found so far in India with its DYS392=14, which presupposes a 13, whereas all the other R1a1a haplotypes have 11 or 10 and 12 from 11, but this haplotype has been tested for M17, then it isn’t an R-M420. Also the extremely large variance of the other markers makes us think that this value 14 derives from a modal 11 (or what was the modal at the origin of this subclade). Then again all the discourses about “modal” and “variance”, as I have supported many times, are worth nothing.
But I think it would be something to say about the TMRCA of 10.97+/-1.86 kya (25 y for generation) even though calculated by the Zhivotovsky rate. It is clear that these R1a1a-s belong to different clades and the massive presence of the clade most usually found falsifies the calculation. It is clear that this haplotype is an outlier, but for this more interesting, because testifies all the mutation gone mostly for the tangent and not around the modal. If we calculate the intraclade between two of these haplotypes, for instance with this closer to the modal: 15, 11-14, 14-32, 24,10, 11, 12,14,10, 20, 15,16 we have 32 mutations. Also using the usual mutation rate of 0,0022, we have:
(454x32)/28=518
518x25=12,950
and I have used a generation of 25 years and not 32 as I use usually, and I haven’t considered other mutations around the modal.

Conclusions? The ancientness of the haplogroups is much much more than it is usually thought.

Re: Is it R1a older than it is usually thought?
Posted: Sun Mar 18, 2012 3:59 pm
by lgmayka

Of course, one cannot come to conclusions based on a single short haplotype taken from a sample for which further testing is not available. We do have 67-marker examples of R1a1a* (R-M198*), and even a 111-marker example of R1a1* (R-SRY10831.2*).

Nevertheless, I encourage any South Asian R1a with an unusual haplotype to order SNP tests and more markers. :)

Re: Is it R1a older than it is usually thought?
Posted: Sun Mar 18, 2012 4:28 pm
by Gioiello

Of course what you say is right and we hope that all these tests will be done. My analysis wasn’t in favour of an Asian origin of this haplogroup, but only to affirm that Y (and mt) are older than it is usually thought, because the usual calculations don’t take in consideration my theories, that:
1) mutations happen around the modal
2) there is a convergence to the modal as time passes
3) only sometimes mutations go for the tangent, and we have the outliers, which, of course, are a few, but I think aren’t negligible.
If R1a1a is so ancient (and probably much more if we calculate its age taking in consideration what I have said above), how ancient is R1a/M420? And has no meaning that it has been found, for what I know, only in Europe (East, South (Italy), the Isles)?

About Anatole Klyosov I can say that I have discussed these problems with him in the past on “Dienekes’ Anthropology blog” and I have posted this there, on Worldfamilies and now here. If he has something to say, I am here.

alan
01-02-2014, 04:49 PM
To elaborate Michal's calculations suggest to me three stages. I tend to believe that the variant of Michal's method that produces the older date range is closest to reality if the low end of his range is considered (BOLDED).

1. A c. 1000 year period of existence of the M269 and L23 SNPs but non-expansion/bare survival c. 5-6000BC

8.6 (7.3-11.3) R1b-M269
8.3 (7.1-10.9) R1b-L23

2. A period of near-simultaneous expansion of three parallel clades at the time of the collapse of old Europe c. 4500-4000BC

7.6 (6.5-10.0) R1b-L51
7.4 (6.3-9.7) R1b-Z2105
7.2 (6.1-9.5) R1b-M269(xL23)

3. A period of secondary takeoff in west-central and western Europe c. 3600-3200BC

6.6 (5.6-8.6) R1b-L11
6.2 (5.3-8.2) R1b-P312
6.1 (5.2-8.0) R1b-U152

parasar
01-07-2014, 03:30 PM
Terminal
Y2392 = FGC7405

Total positive SNPs=1982
So if we go by an average of one mutation every third generation
@ 25 years per generation separation from root('African Adam') = 148650; from BT ('Eurasian Adam')112800; from MP (Papuan M1205) 63750; from P 48375 (potential calibration point with Mal'ta MA1, though coverage, quality, and scan differs).

L657 11475
Z93 12300
R1a 36450
R1 42225

The above dates look to be good grounds based on the new Cruciani paper that JaG posted.

Our dating results and phylogeography led to the following main conclusions: 1) patrilineal lineages with ages approaching those of early AMH fossils survive today only in central-western Africa; 2) only a few evolutionarily successful MSY lineages survived between 160 and 115 kya; 3) an early exit out of Africa (before 70 kya), which fits recent western Asian archaeological evidence, should be considered.

Jean posted this phylogeny:

http://www.anthrogenica.com/attachment.php?attachmentid=1173&d=1389101515

My calculation for BT was 112800ybp. Cruciani calculates a bit higher age (~114800ybp) for BT as his root is a little older.
I think his P triangle vertex (44000ybp) is also in the same ballpark as my calculations from both SNP counts from root and pre-R calibration with Mal'ta.

palamede
01-07-2014, 05:58 PM
I can compare this tree with that I posted in molgen (Y-haplogroup B ) yesterday morning

Poznik et al. find an average mutation rate of 0,82x10-9xbp-1xyear-1 (0,72-0,92 10-9 range ). for 9,99M this gives about 1 mutation every 122 years.

If we compare the tree in the paper of Poznik and al-2013 by taking the bifurcation of Y-haplogroups of F, G, H, IJ K (with Mt hgs M, N) in the warm Odderade phase of the 'Out Of Arabia' population of South Asia as far as Sunda (West Indonesia) and South China 86-74,000 years ago, I obtain an average mutation rate of about 1 mutation every 180years. Therefore average mutation rate of 0,555x10-9/bp-1/year-1 very close to 0,53x10-9/bp-1/year-1 from Francalacci et al-2013

If I am dating by 1 mutation every 180 years, I obtain

- Separation of A1 and BT : about 212,000 years ago (near the birth of Mts L1 and L2'3'4'5'6)

- Separation of B and CT : about 155,000 years ago (near the birth of Mts L2 and L3'4'6) . This would precede an "Out Of Africa of 130-125,000 years, while no CT sub-branches (except E branch are greatly found in African and it seems E was born in Arabia. But maybe, CT was born in North-East Africa in the beginning of the Nubian Culture before 130,000BP, a lineage migrated to Arabia/Near East about 130,000BP and the other remaining lineages disappeared in Africa, maybe with the comeback of E into Africa after 115,000BP. I remind a lot of archeological traces of the Nubian Culture were found in Arabia from about 110-100,000 years and moderbskulls found in Near East from 115,000-90,000 (things a lot of geneticiens refused to consider, preferring the lower and politically more correct date of 70-55,000BP (and this ancient date only because there were forced by the datation of the arrival date of modern humans in Sahul (Australia and New Guinea) more than 50,000 years by Australian scientific), but they have continued refusing an arrival into America before LGM against the more evident archeological and linguistic clues).

- Separation of B2a and B2b : about 105,000 years ago .

- Separation of DE and FT : about 115,000 years ago (near the birth of Mts L3 and L4) .

- Separation of E1b1a and E1b1b: about 75-70,000 years ago .

- Separation of F, G, H, IJ and K : about 88-80,000 years ago (near the birth of Mts M and N) .

- Separation of K, LT, MP, NO : about 82-75,000 years ago (near the birth of Mt R) .

- Separation of N and O : about 75-70,000 years ago . A the beginning (very contrasting between 75 ,00BP and 68,000BP) of the first pleniglacial phase

- Separation of P and Q : about 59-55,000 years ago . After the end of the first pleniglacial phase (59,000BP)

- Separation of Q-L54 and Q-M3 : about 40-35,000 years ago, maybe a little later (35-30,000) as the tree length is shorter. As Q-M3 was probably born in America, the arrival of the haplogroup Q-L54 (with Mt hg C1,D1 and X2, Mt hg A and B arrived later) into America should be dated of the warmer Arcy oscillation (34-33000BP) in the same time R1b and Gravettian Culture entered into Europe from Russia. It is possible the first arrival of the Modern Human (Australoid type) was sooner about 55,000BP with Mt-hg C3* and Mt-hg D1 (it remains to analyse the rare South American C3* to know their separation date with the Asian and North American C3)., I don't dare to imagine migration before the first pleniglacial about 77-70,000BP in the time of the Toda Volcano explosion (73,000BP) .

Separation of A1 (in fact A1b in Poznik tree) and BT (in fact A1a-T): About 212,000 compared to 195,7/213,5

Separation of B and CT ; About 155,000 compared to 114,8/120,2.

Separation of B2a and B2b : About 105,000 compared to 75,7/76,2 . In fact 105,000 is more comparable to the separation of B1 and B2 with 108/110,1.

Separation of DE and CF : About 115,000 older than 85,5/87,9.

Separation of F,G,H,IJ and K : 88-80,000 older than 63,0/66,0.

My dates are 1.32, 1,38, 1,33 and 1,30 older than Cruciani's dates. If we consider for an average of 1,33 older .
Their average mutation rate is about 0,555x1,33x10-9xbp-1xyear-1 = 0,74x10-9xbp-1xyear-1.

palamede
01-07-2014, 06:13 PM
A problem with Poznik and Cruciani trees are the CLADE/SUBCLADE SYMBOLS for haplogroup A which are from old ISOGG and not recent ISOGG.

I guess their A1b could be the present A0 in ISOGG 2013 and 2014.
A1a remains A1a.
A2 becomes A1b1a1a.
A3b1 becomes A1b1b2a
The Nilotic A3b2 becomes A1b1b2b.

palamede
01-09-2014, 07:19 AM
Dienekes signals the authors (Scozzari, Cruciani et al) infer a Y chromosome mutation rate of 0.64 x 10e-9 via the autosomal mutation rate.

Starting from Poznik tree, I calculated 0.74 x 10e-9 for Scozzari tree . Either I did a mistake, or the SNP measurement of Poznik et al is deeper of about 0,74/0,64=15% than this one of Scozzari, Cruciani et al (because Scozzari gets weaker dates with a slower rate of 0,64 and I must add a ratio of 0,74/0,64=15% to obtain same bifurcation dates by taking the same rate) . ... or both, mistake and deeper measurement !!

parasar
01-09-2014, 05:47 PM
...
The higher number is in line with what I calculated before: "from P 48375 (potential calibration point with Mal'ta MA1, though coverage, quality, and scan differs).
L657 11475
Z93 12300
R1a 36450
R1 42225"


Calibrating by the lower number, I would get:
L657 7828
Z93 8390
R1a 24865
R1 28804

Dr. Klyosov has some number for some R1a1 branches:

3.5±0.5 R1a1a-Iran
4.975 R1a1a-dys393=14; 635=23-Iran4
3.375 R1a1a-dys390=24; 635=23-Iran8
3.075 R1a1a-gatah4=13; 635=23; 389ii=17-Iran5
3.075 R1a1a-dys438=10; 385a,b= 11,11-635=23-Iran4
3.025 R1a1a-gatah4=12; 635=23-Iran4
3.000 R1a1a-dys390=25; 635=24-Iran5
2.000 R1a1a-gatah4=12; 635=24-Iran4
1.950 R1a1a-gatah4=13; 635=23-Iran6
1.650 R1a1a-dys438=10; 635=22-Iran5

3.5±0.5 R1a1a1-xM458-Afghanistan
3.900 R1a1a1-xM458-AfghanUzbek3
3.425 R1a1a1-xM458-Hazara3(4)
2.675 R1a1a1-xM458-AfghanTajik13(16)
2.225 R1a1a1*-dys390=24-Afghan33(71)
2.150 R1a1a1*-dys390=25-Afghan17(29)
1.850 R1a1a1-xM458-Pashtun11(14) 1.675 R1a1a1*-dys393=14-SAfghan7(8)
1.450 R1a1a1-xM458-dys393=14- Pashtun8(9)

5.85±0.3 Z94  R1a1a1h1a-L342.2, ..., L349.2

5.9±1.4 R1a1a1h1a*-L342.2*-L657ng

4.350 R1a1a1h1a*-L342?-Kaz2Arm2TurKarach-6

http://aklyosov.home.comcast.net/~aklyosov/06_04_2013.pdf

parasar
01-09-2014, 05:48 PM
Though for some reason in his write-up on "Subclade R1a-L342-L657" he calls L657 young!
http://translate.google.com/ http://aklyosov.home.comcast.net/~aklyosov/6_3_2013.pdf

R M207
• R1 M173
• • R1a L62/M513, L146/M420
• • • R1a1 L120/M516, SRY10831.2
• • • • R1a1a M17, M198, M512, L168
• • • • • • R1a1a1 M417
• • • • • • • Z283 ( Eurasian branch )
• • • • • • • • Z282 ( Young Eurasian branch )
• • • • • • • • • M458 ( European branch )
• • • • • • • • • Z280 ( Central Eurasian branch )
• • • • • • • • • Z284 ( Scandinavian branch )
• • • • • • • Z93 ( South- eastern branch )
• • • • • • • • Z94 ( Young south- eastern branch )
• • • • • • • • • L342.2 ( Aryan branch )
• • • • • • • • • • L657 ( Young Aryan branch )
...


Dr. Klyosov gives his analysis as to why he thinks South Siberia/Andronovo etc. were L657:




Subclade L342.2 for several reasons it is advisable to call the " Aryan " on at least at the level of our current knowledge about it. according to
current concepts, it was formed in Europe about 4900 years ago , went to the Russian plain near the Dniester , and continued migration to the east along the southern steppe and forest-steppe zone. not possible that it was formed in the Dniester- Don region, from preceding it subclades Z94, dating which is still unclear , but probably between 5700 and 4900 years ago. According to some sources , there, in the Dniester- Don region , between 4900 and arias 4,500 years ago , were created the first chariots , and that gave them a speedy quality for subsequent movements


L657 ... the south, the Arabian Peninsula , and in the east, India. How could this happen?

There are at least two alternative explanations .
One - that L657 SNP mutation did not occur after the separation of the Aryans ...

The second option - the mutation L657 centuries gone one direction or to the south, through the Caucasus and Mesopotamia to the Middle East and Arabia either east to Central Asia and later in Iran , to the Urals , and later in India ...

L657 gone in only one direction, and then there was infiltration across the Indian Ocean ...

then common ancestor of these two subclades lived about (4900 +4050 +1175) / 2 = 5060 years ago , and that is the lifetime of the founder within L342.2 error calculations...

Armenia Turkey, in which no L657 ...

"north of the Mongolian border ... Keyser et al ( 2009 ) ... And here we come to the meat of this article . In 2009 came the work of Keyser et al., which identified the 17 - marker haplotypes of the three minerals bone remains Andronovo culture, dating to 3800-3400 years ago, that is when the L657 + should have already formed...

Two other Andronian haplotypes were identical in all 17 markers :
13 25 16 11 11 14 X X 11 32 10 14 - 15 14 20 16 23 11

The authors reported that they could not find the same haplotypes in
databases. From today's point of view, it is not surprising
since then L657 subclade was not known , and then haplotypes
was much less defined .
But now the first 12 alleles look very familiar. While the basic
haplotype L657 + on the first 12 markers is somewhat different,
but fossil haplotype , even in the Urals , should not be
necessarily base :
13 25 16 11 11 14 12 12 10 13 11 30 ( base haplotype subclades L657)
At the same time, among haplotypes subclades L657 + are replete with such
a pair 14-32 (DYS389):
13 25 16 11 11 14 12 12 10 14 11 32 India ( Brahmin )
13 25 15 11 11 14 12 12 10 14 11 32 India
13 25 15 10 11 14 12 13 10 14 11 32 Iran
13 25 16 11 11 13 12 12 11 14 11 32 UAE
13 25 15 10 11 14 12 12 10 14 11 32 Arab
13 25 15 10 11 14 12 12 10 14 11 32 Arab
13 25 15 11 11 14 12 12 10 14 11 32 Bahrain
13 24 15 10 11 14 12 12 10 14 11 32 Saudi Arabia
13 25 16 11 11 14 X X 11 32 10 14 ( fossil haplotype )

Indian Brahmin haplotype is identical to fossil first panel haplotype and all the rest - a characteristic triple 14.11.32 in the last three markers.

If fossil haplotype - really L657 +, on the basis that data presented here is very likely that in the range of 3800-3400 R1a-L342.2-L657 + has proved beyond the Urals , which is expected since about 3500 years ago, their brothers went to India.

In case subclade L657 + hit in the Middle East really across the Indian Ocean, and it is now clear his absence in Armenia Turkey, Egypt .

L342.2 + / L657 + L342.2/L657- along with India, Iran, Pakistan and Kazakhstan suggests that carriers of both lines migrated together, and both survived

...


So Dr. Klyosov's theory essentially is:
1. Z94 born in Europe.
2. L342.2 goes south as well as crosses the Urals.
3. Andronovo was L657
4. L657+ and L342.2+L657- both enter India.
5. L657+ crosses the Indian Ocean to Arabia and neighborhood.

For #3 he relies on the Indian Brahmin sequence matching the Andronovo and on some other cherry picked samples with 14,32, and the remnant Kazakh L657+.

I think Dr. Kloyosov's analysis is wrong (and contrived to show L657 entrance into India in the theorized Aryan timefreme) and Andronovo matching the Brahmin is just a coincidence as I have indicated elsewhere (http://eng.molgen.org/index.php):


...
Based on a report by Ryschev from a Poster Presentation it appeared that the Andronovo R1a1 was a match to a Tanmoy Bhattacharya from Calcutta, and therefore in Oct 2007 I had sent him an email on the possible match.

“Those from the Andronovo culture (2300–1000 BC), often associated with the Indo-Iranian innovations, had (this is not in the paper: it was reported in a discussion forum)
DYS 19:16 385a:11 385b:14 389I:14 389II:32 390:25
391:11 392:11 393:13 437:14 438:11 439:10
which is an exact match for me.” http://tanmoy.tripod.com/geneology.html

The Initial Paper: First successful assay of Y-SNP typing by SNaPshot minisequencing on ancient DNA C. Bouakaze & C. Keyser & S. Amory & E. Crubézy & B. Ludes
Received: 31 October 2006 /Accepted: 7 May 2007

From Poster and Ysearch:

Afanassievo culture
yDNA - no data

Andronovo culture
yDNA – R1a1
16-11,14-14-32-25-11-11-13-14-11-10
(DYS19-385a,b-389I-389II-390-391-392-393-437-438-439) - 2 sample
User ID Last Name Origin 393, 390, 19, 391, 385, 426, 388, 439, 389i, 392, 389ii
NFVJH Bhattacharya Kolkata, India 13 25 16 11 11 14 12 12 10 14 11 32
B87RC Bukowsky Essex, England 13 25 16 11 11 14 12 12 10 14 11 32
YGZS4 Hemskey Poland 13 25 16 11 11 14 12 12 10 14 11 32

Tagar culture
yDNA – R1a1
17-11,14-13-31-24-11-11-13-14-11-10
16-11,14-13-31-24-11-11-13-14-11-10
16-11,14-14-31-25-11-11-13-14-11-10
17-11,14-13-31-24-11-12-13-14-11-10
(DYS19-385a,b-389I-389II-390-391-392-393-437-438-439)

Tachtyk culture
yDNA – R1a1
17-11,14-13-31-24-11-11-13-14-11-10
(DYS19-385a,b-389I-389II-390-391-392-393-437-438-439)

While the first 12 (or actually 10) did match, it appears now that that match was coincidental.

Keyser published STR data in a following paper:
DYS19 DYS385 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS437 DYS438 DYS439 DYS448 DYS456 DYS458 DYS635 YGATA
S07 15 12/13 14 30 22 9 12 14 14 10 11 19 15 16 22 11 C(xC3)
S10/S16 16 11/14 14 32 25 11 11 13 14 11 10 20 16 15 23 12 R1a1
S24/S34 17 11/14 13 31 24 11 11 13 14 11 10 20 16 15 23 13 R1a1
S25 – 11/14 13 31 24 11 11 13 14 11 10 20 16 15 23 – R1a1
S26 16 11/14 13 31 24 11 11 13 14 11 10 20 16 15 23 13 R1a1
S28 16 11/14 14 31 25 11 11 13 14 11 10 20 16 15 23 12 R1a1
S29 – 11/14 14 31 25 11 11 13 14 11 – – 16 15 23 12 R1a1
S32 17 11/14 13 31 24 11 12 13 14 11 10 20 16 15 23 13 R1a1
http://www.hamagmongol.narod.ru/library/keyser_2009_e.pdf

My guess is that the S. Siberian folk could be Z93+, but I doubt they are L657+.
All the ancient S. Siberians look to be derivatives of: 16, 11/14, 13, 31, 24, 11, 11, 13, 14, 11, 10, 20, 16, 15, 23, 12

Rearranging in FTDNA order we have (with a couple of interpolations for 388, 426):

393, 390, 19, 391, 385, 426, 388, 439, 389i, 392, 389ii, 458.437.448.GATA.456.438.635

13, 24, 16, 11, 11/14, 12, 12, 10, 13, 11, 31, 15...14...20...12...16...11...23

What we find for the most part from FTDNA R1a1 and Subclades project dataset:

For L657
DYS456=15
DYS458=16

But for Z93+, L657-
DYS456=16
DYS458=15

I therefore feel that the Krasnoyarsk samples are not L657+.

On the other hand, based on additional STRs, Mr. Bhattacharya looks to be likely L657+, Y7+:
DYS 393 390 19/394 391 385a* 385b*
value 13 25 16 11 11 14
DYS 426 388 439* 389-1 392 389-2
value 12 12 10 14 11 32
DYS 458* 459a 459b 455 454 447
value 17 10 10 11 11 24
DYS 437 448 449* 464a* 464b* 464c*
value 14 20 32 15 15 16
DYS 464d* 460 GATAH4 YCAIIa YCAIIb 456*
value 16 12 12 19 23 15
DYS 607 576* 570* CDYa* CDYb* 442
value 16 19 18 36 42 14
DYS 438
value 11
http://tanmoy.tripod.com/Ychrom.html

Therefore it appears that he is not as closely related to the S. Siberian Andronovos as initially thought

parasar
01-14-2014, 07:03 PM
http://biorxiv.org/content/biorxiv/suppl/2014/01/13/001792.DC1/001792-1.pdf

Looks like this paper did not calibrate R age with the Mal'ta boy. Plus they have a problematic phylogeny, perhaps due to lack of sufficient HGDP samples with R*(xM479, xM173).


Bayesian tree for sequences from NRY haplogroup R. This haplogroup
has an estimated age of 21 ky based on the HGDP sequences, slightly younger than previous
estimates of about 27 ky14. Whereas in the SNP-based tree of NRY haplogroups R* is ancestral to
other R haplogroups, in the HGDP tree the R* sequences diverged from R2 sequences about 11 kya,
while R1a1 sequences diverged from R1b1, R1b1b1, and R1b1b2 sequences about 18 kya. R* and
R2 sequences are almost exclusively from central Asia, while R1a1 and R1b1 sequences are more
widespread and include central Asia, the Middle East, and Europe (mostly Adygei and Russians
from eastern Europe). The R1b1b1 sequences are only from the Hazara, and diverged from R1b1b2
sequences (which are almost exclusively European) about 11 kya. The diversity within R1b1b2 dates
to about 8 kya, so the age and geographic distribution of subhaplogroup R1b1b2 suggest a possible
spread to Europe during the Neolithic38

palamede
01-15-2014, 12:53 PM
R* and R2 sequences are almost exclusively from central Asia, while R1a1 and R1b1 sequences are more widespread and include central Asia, the Middle East, and Europe (mostly Adygei and Russians from eastern Europe). The R1b1b1 sequences are only from the Hazara,

Out of the dates which leave me sceptikal, the geographical senteces are a lot of asinine statements. The great majority of geneticians should never speak of geography and history, they haven't any general knowledhe and even in genetic geography, their knowings are weak.
Nevertheless, theu should have verified what they claim.

For R2, they ignored Indian Peninsula and Middle East.
For R1b1b1-M73, it exists for the Hazaras, but mainly for Baschkirs and Tatars, It exists for Balkars and Karachais in North Caucasus.

A lot of R1 bor Russians, but increasing from 35% in North to 60% in South West.
The funniest is when they wanted to be erudite and they stated Adyghei who are less than 20% R1.


and diverged from R1b1b2 sequences (which are almost exclusively European)

They ignored R1b1b2-M269 is present for about 15% Turks, 30% Armenians, 25% Assyrians and Alaouites, presence non negligeable in North Iran specially in Gilaki and Luri. Some presence in North-East Caucase, in South-East Oural and about 10% along the Med. Coasts of North-West Africa.
They ignore R1b-V88 in Africa with very important frequencies in Tchadian Basin (90 % in some small ethnies and probably more than 10 Million males in Africa, specially with 20 to 40% in the big Hausa ethny.

parasar
01-25-2014, 02:02 AM
Out of the dates which leave me sceptikal, the geographical senteces are a lot of asinine statements. The great majority of geneticians should never speak of geography and history, they haven't any general knowledhe and even in genetic geography, their knowings are weak.
Nevertheless, theu should have verified what they claim.

For R2, they ignored Indian Peninsula and Middle East.
For R1b1b1-M73, it exists for the Hazaras, but mainly for Baschkirs and Tatars, It exists for Balkars and Karachais in North Caucasus.

A lot of R1 bor Russians, but increasing from 35% in North to 60% in South West.
The funniest is when they wanted to be erudite and they stated Adyghei who are less than 20% R1.

...
You are being charitable! It is not an intellectually dishonest paper like that Sharma paper (The Indian origin of paternal haplogroup R1a1*...), but displays a pervasive lack of curiosity on part of the authors. A simple search would have shown them that all their R* samples were found by Myres et al to be R2-M479. A review of YFull team's tree would have given them an idea of many new SNPs so that they could have avoided that embarrassment of an R tree.

Their R1a1 tree is incomprehensible, but Semargl has made some sense out of it here - http://eng.molgen.org/download/file.php?id=548&mode=view


Anyway, even from this mess of a paper, a few points relevant to R come across:

1. Their R* age of 21 ky is actually the age of R2-M479. This comports well with the age of the pre-R derivative from Mal'ta

2. From Semargl's analysis we see that Sindhis are mainly L657,Y9; Baloch and Brahui are L657,Y8; Pathans are mixed in their R1a1 types; Burusho are Y40. There are couple of Uyghur samples that are L657,Y9, which could be interesting in light of their proximity to the Tarim mummies.
http://books.google.com/books?id=xH1agYR4w40C&pg=PA117

BMG
01-25-2014, 02:51 AM
From Semargl's analysis we see that Sindhis are mainly L657,Y9; Baloch and Brahui are L657,Y8; Pathans are mixed in their R1a1 types; Burusho are Y40. There are couple of Uyghur samples that are L657,Y9, which could be interesting in light of their proximity to the Tarim mummies.


Where did u get these details ?

parasar
01-25-2014, 03:15 AM
Where did u get these details ?

http://eng.molgen.org/viewtopic.php?f=77&t=1412&p=19673&hilit=BUR_341#p19673

Mehrdad
01-25-2014, 04:16 AM
You are being charitable! It is not an intellectually dishonest paper like that Sharma paper (The Indian origin of paternal haplogroup R1a1*...), but displays a pervasive lack of curiosity on part of the authors. A simple search would have shown them that all their R* samples were found by Myres et al to be R2-M479. A review of YFull team's tree would have given them an idea of many new SNPs so that they could have avoided that embarrassment of an R tree.

Their R1a1 tree is incomprehensible, but Semargl has made some sense out of it here - http://eng.molgen.org/download/file.php?id=548&mode=view


Anyway, even from this mess of a paper, a few points relevant to R come across:

1. Their R* age of 21 ky is actually the age of R2-M479. This comports well with the age of the pre-R derivative from Mal'ta

2. From Semargl's analysis we see that Sindhis are mainly L657,Y9; Baloch and Brahui are L657,Y8; Pathans are mixed in their R1a1 types; Burusho are Y40. There are couple of Uyghur samples that are L657,Y9, which could be interesting in light of their proximity to the Tarim mummies.
http://books.google.com/books?id=xH1agYR4w40C&pg=PA117

Thanks Parasar, and at the same time we see Y6 and Y7 amongst south asians and arabs alike. Makes me wonder if they (Y6 and Y7) were born in either Sindh or Baluchistan.

BMG
01-25-2014, 05:54 PM
Thanks Parasar, and at the same time we see Y6 and Y7 amongst south asians and arabs alike. Makes me wonder if they (Y6 and Y7) were born in either Sindh or Baluchistan.
Not Necessarily .We know parasar himself is Y9+ Y7- and we need more results from that region. But i agree that arab Y7/Y6 would probably from sindh/baluchistan.
btw are you Y7+

Mehrdad
01-25-2014, 06:07 PM
Not Necessarily .We know parasar himself is Y9+ Y7- and we need more results from that region. But i agree that arab Y7/Y6 would probably from sindh/baluchistan.
btw are you Y7+

Yes sir, I'm Y7+

BMG
01-25-2014, 06:35 PM
I would like to test myself for Y7/Y6 through but the cost seems to be a little too high at $39 for testing a single SNP .Waiting for them to reduce the price

parasar
01-25-2014, 07:55 PM
I would like to test myself for Y7/Y6 through but the cost seems to be a little too high at $39 for testing a single SNP .Waiting for them to reduce the price

You could request Thomas Krahn to create a primer. http://shop.yseq.net/product_info.php?products_id=108
Unfortunately the price at his new company has gone up to $35 for each SNP test.

parasar
01-25-2014, 08:40 PM
Thanks Parasar, and at the same time we see Y6 and Y7 amongst south asians and arabs alike. Makes me wonder if they (Y6 and Y7) were born in either Sindh or Baluchistan.

The Sindh/Baluchistan/Gujarat region has the highest R1a1 diversity - all due to Z93/Z94. If that been all due to L657 (and not Y40 and Z2124) too, then that region would have been the best candidate for the origin of L657. Right now I think the Indus was where Z2124 and Y40 were born, and L657 perhaps a little more to the east.

If you look at the distribution all over the subcontinent it is almost impossible to come to any clear conclusion.
L657 and Y40 https://docs.google.com/file/d/0B-b3KeGG3Un1VUJTSU12UUVkXzQ/edit
Z2124 https://docs.google.com/file/d/0B-b3KeGG3Un1RXEweGJsYUlrNHc/edit

I have been totally wrong in my thinking that Y-R was born in Europe (unless by some chance Kostenki 14 tests R) as ancient DNA is showing absence of R there even in the neolithic, so perhaps ancient DNA may be the only reliable indicator. In light of the Baikal Mal'ta R, I am hoping that these Chinese, Russian, and Mongol R1a1 can be revisited and tested for downstream markers.
http://link.springer.com/article/10.1007%2Fs00439-009-0683-0 "nearly all subjects belong to haplogroup R1a1-M17"
http://onlinelibrary.wiley.com/doi/10.1002/ajpa.21242/abstract "a male of distinctive Indo-European lineages (R1a1) was present in the Xiongnu of Mongolia"
http://www.biomedcentral.com/1741-7007/8/15 "The Y chromosome haplogroup of the seven males were all assigned to haplogroup R1a1a"

Mehrdad
01-25-2014, 09:55 PM
I would like to test myself for Y7/Y6 through but the cost seems to be a little too high at $39 for testing a single SNP .Waiting for them to reduce the price

I'm like you, I'm waiting for the Big Y test to go on sale before I purchase it. Must be our part of our genes - finding the best deals before putting down money for it :D

I'm pretty sure our ancestral genes are part of our intuition, its like a depository of feelings and intelligence like that of the "Supreme Intelligence" of the Kree empire (Marvel Comics).

Mehrdad
01-25-2014, 11:51 PM
Interesting, so the theory that R1 was from the Steppe still holds? I thought I read something on the old DNA forums about this.

I'm still baffled that R1 didn't make a huge dent in the America's.

alan
01-26-2014, 12:21 PM
I think though that the placing of R* mal'ta boy in the context of a very late person in the south-central Siberian middle upper palaeolithic culture (duration c. 30000-22000BC and possible descended in even deeper time from the early upper Paleolithic south-central Siberian culture c. 40000-30000BC) kind of rules out Kostenki from being an R guy. He may have lived about 30000BC but he was from a different cultural group. As far as I can understand his cultural group and that which Mal'ta belonged to are different and may not have any sort of share ancestry until before 40000BC or earlier which is pretty well back to the beginning of modern human settlement of north Eurasia. That makes it very unlikely that are from the same haplogroup branch IMO and probably dont share ancestry back until the IJK period or earlier. In fact the famous Kostenki man is from a pre-Gravettian layer so I dont even think he was haplogroup I which is perhaps best linked wit the Gravettian culture. Pre-Gravettian cultures I think probably had haplogroups which are very rare today.



The Sindh/Baluchistan/Gujarat region has the highest R1a1 diversity - all due to Z93/Z94. If that been all due to L657 (and not Y40 and Z2124) too, then that region would have been the best candidate for the origin of L657. Right now I think the Indus was where Z2124 and Y40 were born, and L657 perhaps a little more to the east.

If you look at the distribution all over the subcontinent it is almost impossible to come to any clear conclusion.
L657 and Y40 https://docs.google.com/file/d/0B-b3KeGG3Un1VUJTSU12UUVkXzQ/edit
Z2124 https://docs.google.com/file/d/0B-b3KeGG3Un1RXEweGJsYUlrNHc/edit

I have been totally wrong in my thinking that Y-R was born in Europe (unless by some chance Kostenki 14 tests R) as ancient DNA is showing absence of R there even in the neolithic, so perhaps ancient DNA may be the only reliable indicator. In light of the Baikal Mal'ta R, I am hoping that these Chinese, Russian, and Mongol R1a1 can be revisited and tested for downstream markers.
http://link.springer.com/article/10.1007%2Fs00439-009-0683-0 "nearly all subjects belong to haplogroup R1a1-M17"
http://onlinelibrary.wiley.com/doi/10.1002/ajpa.21242/abstract "a male of distinctive Indo-European lineages (R1a1) was present in the Xiongnu of Mongolia"
http://www.biomedcentral.com/1741-7007/8/15 "The Y chromosome haplogroup of the seven males were all assigned to haplogroup R1a1a"

alan
01-27-2014, 12:11 PM
One reason I have just outlined in a the ANE and IE thread is that at the latitude Mal'ta boy was living the steppe-tundra habitat his family were using came to an end not far east of Baikal and the route further east was blocked by desert until environmental changes after the LGM around 7000 years after he died. We could then ask why R didnt then move east once the LGM was over like Q seems to have. All I can offer is that Mal'ta boy was found on the west side of the very large Lake Baikal and between it, mountains to the south, harsh conditions to the north etc an eastward move may have been a bit long winded. For the entire period c. 25000BC to perhaps 16000BC the only 'out' from the area west of Baikal that retained a similar steppe tundra environment without a large detour would have been west. Perhaps that 9000 year period simply meant that they established traditions of movement that didnt involve moving east of Baikal.

Q on the other hand might have been located in south Altai judging from the best matches with native Americans and from the very long period of microblade cultures there. That position may have allowed a movement into east Asia through Mongolia. You do hear people saying that the only good route from Siberia south across the entire area from Afghanistan to the Pacific was the Dzungarian gate which is in Altai. So perhaps it was simply a more natural route for them to reach the more attractive areas of Pacific east Asia from.


Interesting, so the theory that R1 was from the Steppe still holds? I thought I read something on the old DNA forums about this.

I'm still baffled that R1 didn't make a huge dent in the America's.

parasar
03-12-2014, 04:26 AM
Yes sir, I'm Y7+

A Y7+ (243297 Kutbi http://www.familytreedna.com/public/r-arabia/default.aspx?section=yresults ) has his Big Y result as reported by Michal:


The only known Big Y result for the largest branch of R1a (Z93) is for kit 243297 from clade L657. He was initially classified in our project as a non-clustered member of category 9.C (9.C*). However, ss has already been noted by Semargl in anoother thread, the Big Y results place kit 243297 in a large subclade Y7 of clade Y9 (thus in category 9.C2A* in our project). More specifically, he is a member of a relatively large subclade within Y7 called Y30.

In accordance with his known L657 membership, his "known SNPs" detected in Big Y include, among others, Z93/F992, Z94/F3105 and L657.

His list of high quality "novel variants" includes 138 positions, but after eliminating all SNPs that are either non-reliable (probably false positive) or upstream of Z93, we are left with 23 "novel variants" only. However, this still includes some known SNPs at the Z93 level:

18985344 (Z2479)

at the L657 level:

6810936 (M605)
8439885 (Y27/M634)
17533368 (Y2/M723)
21610995 (Y26/M780)
22479856 (Y13)
22479878 (previously found in our FGC-tested colleague parasar, see: viewtopic.php?f=77&t=463&start=176)

and downstream of L657:

18083234 (Y30)

Thus, after excluding all above known SNPs, the remaining group includes only 15 novel SNPs (probably downstream of L657, and mostly downstream of Y30). We are still waiting for a permission to reveal these new SNPs on this forum.
http://eng.molgen.org/viewtopic.php?f=77&t=1469

parasar
03-12-2014, 04:30 AM
A Y7+ (243297 Kutbi http://www.familytreedna.com/public/r-arabia/default.aspx?section=yresults ) has his Big Y result as reported by Michal:

http://eng.molgen.org/viewtopic.php?f=77&t=1469

He is Y30+
Position of Y30 https://docs.google.com/file/d/0B-b3KeGG3Un1ZTA4eEdZdTJzQXM/edit?pli=1

Mehrdad
03-12-2014, 03:00 PM
He is Y30+
Position of Y30 https://docs.google.com/file/d/0B-b3KeGG3Un1ZTA4eEdZdTJzQXM/edit?pli=1

This is really interesting, I'm starting to wonder if the Y9, Y7 and Y6 found in Arabia occurred right after the decline of the Harappan civilization, where one group went into the Near East since there were already established relationships with the people of that region.

Mehrdad
03-12-2014, 03:36 PM
He is Y30+
Position of Y30 https://docs.google.com/file/d/0B-b3KeGG3Un1ZTA4eEdZdTJzQXM/edit?pli=1

Hi parasar, I'm curious about your results from both the Big Y and the Full Genome.

parasar
03-13-2014, 01:25 AM
Hi parasar, I'm curious about your results from both the Big Y and the Full Genome.

Mehrdad,
I have not tested with Big Y. A summary of my Full Genome test is here:
http://www.anthrogenica.com/showthread.php?1707-Relative-finder-distribution-of-South-Asian-Caste-and-Tribal-Haplogroups&p=23656&viewfull=1#post23656

As of now my terminal SNP under Y9 is Y2392.
https://docs.google.com/spreadsheet/ccc?key=0Alkh_vXxrFrOdENKdWZHZ2VHak1Mc29MUlg4RlJ2S 3c&usp=drive_web#gid=0
I share Y2392 with a Telugu sample: http://www.yfull.com/tree/R-Y2392/

Lappa had compared Big Y and Full Genome for another subgroup - a R1a1-M458 sample:


Difference between FTDNA's Big-Y and FGC you can see on #208920 result (FGC user)

If he would be tested in FTDNA he woudn't have such SNPs discovered:
3250662 (FG1217)
3357346 (FG1218)
4179805 (FG1219)
5023164 (FG1220)
5131016 (FG1221)
5388254 (FG1222)
6430129 (FG1223)
6539506 (FG1224)
7008383 (FG1226)
10065992 (FG1229)
13397116 (FG1230) less rel.?
13699255 (FG1231) less rel.?
15699369 (FG1235)
18054203 (FG1238)
21474211 (FG1242)

which are out of range of scan in Big Y

Big Y also didn't tested such SNPs near L1029 lvl (?) like:
3743294 (Z2950)
5073448 (Z2949)
15995561 (Z2932)
17785637 (Z2936)
23448179 (Z2943)
26526756 (Z2946)

and gave bad results of:
9162901 (Z2921)
9382910 (Z2922)
18873388 (Z2937)

So it's clear that Big Y is a half price of FullGenomes, but also it looks that it's ... half results also...
http://eng.molgen.org/viewtopic.php?f=77&t=1460&p=20049&hilit=208920#p20049

Dr_McNinja
05-02-2014, 11:03 AM
So do you guys completely disagree with the old picture of R1a evolving around Central Asia and then branching into Z93, Z94 on its way south, then Z2124 east in South Central Asia and L657 southwest in southern Iran/Persian Gulf and then moving east to India? You think L657 and Z2124 evolved within India and moved out from there?

My uncle's kit is L657. He's in both R1a project pages, kit number is 311047. His Y-STR pattern seems a little more rare, have you seen it anywhere else among L657 people? There are some Arabs with 14 @ DYS393. Or is that specific marker not that relevant?

Mehrdad
05-02-2014, 02:59 PM
I'm more towards L657 and Z2124 evolving in either the Sindh or Gujarat areas of South Asia and then migrating out of that region.

parasar
05-02-2014, 04:59 PM
So do you guys completely disagree with the old picture of R1a evolving around Central Asia and then branching into Z93, Z94 on its way south, then Z2124 east in South Central Asia and L657 southwest in southern Iran/Persian Gulf and then moving east to India? You think L657 and Z2124 evolved within India and moved out from there?

My uncle's kit is L657. He's in both R1a project pages, kit number is 311047. His Y-STR pattern seems a little more rare, have you seen it anywhere else among L657 people? There are some Arabs with 14 @ DYS393. Or is that specific marker not that relevant?

More likely Y7 (or at least Y9, Y7+ tend to have DYS447<25) than Y6. I have a feeling that your uncle is in a separate clade from this Arab Y7 (al-Hashmi) who belongs to the DYS459=10,10 clade under Y7. I'm DYS459=9,9 sharing with Sandhu from the Punjab and one Afghan (161626). The Afghan happens to be DYS385=11,15 so I think we may share a relatively recent ancestor.

Two items are needed to really nail down the origin of these lines.
1. Ancient DNA (researchers already have a lot of Siberian and Chinese ancient R1a1 DNA so it should be relatively easy to test those again).
2. A through review of central and north Indian tribes at a much higher number and SNP resolution.

parasar
05-30-2014, 03:22 PM
...
...

Calibrating by the lower number, I would get:
L657 7828
Z93 8390
R1a 24865
R1 28804

Based on some of the new papers with L657 lines compared, it appears the above numbers are pretty reasonable.
From the above we get the age of P as 33000ybp.

Calibrating with Anzick-1 below we have an almost identical number.
35 mutations in 12600 years or 360years/mutation.
Total mutations from the split 91.
Approximate age of P 91x360=32760ybp from "P-M45 subtree, represented by 16 carriers of the M45 SNP"
http://www.nature.com/nature/journal/v506/n7487/images/nature13025-sf2.jpg
http://www.nature.com/nature/journal/v506/n7487/extref/nature13025-s1.pdf


The Pakhtoon sample above, 29. Pathan HGDP00243 R-L657 has 105 mutations for 32760 years or 314/years per mutation for this L657 line (a little different from 360years/mutation for Anzick-1, but in the same ballpark).
Underhill sequenced the same sample - 13. HGDP00243 3.6x Pakistani: Pashtun R1a-L657, 10-Mb - below (167+6+28=201) downstream from R1. The divergence time of ∼25 000 calculated by Underhill for R1 looks to be a slight underestimate.

http://www.nature.com/ejhg/journal/vaop/ncurrent/images/ejhg201450f5.jpg


Overall picture from Poznik et al:
http://stanfordcehg.files.wordpress.com/2014/02/f2-large.jpg

parasar
06-03-2014, 02:13 PM
More likely Y7 (or at least Y9, Y7+ tend to have DYS447<25) than Y6. I have a feeling that your uncle is in a separate clade from this Arab Y7 (al-Hashmi) who belongs to the DYS459=10,10 clade under Y7. I'm DYS459=9,9 sharing with Sandhu from the Punjab and one Afghan (161626). The Afghan happens to be DYS385=11,15 so I think we may share a relatively recent ancestor.

Two items are needed to really nail down the origin of these lines.
1. Ancient DNA (researchers already have a lot of Siberian and Chinese ancient R1a1 DNA so it should be relatively easy to test those again).
2. A through review of central and north Indian tribes at a much higher number and SNP resolution.


the patrilineal gene pool revealed the presence of different haplogroups (Q1a2a1-L54, R1a1a1b2-Z93 and C)

http://www.anthrogenica.com/showthread.php?2649-Strong-genetic-admixture-in-the-Altai-at-the-Middle-Bronze-Age&p=41660&viewfull=1#post41660
Apparently, that Y-C Keyser found in her 2009 paper still remains unresolved for lower clades. [Edit: This looks to be a different sample-set]


....
Now with Z93, it would seem that the European matches may just be coincidences, while the Baikal ones are real.

http://bialczynski.files.wordpress.com/2010/09/aa-scyci-dzisiaj-gdzie-mieszkaja-ystr.jpg
http://www.hamagmongol.narod.ru/library/keyser_2009_e.pdf

Plus as I reasoned before none of these are L657+
http://www.anthrogenica.com/showthread.php?1507-Some-provisional-calculations-for-haplogroup-R1a-based-on-the-first-FGC-result&p=26441&viewfull=1#post26441

The absence of L657 (M780 below) in that region was also seen in Underhill:
http://www.nature.com/ejhg/journal/vaop/ncurrent/images/ejhg201450f3.jpg

Michał
08-10-2014, 04:30 PM
Calibrating with Anzick-1 below we have an almost identical number.
35 mutations in 12600 years or 360years/mutation.
Total mutations from the split 91.
Approximate age of P 91x360=32760ybp from "P-M45 subtree, represented by 16 carriers of the M45 SNP"

As you know, my previous estimates (based on two different sets of data) have suggested a significantly older age for haplogroup P (about 38-40 ky (http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting/page9&p=26002#post26002)), so I was wondering if this difference could be related to a very specific set of data you were using, or maybe to the way you have performed your calculations. It seems that both these things could have contributed to the observed discrepancy.

For example, instead of comparing Anzick-1 to just one (the most closely related) modern sublineage of Q-L54 (showing 91 SNPs downstream of P), I would rather compare it to the average number of SNPs found in all sublineages of haplogroup P that were analysed in that paper (which would be 85.6, according to my calculations). After deducting the number of SNPs found in Anzick-1 (56) from that average value, we get 29.6 SNPs per 12,600 years, or 426 years per each SNP tested in that study. When multiplying those 426 years by the average number of 85.6 SNPs under hg P, we get 36,465 years to a common ancestor of all those hg P members.

Another thing that makes me suspect that the above age for haplogroup P could be an underestimation is related to a very striking observation that the only R1a lineage used in that study showed many more SNPs under P (105) than any of the multiple Q lineages studied (all ranging from 67 to 96, with an average value of 84.2). This suggests that haplogroups Q and R show a significantly different average number of SNPs under P, so each age estimate for haplogroup P that was based on haplogroup Q only (or mostly), instead of using a more balanced group of sublineages, could lead to a significantly underestimated result.

Having all above in mind, I would consider it most likely that the "true" age of haplogroup P is between 36 and 41 ky, or close to 38.5 ky.

alan
08-10-2014, 05:41 PM
I think when all is in doubt you have to work back from what is known. What we know is that Mal'ta was an R*/R3 person who lived c. 24000 years ago but who was one of the final people of the middle upper palaeolithic culture that commenced around 33000 years ago in south-central Siberia. So, I think what we can probably say by the simplest of inferences that the cultural founder and genetic ancestor of Mal'ta boy was in situ by 33000 years ago in south-central Siberia. Now that probably has to have been either a P person or a very early R person depending on calculations. I am pretty confident of this much. It gets more speculative further back because the exact relationship of this to the earlier upper Palaeolithic culture in the same area cisnt 100% clear although local evolution is usually considered likely.


As you know, my previous estimates (based on two different sets of data) have suggested a significantly older age for haplogroup P (about 38-40 ky (http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting/page9&p=26002#post26002)), so I was wondering if this difference could be related to a very specific set of data you were using, or maybe to the way you have performed your calculations. It seems that both these things could have contributed to the observed discrepancy.

For example, instead of comparing Anzick-1 to just one (the most closely related) modern sublineage of Q-L54 (showing 91 SNPs downstream of P), I would rather compare it to the average number of SNPs found in all sublineages of haplogroup P that were analysed in that paper (which would be 85.6, according to my calculations). After deducting the number of SNPs found in Anzick-1 (56) from that average value, we get 29.6 SNPs per 12,600 years, or 426 years per each SNP tested in that study. When multiplying those 426 years by the average number of 85.6 SNPs under hg P, we get 36,465 years to a common ancestor of all those hg P members.

Another thing that makes me suspect that the above age for haplogroup P could be an underestimation is related to a very striking observation that the only R1a lineage used in that study showed many more SNPs under P (105) than any of the multiple Q lineages studied (all ranging from 67 to 96, with an average value of 84.2). This suggests that haplogroups Q and R show a significantly different average number of SNPs under P, so each age estimate for haplogroup P that was based on haplogroup Q only (or mostly), instead of using a more balanced group of sublineages, could lead to a significantly underestimated result.

Having all above in mind, I would consider it most likely that the "true" age of haplogroup P is between 36 and 41 ky, or close to 38.5 ky.

parasar
08-12-2014, 03:09 PM
As you know, my previous estimates (based on two different sets of data) have suggested a significantly older age for haplogroup P (about 38-40 ky (http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting/page9&p=26002#post26002)), so I was wondering if this difference could be related to a very specific set of data you were using, or maybe to the way you have performed your calculations. It seems that both these things could have contributed to the observed discrepancy.

For example, instead of comparing Anzick-1 to just one (the most closely related) modern sublineage of Q-L54 (showing 91 SNPs downstream of P), I would rather compare it to the average number of SNPs found in all sublineages of haplogroup P that were analysed in that paper (which would be 85.6, according to my calculations). After deducting the number of SNPs found in Anzick-1 (56) from that average value, we get 29.6 SNPs per 12,600 years, or 426 years per each SNP tested in that study. When multiplying those 426 years by the average number of 85.6 SNPs under hg P, we get 36,465 years to a common ancestor of all those hg P members.

Another thing that makes me suspect that the above age for haplogroup P could be an underestimation is related to a very striking observation that the only R1a lineage used in that study showed many more SNPs under P (105) than any of the multiple Q lineages studied (all ranging from 67 to 96, with an average value of 84.2). This suggests that haplogroups Q and R show a significantly different average number of SNPs under P, so each age estimate for haplogroup P that was based on haplogroup Q only (or mostly), instead of using a more balanced group of sublineages, could lead to a significantly underestimated result.

Having all above in mind, I would consider it most likely that the "true" age of haplogroup P is between 36 and 41 ky, or close to 38.5 ky.

Yes it is possible that averaging using additional strings may give a different value. I just wanted to limit my calculation to using the modern sample (HGDP00877) closest to the dated ancient sample (Anzick-1).

The paper itself on the other hand averaged using three modern samples:

four hg Q samples for whom the false negative rate is minimized due to haploid coverage of at least 5× (HGDP00877, HGDP00856, HG01124, and Anzick-1) ... We infer an average of approximately 36.7 (48.7 – 12) transversions to have accumulated in the past 12.6 ky and therefore estimate the divergence time of Q-M3 and Q-L54*(xM3) to be approximately 16.8 ky (12.6 ky × 48.7 / 36.7) ... Poisson process model for mutations on the tree and used the constrOptim() function in R to compute a maximum likelihood TMRCA estimate of 16.9 ky. We then repeated this for 100,000 bootstrap simulations to yield a 95% confidence interval of 13.0–19.7 ky.


Using their averaging and maximum likelihood calculation, we have 16.9 - 12.6 (age of Anzick-1) = 4.3 ky as the spacing of Anzick-1 from Q-M3, Q-L54 divergence, which going backwards on the same scale gives us about 32.7 ky for R, Q divergence.

I think this lower age is quite possibly the age of P-M45. I would definitely put a much higher age for P-P295 at about 46000 years.

parasar
08-13-2014, 02:14 PM
Assuming the above ~33 ky R,Q divergence age is correct, it appears certain that the ages listed by Yan et. al. are on the low side.

http://1.bp.blogspot.com/-G-47CuvgUzg/Ul7YmkNcYMI/AAAAAAAAJO0/QbG8MRvvhmA/s1600/china.jpg
http://arxiv.org/ftp/arxiv/papers/1310/1310.3897.pdf
Their K-M9 age is 33 ky.

Going by Karafet et al that the K-M9 is about 17 ky distant from R,Q divergence we get 33 (calibration with Anzick-1) + 17 = 50 ky as the approximate age for K-M9.

So a correction factor of 1.51 should be applied to Yan et. al.'s ages. This correction would actually bring their three Chinese neolithic Super-grandfathers more in line with the beginning of Chinese agriculture:

The earliest agriculture in North China emerged before 10 kya, however, no distinct Y chromosomal expansion could be related to this event. The three star-like expansions happened several thousand years later ...

Michał
08-13-2014, 03:33 PM
So a correction factor of 1.51 should be applied to Yan et. al.'s ages.
I would actually use an even higher correction factor (something between 1.6 and 1.7).

For example, when using the 1.65 correction factor, we would get:
10.3 ybp for the start of the Chinese Neolithic (and the average expansion date of several subclades of hg O)
39.8 ybp for haplogroup P (the Q-R split)
40.8 ybp for haplogroup O
49.5 ybp for the N-O split
54.5 ybp for haplogroup K
59.7 ybp for haplogroup F

The above numbers fit quite well my own estimates (http://www.anthrogenica.com/showthread.php?828-STR-Wars-GDs-TMRCA-estimates-Variance-Mutation-Rates-amp-SNP-counting/page8&p=15936#post15936), so at least most of those SNP-based estimates (that are based on some very different sets of data) are consistent when regards the proportions between the ages of particular subclades. Thus, all we need is to get at least a few more Palaeolithic radiocarbon-dated samples as some reference points, to make the general picture crystal clear. :)