PDA

View Full Version : Recent radiation within R-M269 led to high Y-STR haplotype resemblance



Jean M
02-25-2014, 07:40 AM
Maarten H. D. Larmuseau et al., Recent Radiation within Y-chromosomal Haplogroup R-M269 Resulted in High Y-STR Haplotype Resemblance, Annals of Human Genetics, Volume 78, Issue 2, pages 92–103, March 2014

http://onlinelibrary.wiley.com/doi/10.1111/ahg.12050/abstract


Y-chromosomal short tandem repeats (Y-STRs) are often used in addition to Y-chromosomal single-nucleotide polymorphisms (Y-SNP) to detect subtle patterns in a population genetic structure. There are, however, indications for Y-STR haplotype resemblance across different subhaplogroups within haplogroup R1b1b2 (R-M269) which may lead to erosion in the observation of the population genetic pattern. Hence the question arises whether Y-STR haplotypes are still informative beyond high-resolution Y-SNP genotyping for population genetic studies. To address this question, we genotyped the Y chromosomes of more than 1000 males originating from the West-European regions of Flanders (Belgium), North-Brabant and Limburg (the Netherlands) at the highest resolution of the current Y-SNP tree together with 38 commonly used Y-STRs. We observed high resemblance of Y-STR haplotypes between males belonging to different subhaplogroups of haplogroup R-M269. Several subhaplogroups within R-M269 could not be distinguished from each other based on differences in Y-STR haplotype variation. The most likely hypothesis to explain this similarity of Y-STR haplotypes within the population of R-M269 members is a recent radiation where various subhaplogroups originated within a relatively short time period. We conclude that high-resolution Y-SNP typing rather than Y-STR typing might be more useful to study population genetic patterns in (Western) Europe.

Rathna
02-25-2014, 07:54 AM
http://onlinelibrary.wiley.com/doi/10.1111/ahg.12050/abstract

This is contradicted by the difference between the Most Recent Common Ancestor calculated by STRs and by SNPs. It is true that we are yet discussing about the time to assign to a single SNP (3-4 generations or 8-10?), but many samples of aDNA found in Europe (hgs. G and E above all) let us think that the STRs time should have been multiplied for a 2.5 factor.
Perhaps you know that I, from many years, tried to explain this fact by my 3 or 4 golden principles:
mutations happened around the modal
there is a convergence to the modal as time passes
only sometime a mutation goes for the tangent
rarely there are multistep mutations.

VinceT
02-25-2014, 08:00 AM
If they're only typing STRs using the standard forensic 17-marker Y-Filer set, then yes, they won't find much if any to differentiate between sub-clades with any degree of confidence. Maximizing STRs (i.e. 111 markers) in tandem with high-resolution Y-SNP typing would be preferred, but then theirs may be a cost-based conclusion.

More data will always be better.

ArmandoR1b
02-25-2014, 12:56 PM
If they're only typing STRs using the standard forensic 17-marker Y-Filer set, then yes, they won't find much if any to differentiate between sub-clades with any degree of confidence. Maximizing STRs (i.e. 111 markers) in tandem with high-resolution Y-SNP typing would be preferred, but then theirs may be a cost-based conclusion.

More data will always be better.

The study deals directly with "subtle patterns in a population genetic structure" and I don't think that even 111 markers are good for that if the only choice is between those and high-resolution Y-SNP genotyping, especially if cost is considered. 111 STR markers won't always tell us that a person definitely belongs to a subclade. Only Y-SNP testing can tell us that. 111 STR markers are better for more recent ancestry, 20 generations, not "subtle patterns in a population genetic structure"

You can see which STR markers and which SNP markers the authors used by going to http://onlinelibrary.wiley.com/doi/10.1111/ahg.12050/suppinfo and by downloading the supporting info at http://onlinelibrary.wiley.com/store/10.1111/ahg.12050/asset/supinfo/ahg12050-sup-0001-TableS1.docx?v=1&5545d1bb

Dienekes also has an image with the SNP markers at http://1.bp.blogspot.com/-E1MMnc_de-s/Uwu7U0eDVGI/AAAAAAAAJgY/KsjMcuIsvIw/s1600/larmuseau.png

It makes me very happy to see that they included P312 and Z195 which have been missing from many previous studies but is very common in Western Europe.

On October 18, 2008 Dienekes posted Why Y-STR haplotype clusters are not clades which brought this problem to my attention back then. http://dienekes.blogspot.com/2008/10/why-y-str-haplotype-clusters-are-not.html Recently Mike Walsh has posted spreadsheets to try and help R1b people determine their subclade based on STR markers and others have been posting modals in Ysearch. So far I have found it is best to just get people to order SNP tests even if they have a 37 or 67 marker match with another person because the STR markers seem to vary too much. That doesn't mean that at some point there won't be an easy match of 67 STR markers with a specific SNP but in the meantime the the consumer both need to be used and for "subtle patterns in a population genetic structure" high-resolution Y-SNP genotyping should be used.

edit: I want to state that I applaud what Mike Walsh has done with the spreadsheet and anyone that has posted a modal in Ysearch. They have been helpful in determining what can and can't be done with STR markers.

R.Rocca
02-25-2014, 01:19 PM
Their median joining network diagram shows just how meaningless STRs are at such high level to trace deep ancestry. Most of the relationships are between samples that, from an SNP perspective, we know have little to do with one another.

IMO, even 111 STRs are insufficient to rectify these discrepancies with an acceptable level of confidence. The 600 STRs that FGC offers are nice, but when one has a full list of SNPs, STRs become superfluous.

alan
02-25-2014, 04:26 PM
For a moment there I thought we had been subject to some sort of solar radiation :0)

alan
02-25-2014, 04:34 PM
Its nice to have another paper concluding this although STR wizards like Ken N realised this many years ago and the vast majority in this hobby came to a broad consensus that this was the case several years ago. I wonder if they had a stab at dating this 'radiation' of M269. One of the interesting things in SNP counting dating by Michal and others is that M269 seems to have existed for a considerable time before it expanded and left clades that survive today.

alan
02-25-2014, 05:00 PM
Its not really my area of knowledge but I tend to get the same impression that STRs have taken things as far as they could in terms of new knowledge and the future looks like its going to be SNP based. Still, we owe a big debt to all those people in this hobby who have done so much amazing work using STRs to take their use to the maximum, untangle the human story and keep us all interested over the years.


Their median joining network diagram shows just how meaningless STRs are at such high level to trace deep ancestry. Most of the relationships are between samples that, from an SNP perspective, we know have little to do with one another.

IMO, even 111 STRs are insufficient to rectify these discrepancies with an acceptable level of confidence. The 600 STRs that FGC offers are nice, but when one has a full list of SNPs, STRs become superfluous.

vettor
02-25-2014, 05:21 PM
This is contradicted by the difference between the Most Recent Common Ancestor calculated by STRs and by SNPs. It is true that we are yet discussing about the time to assign to a single SNP (3-4 generations or 8-10?), but many samples of aDNA found in Europe (hgs. G and E above all) let us think that the STRs time should have been multiplied for a 2.5 factor.
Perhaps you know that I, from many years, tried to explain this fact by my 3 or 4 golden principles:
mutations happened around the modal
there is a convergence to the modal as time passes
only sometime a mutation goes for the tangent
rarely there are multistep mutations.

you left one out

mutations happened with geographical change

Rathna
02-26-2014, 02:50 AM
you left one out

mutations happened with geographical change

It is possible, but I don't know any serious study that investigated this by a scientific point of view. I think we should add that it seems that mutations are more frequent with the age of the father, and surely there are haplotypes (or haplogroups) which mutate faster or slower than others.
But all this makes STRs to be used with care. The same happened in the mtDNA, in fact usually mutations are calculated on average, but we should add also the number of meioses: a rare haplotype mutates slower than a more diffused one, and where it is rare there are more possibilities it goes extinct, and where it is more diffused that only a few lines survive, and the few survived ones cumulate many SNPs (and STRs variance) in the knots.

ArmandoR1b
02-26-2014, 11:28 AM
Its nice to have another paper concluding this although STR wizards like Ken N realised this many years ago and the vast majority in this hobby came to a broad consensus that this was the case several years ago. I wonder if they had a stab at dating this 'radiation' of M269. One of the interesting things in SNP counting dating by Michal and others is that M269 seems to have existed for a considerable time before it expanded and left clades that survive today.

Did Ken N ever post that anywhere? Oddly enough I am finding people that want to continue using 37 or fewer markers to predict subclades.

GoldenHind
02-26-2014, 06:48 PM
Their median joining network diagram shows just how meaningless STRs are at such high level to trace deep ancestry. Most of the relationships are between samples that, from an SNP perspective, we know have little to do with one another.

IMO, even 111 STRs are insufficient to rectify these discrepancies with an acceptable level of confidence. The 600 STRs that FGC offers are nice, but when one has a full list of SNPs, STRs become superfluous.

I think most of us who have spent any amount of time trying to decipher the R1b tree have been aware for some time that using STRs alone is largely useless. It is pretty common knowledge that using genetic distance alone can show members of different subclades in one's closest matches. This is doubtless due to the comparatively recent expansion of R1b in Europe. Perhaps this tendency would disappear with a sufficient number of markers, but I agree that it would probably require a greater number than 111.

However STRs can be a valuable tool, and should not be disregarded entirely. I have found that using STRs to look for shared off modal matches is far more significant than using them to calculate genetic distance. Nordtvedt of course first demonstrated this principle some years ago with his identification of several R1b varieties, nearly every one of which has since been connected with a particular R1b SNP.

Last year I spent some time examining the haplotypes of those who were in the former P312** group. I first noticed there was small group who shared six different off modal STR matches with me, and even though they did not have a particularly close genetic distance to each other, I suspected I had discovered a variety which would ultimately share a subclade. Since that time every one of the group who has tested for DF99 has received positive results, though this group has proved to be only a small component of DF99. I have since been going through those who remain P312** after testing DF99-, and have discovered several more clusters who share various off modal values. I have very little doubt they too will eventually be found to have an as yet undiscovered SNP in common.

One must always be wary though when looking at R1b STRs, as it is always possible to get false positives as well as false negatives, doubtless due to convergence and the close relationship between R1b in Europe.