View Full Version : Italian Study comparing Y-STR TMRCAs with documented genealogies

06-21-2019, 02:39 PM
I did not see this posted yet, but originally Blaine Bettinger posted this link on FB. It's a study just published today comparing 120 Italian individuals who have deep documented (and apparently valid!) genealogical pedigrees against their predicted TMRCAs using the Bruce Walsh method (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1461668/) (which is probably the LEAST accurate of the various STR-based methods, although they're all reasonably close to each other).

Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees

Alessio Boattini1, Stefania Sarno1, Alessandra M. Mazzarisi1, CinziaViroli2, Sara De Fanti1,
Carla Bini3, Maarten H. D. Larmuseau4,5, Susi Pelotti3 & Donata Luiselli6

In the population genomics era, the study of Y-chromosome variability is still of the greatest interest for several fields ranging from molecular anthropology to forensics and genetic genealogy. In particular, mutation rates of Y-chromosomal Short Tandem Repeats markers (Y-STRs) are key parameters for different interdisciplinary applications. Among them, testing the patrilineal relatedness between individuals and calculating their Time of Most Recent Common Ancestors (TMRCAs) are of the utmost importance. To provide new valuable estimates and to address these issues, we typed 47 Y-STRs (comprising Yfler, PowerPlex23 and YflerPlus loci, the recently defined Rapidly Mutating [RM] panel and 11 additional markers often used in genetic genealogical applications) in 135 individuals belonging to 66 deep-rooting paternal genealogies from Northern Italy. Our results confirmed that the genealogy approach is an effective way to obtain reliable Y-STR mutation rate estimates even with a limited number of samples. Moreover, they showed that the impact of multi-step mutations and back mutations is negligible within the temporal scale usually adopted by forensic and genetic genealogy analyses. We then detected a significant association between the number of mutations within genealogies and observed TMRCAs. Therefore, we compared observed and expected TMRCAs by implementing a Bayesian procedure originally designed by Walsh (2001) and showed that the method yields a good performance (up to 96.72%), especially when using the Infinite Alleles Model (IAM).


Some observations from me about their conclusions:

- Yes it's a small data set, but it's good to see at least some published validation of many observations that are already common in the citizen-scientist community.

- They found that STR-based TMRCAs (at least using individual mutation rates like they applied with the Walsh method) are reasonably accurate within genealogical timeframes but tend to underestimate older ages (in practice we say older than 1500-2000 years ago but the study didn't specify they just noted the same trend).

- Back-mutations and convergence do not significantly affect age estimates within genealogical times among a group that is already known to be related within genealogical time,

- single-step mutation approaches should not be used for multi-copy markers (the study found using infinite alleles for multi-copy markers was better which is commonly the approach used by the various user-developed tools also).

- They did originally start with 135 individuals and found 15 of them were NPE/MPEs (Non-Paternal Event / Misattributed Paternal Event, pick your favorite term). Taking the number of births across the genealogies reported by the group gives a NPE percentage of 1.85% per birth event which is very close to the 2% commonly used in the community.

06-21-2019, 03:27 PM
That is reassuring in my own quest to nail down the TMRCA with my own closest paternal match. Per our previous conversations; With your SAPP tool it has a tight range of 1050 to 1300 AD...midpoint of 1175 AD. This range has overlap with SNP dating methodologies from Yfull, Ytree, and McDonald. My confidence is growing that this will turn out to be close to the truth.