View Full Version : CSYseq: The first Y-chromosome sequencing tool typing a large number of SNPs and STRs

09-07-2021, 07:34 PM
CSYseq: The first Y-chromosome sequencing tool typing a large number of Y-SNPs and Y-STRs to unravel worldwide human population genetics (https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009758)
Sofie Claerhout ,Paulien Verstraete,Liesbeth Warnez,Simon Vanpaemel,Maarten Larmuseau,Ronny Decorte
Published: September 7, 2021

Male-specific Y-chromosome (chrY) polymorphisms are interesting components of the DNA for population genetics. While single nucleotide polymorphisms (Y-SNPs) indicate distant evolutionary ancestry, short tandem repeats (Y-STRs) are able to identify close familial kinships. Detailed chrY analysis provides thus both biogeographical background information as paternal lineage identification. The rapid advancement of high-throughput massive parallel sequencing (MPS) technology in the past decade has revolutionized genetic research. Using MPS, single-base information of both Y-SNPs as Y-STRs can be analyzed in a single assay typing multiple samples at once. In this study, we present the first extensive chrY-specific targeted resequencing panel, the ‘CSYseq’, which simultaneously identifies slow mutating Y-SNPs as evolution markers and rapid mutating Y-STRs as patrilineage markers. The panel was validated by paired-end sequencing of 130 males, distributed over 65 deep-rooted pedigrees covering 1,279 generations. The CSYseq successfully targets 15,611 Y-SNPs including 9,014 phylogenetic informative Y-SNPs to identify 1,443 human evolutionary Y-subhaplogroup lineages worldwide. In addition, the CSYseq properly targets 202 Y-STRs, including 81 slow, 68 moderate, 27 fast and 26 rapid mutating Y-STRs to individualize close paternal relatives. The targeted chrY markers cover a high average number of reads (Y-SNP = 717, Y-STR = 150), easy interpretation, powerful discrimination capacity and chrY specificity. The CSYseq is interesting for research on different time scales: to identify evolutionary ancestry, to find distant family and to discriminate closely related males. Therefore, this panel serves as a unique tool valuable for a wide range of genetic-genealogical applications in interdisciplinary research within evolutionary, population, molecular, medical and forensic genetics.

Author summary
Around 95% of the male-specific Y-chromosome (chrY) is non-recombining and therefore inherited in a conserved manner from father to son. It can therefore serve as a powerful marker for interdisciplinary genetic-genealogical research as it provides a strong link between genetic information and a family tree or pedigree. While Y-chromosomal short tandem repeats (Y-STRs) discriminate close paternal kinships, single nucleotide polymorphisms (Y-SNPs) enables the identification of far evolutionary ancestry. Unfortunately, an extensive chrY-specific sequencing panel combining a large number of familial Y-STRs and evolutionary Y-SNPs was not yet available. Therefore, chrY is rarely included in research projects and not often linked to a genealogical, history-demographical or life science database. In this way, the importance of chrY still remains not yet fully understood. Massive parallel sequencing (MPS) allows the simultaneous analysis at sequence level of Y-SNPs and Y-STRs with variable mutation rates in a large number of males. However, up until today, no commercial kit is exploiting the full potential that MPS offers on chrY. Therefore, we developed the ‘CSYseq’, which is the first extensive chrY-specific sequencing panel. The CSYseq simultaneously identifies 9,014 slow mutating Y-SNPs to identify evolutionary ancestry, and 202 rapid mutating Y-STRs to investigate paternal relationships. We validated and optimized the panel through the analysis of 130 males distributed over 65 families. This novel MPS panel is useful for biogeographical identity and ancestry analysis, together with Y-chromosome profiling for the identification of patrilineages and discrimination of closely related males. As the CSYseq includes a very diverse set of markers that can be easily interpreted, it is interesting for different interdisciplinary applications within evolutionary, population, molecular, medical and forensic genetics.

09-07-2021, 08:53 PM
And it's going to be marketed as PANDORA'S BOX.

09-08-2021, 05:13 PM
This appears to a chip array approach as it targets a huge subset of the YCHR. With only 9,000 YSNPs, this does come even close to determining the 22,000 branches under haplogroup R or 47,000 YSNP branches at FTDNA. Also, this will obviously not include many branch equivalents or private YSNPs which totals 439,000 YSNP mutations at FTDNA. 202 YSTRs is much less than Y700 YSTRs plus are many more that need to be added - Y1000. This looks look super SNP pack that includes YSTRs. It is also doubtful that it could read the longer YSTRs found in Y111, so the using these against the FTDNA YSTR database will be problematic.

If this test is under $100, it probably could be useful test - but like SNP packs, this just will be followed up eventually with a NGS/WGS test. Does this test include a database? Without a database, it is hard to collect this information by researchers. Being a chip array test, this test will not get updated until the setup costs for chip are recovered. But if they split up the chip into multiple tests over the years, they would become more useful. I wish that they would have kept this a YSNP only test and increased the coverage as YSTRs are not going to be very compatible with the FTDNA database. Also, it could been more comprehensive in coverage by limiting to R-L21 where sales would be robust. This product is too broad and too late but could be an economical alternative that would bring in more testers over time (even though this would be a throw away test as many would eventually upgrade to NGS/WGS).

Like SNP packs, these kinds of tests are really not that useful: 1) they are just extra costs on the way to NGS/WGS testing; 2) wasting space for YSTRs just to prove it could be done - more YSNPs would have been better; 3) one size fits all approach - these kinds of tests should be targeted to very large haplogroups - not the whole YCHR tree; 4) the cost is unlikely to be under $100 making this a questionable usage of funds; 5) an improved NatGeo test would have been good ten years ago but with so many YSNP and YSTRs, the coverage will be less than desirable; 6) it is doubtful that they will have a useful database to find this information (the bane of most FTDNA competitors).

09-12-2021, 01:41 AM
... If this test is under $100, it probably could be useful test - but like SNP packs, this just will be followed up eventually with a NGS/WGS test. ...
Yes, sooner or later you have to bite bullet to a true SNP discovery test..... or else your relatives do and thendo SNP Sanger singleton testing. I call it it SNP chasing with the same frustrations as a wild goose chase, and not necessarily a low cost over time. I used to do this myself but I am reformed.