11-23-2020, 08:57 PM
The scope of L96 has remained small compared to the rapid growth of many early YSNP discoveries. In 2012, the scope was estimated to be 20 testers at Y67 or more. This analysis now shows that the number of testers has doubled to 40 testers (confirmed and predicted). The new model currently has an accuracy of 100 % across 57,000 testers of haplogroup R based on a strong eight marker signature. Here is the source file for this analysis. Column K includes the formula for signature match and column L shows the genetic distance from the signature. Column DT contains the actual binary logistic regression model (constants were derived from the statistical software tool AcaStat which can be purchased for only $10 plus tax). The tab labeled AcaStat has both the input and output from AcaStat.

www.rcasey.net/DNA/Temp/L96_HG_R_20201122A.xlsx (http://www.rcasey.net/DNA/Temp/L96_HG_R_20201122A.xlsx)

You can add and update this file for your own usage. The formulas will automatically update with the new rows but you have to copy and paste in the following order: 1) column DU to DZ (formulas for each YSTR panel); 2) K to L for the signature and genetic distance formulas; 3) DT is the model which uses K & L. If you want to play around with AcaStat, you can copy the AcaStat tab and paste it into a "New" file and Save the file as CSV - comma limited file which AcaStat requires as the input file. Here are the step by step instructions to run AcaStat (you will need to download the software and pay the $10 plus tax as well):


The tab SAPP includes the input to the charting tool, SAPP. If you want the png graphic file, just remove the option TEXTOUT and download the png file. Since this current size of L96 only includes 40 testers and only eight YSNP branches, this is a pretty small file for a huge amount of accuracy. SAPP does find the signatures for surname clusters quite well but the earlier branches and TMRCA estimates are not going to be the most accurate. You should really take this file and analyze for enhancements and create your own chart manually (which takes a lot of work). The analysis is based on 57,000 Y67 testers but is probably missing 5,000 to 10,000 testers across 5,000 projects. I did try to look at all the surname projects and look for new YSNP test results. With any analysis, sample size = accuracy. I created this data primarily for YSNP prediction - but SAPP charting is still very useful as a starting point for charting. The input file can be found on the SAPP tab of the EXCEL file.

www.rcasey.net/DNA/Temp/SAPP_Output_20201121G.html (http://www.rcasey.net/DNA/Temp/SAPP_Output_20201121G.html)

To understand the basics of charting: