RobertCasey
05-12-2020, 01:07 AM
I spent several hours analyzing the haplogroup FGC6550. This is my 20th review of predictable haplogroups under L21. This analysis
is by no means complete, but I attempted to pull all FGC6550 testers in the BigTree, updated the YSNP branches to FTDNA labels and looked at
several key surname projects. I have pulled most of the testers that have tested positive for branches under FGC6550 (those in public YSTR
reports). Of course many are private and are not included (I do not feel comfortable using any private data in this analysis). This is about the
20th haplogroup under R-L21 that I have analyzed in this manner, so the procedure is pretty established.
What is really missing:
1) I only looked at the obvious surname projects for data (did not pull DF49, L21 data though).
2) I primarily pulled data that was confirmed positive for FGC6550 only.
3) I used very old (3 years old) for predicted testers (the primary surname projects can now be pulled and many would be added as predicted FGC6550).
4) I deleted 90 % of the HG R testers that have no chance of being positive for FGC6550 (to reduce the size of data not relevant).
I did create a model for prediction of FGC6550 which should 98 to 99 % accurate. Currently the AcaStat model shows 100 % accuracy. This is in
column CB (you can just copy and paste into new rows).
This analysis includes:
1) 50 confirmed FGC6550 testers (this could probably be increased 10 or 20 % from public reports).
2) 23 predicted FGC6550 testers (with more pulls, this would easily double in size).
The first file is my source file for all analysis (it has three tabs):
http://www.rcasey.net/DNA/Temp/FGC6550_HG_R_Master_20200510J.xlsx
1) Tab FGC6550 has the original data.
2) Tab AcaStat is the input and output of the AcaStat statistical program (only costs $20 to download).
3) Tab SAPP is the source file for SAPP input (just needs copy and pasting to a Notepad txt file).
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Input_20200510L.txt
Here is a link to the SAPP tool:
http://www.jdvtools.com/SAPP/
Here is the SAPP output file (html version):
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Output_20200510M.html
Here is the SAPP output file (graphic version):
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Output_20200510J.png
Anyone feel free to update these files with new data. Collecting the data was 90 % of the time. AcaStat and SAPP took less than one hour once the data was collected. I did not really sanity check the data that well. Feel free to ask questions. I have several YouTube videos that explain this process. Search "Genetic Genealogy Robert Casey" for a dozen presentations.
is by no means complete, but I attempted to pull all FGC6550 testers in the BigTree, updated the YSNP branches to FTDNA labels and looked at
several key surname projects. I have pulled most of the testers that have tested positive for branches under FGC6550 (those in public YSTR
reports). Of course many are private and are not included (I do not feel comfortable using any private data in this analysis). This is about the
20th haplogroup under R-L21 that I have analyzed in this manner, so the procedure is pretty established.
What is really missing:
1) I only looked at the obvious surname projects for data (did not pull DF49, L21 data though).
2) I primarily pulled data that was confirmed positive for FGC6550 only.
3) I used very old (3 years old) for predicted testers (the primary surname projects can now be pulled and many would be added as predicted FGC6550).
4) I deleted 90 % of the HG R testers that have no chance of being positive for FGC6550 (to reduce the size of data not relevant).
I did create a model for prediction of FGC6550 which should 98 to 99 % accurate. Currently the AcaStat model shows 100 % accuracy. This is in
column CB (you can just copy and paste into new rows).
This analysis includes:
1) 50 confirmed FGC6550 testers (this could probably be increased 10 or 20 % from public reports).
2) 23 predicted FGC6550 testers (with more pulls, this would easily double in size).
The first file is my source file for all analysis (it has three tabs):
http://www.rcasey.net/DNA/Temp/FGC6550_HG_R_Master_20200510J.xlsx
1) Tab FGC6550 has the original data.
2) Tab AcaStat is the input and output of the AcaStat statistical program (only costs $20 to download).
3) Tab SAPP is the source file for SAPP input (just needs copy and pasting to a Notepad txt file).
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Input_20200510L.txt
Here is a link to the SAPP tool:
http://www.jdvtools.com/SAPP/
Here is the SAPP output file (html version):
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Output_20200510M.html
Here is the SAPP output file (graphic version):
http://www.rcasey.net/DNA/Temp/FGC6550_SAPP_Output_20200510J.png
Anyone feel free to update these files with new data. Collecting the data was 90 % of the time. AcaStat and SAPP took less than one hour once the data was collected. I did not really sanity check the data that well. Feel free to ask questions. I have several YouTube videos that explain this process. Search "Genetic Genealogy Robert Casey" for a dozen presentations.