PDA

View Full Version : Guessing SNPs from Y-Str's



jamesdowallen
04-21-2017, 07:18 PM
My 12 STR's are (familytreedna)

14 23 14 11 11-15 12 12 12 13 13 30

Is there a free on-line resource where I can type that in and it will show me which R1b subclade is most likely?

aTdHvAaNnKcSe, James

TigerMW
04-21-2017, 08:33 PM
My 12 STR's are (familytreedna)

14 23 14 11 11-15 12 12 12 13 13 30

Is there a free on-line resource where I can type that in and it will show me which R1b subclade is most likely?

aTdHvAaNnKcSe, James
I do a lot of this kind of thing to help people in looking at what SNPs they might want to test for but I gave it on for anything less than 67 STRs several years ago.

I think 111 STRs is generally a good thing as it
1) sets you very well in the matching database for verification of good matches, not false matches and missing matches
2) can help you hone in on just a couple SNPs as long people who are potentially related to you have done the SNP discovery work via Next Generation Sequencing tests
3) will be helpful in fine family lineage demarcations like what is used in "Mutation History Trees".

Petr
04-22-2017, 09:33 AM
Good tool is http://www.nevgen.org/ but 12 STRs are insufficient for better guess than R1b.

Dewsloth
05-16-2017, 11:25 PM
Just a word of warning. Nevgen doesn't work so well for every R1b subclade:

I input a full Y-111, and while it's pretty sure I'm R1b, it can't tell any more about me. They don't seem to have a lot of DF19 subclade info.

Haplogroup Probability Fitness Fitness 2
1 R1b L51>L151> CTS4528> S14328
2.52 34.23 0.68
2 R1b L21>> ZZ10>Z253> Z2186>L1066
1.54 34.05 0.65
3 R1b Z2103>Z2106
0 32.35 0.71
4 R1b DF27>ZZ12> ZZ39
0 32.2 0.62
5 R1b L21>DF13> Z39589>Z251> Z16943
0 31.6 0.61
6 R1b U152>L2> DF103
0 31.39 0.62
7 R1b U106>Z18> Z17>>L257>> Z375
0 31.22 0.52
8 R1b DF27>Z196> L176.2>L165
0 31 0.57
9 R1b L21>DF13> FGC5494
0 29.59 0.59
10 P312>DF19> DF88>>S4268> Z17112> Z27257
0 30.37 0.53
11 R1b P312>DF19> Z302
0 29.77 0.59
12 R1b P312>L238
0 29.64 0.53
13 R1b U152>L2> Z258>L20
0 28.84 0.6
14 R1b L51>PF7589
0 28.95 0.59
15 R1b U152>L2> PF6658> Z192
0 28.79 0.55
16 R1b DF27>ZZ12> FGC20747
0 28.83 0.54
17 R1b U106>Z381> Z301>S1688> U198
0 28.39 0.52
18 R1b U106>Z381> Z301>L48> L47>Z159> S3251
0 28.64 0.53
19 R1b L21>DF13> Z39589>Z251>> S11556
0 28.13 0.55
20 R1b DF27>ZZ12> Z2552> L617
0 28.28 0.52
21 P312>DF19> DF88>>S4268> Z17112> S17075
0 28.51 0.46 So close, but I'm S17075- and they can't see me
22 R1b U106>Z381> Z301>L48> S23189
0 27.51 0.52
23 R1b L21>DF13> L513>S5668> Z16340
0 28.21 0.46
24 R1b U152>Z36> BY1328
0 27.7 0.51
25 R1b Z2103>Z2106>> Y5587
0 27.34 0.49
26 R1b U106>Z381> Z156>DF98> S1911
0 27.51 0.5
27 R1b U106>Z381> Z156>DF96> S11515> L1
0 26.99 0.48
28 R1b DF27>Z196> Z209> Z295> Z216
0 27.26 0.51
29 R1b Z2103>Z2106>> BY250
0 27.01 0.51
30 R1b L21>> Z39589>DF49> ZP20> ZP21
0 26.36 0.5
31 R1b U106>S12025
0 26.89 0.44
32 R1b L21>FGC11134>> A1135
0 26.76 0.43
33 R1b U106>Z381> Z156>DF96> S11515> FGC8410
0 26.96 0.46
34 R1b L21>DF13> CTS1751
0 26.25 0.51
35 R1b Z2103>L277
0 26.58 0.53
36 R1b U106>Z381> Z301>>Z30>> Z346> DF101
0 26.69 0.45
37 R1b DF27>Z196> L176.2>SRY2627> Z207
0 26.5 0.5
38 R1b U106>> Z301>>Z30>> FGC7559> CTS10893
0 26.31 0.46
39 R1b L21>DF13> Z39589>DF49>> ZP75
0 26.26 0.47
40 R1b U106>Z381> Z301>L48> Z9>>Z326>> FGC10367
0 26 0.48
41 R1b L21>DF13> FGC5494> Z16502
0 26.02 0.46
42 R1b DF27>ZZ12> ZZ19>Z34609> Z2571
0 25.52 0.41
43 R1b L21>DF13> Z39589>S1051
0 25.62 0.49
44 R1b DF27>Z196> L176.2>SRY2627> CTS606
0 25.61 0.45
45 R1b L21>DF13> DF21>FGC3213> S3058
0 25.76 0.44
46 R1b U106>Z381> Z301>L48> Z9>>FGC12346
0 25.44 0.47
47 R1b P312>DF99
0 25.46 0.48

Dewsloth
05-22-2018, 05:17 PM
I hereby rescind my previous critique of Nevgen as articulated above. In the last year, they have managed to update their site enough so that my results now read:

The calculated group
P312>DF19> DF88>>S4268>Z17112> Z43162 100%



Good job, folks!

Webb
05-24-2018, 04:03 PM
I hereby rescind my previous critique of Nevgen as articulated above. In the last year, they have managed to update their site enough so that my results now read:

The calculated group
P312>DF19> DF88>>S4268>Z17112> Z43162 100%



Good job, folks!

I hate to be a Debbie downer but they have a database and you match spot on because you are in their database so they already know what you are positive for. I found myself in their database, hence the accurately predicted my smps as well. Sorry.

Dewsloth
05-24-2018, 05:31 PM
I hate to be a Debbie downer but they have a database and you match spot on because you are in their database so they already know what you are positive for. I found myself in their database, hence the accurately predicted my smps as well. Sorry.

You aren't a Debbie Downer at all: My point was they weren't sitting on their butts and had made an effort to update their site (which is all we can ever really hope for).
That "terminal" haplogroup of mine didn't exist until several months ago, in fact it's likely I am the database sample they used.

Webb
05-24-2018, 07:07 PM
You aren't a Debbie Downer at all: My point was they weren't sitting on their butts and had made an effort to update their site (which is all we can ever really hope for).
That "terminal" haplogroup of mine didn't exist until several months ago, in fact it's likely I am the database sample they used.

That’s what I mean. You match yourself. I match myself too!!!! It seems fishy.

RobertCasey
05-24-2018, 08:13 PM
As many may know, I have the R-L21 SNP predictor tool that works for around 50 % of the R-L21 but has not been updated since the roll out of Big Y:

http://www.rcasey.net/DNA/R-L21_SNP_Predictor_Intro.html

However, my new YSNP predictor is vastly improved from this version but only implemented for a few L21 haplogroups. Since I am the admin for R-L226, I ran several L226 YSTR testers through NevGen that are boundary condition testers but have not been YSNP predicted. To my amazement, I did around 20 or 30 and there was a 100 % match between my new YSNP predictor and NevGen (for L226). So this technology works very well for haplogroups that have been analyzed - but there are a lot of branches to analyze which takes a lot of time. NevGen uses a very different mathematical model from my binary logistic regression model but with math, there are always many ways to get the same answer.

However, their model can not be visually seen like binary logistic regression. With the signature in hand for your haplogroup (1200 to 2500 years only), you can add two columns to your spreadsheet - signature match and genetic distance from signature. With these two variables and an EXCEL macro, you can now get 100 % accuracy for prediction until a new extreme boundary condition tester tests positive for the first time. Then you just run the model again and get new constants and the accuracy new model goes back to 100% accuracy again. You can implement YSNP prediction yourself for any haplogroup that qualifies for prediction:

1) Must be older that 1,200 years and younger than 2,500 years using a 67 marker signature.
2) It must have minimal convergence.
3) You need at least 10 or 20 testers at 67 markers that are YSNP confirmed.
4) Prediction using 37 markers is not consistently reliable.

Here is a paper that I recently updated for this prediction methodology:

http://www.rcasey.net/DNA/R_L21/Math_behind_R_L21_SNP_Predictor_20180202A.pdf

Here is the implementation of this methodology in EXCEL for R-L226:

http://www.rcasey.net/DNA/R_L226/Haplotrees/L226_Signatures.xlsx


If you are an admin of a large haplogroup, it is well worth you time to implement this prediction in EXCEL.