Page 1 of 3 123 LastLast
Results 1 to 10 of 28

Thread: A Mutation Tree for YFull's NGS "500 STRs"

  1. #1
    Registered Users
    Posts
    339
    Sex
    Location
    USA
    Nationality
    USA
    Y-DNA (P)
    R1b-L21 L513*

    United States of America Ireland Germany Belgium Wallonia

    A Mutation Tree for YFull's NGS "500 STRs"

    This is just a start at organizing the STR data from NGS testing, but it is possible at least.

    YFull currently has 317 kits in the L21 group reporting up to 503 STR markers from their NGS analyses. I ran those through SAPP to build a L21 mutation history tree for the 317 kits for all markers.

    When I say "all markers", I did have to clean up the data for no-reads, etc. I included only the 439 STRs with >50% reporting among the 317 kits. Perhaps more importantly for this first run I simplified the remaining data by using the group modal values in place of the remaining blanks and converting the compound/complex repeats (e.g. "10.3", "16.g", etc) to a "simple"/integer form. In later runs I can try different assumptions. I need to also gather more STR mutation rates for the non-FTDNA-tested markers; for now for those I just used the NIST standard 1.3x10-3.

    And finally I also included in the run YFull's Y-SNP tree under L21 and the terminal SNPs for each of the 317 kits so the program would follow the Y-SNP tree.

    SAPP completed the 317 kits with 439 data points in about 15 minutes. I have attached the output below. This is a huge amount of data to absorb and neither the picture output nor the table output are particularly easy to follow. However there are interesting points in the mutation history, including several possible STR signatures for sub-groups of L21 among the non-FTDNA markers. The run table output also includes 439-STR-long modal haplotypes for the major sub-L21 SNPs.

    Ignore kits YF01441 and YF01525 in the data; they're non-L21 kits in the group that the program correctly shows as outliers.

    For anyone outside of L21 I'm happy to repeat the exercise for other groups.

    Files (Dropbox):

    Output Run Tables (input tables plus SNP modal haplotypes and genetic distances between kits) (warning: 28Meg)

    439-STR Mutation History Tree as Picture (PNG) (warning: 14Meg & 54133x2946 pixels. Normally STR mutations are listed above the nodes/kits; here they are above and to the right of each node or kit for readability).

    439-STR Mutation History Tree as Table (HTML)
    Last edited by Dave-V; 01-31-2017 at 05:55 PM.

  2. The Following 12 Users Say Thank You to Dave-V For This Useful Post:

     cairn (02-01-2017),  dp (01-31-2017),  Eochaidh (02-01-2017),  gotten (01-31-2017),  Mag Uidhir 6 (01-31-2017),  MitchellSince1893 (01-31-2017),  Net Down G5L (02-05-2017),  palamede (02-01-2017),  Pribislav (02-02-2017),  rivergirl (02-02-2017),  Safedrugs (01-31-2017),  TigerMW (01-31-2017)

  3. #2
    Registered Users
    Posts
    4,210
    Sex
    Location
    Tierra de Ayllon
    Y-DNA (P)
    U152>L2>Z49>Z142>
    mtDNA (M)
    H1bt
    Y-DNA (M)
    I2a2a1b2a1b1>Y4925
    mtDNA (P)
    H37

    England Scotland Wales Germany Ireland Sweden Finns
    Quote Originally Posted by Dave-V View Post
    ...

    For anyone outside of L21 I'm happy to repeat the exercise for other groups.

    Files (Dropbox):

    Output Run Tables (input tables plus SNP modal haplotypes and genetic distances between kits) (warning: 28Meg)

    439-STR Mutation History Tree as Picture (PNG) (warning: 14Meg & 54133x2946 pixels. Normally STR mutations are listed above the nodes/kits; here they are above and to the right of each node or kit for readability).

    439-STR Mutation History Tree as Table (HTML)
    Thanks for the kind offer. Request a run on R-U152
    Y DNA line continued: Z142>Z12222>FGC12378>FGC12401>FGC12384
    37% English, 26% Scot/Ulster Scot, 14% Welsh, 14% German, 3% Ireland, 3% Nordic, 2% French/Dutch, 1% India
    Hidden Content

  4. #3
    Registered Users
    Posts
    178
    Sex
    Location
    Reston, VA, USA
    Ethnicity
    Anglo-Norman Irish
    Nationality
    USA
    Y-DNA (P)
    R1b-Z49>S8183
    mtDNA (M)
    U4c1a

    Ireland Germany England Scotland Wales
    Quote Originally Posted by MitchellSince1893 View Post
    Thanks for the kind offer. Request a run on R-U152
    Second the request. Thanks!

  5. #4
    Gold Class Member
    Posts
    2,065
    Sex
    Location
    Virginia, USA
    Y-DNA (P)
    DF27, FGC15733
    mtDNA (M)
    T2f3

    I don't want to be late to the party, so I request a run on R-DF27. Really it should be paired with U152 anyhow, but I gather the data will be completely out of hand without doing that. I don't have a computer at present that can handle it, but am currently shopping, maybe I will have a new one before you get around to us. Meanwhile, others are not so handicapped.

  6. The Following User Says Thank You to razyn For This Useful Post:

     dp (01-31-2017)

  7. #5
    Registered Users
    Posts
    4,291
    Sex
    Y-DNA (P)
    R1b
    mtDNA (M)
    H

    Quote Originally Posted by Dave-V View Post
    This is just a start at organizing the STR data from NGS testing, but it is possible at least.

    YFull currently has 317 kits in the L21 group reporting up to 503 STR markers from their NGS analyses. I ran those through SAPP to build a L21 mutation history tree for the 317 kits for all markers.

    When I say "all markers", I did have to clean up the data for no-reads, etc. I included only the 439 STRs with >50% reporting among the 317 kits. Perhaps more importantly for this first run I simplified the remaining data by using the group modal values in place of the remaining blanks and converting the compound/complex repeats (e.g. "10.3", "16.g", etc) to a "simple"/integer form. In later runs I can try different assumptions. I need to also gather more STR mutation rates for the non-FTDNA-tested markers; for now for those I just used the NIST standard 1.3x10-3.

    And finally I also included in the run YFull's Y-SNP tree under L21 and the terminal SNPs for each of the 317 kits so the program would follow the Y-SNP tree.

    SAPP completed the 317 kits with 439 data points in about 15 minutes. I have attached the output below. This is a huge amount of data to absorb and neither the picture output nor the table output are particularly easy to follow. However there are interesting points in the mutation history, including several possible STR signatures for sub-groups of L21 among the non-FTDNA markers. The run table output also includes 439-STR-long modal haplotypes for the major sub-L21 SNPs.

    Ignore kits YF01441 and YF01525 in the data; they're non-L21 kits in the group that the program correctly shows as outliers.

    For anyone outside of L21 I'm happy to repeat the exercise for other groups.

    Files (Dropbox):

    Output Run Tables (input tables plus SNP modal haplotypes and genetic distances between kits) (warning: 28Meg)

    439-STR Mutation History Tree as Picture (PNG) (warning: 14Meg & 54133x2946 pixels. Normally STR mutations are listed above the nodes/kits; here they are above and to the right of each node or kit for readability).

    439-STR Mutation History Tree as Table (HTML)
    Dave, we have a DF1 YFull group too. Would it help you to be an admin for that? I haven't compared but there may be some DF1/L513 people who aren't in the L21 group.

  8. #6
    Registered Users
    Posts
    339
    Sex
    Location
    USA
    Nationality
    USA
    Y-DNA (P)
    R1b-L21 L513*

    United States of America Ireland Germany Belgium Wallonia
    Quote Originally Posted by MitchellSince1893 View Post
    Thanks for the kind offer. Request a run on R-U152
    MitchellSince1893 sent me the U152 NGS kits; results attached (note in this case I took out kits YF05298, YF01457, and YF05911 as they were not in R-U152). There were 445 STRs with >50% read coverage for this group.

    I have also attached here the input file to SAPP for reference, and the spreadsheet I used to create the various sections of the input file.

    Picture Output (PNG)

    Table Output (HTML) (same as picture output but in tabular form; easier to read/print/etc)

    Run Tables (HTML) (kit inputs, modal haplotypes, etc)

    Input File

    Spreadsheet used to create input file.

    Dave

  9. The Following 3 Users Say Thank You to Dave-V For This Useful Post:

     dp (01-31-2017),  MitchellSince1893 (02-01-2017),  Osiris (01-31-2017)

  10. #7
    Registered Users
    Posts
    339
    Sex
    Location
    USA
    Nationality
    USA
    Y-DNA (P)
    R1b-L21 L513*

    United States of America Ireland Germany Belgium Wallonia
    Quote Originally Posted by Mikewww View Post
    Dave, we have a DF1 YFull group too. Would it help you to be an admin for that? I haven't compared but there may be some DF1/L513 people who aren't in the L21 group.
    I had forgotten to join the DF1 YFull Group but just did this week (thanks!). Those kits should hopefully be in the L21 set too, but here's the results for DF1 by itself. 451 STRs in this set were >50% so there are a few more captured than in the L21 output.

    Output picture (PNG)

    Output table (HTML) (same as above but in table form)

    Run Tables (modal haplotypes, etc)

    Input File

    Spreadsheet used to create input file

  11. The Following 2 Users Say Thank You to Dave-V For This Useful Post:

     dp (01-31-2017),  Mag Uidhir 6 (02-03-2017)

  12. #8
    Registered Users
    Posts
    4,210
    Sex
    Location
    Tierra de Ayllon
    Y-DNA (P)
    U152>L2>Z49>Z142>
    mtDNA (M)
    H1bt
    Y-DNA (M)
    I2a2a1b2a1b1>Y4925
    mtDNA (P)
    H37

    England Scotland Wales Germany Ireland Sweden Finns
    Quote Originally Posted by Dave-V View Post
    ...

    When I say "all markers", I did have to clean up the data for no-reads, etc. I included only the 445 STRs with >50% read coverage for this group...Perhaps more importantly for this first run I simplified the remaining data by using the group modal values in place of the remaining blanks and converting the compound/complex repeats (e.g. "10.3", "16.g", etc) to a "simple"/integer form. In later runs I can try different assumptions. I need to also gather more STR mutation rates for the non-FTDNA-tested markers; for now for those I just used the NIST standard 1.3x10-3....
    Feedback for you: This approach of using modal values in place of blanks resulted in me having a 36 marker difference with myself. I'm YF01489 and YY06577. (My father's BigY result and FGC result).

    I'd be interested to see if different assumptions get that number down.
    Last edited by MitchellSince1893; 02-01-2017 at 01:40 AM.
    Y DNA line continued: Z142>Z12222>FGC12378>FGC12401>FGC12384
    37% English, 26% Scot/Ulster Scot, 14% Welsh, 14% German, 3% Ireland, 3% Nordic, 2% French/Dutch, 1% India
    Hidden Content

  13. #9
    Registered Users
    Posts
    339
    Sex
    Location
    USA
    Nationality
    USA
    Y-DNA (P)
    R1b-L21 L513*

    United States of America Ireland Germany Belgium Wallonia
    Quote Originally Posted by MitchellSince1893 View Post
    Feedback for you: This approach of using modal values in place of blanks resulted in me having a 36 marker difference with myself. I'm YF01489 and YY06577. (My father's BigY result and FGC result).

    I'd be interested to see if different assumptions get that number down.
    Good example, thanks. A better approach than just using modals might be to use the nearest reported value based on how the kits fall on the SNP tree. I'll try that next.

  14. The Following User Says Thank You to Dave-V For This Useful Post:

     MitchellSince1893 (02-01-2017)

  15. #10
    Gold Class Member
    Posts
    2,718
    Sex

    Quote Originally Posted by Dave-V View Post
    TI ran those through SAPP to build a L21 mutation history tree for the 317 kits for all markers.
    YFull already does something along this line itself, although I don't know whether the employed algorithm is documented anywhere--it may still be very experimental. Just as an example, below is the signature YFull gives for L226. The so-called mutation rate actually has an inverse logic--more stars implies a slower mutation rate.

    This is found on the STRs tab of the "Info" on a tree branch.
    ---
    STRs MUTATION RATE ANC DER
    DYS538 **** 10 → 9
    DYS716 *** 26 → 24
    DYS592 shared with YF04772 *** 11 → 12
    DYS522 shared with YF06573 *** 10 → 11
    DYR59 *** 13 → 15
    DYS526A *** 13 → 14
    DYS510 shared with YF01431 *** 17 → 18
    DYR44 ** 11 → 10
    DYS557 shared with YF03622, YF06573 ** 16 → 15
    DYS439 shared with R-Z2189 ** 12 → 11
    DYS551 ** 13 → 14
    DYR101 shared with R-A16 ** 11 → 12
    DYS679 shared with R-A16 ** 13 → 14
    DYS622 ** 19 → 20
    DYR161 shared with YF04772 ** 16 → 15
    DYS456 shared with R-Z2189 ** 16 → 15
    DYS517 shared with R-A16, YF06573 ** 14 → 15
    DYS722 shared with R-A16, YF06573 ** 21 → 22
    DYR33 shared with R-A16, YF04772, YF06573 ** 14 → 15
    DYS546 shared with R-Z2189, YF01431 ** 15 → 16
    DYR75 ** 14 → 13
    DYS630 shared with YF01431, YF04772 ** 21 → 20
    DYR1 shared with R-Z2189 ** 17 → 16
    DYS523 ** 13 → 12
    DYR55 shared with R-A16, YF01431, YF03622, YF04772 ** 11 → 12
    DYR160 ** 13 → 15
    DYF393 shared with YF03622 ** 27 → 28
    DYS471 ** 28 → 29
    DYS626 shared with R-A16, YF06573 ** 26 → 27
    DYS664 shared with YF01431, YF03622, YF04772 ** 50 → 51
    DYS627 shared with R-A16 ** 28 → 27
    DYS526B ** 35 → 36
    DYS684 ** 57 → 59
    DYS612 shared with R-A16 ** 31 → 30
    DYR170 ** 37 → 34
    DYS688 ** 79 → 80
    ---
    Last edited by lgmayka; 02-01-2017 at 03:42 PM.

  16. The Following User Says Thank You to lgmayka For This Useful Post:

     Dave-V (02-01-2017)

Page 1 of 3 123 LastLast

Similar Threads

  1. Replies: 15
    Last Post: 01-30-2017, 09:51 PM
  2. Replies: 12
    Last Post: 04-20-2016, 03:09 PM
  3. Replies: 4
    Last Post: 04-15-2016, 02:45 PM
  4. Replies: 0
    Last Post: 12-03-2015, 11:49 PM
  5. Replies: 1
    Last Post: 06-07-2015, 03:05 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •