Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Yfull - Accuracy of Estimated Dates

  1. #1
    Registered Users
    Posts
    769
    Sex
    Y-DNA (P)
    J-M241
    mtDNA (P)
    M5a2a

    Yfull - Accuracy of Estimated Dates

    As dates estimated by Yfull plays a significant role from validating a claimed historical event to assessing the genetic impact of a historical event/s.


    • How accurate are the dates as estimated by Yfull for various clades ?
    • Any cases were they had underestimated or overestimated by small time frame to large discrepancy ?
    • Are there options to validate accuracy of estimation ?
    Last edited by discreetmaverick; 06-19-2021 at 02:53 PM.

  2. The Following 3 Users Say Thank You to discreetmaverick For This Useful Post:

     IronHorse (06-21-2021),  JMcB (06-19-2021),  Riverman (06-22-2021)

  3. #2
    Gold Class Member
    Posts
    2,971
    Sex
    Location
    Florida, USA.
    Ethnicity
    English, Scottish & Irish
    Nationality
    American
    Y-DNA (P)
    I-FT80854
    mtDNA (M)
    H1e2
    mtDNA (P)
    K1

    England Scotland Ireland Germany Bayern Italy Two Sicilies France
    All age estimations have to be considered within the parameters of their Confidence Intervals which are usually quite wide.


    For example, the age estimation for my branch is 225 ybp (or 1795). Our genealogical records indicate that none of us share a common ancestor after 1720. Although, other measurements are pointing to a common ancestor who lived in the 1600s


    YFull’s CI give us a range of dates as follows: TMRCA CI 95% 400<->100 ybp
    (Or 1620 - 1920)

    All in all, I would say they’ve done about as well as can be expected but there’s still a lot of leeway there. Which we have been able to narrowed down using genealogical records.


    https://www.yfull.com/tree/I-A13242/



    If I go further back in time to an earlier branch the ranges are naturally even wider:

    TMRCA 1150 ybp (Or 870 AD)

    formed CI 95% 2400<->1400 ybp, TMRCA CI 95% 1650<->700 ybp
    (Or Formed 380 BC - 620 AD, TMRCA 370 AD - 1320 AD)


    https://www.yfull.com/tree/I-A13248/



    Here’s an approximate visualization of the above. Although, the formed date has slightly change since I did this:


    98A0E8E0-AC17-400C-8D85-0405C6AA139F.jpeg
    Last edited by JMcB; 06-20-2021 at 02:57 PM.
    Paper Trail: 42.25% English, 31.25% Scottish, 12.5% Irish, 6.25% German, 6.25% Sicilian & 1.5% French. Or: 86% British Isles, 6.25% German, 6.25% Sicilian & 1.5% French.
    LDNA(c): 86.3% British Isles (48.6% English, 37.7% Scottish & Irish), 7.8% NW Germanic, 5.9% Europe South (Aegean 3.4%, Tuscany 1.3%, Sardinia 1.1%)
    BigY 700: I1-Z140 >I-F2642 >Y1966 >Y3649 >A13241 >Y3647 >A13248 (circa 620 AD) >A13242/YSEQ (circa 765 AD) >FT80854 (circa 1650 AD).

  4. The Following 6 Users Say Thank You to JMcB For This Useful Post:

     Agamemnon (06-22-2021),  deadly77 (06-20-2021),  discreetmaverick (06-19-2021),  IronHorse (06-21-2021),  Michał (06-20-2021),  Riverman (06-22-2021)

  5. #3
    Moderator
    Posts
    979
    Sex
    Location
    United Kingdom
    Ethnicity
    European
    Nationality
    British
    Y-DNA (P)
    I-FGC74348
    mtDNA (M)
    J1c1

    United Kingdom Northumberland European Union
    I'd agree with JMcB on this - most people just look at the midpoint of the age estimate listed on YFull's tree without considering the spread of the 95% confidence interval (found by hovering your cursor over the date). Within this confidence interval, it's fair to say that the TMRCA age estimates are accurate, but imprecise (these two criteria often seem to be conflated, but really shouldn't be). With the caveat that 5% of samples fall outside the 95% confidence interval.
    Haplogroup I1 Ancient DNA Samples Map: Hidden Content

  6. The Following 4 Users Say Thank You to deadly77 For This Useful Post:

     Agamemnon (06-22-2021),  discreetmaverick (06-20-2021),  JMcB (06-20-2021),  Michał (06-20-2021)

  7. #4
    Registered Users
    Posts
    769
    Sex
    Y-DNA (P)
    J-M241
    mtDNA (P)
    M5a2a

    Quote Originally Posted by poi View Post
    FWIW - I am now permanently in YFull's Y-Tree, representing (for now) the R-BY160158 line. I have also messaged the unknown sample YF17019 who shares the same line as me within ~1900 years. I will update this thread when/if the person responds with his paternal origin. The other sample is a Tamil Srilankan from 1000Genomes, so effectively anonymous.

    Attachment 37492
    Quote Originally Posted by poi View Post
    YFull just increased my paternal line's TMRCA by 1000 years. So, instead of 1900ybp, as it was earlier, it is now 2900ybp.

    https://www.yfull.com/tree/R-BY160158/


    Continuing here,

    That is really a large discrepancy of 1000 Years, did Y full provide explanation for such a large error in estimation?

    Can this be independently verified ?

  8. #5
    Gold Class Member
    Posts
    2,971
    Sex
    Location
    Florida, USA.
    Ethnicity
    English, Scottish & Irish
    Nationality
    American
    Y-DNA (P)
    I-FT80854
    mtDNA (M)
    H1e2
    mtDNA (P)
    K1

    England Scotland Ireland Germany Bayern Italy Two Sicilies France
    Quote Originally Posted by discreetmaverick View Post
    Continuing here,

    That is really a large discrepancy of 1000 Years, did Y full provide explanation for such a large error in estimation?

    Can this be independently verified ?

    Not sure if this was meant for this thread or another one because poi hasn’t posted in this thread. So he might not see this, if it was intended for somewhere else. Be that as it may, judging from the information in the [info] box, there’s a large difference (15.5 SNPs) in the number of Novel Variants each tester has. Which is why the date has changed so much. At any rate, poi can always email them to see what’s up.



    Edit: Plus, judging from the coverage, one looks like it may be Y500 and the other Y700 or WGS.
    Last edited by JMcB; 06-22-2021 at 02:41 AM.
    Paper Trail: 42.25% English, 31.25% Scottish, 12.5% Irish, 6.25% German, 6.25% Sicilian & 1.5% French. Or: 86% British Isles, 6.25% German, 6.25% Sicilian & 1.5% French.
    LDNA(c): 86.3% British Isles (48.6% English, 37.7% Scottish & Irish), 7.8% NW Germanic, 5.9% Europe South (Aegean 3.4%, Tuscany 1.3%, Sardinia 1.1%)
    BigY 700: I1-Z140 >I-F2642 >Y1966 >Y3649 >A13241 >Y3647 >A13248 (circa 620 AD) >A13242/YSEQ (circa 765 AD) >FT80854 (circa 1650 AD).

  9. The Following 3 Users Say Thank You to JMcB For This Useful Post:

     Agamemnon (06-22-2021),  discreetmaverick (06-22-2021),  parasar (06-22-2021)

  10. #6
    Registered Users
    Posts
    769
    Sex
    Y-DNA (P)
    J-M241
    mtDNA (P)
    M5a2a

    Quote Originally Posted by JMcB View Post
    Not sure if this was meant for this thread or another one because poi hasn’t posted in this thread. So he might not see this, if it was intended for somewhere else. Be that as it may, judging from the information in the [info] box, there’s a large difference (15.5 SNPs) in the number of Novel Variants each tester has. Which is why the date has changed so much. At any rate, poi can always email them to see what’s up.



    Edit: Plus, judging from the coverage, one looks like it may be Y500 and the other Y700 or WGS.

    PMed Him,

    STR haplotype for MRCA of R-BY160158 (minimum parsimony hypothetical using 2 samples) with 632 STRs.


    Age estimated based on Average of two, Formula: (1890+3963)/2 = 2926 ybp

    Age estimation difference between the two samples is 2073 yrs.




    Does use of Big 500 and Big 700 or WGS can result in difference of 2000 yrs?

    Other person is from 1000 GENOMES PROJECT https://www.internationalgenome.org/...sample/HG03685

    If other person get Big 700 wold be closer to Poi? then wouldn't Date estimate for the clade would be pushed forward by another 1000 yrs ?

    Am I understanding it correctly?
    Last edited by discreetmaverick; 06-22-2021 at 01:58 PM.

  11. The Following User Says Thank You to discreetmaverick For This Useful Post:

     JMcB (06-22-2021)

  12. #7
    Gold Class Member
    Posts
    2,971
    Sex
    Location
    Florida, USA.
    Ethnicity
    English, Scottish & Irish
    Nationality
    American
    Y-DNA (P)
    I-FT80854
    mtDNA (M)
    H1e2
    mtDNA (P)
    K1

    England Scotland Ireland Germany Bayern Italy Two Sicilies France
    Quote Originally Posted by discreetmaverick View Post
    PMed Him,

    STR haplotype for MRCA of R-BY160158 (minimum parsimony hypothetical using 2 samples) with 632 STRs.

    Age estimated based on Average of two, Formula: (1890+3963)/2 = 2926 ybp

    Age estimation difference between the two samples is nearly 2073 yrs.




    Does use of Big 500 and Big 700 can result in difference of 2000 yrs?

    If other person get Big 700 wold be closer to Poi? then wouldn't Date estimate for the clade would be pushed forward by another 1000 yrs ?

    Am I understanding it correctly?

    The increased coverage in a Y700 test can lead to more Novel Variants being found. For example, in my Y500 results I had 7 Novel Variants and in my Y700 test I received 14 NVs. So, assuming I’ve interpreted the coverage figures correctly, it’s a possibility. As YFull uses the better quality NVs for dating purposes. Do you know what version of test these two sample have taken?

    I also noticed last night that YFull is in the process of updating their tree and the difference in Novel Variants has slightly changed from last night (it’s now 14.35). So it’s possible they haven’t finished analyzing the samples.
    Paper Trail: 42.25% English, 31.25% Scottish, 12.5% Irish, 6.25% German, 6.25% Sicilian & 1.5% French. Or: 86% British Isles, 6.25% German, 6.25% Sicilian & 1.5% French.
    LDNA(c): 86.3% British Isles (48.6% English, 37.7% Scottish & Irish), 7.8% NW Germanic, 5.9% Europe South (Aegean 3.4%, Tuscany 1.3%, Sardinia 1.1%)
    BigY 700: I1-Z140 >I-F2642 >Y1966 >Y3649 >A13241 >Y3647 >A13248 (circa 620 AD) >A13242/YSEQ (circa 765 AD) >FT80854 (circa 1650 AD).

  13. The Following 2 Users Say Thank You to JMcB For This Useful Post:

     Agamemnon (06-22-2021),  discreetmaverick (06-22-2021)

  14. #8
    Registered Users
    Posts
    695
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA (P)
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    YFULL, the BigTree and U106 uses very similar TMRCA estimates. These are quite accurate down to predictable haplogroups which originate from 1,500 to 2,500 YBP. Most of these
    estimates are within 10 to 20 % but some are not. Accuracy assumes a reasonable sample size - otherwise accuracy can suffer. Unfortunately, YFULL continues to use this
    methodology in very recent times where accuracy is very questionable but Big Tree normally cuts off these questionable TMRCAs.

    Below predictable haplogroups, counting YSNP mutations in a bottoms up approach is much more accurate. But this requires access to private YSNPs and the estimates are really
    affected the sample size. The problem with this approach is the years per YSNP varies dramatically depending on the sample size and which NGS test. Big Y700 has better coverage
    and slightly longer read lengths which produce much lower years per YSNP. Also, statistical variation can be very significantly in the genealogical time frame. The assumptions
    for years per YSNP vary a lot which is the weak part of these estimates. You also should use some blend of years per YSNP to adjust for the mixture of Big Y500 and Big Y700 testers.

    If you have over 500 Y67 testers under your predictable haplogroup, I use yet a different method which depends on surname clusters (I use 1000 AD for my Irish haplogroup). This
    is by far the most accurate measurement between the origin of haplogroup and L226's 25 surname clusters. You can actually calculate the real years per YSNP for 20 to 30 % of
    the upper part of the haplotree. This adjusts for statistical variation as the years per YSNP is allowed to vary between 48 years and 350 years. From these branches, you
    can calculate a pretty reasonable "average" years per YSNP which will decline over time as the sample size increases (now around 70 years per YSNP branch). However, in the
    genealogical time frame, estimates will remain much older until you have around 100 testers under a surname cluster which rarely happens. Using the average produces somewhat
    conservative estimates that are usually somewhat older than they should be. Here is a presentation on this approach:

    https://www.youtube.com/watch?v=sKaxanrxBgs&t=1682s

    This approach is best of breed for larger predictable haplogroup: L1065, M222, L193, CTS4466, etc. Its weak point is that TMRCA estimates under
    surname clusters and TMRCA where no surname clusters exist tend to be older if the sample sizes are lacking.

  15. The Following 6 Users Say Thank You to RobertCasey For This Useful Post:

     deadly77 (06-22-2021),  discreetmaverick (06-22-2021),  IanFitzpatrick (06-22-2021),  JMcB (06-22-2021),  Riverman (06-22-2021),  vettor (06-22-2021)

  16. #9
    Registered Users
    Posts
    103
    Sex
    Ethnicity
    Irish/Scot/English
    Y-DNA (P)
    FGC11134/ZZ44/BY9002
    mtDNA (M)
    H4a1a1

    Ireland Scotland England
    Everything Robert says above is spot on, you simply need to have a large sample size for accuracy.

    Y-Full, from what I have seen on groups I am familiar with, seems to do a reasonable job with dating but simply do not have enough samples on the vast majority of lines, especially when it comes to the last 1000 years.

  17. The Following 3 Users Say Thank You to IanFitzpatrick For This Useful Post:

     deadly77 (06-22-2021),  discreetmaverick (06-22-2021),  JMcB (06-22-2021)

  18. #10
    Registered Users
    Posts
    769
    Sex
    Y-DNA (P)
    J-M241
    mtDNA (P)
    M5a2a

    Quote Originally Posted by JMcB View Post
    The increased coverage in a Y700 test can lead to more Novel Variants being found. For example, in my Y500 results I had 7 Novel Variants and in my Y700 test I received 14 NVs. So, assuming I’ve interpreted the coverage figures correctly, it’s a possibility. As YFull uses the better quality NVs for dating purposes. Do you know what version of test these two sample have taken?

    I also noticed last night that YFull is in the process of updating their tree and the difference in Novel Variants has slightly changed from last night (it’s now 14.35). So it’s possible they haven’t finished analyzing the samples.
    Other person is from 1000 genome project, https://www.internationalgenome.org/...sample/HG03685

    How many SNP/STR they had tested for that person?

    Sorry, I don't know about Poi.


    How much does a novel variant add on average to the time estimate ? Were in the formula or calculation of estimation does this value of 14.35 will be plugged into ?

  19. The Following User Says Thank You to discreetmaverick For This Useful Post:

     JMcB (06-22-2021)

Page 1 of 2 12 LastLast

Similar Threads

  1. Replies: 14
    Last Post: 10-26-2020, 06:03 PM
  2. Accuracy for Regions Within Britain
    By Bog Body in forum 23andMe
    Replies: 5
    Last Post: 09-17-2020, 08:30 PM
  3. WeGene accuracy...
    By NiamhWitch in forum WeGene
    Replies: 8
    Last Post: 10-12-2018, 07:30 PM
  4. Replies: 12
    Last Post: 04-11-2017, 09:00 PM
  5. Estimated total size of haplogroups
    By Tomenable in forum Other
    Replies: 4
    Last Post: 05-19-2016, 06:54 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •