Page 2 of 2 FirstFirst 12
Results 11 to 18 of 18

Thread: Explanations for varying rates of Y-Chromosome mutation

  1. #11
    Registered Users
    Posts
    77
    Sex
    Location
    Phoenix, AZ - USA
    Ethnicity
    Caucasian
    Nationality
    American
    Y-DNA
    R-Z49
    mtDNA
    N1b2

    Quote Originally Posted by redifflal View Post
    If the younger to procreate theory does hold water, it would be interesting to understand how that sustains over time. It would have to be a cultural context that allows for such a phenomenon beyond just availability of resources. It might even have very little to do with the so called dominance of the male line and more to do with the resident female population's cooperativity as was noted in the other thread recently about grandma power
    https://www.npr.org/sections/goatsan...ntent=20180607
    In my view, this doesn’t persist over time and that article has little relevance. Remember we’re not talking about hunter gatherers. We’re talking about subsistence farmers who had developed a complex society with power systems, including chieftains, fealty, slavery, and even a legal system. But wealth generation also depended on expanding into new territories. I think this was a very difficult time for women, who were forced to adjust to mobile warring tribes of men who used their superior weaponry to kill their sons and husbands and keep them as trophies.

  2. #12
    Registered Users
    Posts
    514
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    Quote Originally Posted by theImmortal View Post
    I don't see how greater sample size = higher observed mutation rates. We should see a regression toward the mean. As you point out, we'll see random statistical variation for smaller sample sizes, but shouldn't they be scattered equally between high and low outliers? I take it you agree they are not.
    For L226 - As we get more and more YSNP branches, it becomes very obvious that the true years per YSNP mutation is much less than those used in predicting much older haplogroups. During my previous iteration of analysis, I used 70 years per YSNP which made the dating of the YSNP branches fit neatly into the surname clusters. Since the number of branches went for 45 to 78, many of the YSNP branches now are younger than the surname clusters. Also, for very large surname clusters of 200 or 300 testers, they are finding years per YSNP to be around 30 years per YSNP.

    Just think when L226 reaches 200 branches, a lot will be in the last 1,000 years - but we are continually finding more branches older than 1,000 years as well. We continue to add more and more intermediate branches, the years per YSNP mutations will have to come down. Also, we are finding more and more surname clusters as well (now 25 vs. 15 for the previous analysis). This also contributes to forcing years per YSNP to decrease per year since there are more fixed time points that the older YSNP progression must fit into. Another general rule of thumb, the more recent time frame that you are analyzing - the higher the number of YSNP branches are present. The older your time frame, the more lines that have died that hides mutations that could of happened if all lines survived. So the younger the time, the less that lines have died out.

    Also, as pointed out, the origin of surnames varies dramatically by geography. The example that I was giving for 1,000 years for surname creation applies to Irish and Scottish surnames. English surnames are a little younger by another 100 or 200 years. Plus there are exceptions as royal surnames were used prior to 1,000 years by some surnames. This date of surname obviously does not work for Swedish lines since they only converted surnames being passed in the 100 to 200 years. Turkey did not allow surnames for most people until just around 100 years ago.
    Last edited by RobertCasey; 06-13-2018 at 04:09 AM.

  3. The Following 2 Users Say Thank You to RobertCasey For This Useful Post:

     JMcB (06-14-2018), redifflal (06-13-2018)

  4. #13
    Junior Member
    Posts
    9
    Sex

    Quote Originally Posted by RobertCasey View Post
    For L226 - As we get more and more YSNP branches, it becomes very obvious that the true years per YSNP mutation is much less than those used in predicting much older haplogroups. During my previous iteration of analysis, I used 70 years per YSNP which made the dating of the YSNP branches fit neatly into the surname clusters. Since the number of branches went for 45 to 78, many of the YSNP branches now are younger than the surname clusters. Also, for very large surname clusters of 200 or 300 testers, they are finding years per YSNP to be around 30 years per YSNP.
    I am intrigued but baffled by this observation. Is this a consequence of imperfect coverage by any single Big Y test, with the increased number of identified SNPs in a given line with more men tested due to the higher likelihood of two men sharing a region of the Y chromosome usually underrepresented by Big Y testing? Surely if coverage were perfect and complete, the number of SNP differences between any random pair of men would, after dividing by two, be an unbiased estimate of the number of mutations in the time since their common ancestor so long as there are no back mutations. The estimate in that case should not systematically decrease by adding more descendants of that ancestor, hence my interest to understand this better.

  5. The Following 2 Users Say Thank You to neanderling For This Useful Post:

     JMcB (06-14-2018), redifflal (06-13-2018)

  6. #14
    Registered Users
    Posts
    514
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    Quote Originally Posted by neanderling View Post
    I am intrigued but baffled by this observation. Is this a consequence of imperfect coverage by any single Big Y test, with the increased number of identified SNPs in a given line with more men tested due to the higher likelihood of two men sharing a region of the Y chromosome usually underrepresented by Big Y testing? Surely if coverage were perfect and complete, the number of SNP differences between any random pair of men would, after dividing by two, be an unbiased estimate of the number of mutations in the time since their common ancestor so long as there are no back mutations. The estimate in that case should not systematically decrease by adding more descendants of that ancestor, hence my interest to understand this better.
    Years per YSNP is based on only shared YSNP mutations that define branches and the number of private and branch equivalents are not in play. Around every six months or so, I attempt to determine the TMRCA values. As more and more intermediate branches between 1,000 and 1,500 years are discovered via a larger sample size of Big Y testers (and increased numbers of branches under L226), the more I have to reduce the years per YSNP for two reasons: 1) there are now many more intermediate branches that have been discovered 1,000 and 1,500 years; 2) there are now many more surname clusters found that are around 1,000 years of age that require YSNP branches to not having younger estimates than 1,000 years.

    As the sample size grows for Big Y testers, the number of branches continue to increase which includes many times branch equivalents becoming branches which add another branch in their path that is older than 1,000 years. It also includes new YSNPs that create intermediate YSNP branches to be added. So the number of branches between 1,500 years (the age of L226) and 1,000 years (where a increasing number of surname clusters being added) results in having to continually decrease the years.

    Here is an example. Previous evaluation (45 branches) - L226 > FGC5660 > Z17669 > DC63 > DC29 > DC30 > DC189. We knew that DC189 was older than 1,000 years due to mixture of surnames: 6 McNamara, 8 Bryan, 2 Small, 2 Davis, 2 O"Neil and 6 other surnames. Obviously, we are getting close to 1,000 years with 8 Bryans and 6 McNamara - but having nine other surnames implies it it just over 1,000 years. Plus the Bryans are tightly clustered together. DC189 was estimated to be 1,500 - (6 x 70) = 1,080 which is which is pretty close.

    Fast forward to the present analysis (78 branches) - L226 > FGC5660 > Z17669 > DC63 > DC29 > DC367 > DC31 > DC30 > DC189 > DC191 > DC340. The Bryans were split off to their own branch but DC191 remains older than 1,000 years due to surnames: 5 McNamara, 2 Davis, 2 O'Neil and 4 other surnames. So using the 70 years per YSNP, we get the following estimate for DC191 = 1,500 - (9 x 70) = 870 years old. This is younger than what surnames imply that suggests around 1,060 years. Even with the new 60 years per YSNP, the number of YSNPs for this path is even less than 60 years 1,500 - (9 x 60) = 960. But allowing some statistical variation, these two estimates are much closer with 60 years - but 870 is just too way young. For this particular path, 50 years per YSNP would be more appropriate = 1,500 - (9 x 50) = 1,050 is very close to 1,060 years (two generations before surnames).
    Last edited by RobertCasey; 06-13-2018 at 02:37 PM.

  7. The Following User Says Thank You to RobertCasey For This Useful Post:

     redifflal (06-13-2018)

  8. #15
    Junior Member
    Posts
    9
    Sex

    Quote Originally Posted by RobertCasey View Post
    the more I have to reduce the years per YSNP for two reasons: 1) there are now many more intermediate branches that have been discovered 1,000 and 1,500 years; 2) there are now many more surname clusters found that are around 1,000 years of age that require YSNP branches to not having younger estimates than 1,000 years.
    So the reason for the dependency on sample size is that more samples increases the likelihood of finding two men who descend from different brothers of a very early user of a given surname. If you have only two men with a surname and don't have the paper trail that connects them, perhaps they actually have a common ancestor from 300 years ago, but you can wonder if the common ancestor was 800 years ago. Perhaps the fifth man tested with that surname connects from an earlier branch, demonstrable by SNPs, making it no longer possible to consider the connection of the first two men to have been so ancient. Once you've got men from the earliest still extant split of the surname (though you never know when that has happened without a full paper trail), you have the best estimate that you can achieve. The other element here is having a good estimate of when surnames were adopted, which is what provides the number of years.

    I wonder if the high rates that you are seeing may also be providing clues as to breakdowns in the assumptions that go into mutation rate estimates. Specifically, to convert SNPs into years rather than into generations, you have to assume either that the higher mutation rate in older fathers conveniently cancels out the number of extra years elapsed (e.g., that the mutation rate in a 72 year old father is twice that of a 36 year old father and 4 times that of an 18 year old father) or that the distribution of paternal ages over the course of a few hundred years doesn't vary much from one lineage to another. I can imagine that a pedigree who consistently married early would vastly outnumber a pedigree that consistently married late after several hundred years, so there is a selection bias towards overestimating the mutation rate if you don't have a paper trail and have to base the analysis on surnames and the date of surname adoption. If the increased mutation rate in older men falls short of compensating for the decreased number of generations ("older" here might just mean age 30 versus age 18), this would skew even more severely.

    If there are systematic cultural variations in reproductive age, this does not bode well for a universally valid estimation of the date of the most recent common ancestor based on SNP counts within the genealogically relevant time frame, no matter how many men are tested. The more reliable estimates will necessarily derive from early branches that have certain paper trail documentation and SNP results from known descendant lines and from an understanding of the dates and patterns of surname adoption.

  9. The Following User Says Thank You to neanderling For This Useful Post:

     JMcB (06-14-2018)

  10. #16
    Registered Users
    Posts
    514
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    Quote Originally Posted by neanderling View Post
    Specifically, to convert SNPs into years rather than into generations, you have to assume either that the higher mutation rate in older fathers conveniently cancels out the number of extra years elapsed (e.g., that the mutation rate in a 72 year old father is twice that of a 36 year old father and 4 times that of an 18 year old father) or that the distribution of paternal ages over the course of a few hundred years doesn't vary much from one lineage to another. I can imagine that a pedigree who consistently married early would vastly outnumber a pedigree that consistently married late after several hundred years, so there is a selection bias towards overestimating the mutation rate if you don't have a paper trail and have to base the analysis on surnames and the date of surname adoption. If the increased mutation rate in older men falls short of compensating for the decreased number of generations ("older" here might just mean age 30 versus age 18), this would skew even more severely.

    If there are systematic cultural variations in reproductive age, this does not bode well for a universally valid estimation of the date of the most recent common ancestor based on SNP counts within the genealogically relevant time frame, no matter how many men are tested. The more reliable estimates will necessarily derive from early branches that have certain paper trail documentation and SNP results from known descendant lines and from an understanding of the dates and patterns of surname adoption.
    I used to think years per generation was an issue. But in my calculations, this is not a factor. There are two assumptions: 1) the accuracy of the TMRCA date of your particular haplogroup - but this methodology is pretty well established and accepted (with the exception of how people deal with massive branch equivalents which are common). For L226, we used to have this exposure, but with two recent Big Y tests, 80 % of the L226 equivalents have moved up to newly discovered branches just above L226; 2) the dating of surname clusters based on surname diversity. Not only may be a little older and somewhat younger for different surnames, this extra 100 years makes a difference. Also, what are the acceptable NPE rates for surname clusters. I have made a stab at using a variable surname cluster dating based on the number of surnames involved and the percentage of surnames. I am still playing around varying the 1,000 year rate to 900 to 1,100 years based on how many surnames are involved (could be an early NPE event) and what the percentage of the surname is (obviously a cluster with 100 % one surname is probably more recent than one with 70 % of one surname.

    The age at reproductive and average age per generation are minor factors. I think with 150 read length NGS testing, the real YSNP rate will approach one mutation per generation with enough testing. At 2,000 read length, this then become two YSNP mutations per generation. So all we need is 1000X testers and it will truly reveal YSNP rate. Also, another factor is usage of YSNPs in complex areas and inserts/deletes. This increases the years per YSNP by 20 or 30 percent or more. My analysis includes these - YFULL does not include these in their estimates.

  11. The Following User Says Thank You to RobertCasey For This Useful Post:

     JMcB (06-14-2018)

  12. #17
    Registered Users
    Posts
    85
    Sex
    Location
    Ireland
    Y-DNA
    R1b-Z255
    mtDNA
    K1C1B

    Ireland Ireland County Tipperary
    In terms of dating surnames I think it is fair to believe that they were first adopted by the then elite and only a few generations after that by the branches that split off earlier. In the case of L226 I think the O'Brien surname would have come before many of the others. I see this in my own group [Z255/S219 > Z16429 > BY519 > ZZ7 > DYS435=12 > Z16430] where the Byrne surname seems to emerge earlier than the disparate others under Z16430.

  13. The Following User Says Thank You to CillKenny For This Useful Post:

     JMcB (06-14-2018)

  14. #18
    Registered Users
    Posts
    514
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    Quote Originally Posted by CillKenny View Post
    In terms of dating surnames I think it is fair to believe that they were first adopted by the then elite and only a few generations after that by the branches that split off earlier. In the case of L226 I think the O'Brien surname would have come before many of the others. I see this in my own group [Z255/S219 > Z16429 > BY519 > ZZ7 > DYS435=12 > Z16430] where the Byrne surname seems to emerge earlier than the disparate others under Z16430.
    We know that many O'Briens are direct descendants of King Brian Boru (born 941). So it may be off one or two generations for this royal line. But they did not use surnames until several generations later - and probably many of descendants took other surnames by the time that surnames were used. However, Sir Conor O'Brien, who is a direct descendant of King Brian Boru, has been YDNA tested. He is a proven descendant of King Brian Boru since one of his titles is "Chief of the O'Briens." This title has been formally passed down for 40 generations and documentation of this title is still available today.

    We now have ten YSNP branches that are associated with this very large surname cluster. Dating of Y5610 is 1,080 years old - pretty close to time of King Brian Boru 1,500 - (7 x 60) = 1,080. We now have ten YSNP branches associated with this line. Another major branch, DC36, may be later proven to part of this line as well (brother of Y5610) which has large numbers of O'Briens (60 %). However, many branches that are not YSNP tested have been predicted DC36 where further YSNP testing could move predicted DC36 testers to other non-DC36 branches.

    The royal O'Brien surname cluster has an amazing ten YSNP branches associated with this surname cluster. It also has another twelve YSTR branches for a total of 22 branches that divide up this surname cluster. This surname cluster really wants to get good time estimates but years per YSNP get less and less accurate the more recent you get due to lack of sample size. Their lowest branch (Y44000) is 12 levels below L226 giving it a date of 780 years ago - this estimate is way too old due to lack of sample size. During each iteration, they add more and more branches to their surname cluster - but they do not understand that other older lines become more tested and every six to twelve months, I have to reduce the years per YSNP (this time reducing from 70 years per YSNP to 60 years per YSNP).

    The only alternative for dating would be YSTR based - but the statistical variation and accuracy for these estimates would be very problematic. So as they add more YSNP branches getting more levels - the time estimates do not change as the years per YSNP gets lowered as they add more levels of branches. But YSTR estimates below YSNP dating is probably better than just pure speculation - but not by too much.
    Last edited by RobertCasey; 06-14-2018 at 02:29 PM.

  15. The Following User Says Thank You to RobertCasey For This Useful Post:

     JMcB (06-14-2018)

Page 2 of 2 FirstFirst 12

Similar Threads

  1. Replies: 0
    Last Post: 04-22-2018, 09:02 PM
  2. Replies: 41
    Last Post: 07-28-2017, 06:29 AM
  3. Replies: 301
    Last Post: 06-23-2017, 09:29 PM
  4. New I1 SNP packs, but need explanations
    By lgmayka in forum I1-M253
    Replies: 5
    Last Post: 08-03-2016, 04:10 AM
  5. Replies: 77
    Last Post: 09-26-2013, 03:37 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •