PDA

View Full Version : Age of Z251 Based on YFull SNP Rate Constants



seferhabahir
03-21-2015, 03:32 AM
See the following paper (published today) on new rate constants for Y-SNPs based on NGS data

http://rjgg.molgen.org/index.php/RJG...e/view/151/175

I went and checked all of my FGC SNPs to see how many fell into the paper's combBED regions (defined in Table 1 of the paper's supplement) and came up with 28 of them. So, using their 160 years per SNP, this gives 4480 years ago, consistent with my previous statements that my particular lineage likely diverged from the rest of the Z251 cousins 4000-5000 years ago.

Those 28 included FGC11986 (in a combBED region) but not A555 (not in a combBED region). If I add SNPs we know are above FGC11986 (these would be S9294, S11556, Z251, DF13), there will be 32 SNPs to give 5120 years ago and is consistent with my thinking on DF13 and also consistent with YFull's estimate of 5200 years ago for DF13 and 5100 years ago for Z251.

Perhaps we are seeing the dots line up well enough to have a reasonable story. It would be interesting to see if these dates hold up for other Z251 Big Y results. You will need to take your list of unique SNPs and see how many are in their combBED regions (which totals about 7.6 Mbp). Note that most Big Y tests looked at more of the Y, up to perhaps 10.3 or 11.0 Mbp, in which case you would need to use a smaller number of years per BIG Y SNP than of 160. The paper states 118 years as a coefficient for 10.3 Mbp and 111 years for 11.0 Mbp. For FGC coverage which is much more than Big Y at 23 Mbp, they use 53 years per SNP as the rate.

seferhabahir
03-23-2015, 11:27 PM
If I add SNPs we know are above FGC11986 (these would be S9294, S11556, Z251, DF13), there will be 32 SNPs to give 5120 years ago and is consistent with my thinking on DF13 and also consistent with YFull's estimate of 5200 years ago for DF13 and 5100 years ago for Z251.

It appears that the latest incarnation of the YFull tree puts L21 and DF13 at 4700ybp, and Z251 at 4200 ybp. I think Z251 is closer to the age of L21 and DF13 than they suggest. In any case, I think it is still likely that the line represented by A555 diverged perhaps 4000 years ago or more from the rest of Z251 because of the 28 unique SNPs in the combBED regions.

http://www.yfull.com/tree/R1b/

donmatthews0910
03-25-2015, 04:44 AM
Are the 28 unique SNPs under S11556 below S9294?

seferhabahir
03-25-2015, 05:42 AM
Are the 28 unique SNPs under S11556 below S9294?

Yes, they are below S11556>S9294>FGC11986 and seem to only appear in the 251-EE group (the one identified by A555).

Dave-V
03-28-2015, 06:06 PM
You will need to take your list of unique SNPs and see how many are in their combBED regions (which totals about 7.6 Mbp). Note that most Big Y tests looked at more of the Y, up to perhaps 10.3 or 11.0 Mbp, in which case you would need to use a smaller number of years per BIG Y SNP than of 160. The paper states 118 years as a coefficient for 10.3 Mbp and 111 years for 11.0 Mbp. For FGC coverage which is much more than Big Y at 23 Mbp, they use 53 years per SNP as the rate.

I wrote up a little Excel file to automate this. You cut-and-paste in a list of SNPs (by position, not the SNP names) and it picks out the ones within the YFull CombBED region, then calculates the age of the SNP list (treating it as a single phylogenetic block) using the YFull coefficients.

The Excel file is at https://dl.dropboxusercontent.com/u/106196821/YFull-based%20SNP%20Block%20Age%20Estimator.xlsx.

Mag Uidhir 6
03-29-2015, 12:04 AM
Thanks Dave!
Using your tool and Mike's discovery spreadsheet, I ran the numbers for L513-A2. It shows a phylogenic age of 2727.86 ybp (ie:778BC) Hmmmmm? I think I like this.
Brad



I wrote up a little Excel file to automate this. You cut-and-paste in a list of SNPs (by position, not the SNP names) and it picks out the ones within the YFull CombBED region, then calculates the age of the SNP list (treating it as a single phylogenetic block) using the YFull coefficients.

The Excel file is at https://dl.dropboxusercontent.com/u/106196821/YFull-based%20SNP%20Block%20Age%20Estimator.xlsx.

Muircheartaigh
03-29-2015, 11:01 AM
I ran Dave's program inserting my BAM filtered Full Genome results for SNPs downstream of Z253. Total SNPs in the ComBED regions = 33 calculated age of Z253 = 5295 years.

This figure compares with C4800 years if I use my 41 Big Y filtered SNPs and 118 years per Big Y SNP period suggested in the Yfull paper. My Big Y coverage is 10.3 million.


Coefficients for Age Calculations

Rate Constant of SNP Mutations: 8.2E-10 95% CI Min: 7.00E-10
POS POS in CombBED Y/N (1/0)? 95% CI Max: 9.40E-10
2838619 2838619 1
2888763 2888763 1 Tested CombBED Length: 7.60E+06 bp Or enter exact value from YFull BAM Report
4222379 4222379 0
4880969 4880969 0 Resulting SNP Mutation Rate: 160.46 years per SNP mutation
6450549 6450549 0 (High of 187.97 years, Low of 139.98 years)
6753258 6753258 1
6972701 6972701 1
7301905 7301905 1 Phylogenetic Block Age Estimates:
7814875 7814875 1 Number of SNPs in CombBED Region: 33
8237712 8237712 1
8823011 8823011 1 Calculated Age of Phylogenetic Block: 5295.25 ybp = 3345BC
13384620 13384620 0 95% CI High = 6203.01 ybp =4253BC
14203503 14203503 1 95% CI Low = 4619.26 ybp =2669BC
14361011 14361011 1
14454459 14454459 1
14706616 14706616 1
14809558 14809558 1
15937461 15937461 1
16716134 16716134 1
16850461 16850461 1
16959270 16959270 1
17101407 17101407 1
17234684 17234684 1
17473270 17473270 1
17539309 17539309 0
17814452 17814452 1
17874584 17874584 1
18543377 18543377 0
18706870 18706870 1
18880196 18880196 1
18948915 18948915 1
18995241 18995241 1
19262308 19262308 1
21132347 21132347 1
22466087 22466087 0
22475629 22475629 0
23954436 23954436 0
28627906 28627906 0
3013190 3013190 0
3747196 3747196 0
4338602 4338602 0
4355698 4355698 0
5108731 5108731 0
5713815 5713815 0
5875596 5875596 0
5946702 5946702 0
6932320 6932320 1
7914194 7914194 1
13284248 13284248 0
13457518 13457518 0
13595280 13595280 0
14018526 14018526 1
14618077 14618077 0
18725920 18725920 1
21248023 21248023 0
22236716 22236716 0
22901682 22901682 1
25029998 25029998 0
24272809 24272809 0
25931596 25931596 0
25982323 25982323 0
26069601 26069601 0

Mag Uidhir 6
03-29-2015, 08:40 PM
Dave,

Using Dennis Wright's Big Tree data in conjunction with Mike Walsh's L21 SNP Discovery Tree to populate your Age Estimate tool, I ran the numbers per block as Dennis IDs the SNP blocks and found branch points in the SNP timeline that roughly match known Maguire pedigree family branches.

Overall our age counts back to 3529 years below SNP Z16340 (L21>DF13>L513>S5668>Z16340) This is inline with current L21, DF13, L513 estimates. The group has 14 NGS tested kits beginning at FGC9807 (1123years of mutations), 13 kits at FGC9795 (1283 years), 8 kits at FGC9804 (321 years), 6 kits at FGC9809 (481years), and 5 kits at Z16337 (321years).

Our pedigree (family tree) is best seen here http://homepages.rootsweb.ancestry.com/~amaguire/Pedigree.htm using the times above.....looks very hopeful.
Many thanks for the tool!!

Brad

Dave-V
03-29-2015, 09:01 PM
I ran Dave's program inserting my BAM filtered Full Genome results for SNPs downstream of Z253. Total SNPs in the ComBED regions = 33 calculated age of Z253 = 5295 years.

This figure compares with C4800 years if I use my 41 Big Y filtered SNPs and 118 years per Big Y SNP period suggested in the Yfull paper. My Big Y coverage is 10.3 million.

While it's all within the 95% confidence ranges, the main variation in applying this method to individual Big Y tests seems to come from the variation in coverage of the CombBED region in each Big Y test. The CombBED region (defined by YFull in their paper) is 8.4M base pairs (bp) long, and while YFull says that the average Big Y covers 7.6M bp of that, the actuals I've seen quoted so far suggest that individual Big Ys probably vary from 7.0M to 8.3M, if not more. That would cause the individual years-per-SNP rates to vary from around 145 to 175 years per SNP.

For example, in my case the tested CombBED coverage in my Big Y is 8,104,097, which makes my individual years-per-SNP about 150. If that were true of your Big Y also, your calculated age of Z253 would fall to 4950 ybp.

You can change the "Tested CombBED Length" in the spreadsheet to compensate for the difference from average (7.6M). But only YFull's BAM analysis is reporting it.

Muircheartaigh
03-29-2015, 10:00 PM
While it's all within the 95% confidence ranges, the main variation in applying this method to individual Big Y tests seems to come from the variation in coverage of the CombBED region in each Big Y test. The CombBED region (defined by YFull in their paper) is 8.4M base pairs (bp) long, and while YFull says that the average Big Y covers 7.6M bp of that, the actuals I've seen quoted so far suggest that individual Big Ys probably vary from 7.0M to 8.3M, if not more. That would cause the individual years-per-SNP rates to vary from around 145 to 175 years per SNP.

For example, in my case the tested CombBED coverage in my Big Y is 8,104,097, which makes my individual years-per-SNP about 150. If that were true of your Big Y also, your calculated age of Z253 would fall to 4950 ybp.

You can change the "Tested CombBED Length" in the spreadsheet to compensate for the difference from average (7.6M). But only YFull's BAM analysis is reporting it.

Thanks Dave. The 33 SNPs are from my FGC results so I changed the Value of the ''Tested CombBED Length'' to 8.4M in accordance with my Yfull Value for ''Length Coverage for Age''

This results in a reduction in the age of Z253 from 5295 years to 4790 years with a mutation rate 145.18 years

Dave-V
03-29-2015, 10:22 PM
Thanks Dave. The 33 SNPs are from my FGC results so I changed the Value of the ''Tested CombBED Length'' to 8.4M in accordance with my Yfull Value for ''Length Coverage for Age''

This results in a reduction in the age of Z253 from 5295 years to 4790 years with a mutation rate 145.18 years

Perfect. Exactly how it's supposed to work. I've seen similar quotes for FGC results coverage (8.43-8.44M) on the other thread that's discussing this, so sounds like 8.4M in general is more appropriate than 7.6M for that field for anyone with an individual SNP block from FGC.

Incidentally I've been focused on making sure the spreadsheet returns the right results according to the methdology, but I need to be careful not to oversell the methodology. The YFull paper is brand-new and not yet peer-reviewed to my knowledge.

seferhabahir
03-30-2015, 12:02 AM
I'm seeing a lot of agreement on 4700-4800 years for sons of DF13 (e.g., Z251, Z253) with this tool. I am glad to see other people plugging in their FGC and Big Y results. I wonder how much concurrence we would get by testing all the immediately downstream DF13 sons.

[Edit: If I go back and plug in 8.4M instead of 7.6M, then for what I think are my appropriate FGC SNPs (30 downstream from FGC11986 in the combBED regions, not taking into account any recurrent SNPs or INDELs) I get 4355 ybp. Add another 160 years each for FGC11986, S9294, and S11556 gets back to 4835 ybp, and another 160 years for Z251 gets back to 5000 ybp.]

David Wilson
03-31-2015, 12:04 AM
Just to make sure I am not misunderstanding something, is it then the case that any FGC-tested individual should just use 8.4M for the Tested CombBED Length?

I'm coming at this from the M222 branch, where I have been nervous for a long time about the three dozen equivalent SNPs that we jointly call M222 and which lie between Z2961 and the first SNPs below M222. Some 23 of those are in the CombBED region, suggesting a 3700 year gap between the grandparent haplogroup and the child haplogroups. After M222, my own CombBED SNPs total 7, which puts the age of the block running from there to the present at just over 1000 years if I am correct to use the 8.4M figure rather than the average 7.6M.

Thanks to Dave for the elegant Excel tool that has greatly simplified the process of handling these calculations.

Dave-V
03-31-2015, 12:49 AM
Just to make sure I am not misunderstanding something, is it then the case that any FGC-tested individual should just use 8.4M for the Tested CombBED Length?

[Edit: Corrected my 8.x figures :-)]

That's correct. There is no published "average" figure for FGC Elite tests, but the data points so far suggest they consistently cover 8.43M-8.44M of this area (the CombBED), so I've started recommending using 8.435M for individual tests from FGC Elite. But there should be no meaningful difference between 8.435M and 8.44M, really.

Bear in mind though the range to 2 standard deviations (95% confidence) is still pretty large; so for instance 23 SNPs gives 3700 years +/- about 500 years, etc (the spreadsheet shows the 95% confidence range because YFull published those high/lows for their constant rate of SNP mutation calculation).

joeflood
04-23-2016, 07:31 AM
Couldnt work out how to access that journal unfortunately, sounded interesting

cairn
05-03-2016, 05:00 PM
Couldnt work out how to access that journal unfortunately, sounded interesting

Is this the paper you're looking for?

Defining a New Rate Constant for Y-Chromosome SNPs based on Full Sequencing Data
Dmitry Adamov, Vladimir Guryanov, Sergey Karzhavin, Vladimir Tagankin, Vadim Urasin
http://rjgg.molgen.org/index.php/RJGGRE/article/view/151

The full text is available as a PDF on that page.

joeflood
10-18-2020, 01:38 AM
Looks too old to me. yfull now has is at 3900 ybp, 900 years younger. With BigY700 we are now using 82 years per SNP