PDA

View Full Version : 19 Subclades and Counting - L2 subclades and their modal values



R.Rocca
08-05-2015, 01:36 PM
As many of you are now aware, we are up to a whopping 19 known subclades directly below L2 (see U152 Project Tree). Two separately published studies have highlighted L2 as one of the European male lineages with the highest branching. Perhaps the first "Mr. L2" was the Charlemagne of the Bronze Age, but that is all for ancient DNA to figure out. One thing is clear however...the expansion of L2 was so rapid and so successful, that it is nearly impossible to guarantee an SNP subclade based on haplogroup signatures only. Certainly the new R1b Backbone test will help some, but there are some STRs that may help as well, even though none are of 100% certainty. I thought I'd start this thread to discuss some of those that I know of off the top of my head. Perhaps others can add to it. They are as follows:



L2 Subclade
Off Modal STR Value


Z49:
DYS456 = 16


Z367:
DYS460 = 10


DF90:
DYS413a/b = 21-23


FGC10516:
DYS442 = 11


FGC13620:
DYS385b = 15, DYS389i=14


FGC22501:
DYS438 = 13


L199:
DYS385a= 10

lgmayka
08-05-2015, 07:25 PM
L2 Subclade
Off Modal STR Value


Z367:
DYS460 = 10


FGC10516:
DYS442 = 11


FGC22501:
DYS438 = 13


DF90:
DYS413a/b = 21-23


FGC13620:
DYS385 = 11-15, DYS389i=14


How do you classify the cluster that has both DYS442=11 and DYS438=13 ? (e.g., kits 61501 and 225559)

Kwheaton
08-06-2015, 12:10 AM
Rich,

19 subclades is awesome compared to where we were just a couple of years ago!!!!:beerchug:

I was very excited to hear FGC22501 was included in the new Backbone test only to realize it will likely be an unstable marker due to its location in the X area of the Y.
As you stated FGC22501: DYS438 = 13 seems to be true for all who have tested positive so far and certainly is a place to start. Those include WHEATON, DOOTZ, KIDD, TRIPP, and WHIFFING.

If the testing turns up more L2's its a good thing and even if FGC22501 is unstable perhaps if those who test positive for L2 are negative for all the other branches---it will be helpful for us in FGC22501 world.

FGC22500 is available at YSEQ and the following were positive for it WHEATON, TRIPP, HG01431, HG02253
FGC22538 is available at YSEQ and the following were positive for it WHEATON, TRIPP, DOOTZ, & KIDD

Kelly

R.Rocca
08-06-2015, 01:12 AM
How do you classify the cluster that has both DYS442=11 and DYS438=13 ? (e.g., kits 61501 and 225559)

I didn't see kit 61501, but I did add L199, as they all have DYS385a= 10

MitchellSince1893
08-06-2015, 01:59 AM
Compared to L2 as a whole, Z49 is more likely to have these off modal values:

DYS456: L2 = 15. Z49 = 16.
DYS650: L2 = 19. Z49 = 18.

For what it's worth I have the L2 values rather than the Z49 values for above 2 STRs. And the off modal markers for Z49 are by no means the vast majority

In other words Z49 and L2 are too much alike at the modal level to really distinguish them.

jbarry6899
08-06-2015, 02:44 AM
Compared to L2 as a whole, Z49 is more likely to have these off modal values:

DYS456: L2 = 15. Z49 = 16.
DYS650: L2 = 19. Z49 = 18.



Our cluster in Z49>S8183>FGC20812> FGC31474> Y11179 et al has DYS456=16 but DYS650=20.

greystones22
08-06-2015, 10:25 AM
In Europe it looks like DF27, DF13 and L2 were all the subject of major population expansions.
Do you think this is a chance event, or the same mechanism drove all these groups to expand? Are there any clues from the relative dates of MRCA?

haleaton
08-06-2015, 11:10 AM
The number of STRs and biases in the chosen order by FTDNA up to 111 always interested me.

Are the Haplogroup predictions that FTDNA makes based on just the first 11? Usually it just determines if you are R-M269.

YFull now has a STR list that goes up to 491 which you can compare with those paying $50 to submit their BAM file and joining the U142 group which current has 90 members, more than half L2. Typically around 400 STR can be compared with others, though there is a variation.

Is there anything to be learned now that so many L2's have BIG Y Data to compare some of these other STRs?

My impression was STRs 68-111 have not been particularly useful to L2, though I heard people in other Haplogroups sometime gain "matches" by going to 111.

MitchellSince1893
08-06-2015, 04:17 PM
Our cluster in Z49>S8183>FGC20812> FGC31474> Y11179 et al has DYS456=16 but DYS650=20.
You are proving my point...it's tough to distinguish Z49 from L2 at 111 markers.

I will take a look at the yfull strs to see if anything stands out in these markers between Z49 and L2.

MitchellSince1893
08-06-2015, 08:45 PM
...I will take a look at the yfull strs to see if anything stands out in these markers between Z49 and L2.
I compared yfull STRs for Z49 and U152 overall, and couldn't find one that would have been much use in distinguishing between the two.

So maybe the answer is: If your STRs are close to the L2 modals then test for Z49.

Kwheaton
08-06-2015, 11:57 PM
The number of STRs and biases in the chosen order by FTDNA up to 111 always interested me.

Are the Haplogroup predictions that FTDNA makes based on just the first 11? Usually it just determines if you are R-M269.

YFull now has a STR list that goes up to 491 which you can compare with those paying $50 to submit their BAM file and joining the U142 group which current has 90 members, more than half L2. Typically around 400 STR can be compared with others, though there is a variation.

Is there anything to be learned now that so many L2's have BIG Y Data to compare some of these other STRs?

My impression was STRs 68-111 have not been particularly useful to L2, though I heard people in other Haplogroups sometime gain "matches" by going to 111.

Hal,
For the most part in our L2 group nothing to be learned from extending to 111. No strong patterns...I no longer recommend for our group. Although I did not know that before testing a bunch.....I think thats the deal, you don't know the value or lack thereof until after you test.....Geno 2.0, Big Y vs Elite, MyOrigins etc....This Backbone is a not so well vetted panel. It is amazing that FTDNA manages to get people to pay for research they should be investing in themselves....its a great business model for them but I have not liked their track record....that does not mean it does not have value....its just I tend to be rather frugal with my money and certainly when I am advising group members. My credibility is on the line and I simply can't be hawking all their BETA offerings and then be left holding the bag. $79 is too much to spend for our group purposes since we already know where we are. For thise that are sitting way up the R1b tree $79 is worth it.


Interesting point about the BAM files and more markers but I do not have any group BIG Y in our L2's.

R.Rocca
08-07-2015, 02:31 PM
Hal,
For the most part in our L2 group nothing to be learned from extending to 111. No strong patterns...I no longer recommend for our group. Although I did not know that before testing a bunch.....I think thats the deal, you don't know the value or lack thereof until after you test.....Geno 2.0, Big Y vs Elite, MyOrigins etc....This Backbone is a not so well vetted panel. It is amazing that FTDNA manages to get people to pay for research they should be investing in themselves....its a great business model for them but I have not liked their track record....that does not mean it does not have value....its just I tend to be rather frugal with my money and certainly when I am advising group members. My credibility is on the line and I simply can't be hawking all their BETA offerings and then be left holding the bag. $79 is too much to spend for our group purposes since we already know where we are. For thise that are sitting way up the R1b tree $79 is worth it.


Interesting point about the BAM files and more markers but I do not have any group BIG Y in our L2's.

I agree... I've never been a fan, and have never upgraded to Y111.

haleaton
08-07-2015, 03:16 PM
I agree... I've never been a fan, and have never upgraded to Y111.

Rich, your advice is always are spot on in when it comes to phylogenetic bang for the buck. But what I was wondering is about is Y112-Y481? Or when they picked the first 11, then up to 67, did they look a all of them and make wise choices for the genetic demographics they understood at the time. I would think once they had the reference sequence all STRs are known, though there must be be a length minimum.

With the L2+ branch defining match I have in mutual Big Y's, 4/16 of the established SNP matches, and 2/6 of the recent SNP non matches are sequential doublets (is there a better term?) in tandem, but really really short (joke fail). I do not know if there are any scientific studies to show these are likely simultaneous mutations. I wonder how common this is--two sequential SNPs. [I am stirring the relevance to this STR thread a bit here.]

In my surname group going to 111 did not real reveal much either, though I have read some people in other haplogroups have found additional matches up there that did not meet the criteria at 67. If that happened to any L2 person then it might point to something. Is there a way to reorder them specifically for L2 once we have more data--no more information just easier to follow.

R.Rocca
08-07-2015, 03:53 PM
Rich, your advice is always are spot on in when it comes to phylogenetic bang for the buck. But what I was wondering is about is Y112-Y481? Or when they picked the first 11, then up to 67, did they look a all of them and make wise choices for the genetic demographics they understood at the time. I would think once they had the reference sequence all STRs are known, though there must be be a length minimum.

With the L2+ branch defining match I have in mutual Big Y's, 4/16 of the established SNP matches, and 2/6 of the recent SNP non matches are sequential doublets (is there a better term?) in tandem, but really really short (joke fail). I do not know if there are any scientific studies to show these are likely simultaneous mutations. I wonder how common this is--two sequential SNPs. [I am stirring the relevance to this STR thread a bit here.]

In my surname group going to 111 did not real reveal much either, though I have read some people in other haplogroups have found additional matches up there that did not meet the criteria at 67. If that happened to any L2 person then it might point to something. Is there a way to reorder them specifically for L2 once we have more data--no more information just easier to follow.

I'm not really sure how they came up with 1-12, 13-37, 38-67, 68-111 or if they took into consideration other STRs. The frustrating thing is that in some cases, it seems like we get a different amount of repeats depending on the vendor, or even the test within the vendor. So, a couple of STRs I have seen as "interesting" in FGC data, had inconsistencies with FTDNA 111 data, which in turn had inconsistencies with manually fired Big-Y LobSTR data. So, given time constraints, I've dropped it off of my "things to get to the bottom of" list, especially given the growing SNP data we have.

haleaton
08-09-2015, 07:09 PM
I'm not really sure how they came up with 1-12, 13-37, 38-67, 68-111 or if they took into consideration other STRs. The frustrating thing is that in some cases, it seems like we get a different amount of repeats depending on the vendor, or even the test within the vendor. So, a couple of STRs I have seen as "interesting" in FGC data, had inconsistencies with FTDNA 111 data, which in turn had inconsistencies with manually fired Big-Y LobSTR data. So, given time constraints, I've dropped it off of my "things to get to the bottom of" list, especially given the growing SNP data we have.

You would think STRs can be manually counted from NGS data or a few overlapping Sanger traces, though there may be STR definition issues with the test method used for regular STR testing at FTDNA or YSEQ--such as it determines the total length of tandem repeats but ignores if there is SNP in the middle of the STR or some no calls.

Several U152 & Subclades SNPs fall in a STR region--I wonder if that would change the count and explain abrupt shifts in STR counts between Haplogroups. Easy to look at this by calling up STR in YSEQ--there are usually SNPs there from somebody.

Shouldn't there be a base STR count for the Reference Sequence (which is partially L2, if I remember)? Changes in STR could then be viewed as Insertions/Deletes but when don't know where in the STR (?). I am misunderstanding something fundamental and basic about the data here or the way it is processed and compared against the Reference sequence.

I avoided looking at STRs myself.

haleaton
08-13-2015, 06:30 PM
By looking at DYS456 by some tricks using YFull, 92 (almost all) U152+ samples in the YFull Group: U152, Excel & Word, I reduced an iota of my massive STR ignorance. Other exotic STRs like DYS713 are complex and appear to have multiple definitions. What an FTDNA STR count measurement is whole other issue which I still need to learn. People go to jail or get set free based on their STR counts by these test methods so I hope it’s highly accurate. This study applies to NGS data.

DYS456 is a simple "AGAT" repeat motif located in the b37/hg99 range ChrY:4270942..4271090 (YBrowse), which is a wide region (149) for sampling reasons only, the action actually place beginning at 4270959 followed by the N repeats (N x "AGAT"), then the remaining sampling region. So if a SNP falls in the sampling region it usually outside of the actual STR.

The YFull Group:U152, as of yesterday, has Y-STR Y-Results for 45 user submitted samples, but has a Y-Browser which can list the results (ten at time) of Y-Chr positions hg19 (comma-delimited) for 92 samples with the added U152 1K Genomes plus the values of the Reference. Y-Browser summarizes the results with letters codes but by cursor hovering the actual reads in the raw data will appear (with "I" for Insertions without being defined unless it was dominant and called out as the Sample Allele for that cell).

From MS Excel, you can make a list of b37 locations, ten at time, Cut and Paste into the YFull (Group) Y-Browser, and then Cut and Paste Special (text) back into a spreadsheet, then CONCATENATE the entire STR region into one string. This gives the top-level sequence data, as processed by YFull, for the entire group. This takes a few minutes ten at a time. Hint: F9 function in Excel can help.

Using fixed width fonts, such a Courier, this can be pasted into MS Word, which allows you to search substrings, highlight regions, and quickly determine a manual count STR number because it is all aligned visually in Word. For the Reference, which is taken as absolute truth, DYS456=15. Much of the Reference, if I remember right, was chosen from some L2+ guy, and about half of the samples exactly match the Reference for all 149 positions in the DYS456 sampling range. I attach a screen shot as example though it is barely readable.
5580

YFull members can look at this for themselves by pasting these ten into the YFull Group Y-Browser function: 4270957,4270958,4270959,4270960,4270961,4270962,42 70963,4270964,4270965,4270966 which is the beginning of the DYS456 "AGAT" repeat region.

Generally, the manual MS Office STR counts match the ones given for 45 user samples provided by YFull, but for several, including mine, they do not. I compared a few of these using FTDNA measurements, using method other than NGS sequencing, and they matched the YFull STR count, not my manual MS Office count. In my case, I have both FGC BGI & Big Y BAM data in the YFull U152 group and they had different DYS456 manual counts (15 & 16), but the YFull STR counts were both 16 and agreed with my FTDNA result of DYS456=16, though I am L2+ Z49-. FGC results for both my BAMs was also DYS456=16.

The reason, I think, is due to differing heterozygosity in the reads in the two different BAMs which gave different YFull Sample Allele status which for my two samples. I can get more details on reads for just my own samples.

Sample: #YF01461 (R-A197) ChrY, position: 4270959 (+strand), Reads: 19 Position data: 12A 1AAGATAGATAGAT 6AAGAT, Weight for A: 0.63601532567, Weight for INS: 0.36398467433, Probability of error: 0.36398467433 (0<->1), Sample allele: A, Reference (hg19) allele: A [This is my FGC BGI BAM data.]

Sample: #YF02170 (R-A197), ChrY position: 4270959 (+strand), Reads: 23, Position data: 4A 2TAGAT 1AAGAN 16AAGAT, Weight for A: 0.179696616103, Weight for INS: 0.820303383897 Probability of error: 0.179696616103 (0<->1), Sample allele: AAGAT, Reference (hg19) allele: A [This is my FTDNA BAM data.]

So in one case the criteria used by YFull weighted the multiple reads of an AGAT insertion (4 reads apiece) similar to a single read at that position for no Insertion, I think. The combined weight of the 6AAGAT reads could be counted at 6x4=24 versus the 4 single A reads, which would have statused the Insertion as derived for this sample.

A look at the other cases where manual STR counts did not match the YFull STR counts indicated a similar problem, but because the actual Insertion is only YFull statused as a single "I" a more detailed look was not possible. Generally, though it indicates that the manual should be raised by multiples of 4 for this STR.

One of the interesting things about STRs is the probability that a mutation will be take a particular motif like "AGAT" and whether there is a probabilistic reason to assume that mutation takes that form when there is a no call in one position out of four.

This appeared to me to be very apparent in the NGS data in the YFull Group: U152 data. I was also in almost all cases make a manual STR count on the 1K Genomes samples, though for many only a single read is at each location but they consistently conformed to the AGAT motif in the specific DYS456 location . This is all processed data a top level. In similar cases for user data where there were a YFull STR count available the processing of the raw data seems to have taken account of a missing single read out of four.

So anyway, no clear conclusions at this point relevant to L2 subclades and their modal values relevant to DYS456 and Z49, at this point. I don’t know if there is any “raw” data associated with STR counts done by FTDNA that could be analyzed in similar way position by position. I thought this was an interesting way to look at NGS data so I wrote up this long post.

haleaton
08-19-2015, 04:37 AM
If we get enough samples in L2 and a subclade can we calculate a modal time difference between the two? I have never looked into this. I am sure it has been discussed. If there is enough data to establish an off modal value for a STR in a subclade should that not lead to a time difference?

For my little subclade of L2, which is defined by a slew of old and young SNPs, I attempted to estimate the age (the younger age, not the modal age) based on assuming L2 is 4600 ybp (YFull) and uniform Big Y SNP mutations over time with small numbers.

It is important to get the number of non-matching private SNPs and INDEls--though this can be hard since Big Y reports a no call the same as not having a Non-Matching Known SNP. No call the same as negative. A non-match can be as useful as a match if it is an established Private SNP to calculate younger age.

5626

Here the FTDNA STR GD calculation would completely fail if the age is at all close. However the age between L2 and U152 should be close and along with one of the SNPs in the list--though we do know which.

Attached also shows the coverage differences between FGC Elite 1.0 and Big Y and which SNPs that can be Sanger sequenced.

Kwheaton
08-20-2015, 01:11 AM
Hal,

I was working similarly with our newest Big Y TRIPP match who just got their BAM analysis back from FGC. Besides the 6 shared that showed up at FTDNA there was a shared insertion in the BAM that FGC picked up. All together with the estimated FGC22501 assumed + We have these shared SNPS downstream of L2 (Y= Yes Sanger N=No blank is maybe)
FGC22500+Y
FGC22501+N
FGC22516+
FGC22528+Y
FGC22533+
FGC22538+Y
FGC22547+Y
FGC22548+N

There are over 40 additional novel WHEATON SNPS either not tested or negative in their Big Y. Of those 19 are Sanger sequenceable 9 maybe and the rest no.

I am not sure how to come up with an age estimate. The GD at its closest is 50/67 with our WHEATON group. A wild guess is the MCRA is 2000-2500 ybp. Does this seem reasonable or is it likely further back?

haleaton
08-20-2015, 11:16 AM
Hal,

I was working similarly with our newest Big Y TRIPP match who just got their BAM analysis back from FGC. Besides the 6 shared that showed up at FTDNA there was a shared insertion in the BAM that FGC picked up. All together with the estimated FGC22501 assumed + We have these shared SNPS downstream of L2 (Y= Yes Sanger N=No blank is maybe)
FGC22500+Y
FGC22501+N
FGC22516+
FGC22528+Y
FGC22533+
FGC22538+Y
FGC22547+Y
FGC22548+N

There are over 40 additional novel WHEATON SNPS either not tested or negative in their Big Y. Of those 19 are Sanger sequenceable 9 maybe and the rest no.

I am not sure how to come up with an age estimate. The GD at its closest is 50/67 with our WHEATON group. A wild guess is the MCRA is 2000-2500 ybp. Does this seem reasonable or is it likely further back?

I am just trying to understand if any valid age estimates can be made by carefully counting the verified non-matching mutations (Private SNPs) in equal coverage situations from NGS data sets. Equal coverage has to consider accounting for no calls or SNPs understudy as error sources. Assumes uniform mutation distribution in time since whatever other analyses claims is the L2 age or later age of major subclades of L2 with enough data to do modal MRCA comparisons. It also assumes that mutations are uniformly distributed over the Y Chromosome and SNP reporting using the various reliability conditions is consistent in each sample.

The most accurate time point we know is about when each of the current samples was tested though the age of the individual donor is a small complication.

So here (only two donors) it was interesting that the number of Private SNPs in each more modern line was about the same 10 and so the total since L2 is about the same 30 which if L2 is 4500 ybp and present is 1950 (if I remember) is about 450 AD--with big error bars from age of L2 calculation which is from Y-Full. Both samples were English separated geographically since around 1600 AD.

Obliviously, the unlawfulness of low numbers makes us wary of these sorts of estimates. Unfortunately, I do not have two FGC NGS wider coverage cases to compare which could widely swing the estimate as only 55% of my novel variants were found in my Big Y.

Kwheaton
08-20-2015, 02:25 PM
Hal,
Anything with these low numbers is widely speculative. Of the downstream markers with a known MRCA born c. 1606 we came up with 4 unshared mutations (1 from the WTY and 3 from the Elite ) based on the 21 we tested. There's probably another ten that may be able to be Sanger sequenced but need to round up 3 more willing to test to see if YSEQ can develop these primers.

I guess my question: Is there an age estimate for STRS differences in L2? Is there an age estimate for SNPS downstream of L2. Or is it just too premature?

If we assume a 4500 ybp for L2 then did most of the expansion happen shortly thereafter? I guess we just need more results...can't help myself from trying to read the tea leaves.

haleaton
08-20-2015, 04:22 PM
Hal,
Anything with these low numbers is widely speculative. Of the downstream markers with a known MRCA born c. 1606 we came up with 4 unshared mutations (1 from the WTY and 3 from the Elite ) based on the 21 we tested. There's probably another ten that may be able to be Sanger sequenced but need to round up 3 more willing to test to see if YSEQ can develop these primers.

I guess my question: Is there an age estimate for STRS differences in L2? Is there an age estimate for SNPS downstream of L2. Or is it just too premature?

If we assume a 4500 ybp for L2 then did most of the expansion happen shortly thereafter? I guess we just need more results...can't help myself from trying to read the tea leaves.

Yes, low numbers. Also assumption of mutation time and Chr Y coverage uniformity. Also low number of matching samples in most of the L2 subclades.

YFull has estimates of the TMRCA of the subclades, though I have not studied what they are based on STRs or SNPs or both. My own subclade R-A197's estimate of 150 ypb is based on the differences between my FGC Y Prime and FTDNA Big Y data sets. YFull has 9 of 19 L2 subclades currently.

I should get (finally) a FGC Y-Elite 2.0 of a STR GD 6/67 cousin in a few days, with a TRMCA ancestor born around 1545 AD, and uniformity would expect a number of around 5 Private SNPs in each sample that do not match. This is sort of similar in time frame to your Wheaton example. Will others who count the Private SNPs against known dates find something similar?

Nice thing about reading the tea leaves is you get to drink the tea! If it turns right you were correct, though it may be luck and then you are lucky, or if it turns out way off you learn something ignoring the fact you may have just been unlucky and incorrect.

If the U152 tree was based all on same coverage NGS sequencing and carried down to the private SNPs found in each of the samples of people living now then it would be interesting to compare number of SNPs in each branch. I think eventually that picture may emerge, but would be too much at this time. I assume Rich has a day job.

U152 & L2 close time-frame expansion is very interesting whether due to harems, male popularity in a matriarchy, bronze swords, axes, plows, weather, or something else like bell-beakers filled with brew [I may be conflating things here]. Interesting, demographic issue is if L2 now has had a rapid expansion due to immigrants to the New World from the Germany, British Isles, and Italy to open territory where the indigenous people did not do so well at the same time? This would have paper documentation. How has population growth in Europe compared during same time period which had large numbers of male war losses particularly in Russia? Probably scientific publications may tell this without the sampling bias of US folks doing ancestry.

wisconsinwalter
08-22-2015, 07:34 PM
Trying to understand the first post in this thread. I am L2 but fit in none of the groups in that post or subsequent later examples.

Would it be safe to say I fit one of the other 19? FYI, I am currently family less, I still haven't figured out my paternal line, though it is leaning towards Yates.

Kwheaton
08-22-2015, 10:05 PM
wisconsonwaiter,
Are you a part of the Ul52 group at FTDNA? If not do join---then ask for some advice on further testing. If you want to post your kit number that helps too.
Kelly

wisconsinwalter
08-23-2015, 12:47 AM
My kit is 135765. I am part of that group.

R.Rocca
08-23-2015, 01:26 PM
My kit is 135765. I am part of that group.

Walter, you would definitely benefit from the R1b Backbone test.

wisconsinwalter
08-24-2015, 12:52 AM
Thanks, I am still waiting on 111 and MTDNA. The admin helping me find my rightful family wants to see my 111 to compare to my closest matches who have theirs completed.

haleaton
09-06-2015, 07:19 PM
Are the YSearch files QM4ES R1b-U152 (S28) & 6JCVN R1b-L2 (S139) done by Mike Walsh in 2012 the most current, for now? There is a place holder on the U152 & Subclades page for modal values. With now so many tiny subclades the MIN, MAX, & MODE is kind of less useful. It did not stop me though.

I did start looking closely for the first time at my little L2 > FGC5301 et al. STRs pooling the the larger data sets from the Eaton and Whitney surname projects respectively, for a total of 35 total samples albeit all probably related with the last thousand years in four groups of MRCA related in the last 500 years.

I found all were YCAIIa=18 which is pretty rare and also DYS594=11. All the Whitneys which merge with the Eatons upstream of the present were DYS385b=16, while the Eatons were 15 with one 14. Modal is 14.

Kwheaton
09-06-2015, 08:20 PM
Since we have at least 19 Subclades for L2 can we make any geographical distinctions? Or any rational time frame estimates?

This is meant to be a provocative question and any discussion should be open ended. It seem that some subclades are geographically specific and some may have followed armies northward and spread broadly.

Let me show my thinking based on our own FGC22501 and the matches we have received

Nov 2014 DOOTZ from Romania w.poss 1200's Flanders origins
Feb 2015 KIDD from UK no specifics
Mar 2015 Anonymous from the Bristol, ENG area
July 2015 TRIPP from Lincolnshire, ENG Also Shares FGC22538, FGC22516, FGC22528, FGC22533, FGC22547 and FGC22500
July 2015 WHIFFING Surrey, ENG
Aug 2015 VANDERBEECK from Belgium going back to about 800AD FGC22538, FGC22516, FGC22528, FGC22533, FGC22547 and FGC22500 and an un-named SNP Rich discovered.

And here is the GD at 67 markers
5829

And the projected age to MRCA at 67 markers at 95%
5830

So what started out as a hunch with a connection to Flanders is looking very promising. If we look at Belgica and the Belgae settlement of Britain and then later incursions of Flemish with the Roman army and then the Norman conquest it is possible to account for all these men having the same ancestral origin in one of the Alpine Celtic tribes whether they were later absorbed into others or not. The spread would be up the western Rhine into Belgica and then spreading in a Fan shaped pattern across Roman, Flemish and then Norman holdings. So the spread was not from within Britain but these isolates each represent one man traveling either Pre-Roman times, Roman conquest, Norman conquest and/or Flemish trade....The TRIPP VANDERBEECK connection is -10 GD at 67 and there is a viable genealogical connection. The WHEATON cluster is at its closest -17 with DOOTZ but that is misleading as our closer connections (SNPS) are with TRIPP and VANDERBEECK who are -18 and -20 respectively.

And although this appears to be our pattern I wonder how many other British subclades may have similar origins, where the MRCA is likely found in mainland Europe---thoughts?

haleaton
09-09-2015, 01:02 AM
Since we have at least 19 Subclades for L2 can we make any geographical distinctions? Or any rational time frame estimates?

This is meant to be a provocative question and any discussion should be open ended. It seem that some subclades are geographically specific and some may have followed armies northward and spread broadly.

Let me show my thinking based on our own FGC22501 and the matches we have received

Nov 2014 DOOTZ from Romania w.poss 1200's Flanders origins
Feb 2015 KIDD from UK no specifics
Mar 2015 Anonymous from the Bristol, ENG area
July 2015 TRIPP from Lincolnshire, ENG Also Shares FGC22538, FGC22516, FGC22528, FGC22533, FGC22547 and FGC22500
July 2015 WHIFFING Surrey, ENG
Aug 2015 VANDERBEECK from Belgium going back to about 800AD FGC22538, FGC22516, FGC22528, FGC22533, FGC22547 and FGC22500 and an un-named SNP Rich discovered.

And here is the GD at 67 markers
5829

And the projected age to MRCA at 67 markers at 95%
5830

So what started out as a hunch with a connection to Flanders is looking very promising. If we look at Belgica and the Belgae settlement of Britain and then later incursions of Flemish with the Roman army and then the Norman conquest it is possible to account for all these men having the same ancestral origin in one of the Alpine Celtic tribes whether they were later absorbed into others or not. The spread would be up the western Rhine into Belgica and then spreading in a Fan shaped pattern across Roman, Flemish and then Norman holdings. So the spread was not from within Britain but these isolates each represent one man traveling either Pre-Roman times, Roman conquest, Norman conquest and/or Flemish trade....The TRIPP VANDERBEECK connection is -10 GD at 67 and there is a viable genealogical connection. The WHEATON cluster is at its closest -17 with DOOTZ but that is misleading as our closer connections (SNPS) are with TRIPP and VANDERBEECK who are -18 and -20 respectively.

And although this appears to be our pattern I wonder how many other British subclades may have similar origins, where the MRCA is likely found in mainland Europe---thoughts?
I enjoyed this post. I wonder if there is an automated way to generate the GD cross matrix for all the L2 in the project? Maybe it could be done by modals grouped at the first subclade level so it would only 19 x 19, currently.

I only have closer matches though in the 1500's the Eatons of Dover, England who were Merchants of the Staple and had homes in Calais, Antwerp, and sent their sons to schools in France to learn the language. Calais was conquered by the English in 1363 and retaken by France in 1558. I would think there would be a lot of back and forth since 1066 of the parts of England nearest to Belgium and France.

My earlier Whitney match I am getting a comBED TMRCA analysis done by YFull. The fact that they both are English names sort of localizes them in England at the time people got surnames. There are paper genealogy indications that these Eaton & Whitney families were from inland Cheshire and Shropshire and part of Norman families--but nothing proven. My Eaton's having a long dubious paper genealogy to William Fitz Alan which conflicts with Stewart haplogroup studies.

This is all recent compared with the time that the descendants of L2 expanding out throughout Europe in uniformly from the origin at the level it is currently sampled.

[Kelly, I did get my 11th 1R Eaton cousin's Y Elite 2.0 results, after the same sample failed twice at the old lab, which is a good development now that FGC using the Complete Genomics lab. My results are getting some additive sequencing as it ended up a smaller than the others though almost all SNP were present and consistent with my old Y Prime data--just INDEL and STRs were spotty and a lot more ***. It may indicate that getting WGS 2X ($225) for a SNP survey with option for additive later testing if interested may be a good way to go--though I await somebody reporting this to work.]

Kwheaton
09-09-2015, 01:51 AM
Thanks Hal...

I have found the last week I am scrambling from Tribes in Celtic Europe to Roman armies to Norman Armies etc. I found some excerpts from a book online Flanders and the Anglo-Norman World, 1066-1216 which were quite interesting and my new Verbeeck match is a wonderful source of information. Bottom line is all these cultures/empires exchanged goods, DNA and legacies. Over and over again I am struck with how early humankind was exploiting resources and traveling far a field. The Royals and the landed gentry types had marriages of alliance and entourages that went with them. There land holding might include Flanders, Normany and England---and not just a specific area of England but all over the place. So assumptions just fly out the window.

There is even the possibility of back migrations of DNA that came over in the Iron or Copper Age---migrating back to mainland Europe so quite a web to untangle. But what else should we be doing? :beerchug:

BTW I have proven Royal Stewarts in my Tree---and this was always a contention of my 2nd great grandmother.

And I think the WGS 2X may be an awesome way to go if it pans out---- keep me posted on what you find out.

I am currently struggling with whether to start a SNP specific FTDNA group and then get folks to test and join up---but I am still ruminating on that.
Kelly

MitchellSince1893
09-09-2015, 01:58 AM
The Z49 subclade of L2 currently has two large subclades Z142 and S8183 and two smaller ones.

Looking at the FTDNA U152 project, Z142 has more Western European samples while S8183 is more Central European.

Drilling down 13 SNP below L2 on my own branch of Z142, it's still found throughout Western Europe in Spain, the Netherlands, and the UK. Yfull estimates the age at this point to be 3600 years ago to the MRCA.

I have a feeling, eventually, all 19 subclades right below L2 will be found to be widespread throughout Western Europe.

One will probably have to get 20+ SNPs below L2 to get to British specific branches.

paulgill
09-09-2015, 02:06 AM
What is L2 anyways? Why can't you add R1b-L2 as there is also Y-Dna haplogroup L2 and that is the real L2, yours is only a sub-sub-sub clad of R1b? Stop wasting time of real L2, would you please?

MitchellSince1893
09-09-2015, 02:39 AM
What is L2 anyways? Why can't you add R1b-L2 as there is also Y-Dna haplogroup L2 and that is the real L2, yours is only a sub-sub-sub clad of R1b? Stop wasting time of real L2, would you please?

In case you didn't notice, this thread is in the "R>R1b-P312>R1b-U152>L2" subforum; so "R1b-L2" is a given.

Kwheaton
09-09-2015, 03:05 AM
The Z49 subclade of L2 currently has two large subclades Z142 and S8183 and two smaller ones.

Looking at the FTDNA U152 project, Z142 has more Western European samples while S8183 is more Central European.

Drilling down 13 SNP below L2 on my own branch of Z142, it's still found throughout Western Europe in Spain, the Netherlands, and the UK. Yfull estimates the age at this point to be 3600 years ago to the MRCA.

I have a feeling, eventually, all 19 subclades right below L2 will be found to be widespread throughout Western Europe.

One will probably have to get 20+ SNPs below L2 to get to British specific branches.

Mitchell--
I think the time frames as you suggest are still way back there...but also what I am seeing suggests that these widely disseminated groups are going to cluster somewhere on the mainland...even those that spread out. Under L2 our FGC22501 is defintely old and widespread but then the matches that match this SNP and another 7 downstream I think we are honong in on a central location in Flanders and then moving from there to Normandy and on into Britain and may end up all over the map there. I expect there are many of these which may not ever have huge numbers but be pockets that survived here and there. Within our WHEATON project we have about 50 SNPS under L2 and all but 3 appear to be shared (of those that we have Sanger sequenced) indicating our MRCA bc1606. With no matches closer than -17 we are no where close to finding that kind of 20 SNP depth. We do have the 2 matches to a depth of 8 one from Lincolnshire and one from Belgium.

My point is progress is being made, each if us working our area of the jig saw puzzle. I am pleasantly surprised at the progress we each are making.

Kwheaton
09-16-2015, 05:15 PM
Mitchell I want to reiterate your comment


I have a feeling, eventually, all 19 subclades right below L2 will be found to be widespread throughout Western Europe.

I keep tweaking my map of our FGC22501 Subclade but I think it shows how accurate that comment can be.
5925

And her is our mini-branch of the Phylogenetic tree5926

I think each of us working on our particular branch of the L2 are going to find different patterns of dispersal but most of them are going to be widely disseminated do to the repeated waves of trade, conquest and empire building.

Kwheaton
09-16-2015, 06:22 PM
Repeated post.

haleaton
09-29-2015, 06:36 PM
Special thanks to Mitchellsince1983 for his post http://www.anthrogenica.com/showthread.php?5429-How-to-see-STR-matches-beyond-what-FTDNA-allows which inspired me to look at STRs in my L2 > FGS5301 et al. subclade which consists Eatons, Whitneys, and possibly one Day sample for a total 36 data sets.

I took those pooled them and then compared them with members of the U152 group and a few others from YSearch and surname or location group projects and found 17 candidates that were mostly L2 and not found yet to be in any of its subclades. Shown in the following table (I left off all the other the Eatons and Whitneys--the entire group was used to select matches):

http://www.anthrogenica.com/attachment.php?attachmentid=6093&stc=1

The TMRCA accuracy for the two Eatons (who are known cousins) shown was amazingly accurate since their common ancestor was born about 480 years ago based on genealogy, but the Whitney-Eaton TMRCA is an outlier based on the other Whitney samples TMRCAs (not shown). I should get a YFull combBEDage estimates in a few days. So wonder how accurate the more distant age estimates really are for STR counting at all?

We also have seen how quickly Y Haplogroup prediction falls apart using STR counts outside of FTDNA reported one. For different subclades of L2 all these ages would fall later than their genesis. It was is nice group of possible matches with from the British Isles, Belgium, France, Germany, Iberia, and one Italian with a 600 TMRCA match with Poland.

I looked at the STR values of the The Eaton-Whitey-Day FGC5301 et al subclade and these group of possible matches and found comparing the off-modal STR values:

All Eatons, Whitneys, & Day are YCAIIa=18
Eatons are DYS385b=15 w/ one 14 and Whitneys are DYS385b=16 (recent mutation)
All Eatons, Whitneys,& Day are DYF406s1=9
All Eatons, Whitneys,& Day are DYS594=11
All Eatons, Whitneys,& Day are DYS557=17


All New Possible are YCAIIa=19 which is L2 Modal.
New Possibles are all YCAIIa=18 which is L2 Modal.
New Possibles are DYS385b=14 w/ two 13s; L2 Modal is 14.
All New Possibles are DYF406s1=10 w/ two 11s; 10 is L2 Modal
All New Possible are DYS594=10; which is L2 Modal.
New Possible are DYS557=16 w/ one 15 & one 17; 16 is L2 Modal

So my subclade possible matches have just the the L2 Modal values and nothing unique to the Eaton-Whitney-Day subclade, despite the TMRCA predictions that fall well after the L2 subclades were established. Or maybe they are future matches?

haleaton
09-29-2015, 07:04 PM
Special thanks to Mitchellsince1983 for his post http://www.anthrogenica.com/showthread.php?5429-How-to-see-STR-matches-beyond-what-FTDNA-allows which inspired me to look at STRs in my L2 > FGS5301 et al. subclade ...



Here are same TMRCA chart using 95% probability and the FTDNA rates that Kelly used and also 95% using the custom rates the STR tool.

6094

6095

Kwheaton
09-29-2015, 07:22 PM
Hal,

I will have to try my suspected FGC22501+. I should have cut and pasted the full rows but I only collected the MRCA and Kit number. My major link is DYS438=13 if they are already tested to L2. (There are some L21 who also are DYS438=13.) It woukd be interesting to see if the time frames fall to any patterns. Thanks for the inspiration. I may not have to play with this til the weekend....

Kelly

haleaton
09-29-2015, 07:43 PM
Hal,

I will have to try my suspected FGC22501+. I should have cut and pasted the full rows but I only collected the MRCA and Kit number. My major link is DYS438=13 if they are already tested to L2. (There are some L21 who also are DYS438=13.) It woukd be interesting to see if the time frames fall to any patterns. Thanks for the inspiration. I may not have to play with this til the weekend....

Kelly

This one is for you Kelly as FGC22501 is one of my favorite L2 subclades. I posted in the data for your group and plotted them with the previous using your 95% FTDNA rates:

6099

Hard to read but you get the idea.

What I see are two distinct L2 subclades, though I can't figure out the TMRCA ages. So there may be something to this, but I will look at the different subclades.

EDIT: After looking at other large L2 subclades I did find that there are always one or two outliers that if they had not already been assigned, either manually or by test, I would have selected as possible matches. So many of the matches could be just outliers in STR probabilities.

TRanger
09-29-2015, 08:15 PM
Hello Everyone,

This is my first post here. My paternal line prior to the 1830's is uncertain. Through the Backbone test, I am L2+ Z49- Z367- PF6610- CTS5689- L408- DF110- DF103- FGC22501-. My knowledge on this subject is growing, but still very limited.

I am FTDNA 413844, YSearch 2YCYG and GEDMatch A345434.

Allen Slaughter

Kwheaton
09-29-2015, 08:20 PM
This one is for you Kelly as FGC22501 is one of my favorite L2 subclades.

Hal,
I am feeling the love! :love:
We have been in this a long time together and I feel as if we are following parallel paths. Here's my analysis in a nutshell. We essentially have the following major time frame where there was large scale migration of L2 to Britain
Pre Roman period of celtic traders/settlers
Celtic mercenaries, warriors, conscriptees coming with the Romans
Celtic DNA....coming with the Saxons
L2 coming with the Normans

And it is possible that in our subclades like FGC22501 it coukd have arrived in different waves giving rise to small pockets broadly spread in Britain.....time will tell.

lgmayka
09-29-2015, 09:02 PM
It was is nice group of possible matches with from the British Isles, Belgium, France, Germany, Iberia, and one Italian with a 600 TMRCA match with Poland.
Do you mean that a Polish entry appears close in Y-STRs but has a different SNP result?

haleaton
09-29-2015, 09:30 PM
Do you mean that a Polish entry appears close in Y-STRs but has a different SNP result?

No, the FTDNA SNP status of the two are R-L2 Z49-, Z367- .FGC10543- and R-L2 L20-, it was just interesting as it might show a migration between Italy and Poland. They are close in STRs and possibly in the same Haplogroup--though distant from me and may or may not be in the same Haplogroup as me.

Kwheaton
09-29-2015, 10:41 PM
Hi Allen!

Welcome to our little L2 world. Too bad you weren't FGC22501+ ;-)
In any event you have several directions you can pursue at this point.

You can do educated SNP chasing....that is test a few SNPS that others with a similar haplotype have tested for (these are $17.50 each at Yseq.org or if available $39 at FTDNA)

You can do a Full Genomes Corp Elite Y reg $750 on sale thru Sept for $700. This should get 99+% of your Y SNPS

You can do a Big Y from FTDNA for $575 which will cover 40-60% of your SNPS

Or you can wait for a L2 exploratory panel from euther FTDNA or YSEQ. (probable cost between $88-233 depending on the number of SNPS included)

Rich may have some suggested SNPS if thats the route you want to go.
You have picked a very good time as most of us here are making some progress!

Kelly

TRanger
09-29-2015, 11:01 PM
I am a bit confused. I am seeing L199 and FGC10516 listed in different places depending on the tree. Mr. Rocca has them under R-L2. By the first post in this thread, L199: DYS385a= 10 and FGC10516: DYS442 = 11. I am both. Actually, I am 10.2 on DYS385a but I do not know if that makes any difference. I have taken the Y37 test. Is there any way to narrow it down with other STR's?

Kelly, thanks for the welcome and the information.

Allen

Kwheaton
09-29-2015, 11:36 PM
I am a bit confused. I am seeing L199 and FGC10516 listed in different places depending on the tree. Mr. Rocca has them under R-L2. By the first post in this thread, L199: DYS385a= 10 and FGC10516: DYS442 = 11. I am both. Actually, I am 10.2 on DYS385a but I do not know if that makes any difference. I have taken the Y37 test. Is there any way to narrow it down with other STR's?

Kelly, thanks for the welcome and the information.

Allen

Allen,
Always the best source of position on the Haplotree is Rich Rocca's version. He is my go to person for L2 and we are very lucky to have him as one of the co-admins of the U152 project. As for STRS denoting SNPS this is rather a squishy rubric that sometimes works and sometimes doesn't. It often depends on how slow a marker values change.....but because they change back and forth...that is adding or dropping a repetition they can be misleading. But in the absence of other indicators we go with any pattern that may show a possible connection to a SNP. In the case of DYS438=13 so far if someone is L2 and DYS438=13 they have proven positive for FGC22501 but that does not mean it will always be....so far so good. There have been ones we have seen in the past that have turned out not to indicate a particular SNP even though it looked good on paper.

Hopefully Rich will weigh in and have a soecific suggestion for you. If not you may have to go with one if the other options I laid out. Depending on your interest and wallet...you can hardly go wrong with the Y Elite and the Big Y woukd be my second choice. If you are co sidering a Big Y keep your eyes open for the sale usually announced at the FTDNA Admins Cibference in Nov.
Kelly

lgmayka
09-30-2015, 12:06 AM
In the case of DYS438=13 so far if someone is L2 and DYS438=13 they have proven positive for FGC22501 but that does not mean it will always be....so far so good.
#196632 of Poland has DYS438=13 but has tested DF110+ .


196632 Lewicki, Laszki Murowane (Chyrow), PL till 1939 Poland R-DF110 13 24 14 11 11 14 11 12 12 14 13 30 17 9 9 11 11 24 15 19 31 15 16 17 18 11 11 19 23 16 15 20 17 37 37 12 13


#61501 and #225559 have DYS438=13 but belong to an Ashkenazi Jewish cluster that has tested L408+ within R-L2.


61501 Poland Poland R-M269 12 23 14 10 10 10 12 12 13 13 13 29 18 9 9 11 11 25 15 19 33 15 16 16 17 12 11 19 23 17 15 19 17 35 37 11 13
225559 Jankiel Kaufman, b~1795 Panemune Lithuania Lithuania R-L2 13 23 14 10 10 10 12 12 12 13 13 30 18 9 9 11 11 26 15 19 32 15 16 16 17 11 11 19 23 17 15 19 17 35 37 11 13


#203572 of Poland has DYS438=13 but has only tested L2+ L20- .


203572 Wojtkowski Poland R-L2 13 24 14 10 11 16 12 12 13 13 13 29 18 9 10 11 11 25 15 20 30 14 15 16 17 12 11 19 23 14 15 18 16 34 38 12 13

Kwheaton
09-30-2015, 12:57 AM
#196632 of Poland has DYS438=13 but has tested DF110+ .


196632 Lewicki, Laszki Murowane (Chyrow), PL till 1939 Poland R-DF110 13 24 14 11 11 14 11 12 12 14 13 30 17 9 9 11 11 24 15 19 31 15 16 17 18 11 11 19 23 16 15 20 17 37 37 12 13


#61501 and #225559 have DYS438=13 but belong to an Ashkenazi Jewish cluster that has tested L408+ within R-L2.


61501 Poland Poland R-M269 12 23 14 10 10 10 12 12 13 13 13 29 18 9 9 11 11 25 15 19 33 15 16 16 17 12 11 19 23 17 15 19 17 35 37 11 13
225559 Jankiel Kaufman, b~1795 Panemune Lithuania Lithuania R-L2 13 23 14 10 10 10 12 12 12 13 13 30 18 9 9 11 11 26 15 19 32 15 16 16 17 11 11 19 23 17 15 19 17 35 37 11 13


#203572 of Poland has DYS438=13 but has only tested L2+ L20- .


203572 Wojtkowski Poland R-L2 13 24 14 10 11 16 12 12 13 13 13 29 18 9 10 11 11 25 15 20 30 14 15 16 17 12 11 19 23 14 15 18 16 34 38 12 13


lgmayka,
Thanks for illustrating my point! I had not seen these---probably because I have been geographically focused on the British Isles and the Norman-Flemish areas. Aside from the, to date relatively few ambiguous SNPS, SNPS are much more definitive than STRS. That does not mean STRS do not have value in guestimating possible SNP groupings but they are far from definitive.

Allen---in my FGC22501 example the range of GD at 67 markers is as high as -22 (that is 22 marker values are mismatches yet the two men still share the SNP FGC22501 and more downstream SNPS) There is an overall subtle shift toward NGS (next Generation Sequencing) ie Big Y or Y elite type tests because they are much more powerful and conclusive. And although the initial cost is much higher they are arguably cheaper in the long run. [Those who have gone through rounds of Walk Through the Y, Geno 2, Britains 2.0 etc know what I mean]

Bolgeris
09-30-2015, 07:38 PM
There are three poland man in my subclade L2+ L20+ CTS9733..
all the same surname.. visible only in Big tree results.. Mah.
Southern Wielkopolska, Poland
Interesting..

TRanger
10-01-2015, 10:45 AM
Thanks again. I will likely save up my nickles and dimes and get the Elite Y so hopefully I will be done buying tests for now. It looks like I will be able to get more useful info from it than the Big Y. I am guessing the project administrators can manually put the needed data into the project.

Allen

Kwheaton
10-01-2015, 01:45 PM
Allen,

In terms of clustering you with others in the U152 project, yes that can be done by hand. Unfortunately there is no upload option at this time. The Big Y has proved usedul as well it just has a probabbility of leaving more out.

Regards,
Kelly

MitchellSince1893
10-01-2015, 08:08 PM
Here are same TMRCA chart using 95% probability and the FTDNA rates that Kelly used and also 95% using the custom rates the STR tool.

6094

6095

Your 174343 James Lincoln, d. 1967 in the chart above is in my branch of U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et a.

Assuming the kits are in the same haplogroup; based on what I'm seeing the TMRCA dates become unreliable once you get past a GD that FTDNA uses to filter their 37, 67, and 111 markers.

For example, SNP based TMRCA date at Yfull's ytree for my terminal branch as 1000 BC. Using the STR based mcgee Y-Utility tool has it at ~700AD. The 111 marker GD is 21.

On the branch one up from my terminal Yfull has 1600 BC and Y-Utility has ~600 AD. This branch has a 37 marker/GD = 8, 67 marker/GD=14, and 111 mark/GD=21.

So at these GDs the STR based TMRCA can be unreliable.

Kwheaton
10-01-2015, 08:24 PM
Your 174343 James Lincoln, d. 1967 in the chart above is in my branch of U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et a.

Assuming the kits are in the same haplogroup; based on what I'm seeing the TMRCA dates become unreliable once you get past a GD that FTDNA uses to filter their 37, 67, and 111 markers.

For example, SNP based TMRCA date at Yfull's ytree for my terminal branch as 1000 BC. Using the STR based mcgee Y-Utility tool has it at ~700AD. The 111 marker GD is 21.

On the branch one up from my terminal Yfull has 1600 BC and Y-Utility has ~600 AD. This branch has a 37 marker/GD = 8, 67 marker/GD=14, and 111 mark/GD=21.

So at these GDs the STR based TMRCA can be unreliable.

I totally agree as far as reliably predicting years with precision-- however I still find it informative. Our branch isn't at YFULL so I do not have estimates there. Interestingly SNPS shared downstream of FGC22501 indicate a wide year span via STR differences. -10 to -20 at 67 markers yield 700-2300 YBP, yet the all share the same 7 SNPS or indels below FGC22501. The MRCA by educated guess is about 2,000 YBP or right around the Roman expansion/ invasion period which would make sense. The two with the closest STRS are likely to have a MRCA sometime before the Norman conquest so maybe 1,000 YBP or closer.

For all of us its a matter of data points, the more the better. Obviously SNPS are a much better measurment than STRS when calculating ages. But for the time being STRS data is easier to come by.

haleaton
10-01-2015, 09:15 PM
Your 174343 James Lincoln, d. 1967 in the chart above is in my branch of U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et a.

Assuming the kits are in the same haplogroup; based on what I'm seeing the TMRCA dates become unreliable once you get past a GD that FTDNA uses to filter their 37, 67, and 111 markers.

For example, SNP based TMRCA date at Yfull's ytree for my terminal branch as 1000 BC. Using the STR based mcgee Y-Utility tool has it at ~700AD. The 111 marker GD is 21.

On the branch one up from my terminal Yfull has 1600 BC and Y-Utility has ~600 AD. This branch has a 37 marker/GD = 8, 67 marker/GD=14, and 111 mark/GD=21.

So at these GDs the STR based TMRCA can be unreliable.

I thought Lincoln was familiar, though I thought I checked. I think most of these are not in my L2 subclade because they lack any of the off L2-modal STR values.

I also agree on GD and TMRCA error bars particularly when outside of the FTDNA STR reporting range. A just outside of reporting at the 67 STR level is getting an R1b Backbone Snp Pack Test--so it will be interesting to see if FGC5301 comes up positive.

Using the McGee Y-Utility Tool, has really show this to me using data from surname projects with a number of closely related relatives you can see the large spread of GD/TMRCA values within the group compared with another closely related group. Even with folks in "distant" Haplogroups like DF27 they will compare similarly to two subclades of L2 with multiple members.

What I did find useful was using http://www.semargl.me/ to locate nearer GD samples and then using their links to the FTNDA pages to examine the STR against my group for GD and then compare the individual off-modal STR values. I did come up with several candidates this way.

Also notice more people in surname projects are getting SNP Pack testing and the results appear, so it may be worthwhile to look at YSTR-12 matches again as results are appearing there with e-mails bu no kit #s . . .

Kwheaton
10-01-2015, 09:40 PM
Hal,


STR Data Mining for Pyrite

:lol: Love it!

Ravai
10-11-2015, 08:02 AM
Dear All,

Yesterday I applied to the test FTDNA BIG-Y. Hopefully soon I can share with you the results.

Yours sincerely

David Rabai

R.Rocca
10-11-2015, 11:35 AM
Dear All,

Yesterday I applied to the test FTDNA BIG-Y. Hopefully soon I can share with you the results.

Yours sincerely

David Rabai

Great David, the area of Savona is very under-represented in our project, so your result will be very important.

Kwheaton
10-11-2015, 10:09 PM
FYI
Just by happenstance I found that FTDNA IS offering a stand along FGC22501 SNP test ($39). It is not found in their advanced order menu, but can be found on their Haplotree and SNPS from one's individual pages.

6277

So any L2's with DYS438=13 can confirm FGC22501 without taking the full backbone M343 test.

haleaton
10-11-2015, 11:11 PM
FYI
Just by happenstance I found that FTDNA IS offering a stand along FGC22501 SNP test ($39). It is not found in their advanced order menu, but can be found on their Haplotree and SNPS from one's individual pages.

6277

So any L2's with DYS438=13 can confirm FGC22501 without taking the full backbone M343 test.

Interesting, I wonder what test method they are using?

Kwheaton
10-11-2015, 11:34 PM
Interesting, I wonder what test method they are using?

My thoughts precisely. We know that they are getting accurate results in the panel--- maybe something will be forthcoming at the conference. Weird that it is listed one place and not the other. I have requested downstream SNPS as well but these aren't showing anywhere.

haleaton
10-14-2015, 07:47 PM
In theory with enough data and widely dispersed subclade demographics (actual and sampled) the modal values should approach the actual STR set of the MRCA person for the ensemble.

Using the U152 & Suclades data from two weeks ago I did that for U152, L2, Z367, Z49, and also L2 xZ367 xZ49 to include the entire set. I also did my own L2 Subclade FGC5301+ et al. though there are only 36 STR data sets from three surname families with a TMRCA < 1200 yrs (40 generations) based on SNP counting.

Here are the results using the McGee Y-Utility. I used Generations rather than years with the default 30 years/generation.

6328

You can see the U152 and L2 have the same modal values, with several generations between the L2 subclade groups, and a large generational difference between my recent subclades's modal value. Though if the YFull estimate of L2 & U152 being 4500 years old or 150 generations, then these estimates are way off. I know we don't expect STR calculations to be accurate over large GDs.

A question I have is when we calculate TRMCA's it is for two separate people living at the same time, so it counts mutations from the MRCA person, the modal HT, back down to both living people and is it the same as calculating it between an ancient ancestor (eg Modal L2) and a modern person or the modal HT values of a descendant subclade?

As a thought experiment, we can imagine for a descendant with GD to an ancestor and a hypothetical descendant who has the same GD, but with completely different STRs, then the TRMCA between the two would twice the GD of one by itself. It is then consistent with the TRMCA calculation assumptions.

So when comparing TMRCA's between a modal value and descendant value do we need to double it? [Maybe this is something well-known?]

We may be seeing this comparing the GD between modal Z49 and modal Z367 which is about twice the value from each to L2 modal.

Also if we take the average generations between L2 and FGC5301 et al modal values, 39, double it and then add 40 (a rough number) for the time between modal FGC5301 et al and a living descendant we get 118 generations or 3540 years closer to estimates of L2 age.

haleaton
11-04-2015, 08:45 PM
Just a comment on something I noticed, that may be well known.

The U152 & L2 Modal HT are the same based on the R U152 Project members. But also if you look at the SWAMH-67, the Super Western Atlantic Modal Haplotype (for 67 markers), it only differs from U152/L2 by 1, for DYS456 which is fast marker which competes for modal value.

So despite being perhaps 3-4 times older than U152, this is not discernible from STR values. My hunch is that the farther away from the modal values, the better able STR counts, within the FTDNA criteria, are able to predict age and then also being in the same L2 subclade. I only have two examples of this, this.

R.Rocca
11-04-2015, 09:37 PM
Just a comment on something I noticed, that may be well known.

The U152 & L2 Modal HT are the same based on the R U152 Project members. But also if you look at the SWAMH-67, the Super Western Atlantic Modal Haplotype (for 67 markers), it only differs from U152/L2 by 1, for DYS456 which is fast marker which competes for modal value.

So despite being perhaps 3-4 times older than U152, this is not discernible from STR values. My hunch is that the farther away from the modal values, the better able STR counts, within the FTDNA criteria, are able to predict age and then also being in the same L2 subclade. I only have two examples of this, this.

I'm not sure I follow... what is it that is 3-4 times older than U152?

Krefter
11-04-2015, 09:39 PM
Did the Bell Beaker U152 guy get a call for L2? I bet he had it.

haleaton
11-04-2015, 10:05 PM
I'm not sure I follow... what is it that is 3-4 times older than U152?

I meant the age of the subclade R1b (M343) which the SWAMH-67 is supposed to represent. Unless I misunderstood something fundamental.

I was basing this on YFull's estimate for R1b, "formed 22000 ybp, TMRCA 18400 ybp" and for U152/L2 "formed 4500 ybp, TMRCA 4500 ybp". Which is 4x.

Not certain if sampling for SWAMH-67 STR was really representative enough for R1b. YFull's estimate of M269 is "formed 13200 ybp, TMRCA 6400 ybp". Which is 3x, possibly more representative of the sampling. [Edit: Using TMRCA's it is 1.4x.]

Also, I used a published value for the SWAMH-67 STR values that was somewhat old. http://www.hprg.com/R1b/page2.html

haleaton
11-04-2015, 11:12 PM
Did the Bell Beaker U152 guy get a call for L2? I bet he had it.

Don't know if this is directed at my little post. I don't really follow pottery issues and do not know accuracy of the YFull age estimations.

I suppose the SWAMH-67 could represent something Much closer in time than to the original R1b person.

How long it takes for a person with a Y mutation to expand and the descendants have archaeological impact is interesting question on which I have no opinion.

R.Rocca
11-05-2015, 12:11 AM
I meant the age of the subclade R1b (M343) which the SWAMH-67 is supposed to represent. Unless I misunderstood something fundamental.

I was basing this on YFull's estimate for R1b, "formed 22000 ybp, TMRCA 18400 ybp" and for U152/L2 "formed 4500 ybp, TMRCA 4500 ybp". Which is 4x.

Not certain if sampling for SWAMH-67 STR was really representative enough for R1b. YFull's estimate of M269 is "formed 13200 ybp, TMRCA 6400 ybp". Which is 3x, possibly more representative of the sampling. [Edit: Using TMRCA's it is 1.4x.]

Also, I used a published value for the SWAMH-67 STR values that was somewhat old. http://www.hprg.com/R1b/page2.html

Oh, OK. I think that the SWAMH can only really be applied to P312 and its siblings though, since U106 has its different modals, Z2103 has its distinct modals and so on.

R.Rocca
11-05-2015, 12:14 AM
Did the Bell Beaker U152 guy get a call for L2? I bet he had it.

Unfortunately it was a no call. In fact, he does not have a single matching SNP with any known U152 individuals. Either he was a dead end, or U152 was so new, that most of known subclades right below it hadn't formed yet. That is why I think we are seeing plenty of SNPs below Z2103 individuals in Yamnaya samples, but are more difficult for Bell Beaker.

lgmayka
11-05-2015, 12:15 AM
I meant the age of the subclade R1b (M343) which the SWAMH-67 is supposed to represent.
To the extent that WAMH has any useful meaning at all, it is (perhaps) the modal haplotype of R1b-P312. No higher than that.

Krefter
11-05-2015, 12:21 AM
Unfortunately it was a no call. In fact, he does not have a single matching SNP with any known U152 individuals. Either he was a dead end, or U152 was so new, that most of known subclades right below it hadn't formed yet. That is why I think we are seeing plenty of SNPs below Z2103 individuals in Yamnaya samples, but are more difficult for Bell Beaker.

By no call you mean he was negative for L2? Some interpreted U152 in the Alps 4,000 years ago as evidence R1b-L51 expanded in the Neolithic and modern distibution of P312 existed 4,000 years ago. But him being U152* along with U106* from Sweden is evidence that wasn't the case.

R.Rocca
11-05-2015, 12:45 AM
By no call you mean he was negative for L2? Some interpreted U152 in the Alps 4,000 years ago as evidence R1b-L51 expanded in the Neolithic and modern distibution of P312 existed 4,000 years ago. But him being U152* along with U106* from Sweden is evidence that wasn't the case.

No call as in "no data at the L2 position".

haleaton
11-05-2015, 01:08 AM
Oh, OK. I think that the SWAMH can only really be applied to P312 and its siblings though, since U106 has its different modals, Z2103 has its distinct modals and so on.


To the extent that WAMH has any useful meaning at all, it is (perhaps) the modal haplotype of R1b-P312. No higher than that.

Thanks!

Also, I did not distinguish between inclusive and exclusive modal values and there are various differing values floating around in the Interwebs . . .

The FTDNA WAMH (12 marker) according to ISOGG is multi-valued and the "badge" used to encourage further 67-STR and 111-STR testing.

Walkman
11-05-2015, 03:44 AM
As many of you are now aware, we are up to a whopping 19 known subclades directly below L2 (see U152 Project Tree) ... Certainly the new R1b Backbone test will help some, but there are some STRs that may help as well, even though none are of 100% certainty.



L2 Subclade
Off Modal STR Value


Z49:
DYS456 = 16


Z367:
DYS460 = 10


DF90:
DYS413a/b = 21-23


FGC10516:
DYS442 = 11


FGC13620:
DYS385b = 15, DYS389i=14


FGC22501:
DYS438 = 13


L199:
DYS385a= 10



Rich,

Do you know if FTDNA is developing a backbone test for L2 or Z49? I have kit #282689 in the R1b-U152 Project. Thanks for all of your work as administrator.

Ravai
11-06-2015, 08:54 PM
Good evening. A few hours ago I was notified that I am FGC13620 . There is some theory or study of this group? a colleague in the forum that has this subclass?

Regards

Ravai
11-07-2015, 07:10 AM
FGC13620: DYS385b = 15, DYS389i=14

Hi Richard , in my case, the DYS389I = 13.

Moreover, I could not find the STR ​​DYS385b in https://www.familytreedna.com/public/R1b-U152?iframe=yresults

Thanks

Regards

Ravai
11-07-2015, 07:36 AM
Hello Richard, They could include Italian , candidates for FGC13620 ? What do you think?

My kit:
N119885 Giovanni Domenicus Rabai, b. 1609, Savona, Italy Italy R-L2 14 24 15 10 11-14 12 12 12 13 13 30

Kit Nº Paternal Ancestor Name Country Haplogroup DYS393 DYS390 DYS19 DYS391 DYS385 DYS426 DYS388 DYS439 DYS389i DYS392 DYS389ii
E5448 Antonino Leto b 1787 Monreale. d 8/7/1837 Monreale Italy R-L2 14 24 15 11 11-14 12 12 12 13 13 31
N113294 Luigi Taruffi, b. 1888, Bologna, Italy Italy R-L2 13 24 14 11 11-14 12 12 13 13 13 30
N16074 Giovanni Dalle Castegnare b. 1485 d. 1555 Italy R-U152 13 25 14 11 11-14 12 12 10 13 13 30
118549 Giovanni Mazzer, born abt 1860, Scomigo, Italy Italy R-U152 13 23 14 11 11-13 12 12 13 13 13 30
N19339 Giovanni Maria Meaggia, Cuglieri, Sardinia Italy R-U152 13 24 14 10 10-16 12 12 12 13 13 30

DYS389i=13 and italian persons
6581

DYS389i=13 y 14 and italian persons
6585

Ravai
11-07-2015, 07:12 PM
Hello , I created the map below FGC13620 and children with this . What do you think? A greeting

6595

Armenis
11-07-2015, 09:02 PM
Welcome to the FGC13620+ group Manuel. I am very excited!

Ravai
11-08-2015, 08:06 AM
Hello Armenis, nice to meet you. I am also excited with the new discovery .

We have little kits with FGC13620 mutation, but I think that this mutation could be generated in Liguria and moved to the British Isles and Greece . What do you think?

Incidentally, this summer we were on vacation in Crete. And the flight back I was talking to a priest of the Monastery of Corfu, is Argentine. You are Corfu ?

Regards

Armenis
11-08-2015, 04:17 PM
Hello Armenis, nice to meet you. I am also excited with the new discovery .

We have little kits with FGC13620 mutation, but I think that this mutation could be generated in Liguria and moved to the British Isles and Greece . What do you think?

Incidentally, this summer we were on vacation in Crete. And the flight back I was talking to a priest of the Monastery of Corfu, is Argentine. You are Corfu ?

Regards

Hi David, yes I am from Corfu. The author Mugnos wrote that the Armeni family were from Piacenza or Milan and there are some medieval documents that claim a few individuals with the family name Armenus were from Genova (Januensis).

Dal Mugnos rileviamo esser questa una famiglia nobile originaria di Piacenza, passata in Sicilia sotto i reali d'Aragona, stabilendosi nella città di Lentini, ove occupò i primari uffici. Altro ramo. Antichissima famiglia alla quale un Giorgio Armeni, familiare del Re Martino, diede principio nella città di Messina. Questi ebbe carico dal suo re di visitare tutte le chiese ed i poveri, soccorrendoli nelle loro necessità. http://www.heraldrysinstitute.com/cognomi/Armeni/Italia/idc/6715/lang/es/

Alex

Ravai
11-08-2015, 06:19 PM
Hi Alex , 54 minutes' drive from Piacenza is Pavia . In this city lives a Rabai family, although this does not mean anything , just do not have the same mutation . In Milan there are also family Rabai . He lives there I saw someone interested in genealogy. Le I can discuss new developments and see if you want to take a genetic test.

Armenis
11-09-2015, 02:56 AM
Hi Alex , 54 minutes' drive from Piacenza is Pavia . In this city lives a Rabai family, although this does not mean anything , just do not have the same mutation . In Milan there are also family Rabai . He lives there I saw someone interested in genealogy. Le I can discuss new developments and see if you want to take a genetic test.

Good plan! A test for FGC13620 in the Rabai man from Milan would be great.

dp
01-21-2016, 07:51 PM
Z49: DYS456=16; tested 16
Z367: DYS460=10; tested 11
DF90: DYS413a/b=21-23; not tested
FGC10516: DYS442=11; 12
FGC13620: DYS385b=15, DYS389i=14; tested 385b=14, 389i=12
FGC22501: DYS438=13, tested 12
L199: DYS385a=10; tested 11

Rich, what do you make of this STR values? What percentage of L2 (or 152) that have DYS456=16 are in Z49?
Thanks,
dp :-)


As many of you are now aware, we are up to a whopping 19 known subclades directly below L2 (see U152 Project Tree). Two separately published studies have highlighted L2 as one of the European male lineages with the highest branching. Perhaps the first "Mr. L2" was the Charlemagne of the Bronze Age, but that is all for ancient DNA to figure out. One thing is clear however...the expansion of L2 was so rapid and so successful, that it is nearly impossible to guarantee an SNP subclade based on haplogroup signatures only. Certainly the new R1b Backbone test will help some, but there are some STRs that may help as well, even though none are of 100% certainty. I thought I'd start this thread to discuss some of those that I know of off the top of my head. Perhaps others can add to it. They are as follows:



L2 Subclade
Off Modal STR Value


Z49:
DYS456 = 16


Z367:
DYS460 = 10


DF90:
DYS413a/b = 21-23


FGC10516:
DYS442 = 11


FGC13620:
DYS385b = 15, DYS389i=14


FGC22501:
DYS438 = 13


L199:
DYS385a= 10