PDA

View Full Version : U152 > L2 > ZZ48



R.Rocca
05-21-2016, 02:44 PM
This thread is to keep folks up to date on L2 subclade ZZ48. This mutation is on palindrome P6 and shows up as a mixture of ancestral and derived reads at position ZZ48_1 and ZZ48_2. The positions are as follows:

ZZ48_1:
18316065(A>G)
ZZ48_2:
18493038(T>C)

Current subclades include:

Group A: FGC10516+ FGC10536+ FGC10543+
Rocca (Italy), Simonetti (Italy), Bonfanti (Italy), Gronella (Italy), Fiero (Italy), two anonymous Sardinians, England (England), Curtis (England), Lambert (France), Robert (France), two anonymous Bristol England samples
Group B: FGC5301 et al
Eaton (England), Whitney (England), Day (England)
Group C: FGC5336 et al
Montaguti (Italy), two anonymous Sardinians
Group D: BY3477+ BY3486+ L199+
Neese (German), Epstein (German Ashkenazi)

Groups A and D have an almost absolute DYS442=11 modal value.

haleaton
05-21-2016, 07:22 PM
Rich, thanks for starting this thread. I have had a hard time understanding specifics of ZZ SNP first identified by Alex Williamson for The Big Tree:

http://www.ytree.net/DisplayTree.php?blockID=1431

Is there something specific that connects the two locations for ZZ48 in the Palindrome?

Also, to identify a ZZ SNP it seems like other samples do not have heterozygous results which it a sign that it is unique. Is there a way to sample all the public samples to support the validity of ZZ type SNPs?

Acque agitate
05-21-2016, 09:54 PM
First of all I thank Richard and Alex for the fantastic job that they have done.
After long years in which I thought to be the last of the Mohicans of the FGC5336+ al. (along with 2 Sardinian anonymous), finally I found out that I belong to a new subgroup.

First of all I ask you a question:
Richard reported that ZZ48+ is divided into four distinct subgroups, while the site http://www.ytree.net indicates that there are only three subgroup (Group D: BY3477+, BY3486+, L199+ is to be located under Group A: FGC10516+, FGC10536+, FGC10543+).
What is the truth?

In any case, I tell you that based on some of my analysis I think is very likely that these samples belong to group D):
1) Simon Shonkwiler (sample FTDNA No. 238720);
2) Szarvas (sample FTDNA No. 194683), this sample should belonging to the Epstein group;

I think it is interesting that even the Shonkwiller family is originally from Germany, even near Frankfurt.
The Szarvas family should be more connected with the Epstein family. It may not be a coincidence that this family is originally from Bukovina.

In conclusion I make this last reflection. If it is confirmed that the ZZ48 + group has 4 sub-groups, it is clear that it was the most prolific direct subgroup of R1b-L2. This could be an indication to understand the historical position of the person who developed the ZZ48+ mutation.

R.Rocca
05-22-2016, 11:51 AM
Rich, thanks for starting this thread. I have had a hard time understanding specifics of ZZ SNP first identified by Alex Williamson for The Big Tree:

http://www.ytree.net/DisplayTree.php?blockID=1431

Is there something specific that connects the two locations for ZZ48 in the Palindrome?

Also, to identify a ZZ SNP it seems like other samples do not have heterozygous results which it a sign that it is unique. Is there a way to sample all the public samples to support the validity of ZZ type SNPs?

Hal, this may help explain the relationship, especially the graphic of the palindromic arms and the correlation between genes on both sides... https://phylogenous.wordpress.com/2010/10/01/y-chromosome-vii-why-palindromes/

R.Rocca
05-22-2016, 11:57 AM
First of all I thank Richard and Alex for the fantastic job that they have done.
After long years in which I thought to be the last of the Mohicans of the FGC5336+ al. (along with 2 Sardinian anonymous), finally I found out that I belong to a new subgroup.

First of all I ask you a question:
Richard reported that ZZ48+ is divided into four distinct subgroups, while the site http://www.ytree.net indicates that there are only three subgroup (Group D: BY3477+, BY3486+, L199+ is to be located under Group A: FGC10516+, FGC10536+, FGC10543+).
What is the truth?

In any case, I tell you that based on some of my analysis I think is very likely that these samples belong to group D):
1) Simon Shonkwiler (sample FTDNA No. 238720);
2) Szarvas (sample FTDNA No. 194683), this sample should belonging to the Epstein group;

I think it is interesting that even the Shonkwiller family is originally from Germany, even near Frankfurt.
The Szarvas family should be more connected with the Epstein family. It may not be a coincidence that this family is originally from Bukovina.

In conclusion I make this last reflection. If it is confirmed that the ZZ48 + group has 4 sub-groups, it is clear that it was the most prolific direct subgroup of R1b-L2. This could be an indication to understand the historical position of the person who developed the ZZ48+ mutation.

Alex is still looking at the Epstein/Newman files, so they are temporarily placed there. Remember that Big-Y misses a lot of positions, so some SNPs are not covered and therefore other samples like FullGenomes, 1000 Genomes etc need to be looked at to properly place them.

haleaton
05-22-2016, 02:38 PM
Hal, this may help explain the relationship, especially the graphic of the palindromic arms and the correlation between genes on both sides... https://phylogenous.wordpress.com/2010/10/01/y-chromosome-vii-why-palindromes/

Thanks, My question is in a Palindrome of known length and postion, to the extent the reference sequence can determine, is a SNP location on one arm determine the location on the other arm or was this done empirically by Alex--though there really is limited Big Y data in this case. Is there a mathematical formula given one location for the other?

I include my results for myself (FGC, BigY), Whitney(BigY), and my 11th 1R Eaton cousin (FGC). Comparing the others in the YFull U152 group only one other, who is anonymous as far as paper ancestry, has this level of heterozygosity to indicate a SNP.

However it would be interesting to see the heterozygosity accross a larger sample of all human Y haplogroups.

ZZ48_1:
18316065(A>G)

YF01461 (R-A197) 32A 39G Eaton#1 FGC Elite 1.0 BGI
YF02170 (R-A197) 51A 39G Eaton#1 Big Y
YF04174 (R-A197) 35A 28G Whitney Big Y
YF05537 (R-A197) 57A 49G Eaton#2 11th 1R Eaton#1 cousin Elite 2.0

YF02458 (L-2*) 35A 53G (Big Y) Anonymous in YFull U-152 Group (probably one of the samples in Rich's post)

ZZ48_2:
18493038(T>C)

YF01461 (R-A197) 30T 27C
YF02170 (R-A197) 1T (Big Y)
YF04174 (R-A197) no call position (Big Y)
YF05537 (R-A197) 63T 48C

YF02458 no call (Big Y)

I have three Elite 2.1 samples (Eaton#3, Whitney, Day) in batch 8007 due in September. While all of these samples are probably related within the last 2000 years at most, the may show the same clear evidence for ZZ48 as my FGC sample and my 11th 1R cousin, separated by 500 years.

R.Rocca
05-22-2016, 04:51 PM
Thanks, My question is in a Palindrome of known length and postion, to the extent the reference sequence can determine, is a SNP location on one arm determine the location on the other arm or was this done empirically by Alex--though there really is limited Big Y data in this case. Is there a mathematical formula given one location for the other?

I include my results for myself (FGC, BigY), Whitney(BigY), and my 11th 1R Eaton cousin (FGC). Comparing the others in the YFull U152 group only one other, who is anonymous as far as paper ancestry, has this level of heterozygosity to indicate a SNP.

However it would be interesting to see the heterozygosity accross a larger sample of all human Y haplogroups.

ZZ48_1:
18316065(A>G)

YF01461 (R-A197) 32A 39G Eaton#1 FGC Elite 1.0 BGI
YF02170 (R-A197) 51A 39G Eaton#1 Big Y
YF04174 (R-A197) 35A 28G Whitney Big Y
YF05537 (R-A197) 57A 49G Eaton#2 11th 1R Eaton#1 cousin Elite 2.0

YF02458 (L-2*) 35A 53G (Big Y) Anonymous in YFull U-152 Group (probably one of the samples in Rich's post)

ZZ48_2:
18493038(T>C)

YF01461 (R-A197) 30T 27C
YF02170 (R-A197) 1T (Big Y)
YF04174 (R-A197) no call position (Big Y)
YF05537 (R-A197) 63T 48C

YF02458 no call (Big Y)

I have three Elite 2.1 samples (Eaton#3, Whitney, Day) in batch 8007 due in September. While all of these samples are probably related within the last 2000 years at most, the may show the same clear evidence for ZZ48 as my FGC sample and my 11th 1R cousin, separated by 500 years.

My understanding is that, once you know the length of the palindrome, you will now the relationship of both values. I suspect Alex just checks a site like NCBI or the like, as he already knows that a specific location is located in a specific arm of a palindrome.

R.Rocca
05-24-2016, 05:14 PM
Hal, this is the explanation I received from Alex via PM...


Thomas Krahn has the positions of the majority of the palindromes loaded into YBrowse (attached). I also went through the reference sequence myself trying to identify the boundaries of each of the palindromes. I did that before Thomas added palindromes to YBrowse, but I haven't tried to compare them. I think they could both be tweaked a little.

By your second question, I take it you're asking how to find the "same" position on the various other arms of the palindromes? If not, let me know.

I use BLAT to find them.

https://genome.ucsc.edu/cgi-bin/hgBlat?command=start

Take ZZ48 for example. ZZ48_1 = 18316065-A-G. It is on palindrome P6. To find the same position on the opposite arm, I first retrieve the reference sequence with say 200 bp on either side of 18316065. Then I search for that sequence in BLAT. There will be two possibilities with about a 100% match. One of them will be ZZ48_1 and the other ZZ48_2.

In the other palindromes, there may be 3 or 4 different matches. They tend not to show up as well in BigY results, and seem a little patchy. That's probably because only 1/3 or 1/4 of the reads will have the desired mutation and it won't always be called.

Regards,
Alex

R.Rocca
05-26-2016, 04:20 PM
In an interesting twist, these two groups now share a subclade below ZZ48 known as ZZ56:

Group A: FGC10516+ FGC10536+ FGC10543+
Group D: BY3477+ BY3486+ L199+

As mentioned in the opening thread, both groups have an almost absolute DYS442=11 modal value, so it looks like the lone FGC10516+ FGC10536+ FGC10543+ sample with DYS442=12 was a back mutation to the P312 modal value.

Acque agitate
05-26-2016, 10:36 PM
In an interesting twist, these two groups now share a subclade below ZZ48 known as ZZ56:

Group A: FGC10516+ FGC10536+ FGC10543+
Group D: BY3477+ BY3486+ L199+

As mentioned in the opening thread, both groups have an almost absolute DYS442=11 modal value, so it looks like the lone FGC10516+ FGC10536+ FGC10543+ sample with DYS442=12 was a back mutation to the P312 modal value.

Congratulations Richard,
I propose you a daring hypothesis:
1) the ZZ48 group could share DYS449 = 30-31
2) the Venetian subgroup of FGC10530 (Gronella, Bonfant, Simonetti) might have had a strong back mutation from DYS449 = 30-31 to DYS449 = 27

R.Rocca
05-26-2016, 10:51 PM
Congratulations Richard,
I propose you a daring hypothesis:
1) the ZZ48 group could share DYS449 = 30-31
2) the Venetian subgroup of FGC10530 (Gronella, Bonfant, Simonetti) might have had a strong back mutation from DYS449 = 30-31 to DYS449 = 27

Correct, it seems that all ZZ48 is above the modal and the move down to 27 was likely a one time multi-step event.

haleaton
03-23-2017, 04:59 PM
So the mass array SNP Pack seems effective in identifying ZZ48+ samples. I noticed ZZ48_1 and ZZ48_2 are also available for individual orders for $39 each. I was wondering if they are really amenable to Sanger Sequencing and if anybody had tried just an individual order of just one? YSEQ does not offer ZZ48 and I never checked if it was possible.

When there was just a few SNPs we used to screen folks who were just L2+ with a few SNPs. I was curious how effective an effort to get folks just to test ZZ48_1 might be for the selfish reasons of finding more ZZ48+ candidates for further testing rather than the more expensive SNP Packs.

haleaton
03-23-2017, 05:16 PM
So the mass array SNP Pack seems effective in identifying ZZ48+ samples. I noticed ZZ48_1 and ZZ48_2 are also available for individual orders for $39 each. I was wondering if they are really amenable to Sanger Sequencing and if anybody had tried just an individual order of just one? YSEQ does not offer ZZ48 and I never checked if it was possible.

When there was just a few SNPs we used to screen folks who were just L2+ with a few SNPs. I was curious how effective an effort to get folks just to test ZZ48_1 might be for the selfish reasons of finding more ZZ48+ candidates for further testing rather than the more expensive SNP Packs.

Actually, I did check with YSEQ and they declined to offer ZZ48 for the following reasons:

ZZ48_1 18316065 A_G palindromic region*
ZZ48_2 18493038 T_C palindromic region*

It also turns out that for some reason, in a recent test, FGC Elite Prime 2.1 does not reliably cover ZZ48 unlike previous version of their Y tests. Don't know about the various WGS test they offer though I should find out soon on a couple of samples.

Big Y BAM does have ZZ48_1 but ZZ48_2 was questionable in the couple of samples I looked at.

FTDNA and YFULL do not include ZZ48 in their tree, only on Alex Williamson's The Big Tree.

So for existing STR testers at FTDNA the most cost effective way of screening ZZ48 is an individual order, though a SNP Pack would much more info for negatives.

R.Rocca
03-23-2017, 07:00 PM
FTDNA and YFULL do not include ZZ48 in their tree, only on Alex Williamson's The Big Tree.


Slight correction... FTDNA does now include ZZ48_1 and ZZ48_2 in their tree, but not subclade ZZ56.

haleaton
03-23-2017, 07:51 PM
Slight correction... FTDNA does now include ZZ48_1 and ZZ48_2 in their tree, but not subclade ZZ56.

You are correct, of course, Rich. I missed it. We just were able to split FGC5301/FGC5303 and FGC5306/FGC5308 by getting a L2 SNP Pack for somebody we knew would be that from NGC Y Elite Testing.

So I am wondering if now they can reliably test ZZ48_1 and would retest or do mass array all for $39.

An L2+ person wishing to spend the own money to learn more would be better off getting the L2 SNP Pack for $119. However, if other ZZ48+ folks wanted to pay for or subsidize at test for another individual to see if they are ZZ48+ then this might be an option. All up to the admins and how reliable the individual order ZZ48_1 test is.