PDA

View Full Version : R-A197 or How I got to stay near the top of the YFull R-L2 subtree - UPDATE



haleaton
04-04-2015, 05:30 PM
U152 > L2 > A197?

In YFull Experimental YTree v3.7 at 28 March 2015, at the top of subtree under R-L2 is a new entry:
R-A197 A197 formed 4700 ybp, TMRCA 150 ybp [formed CI 95% 5200<->4100ybp, TMRCA CI 95% 375<->50 ybp
id: YF01461 ENG Jonas Eaton b. c.1618 Kent, England (FGC BGI FGC-FG1059A)
id: YF02170 ENG Jonas Eaton b. c.1618 Kent, England (FTDNA BigY FTDNA-125963)

Which are my sets of data from samples from me, the same person. My two samples match on 396/396 YFull STRs.
If ypb is defined from 1 JAN 1950 then actually TMRCA is -8 ypb [assuming I am my own ancestor in the calculation.] :)

The SNP A197 aka 13677425(A/T) was found in my BGI (with ** reliability) & Big Y (with *** reliability) data but not named by FGS analysis because of lower reliability and FTDNA in their Big Y did identify it as a medium quality novel variant.

A197 was named and put in Y-Browse after I requested YSeq determine if primers could easily be made but they could not because it was in a Centromeric GGAAT repeat region. So I could not verify A197 by Sanger sequencing at Yseq.

So YFull probably named R-A197 because out of all the common SNPs found in FGC and BigY coverage it had a name. And fortunately for me YSeq uses the first letter of the alphabet "A" to a get to be just after my old R-L2* position on YFull tree. ;)

A197 is no joke however because it was also found in the BigY sample FTDNA-92676 and put at the bottom of Rocca's U152 _Project Tree L2 branch:
FGC5301, FGC5303, FGC5304, FGC5306, FGC5308, FGC5311, FGC5312, FGC5313, FGC5314, FGC5315, FGC5316, FGC5317, FGC5318, FGC5321, FGC5322, FGC5323, 13677424(G/A), 13677425(A/T), 21852729(C/G), 22467805(G/T), 22467806(A/G)

FTDNA-92676 Hiram C. Whitney, Orangeburg, South Carolina who is likely descendant of Samuel Whitney (c. 1615 -1684) who went from England to Bermuda in the 1600s about the same time my Eaton ancestor left for New England. We are U152> L2> FGC5301 et al." branch in the U152 group results along with others who are close STR matches. FTDNA-92676 and I, FTDNA-125963, match at BigY and at 6/111 STRs.

Distant Eaton cousin, FTDNA-41600 John Eaton , b.c.1636, Watertown, MA, FGC saliva sample failed BGI Q/A twice, so was sent to FGC's other lab in the US and we are awaiting to see if the other lab can extract enough DNA. FTDNA-41600 & FTDNA-125963 are 6/67 STR and on paper have common male ancestor William Eaton (c. 1540-1584), of Dover, Kent.

I sent one e-mail to FTDNA Eaton/Whitney/Day STR matches, 37 or higher, to consider testing BigY but did not get response other than one person was tired of all the FTDNA e-mail traffic on things like STR-12 matches. [I do like Spam in Musubi or Budae-jjigae. :P]

In my samples, A197 was 6A 24T 1DEL reads in NGC BGI and 1A 84T 2AATGC reads in FTDNA BigY and cannot be (easily) Sanger sequenced and was only named because I asked about it to YSeq. It was not named by NGC as determined unreliable, but would have been given an internal name by YFull. If Whitney FTDNA-92676 sample was also contributed to YFull who would have matched, then it would likely be R-FGC5301 as it the first SNP in B37 order and was already in YBrowse.

So it will be interesting how long R-A197 remains high on the Y-Full L2 tree or if it truly ends up a non-family L2 subclade name, thankfully high on the alphabet under L2. :D

R.Rocca
04-04-2015, 06:50 PM
U152 > L2 > A197?

In YFull Experimental YTree v3.7 at 28 March 2015, at the top of subtree under R-L2 is a new entry:
R-A197 A197 formed 4700 ybp, TMRCA 150 ybp [formed CI 95% 5200<->4100ybp, TMRCA CI 95% 375<->50 ybp
id: YF01461 ENG Jonas Eaton b. c.1618 Kent, England (FGC BGI FGC-FG1059A)
id: YF02170 ENG Jonas Eaton b. c.1618 Kent, England (FTDNA BigY FTDNA-125963)

I'm not sure I follow. YFull puts SNPs on a tree even though only one kit has tested positive for it?...or in this case, wouldn't all of the two Eaton kit's SNPs need to be added?

lgmayka
04-04-2015, 06:53 PM
1) You really need to tell YFull that those two samples are the same person, and should be merged rather than treated as a subclade.

2) If #92676 submits his BAM file to YFull, you will have a legitimate subclade.

haleaton
04-04-2015, 07:38 PM
I'm not sure I follow. YFull puts SNPs on a tree even though only one kit has tested positive for it?...or in this case, wouldn't all of the two Eaton kit's SNPs need to be added?

They do know these kits are the same person, but my guess is much of the tree is automated. Unless they forgot. I like being R-A197.

haleaton
04-04-2015, 08:07 PM
1) You really need to tell YFull that those two samples are the same person, and should be merged rather than treated as a subclade.

2) If #92676 submits his BAM file to YFull, you will have a legitimate subclade.

Yes, Richard Rocca has put 13677425(A/T) aka A197 on the U152 Project Tree and Group as U152> L2> FGC5301 et al." If he chooses to recognize the A197 as the name for 13677425(A/T) that would make me happy, though I don't know if that would change my position under L-2 in what I consider to be the definitive U152 tree at this time. I always enjoyed being R-L2* up at the top. B)

I guess Y-Full also is including the ISOGG v9.29 at 2 March 2014 tree. It is not clear that my own Eaton and the Whitney results meet the rules about SNPs being 500 years old (I have not recently checked on rule changes) with STR distance of 6/111 but my paper guess is that our MRCA is probably older than that.

Y-Full does know the two Eaton samples are the same person, but they process a lot of data. I will check if they use any data that is not shown, but that is not likely. So it is likely sort of a mistake, but turns out to be true. The TMRCA date is sort of a give away--just the difference between FGC Elite and Big Y, I guess.

haleaton
04-04-2015, 08:59 PM
So YFull probably named R-A197 because out of all the common SNPs found in FGC and BigY coverage it had a name. And fortunately for me YSeq uses the first letter of the alphabet "A" to a get to be just after my old R-L2* position on YFull tree. ;)

A197 is no joke however because it was also found in the BigY sample FTDNA-92676 and put at the bottom of Rocca's U152 _Project Tree L2 branch:
FGC5301, FGC5303, FGC5304, FGC5306, FGC5308, FGC5311, FGC5312, FGC5313, FGC5314, FGC5315, FGC5316, FGC5317, FGC5318, FGC5321, FGC5322, FGC5323, 13677424(G/A),13677425(A/T), 21852729(C/G), 22467805(G/T), 22467806(A/G)



I checked the version of YBrowse that is currently available for these SNPs that do not have names in Rocca's U152 Project Tree.

13677424(G/A) - This was not named along with its twin 13677425(A/T) as they were both (**) in FGC BGI, (***) in BigY reliable. Also as a doublet,
GA>AT (***), they were classed as unreliable to be named. FTDNA did note them as medium quality positive novel variant and matched me with the Whitney sample. This one is available for naming, but it was not (easily) amenable to Sanger sequencing as YSeq as both were near the Centromeric GGAAT repeat. I will recheck with YSeq sometime.

13677425(A/T) aka A197 which matched the Whitney sample.

21852729(C/G) aka YFS071059 internal name by YFull. It was not named by FGC or FTDNA as results were ** and *** and it was outside of FTDNA's BED file. It was in the data however and matched the Whitney sample. It verified in my sample by Sanger as G+.

Another doublet GA>TG (***) :
22467805(G/T) aka YFS071066, ** and *** in FGC and not named, named by YFull as acceptable and named internally, medium positive novel variant in Big Y, matched with Whitney sample
22467806(A/G) aka YFS071067, ** and *** in FGC and not named, named by YFull as unreliable and named internally, medium positive novel variant in Big Y, matched with Whitney sample

Neither of these was amenable to Sanger sequencing, being part of "125bp repeat." but I will recheck with YSeq.

These are good examples of the different cases with all sorts of exceptions.

R.Rocca
04-04-2015, 11:45 PM
I checked the version of YBrowse that is currently available for these SNPs that do not have names in Rocca's U152 Project Tree.

13677424(G/A) - This was not named along with its twin 13677425(A/T) as they were both (**) in FGC BGI, (***) in BigY reliable. Also as a doublet,
GA>AT (***), they were classed as unreliable to be named. FTDNA did note them as medium quality positive novel variant and matched me with the Whitney sample. This one is available for naming, but it was not (easily) amenable to Sanger sequencing as YSeq as both were near the Centromeric GGAAT repeat. I will recheck with YSeq sometime.

13677425(A/T) aka A197 which matched the Whitney sample.

21852729(C/G) aka YFS071059 internal name by YFull. It was not named by FGC or FTDNA as results were ** and *** and it was outside of FTDNA's BED file. It was in the data however and matched the Whitney sample. It verified in my sample by Sanger as G+.

Another doublet GA>TG (***) :
22467805(G/T) aka YFS071066, ** and *** in FGC and not named, named by YFull as acceptable and named internally, medium positive novel variant in Big Y, matched with Whitney sample
22467806(A/G) aka YFS071067, ** and *** in FGC and not named, named by YFull as unreliable and named internally, medium positive novel variant in Big Y, matched with Whitney sample

Neither of these was amenable to Sanger sequencing, being part of "125bp repeat." but I will recheck with YSeq.

These are good examples of the different cases with all sorts of exceptions.

OK, I added the SNP names and re-uploaded the spreadsheet. Thanks.

haleaton
04-05-2015, 02:50 AM
OK, I added the SNP names and re-uploaded the spreadsheet. Thanks.

Thanks. The reason 13677424(G/A) was not name was I didn't ask YSeq and when they said 13677424(G/A) was not amenable to sequencing I assumed the previous location was also. I will recheck with them and if they give it name and put it in YBrowse I will let you know.

There were 9 other SNPs and an INDEL which were in my BigY data but were excluded by the BED file which I do not know if a comparison with the Whitney FTDNA-92676 had been done. Did you ever get a FTDNA-92676 BAM or find out if was submitted to FGC for analysis? 8 out of 10 were validated by me by Sanger methods at YSeq on me only,

26 identified SNP Locations in my FGC are not in the BigY due to coverage difference of these only 7 were verified on me only by Sanger methods.
The others they could not easily create primers for.

haleaton
09-25-2015, 05:35 PM
6020

Got there before by me matching myself with FGC and Big Y samples treated as different persons . . .

Will see if the age estimates change.

haleaton
09-26-2015, 09:56 PM
My biggest complaint with YFull right now is that they seem to be exercising selection bias (Big-Y over FGC) in positioning SNPs on their tree.

Almost everything else I've found outstanding, and I recommend that both FGC and BigY BAM files are worthy of submission.



I moved you comment over here < L2. This may have some entertainment value to a small list of readers . . . I too find YFull outstanding.

Earlier this year I sneaked onto the Y-Full Tree under L2 < U152 as R-A197, I assume because I submitted both my NGC Y Prime (BGI) and my FTDNA Big Y making it plain that they were samples from the same person. I became of subclade of one.

A197 aka 13677425(A/T) was a ** (6A 24T 1DEL reads) reliablity SNP so FGC did not name it. It was *** (1A 84T 2AATGC reads) SNP in my Big Y. Even YFull did not call it out, but some reason FTDNA classified as medium quality positive. So I asked YSEQ and they said it could not be easily Sanger sequenced, but they did give it name and put it in YBrowse. All of my FGC named SNPs ( blank or * reliability) are put quickly in YBrowse.

I have never understood why YFull chose A197 which they ranked as four stars. Rich Rocca wondered why they only listed one instead of all the SNPs that I match myself. YFull said they did not use data from any sources they do not name. I assumed it was just because it was first in alphabetical order

It turned out in the mean time that another person's Big Y matched A197 along with others and Rich put 21 SNPs to define a little subclade of L2.
One of the SNPs FGC5301 was chosen for the R1b M383 Backbone SNP Pack which is a good choice out of many.

As we speak, this Big Y has been submitted to Y-Full with final processing for SNPs estimated to be 10/2/2015 though the dates vary everyday. I can use the Browser function now for YF04174 . . . A197 is a clean 60T read.

So I wonder still now just why YFull chose A197 and how the tree updates. There were many FGC named SNPs to choose one from, which may relate to your comment.

If there is a lesson, it may be FTDNA is not always wrong, and hopefully right even more than twice a day--assuming a 12 hour clock . . .

Kwheaton
09-27-2015, 11:36 PM
Hal,
I just wanted to congratulate you on your WHITNEY match---just happened to notice it and then saw this post of yours. AN\nd congrats on the A197---its easier than my mouth-full FGC22501 although not too hard to remember. Here's raising a glass to you :beerchug:
Kelly

haleaton
09-28-2015, 01:08 AM
Hal,
I just wanted to congratulate you on your WHITNEY match---just happened to notice it and then saw this post of yours. AN\nd congrats on the A197---its easier than my mouth-full FGC22501 although not too hard to remember. Here's raising a glass to you :beerchug:
Kelly

FGC5301 is a better choice than A197--unreliable, unamed by YFull, can't easily Sanger it in the Centronomic GGAAT repeat region, deemed ** or *** by NGC and also unnamed. For some reason FTDNA said it was medium quality in Big Y. Rich Rocca did put it way back on the U152 & Subclades tree along with its twin.

A197 is actually the start of a couplet MNP, GA>AT, with the second SNP 13677424(G/A) now called A7393. I checked again if could be Sangered and the new assistant approved it not recognizing the difficulties and Thomas later said he could not warranty the reliability of the results . . .

Thanks! Kelly. I enjoyed being on the YFull Tree based, I think, on having two samples from me. But now it will be legit. :amen:

haleaton
10-08-2015, 08:40 PM
Decided to post this here.

GD 7/67 with other English Surname sample tested FGC5301+ from R1b-M343 Backbone SNP Pack test just ordered 9/22/2015!

This shows SNP Pack is working for one SNP in this L2 Subclade which was found in my and another Big Y.

Because all three sample more recently related, FGC5301 test may not catch earlier folks that branched off.

So far FTDNA has not placed R-FGC5301 on their public tree, though it is listed "on tree" in Big Y.


In other news YSEQ came back today with positive Sanger results in my sample for the MNP A7393 & A197 GA>AT (13677424,13677425)!

Previously, Thomas Krahn advised me:

"For your understanding A197 and A7393 are both in the centromeric GGAAT
repeat region. Of course we can design primers for it and maybe we can
get a readable sequence. However this SNPs will be meaningless for your
research.

You may have heard that the centromere is a highly repetitive region
that develops in a permanent flow of self-recombination. Mutations come
and go because they are deleted through LOH between the large scale
repetitive elements. The reference sequence only represents a small
fraction of the repeats that exist in real Y chromosomes. It just can't
get sequenced with any sequencing technology, especially not with NGS
short read sequencing. The reads just align on the A197 region because
the actual regions where they come from are missing in the reference
sequence.
We cannot give any warranty that this sequencing will be successful.

We'll still order this pair of primers, but don't expect too much..."

Good SNP or not? YFull still has it as simple sublclade defining SNP though now has a sample different from mine to justify it.

haleaton
05-06-2016, 08:45 PM
So in the last few day. FTDNA did reclassify my Y Haplogroup as R-FGC5301 (I assume based on Big Y) along with my Whitney Big Y Match and the Day who took the R1b SNP pack and was positive.

ColinUSA
09-12-2016, 03:56 PM
Hi Hal,
I am interested in your results because it is hard to find YFull data for persons sharing a known ancestor with known TMRCA expressed as generations.

On the Eaton project pages, the descendant tree for John Eaton, b 1611 England d.1658 Dedham, MA is cutoff after 9 generations. How many more generations to yourself YF02170 ?

The descendant tree for Jonas Eaton, b c1618, England is cutoff after 8 generations. Do you know how many more generations to the tester YF05537 ?

I understand Jonas and John are first cousins, i.e. 2 more generations to William Eaton (c. 1540-1584), of Dover, Kent., is that correct?

Regards,
Colin

haleaton
11-16-2016, 03:59 PM
Hi Hal,
I am interested in your results because it is hard to find YFull data for persons sharing a known ancestor with known TMRCA expressed as generations.

On the Eaton project pages, the descendant tree for John Eaton, b 1611 England d.1658 Dedham, MA is cutoff after 9 generations. How many more generations to yourself YF02170 ?

The descendant tree for Jonas Eaton, b c1618, England is cutoff after 8 generations. Do you know how many more generations to the tester YF05537 ?

I understand Jonas and John are first cousins, i.e. 2 more generations to William Eaton (c. 1540-1584), of Dover, Kent., is that correct?

Regards,
Colin

Yes. Been busy offline.

haleaton
11-17-2016, 04:54 PM
Finally, did get FGC Y Elite 2.1 Results on FTDNA GD 7/67 and from initially it appears it will split off R-A197 aka FGC5301 et al at a much earlier date based on number of assumed negative SNPs and large number of Private SNPs.

One problem I have with the FGC Reports is they do not call out negatives well of samples even previously tested by them.

haleaton
11-18-2016, 04:41 PM
I just noticed FTDNA is running a Y-HAP-Backbone test on me on their own:

"AWAITING TESTS
Tests Lab Procedure Batch Expected
Y-HAP-Backbone Backbone 692 11/23/2016 - 12/07/2016

Results for your SNP testing have been delayed. We have updated your predicted completion date, and action by you is not usually required. Note that predicted dates are an estimate and subject to lab volume and quality control."

Not clear why they are doing this or if this is all that common.

Anybody know?