PDA

View Full Version : Big-Y results!



Pigmon
02-03-2015, 08:13 PM
Hey guys I finally got my results! There are so many SNPs on here that I have not even seen before that I don't know how to use them.

What do I do now?

Curtis

ArmandoR1b
02-03-2015, 08:39 PM
Download the files, upload them to Dropbox, order a professional analysis by YFull at http://www.yfull.com/ and then give them the link to your files.

R.Rocca
02-03-2015, 08:45 PM
Hey guys I finally got my results! There are so many SNPs on here that I have not even seen before that I don't know how to use them.

What do I do now?

Curtis

Curtis, I checked your results and you are still L2 > Z49, Z68 > Z142 > Z150, Z12222, 22867545(A/T). If you can have FTDNA your BAM file and place it up on Google docs or Dropbox, that would be great.

Pigmon
02-03-2015, 09:52 PM
I'm lost when it comes to dropbox. I have sent an email to [email protected] and asked them to upload the file.

Also I created an account with dropbox and uploaded the file.

It think it is called a VCF file. Is that the one you need? Or is the BAM file different?

MitchellSince1893
02-04-2015, 02:39 AM
Wow that was a quick turn around! Congrats. Now the fun begins.

Pigmon
02-04-2015, 04:25 PM
Yes that was not as long a wait as I was expecting! So I guess the BAM is a separate file.

Here is what ftdna said:

"Thank you for your email. I have requested your BAM file be generated. Please allow 1 - 2 business days for this to complete processing. You should be able to log into your kit and download that file by clicking on the blue Download Raw Data button from your Big Y results page. Again, this may take up to 2 business days to generate and be available. Please let me know if you encounter any issues."

Richard, I will let you know in a couple of days when this come in.

Curtis

Pigmon
02-04-2015, 04:32 PM
Wow that was a quick turn around! Congrats. Now the fun begins.

Mark,

Here are my top 2 matches:



Match Name Shared Novel Variants Known SNP Difference Non-Matching Known SNPs Matching SNPs Match Date


Mr. Charles Lincoln 52 1 PF7214 25,738 2/2/2015



Mr. R Edward Mitchell 52 1 PF7214 25,188 2/2/2015



Are you sure you aren't my cousin? lol

MitchellSince1893
02-04-2015, 05:12 PM
Be careful of the BigY matches. They can be misleading. I bet u are the first of a new branch of z150.

Pigmon
02-04-2015, 05:45 PM
Be careful of the BigY matches. They can be misleading. I bet u are the first of a new branch of z150.

I hadn't thought of that. Makes a lot of sense though because I thought I would have some of the FGC SNPs and there were none.

MitchellSince1893
02-04-2015, 09:11 PM
I hadn't thought of that. Makes a lot of sense though because I thought I would have some of the FGC SNPs and there were none.
On your bigy matches click on "Shared Novel Variants" header. When I do that it does show my closest match at the top. Maybe it will turn out to do the same for you.

razyn
02-04-2015, 09:47 PM
In my experience (not under U152, but with Big Y files) if you scroll to the bottom of your list of shared novel variants, and find someone who shares only two or three (usually not just one -- because one could easily be random, and spurious) that will be your closest match, or group of matches with the least-familiar SNPs. Just make sure they don't share a "known" SNP that is known to be be in a different haplogroup... as most of the people on your list will.

MitchellSince1893
02-04-2015, 11:55 PM
In my experience (not under U152, but with Big Y files) if you scroll to the bottom of your list of shared novel variants, and find someone who shares only two or three (usually not just one -- because one could easily be random, and spurious) that will be your closest match, or group of matches with the least-familiar SNPs. Just make sure they don't share a "known" SNP that is known to be be in a different haplogroup... as most of the people on your list will.

Also look in the non matching SNP column. If your match has a SNP such as Z142, Z49, U152 or P312 listed then it's a false positive (this only applies to Pigmon and other Z142+ samples).

Here are some screen shots of what I was saying before about sorting by "Shared Novel Variants" vs the Default view. The closest and 2nd closest matches shown are my actual phylogenetic matches. I.e. the match with 66 shared novel SNPs shares my terminal SNP, while the two 2nd closest matches are on the next branch up from my terminal SNP branch
37243725

So in my particular case, sorting by Shared Novel Variants works to a certain degree (matches 1-6 look good), but as you can see it's not perfect. There are phylogenetic matches that are shown as more distant than false positive ones. Matches 7-12 are false positives while match 13 (CTS7970, CTS8125 SNPs) is valid . That's why you also have to check the "Non matching Known SNPs" columns to look for mismatches (indicated by the red rectangles).

Pigmon
02-05-2015, 02:39 PM
Thanks razyn and Mark,

I see what you are saying. It is kind of a toss up which ones are the closest matches. A lot of my closest matches don't have U152, Z49, or Z142. Some of them are L20.

It looks like everyone on the Non Matching Known SNPs in Shared Novel Variants have PF7214 except me. Sort of strange.

R.Rocca
02-05-2015, 03:54 PM
Thanks razyn and Mark,

I see what you are saying. It is kind of a toss up which ones are the closest matches. A lot of my closest matches don't have U152, Z49, or Z142. Some of them are L20.

It looks like everyone on the Non Matching Known SNPs in Shared Novel Variants have PF7214 except me. Sort of strange.

Curtis, this means that you have a back mutation at PF7214 that no one else has (or at least not anyone within FTDNA Big-Y testers).

Pigmon
02-05-2015, 04:39 PM
Curtis, this means that you have a back mutation at PF7214 that no one else has (or at least not anyone within FTDNA Big-Y testers).

Ok, that makes sense. I hadn't even thought of that.

So that proves I really am an alien! lol

Titus Valerius
02-05-2015, 09:11 PM
Hi all! I just got my results of Big Y , but no matches found! I just request my Bam File

razyn
02-05-2015, 11:19 PM
Hi all! I just got my results of Big Y , but no matches found! I just request my Bam File

I think the Matches typically don't post until some hours later than the more basic stuff (comparing you against an outdated, fixed list of "Known SNPs" -- rather than against their database of Big Y tested samples). If your display says something like "No data..." the program that looks for it just hasn't finished running, yet. Check later.

It also seems to me that FTDNA's dreaded IT people have in the past 24-36 hours tinkered with the way matching novel variants are displayed, such that I don't get nearly as much out of that display as I formerly did. Has anybody else noticed that problem? What has happened to the list of positions, with decreasing numbers of people matching you at each position listed, until at the bottom there may be only one?

lgmayka
02-05-2015, 11:33 PM
It also seems to me that FTDNA's dreaded IT people have in the past 24-36 hours tinkered with the way matching novel variants are displayed, such that I don't get nearly as much out of that display as I formerly did. Has anybody else noticed that problem? What has happened to the list of positions, with decreasing numbers of people matching you at each position listed, until at the bottom there may be only one?
Yes, they definitely introduced a bug.

jbarry6899
02-06-2015, 01:02 AM
Yes, they definitely introduced a bug.

Yes, I noticed it yesterday and reported it to the helpdesk.

R.Rocca
02-06-2015, 01:20 AM
I think the Matches typically don't post until some hours later than the more basic stuff (comparing you against an outdated, fixed list of "Known SNPs" -- rather than against their database of Big Y tested samples). If your display says something like "No data..." the program that looks for it just hasn't finished running, yet. Check later.

It also seems to me that FTDNA's dreaded IT people have in the past 24-36 hours tinkered with the way matching novel variants are displayed, such that I don't get nearly as much out of that display as I formerly did. Has anybody else noticed that problem? What has happened to the list of positions, with decreasing numbers of people matching you at each position listed, until at the bottom there may be only one?

What Titus meant is that I compared his results and he has not matches. I do not rely on the FTDNA matches.

razyn
02-06-2015, 02:07 AM
What Titus meant is that I compared his results and he has not matches. I do not rely on the FTDNA matches.

OK, if true, but I'm glad I asked the other question, in that it got answers.

Titus said he just got his results, and in fact, the FTDNA "matches" don't show up at first -- whether you happen to look at those, or not.

jbarry6899
02-06-2015, 12:59 PM
Looking at two sets of BigY results, I am comparing my results with those of my closest STR match, same surname, 103/111.

He and I share 93 novel variants and have no differing known SNPs.

There are 7 SNPs for which he has the ancestral value and I have the derived.

There are 7 different SNPs for which I have the ancestral value and he has the derived.

So for purposes of estimating the TMRCA do I use 7 or 14 SNP differences?

What is the current thinking on the frequency of SNP mutations? I've seen estimates ranging from one every three generations to one every five. Doug McDonald has an option on his TMRCA calculator that seems to imply a higher frequency. For example:

If 7 is correct, then the STR TMRCA from the FTDNA Tip Report is roughly the same as the SNP TMRCA from Doug McDonald's Calculator:

STR: 13 generations (50% probability); S. D. 5
SNP: 14.5 generations; S. D. 4.5

Thanks for any insights.

Jim

Pigmon
02-06-2015, 02:05 PM
Curtis, I checked your results and you are still L2 > Z49, Z68 > Z142 > Z150, Z12222, 22867545(A/T). If you can have FTDNA your BAM file and place it up on Google docs or Dropbox, that would be great.

Rich, I have downloaded my BAM file and uploaded it to dropbox. I'm new with dropbox - do I need to do anything else?

Curtis

R.Rocca
02-06-2015, 02:16 PM
Looking at two sets of BigY results, I am comparing my results with those of my closest STR match, same surname, 103/111.

He and I share 93 novel variants and have no differing known SNPs.

There are 7 SNPs for which he has the ancestral value and I have the derived.

There are 7 different SNPs for which I have the ancestral value and he has the derived.

So for purposes of estimating the TMRCA do I use 7 or 14 SNP differences?

What is the current thinking on the frequency of SNP mutations? I've seen estimates ranging from one every three generations to one every five. Doug McDonald has an option on his TMRCA calculator that seems to imply a higher frequency. For example:

If 7 is correct, then the STR TMRCA from the FTDNA Tip Report is roughly the same as the SNP TMRCA from Doug McDonald's Calculator:

STR: 13 generations (50% probability); S. D. 5
SNP: 14.5 generations; S. D. 4.5

Thanks for any insights.

Jim

Jim, I don't know if you saw it on the U152 Project results page, but the Barry cluster now has a sub-cluster as well... https://www.familytreedna.com/public/R1b-U152/default.aspx?section=results

R.Rocca
02-06-2015, 02:17 PM
Rich, I have downloaded my BAM file and uploaded it to dropbox. I'm new with dropbox - do I need to do anything else?

Curtis

Yes, you have to "share" the file or folder it is in with my email address.

Pigmon
02-06-2015, 02:28 PM
Yes, you have to "share" the file or folder it is in with my email address.

I just did it so it should be in there.

jbarry6899
02-06-2015, 04:24 PM
Jim, I don't know if you saw it on the U152 Project results page, but the Barry cluster now has a sub-cluster as well... https://www.familytreedna.com/public/R1b-U152/default.aspx?section=results

Thanks, Rich--I had not seen that. I found that subgroup as well in looking at the latest results. Interestingly, that indicates that the sub-cluster with DYS388=12, the modal value, is older than the group with DYS388=11, so that is a backward mutation. Also, looking at the two subgroups in the Barry project, the one with 12 is more geographically dispersed than the group with 11, which tends to be concentrated in Cork in an area associated with one particular family branch.

We have one other test in process from a project member who is DYS388=11 but has a different surname.

Any thoughts on SNP counting conventions to estimate the ages of these sub-clusters?

Jim

R.Rocca
02-09-2015, 02:02 AM
Thanks, Rich--I had not seen that. I found that subgroup as well in looking at the latest results. Interestingly, that indicates that the sub-cluster with DYS388=12, the modal value, is older than the group with DYS388=11, so that is a backward mutation. Also, looking at the two subgroups in the Barry project, the one with 12 is more geographically dispersed than the group with 11, which tends to be concentrated in Cork in an area associated with one particular family branch.

We have one other test in process from a project member who is DYS388=11 but has a different surname.

Any thoughts on SNP counting conventions to estimate the ages of these sub-clusters?

Jim

Jim, after looking at the BAM file of kit no. N53153, the only SNP he does not share with the other two Barry kits is 7823757(G/T). Goes to show how important the BAM files and how unreliable FTDNA's SNP discovery alogorithm can be.

lamahorse
02-13-2015, 03:14 AM
Ah, looking at the updated tree; there is a McCarthy cluster under L2. I suppose that concludes that we're all L2+. Right?

Is there any competent way of guessimating the timescales of divergence between the various groups?

My McCarthy group is very closely related to Newman/Noonan via Y-111 results. It's quite possible that our McCarthy's are the result of a non paternal event with a Noonan a few hundred years ago.

Is there any way of guessing a statistical date of common ancestry between my McCarthy and the American McCarty using the Big Y? The link between our families is somewhat of an enigma as both paper trails don't shed much light on the link. My family is purely West Cork, tenant farming Catholics back to at least 1807 but these American McCartys are Protestant and landed to their earliest ancestor (b. 1753 or so). They have no link to Ireland as he was born in America.

Recently another McCarthy appeared on the Y-111 who was the same distance to myself as the American McCartys. They were Catholic but through some digging, it turned out that their oldest paternal ancestor was an orphan in the 1860s. Turns out that the man was orphaned by the Civil War where the father was killed in action for the South. Due to his surname, he was fostered in a Catholic Orphanage and the nuns corrected his surname. The father was probably a cousin of the American McCarty branch who lost contact.

History really is quite amazing.

razyn
02-26-2015, 02:41 PM
Yes, they definitely introduced a bug.


Yes, I noticed it yesterday and reported it to the helpdesk.

The Novel Variants matching function is working, this AM. First time since Feb. 4th, I think. Rejoice with them that rejoice... and, knock wood.

jbarry6899
02-26-2015, 02:45 PM
The Novel Variants matching function is working, this AM. First time since Feb. 4th, I think. Rejoice with them that rejoice... and, knock wood.

But unfortunately the fix created another issue. It is now not possible to click on "Haplogroups" and filter by subclade. Reported to helpdesk and on FTDNA forum.

Jim

haleaton
10-29-2015, 12:18 PM
Since August all 22 new FTDNA Big Y "matching" novel variant has my sample sharing L858 & L862 widely across distinct haplogroups. I asked the Help Desk a couple of times and it was forwarded to the FTDNA "Big Y Department" but never got a reason or fixed I am negative for L858 & L862, also.

So this week a new single match sharing V1515 & Z842 appeared who is R-L2, actually U152> L2> DF103,FGC4183> FGC4166,FGC4220> BY1020. (FTDNA-99809). (I am DF103-.)

The SNPs V1515 & Z842, for which I am also negative, are now in the pull down menu for "Shared Novel Variants."

Just mentioning this. Maybe others in L2 Big Ys are seeing random weirdness if they ever bother to go back and check the Big Y Matches page.

I don't think just one person can get them to either fix or explain this. :\

kinman
10-29-2015, 03:18 PM
Hi All,
I am waiting for FTDNA to get my BAM file ready. In the meantime, my "Big Y Results" show one exact match (Known SNP difference = 0) and that is with Matt Langley (as we expected).

From the Non-Matching Known SNPs, Matt Langley and I apparently share one SNP (which none of my other matches have). That is M359. Since the M359 mutation has evolved independently in two other (unrelated) Haplogroups, I assume this is what they call a "floating SNP".

Anyway, it is called M359.1 and M359.2 in those other two haplogroups, so I guess our version will be called M359.3 (is that right?). I see that YSEQ.net already has single SNP tests for M359 and M359.2, but wouldn't those two tests be identical (same mutation at the same position)? If so, I assume anyone in our Langley-Maynor-Kinman group could take either one of those two tests and probably test positive.
------------Ken
--------------------------------------------------------------------------------------------------------------------------

kinman
10-30-2015, 12:31 AM
These are probably dumb questions, but here goes:
(1) Do you have to have a dropbox to share BAM files with project administrators, and if so, is a 2GB dropbox big enough?
(2) For many of my Big Y SNP results, it says "Yes" to being derived, but the same letter (base) is listed for the Reference and the Genotype. Wouldn't those be ancestral instead of derived??
--------------Ken

haleaton
10-30-2015, 01:07 AM
These are probably dumb questions, but here goes:
(1) Do you have to have a dropbox to share BAM files with project administrators, and if so, is a 2GB dropbox big enough?
(2) For many of my Big Y SNP results, it says "Yes" to being derived, but the same letter (base) is listed for the Reference and the Genotype. Wouldn't those be ancestral instead of derived??
--------------Ken

1) The last Big Y BAM I downloaded, after they started stripping off the non-Y data, was 1.23 GB. However, FTDNA provides a link which you can e-mail the admins and they can download it directly.

2) I believe large portions of the Y Reference sequence are based on a person who was L2+ L20+ so this is an adjustment of what is positive based on position in the tree. I could be wrong on that and FTDNA could be wrong.

haleaton
10-31-2015, 06:27 AM
Since August all 22 new FTDNA Big Y "matching" novel variant has my sample sharing L858 & L862 widely across distinct haplogroups. I asked the Help Desk a couple of times and it was forwarded to the FTDNA "Big Y Department" but never got a reason or fixed I am negative for L858 & L862, also.

So this week a new single match sharing V1515 & Z842 appeared who is R-L2, actually U152> L2> DF103,FGC4183> FGC4166,FGC4220> BY1020. (FTDNA-99809). (I am DF103-.)

The SNPs V1515 & Z842, for which I am also negative, are now in the pull down menu for "Shared Novel Variants."

Just mentioning this. Maybe others in L2 Big Ys are seeing random weirdness if they ever bother to go back and check the Big Y Matches page.

I don't think just one person can get them to either fix or explain this. :\

"Hi there,

Thanks for your email, and I apologize for these bugs- I can see that the L858 and L862 both show in addition to the V1515 and Z842 SNPs. I will report this to our IT department instead of the Big Y department as this seems to be a system error that our Big Y department may not be able to correct."
--FTDNA Help Desk

Who knows maybe they will fix it. They are pretty good about responding quickly, just hard to close out things. [Or maybe they are trolling me . . . ;) ]

kinman
10-31-2015, 10:12 PM
My BAM file was ready this morning. But it was only half that size ( about 600 MB ). Took over an hour to download. Anyway, I sent the link to Richard.
--------------Ken
P.S. My computer (Windows 7) said it couldn't open the zip BAM file. I hesitate downloading a program to decompress it, because I hear that some of the free programs come with a lot of junk and problems. Anybody have any suggestions?
----------------------------------------------------------------------------------------------------------------------------------------------------------------


1) The last Big Y BAM I downloaded, after they started stripping off the non-Y data, was 1.23 GB. However, FTDNA provides a link which you can e-mail the admins and they can download it directly.

2) I believe large portions of the Y Reference sequence are based on a person who was L2+ L20+ so this is an adjustment of what is positive based on position in the tree. I could be wrong on that and FTDNA could be wrong.

haleaton
10-31-2015, 11:46 PM
My BAM file was ready this morning. But it was only half that size ( about 600 MB ). Took over an hour to download. Anyway, I sent the link to Richard.
--------------Ken
P.S. My computer (Windows 7) said it couldn't open the zip BAM file. I hesitate downloading a program to decompress it, because I hear that some of the free programs come with a lot of junk and problems. Anybody have any suggestions?
----------------------------------------------------------------------------------------------------------------------------------------------------------------

I personally had no issues with http://www.7-zip.org/, though one can never guarantee your experience will be the same at a different date.

Pigmon
11-01-2015, 09:53 PM
Can anyone tell me - Is the SNP DF 90 part of the Big Y test?

Bolgeris
11-01-2015, 10:01 PM
yes..
I am L20+ and
DF90-
Position data: 93T

Pigmon
11-02-2015, 02:31 PM
My test date was 2/2/15 so I suppose since the DF90 does not show up as a positive SNP for me that it is negative.

Do you know when this SNP was added to the Big Y?

MitchellSince1893
11-02-2015, 03:35 PM
My test date was 2/2/15 so I suppose since the DF90 does not show up as a positive SNP for me that it is negative.

Do you know when this SNP was added to the Big Y?

Another option is that there were too few reads, or too many conflicting reads. i.e. you were 75% negative and the threshold is set at 90%. Richard Rocca could tell you the answer, or if you have your data loaded on Yfull you could look it up yourself.

R.Rocca
11-02-2015, 03:45 PM
My test date was 2/2/15 so I suppose since the DF90 does not show up as a positive SNP for me that it is negative.

Do you know when this SNP was added to the Big Y?

DF90 is a separate branch than your Z49 branch, so you are definitely negative.

kinman
11-05-2015, 02:17 AM
Hi All,
I can't figure out if my Big Y results included SNP L389 or its equivalents (PF6531, L388, or PF6468). Could it be listed under the "novel variants"? I couldn't find any of these SNPs listed on ISOGG, so I don't know the position number of any of them. SNP L389 is so far upstream, I assume that I am positive for it, but I still would like to verify it. I've found all the other upstream SNPs as testing positive, but nothing on L389 or its equivalents.
----------------------Ken
---------------------------------------------------------------------------------------------------------------

haleaton
11-05-2015, 03:23 AM
Hi All,
I can't figure out if my Big Y results included SNP L389 or its equivalents (PF6531, L388, or PF6468). Could it be listed under the "novel variants"? I couldn't find any of these SNPs listed on ISOGG, so I don't know the position number of any of them. SNP L389 is so far upstream, I assume that I am positive for it, but I still would like to verify it. I've found all the other upstream SNPs as testing positive, but nothing on L389 or its equivalents.
----------------------Ken
---------------------------------------------------------------------------------------------------------------

L389 & PF6531 are just different names at same b37 location (ybrowse.org).
L388 & PF6468 are just different names at same b37 location (ybrowse.org).

All four SNPS are covered by Big Y as seen by looking in YFull's U152 Group. All the samples except for a few ?, were positive for all four.

In my Big Y - Results under Known SNPs, I was positive for L388 and ? for L389.

Since YFull had be positive for both, it could be that for me L389 was excluded by the their BED file, but could be found in the Big Y BAM file.

I you can check your BED & VCF file to see if it was or was a no call, but to examine BAM itself can be done through YFull, FGC for fee or using Samtools yourself.

YFull has L388 & L389 in parallel directly under R1b1, with a subclade R-927 which should be on their public tree.

kinman
11-05-2015, 02:50 PM
Thanks,
I finally found L388 and L389 and I am positive for both (as expected). I didn't find them on the CSV file that I had downloaded (which only shows 5,000 entries). However, I did find them in a more complete CSV file which I had just copied and pasted (which showed over 36,000 entries). Not sure why that downloaded CSV file was incomplete. But the "cut and paste" got the whole thing.
Anyway, I look forward to learning how to use Samtools when I have time. Thanks for that tip as well.
-------------------Ken
--------------------------------------------------------------------------------------------------------------------------------------------------------------------


L389 & PF6531 are just different names at same b37 location (ybrowse.org).
L388 & PF6468 are just different names at same b37 location (ybrowse.org).

All four SNPS are covered by Big Y as seen by looking in YFull's U152 Group. All the samples except for a few ?, were positive for all four.

In my Big Y - Results under Known SNPs, I was positive for L388 and ? for L389.

Since YFull had be positive for both, it could be that for me L389 was excluded by the their BED file, but could be found in the Big Y BAM file.

I you can check your BED & VCF file to see if it was or was a no call, but to examine BAM itself can be done through YFull, FGC for fee or using Samtools yourself.

YFull has L388 & L389 in parallel directly under R1b1, with a subclade R-927 which should be on their public tree.

haleaton
11-05-2015, 11:44 PM
"Hi there,

Thanks for your email, and I apologize for these bugs- I can see that the L858 and L862 both show in addition to the V1515 and Z842 SNPs. I will report this to our IT department instead of the Big Y department as this seems to be a system error that our Big Y department may not be able to correct."
--FTDNA Help Desk

Who knows maybe they will fix it. They are pretty good about responding quickly, just hard to close out things. [Or maybe they are trolling me . . . ;) ]

When looking at my Big Y to see IT department fixed things the Big Y group could not, I did noticed two shared novel SNP had been given their names, though both are actually useless SNPs:

S4063 aka rs370504741 shared with 200 out of 202 Big Ymatches also widely shared across multiple distant haplogroups.

BY181 aka FGC22094 aka ZS4363 same location as rs201870012 (TA>T) share with 2 out 202 Big Y matches, one
of which is in my L2 > FGC5301 subsclade and the other is L2 > L20. Also found public data sets in HG00190 (I-Z74), HG01051 (T-L131), HG01530 (T-L131), HG02259 (Q-M848), HG03115 (E-Z5946), NA18632 (O-CTS52), NA20509 (R-Z56), NA20758 (T-L131), NA19189, & (E1a2a1a1)

The last one was a better trolling of me by FTDNA as BY181 has been found in U152, two in L2, and one in my own subclade.

On the other hand Big Y did call out what is now my YFull L2 Subclade A197 as medium quality positive, and today I found out it has been successfully Sanger Sequenced and validated by YSEQ in two two samples.

Formerly I was advised:

Thomas Krahn:
For your understanding A197 and A7393 are both in the centromeric GGAAT
repeat region. Of course we can design primers for it and maybe we can
get a readable sequence. However this SNPs will be meaningless for your
research.

You may have heard that the centromere is a highly repetitive region
that develops in a permanent flow of self-recombination. Mutations come
and go because they are deleted through LOH between the large scale
repetitive elements. The reference sequence only represents a small
fraction of the repeats that exist in real Y chromosomes. It just can't
get sequenced with any sequencing technology, especially not with NGS
short read sequencing. The reads just align on the A197 region because
the actual regions where they come from are missing in the reference
sequence.
We cannot give any warranty that this sequencing will be successful.

We'll still order this pair of primers, but don't expect too much...
***

Now that is masterful trolling . . . ;)

haleaton
11-06-2015, 07:03 PM
When looking at my Big Y to see IT department fixed things the Big Y group could not, I did noticed two shared novel SNP had been given their names, though both are actually useless SNPs:

S4063 aka rs370504741 shared with 200 out of 202 Big Ymatches also widely shared across multiple distant haplogroups.

BY181 aka FGC22094 aka ZS4363 same location as rs201870012 (TA>T) share with 2 out 202 Big Y matches, one
of which is in my L2 > FGC5301 subsclade and the other is L2 > L20. Also found public data sets in HG00190 (I-Z74), HG01051 (T-L131), HG01530 (T-L131), HG02259 (Q-M848), HG03115 (E-Z5946), NA18632 (O-CTS52), NA20509 (R-Z56), NA20758 (T-L131), NA19189, & (E1a2a1a1)

The last one was a better trolling of me by FTDNA as BY181 has been found in U152, two in L2, and one in my own subclade.

On the other hand Big Y did call out what is now my YFull L2 Subclade A197 as medium quality positive, and today I found out it has been successfully Sanger Sequenced and validated by YSEQ in two two samples.

Formerly I was advised:

Thomas Krahn:
For your understanding A197 and A7393 are both in the centromeric GGAAT
repeat region. Of course we can design primers for it and maybe we can
get a readable sequence. However this SNPs will be meaningless for your
research.

You may have heard that the centromere is a highly repetitive region
that develops in a permanent flow of self-recombination. Mutations come
and go because they are deleted through LOH between the large scale
repetitive elements. The reference sequence only represents a small
fraction of the repeats that exist in real Y chromosomes. It just can't
get sequenced with any sequencing technology, especially not with NGS
short read sequencing. The reads just align on the A197 region because
the actual regions where they come from are missing in the reference
sequence.
We cannot give any warranty that this sequencing will be successful.

We'll still order this pair of primers, but don't expect too much...
***

Now that is masterful trolling . . . ;)

Thomas had some addition wise comments on the two A197+ results yesterday:

"You can't really see how meaningful a SNP is from two results. The SNP can mutate back in a few more generations and you can't tell if a person is related or not. I don't mind if you order this SNP for additional people in the future for your research, but you should have this in mind if you evaluate the results.

I also noticed that there is another mutation one base to the left (A7393). I'm not sure if we have discussed this before.
However this is another indication that this base change has more likely been a product of a recombination event (rather than a spontaneous mutation)."

I was considering A197 & A7393 to be a MNP. The two matching result samples differed by at least about 30 generations, based on YFull Age estimates.

Shamash
11-06-2015, 09:33 PM
Hi guys, today I received my grandfather's BigY results. I don't know if the match algorithm is any better than a year ago. Back then it was simply misleading. Anyway my two top matches are a certain Mitchell and Lincoln. Will try to roam through the data when I'm back home! :) Marco

Edit1: seems like we are FGC12378+! :)

Edit2: we are U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al.> FGC12383>19436291,21290212,22476861 and belong to the Mitchell, Roberts subclade! Vittoria! :)

Shamash
11-06-2015, 10:10 PM
I wonder how many other SNPs I share with Mitchell and Roberts and in which timeframe we are connected and via which population?

MitchellSince1893
11-07-2015, 07:20 AM
Hi guys, today I received my grandfather's BigY results. I don't know if the match algorithm is any better than a year ago. Back then it was simply misleading. Anyway my two top matches are a certain Mitchell and Lincoln. Will try to roam through the data when I'm back home! :) Marco

Edit1: seems like we are FGC12378+! :)

Edit2: we are U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al.> FGC12383>19436291,21290212,22476861 and belong to the Mitchell, Roberts subclade! Vittoria! :)

I predicted you would be positive for FGC12383, http://www.anthrogenica.com/showthread.php?2924-Two-branches-of-Z142-Z150-Z12222-and-Z51&p=113554&viewfull=1#post113554

I'm going to go out on a limb and predict you are positive for FGC12383 which is one branch one below FGC12378. Currently there is a Spaniard (HG017777) and 2 Dutch (GoNLx2) members of this branch.


But it's even closer than that!

Welcome to our little branch :)

It's a good genetic genealogy day when you get a new member on your terminal branch. :beerchug:

MitchellSince1893
11-07-2015, 07:29 AM
I wonder how many other SNPs I share with Mitchell and Roberts and in which timeframe we are connected and via which population?

According to Yfull's dates it's somewhere between 2400 to 3500 years ago with 3000 years ago being the current estimate.
http://www.yfull.com/tree/R-Y9080/

I'm YF01489 and Roberts is YF02596


BRANCH ID SAMPLE ID NUMBER OF SNPS COVERAGE (BP) FORMULA TO CORRECT SNPS NUMBER CORRECTED NUMBER OF SNPS FORMULA TO ESTIMATE AGE AGE BY THIS LINE ONLY
YF01489 14.0 7269144 14.0/7269144*8467165 16.31 16.31*144.41+60 2415
YF02596 22.0 7746819 22.0/7746819*8467165 24.05 24.05*144.41+60 3532

Based on a ~1000BC MRCA, I would guess our shared group might be a Celtic tribe on the N. side of the Alps. Your line entering Northern Italy arounnd 400 BC, Roberts and my lines going North and eventually ending up in Britain.

Or it could have been related to the Romans, with Roberts and my lines entering Britain during the Roman occupation.

Of the two I would lean towards the former scenario.

If you send your data to yfull for analysis it will further refine this date.

Shamash
11-07-2015, 08:15 AM
Very nice results! I checked the Yfull tree and could confirm that we really sit on the same terminal branch of the tree as we are also FGC12401+/FGC12402+/FGC12403+/FGC12404+/FGC12405+

Will send the bam file to Yfull!

MitchellSince1893
11-07-2015, 08:22 AM
Very nice results! I checked the Yfull tree and could confirm that we really sit on the same terminal branch of the tree as we are also FGC12401+/FGC12402+/FGC12403+/FGC12404+/FGC12405+

Will send the bam file to Yfull!

Do you know yet if you share any more SNPs with Roberts or me....beyond what you listed above?

If so it would create a new terminal branch with you and Robert or you and me on this new branch.

Here is a list of most of my novel SNPs



Name Position
FGC12384 7547230
FGC12385 7881772
FGC12386 8173636
FGC12387 8194117
FGC12388 8290583
FGC12389 9795464
FGC12390 9862775
FGC12391 10024162
FGC12392 13713820
FGC12393 14497774
FGC12394 15129680
FGC12395 15607220
FGC12396 15878557
FGC12397 15971853
FGC12398 16185253
FGC12399 17003970
FGC12400 17299391
FGC12406 23969601
FGC12408 7900146
13871171
20408815
22318347

Shamash
11-07-2015, 08:28 AM
If you could send me a list with the novel list variants I could check that out. From my account I can only see that we share 66 of 119 novel variants. In order to check I will need the SNP list...

MitchellSince1893
11-07-2015, 08:38 AM
If you could send me a list with the novel list variants I could check that out. From my account I can only see that we share 66 of 119 novel variants. In order to check I will need the SNP list...

See post above yours. I posted my FGC and Yfull Novel SNP lists.

I've looked at the BigY matches data. You, me and Roberts share 66 SNPs, but most of these are above U152 so they aren't helpful

Shamash
11-07-2015, 01:25 PM
Do you know yet if you share any more SNPs with Roberts or me....beyond what you listed above?

If so it would create a new terminal branch with you and Robert or you and me on this new branch.

Here is a list of most of my novel SNPs

Hi, I checked them all and am negative for all the above SNPs! What a pity...

MitchellSince1893
11-07-2015, 03:46 PM
Hi, I checked them all and am negative for all the above SNPs! What a pity...

There is a 4th individual that shares FGC12401 with us. Currently he can only trace his ancestry back to early 1800s in North Carolina. His ancestry is probably English or possibly German (lot of German immigrants to Orange County, NC back then).

Based on our STR matches he agreed to take a test at Yseq.net to confirm he was positive for FGC12401 and just submitted his FGC Y Elite 2.0 sample. It was received 29 Oct.

haleaton
11-10-2015, 04:05 AM
When looking at my Big Y to see IT department fixed things the Big Y group could not, I did noticed two shared novel SNP had been given their names, though both are actually useless SNPs:

S4063 aka rs370504741 shared with 200 out of 202 Big Ymatches also widely shared across multiple distant haplogroups.

BY181 aka FGC22094 aka ZS4363 same location as rs201870012 (TA>T) share with 2 out 202 Big Y matches, one
of which is in my L2 > FGC5301 subsclade and the other is L2 > L20. Also found public data sets in HG00190 (I-Z74), HG01051 (T-L131), HG01530 (T-L131), HG02259 (Q-M848), HG03115 (E-Z5946), NA18632 (O-CTS52), NA20509 (R-Z56), NA20758 (T-L131), NA19189, & (E1a2a1a1)

The last one was a better trolling of me by FTDNA as BY181 has been found in U152, two in L2, and one in my own subclade.


FTNDA still has not fixed anything but just added Z842 as a matching Novel Variant with a single L2 person. Of course I am negative for Z842.

Every time I flag a bug with the Help Desk a week or so later a new bug appears . . .

kinman
11-11-2015, 12:33 AM
Hi All,
I am sort of in a state of shock, uncertain what my newest Big Y match might mean. I was not particularly surprised when Matt Langley was my closest Big Y match ( 0 known SNP difference and 87 shared novel variants ).
However, another Langley has just gotten Big Y results back today, and not only with the expected 0 SNP difference, but also an incredible 105 shared novel variants. That is 105 novel variants shared out of 117.
Now I am wondering if that definitely means that I am more closely related to this Langley than I am to Matt Langley.
--------------Ken

paulgill
11-11-2015, 12:47 AM
Hi All,
I am sort of in a state of shock, uncertain what my newest Big Y match might mean. I was not particularly surprised when Matt Langley was my closest Big Y match ( 0 known SNP difference and 87 shared novel variants ).
However, another Langley has just gotten Big Y results back today, and not only with the expected 0 SNP difference, but also an incredible 105 shared novel variants. That is 105 novel variants shared out of 117.
Now I am wondering if that definitely means that I am more closely related to this Langley than I am to Matt Langley.
--------------Ken

FTDNA BIG Y results mean absolutely nothing, you need an expert to look at both or all three files, FullGenomes https://www.fullgenomes.com/purchases/11/, YFull https://www.facebook.com/groups/yfullcom/, or/and your FTDNA Project Administrator, only then you will get the true picture of relationship between these kits.

MattL
11-11-2015, 01:22 AM
Hi All,
I am sort of in a state of shock, uncertain what my newest Big Y match might mean. I was not particularly surprised when Matt Langley was my closest Big Y match ( 0 known SNP difference and 87 shared novel variants ).
However, another Langley has just gotten Big Y results back today, and not only with the expected 0 SNP difference, but also an incredible 105 shared novel variants. That is 105 novel variants shared out of 117.
Now I am wondering if that definitely means that I am more closely related to this Langley than I am to Matt Langley.
--------------Ken

Just saw them pop up on mine too... They were the one with their STR (111 marker) kit #31971, furthest ancestor a William C Langley. At 37 markers he's a GD of 2 from me.

Will be very interesting to see how all of our results compare on YFull, maybe Richard can offer some insights.

kinman
11-11-2015, 01:55 AM
Hi Matt,
The weird thing is that I am a closer match to your 37 markers than I am to his. And I already found one novel variant that I share with you but not with him. So maybe I'll be equally related to you both after all. I'm sure Richard understands such results better than any of us.
Anyway, I am curious how many novel variants you had, and how many of those you share with William C. Langley.
----------------Ken

--------------------------------------------------------------------------------------------------------------------------------------------------


Just saw them pop up on mine too... They were the one with their STR (111 marker) kit #31971, furthest ancestor a William C Langley. At 37 markers he's a GD of 2 from me.

Will be very interesting to see how all of our results compare on YFull, maybe Richard can offer some insights.

Kwheaton
11-11-2015, 02:43 AM
I have spent the last 24 hours analyzing a new Big Y in my project and comparing it with other Big Y's and although I am happy overall..... the craziness of the reporting is amazing.

19 Novel SNPS were not novel at all: the ancestral and derived were the SAME. You really do need to send these off for analysis or go through the matches with a fine tooth comb. Many of these novel SNPS have been named and found in other samples. Some are upstream of L2 so they are not of much value in defining the lower parts of the branches...Its no wonder people are confused....

The match lists are very misleading as well. As an admin I was able to download the CSV files of 3 Big Y's and then take the Novel SNPS organize by position and then compare across the 3 files plus comparing with a Y Elite.

If you are planning in sending to YFULL or FGC please check to see which has other samples you may match. Our branch has no files at YFULL so FGC is more helpful for us. Your mileage may vary.

MattL
11-11-2015, 09:19 AM
Hi Matt,
The weird thing is that I am a closer match to your 37 markers than I am to his. And I already found one novel variant that I share with you but not with him. So maybe I'll be equally related to you both after all. I'm sure Richard understands such results better than any of us.
Anyway, I am curious how many novel variants you had, and how many of those you share with William C. Langley.
----------------Ken

--------------------------------------------------------------------------------------------------------------------------------------------------

I have 93 shared novel variants with him (87 with you).

paulgill
11-11-2015, 10:00 AM
I have 93 shared novel variants with him (87 with you).

Really?

kinman
11-11-2015, 02:48 PM
I think paulgill is probably right that the numbers on the Big Y Matches page can be misleading. My sharing 87 novel variants with you and 105 with William C. Langley probably does not indicate I am any closer to William.
However, one thing that their Matches page does clearly show is that the three of us are the only ones (tested thus far) who have M359.3 mutation. Once one of our Maynor relatives get a Big Y test, it will no doubt show that the Maynor relatives have it as well.
Anyway, if the three of us share other SNPs (other than M359.3), that information is buried somewhere in those massive BAM files, and I haven't yet had time to delve into that (trying to find a few "needles in a haystack"). I suspect Richard will find them before I do.
----------------Ken
------------------------------------------------------------------------------------------------------------------------------------------------


I have 93 shared novel variants with him (87 with you).

Shamash
11-11-2015, 07:49 PM
Do you know yet if you share any more SNPs with Roberts or me....beyond what you listed above?

If so it would create a new terminal branch with you and Robert or you and me on this new branch.

Here is a list of most of my novel SNPs

These are my 22 private SNPs:

Position,Reference,Genotype,Confidence
2742106,G,A,High
6737738,G,A,High
6942372,G,A,High
6992879,T,C,High
13992494,A,T,High
14095536,T,G,High
14403973,T,C,High
16387379,A,G,High
16412011,C,G,High
16554909,A,T,High
17050243,C,T,High
17329844,T,C,High
18066518,G,T,High
18198976,A,T,High
18656568,A,G,High
22045415,T,G,High
22445399,G,C,High
22477902,T,G,High
22477925,T,G,High
22478089,G,T,High
22478105,G,C,High
25533979,A,G,High

MattL
11-11-2015, 08:13 PM
I think paulgill is probably right that the numbers on the Big Y Matches page can be misleading. My sharing 87 novel variants with you and 105 with William C. Langley probably does not indicate I am any closer to William.
However, one thing that their Matches page does clearly show is that the three of us are the only ones (tested thus far) who have M359.3 mutation. Once one of our Maynor relatives get a Big Y test, it will no doubt show that the Maynor relatives have it as well.
Anyway, if the three of us share other SNPs (other than M359.3), that information is buried somewhere in those massive BAM files, and I haven't yet had time to delve into that (trying to find a few "needles in a haystack"). I suspect Richard will find them before I do.
----------------Ken
------------------------------------------------------------------------------------------------------------------------------------------------

Yeah agreed... I've been meaning to run some of the tools and crack open my BAM file and dig into it and understand it a bit more (I'm a Software Engineer so it intrigues me from that perspective), just haven't had time yet.

I'm betting the same thing, Richard will probably figure any shared downstream SNPs long before we would.

Shamash
11-11-2015, 08:13 PM
I seem to have a brandnew BigY match per today: Humberto J. Vidaillet

His results have rolled in today but I can't verify if he is any closer as Edward Mitchell?

I share 85 out of 119 novel variants SNPs with him (on the other hand I share 66 with Mitchell and we are at the same branch!)

Richard, could you check occasionally which SNPs I share with Vidaillet??? What a pity I can't do it myself...

EDIT1: Its possible to compare clicking on the shared novel variants number. Vidaillet is FGC12378+ and FGC12383+

EDIT2: Vidaillet is negative for these three subclade-defining SNPs: 19436291,21290212,22476861 which Mitchell, Roberts and I share. He sits on another branch below FGC12378 & FGC12383

MitchellSince1893
11-11-2015, 08:23 PM
These are my 22 private SNPs:

Position,Reference,Genotype,Confidence
2742106,G,A,High
6737738,G,A,High
6942372,G,A,High
6992879,T,C,High
13992494,A,T,High
14095536,T,G,High
14403973,T,C,High
16387379,A,G,High
16412011,C,G,High
16554909,A,T,High
17050243,C,T,High
17329844,T,C,High
18066518,G,T,High
18198976,A,T,High
18656568,A,G,High
22045415,T,G,High
22445399,G,C,High
22477902,T,G,High
22477925,T,G,High
22478089,G,T,High
22478105,G,C,High
25533979,A,G,High

I'm negative for all of them except for the last one. On 25533979 it's undetermined as I'm 59% G (same as you) and 41% A (the reference allele)


ChrY position: 25533979 (+strand)
Reads: 153
Position data: 63A 90G
Weight for A: 0.410824108241
Weight for G: 0.589175891759
Probability of error: 0.126113755562 (0<->1)
Sample allele: R (A or G)
Reference (hg19) allele: A
Reference sequence (100bp): AATCTATTATGACCTAGCCTCTGCCTGCTTTCTTAAAATCACCTATCCTC
A
TTCTCTCTCTCTCTTCCAACTTTCCTTTTTTTGTTGTGGGCTTCATCTTT

So we may indeed share this one.

MitchellSince1893
11-11-2015, 08:39 PM
I seem to have a brandnew BigY match per today: Humberto J. Vidaillet

His results have rolled in today but I can't verify if he is any closer as Edward Mitchell?

I share 85 out of 119 novel variants SNPs with him (on the other hand I share 66 with Mitchell and we are at the same branch!)

Richard, could you check occasionally which SNPs I share with Vidaillet??? What a pity I can't do it myself...
Until we hear from Richard...

On BigY matches screen all my matches (Brace, Lincoln, Smith) at:
U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al have 60-61 matches with me.


Both of my matches (you and Roberts) at:
U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al.> FGC12383>FGC12401 et al have 66 matches

Humberto J. Vidaillet has 63 matches with me so in between the two. Without knowing your 85 matches with Vidaillet I would have thought he would have been on FGC12383.

However, with that many additional shared SNPs it's most likely he's on a newly discovered branch with you below FGC12401 et al.

Shamash
11-12-2015, 10:19 AM
Until we hear from Richard...

On BigY matches screen all my matches (Brace, Lincoln, Smith) at:
U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al have 60-61 matches with me.


Both of my matches (you and Roberts) at:
U152> L2> Z49,Z68> Z142> Z150,Z12222,Z26720> FGC12378 et al.> FGC12383>FGC12401 et al have 66 matches

Humberto J. Vidaillet has 63 matches with me so in between the two. Without knowing your 85 matches with Vidaillet I would have thought he would have been on FGC12383.

However, with that many additional shared SNPs it's most likely he's on a newly discovered branch with you below FGC12401 et al.

EDIT1: Its possible to compare clicking on the shared novel variants number. Vidaillet is FGC12378+ and FGC12383+

EDIT2: Vidaillet is negative for these three subclade-defining SNPs: 19436291,21290212,22476861 which Mitchell, Roberts and I share. He sits on another branch below FGC12378 & FGC12383

MitchellSince1893
11-12-2015, 01:18 PM
EDIT1: Its possible to compare clicking on the shared novel variants number. Vidaillet is FGC12378+ and FGC12383+

EDIT2: Vidaillet is negative for these three subclade-defining SNPs: 19436291,21290212,22476861 which Mitchell, Roberts and I share. He sits on another branch below FGC12378 & FGC12383

Does he share this branch with Brace and Lincoln?....hard to check this on my phone.

EDIT: Nevermind, If he's positive for FGC12383 then he shouldn't be on a branch with Lincoln and Brace, but he may be on the same branch as 1KG-HG01777 or one of the GoNLx2 on FGC12383 (see Richard Rocca's tree)

Shamash
11-12-2015, 05:11 PM
Does he share this branch with Brace and Lincoln?....hard to check this on my phone.

EDIT: Nevermind, If he's positive for FGC12383 then he shouldn't be on a branch with Lincoln and Brace, but he may be on the same branch as 1KG-HG01777 or one of the GoNLx2 on FGC12383 (see Richard Rocca's tree)

As far as I remember Lincoln is FGC12378+ but FGC12383- while Vidaillet is FGC12378+ and FGC12383+

A third migration scenario is possible for my Italian FGC12378/FGC12383 line: French ancestors... after all the Normans were from Northern France and also the Anjou brought hundreds of French families to Southern Italy in the 13th century...

MitchellSince1893
11-12-2015, 06:31 PM
As far as I remember Lincoln is FGC12378+ but FGC12383- while Vidaillet is FGC12378+ and FGC12383+

A third migration scenario is possible for my Italian FGC12378/FGC12383 line: French ancestors... after all the Normans were from Northern France and also the Anjou brought hundreds of French families to Southern Italy in the 13th century...

Funny thing is I'm not shown as positive for FGC 12383, position 7733015 on FTDNA's BigY results page i.e it shows up for Vidaillet but not me :\

Another example of why you need a third party i.e. Richard Rocca, FGC, and/or Yfull to look at your results.

kinman
11-12-2015, 11:50 PM
Hi All,
While I was looking through my Big Y results (CSV file), I came across some positive results close together that just didn't look quite right for some reason (P97, L1060, L1061, L1062). I looked them up on ISOGG, and instead of being in Haplogroup R, they were all instead Haplogroup BT.
So is this coincidence that I'm finding a bunch of floating SNPs that I share with Haplogroup BT? Or could they be false positives? Or something else weird going on?
I'm not sure what to think.
----------------Ken

MitchellSince1893
11-13-2015, 12:28 AM
Hi All,
While I was looking through my Big Y results (CSV file), I came across some positive results close together that just didn't look quite right for some reason (P97, L1060, L1061, L1062). I looked them up on ISOGG, and instead of being in Haplogroup R, they were all instead Haplogroup BT.
So is this coincidence that I'm finding a bunch of floating SNPs that I share with Haplogroup BT? Or could they be false positives? Or something else weird going on?
I'm not sure what to think.
----------------Ken

You share them because you are a member of Hg BT http://www.yfull.com/tree/BT/

Scroll to the bottom of the link and you will see R1b

kinman
11-13-2015, 01:35 AM
Thanks,
I'm now banging my palm against my forehead for not checking the YFull tree in the first place. I've never really paid much attention to the base of the Haplotree. I can understand why they would name Haplogroups like DE and CF. However, BT and CT are peculiar names. Oh well, live and learn.
--------------Ken
----------------------------------------------------------------------------------------------------


You share them because you are a member of Hg BT http://www.yfull.com/tree/BT/

Scroll to the bottom of the link and you will see R1b