PDA

View Full Version : DF27 Phylogeny from FGC



rgd
01-13-2015, 09:46 PM
Is FGC at all concerned about the way YFull is sticking their new names on FGC named SNPs? They are using these names in their tree without any indication that the SNPs were previously identified and named by FGC. See YFull's tree for R-Z2571 where the SNPs under Y-8397 are all SNPs previously named by FGC, but there is no way to identify them as such in the YFull tree.

Is YFull just trying to fill a vacuum or are they trying to subvert normal naming protocols and take that over?

lgmayka
01-13-2015, 09:55 PM
Is FGC at all concerned about the way YFull is sticking their new names on FGC named SNPs?
Does FGC check YFull's database before assigning names?

Actually, though, there is indeed a fundamental philosophical difference in naming practice. FGC follows the old publish-or-perish academic rule of naming every single arguable private SNP, in order to claim discovery. The disadvantage is that in our new era of full-Y scanning, this will quickly lead to millions of SNP names and a bloated spreadsheet. (The Z series has already needed a second, "supplemental" spreadsheet just for itself.)

YFull assigns an internal YFS name to every SNP, but only publishes reliable SNPs confirmed in two or more examples. This has the advantage of keeping the spreadsheet manageable.

FGC Corp
01-13-2015, 10:11 PM
We name all SNPs because those are generally asked for by our customers. The second point is that ISOGG has the standard tree which has multiple confirmations for each SNP.

As for third parties everyone has their own approach.

We don't make money off of naming SNPs so it serves as info for the community to use as appropriate demands require.

FGC Corp
01-13-2015, 10:13 PM
Secondly we don't limit our data releases so that all data becomes public as it becomes available. Put differently we make $0 from naming SNPs.

I think I would essentially note that SNP naming/phylogenetic discovery, while some measure of progress just isn't fundamental to a genetics business. For that, new sequencing products or biotech technology drive any business in this area; SNP discovery is part of the hobby but it doesn't contribute to the business, so I would regard YFull as somewhat helpful since it serves as an additional 3rd party assessment of products, particularly with respect to Y chr coverage of the different products.

rgd
01-13-2015, 10:31 PM
Does FGC check YFull's database before assigning names?


YFull assigns an internal YFS name to every SNP, but only publishes reliable SNPs confirmed in two or more examples. This has the advantage of keeping the spreadsheet manageable.

I don't understand how YFull sticking new names on previously identified and named SNPs keeps the spreadsheet manageable--perhaps it makes their spreadsheet more managable, but not mine. Perhaps your question should have been does YFull check whether a SNP has been named and then use that rather that sticking a new name on it. If they don't why not?

FGC Corp
01-13-2015, 10:34 PM
I don't understand how YFull sticking new names on previously identified and named SNPs keeps the spreadsheet manageable--perhaps it makes their spreadsheet more managable, but not mine. Perhaps your question should have been does YFull check whether a SNP has been named and then use that rather that sticking a new name on it. If they don't why not?

As I discussed with a community member a while back a SNP database is needed, since it is better for this type of task, but that is a mission for third parties. It's inevitable given the current set-up for some SNPs to have multiple names, in the absence of a dynamically updated central public database.

KSDA
01-13-2015, 10:57 PM
How much does it really matter if YFull or FGC puts their names on an SNP first? If anything it's proof of independent verification from two separate groups, and both designations will show up eventually in YBrowse and the ISOGG tree anyway.

Also, is there any place where I can find a full list of FGC SNPs and their phylogenetic importance? I'm not really familiar with their findings so far.

FGC Corp
01-13-2015, 11:50 PM
How much does it really matter if YFull or FGC puts their names on an SNP first?

I don't see that it matters.


If anything it's proof of independent verification from two separate groups, and both designations will show up eventually in YBrowse and the ISOGG tree anyway.

I agree.


Also, is there any place where I can find a full list of FGC SNPs and their phylogenetic importance?

Email us.

As I see it, this is the future:
https://www.fullgenomes.com/news/

lgmayka
01-14-2015, 11:28 AM
I don't understand how YFull sticking new names on previously identified and named SNPs
You didn't get my point at all, which is that both companies have more important things to do than to check for duplication of SNP names. I suspect that YFull made a copy of the Ybrowse database about a year ago, and does not duplicate names found there at that time.

VinceT
01-15-2015, 05:39 AM
Just echoing:

It does take a significant amount of time to download, audit, curate, and re-import into their respective internal databases. This *is* done (i.e. FGC had stated elsewhere that they do this every 4 to 6 weeks typically), but is not as frequently as some would like.

Unfortunately, ISOGG simply does not posses the financial or human resources to host and curate the centralized open-access Y-SNP repository database that we all long to to have.

I really wish that organizations and institutions such as NCBI, the Wellcome Trust, Erasmus University, KU Leuven, and so forth, would join forces on a cooperative level with citizen science to set something like this up. Oh, silly dreams.

rgd
01-15-2015, 06:11 PM
As I discussed with a community member a while back a SNP database is needed, since it is better for this type of task, but that is a mission for third parties. It's inevitable given the current set-up for some SNPs to have multiple names, in the absence of a dynamically updated central public database.

Perhaps I should have been more explicit. On 21 Apr 2014 Jim Turner showed the following tree regarding his new FGC SNPs:

https://groups.yahoo.com/neo/groups/R1b-DF27-Project/conversations/topics/935
DF27+
1. Z2571+
1.1. DF84+, Z2568+, Z2569+, Z2570+, and Z2572+
1.2. FGC11368+ and dozens of other new SNPs (Rox2)
1.2.1 FGC11367+ and FGC11397+

YFull now shows the same Z2571 in their Experimental YTree v3.1 as:
http://www.yfull.com/tree/
• R-Z2571*
• R-Y8397Y8842 * Y8840 * Y8839... 16 SNPs
o id:HG00107GBR
o id:YF02413SWE [SE-AC]
• R-Z2568Z2572/CTS11567 * Z2570 * Z2569... 2 SNPs
o id:NA19762MXL
o id:HG01577PEL

The two trees do not look anything alike. Turner's "FGC11368+ and dozens of other new SNPs (Rox2)" appears to have become in part YFull's "R-Y8397Y8842 * Y8840 * Y8839... 16 SNPs." This seemed to me to possibly be more that just an assignment of SNP names to just some SNPs by YFull, though I may be wrong.

I agree there needs to be a dynamically updated central public database.

lgmayka
01-15-2015, 06:29 PM
Turner's "FGC11368+ and dozens of other new SNPs (Rox2)" appears to have become in part YFull's "R-Y8397Y8842 * Y8840 * Y8839... 16 SNPs." This seemed to me to possibly be more that just an assignment of SNP names to just some SNPs by YFull, though I may be wrong.
I have no idea what else you could be imagining. A footnote on the YFull haplotree says: "Based on ISOGG v9.29 at 2 March 2014." I suspect that on or near that same date, YFull made a flattened copy of Ybrowse. Any FGC SNPs that were already in Ybrowse at that time would then keep those names in YFull; but YFull probably does not re-copy Ybrowse periodically, and YFull almost certainly does not poke around FGC's database (just as FGC almost certainly does not poke around YFull's database (https://docs.google.com/a/ieee.org/spreadsheet/ccc?key=0Alkh_vXxrFrOdENKdWZHZ2VHak1Mc29MUlg4RlJ2S 3c&usp=sharing#gid=0)).

Again, the reality is that neither company has the "excess" clerical staff to be checking painfully slow databases (like Ybrowse) all over the web. YFull, in particular, was a strictly volunteer effort until April 2014, when the tsunami of BigY results practically forced them to start charging a fee.

FGC Corp
01-19-2015, 06:35 PM
I have no idea what else you could be imagining. A footnote on the YFull haplotree says: "Based on ISOGG v9.29 at 2 March 2014." I suspect that on or near that same date, YFull made a flattened copy of Ybrowse. Any FGC SNPs that were already in Ybrowse at that time would then keep those names in YFull; but YFull probably does not re-copy Ybrowse periodically, and YFull almost certainly does not poke around FGC's database (just as FGC almost certainly does not poke around YFull's database (https://docs.google.com/a/ieee.org/spreadsheet/ccc?key=0Alkh_vXxrFrOdENKdWZHZ2VHak1Mc29MUlg4RlJ2S 3c&usp=sharing#gid=0)).

Again, the reality is that neither company has the "excess" clerical staff to be checking painfully slow databases (like Ybrowse) all over the web. YFull, in particular, was a strictly volunteer effort until April 2014, when the tsunami of BigY results practically forced them to start charging a fee.

Exactly. There's a good argument for setting up a database via ISOGG. I don't think that would require a very substantial sum of money.

rgd
01-27-2015, 04:09 PM
[QUOTE=lgmayka;64814]I have no idea what else you could be imagining.

Without your background on the company I merely imagined that YFull would do what they offer on their web site: “All newly discovered SNPs will be added to the Y-series.” They did have the resources to designate R-Y8397 as a new subclade under R-Z2571 for the SNPs they named. Perhaps you can help me understand the criteria used by YFull to derive the new subclade or branch they show on their tree.

I went to the YFull database site that you cited and was impressed with the information provided by YFull. I have not found where FGC provides similar information. Certainly it would help if they did. Clearly the companies that are assigning new SNP names should come together and establish a dynamically updated central public database. And surely customers would expect the cost of that to be part of their fee.

When I looked in some detail at the SNPs that they named I found that apparently they have arbitrarily designated one of their 19 SNPs as the ancestor SNP for a branch on the tree. Then they show that the other SNPs lie below it. Their name for that SNP is Y8397, the first on the list of their named SNPs. Their site shows the SNP is highly similar to ChrX.

Then looking farther I found that YBrowse shows, from the location, that the SNP Y8397 was previously named FGC11374, which is one of 45 apparently phylogenetically equivalent SNPs that FGC named in my kit in the Rox2 cluster.

What evidence did YFull use to define this R-Y8397 subclade or branch of their tree? Have they complied with the listing criteria for SNP inclusion into the ISOGG Y-DNA Haplogroup Tree? Shouldn’t their criteria be the same for their tree?

I have not seen any evidence that the SNP Y8397/ FGC11374 is the ancestor of the other SNPs they named or the ones that FGC named. Does this make R-Y8397 an invalid subclade? If it is shouldn’t YFull delete it from their tree?

Petr
01-27-2015, 04:31 PM
How would you feel if you suddenly found the YFull website was using your clipboard without your permission? And what would you do?
I would create new thread, this thread is about Full Y Chromosome Sequencing: Phase III Pilot.

Then I would add a screenshot of the message. According to my experience this information about using the clipboard appears from time to time for no apparent reason.

Then I would ask YFull team.

Petr

FGC Corp
01-27-2015, 07:19 PM
Hello everyone,

I believe that discussions of YFull's haplotree should be continued in a different thread, and that any other discussions about YFull should also be done in a separate thread, particularly since it gives their team an opportunity to address those questions directly.

We are not familiar with YFull's database and have no comment on their operations.

EDIT:
My recommendation is that any specific concerns about operations are best addressed by a direct email to the respective company's team, as suggested by Petr.

lgmayka
01-27-2015, 07:48 PM
First, I must point out that I am not affiliated with YFull, only a very satisfied user of their service.


When I looked in some detail at the SNPs that they named I found that apparently they have arbitrarily designated one of their 19 SNPs as the ancestor SNP for a branch on the tree. Then they show that the other SNPs lie below it. Their name for that SNP is Y8397, the first on the list of their named SNPs.
You misinterpret. When multiple SNPs appear (so far) to be phylogenetically equivalent, YFull--just like FTDNA and now to some extent ISOGG--rather arbitrarily designates one of those SNPs to be the name of the clade. This does not mean that the chosen SNP is the "ancestor" of the others in any way.

What evidence did YFull use to define this R-Y8397 subclade or branch of their tree?
Apparently, the SNP was found in two or more samples, and in YFull's judgment (or perhaps according to some automatic algorithm) was considered reliable enough for phylogenetic purposes.

Have they complied with the listing criteria for SNP inclusion into the ISOGG Y-DNA Haplogroup Tree?
Heaven forbid! ISOGG's tree has a very different purpose, heavily tied into Sanger sequencing and individual SNP testing, and is maintained entirely by volunteers. As a result, ISOGG's tree is generally quite reliable (though it does contain some stubborn fossilized errors); but it definitely lags far behind the best current knowledge. YFull's tree is clearly meant to reflect the latest discoveries and inferences from (nearly-)full-Y sequencing--on the "bleeding edge" of science--but is therefore somewhat more speculative.

TigerMW
01-27-2015, 07:54 PM
I agree with FGC. Let us treat this thread with respect and move the YFull tree discussions to their category or under a general Y tree thread.


Hello everyone,

I believe that discussions of YFull's haplotree should be continued in a different thread, and that any other discussions about YFull should also be done in a separate thread, particularly since it gives their team an opportunity to address those questions directly.

We are not familiar with YFull's database and have no comment on their operations.

EDIT:
My recommendation is that any specific concerns about operations are best addressed by a direct email to the respective company's team, as suggested by Petr.