PDA

View Full Version : clarifYDNA.com : clarifY your "next-generation" Y-DNA results



cmorley
06-13-2014, 09:13 PM
I am excited to announce the launch of www.clarifYDNA.com (https://www.clarifYDNA.com), a service for generating comprehensive Y-trees from "next-generation" Y-DNA sequencing results.

This service's first offering: personalised Y-tree reports produced from BigY VCF/BED data (https://www.clarifydna.com/productdisplay/subscription-us-dollars). Here is a sample (https://www.clarifydna.com/resources/clarifY-Demo-Analysis.pdf). The website is now accepting orders.

Subscribers will periodically receive personalised reports showing their position on the Y-tree relative to others in their Y-chromosomal neighbourhood. Reports build on recent versions of the ISOGG haplotree, and are able to indicate which aspects of the phylogenetic structure are robust and which are more tenuous. Sequential reports will reflect phylogenetic progress resulting from growth in our service's membership. Subscribers will have opt-in control over how much of their genetic and ancestral information is shared with their genetic neighbours.

I originally designed the underlying phylogenetic (tree-generating) software for the purpose of handling Geno 2.0 data. The first experimental phylogeny was released in July 2013, with subsequent releases (http://ytree.morleydna.com/experimental-phylogeny) last August, September, November and February. These reports have been lauded by the genetic genealogy community and have also received some corporate and academic praise. Since November I have been adapting my software for "next-generation sequencing" results, such as Full Genome Corp.'s comprehensive Y test and FTDNA's BigY product. I have also spent time streamlining the non-phylogenetic aspects -- the parts of the process before and after the generation of the Y-tree -- a workflow for receiving and queuing data and generating and distributing personalised phylogenetic reports.

This service is not affiliated with any Y-DNA testing company (despite having last year initiated talks to explore that option).

I will gladly answer questions about this service, in this thread.

Chris Morley
www.clarifYDNA.com (https://www.clarifYDNA.com)

razyn
07-05-2014, 03:13 PM
This post mentions the FGC comprehensive Y test, which I've had, but the clarifYDNA website doesn't -- it seems only to solicit Big Y VCF/BED files. I didn't take a Big Y test.

I also wonder whether the very reasonably priced subscription is for a specific period, calendar year, number of updates, or... what? I couldn't tell, from the website.

Anyway, somebody needed to ask a question, to bump the thread. (It's kind of a slow and lazy summer period, apparently. Not meaning to be northern-hemispherist, or anything.) Chris is one of the good guys, people -- let's have a little chatter.

cmorley
07-05-2014, 05:43 PM
Thanks razyn.

To answer your questions:

(1) My focus at the moment is processing Big Y VCF/BED data. I could launch a similar service for FGC data, if there is sufficient interest. My software has already been used to process FGC variantCompare/haplogroupCompare/gtype files -- I prepared a pilot phylogeny for FGC in November. Since then, I have made further enhancements to my software. Some are BigY-specific, and some are more general. I would have to do some testing to see how these changes behave when applied to an FGC dataset. I would also have to adjust to my "backoffice" scripts (handling non-phylogenetic logistical aspects) to additionally accommodate FGC datasets.

(2) Per https://www.clarifydna.com/productdisplay/subscription-us-dollars (click on the "read more" button): "Subscriptions will last until at least the end of 2014." The frequency of updates will be driven by (a) my development workload, and (b) the popularity of this product.

Chris

J1 DYS388=13
07-07-2014, 07:12 AM
How does what you are selling differ from the "Morley" tab on FTDNA's BigY results matching page?

cmorley
07-09-2014, 06:58 AM
How does what you are selling differ from the "Morley" tab on FTDNA's BigY results matching page?

The "Morley" tab you speak of gets inserted on your BigY results page by a third-party Chrome extension. I have no part in that extension, nor did I ever. I was not notified ahead of time that the extension would, for the purpose of BigY analysis, try to harness ytree.MorleyDNA.com's functionality.

The tool at ytree.MorleyDNA.com was built for Geno 2.0 data. The online predictor and its underlying dataset haven't been modified since July or August of 2013 -- before BigY's announcement. The output of the ytree.MorleyDNA.com tool is essentially static: the classifications and lists of unplaced SNPs will not change, even as new data becomes available. I don't have any enhancements scheduled for ytree.MorleyDNA.com, but the tool will remain online, primarily as a service to those who ordered Geno 2.0. www.clarifYDNA.com is my current focus.

BigY data is more complicated to work with. And there are known basal discrepancies between the phylogeny implied by Geno 2.0 data (or, at least, FTDNA's Geno 2.0 calls), and the one implied by "next-gen" data. Using a Geno-tuned predictor on next-gen data can produce some strange results.

The phylogenetic algorithm underlying clarifYDNA's product is the result of several months' additional development, in order to address the added complexities of "next-gen" data. Moreover, the output (personalised reports) provided by clarifYDNA are dynamic. Our knowledge of a mutation's reliability changes as more test results becomes available. As does our knowledge of the scope or each mutation -- whether a kit's currently "novel" mutations (or at least the test subject's instances thereof -- let's not forget that mildly recurrent mutations can still be helpful) are truly localised to the test subject and his immediate patrilineal relatives; or that they are found in a whole branch of that surname; or most members of that surname; or to men sharing a MRCA around the time hereditary surnames were adopted; or to a distinctive STR cluster. As stated above,


Subscribers will periodically receive personalised reports showing their position on the Y-tree relative to others in their Y-chromosomal neighbourhood. Reports build on recent versions of the ISOGG haplotree, and are able to indicate which aspects of the phylogenetic structure are robust and which are more tenuous. Sequential reports will reflect phylogenetic progress resulting from growth in our service's membership.

In short: the "Morley" tab just uses a component from last year's model, and it uses that component in a way that was never intended. www.clarifYDNA.com is this year's model. A lot has changed in the last year, and a lot will continue to change.

vettor
07-09-2014, 06:48 PM
The "Morley" tab you speak of gets inserted on your BigY results page by a third-party Chrome extension. I have no part in that extension, nor did I ever. I was not notified ahead of time that the extension would, for the purpose of BigY analysis, try to harness ytree.MorleyDNA.com's functionality.

The tool at ytree.MorleyDNA.com was built for Geno 2.0 data. The online predictor and its underlying dataset haven't been modified since July or August of 2013 -- before BigY's announcement. The output of the ytree.MorleyDNA.com tool is essentially static: the classifications and lists of unplaced SNPs will not change, even as new data becomes available. I don't have any enhancements scheduled for ytree.MorleyDNA.com, but the tool will remain online, primarily as a service to those who ordered Geno 2.0. www.clarifYDNA.com is my current focus.

BigY data is more complicated to work with. And there are known basal discrepancies between the phylogeny implied by Geno 2.0 data (or, at least, FTDNA's Geno 2.0 calls), and the one implied by "next-gen" data. Using a Geno-tuned predictor on next-gen data can produce some strange results.

The phylogenetic algorithm underlying clarifYDNA's product is the result of several months' additional development, in order to address the added complexities of "next-gen" data. Moreover, the output (personalised reports) provided by clarifYDNA are dynamic. Our knowledge of a mutation's reliability changes as more test results becomes available. As does our knowledge of the scope or each mutation -- whether a kit's currently "novel" mutations (or at least the test subject's instances thereof -- let's not forget that mildly recurrent mutations can still be helpful) are truly localised to the test subject and his immediate patrilineal relatives; or that they are found in a whole branch of that surname; or most members of that surname; or to men sharing a MRCA around the time hereditary surnames were adopted; or to a distinctive STR cluster. As stated above,



In short: the "Morley" tab just uses a component from last year's model, and it uses that component in a way that was never intended. www.clarifYDNA.com is this year's model. A lot has changed in the last year, and a lot will continue to change.

thanks for explanation

can you answer me then, as a T1 person with CTS8862 have you resolved this SNP , because if the answer is the same as predictor ytree.MorleyDNA then I fail to see why I need to invest in ClarifyDNA

regards

DebbieK
08-23-2014, 10:26 PM
This is a very useful new service from Chris Morley. He is currently providing analysis of Big Y results but hopes to extend the service to cover other tests in due course:

http://cruwys.blogspot.co.uk/2014/08/clarify-dna-new-y-snp-analysis-service.html

Scarlet Ibis
08-24-2014, 12:31 AM
Yes, he announced his new service a few months ago, and said he's available for Q&A in the original post (see link below). Seems promising.

http://www.anthrogenica.com/showthread.php?2690-clarifYDNA-com-clarifY-your-quot-next-generation-quot-Y-DNA-results

DebbieK
08-24-2014, 09:51 AM
Thanks. I didn't realise there was another thread. I wonder if a moderator could merge this thread with the older one. Chris did send me the details of the new service back in June but I've only just had time to investigate it and write about it.

GTC
08-24-2014, 03:19 PM
Thanks. I didn't realise there was another thread. I wonder if a moderator could merge this thread with the older one. Chris did send me the details of the new service back in June but I've only just had time to investigate it and write about it.

Merged, as requested.

cmorley
08-25-2014, 06:18 PM
thanks for explanation

can you answer me then, as a T1 person with CTS8862 have you resolved this SNP , because if the answer is the same as predictor ytree.MorleyDNA then I fail to see why I need to invest in ClarifyDNA

regards

Everything at ytree.MorleyDNA.com is based on Geno 2.0 data rather than BigY or FGC data. I see that CTS8862 was tested on the Geno chip, but it doesn't appear in my latest Geno-based report, and I haven't looked at the Geno 2.0 dataset in several months.

However, I have done a similar analysis (unreleased) on Chromo2 data. That report puts CTS8862 at the "T1a2b~2" level. Below L446, and approximately phyloequivalent to CTS3767, CTS9984 and CTS11984 (and three other markers that recur elsewhere in the tree). I have 18 candidate markers "between T1a-L131 and T1a2b-L446" (the dataset lacks representation from T1a2a-P322). There are 7 S-series markers at the "T1a2b~1" level, and a further 7 at the "T1a2b~1~1" level. Granted, these clades are based on only one sample per clade.

I expect an analysis of several FGC or BigY kits from this part of the haplotree would reveal other SNPs of comparable importance.

I actually don't have any haplogroup T kits in my BigY dataset. The first order at clarifYDNA.com for analysis of haplogroup T BigY data will get two more BigY haplogroup T analyses free of charge: a three-for-one deal for the first customer submitting their haplogroup T data. I will post here once that offer has been claimed.

vettor
08-25-2014, 06:47 PM
Everything at ytree.MorleyDNA.com is based on Geno 2.0 data rather than BigY or FGC data. I see that CTS8862 was tested on the Geno chip, but it doesn't appear in my latest Geno-based report, and I haven't looked at the Geno 2.0 dataset in several months.

However, I have done a similar analysis (unreleased) on Chromo2 data. That report puts CTS8862 at the "T1a2b~2" level. Below L446, and approximately phyloequivalent to CTS3767, CTS9984 and CTS11984 (and three other markers that recur elsewhere in the tree). I have 18 candidate markers "between T1a-L131 and T1a2b-L446" (the dataset lacks representation from T1a2a-P322). There are 7 S-series markers at the "T1a2b~1" level, and a further 7 at the "T1a2b~1~1" level. Granted, these clades are based on only one sample per clade.

I expect an analysis of several FGC or BigY kits from this part of the haplotree would reveal other SNPs of comparable importance.




Isogg T is now interested in
T1a2b CTS11796, CTS12108, L446 ................most likely they are all equal

In regards to you other query....I have tested P322 and I am negative, I have tested P327 ( Palestinian marker ) and I am negative ..........maybe these have to do with my lack of basal M184 as I do not have it in tests with ftdna, 23andme anbd natgeno2...I have the other basal M272.
Anyway P322 represents north-germanic, and scandinavian T markers.

I seem to only have an association/matches with alpine ( and near neighbours, south-german, north-italian, eastern-french )and british isles area ( ireland and wales in majority)

I have replied to your personnel question and await your response.



I actually don't have any haplogroup T kits in my BigY dataset. The first order at clarifYDNA.com for analysis of haplogroup T BigY data will get two more BigY haplogroup T analyses free of charge: a three-for-one deal for the first customer submitting their haplogroup T data. I will post here once that offer has been claimed.

I currently have a query with Ftdna in regards to my T tree ........recent correspondence has been twice in the past week .........I am unsure if ftdna, isogg and yourself are in alignment or not.....confusion reigns superior in this topic for T group

MfA
08-25-2014, 06:55 PM
Everything at ytree.MorleyDNA.com is based on Geno 2.0 data rather than BigY or FGC data. I see that CTS8862 was tested on the Geno chip, but it doesn't appear in my latest Geno-based report, and I haven't looked at the Geno 2.0 dataset in several months.

However, I have done a similar analysis (unreleased) on Chromo2 data. That report puts CTS8862 at the "T1a2b~2" level. Below L446, and approximately phyloequivalent to CTS3767, CTS9984 and CTS11984 (and three other markers that recur elsewhere in the tree). I have 18 candidate markers "between T1a-L131 and T1a2b-L446" (the dataset lacks representation from T1a2a-P322). There are 7 S-series markers at the "T1a2b~1" level, and a further 7 at the "T1a2b~1~1" level. Granted, these clades are based on only one sample per clade.

I expect an analysis of several FGC or BigY kits from this part of the haplotree would reveal other SNPs of comparable importance.



Are there any new SNPs between E-CTS1096 and E-PF6751 on the unreleased tree you can share?

cmorley
08-25-2014, 07:11 PM
Isogg T is now interested in
T1a2b CTS11796, CTS12108, L446 ................most likely they are all equal

In regards to you other query....I have tested P322 and I am negative, I have tested P327 ( Palestinian marker ) and I am negative ..........maybe these have to do with my lack of basal M184 as I do not have it in tests with ftdna, 23andme anbd natgeno2...I have the other basal M272.
Anyway P322 represents north-germanic, and scandinavian T markers.

I seem to only have an association/matches with alpine ( and near neighbours, south-german, north-italian, eastern-french )and british isles area ( ireland and wales in majority)

I have replied to your personnel question and await your response.



I currently have a query with Ftdna in regards to my T tree ........recent correspondence has been twice in the past week .........I am unsure if ftdna, isogg and yourself are in alignment or not.....confusion reigns superior in this topic for T group

vettor, I just replied to your private message.

Let me add, quickly, that the Chromo2 report I mentioned above was made in February. ISOGG added CTS11796 and CTS12108 to its tree in July. Those were two of the 18 markers my Chromo2 report had placed "between T1a-L131 and T1a2b-L446". Without a P322+ kit, my algorithm couldn't know which of the 18 markers were common to T1a2, and which were exclusive to T1a2b. Your stated P322- result reaffirms that P322 and L446 are on parallel branches, but it cannot be used to resolve the placement of the 18 markers. One would have to test a P322+ kit for these 18 markers.

cmorley
08-25-2014, 07:23 PM
I think we're getting off-topic, but I'll answer MfA's question here.

My Chromo2 tree has PF6751 one level below CTS1096. I have about 40 SNPs approximately equivalent to CTS1096, and I don't have a Chromo2 result for E-M136. If somebody in E-M136 is negative for any of these 40 SNPs, then that would suggest a new level in between E-CTS1096 and E-PF6751 (conditional on accepting E-PF6751's singleton sibling as a valid clade).

duckie
09-25-2014, 06:30 AM
Do you have any kits from haplogroup N?

vettor
09-25-2014, 09:23 AM
vettor, I just replied to your private message.

Let me add, quickly, that the Chromo2 report I mentioned above was made in February. ISOGG added CTS11796 and CTS12108 to its tree in July. Those were two of the 18 markers my Chromo2 report had placed "between T1a-L131 and T1a2b-L446". Without a P322+ kit, my algorithm couldn't know which of the 18 markers were common to T1a2, and which were exclusive to T1a2b. Your stated P322- result reaffirms that P322 and L446 are on parallel branches, but it cannot be used to resolve the placement of the 18 markers. One would have to test a P322+ kit for these 18 markers.

contact these for a positive P322

161332 Johann Gottfried Schuetz

or the norwegians
136605 Hendrick Evertsen Nop
123373 Ole Canuteson

as I state, I tested for it and I am negative

cmorley
12-01-2014, 06:30 AM
I originally stated that clarifYDNA.com (https://www.clarifYDNA.com) analyses would be provided until at least the end of 2014. I am extending this to until at least the end of April 2015, and this applies to past and future orders.

In addition to BigY analyses (https://www.clarifydna.com/productdisplay/subscription-us-dollars), clarifYDNA is now also analysing FGC Y Elite data (https://www.clarifydna.com/productdisplay/yelite-analysis).

vettor
12-01-2014, 06:49 AM
However, I have done a similar analysis (unreleased) on Chromo2 data. That report puts CTS8862 at the "T1a2b~2" level. Below L446, and approximately phyloequivalent to CTS3767, CTS9984 and CTS11984 (and three other markers that recur elsewhere in the tree). I have 18 candidate markers "between T1a-L131 and T1a2b-L446" (the dataset lacks representation from T1a2a-P322). There are 7 S-series markers at the "T1a2b~1" level, and a further 7 at the "T1a2b~1~1" level. Granted, these clades are based on only one sample per clade.



Well I have these as positive S-series

S27238+, S137+, S27574+, S138+, S27644+, S27554+, S27681+, S27568+, S27128+, S27531+, S27605+, S27247+, S26596+, S2041+, S27445+, S27774+

then again, Its a pity I am not ready for your service............regards and all the best