Page 1 of 5 123 ... LastLast
Results 1 to 10 of 42

Thread: Tracking R1b novel SNPs and their availability incl. a la carte ordering

  1. #1
    Registered Users
    Posts
    4,411
    Sex
    Y-DNA (P)
    R1b
    mtDNA (M)
    H

    Tracking R1b novel SNPs and their availability incl. a la carte ordering

    I don't think most people will do full genome scans without a dramatic reduction in price. That means that a la carte (stand-alone or one at a time) SNP ordering will still be very critical. We know FTDNA raised their prices from $29 to $39 each earlier this year. It appears they remove the one-time lab transfer fee in the process. We know our favorite (if not best or primary) SNP advocate, Thomas Krahn, is no longer employeed at FTDNA so I don't know what thats means for SNP discovery and offering development for FTDNA. We know that some on order WTY's have been cancelled. My guess is they are watching the competition and are working on new offerings. That's just a guess.

    Regardless, the a la carte (Advanced Orders) process for single SNPs is still available, and as far as we know, SNPs tests are still being development. I propose that we share and track SNP primer requests and availability. I've started this thread to do that for R1b.

    DF99 is one SNP I have questions about. It appears to be a new peer subclade to U152, L21, DF27, DF19 and L238 under P312. Does anyone know the status at FTDNA or how we can determine it.

    FTDNA would probably appreciate we had efficient communications with them on these matters.
    Last edited by TigerMW; 01-14-2014 at 03:43 PM. Reason: corrected spelling

  2. #2
    Gold Class Member
    Posts
    2,718
    Sex

    Quote Originally Posted by Mikewww View Post
    as far as we know, SNPs tests are still being development.
    I have the contrary impression, based on what Bennett Greenspan wrote to me on September 4:
    ---
    For the time being I see no change [due to the departure of Thomas Krahn] other than the SNPs that others have asked him to calibrate up or a few WTYs to be scored, nothing should be out of the norm.
    ---

    Thomas Krahn was the only one at FTDNA, to my knowledge, who had any interest in designing and offering new SNP tests. He had already ordered the primers for several more on my behalf when he was let go. I have no reason to believe that those tests will ever emerge.

    My own suspicion right now is that, in regard to SNP tests, FTDNA will execute a mere holding action until it can:
    - In the shorter term, offer the new version of Geno or a competitor to it
    - In the longer term, offer full Y sequencing.

    With some justification, FTDNA probably feels that individual SNP testing is an ever-deepening bed of quicksand. The requests for new tests continue to grow while the revenue per test continues to decline as tests become more specific to tiny recent clades.

    Maybe a better modern analogy is an ever-widening sinkhole:
    Georgia sinkhole threatens to swallow entire apartment complex
    Louisiana sinkhole swallows 100-foot-tall trees
    Florida sinkhole swallows resort near Disney World

    The inability to get new individual SNP tests is quite tragic in the short term, of course, because there is no cost-effective alternative. Full Y sequencing is currently still much too expensive for routine recommendation, and chip offerings like Geno 2.0 cater primarily to Utahans and Sardinians.
    Last edited by lgmayka; 09-16-2013 at 02:48 PM.

  3. #3
    Administrator
    Posts
    2,461
    Sex
    Y-DNA (P)
    R1b-Z12
    mtDNA (M)
    I3b (FMS)

    Quote Originally Posted by lgmayka View Post
    I have the contrary impression, based on what Bennett Greenspan wrote to me on September 4:
    ---
    For the time being I see no change [due to the departure of Thomas Krahn] other than the SNPs that others have asked him to calibrate up or a few WTYs to be scored, nothing should be out of the norm.
    ---
    I've had the feeling for some time that Bennett Greenspan is not up to speed with the operational details of the a la carte SNP and WTY side of the business. With Thomas' departure I would not be surprised if no more new SNPs are added to the menu.

  4. #4
    Registered Users
    Posts
    714
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA (P)
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    I think that we really need to push Full Genomes for alternatives for testing individual YSNPs or groups of YSNPs. If FTDNA dramatically slows down the offering of newly discovered YSNPs, they are making a big business mistake of taking their eyes off their core business of YSTR testing AND extensive YSNP testing (not less than now). Testing individual YSNPs is probably not a money maker for FTDNA since the overhead costs of scanning for one mutation on the Y-chromosome has got to cost the $39 that they are collecting. I am afraid that their strategy will be pumping out new versions of Nat Geo every couple of years - which just will not be adequate. This could put a business squeeze on Full Genomes - if FTDNA does not offer individual YSNP tests - Full Genomes business will really slow down. We should give FTDNA a few weeks to propose their new strategy on YSNP discovery and testing of newly discovered YSNPs. Our requests to add Full Genomes newly discovered YSNPs will reveal a lot in the near future about the FTDNA long term strategy. If Full Genomes finds a good alternative, I would really hate to manually track of these tests vs. FTDNA YSNP reports (even with all their problems).

    If the Nat Geo / FTDNA contract allows, FTDNA could statically test 500 to 1,000 newly discovered YSNPs with new tests every 3 to 6 months for around $99. Over time, there could be around a dozen or so $99 tests that would give us the equivalent of the $39 individual tests. Not sure if the economics of setting up several static chips would allow this but this would also eliminate many unstable / private SNPs in making the Nat Geo test. Nat Geo probably does not want to add all the private YSNPs because they do not really care about private/genealogical YSNPs. Then every couple of years, the Nat Geo test could add the broader and older YSNPs and remove any unstable or private YSNPs that would only be available via the static $99 tests. This would be a reasonable solution that would allow us continue to test groups of YSNPs at less than the Nat Geo test costs and be able to test newly discovered SNPs much more frequently than a Nat Geo update every two years.

    The 111 marker test ($359) and the Nat Geo test ($199) is already $559 - this is 37 % of the cost of the Full Genomes test. However, $1,499 is just not viable except for those of us are willing to test at those costs and can afford to test at those costs. If FTDNA does not keep the ball rolling and the Full Genome test falls to $750, FTDNA would have a major exposure to losing their lucrative and exclusive testing of profitable YSTR 67 and 111 marker tests as well as keeping orders for Nat Geo from those who recommend YSNP testing. It really sad that Ancestry.com and 23andme have such dismal and disappointing long term strategies. All three companies are pursuing high volume atDNA tests which is a good test for many scenarios - but can not address the older brick walls that YDNA can reveal.

    The above only addresses testing issues - we still have massive database and lack of tools which is another major issue. Having to download Nat Geo tests manually to make up for the broken Nat Geo to FTDNA transfer and manually tracking Walk the Y test results (vs. having them load into the YSNP reports). You know that FTDNA will never allow uploads of Full Genomes YSNP or YSTR results. Just growing pains of a new and leading edge niche industry.
    Last edited by RobertCasey; 09-16-2013 at 06:05 PM.

  5. #5
    Registered Users
    Posts
    4,411
    Sex
    Y-DNA (P)
    R1b
    mtDNA (M)
    H

    Quote Originally Posted by RobertCasey View Post
    ... The above only addresses testing issues - we still have massive database and lack of tools which is another major issue. Having to download Nat Geo tests manually to make up for the broken Nat Geo to FTDNA transfer and manually tracking Walk the Y test results (vs. having them load into the YSNP reports). You know that FTDNA will never allow uploads of Full Genomes YSNP or YSTR results. Just growing pains of a new and leading edge niche industry.
    I was thinking about that database issue too. I see we have different volunteers in U106, Z18 and L21 using different formats to do the comparative analylsis. We also have Chris Morley's tool which must have its own database.

    I wish there was a better example, but just like there is a Ysearch database with STR data elements and "one" haplogroup designation per record/kit/row, we need to have something similar for SNPs. There should be a record ID, associated kit # or some other identifier, a surname as a double check and then a series of structure data items/columns dependent on the haplogroup.... something along those lines. Everybody should use the same format for the SNP mutation label/location and the allele be it "A", "G", etc., etc.

    There are many, many SNPs of course so I'd recommend something like a relevant set of SNPs per each different haplogroup. This needs to be worked out but a secondary index file might have the haplogroup definitions with a list of relevant SNPs per each haplogroup and the ancestral allele/value for each of those SNPs. The primary/large file would have to have a index key stored per each record to point to the appropriate "owning" haplogroup.

    What do you think? Essentially, I think something like ISOGG needs to endorse use of an independent database structure. There should be guidelines/controls for registering the SNPs, etc. They already have an SNP index and perhaps that could be another auxiliary file to be used as the "correct" / consistent SNP names.

    We've had a few years of WTY, then the larger volume of Geno2, but now the multiple other formats which will come. I'm afraid its getting ahead of us.

    I suppose a database structure should be endorsed for Y STRs too since Ysearch doesn't even hold FTDNA's 111 and it sounds like we have more than that coming in these new offerings.

    I recognize that the operational support of such databases is another issue, but at least we should come with a common format so various project data can be joined together for analysis, etc. without a bunch of transformations. I'm advocating setting data format and naming standards, not providing the service itself. Companies will probably attempt to but we will see bias in these towards their offerings or lackluster up/down support.
    Last edited by TigerMW; 09-16-2013 at 09:36 PM.

  6. #6
    Registered Users
    Posts
    4,411
    Sex
    Y-DNA (P)
    R1b
    mtDNA (M)
    H

    Quote Originally Posted by Mikewww View Post
    ... DF99 is one SNP I have questions about. It appears to be a new peer subclade to U152, L21, DF27, DF19 and L238 under P312. Does anyone know the status at FTDNA or how we can determine it. ...
    I'm not sure of their status but Z192 is an important subclade under U152 and then we have DF100 directly under L11 as a peer to P312 and U106.

  7. #7
    Registered Users
    Posts
    714
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA (P)
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    I think that there is hope for a common YSTR format since it is an easier issue:

    1) The dash format of multi-copy markers has to be parsed into separate fields for analysis (first deviation from FTDNA format);
    2) Conversion to 389-delta format - fair superior for analysis (second deviation from FTDNA format - change in my format - my database is already in delta format);
    3) Multi-copy markers beyond normal values should be kept separate from normal markers (change in your format for 464 - eliminate blank columns and put them out in a special area);
    4) Must include the extra Full Genomes YSTRs (up to 400) - needed for future and current YSTR research;
    5) Must include source project name and date extracted fields - I am slowly adding these so that others can easily pull values - I still can not find around 50 of your submissions (labeled SS);
    6) Some way to load non FTDNA YSTR report submissions (YSearch, WFN and custom reports) - too hard to maintain but we need source field to track their original source;
    7) Some way to track no-calls YSTRs and missing YSTRs (currently listed as 0* and 0 in FTDNA reports);
    8) Manually be able to add the CCTG 464 variations as special fields;
    9) Full Genomes is has to be a special source field vs. just using FTDNA project name improperly;
    10) Splitting up f and FTDNA ID fields - bad database design to combine fields - should be report source field and report ID field (change in both of our formats);

    I try to emulate your spreadsheet (no need to re-invent the wheel) with a couple of minor exceptions. Hopefully, it will become the defacto standard to emulate.
    The YSTR and YSNP reports must be separate database tables but output summary export reports can be joined together. Probably need a separate table for genealogical
    data (Surname and estimated European place of origin - derived fields that are major pain and deviation from FTDNA) as well as original FTDNA fields such as Oldest known
    proven ancestor, origin, donor field and grouping header (I parse this).

    Normalization of tables has major tradeoffs. Normalization states that any table entries that are not dependent should be separated. We obviously currently have
    L21 tables (about all our spreadsheets can handle and that is about to get ugly with too many more entries). David Reynolds started doing normalization for his
    Nat Geo / WTY summaries which is the correct thing to do. I would combine L96, L144.1, L371, L679 and CTS2457.2 into one table, DF13private. I would keep
    all the others in DF13new - which would include DF13* which is already has done. Since M222 has so many SNPs (and Full Genomes will add many more) as well
    as having so many submissions: you could split DF49 into DF49-M222 and DF49-other. Same with DF21: DF21-Z246 and DF21other (normalize a few of these
    to get everyone used to constant changes for normalization).

    However, this would greatly complicate our lives since we would have to learn how to join numerous tables together for reports and analysis. Looking up ISOGG requirements
    for new DF13 sons would require looking at most tables or joining them together. We will have no choice but to eventually normalize our data into different tables. Another
    major problem would be convergence. My recent analysis of CTS2687 showed major overlap between L1066 (Z253), L1333 (L513), CTS2687 (L513), P314.2 (DF21), 1515Who
    and 513-V (L513). As testing progresses, we would have to constantly move submissions from one table to another. Also, where do we place 1515Who - under L21* or DF13* or its
    own separate table as L21unk.

    Normalization could help the YSNP issue. Under normalization, you only have to store the raw data (un-parsed YSNP) and relevant SNPs under L513 if in the L513 table. For
    relevant YSNPs, they really need to be separate fields. Output reports of relevant YSNPs would be derived combination of separate fields for spreadsheet conversion. YSNPs can get
    real ugly fast: we need to classify them as terminal, duplicate, unstable, volatile (L159.2 and L69.5) which is subject to a lot of debate. Do we only allow terminal YSNPs
    to be ISOGG driven or do we allow private YSNPs to be terminal YSNPs. Or do we just punt and give them the entire list or filter some (unstable).

    Another major issue for me is prediction. Most of the single fingerprint YSNPs are probably predictable without L21 positive being a requirement. If I filter out P312 and others,
    I will not find convergence and would not be able to properly predict without testing positive for L21. M222 and L226 are very isolated and do not require testing L21+ and
    probably 80 % of the single fingerprint SNPs fit the same category. So for this analysis, I really need to combine L21 with its near relatives. We are already up to half of
    L21 being single fingerprints and the percent will continue to rise. Also, if we can prove some of your single pattern signatures are genetically isolated, they could be
    predicted as L21 with minimal analysis. Of course, any signature / fingerprint can range from very stable to very speculative - so we will constantly have to move submissions
    between tables - a hassle that we do not currently have too much.
    Last edited by RobertCasey; 09-16-2013 at 11:07 PM.

  8. The Following User Says Thank You to RobertCasey For This Useful Post:

     TigerMW (09-18-2013)

  9. #8
    Banned
    Posts
    7
    Sex
    Omitted

    Quote Originally Posted by RobertCasey View Post

    Normalization of tables has major tradeoffs. Normalization states that any table entries that are not dependent should be separated. We obviously currently have L21 tables (about all our spreadsheets can handle and that is about to get ugly with too many more entries). David Reynolds started doing normalization for his Nat Geo / WTY summaries which is the correct thing to do. I would combine L96, L144.1, L371, L679 and CTS2457.2 into one table, DF13private. I would keep all the others in DF13new - which would include DF13* which is already has done. Since M222 has so many SNPs (and Full Genomes will add many more) as well as having so many submissions: you could split DF49 into DF49-M222 and DF49-other. Same with DF21: DF21-Z246 and DF21other (normalize a few of these to get everyone used to constant changes for normalization).

    However, this would greatly complicate our lives since we would have to learn how to join numerous tables together for reports and analysis. Looking up ISOGG requirements for new DF13 sons would require looking at most tables or joining them together. We will have no choice but to eventually normalize our data into different tables. Another major problem would be convergence.

    Another major issue for me is prediction.
    Robert, please drop your thinking about "normalization" and "predictions" and "Excel gymnastics" for a moment and consider this: L371 is a stand alone SNP with a Welsh heritage and does not deserved to be COMBINED nor MIXED IN with other SNPs. One part of your brain seems to get clouded over from your overly Excelling at juggling things around. Just because there have not been so many people tested for L371 is no reason for you to relegate and COMBINE it.

  10. #9
    Registered Users
    Posts
    714
    Sex
    Location
    Texas
    Ethnicity
    English, Irish, German
    Nationality
    US
    Y-DNA (P)
    L21>L226>FGC5639

    England Germany Netherlands France Ireland Switzerland
    Quote Originally Posted by jonesge View Post
    Robert, please drop your thinking about "normalization" and "predictions" and "Excel gymnastics" for a moment and consider this: L371 is a stand alone SNP with a Welsh heritage and does not deserved to be COMBINED nor MIXED IN with other SNPs. One part of your brain seems to get clouded over from your overly Excelling at juggling things around. Just because there have not been so many people tested for L371 is no reason for you to relegate and COMBINE it.
    You are confusing database related issues vs. reporting issues. Currently, L371 is lumped into all of R-L21 (both Mike W. and I) have R-L21 databases (Mike also has a R-P312 database as well). Normalization is necessary for database design and R-L21 is getting too big for export to our external reporting spreadsheets. I will probably continue to create external reports at the YSNP level (L371 has a separate report). Prediction has the potential to save everyone a lot of unnecessary testing costs and that is what Mike and I strive for (as best we can). EXCEL gymnastics is a necessary evil as that is the lowest common denominator for public consumption. When you look at a spreadsheet, wouldn't you prefer to only see the smaller DF13 SNPs vs. all of R-L21 - that is what normalization is all about. What we are talking about is splitting up the L21 database into more manageable chunks. You are confusing reporting related issues with database related issues. I will probably continue to export L371 related reports as standalone spreadsheets and web pages. What we are talking about is what to do when a new L21 submissions when added to our database. Putting it all into one massive L21 database does have advantages (to recognize overlap of L371 with overlapping non-L371 submissions). From a prediction point of view, if I combine all of R-M269 into one database, I will be able to predict most L21 private YSNPs without the need to test positive for L21. This could avoid the Nat Geo test in the future and could allow people to directly test for L371 directly if they match. These issues are very important and affect your and your L371 cousin's pocketbook for testing costs and the ability to rapidly discover and analyze newly discovered YSNPs.

    Creating standards such as Mike W has suggested means analysis between different parts of the haplotree will be more consistent in nature and that we are better able to merge new and better ideas from other groups. By doings so, we will be better able to provide better analysis of L371 and any future descendants of L371. There are now around 20 R-L21 individuals in the Full Genomes tests, so database related issues will become more important with the doubling of R-L21 SNPs by the end of the year. You might want to consider the Full Genomes test for L371 - you will probably discover 20 to 40 new YSNPs under and just above L371. With the impending deluge of new YSNPs coming, we are all concerned about the need to improve and standardize our analysis methodologies. This would separate all the Pugh submissions that dominate L371 from the 20 % non-Pugh submissions. Our goal is to get you a son and grandson of L371 that identifies your particular part of L371. With 100,000+ YSNPs on the near horizon combined with over 400 YSTRs, we may be able to assign combinations of YSNPs and YSTRs to individuals on your pedigree chart - and make some significant progress on our brick walls.

  11. #10
    Banned
    Posts
    7
    Sex
    Omitted

    With the impending deluge of new YSNPs coming, we are all concerned about the need to improve and standardize our analysis methodologies. This would separate all the Pugh submissions that dominate L371 from the 20 % non-Pugh submissions. Our goal is to get you a son and grandson of L371 that identifies your particular part of L371.

    Robert, In an earlier post I did specifically explain to you the "reason" for the higher percentage of "Pugh" surnamed individuals who have the L371 SNP. There are even some African American "West" who are L371 but are genetically YDNA L371 Welsh Pughs. That "reason" is that several years back there was a Project Admin "PROACTIVELY SEEKING" Pughs and Wests to be DNA tested. These surnamed Pughs are hence not a fairly distributed part of the larger Welsh heritage L371 males being tested. Plus, these are mainly DNA tests from USA males .... the real L371 hotspot is in NW Wales (and I have those results now in conjunction with my own research and the POBI project). Those are the hard and fast facts and hopefully I trust you can now accept that reality when you do your Excelling - Standardizing - or Whatever you do on your IBM computers.

    Before making such assertions in the future about any L21 or L371 surname, you might also want to validate and get a pragmatic sanity check with what I am again explaining to you by looking at USA Surname frequencies based on the 2000 Census.

    Pugh ranks # 817 whereas Jones ranks #5 and the Griffith surname ranks # 369. My Jones YDNA line descended from a Griffith L371 line circa 1400s to early 1500s. Again, I doubt your Excelling would discover that very quickly.

    [[[ Edited out and copied elsewhere see note below from moderator]]]

    The British are coming ... The British are coming!!! The Data is coming ... The Data is coming!!! Your sophomoric pleadings to get L371s and other L21s to shuck out big bucks for high priced "Full Genome Testing" is comical. You said: "You might want to consider the Full Genomes test for L371 - you will probably discover 20 to 40 new YSNPs under and just above L371. Robert, I will take this Full Chromosome test ... will you pay 50% of this cost if you are real wrong on the new (20 to 40!!!) YDNA SNPs you predict an L371 individual will have?

    L371 is a relatively younger Y-SNP and looking at Y-STRs right now is the best way to ID Welsh surname relationships as noted in the above examples.

    I would rather fund 10 new Y-37 DNA tests in NW Wales to define some L371 boundaries there rather than shucking out over $1500 for a Full Chromosome DNA Test ... but I'll do that Full Chromosome test now if you want put some money (upfront) where your mouth is. This topic is about "Novel" SNPs (and you brought my L371 into this discussion), so I think my comment calling you out is more than appropriate.


    [[[Mikewww/moderator on 9/18/2013: I copied verbatim the two contiguous off-topic paragraphs on privacy concerns over to a general category thread about DNA project data and pivacy. Go here if interested: http://www.anthrogenica.com/showthre...A-projects-etc I think there are other comments going off tangent here too. ]]]
    Last edited by TigerMW; 09-18-2013 at 08:15 PM.

Page 1 of 5 123 ... LastLast

Similar Threads

  1. Carte pologneballonaise de France
    By NixYO in forum French
    Replies: 1
    Last Post: 04-29-2020, 10:10 AM
  2. Carte K36
    By Theconqueror in forum French
    Replies: 170
    Last Post: 01-20-2019, 04:54 PM
  3. Replies: 1
    Last Post: 11-17-2017, 10:59 PM
  4. Replies: 26
    Last Post: 03-29-2017, 01:57 AM
  5. Oceania (incl. Australia & New Zealand) Archaeology and History News
    By History-of-Things in forum Archaeology (Prehistory)
    Replies: 8
    Last Post: 07-27-2013, 02:48 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •