Page 3 of 3 FirstFirst 123
Results 21 to 23 of 23

Thread: WHY HASN'T NATIONAL GEOGRAPHIC/HELIX DONE ANYTHING ABOUT RELEASING RAW DATA FILES ?

  1. #21
    Registered Users
    Posts
    208
    Sex

    Quote Originally Posted by wombatofthenorth View Post
    It would be odd to got for 95% and not just 100% at that part. Maybe they went for 99% (at least minus the last few at the end that don't change) and ended up with 5% or so where the chip fails and gives no calls. Anyway, seems possibly far better than 23 or LivingDNA.
    A shame then more don't take Geno and across the world since this could've been a way to really build mtDNA studies in a huge way.
    I agree that it seems odd, but it's sequencing, not a chip process. So far the gaps have been in similar places, so perhaps the problem is aligning the short reads and assigning them to the right place. Pure speculation on my part.

  2. #22
    Junior Member
    Posts
    1
    Sex
    Location
    Israel
    Ethnicity
    Cuban
    Nationality
    Cuban-American-Israeli

    Convert new Geno 2.0 file to GEDmatch - help

    Quote Originally Posted by Ann Turner View Post
    Thanks to William Bruce for sending me his raw data from the Helix version of the Genographic Project.

    He was unable to upload the file to GEDmatch because of the file format. It is sorted by rsid instead of chromosome and position, it lacks a column for position, and it puts the two alleles in separate columns. I used Excel's VLOOKUP function against a 23andMe v3 template to identify the position, concatenated allele1 and allele2 to make a genotype column, rearranged columns, and sorted by chromosome and position. This was sufficient for GEDmatch to recognize the file format (23andMe). There is no X data at all, and GEDmatch made note of that but still accepted the file.

    General observations:

    The genotype distribution is unremarkable. SNPs where the two alleles are also complementary base pairs are avoided. Transitions ( A<->G or C <-> T ) are more common than transversions.

    The no-call rate is quite high at 13%. Genealogy companies aim for no more than 3% and often achieve much better than that. It would be useful to know if the no-calls have a consistent pattern.

    The overall homozygosity is 60.2%, lower than found in the SNP selection for 23andMe v3 (70.5%) or LivingDNA (83%). Homozygosity increases when more SNPS with rare alleles are added, since most people will share the more common allele. The Genographic Project may have looked for SNPs that are somewhat common globally but have different distributions in various parts of the world.

    Attachment 20032

    William has a file from LivingDNA for comparison. GEDmatch Genesis shows that 99.5% of the calls in the Helix file match the calls in the LivingDNA file (about 237 differences out of the 47423 SNPs available for comparison). This sounds like a decent concordance rate, but it is lower than achieved by chip technology. My son's LivingDNA file compared to his 23andMe file shows 54 differences out of 183,824 SNPs (99.97%).

    The SNP overlap with different platforms is important for GEDmatch. These stats are comparisons of specific files, so they would vary slightly depending on no-calls in the other files.

    Attachment 20033

    With the current GEDmatch algorithm, William's LivingDNA and Helix files show lots of gaps in sections where the SNP overlap falls below their threshold. He shows only a 76.3% similarity with himself.

    Attachment 20034

    The Helix file fared better with a comparison with a parent/child kit from 23andMe v4, where there are more SNPs in common. There were still some small breaks: there were 66 segments vs the expected 22. However, the total cM added up to 3455, just shy of the 3565 I see for a 23andMe v3 kit. GEDmatch also introduces breaks for a v3 kit in regions where the SNP density is low, so 45 segments were reported for a self-to-self comparison.

    I need help to convert my new Geno 2.0 file to GEDmatch. Tx.

  3. #23
    Registered Users
    Posts
    29
    Sex

    Quote Originally Posted by aosamores View Post
    I need help to convert my new Geno 2.0 file to GEDmatch. Tx.
    Would someone be kind enough to tell us step-by-step what to do to upload the data to gedmatch? Everyone should be aware that Geno 2.0, both from FTDNA and HELIX, many times, give customers the wrong haplogroup. Apparently, it is National Geographic's bad algorithm. They have terrible customer service. I'm also disappointed at the rate of no-calls in Helix compared to FTDNA. I can't wait to see the haplogroup they identify once it is corrected.

Page 3 of 3 FirstFirst 123

Similar Threads

  1. Replies: 29
    Last Post: 08-18-2018, 11:48 PM
  2. Replies: 32
    Last Post: 06-02-2018, 08:52 PM
  3. How do you delete your data files from Genesis?
    By WilliamBruce in forum Open-Source Projects
    Replies: 10
    Last Post: 08-07-2017, 05:57 AM
  4. Replies: 14
    Last Post: 07-22-2016, 10:56 PM
  5. Replies: 0
    Last Post: 05-05-2016, 03:42 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •