PDA

View Full Version : What to do with raw DNA data?



to147628
07-27-2018, 09:19 PM
Hello,

This is my first time posting here, so I would like to apologize in advance if this was posted in the wrong (sub)forum.

Anyway,
I recently got my hands on the raw DNA data of one of my relatives. Sadly he passed away before being able to see the results. Since we share an Y chromosome, I'm really interested in finding out more about our general ancestry. My question is, how can I do this with the raw data in my possesion? Because of his sudden death, I don't even know the company that took the test, and it appears to be a full genome analysis (?). The data is about 4,5 GB in total, and the main file is a .vcf file named "XXXXXXXXX-DNA_XXX.genome", along with 24 .gz files and a .tbi file (picture of it here: https:// imgur. com/a/ Yyzmgtc). So what can I do from here? Are there any sites/programs I should check out? I'm not too keen on uploading the data to a third-party site due to privacy concerns, but I'm open to any suggestions. Help would be really appreciated, since I'm really at a loss here and just don't know where to start.

Thank you very much

Regards,

to147628

Vrump
07-29-2018, 08:47 AM
Gedmatch?

anglesqueville
07-29-2018, 10:10 AM
to147628, you will not be able to do anything with this file as it is, in particular with Gedmatch, or with any DoItYourself program. You have to extract the usual SNPs from the vcf file and write files in a usable format, depending on the tools that you intend to use (PLINK, EIGENSTRAT, or a text format, as 23&me or ftdna). The classical tool for the vcf is vcftools http://vcftools.sourceforge.net/index.html . The link inside your post is not readable, therefore I can only guess that the 24 .gz files you speak of are vcf files for each chromosome. I suppose also that those data are post-imputation data? Which company?

Cofgene
07-29-2018, 12:54 PM
Since you have the genome.vcf.tbi you can submit that to https://promethease.com/ . You will get some some haplogroup information along with trait analysis. You can also consider submitting the VCF to YFull.com to see where they place the paternal line on their tree. If you haplogroup falls under R1b then submitting the VCF to haplogroup-r.org would also be a good thing from an analysis perspective.

This is the analyzed results from the genome test. Do you have the BAM file in a folder? If not try to find the order/account info and get access to the BAM. The file naming convention should provide the hint as to which firm did the sequencing.

to147628
07-29-2018, 01:15 PM
Thank you very much anglesqueville, I will definitely check vcftools out!
The link should be readable, I just added spaces inbetween because I'm not yet allowed to post links on this forum, remove them and it should be fine (https:// imgur. com/a/Yyzmgtc ). The 24 .gz files are indeed named after chromosomes (chr1 -> chr22, X and Y). As for the data being post-imputation data, I have simply no idea (I had to look up this term too). As I stated before, I also have no idea which company took the test.

What I did forgot to mention in my previous post is that along with the data, a .pdf with instructions on how to install "IGV" was included. I tried loading the data through this program and it worked fine, It's just that I don't really know what I'm doing since IGV seems to be software for professional researchers.
If I do manage to figure out how to use vcftools, what should I do with the data? GEDmatch is mentioned all the time, but are there any other things I should check out?

to147628
07-29-2018, 01:33 PM
Thank you Cofgene, this is really helpful!

Promethease's privacy police appears to be really sound so I will definitely try them out.
There sadly doesn't appear to be a BAM file in the folder, I will check if I can find any information on the order/account, but this will likely take some time. The files are all named "LPXXXXXXX-DNA_E01", with the X'es being a string of numbers (If this helps at all).

Dmitry
08-01-2018, 01:35 AM
There is a website akin to Promethease called codegen.eu. It is for free, too. You can upload your data anonymously and get a health/traits report.

Rpink53
08-10-2018, 01:38 PM
Can you upload to YFull for free or all services paid?

spruithean
08-10-2018, 04:38 PM
Can you upload to YFull for free or all services paid?

It is roughly $50 IIRC to upload to YFull.

C J Wyatt III
08-10-2018, 04:53 PM
I may be a bit out of line here, but when I read the title of this thread, I can't get Bubba's recitation of what you can do with shrimp out of my head:

https://www.bing.com/videos/search?q=bubba+gump&view=detail&mid=34F8FE1687E00F39634134F8FE1687E00F396341&FORM=VIRE

Jack

rms2
08-12-2018, 07:56 PM
I have Kaiser Permanente health insurance through my employer. Recently I agreed to participate in a Kaiser dna study and went to my local Kaiser lab, where several vials of my blood were withdrawn for that purpose.

I am hoping to get the raw results, although I'm not sure they will be just extra anxious to give them to me.

This thread has been interesting in that connection. At least now I will have some idea of what to do with the raw data, if I get them.