Page 63 of 64 FirstFirst ... 135361626364 LastLast
Results 621 to 630 of 639

Thread: Dante Labs (WGS)

  1. #621
    Registered Users
    Posts
    179
    Sex

    One can do quick & dirty variant calling with bcftools from the bam with:
    wget http://hgdownload.cse.ucsc.edu/golde...ips/hg19.fa.gz
    samtools index some-mapped-sorted-bam-file.bam
    bcftools mpileup -d 10000 -r chrM -f hg19.fa.gz some-mapped-sorted-bam-file.bam | bcftools call -c --ploidy 1 | grep -v "[[:space:]]0:" --line-buffered | tee some-mapped-sorted-bam-file.chrM.vcf
    (You can add -v to "bcftools call" for similar effect to the grep, but grep with --line-buffered shows progress. Or you can use multiallelic caller with -m instead of -c but then you'll have to contend with results like "Adenine OR Cytosine", so consensus caller may be easiest for genealogical mtDNA. For me the results are identical.)

    Of course, that'll still be in coordinates of whatever reference was used for the bam, in this case hg19 Yoruban chrM. Grabbing the chrM reads and re-mapping them to either rCRS or RSRS reference would also be possible, but at that point you're starting to look at full-fledged genome analysis pipeline. Galaxy, Sequencing.com, YFull (for males) and FGC can all do that as well if you can get the BAM file online.
    Last edited by Donwulff; 04-14-2019 at 01:58 PM. Reason: Changed displaying progress and max. read depth to 10,000 to use all data; not that you'll need that much...

  2. The Following 3 Users Say Thank You to Donwulff For This Useful Post:

     bjp (04-14-2019),  ChrisR (04-15-2019),  pmokeefe (04-14-2019)

  3. #622
    Registered Users
    Posts
    205
    Sex

    Quote Originally Posted by Giosta View Post
    My Dante vcf does not contain any chrM lines
    Nor did mine, but today I received working instructions on how to request a report. The first link they sent gave me a 404 error message.

    Go to our website dantelabs.com
    Click on Products
    Click on My Full DNA: Whole Genome Sequencing with mtDNA
    Over the far right, you should see a brief description of what this Kit offers as well as a "form" hyperlink above the cost.
    Click on form and fill out the requested information.


    It should look like the below:

    If you seek to get a Customized Report on a disease or gene panel, after receiving your saliva collection kit, fill out the form

    If results come back aligned to the Yoruba Reference Sequence, James Lick's Advanced utility lets you select that. I don't have any direct experience with it.

    https://dna.jameslick.com/mthap-new/advanced.php

  4. The Following 3 Users Say Thank You to Ann Turner For This Useful Post:

     Donwulff (04-15-2019),  Giosta (04-16-2019),  pmokeefe (04-15-2019)

  5. #623
    Registered Users
    Posts
    424
    Sex
    Location
    Praha, Czech Republic
    Ethnicity
    Czech
    Nationality
    Czech
    Y-DNA
    R-Y14088
    mtDNA
    J1c1i

    Czech Republic Austria Austrian Empire Bohemia Carinthia
    Update:
    Quote Originally Posted by Petr View Post
    Now I sent 3rd to 7th kit:
    Delivered (tracking): 2019-01-03
    and I'm curious to see what will be the turnaround time this year.
    3rd to 7th kit:
    Delivered (DHL tracking): 2019-01-03
    Dante Labs: Kit received: 2019-01-13
    Successful DNA extraction - Level A: 2019-02-27
    DNA Sequencing was completed with success: 2019-03-22
    Your results are ready!: 2019-04-02
    HDD ordered: 2019-04-04
    More results (indel, CNV, SV): 2019-04-10
    So for 4 kits I'm waiting for HDD and one kit still has status "Success DNA - A" and according to Dante support "The expected results are in the early of May".

    8th to 10th kit:
    Delivered (UPS tracking): 2019-03-08
    Dante Labs: Kit received: 2019-04-02?? (no e-mail received)
    So these kits still have status "Kit received".
    Y-DNA: R-Y14088 (ISOGG: R1b1a1a2a1a2b1c2b1a1a)
    mtDNA: J1c1i (J1c1 + 7735G and 8848C) Extras: 198T 12007A 16422C 16431A

  6. The Following User Says Thank You to Petr For This Useful Post:

     ChrisR (04-15-2019)

  7. #624
    Registered Users
    Posts
    34
    Sex
    Location
    Porto, Portugal
    Nationality
    Portuguese
    Y-DNA
    DF27
    mtDNA
    H100

    Portugal Brazil Spain France Galicia United Kingdom
    There are some relevant and possible sub-topics he re derived of the massive Dante Lab approach.
    1. Which file format would be best deliverable for domestic users, with genealogical and possibly health interest. Bam file may be essencial to have a full product, but bam is a very big file full of unrelevant information. VCF only has the results that differ from a reference genome, however, it should also include mtdna and would lack the indels and other non defined information. I think that both BAM and VCF may be the best approach (if vcf includes mtdna).
    2. Quality. We can verify the quality of the results. I must say, from my direct knowledge, that Portuguese DNA labs are extraordinary good and also I have a very good appraisal on Italians and those from Finland. They have a track record of very good ethical and research based attitude, as for GPRD respect determined by EU commission (no Brexit jokes here). FDA reports are not so relevant out of USA.
    3. Interpreting results. A new avenue is open that lead to both commercial driven services and to high differentiate end-users that can do their own job. I like to think myself as part of the second road. Gedmatch closed the door to VCF, I do not know why, but assume the complexity is very high. There may be room for a seqmatch service only accepting vcf files. Y, tools to clarify haplogroups like the ytree.morleydna.com could be better adapted to this new reality. Mtdna, James Lick tool is excelent but still need to have the mtdna results in a feasible way, that may include the extraction of mtdna (and Y) from the whole vcf.

    Just some directions for reflection.

  8. The Following User Says Thank You to kafky For This Useful Post:

     ChrisR (04-15-2019)

  9. #625
    Registered Users
    Posts
    34
    Sex
    Location
    Porto, Portugal
    Nationality
    Portuguese
    Y-DNA
    DF27
    mtDNA
    H100

    Portugal Brazil Spain France Galicia United Kingdom
    Success on Y haplogroup definition.

    The steps I made are:

    1. transformed vcf in raw with the DNA Kit Studio (option without RSID);
    2. on a text editor, deleted all data except the header and the Y results; saved as txt.
    3. Uploaded the resulted file on ytree.morleydna.com

    Voilá. DF27 without more definition than this.

  10. #626
    Registered Users
    Posts
    201
    Sex
    Location
    South Tyrol, Alps, Central Europe

    Austria Tirol Italy Trentino Alto Adige Germany Bayern Italy Veneto Switzerland France
    Quote Originally Posted by kafky View Post
    1. Which file format would be best deliverable for domestic users, with genealogical and possibly health interest. Bam file may be essencial to have a full product, but bam is a very big file full of unrelevant information. VCF only has the results that differ from a reference genome, however, it should also include mtdna and would lack the indels and other non defined information. I think that both BAM and VCF may be the best approach (if vcf includes mtdna).
    I think if they would also make available for download the uniparental raw data (FASTQ and/or BAM for the Y and mt) that would suffice for most users interested in genetic genealogy and should be possible at least for a certain amount of time (3 -6 months or even longer). If standardized it should not cause much more post-sequencing costs (analysis-pipeline & hosting).
    The full FASTQ and BAM can remain an additional "product" either delivered by HDD (or SD-Card/USB-Stick) or by Download.
    gVCF (records for all sites, whether there is a variant call there or not + accurate estimation of confidence that the sites are homozygous-reference or not) might be worth as a replacement to the currently used VCF.

    Quote Originally Posted by kafky View Post
    3. Interpreting results. A new avenue is open that lead to both commercial driven services and to high differentiate end-users that can do their own job. I like to think myself as part of the second road. Gedmatch closed the door to VCF, I do not know why, but assume the complexity is very high. There may be room for a seqmatch service only accepting vcf files. Y, tools to clarify haplogroups like the ytree.morleydna.com could be better adapted to this new reality. Mtdna, James Lick tool is excelent but still need to have the mtdna results in a feasible way, that may include the extraction of mtdna (and Y) from the whole vcf.
    I think for Y (and no also for mt) sequence analysis and phylogeny reconstruction the best way is currently represented by YFull. mthap is excellent as long as the latest PhyloTree Build is comprehensive enough. Similar for other mt/Y phylogeny tools where the used reference is key to the value.
    Autosomal Matching is much more complex and I'm not an expert. I see:
    a) difficulty to get huge numbers to upload at one place: while good gedmatch and DNA.Land may not have any of the matches interesting for a good analysis
    b) difficulty to do the step from sub-1-Million-SNP-chips to Full-Genome-Variants and a robust matching including to all the old SNP-platform results
    Particularly interested in: DNA/Admixture from Central Europe (Alps, Tyrol, Dolomites); Y-DNA J2a-PF5169, J2a-L210, R1a-M17, R1b-U106; mtDNA J1b1b, J1c1d, U5a2b2, U5b1b1. Projects in which I work:Hidden Content , Hidden Content , J2a-PF5197, ISOGG Wiki, GenWiki (german)

  11. The Following User Says Thank You to ChrisR For This Useful Post:

     kafky (04-15-2019)

  12. #627
    Registered Users
    Posts
    1,122
    Sex
    Location
    Glasgow, Scotland
    Ethnicity
    Pictland/Deira
    Y-DNA
    R1b-M222-FGC5864
    mtDNA
    H5r*

    One does wonder if YFull will expand further into WGS analysis?
    YSEQ:#37; YFull: YF01405 (Y Elite 2013)
    GEDMatch: A828783 (autosomal DNA), 9427684 (GEDCOM) for segment matching DO NOT POST ADMIXTURE REPORTS USING MY KIT
    WGS (Full Genomes Nov 2015, YSEQ Feb 2019, Dante Mar 2019) - further WGS tests pending ;-)
    Ancestry GCs: Scots in central Scotland & Ulster, Ireland; English in Yorkshire & Pennines
    Hidden Content

  13. The Following User Says Thank You to MacUalraig For This Useful Post:

     kafky (04-15-2019)

  14. #628
    Registered Users
    Posts
    179
    Sex

    A traditional medical/bioinformatics way of delivering results would be VCF + BED file of confident regions ie. which parts of genome were successfully tested. FTDNA BigY delivered like this, and after I pointed it out YFull started accepting their VCF files. I assume YFull could do the same for Dante Labs if they started delivering gVCF and/or BED files, though it must be remembered YFull is in the business of sequencing data. GEDmatch should've pre-emptively applied some default BED files for WGS and exome, derived from a few model samples. gVCF format might be winning, and I see a singular benefit for it in the online use: The callable regions are encoded in same file, so there's no "Forgot to upload BED or uploaded wrong one".

    Dante Labs & Sequencing.com have already declared they're working on transfer ("working seamless within few weeks" or something like that) of the raw data to Sequencing.com where people can get the additional reports they need for a slight charge. I get that people want everything for free, and quite honestly I could only see that as a strong sales-benefit for Dante Labs, but being able to get most formats you want, up to date, for a small charge is great option. The first wellness report from Dante Labs is already produced with Sequencing.com, I don't know the technical details of that but it seems that Sequencing.com *already* has access to your results (likely the VCF) whether you choose to use their services further or not.

    Not sure what's up with GEDmatch Genesis/VCF since I had to leave there after their limits on "artificial kits" which seems like it could cover most bioinformatics workflows (Ie. not directly obtained from vendor), but back before then it seemed that gVCF files were too large to upload there as is. A kit which has the superset of SNP's on all autosomal tests might be one of the best options. For myself (before obtaining the BAM file) I experimented with putting the sequencing results through Human Haplotype Reference Consortium imputation pipeline, which has two benefits: Adding any missing no-calls, and reducing file size to known variants only. Of course, imputation has the possibility of adding some spurious matches/non-match too, so this is clearly inferior to knowing the no-calls.

    Useful deliverables would overall include:
    gVCF file for Promethease; of course companies try to provide their own health interpretations for most consumers.
    23andMe/AncestryDNA tab-files with identical SNP's (Some sites don't seem to be able to handle extra SNP's) for current genealogical services
    mtDNA FASTA for mtDNA analysis
    FASTQ/BAM as currently
    Y-chromose & mtDNA BAM separated (Because YFull takes either/both)
    Additionally:
    VCF file with all latest dbSNP variants, whether ancestral, derived or no-call, and any derived novel variants.
    BED file of confident regions.
    A variant browser that can handle those, online for mobile devices like most genomics services provide right now.
    For future:
    BAM of unmapped/weakly mapped reads because metagenome (ie. bacteria etc.) is still poorly tapped

    Indels can be in same VCF file as far as I can see, as long as they're normalized in the normal format. Everything I know of just ignores indels that they can't handle, and the same VCF format handles indels and SNP's just fine. In fact in some cases you can have like "Ancestral: A, Sample: G *and* AG" so separating them doesn't make much sense. Structural variants, where one segment can be flipped or moved elsewhere are different, but few things use them yet and short read sequencing isn't good at detecting them.

    I'm intentionally describing (somewhat) what would be useful "right now", but it should be borne in mind that what's most useful next year may not be the same, which is one reason the "Vendor provides everything at once and you need nothing more" isn't realistic, and third party services serve a part in converting bioinformatics formats (To say nothing of re-analysis with new genomic references and tools). With race-to-the-bottom sequencing costs you can't necessarily expect vendors to provide these & updates for free indefinitely, though it's certainly an important business edge.

    Also DNA.Land, MyHeritage and other established imputation using genelogic matching sites can technically start taking whole sequence data almost any time they want, which is interesting. In some ways it doesn't make sense to start a competing service because they could beat any competition the moment they want to, but until then, that service is missing. (However, whole sequence doesn't currently add much to autosomal matching due to genealogical matches coming in IBD segments, it's more about data compatibility; perhaps later genome phasing and indels).

    Every company that targets/markets to EU is bound by GDPR. Of course, just because there's a law for something won't mean everybody will obey the law, on the other hand GDPR means that there are at least some recourses available should a company get caught from breaking it. However, that's veering dangerously into opinions and controversial topics... (Whee, long post now, lol)
    Last edited by Donwulff; 04-15-2019 at 11:38 PM.

  15. The Following User Says Thank You to Donwulff For This Useful Post:

     kafky (04-15-2019)

  16. #629
    Registered Users
    Posts
    179
    Sex

    If people are getting their BAM's on hard-drives, it's relatively easy to build a JavaScript/browser based BAM-reader that would extract and send over only chrY or mtDNA mapped reads. I've been raking my brain trying to figure how you would *prove* that only the selected data is sent though. But I can't see the hard-disk delivery surviving long in the Internet age (And the Sequencing.com deal is good indication) so it seems a waste of effort, albeit I'm aware the "sneaker-net" beats best Internet transfer speeds if we're starting to see really massive data. Latest mobile phones have like 1TB storage though so perhaps people will be keeping their whole family's BAM files on one so they only have to worry about it being stolen, scrambled or hacked

    Autosomal BAM probably has less direct usage if processed into high-quality gVCF/VCF+BED, beyond very computationally intensive re-analysis for newish references etc. Although the long-read sequencing & polishing with sort-read sequencing raw data are a good example of new uses which may not be immediately apparent. Joint genotyping and family-trio analyses etc. taking advantage of similarities between genomes of close relatives are another significant use for the BED. But of course for starters you need one really good genotyping of the sequence anyway.
    Last edited by Donwulff; 04-15-2019 at 10:10 PM.

  17. #630
    Registered Users
    Posts
    31
    Sex

    If you inquired about your kit and received the following response
    "I have checked on your DNA sample and appears to be still in the sequencing process which is a long and complex process, composed of several steps. Your results were expected to be ready on 12th last month. We are very sorry we failed to inform you there could be delays."

    You should be alarmed because it is auto-generated. It sounds personal but is actually auto-generated. Anyone received results from last year's Thanksgiving promotion? I am curious as to whether they sequence the full-pay customers' first or randomly selected few first. If the latter some from last year's promo should have gotten their results by now.

  18. The Following User Says Thank You to ybmpark For This Useful Post:

     kafky (04-16-2019)

Page 63 of 64 FirstFirst ... 135361626364 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •