Page 78 of 80 FirstFirst ... 28687677787980 LastLast
Results 771 to 780 of 800

Thread: Dante Labs (WGS)

  1. #771
    Junior Member
    Posts
    6
    Sex

    Quote Originally Posted by Donwulff View Post
    how will you be able to tell the third-party software won't upload your genome, and everything else found on your computer at the same time, to a server in China or Russia or whatever your Evil Empire is?
    The program is open source, source is included.

  2. #772
    Registered Users
    Posts
    257
    Sex

    Quote Originally Posted by mwasser View Post
    The program is open source, source is included.
    So did you investigate and analyze all the binaries in the package? The issue if hypothetical, of course, much like people objecting before that they can't use Sequencing.com because they may send their DNA data to China (Or Russia in the case of YFull). I'm just pointing out that running a Windows software package is more insecure than sending your DNA data for third party analysis.
    In this case there was absolutely no information of author/provenance of the software, or the license.

    I did download the package, and it's bunch of binaries, couple of scripts with the reference genomes making the bulk of the download. Source is actually not included, but everything looks to be open source. Individual people aren't really equipped to audit binaries, much less source-code though. Open source protects against trojans & backdoors when you pull your code from a trusted location used by thousands of users and developers, but I'm not really here to argue that... If you want to be relatively safe you could always run it in throwaway virtual machine without network connection

    However, we really need secure web-services for processing genomic data. Unfortunately, as noted, Sequencing.com has started returning "403 Forbidden" to my attempts to retrieve/share my BAM files, which suggests either they didn't have backups or their storage isn't as unrestricted as they implied (No, I didn't contact their support yet, I just Googled/searched their knowledge base and was surprised to find no matches). So I'm not sure I can any more recommend them on data integrity grounds.

  3. #773
    Junior Member
    Posts
    7
    Sex
    Nationality
    Finnish
    Y-DNA
    I-Y39168
    mtDNA
    U5b1b1a1a

    Finland
    Quote Originally Posted by Donwulff View Post
    However, we really need secure web-services for processing genomic data. Unfortunately, as noted, Sequencing.com has started returning "403 Forbidden" to my attempts to retrieve/share my BAM files, which suggests either they didn't have backups or their storage isn't as unrestricted as they implied (No, I didn't contact their support yet, I just Googled/searched their knowledge base and was surprised to find no matches). So I'm not sure I can any more recommend them on data integrity grounds.
    I've done few EvE premium runs lately.. and everytime the results return you 403 and you have to ask support to give you an "48 hour" link so you can download it to your own computer.. paradoxically they can't save the results under your account.. you actually have to upload it yourself if you want to use the resulting file on sequencing.com. I'll really need to look into building my own cluster with working pipelines so I can stop paying sequencing.com for each run..

  4. #774
    Registered Users
    Posts
    3,816
    Sex
    Y-DNA
    R1b
    mtDNA
    H

    ...and thirdly assuming this is due to privacy, how will you be able to tell the third-party software won't upload your genome, and everything else found on your computer at the same time, to a server in China or Russia or whatever your Evil Empire is?
    ... much like people objecting before that they can't use Sequencing.com because they may send their DNA data to China (Or Russia in the case of YFull). I'm just pointing out that running a Windows software package is more insecure than sending your DNA data for third party analysis.
    These are considerations. I'm not sure I agree that Windows is less secure then sending your DNA data to third party analysis. I don't think anyone can answer that with some objective numbers.

    Regardless, sending your Whole Genome Sequencing (WGS) data across the internet (anywhere) is something to consider heavily. Essentially, your WGS data is the blueprint to yourself as well as parts of your family members.

    This is the reason I won't recommend WGS at this point. We need clear laws on this and secure systems. I don't think mom and pop businesses can support this. The big governments and hackers are fighting cyberwars.

  5. #775
    Registered Users
    Posts
    244
    Sex
    Location
    Wisconsin, USA
    Nationality
    American
    Y-DNA
    R1b-FGC29071
    mtDNA
    U5a1b1g*

    Ireland England Netherlands Germany France
    Quote Originally Posted by Mikewww View Post
    These are considerations. I'm not sure I agree that Windows is less secure then sending your DNA data to third party analysis. I don't think anyone can answer that with some objective numbers.
    There is little stopping a trojan Windows application from phoning home with your genetic data. How does the operating system know if small encrypted packets broad casting back to AWS contain sensitive information instead of typical check for updates chatter? Unless you are compiling the source for these newer GUI's popping up after auditing the code, you have no idea what's in the distributed archives. Most interested parties don't have the capacity to compile the code much less understand the details of its operations.

    On the other hand if you are installing the individual components like GATK, bwa, Yleaf, etc... from their repositories, there are thousands of eyes on the ball. There's little risk in doing the analysis from a command prompt and judicious use Google that makes it pretty simple these days.

    Web hosted services like sequencing.com are convenient but unless their market places pick up or they score million dollar contracts to access the data, I don't see them remaining free. Cloud compute and storage resources are not inexpensive.

  6. #776
    Registered Users
    Posts
    257
    Sex

    I should stress I have no information that WGS Extract is dangerous, and I'm not accusing that, "it's probably safe". I've just seen people be very security-conscious even on that thread, so I thought to mention that running a program (or script running programs) found off Internet on your genome probably isn't the best course if you're worried about security of your genomic data. And it was more than little concerning to me that the web-page gives no hint as to author/origin/license of that software. In 2017 there was new malware every 4.2 seconds, today they're popping up faster than new humans: https://www.gdatasoftware.com/blog/2...re-trends-2017 But yes, one way to mitigate would be to run it in some sort of sandbox, like a virtual machine without network access.

    Sequencing.com is a good case in point on the price. They started off with some nice, free analysis products, but at present time I think "gVCF to VCF" is the ONLY free thing they have, and the individual tool prices have been raising exponentially, so they're practically pricing themselves out of market. I don't think "free" is the important word here, but $19.99 for extracting 23andMe compatible file? You're probably better off buying 23andMe test. Another good example is Helix, they even do their own sequencing, but apparently sequencing & genomics marketplace wasn't profitable enough so they made a strategic shift away from personal genomics. Of course, getting a computer cluster & learning all the tools to use on it for one-time genomic analysis is still going to be way, way more expensive options. Galaxy is still free (until people rush that), but it takes more effort and doesn't have specific tools. One could always just drop a pre-configured virtual machine on AWS/Azure and pay for the computing time and storage, though, but that has somewhat similar issue to the "found binary on the Internet".

  7. #777
    Junior Member
    Posts
    6
    Sex

    WGS Extract has got the license text included. It is the file "wgsextract-license" in the "open_source_licenses" subdirectory of the download archive. It is GPL V3.

    The source code is also included. It is the file "wgsextract.py" in the "programs/wgsextracty" subdirectory. The program itself does even run from source, there is no binary. The batch file in the main folder calls the python interpreter and let it run wgsextract.py.
    If you look at the python source code, then you can see that this file doesn't do any evil. It just shows the GUI, using the official TKinter module of python, and it does run samtools, bcftools and other open source products.

    The only binary files in the archive are the third party software products that come included, so that the user doesn't have to install anything, and can run it immediately. Those binaries don't open any network connections, as can be seen in a network sniffer. But if somebody still suspects something evil, then he can download Python, Cygwin etc. from the official homepages and overwrite the .exe and .dll files in the archive with them. However, somebody who is that cautios, would also not upload his whole genome to a commercial company like sequencing.com.

    That's just my humble opinion. At the end of the day, everybody has to decide himself which way to take to extract the files.

  8. #778
    Registered Users
    Posts
    3,816
    Sex
    Y-DNA
    R1b
    mtDNA
    H

    Quote Originally Posted by Mikewww View Post
    ... Regardless, sending your Whole Genome Sequencing (WGS) data across the internet (anywhere) is something to consider heavily. Essentially, your WGS data is the blueprint to yourself as well as parts of your family members.

    This is the reason I won't recommend WGS at this point. We need clear laws on this and secure systems. I don't think mom and pop businesses can support this. The big governments and hackers are fighting cyberwars.
    How does the hard drive method of transmission work in terms of safety? Does it provide extra security or are there holes in this method.

    I have some information on my home systems cordoned off network-wise and only back up with portable hard drives that go into safes. Maybe I'm overdoing it but there is some data I want only my family be able to retrieve. Hard drives are cheap and everyone should have a solid fireproof safe.

  9. #779
    Registered Users
    Posts
    244
    Sex
    Location
    Wisconsin, USA
    Nationality
    American
    Y-DNA
    R1b-FGC29071
    mtDNA
    U5a1b1g*

    Ireland England Netherlands Germany France
    That depends on chain of custody. Many of the current WGS offerings are sending the bar-coded sample to partners for sequencing. You would have to inquire about how they get the FASTQ data back before delivery to you. I would imagine itís a mix of encrypted file transfer or hard disk depending on volume.

  10. #780
    Registered Users
    Posts
    22

    Virtually all trojan/virus/malicious software is created to make money in one way or another.
    While stealing/uploading DNA seq data is probably not that hard, I just can't think of a scenario where a trojan writer could make money off the genomes stolen from random people.
    Maybe with the exception of targeted thefts aimed at politicians and celebrities who could be blackmailed.

    Once DNA printing advances, it might become useful to frame someone for a crime, but at the same time the public trust in DNA fingerprinting would nosedive, so it might not make much of a difference.
    Those who have the most to gain from DNA data would probably be the insurance industry, but they won't risk the media firestorm if they are exposed buying customer DNA from dodgy botnet operators.

    At the moment, the cost/payoff ratio for DNA theft just means that a criminal is much better off fishing for your bank data than creating a DNA database in the hopes of selling it at some point.
    For authoritarian nation-state actors the equation changes a bit, but I still don't think even China would be interested in creating a DNA database for foreigners - if I were a Chinese citizen, I would be concerned.

    I guess we'll have to live with the idea that our genome is not private - not any more private than a literal fingerprint. Nanopore sequencers are already the size of large flash drive, and most of the tech in there is commodity electronics - the price will continue to go down. We are literally one order of magnitude in price reduction away from 50$, and then lifting a full genome off of a used coffee cup will be as easy as lifting off a fingerprint.

    Trying to ban people from using tech that is cheap and can be used secretly is not going to be successful, the best we can hope for is stopping companies from discriminating using it.

Page 78 of 80 FirstFirst ... 28687677787980 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •