Results 1 to 5 of 5

Thread: Admixture Databases and models

  1. #1
    Registered Users
    Posts
    1,780
    Sex
    Location
    Canada
    Ethnicity
    Germanic
    Nationality
    Canadian
    Y-DNA (P)
    R1B-DF99(FGC16982)

    Germany Imperial Normandie Switzerland

    Admixture Databases and models

    Anyone knows if the databases behind Gedmatch Projects are available to the public? I would be interested in using different statistical models for matching autosomal signatures. Thanks.

  2. #2
    Registered Users
    Posts
    2,289
    Sex
    Location
    Canada
    Ethnicity
    Mixed Euro/Near East
    Nationality
    Canadian
    Y-DNA (P)
    R1a-YP4516/YP4807*
    mtDNA (M)
    H11a2a3

    Canada Franco-Manitoban European Union Ottoman Empire Russia Imperial United States Grand Union
    ^ If you mean the population sample sets for the autosomal admixture analyses, those are mainly taken from direct user submissions to the various projects. Out of curiosity what models were you thinking of using instead of Fst?
     

    Other ancestral Y lines:

    E1b-M81 Ukraine (Ashkenazi)
    E1b-V13 England
    I1-M253 Ireland
    I2-M423 Ukraine
    R1a-L176.1 Scotland
    R1b-L584 Syria/Turkey (Sephardi)
    R1b-L20 Ireland
    R1b-L21 (1)England; (2)Wales?>Connecticut
    R1b-L48 England
    R1b-P312 Scotland
    R1b-FGC32576 Ireland

    Other ancestral mtDNA lines:

    H1b2a Ukraine (Ashkenazi)
    H6a1a3 Ukraine
    K1a9 Belarus (Ashkenazi)
    K1c2 Ireland
    V7a Ukraine

  3. #3
    Registered Users
    Posts
    1,780
    Sex
    Location
    Canada
    Ethnicity
    Germanic
    Nationality
    Canadian
    Y-DNA (P)
    R1B-DF99(FGC16982)

    Germany Imperial Normandie Switzerland
    I was thinking about running Neural Networks to train the data by sub-populations to generate and classify a library of aggregated admixture signatures. Then, use the library of admixture signatures as a base to detect a belonging class for a current signature (mine, yours, etc). The program would spit out the likelihood of belonging to an aggregated class. The Oracle does that but I don't know the statistical methodology.



    Quote Originally Posted by AJL View Post
    ^ If you mean the population sample sets for the autosomal admixture analyses, those are mainly taken from direct user submissions to the various projects. Out of curiosity what models were you thinking of using instead of Fst?

  4. #4
    Registered Users
    Posts
    2,289
    Sex
    Location
    Canada
    Ethnicity
    Mixed Euro/Near East
    Nationality
    Canadian
    Y-DNA (P)
    R1a-YP4516/YP4807*
    mtDNA (M)
    H11a2a3

    Canada Franco-Manitoban European Union Ottoman Empire Russia Imperial United States Grand Union
    ^Interesting thought. I believe all the admixture analyses online use the standard Fst model to find components, then check the combination of components to find least squared mutations as compared to sample populations. But one other approach I've seen is D-statistics:

    http://dienekes.blogspot.com/2012/12...omponents.html
     

    Other ancestral Y lines:

    E1b-M81 Ukraine (Ashkenazi)
    E1b-V13 England
    I1-M253 Ireland
    I2-M423 Ukraine
    R1a-L176.1 Scotland
    R1b-L584 Syria/Turkey (Sephardi)
    R1b-L20 Ireland
    R1b-L21 (1)England; (2)Wales?>Connecticut
    R1b-L48 England
    R1b-P312 Scotland
    R1b-FGC32576 Ireland

    Other ancestral mtDNA lines:

    H1b2a Ukraine (Ashkenazi)
    H6a1a3 Ukraine
    K1a9 Belarus (Ashkenazi)
    K1c2 Ireland
    V7a Ukraine

  5. The Following User Says Thank You to AJL For This Useful Post:

     dp (03-11-2016)

  6. #5
    Registered Users
    Posts
    1,780
    Sex
    Location
    Canada
    Ethnicity
    Germanic
    Nationality
    Canadian
    Y-DNA (P)
    R1B-DF99(FGC16982)

    Germany Imperial Normandie Switzerland
    I will look into that, thanks. Ultimately, when a population cluster is defined and known, it is a simple classification issue of knowing how close to an aggregated signature an individual is. There are different ways to approach this issue but I believe that using NN to classify and detect matching pattern is a valid one. I believe PCA is already used in some research looking at the shared fraction of the sample variance. I need to understand better the Oracle method that is used right now. It is interesting to see how forcing a 2-3-4 population solution generates quite informative results.


    Quote Originally Posted by AJL View Post
    ^Interesting thought. I believe all the admixture analyses online use the standard Fst model to find components, then check the combination of components to find least squared mutations as compared to sample populations. But one other approach I've seen is D-statistics:

    http://dienekes.blogspot.com/2012/12...omponents.html

Similar Threads

  1. DNA Databases Deleted: Miilions of Customer Acounts Affected
    By can't_lurk_no_mo' in forum General
    Replies: 6
    Last Post: 04-02-2017, 04:24 PM
  2. D-Stat Models
    By Kale in forum Autosomal (auDNA)
    Replies: 67
    Last Post: 03-28-2017, 06:16 AM
  3. Replies: 2
    Last Post: 08-05-2016, 04:16 AM
  4. Replies: 2
    Last Post: 03-28-2016, 08:03 AM
  5. Replies: 6
    Last Post: 06-06-2014, 04:48 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •