Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: How to build G25 coordinates from Raw data

  1. #1
    Registered Users
    Posts
    119
    Sex
    Ethnicity
    North Africa
    Nationality
    French
    Y-DNA (P)
    E-M183
    mtDNA (M)
    H1

    France Morocco Occitania

    How to build G25 coordinates from Raw data

    Hello
    who knows how to build G25 coordinates from Raw data ?
    and what are the differences between scaled and unscaled ?,

  2. The Following User Says Thank You to Itrane2000 For This Useful Post:

     Hurricane (10-28-2020)

  3. #2
    Registered Users
    Posts
    722
    Sex
    Location
    Poland
    Nationality
    Polish
    Y-DNA (P)
    R1a1a1b1a2b3
    mtDNA (M)
    U5a2b

    Poland Russian Federation Bulgaria Ottoman Empire Israel
    Quote Originally Posted by Itrane2000 View Post
    Hello
    who knows how to build G25 coordinates from Raw data ?
    and what are the differences between scaled and unscaled ?,
    If you want your G25 coordinates you should contact Davidski and sent to him your raw data. https://eurogenes.blogspot.com/2017/...ore-to-be.html
    I also want to know what is difference between scaled and unscaled, but did not find answer on this forum and other websites
    23andme 5.9: 86,8% Eastern European, 9,1% Greek and Balkan, 0,3% Spanish and Portuguese, 1,2% Anatolian, 0,7% Ashkenazi Jewish, 1,4% Broadly Southern European, 0,5% Broadly European

    Distance: 1.8082% / 0.01808160
    Target: Cyryl_scaled | ADC: 0.25x RC
    85.3 East_Europe
    7.4 Southeast_Europe
    5.0 Sephardic
    2.3 Asia_Minor



    Ancestry: 68,75% Polish minority in Ukraine, 12,5% Northern Russian, 12,5% Alfatar Bulgarian ( + Sephardic ancestry ) , 6,25% Kuban Cossack

  4. The Following User Says Thank You to CyrylBojarski For This Useful Post:

     AliB12 (11-02-2020)

  5. #3
    Registered Users
    Posts
    119
    Sex
    Ethnicity
    North Africa
    Nationality
    French
    Y-DNA (P)
    E-M183
    mtDNA (M)
    H1

    France Morocco Occitania
    Quote Originally Posted by CyrylBojarski View Post
    If you want your G25 coordinates you should contact Davidski and sent to him your raw data. https://eurogenes.blogspot.com/2017/...ore-to-be.html
    I also want to know what is difference between scaled and unscaled, but did not find answer on this forum and other websites
    I already have my G25 coordinates, but there are a lot of public raw data in the genetic database on the web that are not been converted in G25 and I would like to convert them.

  6. #4
    Registered Users
    Posts
    928
    Sex
    Location
    Krakow
    Ethnicity
    Polish
    Nationality
    Pole
    Y-DNA (P)
    R1a > M458 > Y23110
    mtDNA (M)
    J1c1

    Poland
    Quote Originally Posted by Itrane2000 View Post
    I already have my G25 coordinates, but there are a lot of public raw data in the genetic database on the web that are not been converted in G25 and I would like to convert them.
    Only Davidski can do this for G25.
    If you think something is missing, I would recommend sending this data to Davidski.

  7. The Following User Says Thank You to michal3141 For This Useful Post:

     Ruderico (10-28-2020)

  8. #5
    Registered Users
    Posts
    159
    Sex
    Location
    Finland, Pohjois-Pohjanmaa
    Ethnicity
    Finn
    Nationality
    Finnish
    Y-DNA (P)
    I1a1b-L22 [P109-]

    Finland
    Quote Originally Posted by Itrane2000 View Post
    I already have my G25 coordinates, but there are a lot of public raw data in the genetic database on the web that are not been converted in G25 and I would like to convert them.
    What data specifically, may I ask?

  9. #6
    Gold Class Member
    Posts
    2,957
    Sex
    Location
    Pennsylvania
    Ethnicity
    West European
    Nationality
    USA
    Y-DNA (P)
    R1b-U152-Z36+FGC6511
    mtDNA (M)
    H11a2a

    United States of America Germany England France Scotland Ireland
    On the unscaled vs scaled coordinates: The G25 coordinates are a 25 dimensional principal component analysis (PCA) abstraction of your raw DNA that allows you to visualize your position on a genetic map/picture with reference to world populations. It turns out that these coordinates also work great with optimization algorithms to model your admixture/ancestry. Most of the variation is captured in the first dimension, the next most in the second dimension, etc. The later dimensions in the set are working on delineating smaller kinds of variations. But, in order to make the numbers useful on a graph, the dimensions are "normalized " by the PCA so that your X and Y lines on a graph are similar in length if, for example, you want to plot you and others on a two dimensional graph picture, say, with PC1 and PC2. These original numbers are the "unscaled" raw coordinates. Someone got the idea that the relative importance of the first dimension through the last could be used to modify the raw numbers and increase the importance of the dimensions to cause the modeling tool also to put more weight to the more important dimensions. The amount of variation each dimension represents is associated with something called the Eigen value for the dimension. If I am remembering correctly, the square root of each dimension's Eigen value is multiplied by the unscaled coordinate to create the "scaled" value. Using the scaled values, I think, can make some modeling tasks easier because the lesser dimensions can introduce some "noise" into the model and the scaling greatly reduces or almost eliminates their impact to the model. However some of those lesser dimensions distinguish things like Siberians from Amerinds or different kinds of African populations in useful ways. So, I think they are useful for my family, for example, who have some minor admixture of those types. You just have to be a little more careful. However, the scaled values can make some tasks easier, like modeling with the really ancient references. So, it helps to be adaptable. In this thread, I show in visual terms how each dimension impacts the model, first unscaled, and then later with the scaled values:

    https://anthrogenica.com/showthread....l=1#post566344

    I hope that helps!
    Last edited by randwulf; 10-28-2020 at 09:49 PM.

  10. The Following 9 Users Say Thank You to randwulf For This Useful Post:

     Bart (11-02-2020),  jadegreg (10-29-2020),  jelliedsoup (10-28-2020),  JerryS. (10-29-2020),  maroco (10-30-2020),  Michalis Moriopoulos (10-29-2020),  ph2ter (10-29-2020),  Ruderico (10-29-2020),  ThaYamamoto (10-29-2020)

  11. #7
    Moderator
    Posts
    7,126
    Sex
    Location
    Normandy
    Ethnicity
    northwesterner
    Y-DNA (P)
    R-BY3604-Z275
    mtDNA (M)
    H5a1

    Normandie Orkney Netherlands Friesland East Frisia Finland
    Quote Originally Posted by Itrane2000 View Post
    I already have my G25 coordinates, but there are a lot of public raw data in the genetic database on the web that are not been converted in G25 and I would like to convert them.
    G25 coordinates don't result from anything like a "conversion". "G25" per se is the matrix of a Principal Components Analysis (reduced to 25 dimensions). This PCA was elaborated with the Eigensoft program called "smartpca" on the basis of a huge matrix of raw data. This matrix and the choices of filtering (etc) made on it are the property of Davidski. If you learn how to manage genetic data and use smartpca you will be able to elaborate something that will be from the same "family" as Eurogenes-G25, but will not be Eurogenes-G25, far from it (likely very far from it). That said it would be a good idea to learn genetic data manipulations and practise of genetic tools.
    En North alom, de North venom
    En North fum naiz, en North manom

    (Roman de Rou, Wace, 1160-1170)

  12. The Following 11 Users Say Thank You to anglesqueville For This Useful Post:

     Bart (11-02-2020),  Bygdedweller (10-29-2020),  fabrice E (10-31-2020),  Itrane2000 (10-28-2020),  jadegreg (10-29-2020),  Megalophias (10-28-2020),  NixYO (10-29-2020),  ph2ter (10-29-2020),  randwulf (10-28-2020),  Ruderico (10-29-2020),  ThaYamamoto (10-29-2020)

  13. #8
    Global Moderator
    Posts
    3,418
    Sex
    Location
    Beira Douro
    Ethnicity
    Portuguese highlander
    Y-DNA (P)
    E-Y31991>FT17866
    mtDNA (M)
    H20 (xH20a)

    Asturias Galicia Portugal 1143 Portugal 1485 Portugal Order of Christ PortugalRoyalFlag1830
    Quote Originally Posted by randwulf View Post
    On the unscaled vs scaled coordinates: The G25 coordinates are a 25 dimensional principal component analysis (PCA) abstraction of your raw DNA that allows you to visualize your position on a genetic map/picture with reference to world populations. It turns out that these coordinates also work great with optimization algorithms to model your admixture/ancestry. Most of the variation is captured in the first dimension, the next most in the second dimension, etc. The later dimensions in the set are working on delineating smaller kinds of variations. But, in order to make the numbers useful on a graph, the dimensions are "normalized " by the PCA so that your X and Y lines on a graph are similar in length if, for example, you want to plot you and others on a two dimensional graph picture, say, with PC1 and PC2. These original numbers are the "unscaled" raw coordinates. Someone got the idea that the relative importance of the first dimension through the last could be used to modify the raw numbers and increase the importance of the dimensions to cause the modeling tool also to put more weight to the more important dimensions. The amount of variation each dimension represents is associated with something called the Eigen value for the dimension. If I am remembering correctly, the square root of each dimension's Eigen value is multiplied by the unscaled coordinate to create the "scaled" value. Using the scaled values, I think, can make some modeling tasks easier because the lesser dimensions can introduce some "noise" into the model and the scaling greatly reduces or almost eliminates their impact to the model. However some of those lesser dimensions distinguish things like Siberians from Amerinds or different kinds of African populations in useful ways. So, I think they are useful for my family, for example, who have some minor admixture of those types. You just have to be a little more careful. However, the scaled values can make some tasks easier, like modeling with the really ancient references. So, it helps to be adaptable. In this thread, I show in visual terms how each dimension impacts the model, first unscaled, and then later with the scaled values:

    https://anthrogenica.com/showthread....l=1#post566344

    I hope that helps!
    I believe scaling was originally made by David by simply multiplying the eigenvalue by its corresponding dimension value, but it was Huijbregts who intervened and said that at the very least it should be with its squareroot instead, which as you said is the current version. Huijbregts, however, never supported this transformation to begin with from a methodological pov (he criticised it a lot), which is why you never see him use it in his works
    YDNA E-Y31991>PF4428>Y134097>Y134104>Y168273>FT17866 (TMRCA ~1100AD) - Domingos Rodrigues, b. circa 1690 Hidden Content , Viana do Castelo, Portugal - Stonemason, miller.
    mtDNA H20 - Monica Vieira, b. circa 1700 Hidden Content , Porto, Portugal

    Hidden Content
    Global25 PCA West Eurasia dataset Hidden Content

    [1] "distance%=1.6023"

    Ruderico

    Iberia_IA+Hallstatt,82.2
    Berber_EMA,10
    Rome_Imperial,7
    Levant_Roman,0.8

  14. The Following 5 Users Say Thank You to Ruderico For This Useful Post:

     jadegreg (10-29-2020),  ph2ter (10-29-2020),  randwulf (10-29-2020),  sktibo (10-29-2020),  ThaYamamoto (10-29-2020)

  15. #9
    Gold Class Member
    Posts
    2,957
    Sex
    Location
    Pennsylvania
    Ethnicity
    West European
    Nationality
    USA
    Y-DNA (P)
    R1b-U152-Z36+FGC6511
    mtDNA (M)
    H11a2a

    United States of America Germany England France Scotland Ireland
    Quote Originally Posted by Ruderico View Post
    I believe scaling was originally made by David by simply multiplying the eigenvalue by its corresponding dimension value, but it was Huijbregts who intervened and said that at the very least it should be with its squareroot instead, which as you said is the current version. Huijbregts, however, never supported this transformation to begin with from a methodological pov (he criticised it a lot), which is why you never see him use it in his works
    Yes, I agree with your understanding on it. I, myself, almost always model unscaled/original PCA values as I find that I get more realistic models that include my family members' minor ancestry and no problems,really, with our major ancestry that way.

  16. The Following 4 Users Say Thank You to randwulf For This Useful Post:

     jadegreg (10-29-2020),  Ruderico (10-29-2020),  sktibo (10-29-2020),  ThaYamamoto (10-29-2020)

  17. #10
    Global Moderator
    Posts
    3,418
    Sex
    Location
    Beira Douro
    Ethnicity
    Portuguese highlander
    Y-DNA (P)
    E-Y31991>FT17866
    mtDNA (M)
    H20 (xH20a)

    Asturias Galicia Portugal 1143 Portugal 1485 Portugal Order of Christ PortugalRoyalFlag1830
    To be honest I've stopped caring what people use, as long as they don't annoy me because of my own choice
    YDNA E-Y31991>PF4428>Y134097>Y134104>Y168273>FT17866 (TMRCA ~1100AD) - Domingos Rodrigues, b. circa 1690 Hidden Content , Viana do Castelo, Portugal - Stonemason, miller.
    mtDNA H20 - Monica Vieira, b. circa 1700 Hidden Content , Porto, Portugal

    Hidden Content
    Global25 PCA West Eurasia dataset Hidden Content

    [1] "distance%=1.6023"

    Ruderico

    Iberia_IA+Hallstatt,82.2
    Berber_EMA,10
    Rome_Imperial,7
    Levant_Roman,0.8

  18. The Following 5 Users Say Thank You to Ruderico For This Useful Post:

     fabrice E (10-31-2020),  jadegreg (10-29-2020),  JerryS. (11-07-2020),  randwulf (10-29-2020),  sktibo (10-29-2020)

Page 1 of 2 12 LastLast

Similar Threads

  1. Is 23andme v5 Raw Data really bad in uploading to G25 coordinates???
    By CyrylBojarski in forum Autosomal (auDNA)
    Replies: 59
    Last Post: 10-03-2020, 12:57 PM
  2. How to Tame a Fox (and build a dog).
    By JohnHowellsTyrfro in forum Fauna
    Replies: 8
    Last Post: 08-19-2017, 11:20 PM
  3. Replies: 0
    Last Post: 08-25-2012, 01:38 AM
  4. 23andme Upgrading to NCBI Build 37 Coordinates
    By Scarlet Ibis in forum 23andMe
    Replies: 2
    Last Post: 08-15-2012, 04:54 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •