Page 1 of 3 123 LastLast
Results 1 to 10 of 29

Thread: Incorporating individual samples when modelling oneself, or exclusively averages?

  1. #1
    Registered Users
    Posts
    160
    Sex
    Omitted

    Incorporating individual samples when modelling oneself, or exclusively averages?

    Hello all,

    I recently got into G25 modelling and was wondering whether it is ok to use individual samples in correspondence with averages, for example when trying to find out minute amounts of ancestry from a far away population.

    Thanks in advance!

  2. The Following User Says Thank You to bovefex For This Useful Post:

     JMcB (02-13-2021)

  3. #2
    Gold Class Member
    Posts
    3,183
    Sex
    Location
    Pennsylvania
    Ethnicity
    West European
    Nationality
    USA
    Y-DNA (P)
    R1b-U152-Z36+FGC6511
    mtDNA (M)
    H11a2a

    United States of America Germany England France Scotland Ireland
    Quote Originally Posted by bovefex View Post
    Hello all,

    I recently got into G25 modelling and was wondering whether it is ok to use individual samples in correspondence with averages, for example when trying to find out minute amounts of ancestry from a far away population.

    Thanks in advance!
    People develop different work patterns. I use all individuals, usually. Sometimes, I do model with averages. I don't usually mix them. I guess that would work, but I am not sure if there are pitfalls to it.

  4. The Following 5 Users Say Thank You to randwulf For This Useful Post:

     bovefex (02-13-2021),  JerryS. (02-14-2021),  JMcB (02-13-2021),  RVBLAKE (02-14-2021),  ThaYamamoto (02-14-2021)

  5. #3
    Registered Users
    Posts
    5,556
    Sex
    Location
    Zagreb
    Ethnicity
    Kaikavian Croatian
    Nationality
    Croatian
    Y-DNA (P)
    I2a1a2b-A815
    mtDNA (M)
    HV0-T195C!

    Croatia European Union
    My experience is that with individual samples you can get better, more accurate results, but you must take into account obvious outliers which exist in the spreadsheet.
    Someone from Germany (or any other nationality or ethnicity) sometimes does not look so.
     
    Some simple calculations, maps and plots I make for free, but for more complicated maps, plots and calculations I ask for a donation of $2 or 1.7 per map/plot via Hidden Content PayPalHidden Content account.
    Every forth map is free. Admix decomposition is $7 or 6.

  6. The Following 6 Users Say Thank You to ph2ter For This Useful Post:

     bovefex (02-14-2021),  JerryS. (02-14-2021),  JMcB (02-13-2021),  randwulf (02-14-2021),  RVBLAKE (02-14-2021),  Sorcelow (02-13-2021)

  7. #4
    Registered Users
    Posts
    2,806
    Sex
    Location
    America
    Ethnicity
    North & Ionian Seas
    Nationality
    American
    Y-DNA (P)
    I1 (P109)

    England Italy Germany Scotland
    Quote Originally Posted by ph2ter View Post
    My experience is that with individual samples you can get better, more accurate results, but you must take into account obvious outliers which exist in the spreadsheet.
    Someone from Germany (or any other nationality or ethnicity) sometimes does not look so.
    I found this to be the case for me, particularly with my north shifted German.
    Last edited by JerryS.; 02-14-2021 at 12:20 AM.

  8. The Following 2 Users Say Thank You to JerryS. For This Useful Post:

     bovefex (02-14-2021),  JMcB (02-14-2021)

  9. #5
    Registered Users
    Posts
    160
    Sex
    Omitted

    Do you guys then also use each and every individual sample from a population, or only specific ones? Because some samples gives no results, others give weird results and some give results that make sense, at least in my experience.

  10. #6
    Administrator
    Posts
    2,407
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    Quote Originally Posted by bovefex View Post
    Do you guys then also use each and every individual sample from a population, or only specific ones? Because some samples gives no results, others give weird results and some give results that make sense, at least in my experience.
    Depending on what I'm looking for I'll use the individual sample that I fit best with if I'm using individual samples, however I also run these through a calculator first to make sure they represent the majority of my ancestral component (and aren't going to overfit/underfit). An example for me is I don't use the Maori population when modelling, as its 50/50 Polynesian and Western Euro, and sometimes I score more of it than I should because of this. I always use a Papuan population instead.
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  11. The Following 3 Users Say Thank You to BalkanKiwi For This Useful Post:

     bovefex (02-14-2021),  JerryS. (02-14-2021),  JMcB (02-14-2021)

  12. #7
    Registered Users
    Posts
    160
    Sex
    Omitted

    Quote Originally Posted by BalkanKiwi View Post
    An example for me is I don't use the Maori population when modelling, as its 50/50 Polynesian and Western Euro, and sometimes I score more of it than I should because of this. I always use a Papuan population instead.
    Kind of unrelated, but Maori are ~50% Western European?

  13. The Following User Says Thank You to bovefex For This Useful Post:

     JMcB (02-14-2021)

  14. #8
    Administrator
    Posts
    2,407
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    Quote Originally Posted by bovefex View Post
    Kind of unrelated, but Maori are ~50% Western European?
    The particular Maori sample in the G25 is. Its normal for modern Maori to be mixed with something (for example Croatian due to Yugoslav migrations), however you can still get Maori who are >60% Polynesian. The North Island of New Zealand for example, has had a lot of migration from Samoa, Tonga and Fiji, and have mixed with Maori.
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  15. The Following 3 Users Say Thank You to BalkanKiwi For This Useful Post:

     bovefex (02-14-2021),  JerryS. (02-14-2021),  JMcB (02-14-2021)

  16. #9
    Gold Class Member
    Posts
    3,183
    Sex
    Location
    Pennsylvania
    Ethnicity
    West European
    Nationality
    USA
    Y-DNA (P)
    R1b-U152-Z36+FGC6511
    mtDNA (M)
    H11a2a

    United States of America Germany England France Scotland Ireland
    Quote Originally Posted by bovefex View Post
    Do you guys then also use each and every individual sample from a population, or only specific ones? Because some samples gives no results, others give weird results and some give results that make sense, at least in my experience.
    I tend to use a regional group of references in full for convenience. This means my reference groups tend to be proxies at times, sort of the way BalkanKiwi is describing. My example: my dad models well as German and Ashkenazi and I can use all of the individuals in his model and it works very well. My mother is South German and Isles, but if I include German in the model, some of the northerly Germans in the set of individuals overlap with some of the English and the percentages become odd. But, if I use other regional Germanic types, like Swiss German or Swiss French, it works very well for her. So, you have to consider mixed or outliers in the set as well as the considerable overlap between some regions.

  17. The Following 4 Users Say Thank You to randwulf For This Useful Post:

     bovefex (02-14-2021),  JerryS. (02-14-2021),  JMcB (02-14-2021),  Nino90 (02-14-2021)

  18. #10
    Registered Users
    Posts
    2,546
    Sex
    Location
    Sweden
    Ethnicity
    Italo-Germanic
    Nationality
    Swedish
    Y-DNA (P)
    R-L2 / R1b-U152
    mtDNA (M)
    H1a1

    Sweden Italy Italy 1861-1946 Sami Vatican Italy Tuscany
    Quote Originally Posted by randwulf View Post
    I tend to use a regional group of references in full for convenience. This means my reference groups tend to be proxies at times, sort of the way BalkanKiwi is describing. My example: my dad models well as German and Ashkenazi and I can use all of the individuals in his model and it works very well. My mother is South German and Isles, but if I include German in the model, some of the northerly Germans in the set of individuals overlap with some of the English and the percentages become odd. But, if I use other regional Germanic types, like Swiss German or Swiss French, it works very well for her. So, you have to consider mixed or outliers in the set as well as the considerable overlap between some regions.
    Well explained.

    If I use the individual Swedish samples, It works best when choosing the ones who fit closest to my mother.

    Some of the Swedish samples are much more southern/continental shifted, and others have part Finnic and Siberian admixture.

    Target: Nino_scaled
    Distance: 1.8542% / 0.01854211
    66.6 Swedish
    25.0 Italian_Tuscany
    8.4 Saami
     
    Target: Nino_scaled
    Distance: 1.8542% / 0.01854211
    53.8 Swedish:Sweden10
    21.4 Italian_Tuscany:NA20504
    12.8 Swedish:Sweden12
    8.4 Saami:saami6
    3.6 Italian_Tuscany:VO59


    VS

    All Swedish samples:

    Target: Nino_scaled
    Distance: 1.7974% / 0.01797445
    69.2 Swedish
    22.8 Italian_Tuscany
    8.0 Saami
     
    Target: Nino_scaled
    Distance: 1.7975% / 0.01797522
    27.6 Swedish:Sweden10
    20.6 Swedish:Sweden17
    18.2 Italian_Tuscany:NA20504
    7.6 Swedish:Sweden10
    6.0 Swedish:Sweden18
    4.8 Saami:saami6
    4.6 Italian_Tuscany:Tuscany27
    4.0 Swedish:Sweden12
    3.6 Swedish:Sweden16
    2.4 Saami:saami9
    0.6 Saami:saami3



    Not an extreme change in percents. But notice that the other option add way more samples than first model.

    Vahaduo_ Global 25 Views (11).png

    According to this PCA you see some Swedish samples are very Uralic shifted and others close to Germans.

    It is also very easy to manipulate your results:

    Here I just choosed the most Uralic shifted sample:

    Target: Nino_scaled
    Distance: 3.0525% / 0.03052475
    46.0 Swedish
    42.2 Italian_Tuscany
    11.8 Saami
    Target: Nino_scaled
    Distance: 3.0525% / 0.03052475
    46.0 Swedish:Sweden15
    42.2 Italian_Tuscany:VO59
    11.8 Saami:saami6
    Of course increase the South European because of lower levels of EEF I would guess.
    Also notice the bad fit.

    Also I noticed that I most often get modeled with two of the most "north" shifted Tuscan samples, witch make sense because my Italian ancestors were all from North Tuscany.
    Last edited by Nino90; 02-14-2021 at 01:59 PM.
    Hidden Content


    Target: Nino_scaled
    Distance: 1.6700% / 0.01669968 | R3P
    53.6 Goth_Kowalewko = Germanic
    39.2 Levant_LBN_MA_o6 = South Euro
    7.2 RUS_Khaptsagai_IA = Steppe + Siberian

  19. The Following 3 Users Say Thank You to Nino90 For This Useful Post:

     bovefex (02-14-2021),  JMcB (02-14-2021),  randwulf (02-14-2021)

Page 1 of 3 123 LastLast

Similar Threads

  1. Replies: 21
    Last Post: 12-01-2020, 08:05 PM
  2. G25 Eurogenes dataset Averages & Individual samples
    By Hurricane in forum Autosomal (auDNA)
    Replies: 44
    Last Post: 11-24-2020, 11:33 AM
  3. Individual Arain Xing GedMatch Harappa Samples?
    By SeverusGrape in forum Southern
    Replies: 3
    Last Post: 04-09-2020, 03:12 AM
  4. The trouble with individual samples
    By firemonkey in forum Autosomal (auDNA)
    Replies: 0
    Last Post: 07-07-2019, 12:01 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •