Page 1 of 8 123 ... LastLast
Results 1 to 10 of 77

Thread: Compact Moderns+Ancients model - amateur version

  1. #1
    Registered Users
    Posts
    252
    Sex
    Ethnicity
    Malayali Muslim
    Nationality
    Indian
    Y-DNA
    HM82->Z4507->Z5890
    mtDNA
    R30a

    Compact Moderns+Ancients model - amateur version

    Hola folks..so I put together a model to explore a couple of ideas I've come across here -

    - Gradient descent-like or zero penalty overfitting models can be helpful to show pulls towards groups/regions.

    - More mixed people may sometimes benefit more from such models as they don't penalize distant admixtures.

    - It would be interesting to have one such (compact) ancients+moderns model geared towards south asians. For this one, I used unscaled co-ords from the G25 datasheets of modern south asian pops, few ancients (sintashta, srubnaya, Afanasievo, ganj dareh N, sarazm EN) & other moderns w/ recent history w/ the sub-continent -> dutch, english, portuguese, cambodian, french, med/west asian, ME & african groups. In additon, sprinkled in a few moderns closeby - like Han, Burmese - and a couple of Eurasian ones that would be along or at the end of various migratory routes - Greenland, Finland, Belarusian. I wanted to add Sinhalese & some NE Indian groups too, hopefully some of you can recommend/provide those.

    - Something like this may be helpful when trying to model one group with other groups. See UP Brahmin & Kerala Nair examples below.

    Some notes:

    - I think this model, as currently constructed, should work for most Indians, some Pakistanis & Bangladeshis.
    - All runs are posted with pen=0. Default penalty runs on this model will probably come out the same as your G25 full moderns runs for most.
    - I labelled the Gujarati samples from the G25 moderns sheet as A & B. To me, they all looked very Iran shifted. Samples that modeled w/ 10-30% Irula/Kol/Pulliyar/Velamas were labelled GujaratiB, those with <10% of the same were labelled GujaratiA. I could definitely use some eyes on these, so let me know if something should be changed.

    The datasheet attached here -
    Compact Moderns + Ancients South Asians.zip

    Some AG member runs -


    "distance%=0.2985" - Kulin

    Kshatriya,18.8
    Irula,18
    Bengali_Bangladesh,10.4
    Balochi,9
    Brahmin_Uttar_Pradesh,8.8
    Pulliyar,7.2
    Brahmin_Manipuri,6.4
    Brahmin_West_Bengal,5
    Gujar,4.2
    Cambodian,3.8
    Burmese,2
    Belarusian,1.4
    Sintashta_MLBA,1.4
    Ethiopian_Amhara,1
    Greenlander_East,1
    Iyer,0.6
    Jarawa,0.4
    Ethiopian_Anuak,0.2
    Ethiopian_Oromo,0.2
    Han_NChina,0.2


    "distance%=0.8528" - Frost

    Punjabi_Lahore,21.8
    Bengali_Bangladesh,19.6
    Brahmin_Uttar_Pradesh,16.6
    Irula,11.2
    Afanasievo,8.8
    GujaratiA,7.2
    Burusho,4.8
    Berber_Morocco_TIZ,4
    Sintashta_MLBA,2.8
    Iyer,2.4
    Srubnaya_Alakul_MLBA,0.6
    GujaratiB,0.2

    Frost's results are pretty intriguing. I know we've speculated that there could be some relatively recent ancestry on his dad's side. It prefers steppe to bengali brahmin or european moderns, could just be the lack of better samples on this model for him of course. The berber is interesting too.


    "distance%=0.5683" - Censored

    Irula,32
    Brahmin_Gujarat,18
    Kol,13.6
    GujaratiA,12.4
    Brahmin_Uttar_Pradesh,9.4
    Sarazm_Eneolithic,5.2
    Iranian_Bandari,2.2
    Gujar,2
    Afanasievo,1.8
    Iranian_Mazandarani,1.2
    Chechen,0.8
    Ethiopian_Anuak,0.8
    Velamas,0.6


    "distance%=0.2406" - Rusty

    Velamas,18.2
    GujaratiB,12.2
    Balochi,11.4
    Brahmin_Uttar_Pradesh,10.2
    Punjabi_Lahore,10
    Shahr_I_Sokhta_BA3,8
    Irula,6
    Sindhi,5.2
    Ganj_Dareh_N,4.4
    Iraqi_Jew,4
    Bhumij,2.8
    Iyer,2.8
    Ethiopian_Tigray,1.6
    Yemenite_Al_Jawf,1.6
    Bengali_Bangladesh,0.6
    Hawaiian,0.4
    Assyrian,0.2
    Ethiopian_Amhara,0.2
    Yemenite_Mahra,0.2


    "distance%=0.441" - Mandoos

    Brahmin_Uttar_Pradesh,18.8
    GujaratiB,14.6
    Iyer,14
    Velamas,13.6
    GujaratiA,11.4
    Brahmin_Gujarat,9.4
    Irula,5.8
    Balochi,4.6
    Kerala_Nair,2.4
    Berber_Morocco_ERR,1.6
    Gupta,1.6
    Han,1.4
    Berber_Morocco_TIZ,0.4
    Brahmin_Manipuri,0.4


    "distance%=0.3688" - Scobar

    Velamas,20.9
    Pulliyar,15.5
    Brahmin_West_Bengal,11.8
    Brahmin_Uttar_Pradesh,10.8
    GujaratiB,10.8
    Ganj_Dareh_N,5.3
    Irula,4.2
    Chechen,3.9
    Greek_Central_Anatolia,3.3
    GujaratiA,3.2
    Greek_Trabzon,2.4
    Ethiopian_Tigray,1.8
    Turkish_Trabzon,1.1
    Ethiopian_Afar,1
    Brahmin_Manipuri,0.9
    Ethiopian_Anuak,0.9
    Iranian_Fars,0.8
    Yemenite_Al_Bayda,0.4
    Sindhi,0.3
    Ethiopian_Amhara,0.2
    Greenlander_West,0.2
    Yemenite_Al_Jawf,0.2
    Greenlander_East,0.1


    "distance%=0.5265" - Kerala_Nair:Kalashviv when run w/ his sample (the lone Nair one on the model :0) removed from the datasheet -

    Punjabi_Lahore,43
    GujaratiB,12.4
    Kol,10.8
    Irula,8.2
    Ganj_Dareh_N,5.2
    Brahmin_Uttar_Pradesh,4.8
    Brahmin_Gujarat,3.4
    Brahmin_Manipuri,3.4
    GujaratiA,3.2
    French,1.6
    Sarazm_Eneolithic,1.6
    Velamas,1.4
    Gupta,0.8
    Esan_Nigeria,0.2


    "distance%=0.943" - Brahmin_Uttar_Pradesh:177e when run w/ all UP Brahmin samples removed from the datasheet.

    Sindhi,25.6
    GujaratiA,19.2
    Iyer,15
    Sintashta_MLBA,12.6
    Shahr_I_Sokhta_BA3,11.2
    Gujar,9.8
    Afanasievo,2.4
    GujaratiB,2
    Gupta,1.2
    Kol,1

  2. The Following 8 Users Say Thank You to scobar For This Useful Post:

     26284729292 (06-24-2019),  bmoney (06-25-2019),  client (06-25-2019),  FrostAssassin0701 (06-24-2019),  kush (06-26-2019),  noman (06-26-2019),  Rustyshakelford (06-25-2019),  vishankar (06-27-2019)

  3. #2
    Registered Users
    Posts
    836
    Ethnicity
    Brahmin (mixed)
    Nationality
    Indian
    Y-DNA
    R-1A (Z-93)
    mtDNA
    M-30

    Are the Brahmin samples group averages? Or particular individuals within these groups?

    Because last time these weren’t averages and we got outliers for some of these individuals, which skewed the proportions.
    Last edited by 26284729292; 06-24-2019 at 05:56 AM.

  4. #3
    Registered Users
    Posts
    252
    Sex
    Ethnicity
    Malayali Muslim
    Nationality
    Indian
    Y-DNA
    HM82->Z4507->Z5890
    mtDNA
    R30a

    Quote Originally Posted by 26284729292 View Post
    Are the Brahmin samples group averages? Or particular individuals within these groups?

    Because last time these weren’t averages and we got outliers for some of these individuals, which skewed the proportions.
    I took them from the G25 individual sheet, not the pop averages. Does running with higher Nbatch alleviate the issue you're talking about?

  5. #4
    Registered Users
    Posts
    836
    Ethnicity
    Brahmin (mixed)
    Nationality
    Indian
    Y-DNA
    R-1A (Z-93)
    mtDNA
    M-30

    Quote Originally Posted by scobar View Post
    I took them from the G25 individual sheet, not the pop averages. Does running with higher Nbatch alleviate the issue you're talking about?
    I don't think it would. Pop averages would help a ton. I'm not sure if that's possible, but I remember in Mandoos' case, some of them were outliers,

  6. The Following User Says Thank You to 26284729292 For This Useful Post:

     FrostAssassin0701 (06-24-2019)

  7. #5
    Bronze Class Member
    Posts
    1,926
    Sex
    Location
    Krasnoyarsk
    Ethnicity
    Munda

    Turkmenistan Suriname South Africa Costa Rica Norway Australia
    Quite a lot of west Asian for someone like me

  8. #6
    Registered Users
    Posts
    252
    Sex
    Ethnicity
    Malayali Muslim
    Nationality
    Indian
    Y-DNA
    HM82->Z4507->Z5890
    mtDNA
    R30a

    Quote Originally Posted by Censored View Post
    Quite a lot of west Asian for someone like me
    What did your gradient descent one look like?

  9. #7
    Registered Users
    Posts
    252
    Sex
    Ethnicity
    Malayali Muslim
    Nationality
    Indian
    Y-DNA
    HM82->Z4507->Z5890
    mtDNA
    R30a

    Quote Originally Posted by 26284729292 View Post
    I don't think it would. Pop averages would help a ton. I'm not sure if that's possible, but I remember in Mandoos' case, some of them were outliers,
    So I got the G25 sheet off Davidski's site. Do you remember which group in particular had the outliers?

  10. #8
    Registered Users
    Posts
    252
    Sex
    Ethnicity
    Malayali Muslim
    Nationality
    Indian
    Y-DNA
    HM82->Z4507->Z5890
    mtDNA
    R30a

    Quote Originally Posted by 26284729292 View Post
    I don't think it would. Pop averages would help a ton. I'm not sure if that's possible, but I remember in Mandoos' case, some of them were outliers,
    So I used Nbatch=1000 and Ncycles=1000 for these runs. My understanding is it picks Nbatch samples drom the datasheet per cycle. The datasheet itself is a little > 1000 samples. Assuming it averages 5-10 samples per group w/ 0-2 outliers, would the outliers still have an outsized impact on the result? Maybe poi or bmoney or someone knows more about these configurations.

  11. The Following User Says Thank You to scobar For This Useful Post:

     26284729292 (06-24-2019)

  12. #9
    Registered Users
    Posts
    836
    Ethnicity
    Brahmin (mixed)
    Nationality
    Indian
    Y-DNA
    R-1A (Z-93)
    mtDNA
    M-30

    Quote Originally Posted by scobar View Post
    So I used Nbatch=1000 and Ncycles=1000 for these runs. My understanding is it picks Nbatch samples drom the datasheet per cycle. The datasheet itself is a little > 1000 samples. Assuming it averages 5-10 samples per group w/ 0-2 outliers, would the outliers still have an outsized impact on the result? Maybe poi or bmoney or someone knows more about these configurations.
    You may be correct. I'd inquire with both.

  13. #10
    Bronze Class Member
    Posts
    1,926
    Sex
    Location
    Krasnoyarsk
    Ethnicity
    Munda

    Turkmenistan Suriname South Africa Costa Rica Norway Australia
    Quote Originally Posted by scobar View Post
    What did your gradient descent one look like?
    What’s that?

Page 1 of 8 123 ... LastLast

Similar Threads

  1. Replies: 559
    Last Post: Today, 02:25 PM
  2. Raw Data Sharing For Scientific or Amateur Research
    By zuzu14 in forum Autosomal (auDNA)
    Replies: 1
    Last Post: 04-03-2019, 11:41 PM
  3. PCA with Baiuvarian samples and moderns
    By lukaszM in forum Ancient (aDNA)
    Replies: 202
    Last Post: 03-27-2018, 07:44 PM
  4. when will we see amateur genetics testing?
    By Almagest in forum General
    Replies: 11
    Last Post: 11-13-2017, 09:17 AM
  5. Question from an amateur regarding SNPs?
    By Dibran in forum Living DNA
    Replies: 6
    Last Post: 05-13-2017, 12:26 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •