Page 1 of 8 123 ... LastLast
Results 1 to 10 of 74

Thread: West Eurasian (ANI) and ASI estimates for South Asians (Sengupta et al. 2016)

  1. #1
    Registered Users
    Posts
    67
    Sex

    West Eurasian (ANI) and ASI estimates for South Asians (Sengupta et al. 2016)


    From Sengupta et al., 2016

    As to how these groups subdivide:

    GIH 1 probably corresponds to HarappaWorld's Gujarati-a (which is believed to be a group of Patels) and GIH 2 corresponds to Gujarati-b. HarappaWorld also divided GIH into two groups (Gujarati-a and Gujarati-b) with the same size as GIH 1 and 2 respectively.

    The divide between PJL 1 and PJL 2 is probably the low castes vs. Biraderis divide. PJL 1 is probably Chamars, Churas etc. while PJL 2 is the rest of Punjabis.

    I am not sure about the division between ITU 1 and ITU 2.

    As a side note, this study is 2 years old and it is not outdated. Sapporo is correct that newer papers have successfully divided South Asian genome further using ancient samples, but the cumulative ANI, ASI and AAA estimates that this study provides are still pretty accurate.
    Last edited by coolguy; 06-30-2018 at 09:53 PM.

  2. The Following 2 Users Say Thank You to coolguy For This Useful Post:

     kush (06-30-2018),  misanthropy (06-30-2018)

  3. #2
    Registered Users
    Posts
    628
    Sex
    Ethnicity
    Dakhni
    Y-DNA (P)
    E-V22
    mtDNA (M)
    M2B

    Huh, I always imagined ASI and Austro-Asiatic being like the same thing or very similar. Didn't know they were separate and that people had both. I imagined Austro-Asiatic simply being a mix of ASI + some East Asian component.

  4. #3
    Gold Class Member
    Posts
    2,097
    Sex
    Location
    NorCal
    Ethnicity
    Punjabi Jatt Sikh
    Nationality
    American & Canadian
    Y-DNA (P)
    L-M357 > Y33989
    mtDNA (M)
    HV2a3
    Y-DNA (M)
    L-M27
    mtDNA (P)
    R0

    United States of America California Republic Canada India Punjab Sikh Empire Nishan Sahib
    Quote Originally Posted by misanthropy View Post
    Huh, I always imagined ASI and Austro-Asiatic being like the same thing or very similar. Didn't know they were separate and that people had both. I imagined Austro-Asiatic simply being a mix of ASI + some East Asian component.
    Austro-Asiatics within the South Asian vicinity are a mix of East Asian and AASI. I believe if you look at how they (Munda, Ho or Santal, etc.) score on Harappa and other GEDMatch calculators, they're a combination of East Asian/SE Asian + South Asian/South Indian components. For example, the Santal/Santhal score 72% South Indian and 17% SE Asian on Harappa.

    Anyways, this paper is grossly outdated. VM Narasimhan and Reich's 2018 paper has already moved on to using ancient South Asian and Central Asian samples to model modern South Asians and give more accurate estimates of ancestral populations such as Iran N, Steppe and AASI.
    Last edited by Sapporo; 06-30-2018 at 05:34 AM.
    pegasus modeling:
    sample": "Punjabi_Jat:Sapporo_AGUser",
    "fit": 1.1506,
    "IRN_Shahr_I_Sokhta_BA3": 43.33,
    "TKM_Gonur1_BA": 31.67,
    "RUS_Sintashta_MLBA": 25,
    "closestDistances": [

    avatar credit goes out to aaronbee2010

  5. The Following 8 Users Say Thank You to Sapporo For This Useful Post:

     agent_lime (06-30-2018),  bmoney (06-30-2018),  Eвa (07-04-2018),  jortita (06-30-2018),  midichlorian (06-30-2018),  misanthropy (06-30-2018),  Observer (07-02-2018),  redifflal (07-02-2018)

  6. #4
    Registered Users
    Posts
    3,372
    Sex
    Location
    Chicago
    Ethnicity
    Baloch Kashmiri Uzbek Kho
    Nationality
    USA
    Y-DNA (P)
    R-Y17491 > R-YP4858
    mtDNA (M)
    A8a

    Pakistan United States of America
    Quote Originally Posted by coolguy View Post

    From Sengupta et al., 2016

    As to how these groups subdivide:

    GIH 1 probably corresponds to HarappaWorld's Gujarati-a (which is believed to be a group of Patels) and GIH 2 corresponds to Gujarati-b. HarappaWorld also divided GIH into two groups (Gujarati-a and Gujarati-b) with the same size as GIH 1 and 2 respectively.

    The divide between PJL 1 and PJL 2 is probably the low castes vs. Biraderis divide. PJL 1 is probably Chamars, Churas etc. while PJL 2 is the rest of Punjabis.


    I am not sure about the division between ITU 1 and ITU 2.

    As a side note, this study is 2 years old and it is not outdated. Sapporo is correct that newer papers have successfully divided South Asian genome further using ancient samples, but the cumulative ANI, ASI and AAA estimates that this study provides are still pretty accurate.
    The study used all the GIH samples and all the PJL. So just to correspond I double-checked some divisions.
    Which letters correspond to which numbers are strong inferences because I know how many samples are in each group of both divisions. I have the original data sets; both Zack Ajmal's divided set and the unadulterated 1000Genomes data set.

    By the numbers:
    GIH_1 = GIH_Patel in the Zack Ajmal Dataset. Exactly 67.
    GIH_2 = GIH_ANI 1/2/3/4 = Gujarati A/B/C. Exactly 36.
    PJL_1 = PJL_A minus 2 samples OR PJL D minus 4 samples.
    PJL_2 = PJL_B/C/D plus 2 samples OR PJL A/B/C plus 4 samples.

    PJL A 33 PJL 1 31
    PJL B 9 PJL 2 65
    PJL C 19
    PJL D 35
    GIH ANI 1 8 GIH 1 67
    GIH ANI 2 8 GIH 2 36
    GIH ANI 3 15
    GIH ANI 4 5
    GIH PATEL 67 Corresponds exactly to the divisions.
    Every sample is used. No sample is unaccounted for.



    EDIT: Originally, I wrote which ANI/PATEL group corresponds with which GIH_A/B/C/D. However, I need to double-check before I type that out.

    This diagram and table lists n= for each sample set from the Sengupta et. al. study and their subsequent divisions.






    As for how the GIH_Patel group got its name; or why they are assumed to be Patels is because when the GIH samples first came out they plotted like this:



    So every single person who analyzed it, knew that there was at least 2 distinct groups. Zack Ajmal then compared that to known Patels he had from his Harappa project, and the assumption became that these ~67 are Patels.
    The rest of the samples then got divided by ANI, so you got 1/2/3/4.

    This part is important. ORIGINALLY, Gujarati-a was representing that group that clustered tightly. Gujarati-b were those that DID NOT. Gujarati-a gets renamed GIH_Patel AFTER Zack's analysis. All calculators at that time only had 2 division for Gujarati, A and B.
    Subsequently, Gujarati A/B/C/D names were made, corresponding to ANI1/2/3/4 and Patel. Which is which I am not exactly sure. That I will have to dig more deeply into. However, the important thing is that the definition of Gujarati A on admix calculators flipped!
    Gujarati A/B/C/D are referenced in 2016 Lazaradis et. al Genomic insights into the origin of farming in the ancient Near East.

    So they took the GIH samples and divided it either based from another study (I'm trying to find out) or they did it themselves, in any case they analyzed it as such and it's all the GIH samples:




    So if you see an admix calculator with gujarati-a and gujarati-b in it's oracle; gujarati-a is probably the "Patel" group.

    If you see an admix calculator with Gujarati A/B/C/D; Gujarati A is probably ANI 1, Gujarati B is probably ANI_2 and _3, Gujarati C is probably GIH_Patel and Gujarati D is probably ANI 4.
    The Bohra_Patels score near ANI_1 and ANI_2.

    Here are the GIH in admixture:

    Last edited by khanabadoshi; 07-02-2018 at 03:58 PM.
    “Chahar chez est tohfay Multan, Gard-o- Garma, Gada-o- Goristan”.

    Four things are the gift of Multan: Dusty winds, hot seasons, beggars and graveyards.




  7. The Following 2 Users Say Thank You to khanabadoshi For This Useful Post:

     bmoney (07-03-2018),  MonkeyDLuffy (07-02-2018)

  8. #5
    Registered Users
    Posts
    4,834
    Sex

    Quote Originally Posted by khanabadoshi View Post
    The study used all the GIH samples and all the PJL. So just to correspond I checked double-checked some divisions.
    Which letters correspond to which numbers are strong inferences because I know how many samples are in each group of both divisions. I have the original data sets; both Zack Ajmal's divided set and the unadulterated 1000Genomes data set.

    By the numbers:
    GIH_1 = GIH_Patel in the Zack Ajmal Dataset. Exactly 67.
    GIH_2 = GIH_ANI 1/2/3/4 = Gujarati A/B/C. Exactly 36.
    PJL_1 = PJL_A minus 2 samples OR PJL D minus 4 samples.
    PJL_2 = PJL_B/C/D plus 2 samples OR PJL A/B/C plus 4 samples.

    PJL A 33 PJL 1 31
    PJL B 9 PJL 2 65
    PJL C 19
    PJL D 35
    GIH ANI 1 8 GIH 1 67
    GIH ANI 2 8 GIH 2 36
    GIH ANI 3 15
    GIH ANI 4 5
    GIH PATEL 67 Corresponds exactly to the divisions.
    Every sample is used. No sample is unaccounted for.



    EDIT: Originally, I wrote which ANI/PATEL group corresponds with which GIH_A/B/C/D. However, I need to double-check before I type that out.

    This diagram and table lists n= for each sample set from the Sengupta et. al. study and their subsequent divisions.




    If you think about it, Punjabi farmer and merchant Biradaris literally cluster with Sindhi. So it doesn't make sense to put those in separate groups. But again this is an old paper with outdated data.

  9. The Following 3 Users Say Thank You to MonkeyDLuffy For This Useful Post:

     bmoney (07-03-2018),  redifflal (07-02-2018),  Sapporo (07-02-2018)

  10. #6
    Gold Class Member
    Posts
    4,034
    Location
    Shangri La

    Afghanistan Jammu and Kashmir United States of America Canada
    FYI That paper is considerably outdated , considering the papers which have dropped this year.

  11. The Following 5 Users Say Thank You to pegasus For This Useful Post:

     bmoney (07-03-2018),  khanabadoshi (07-02-2018),  Observer (07-03-2018),  redifflal (07-02-2018),  Sapporo (07-02-2018)

  12. #7
    Registered Users
    Posts
    1,361
    Sex
    Location
    Loe Banr
    Y-DNA (P)
    H1a* (H-Y33670*)
    mtDNA (M)
    M5a

    Quote Originally Posted by khanabadoshi View Post
    The study used all the GIH samples and all the PJL. So just to correspond I double-checked some divisions.
    Which letters correspond to which numbers are strong inferences because I know how many samples are in each group of both divisions. I have the original data sets; both Zack Ajmal's divided set and the unadulterated 1000Genomes data set.

    By the numbers:
    GIH_1 = GIH_Patel in the Zack Ajmal Dataset. Exactly 67.
    GIH_2 = GIH_ANI 1/2/3/4 = Gujarati A/B/C. Exactly 36.
    PJL_1 = PJL_A minus 2 samples OR PJL D minus 4 samples.
    PJL_2 = PJL_B/C/D plus 2 samples OR PJL A/B/C plus 4 samples.

    PJL A 33 PJL 1 31
    PJL B 9 PJL 2 65
    PJL C 19
    PJL D 35
    GIH ANI 1 8 GIH 1 67
    GIH ANI 2 8 GIH 2 36
    GIH ANI 3 15
    GIH ANI 4 5
    GIH PATEL 67 Corresponds exactly to the divisions.
    Every sample is used. No sample is unaccounted for.



    EDIT: Originally, I wrote which ANI/PATEL group corresponds with which GIH_A/B/C/D. However, I need to double-check before I type that out.

    This diagram and table lists n= for each sample set from the Sengupta et. al. study and their subsequent divisions.






    As for how the GIH_Patel group got its name; or why they are assumed to be Patels is because when the GIH samples first came out they plotted like this:



    So every single person who analyzed it, knew that there was at least 2 distinct groups. Zack Ajmal then compared that to known Patels he had from his Harappa project, and the assumption became that these ~67 are Patels.
    The rest of the samples then got divided by ANI, so you got 1/2/3/4.

    This part is important. ORIGINALLY, Gujarati-a was representing that group that clustered tightly. Gujarati-b was those that DID NOT. Gujarati-a gets renamed GIH_Patel AFTER Zack's analysis. All calculators at that time only had 2 division for Gujarati, A and B.
    Subsequently, Gujarati A/B/C/D names were made, corresponding to ANI1/2/3/4 and Patel. Which is which I am not exactly sure. That I will have to dig more deeply into. However, the important thing is that the definition of Gujarati A on admix calculators flipped!
    They didn't ask for background or caste before collecting samples. The idea that bulk of GIH samples are patels is just good educated guess based on results of individual gujaratis with known background like you said.

    Is Gujarati B 36 samples and Gujarati A 67 samples pretty much all of GIH right?

    Just found individual 36 results of "Gujarati-B". From this group they divided in to ABCD to be used in genetic studies. So there is really no need to upload GIH to gedmatch all over again IMO. We already have them.

    https://docs.google.com/spreadsheets...0iw/edit#gid=0

    There are like 5-6 results out of 100 plus which score like Gujarati ANI 1. Based on results we have seen so far, couple of them are lohana like and few other Gujarti brahmin like.

    For PJL for some reason every study uses ANI 4 like samples. Even thought they are like 30% of overall PJL samples or 45% if one include ANI 3 as well, that's where confusion comes from.
    Last edited by bol_nat; 07-02-2018 at 03:58 PM.

  13. The Following 4 Users Say Thank You to bol_nat For This Useful Post:

     bmoney (07-03-2018),  khanabadoshi (07-02-2018),  MonkeyDLuffy (07-02-2018),  Sapporo (07-02-2018)

  14. #8
    Banned
    Posts
    440
    Sex
    Ethnicity
    Bengali
    Y-DNA (P)
    R2A2 (R-P267)
    mtDNA (M)
    M52A

    Agree. The Narasimhan Reich work is first time any kind of aDNA relevant to Indian subcontinent is being published with modern population data. As inadequate as it is, you have to go off that one compared to anything that had no aDNA basis whatsoever.

  15. The Following User Says Thank You to redifflal For This Useful Post:

     bmoney (07-03-2018)

  16. #9
    Registered Users
    Posts
    1,361
    Sex
    Location
    Loe Banr
    Y-DNA (P)
    H1a* (H-Y33670*)
    mtDNA (M)
    M5a

    Quote Originally Posted by MonkeyDLuffy View Post
    If you think about it, Punjabi farmer and merchant Biradaris literally cluster with Sindhi. So it doesn't make sense to put those in separate groups. But again this is an old paper with outdated data.
    Its because genetic studies have always used PJL-ANI 4 like samples, its easy to verify with ID and samples Khanabodoshi have uploaded.

  17. #10
    Registered Users
    Posts
    3,372
    Sex
    Location
    Chicago
    Ethnicity
    Baloch Kashmiri Uzbek Kho
    Nationality
    USA
    Y-DNA (P)
    R-Y17491 > R-YP4858
    mtDNA (M)
    A8a

    Pakistan United States of America
    Quote Originally Posted by bol_nat View Post
    They didn't ask for background or caste before collecting samples. The idea that bulk of GIH samples are patels is just good educated guess based on results of individual gujaratis with known background like you said.

    Is Gujarati B 36 samples and Gujarati A 67 samples pretty much all of GIH right?

    Just found individual 36 results of "Gujarati-B". From this group they divided in to ABCD to be used in genetic studies. So there is really no need to upload GIH to gedmatch all over again IMO. We already have them.

    https://docs.google.com/spreadsheets...0iw/edit#gid=0

    There are like 5-6 results out of 100 plus which score like Gujarati ANI 1. Based on results we have seen so far, couple of them are lohana like and few other Gujarti brahmin like.

    For PJL for some reason every study uses ANI 4 like samples. Even thought they are like 30% of overall PJL samples or 45% if one include ANI 3 as well, that's where confusion comes from.
    The Gujarati B you found is not the Gujarati B from the A/B/C/D group, it's the "non-Patel" Gujarati-b group; that is ANI 1, 2, 3, and 4 GIH guys. I know it's confusing.
    Lazaradis (or whomever he sourced the samples from) made their own A/B/C/D divisions, and all admixture calculators after that use them as standard.
    I am fairly sure Gujarati-a which became GIH_Patel eventually became Gujarati C. Gujarati-b became GIH_ANI_1/2/3/4, which now is Gujarati A, B, and D; with ANI_1 becoming A, ANI 2 and 3 becoming B, and ANI 4 becoming D.
    It's possible it's not corresponding 1-to-1, but likely that's the pattern.

    We don't know if Gujarati-a (and whatever later group they became, B/C/D) are in fact Patels, what we do know is that a bunch of known Patels placed in their extremely tight cluster, and everyone else in Gujarat did not.
    Last edited by khanabadoshi; 07-02-2018 at 04:10 PM.
    “Chahar chez est tohfay Multan, Gard-o- Garma, Gada-o- Goristan”.

    Four things are the gift of Multan: Dusty winds, hot seasons, beggars and graveyards.




  18. The Following User Says Thank You to khanabadoshi For This Useful Post:

     bmoney (07-03-2018)

Page 1 of 8 123 ... LastLast

Similar Threads

  1. Replies: 5
    Last Post: 12-11-2019, 09:43 AM
  2. Do Southeast Asians have West Eurasian ancestry?
    By Tsakhur in forum Southeastern
    Replies: 60
    Last Post: 08-20-2019, 06:25 AM
  3. West Eurasian mtdna in South India
    By BMG in forum Other
    Replies: 8
    Last Post: 12-22-2018, 06:48 AM
  4. Replies: 18
    Last Post: 04-06-2017, 09:53 AM
  5. Replies: 8
    Last Post: 04-10-2016, 07:24 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •