Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 28

Thread: G25 coordinates for ethnic minorities from China

  1. #11
    Quote Originally Posted by altvred View Post
    Hey, can you re-share the coordinates from the Sudanese study and the following; the original links have expired!

    https://anthrogenica.com/showthread....-Human-Origins

  2. The Following User Says Thank You to norseking For This Useful Post:

     chum (02-01-2023)

  3. #12
    Gold Class Member
    Posts
    866
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    https://drive.google.com/file/d/1tZf...ew?usp=sharing

     


    The samples with abbreviated pop names are from here:

    https://zenodo.org/record/6816509
    https://www.frontiersin.org/articles...22.995783/full

    We collected saliva samples from 239 individuals from the Tai-Kadai-speaking populations in Guizhou, southwest China. There were 50 individuals of Congjiang Zhuang in southeastern Guizhou (Zhuang_Congjiang), 51 individuals of Guanling Bouyei (Bouyei_Guanling), 22 individuals of Majiang Mulao in southeastern Guizhou (Mulao_Majiang, 木佬), 48 individuals of Liping Dong in southeastern Guizhou (Dong_Liping), 20 individuals of Dong in Dong_Tongren, and 48 individuals of Wuchuan Gelao in Zunyi (Gelao_Wuchuan). The geographical locations of the above six populations are shown in Supplementary Figure 1. Participants whose parents and grandparents are indigenous people reside in Guizhou for at least three generations and should have no consanguineous marriage with other groups. The Medical Ethics Committee of Guizhou Medical University approved the study (2019 Ethics Approval Document No. 74). We followed the recommendations provided by the revised Helsinki Declaration of 2000. All the participants signed written informed consent before participating in the study.
    There were no supplementry files that I could locate that included precise sample or population labels. So I decided to keep it brief instead of guessing bassing on ID or number of samples.

  4. The Following 3 Users Say Thank You to altvred For This Useful Post:

     Aben Aboo (02-06-2023),  Michalis Moriopoulos (02-05-2023),  MNOPSC1b (02-05-2023)

  5. #13
    Gold Class Member
    Posts
    866
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Quote Originally Posted by norseking View Post
    Hey, can you re-share the coordinates from the Sudanese study and the following; the original links have expired!

    https://anthrogenica.com/showthread....-Human-Origins
    https://drive.google.com/file/d/1Kpl...ew?usp=sharing
    https://drive.google.com/file/d/1tj1...ew?usp=sharing

  6. The Following 8 Users Say Thank You to altvred For This Useful Post:

     Aben Aboo (02-06-2023),  Agamemnon (02-07-2023),  alienation (02-05-2023),  drobbah (02-06-2023),  norseking (02-05-2023),  pegasus (02-05-2023),  PLogan (02-05-2023),  Tsakhur (02-05-2023)

  7. #14
    Registered Users
    Posts
    2,114
    Sex
    Location
    Central Florida
    Ethnicity
    Greek + Anglo-American
    Nationality
    American
    Y-DNA (P)
    J-PF5197
    mtDNA (M)
    J1b1a

    Greece United States of America
    This is a significant update for East Asia. Great work as always, Altvred! Keep 'em coming!

    I wish we could find some publicly available Ainu genomes.
    Ελευθερία ή θάνατος.

  8. The Following 4 Users Say Thank You to Michalis Moriopoulos For This Useful Post:

     Agamemnon (02-07-2023),  alienation (02-06-2023),  altvred (02-06-2023),  xerxez (02-06-2023)

  9. #15
    Registered Users
    Posts
    2,114
    Sex
    Location
    Central Florida
    Ethnicity
    Greek + Anglo-American
    Nationality
    American
    Y-DNA (P)
    J-PF5197
    mtDNA (M)
    J1b1a

    Greece United States of America
    Quote Originally Posted by altvred View Post
    There were no supplementry files that I could locate that included precise sample or population labels. So I decided to keep it brief instead of guessing bassing on ID or number of samples.
    Not sure I understand this part. Are those abbreviations you give to samples in the Google drive sheet also used in the genotype files themselves for those specific samples? Because if they give those abbreviations then it should be pretty clear what's what, right?
    Ελευθερία ή θάνατος.

  10. #16
    Gold Class Member
    Posts
    866
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Quote Originally Posted by Michalis Moriopoulos View Post
    Not sure I understand this part. Are those abbreviations you give to samples in the Google drive sheet also used in the genotype files themselves for those specific samples? Because if they give those abbreviations then it should be pretty clear what's what, right?
    The pop labels I assigned were based on sample IDs.

    The original FAM file lacked FIDs (the first column, which is usually used for population labels).


    https://zenodo.org/record/6816509

    We genotyped about 700,000 single nucleotide polymorphisms (SNPs) of 239 individuals from six Tai-Kadai-speaking populations residing in the mountainous Guizhou Province of southwestern China.
    There are actually 251 individuals in that dataset, not 239.


    We collected saliva samples from 239 individuals from the Tai-Kadai-speaking populations in Guizhou, southwest China. There were 50 individuals of Congjiang Zhuang in southeastern Guizhou (Zhuang_Congjiang), 51 individuals of Guanling Bouyei (Bouyei_Guanling), 22 individuals of Majiang Mulao in southeastern Guizhou (Mulao_Majiang, 木佬), 48 individuals of Liping Dong in southeastern Guizhou (Dong_Liping), 20 individuals of Dong in Dong_Tongren, and 48 individuals of Wuchuan Gelao in Zunyi (Gelao_Wuchuan).
    Here is the number of samples for each abbreviated pop label.
    Code:
         51 CZ
         48 GBY
         46 LD
         39 MM
         20 TD
         47 WGL
    Now,
    TD=Dong_Tongren (only pop with 20 samples)

    WGL=Gelao_Wuchuan

    According to the way they cluster, I am fairly confident that is the case.

    The Gelao_Wuchuan and
    Dong_Tongren clustered with southern Han Chinese with
    amounts of Sino-Tibetan-related components.
     


    The rest probably are:
    LD=Dong_Liping
    MM=Mulao_Majiang
    CZ=Zhuang_Congjiang
    GBY=Bouyei_Guanling

    https://pastebin.com/pbsYVmXV

    There are some outliers that need to be removed. The GBY ones are fairly obvious.

     

  11. The Following 3 Users Say Thank You to altvred For This Useful Post:

     Agamemnon (02-07-2023),  Michalis Moriopoulos (02-07-2023),  MNOPSC1b (02-06-2023)

  12. #17
    Gold Class Member
    Posts
    866
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Quote Originally Posted by Michalis Moriopoulos View Post
    This is a significant update for East Asia. Great work as always, Altvred! Keep 'em coming!

    I wish we could find some publicly available Ainu genomes.
    Cheers.

    Btw

    I just sent David the dataset from Fortes-Lima et al 2022.

    SNP coverage isn't an issue with the Illumina Omni2.5-v8 chip. There were 327 samples from the Sahel, 109 of them from Chad. G25 definitely has a blind spot there.

    Here is where they plot on a PCA alongside the Sudanese samples from Hollfelder 2017. (If you're wondering, this isn't G25, but rather my own smartpca run.)

     


  13. The Following 8 Users Say Thank You to altvred For This Useful Post:

     Agamemnon (02-07-2023),  chum (02-08-2023),  CopperAxe (02-07-2023),  ESPLover4 (02-08-2023),  Megalophias (02-07-2023),  Michalis Moriopoulos (02-07-2023),  PLogan (02-06-2023),  Ryukendo (02-07-2023)

  14. #18
    Global Moderator
    Posts
    923
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Can you educate me... never seen these before.



    What are these data files they provide?

    XXX_R01C01_Grn.idat
    XXX_R01C01_Red.idat

    32MB each

    Each strand of SNPs read?

    UPDATE: Found the plink format files...


  15. #19
    Gold Class Member
    Posts
    866
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Quote Originally Posted by PLogan View Post
    Can you educate me... never seen these before.



    What are these data files they provide?

    XXX_R01C01_Grn.idat
    XXX_R01C01_Red.idat

    32MB each

    Each strand of SNPs read?
    Idat are the Illumina DNA microarray intensity raw data files, the way it works is that you use Illumina software to call genotype from the raw data and then convert to whatever you want.

    I recommend using the gtc2vcf bcftools plug in.

    https://github.com/freeseek/gtc2vcf#...-and-cel-files

    Anyway the data is available in plink format for that particular dataset.
    https://www.ebi.ac.uk/biostudies/fil...l_MBE_2022.bed
    https://www.ebi.ac.uk/biostudies/fil...l_MBE_2022.bim
    https://www.ebi.ac.uk/biostudies/fil...l_MBE_2022.fam

  16. The Following User Says Thank You to altvred For This Useful Post:

     PLogan (02-06-2023)

  17. #20
    Global Moderator
    Posts
    923
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Those Savannah samples are seriously high quality.... 2.2 million SNPs.

    Average of 99.05% complete within that 2.2 million.


    Missing SNPs
    Code:
    Mean	0.00946244
    Standard Error	0.002189761
    Median	0.004009
    Mode	0.003748
    Standard Deviation	0.039597762
    Sample Variance	0.001567983
    Kurtosis	135.6833522
    Skewness	11.17522859
    Range	0.540847
    Minimum	0.002953
    Maximum	0.5438
    Sum	3.094218
    Count	327
    
    Population	Samples
    Chad_ArabBaggara	25
    Chad_Dangaleat	25
    Chad_Daza	22
    Chad_Maba	25
    Chad_Zaghawa	12
    Guinea_Fulani	25
    Senegal_Bedik	25
    Senegal_Halpularen	25
    Sudan_ArabBaggara	23
    Sudan_ArabKababish	25
    Sudan_ArabRashaayda	20
    Sudan_Daju	25
    Sudan_NubaKoalib	25
    Sudan_Zaghawa	25
    Grand Total	327

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. Some Data for E-V13 in China
    By Riverman in forum E1b-M215
    Replies: 47
    Last Post: 02-26-2023, 02:05 PM
  2. Embedded Racism: Japan's Visible Minorities and Racial Discrimination
    By East-Asia in forum General Sociology/Ethnology
    Replies: 0
    Last Post: 01-15-2023, 01:32 AM
  3. Replies: 10
    Last Post: 05-12-2021, 11:28 AM
  4. Replies: 1
    Last Post: 05-14-2016, 09:09 PM
  5. Results from China
    By Marmaduke in forum General
    Replies: 12
    Last Post: 07-03-2014, 02:58 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •