Page 1 of 3 123 LastLast
Results 1 to 10 of 30

Thread: New Reich ancient DNA data - V54.1

  1. #1
    Global Moderator
    Posts
    848
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands

    New Reich ancient DNA data - V54.1


  2. The Following 24 Users Say Thank You to PLogan For This Useful Post:

     anglesqueville (11-18-2022),  Brent.B (11-18-2022),  Bygdedweller (11-18-2022),  capsian (11-19-2022),  crawford (11-18-2022),  Dewsloth (11-18-2022),  Erikl86 (12-25-2022),  Fried (11-26-2022),  Gentica277282 (11-17-2022),  JMcB (11-18-2022),  Jon (11-18-2022),  Kale (11-18-2022),  Luso (11-18-2022),  Mnemonics (11-19-2022),  parastais (11-19-2022),  pmokeefe (11-18-2022),  procoptodon (01-09-2023),  R.Rocca (11-18-2022),  Ruderico (11-18-2022),  teepean47 (11-19-2022),  TheHolyGoose (11-18-2022),  WÔldpykjong (11-17-2022),  xerxez (11-18-2022),  Ăsir (11-24-2022)

  3. #2
    Global Moderator
    Posts
    848
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Need some help here mates...

    Not sure what I'm doing different this time in converting the files formats from Eigenstrat to Plink.

    I converted both packages, but my fam(ily) files seem to have the family/population names stripped from them. My previous conversions such as 50 & 52 brought that information over.

    Here is my parameter file

    Code:
    genotypename:    v54.1_1240K_public.geno
    snpname:         v54.1_1240K_public.snp
    indivname:       v54.1_1240K_public.ind
    outputformat:    PACKEDPED
    genotypeoutname: v54.1_1240K_public.bed
    snpoutname:      v54.1_1240K_public.bim
    indivoutname:    v54.1_1240K_public.fam
    Example of 52 fam file and the 54 fam vs the ind


  4. The Following User Says Thank You to PLogan For This Useful Post:

     JMcB (11-18-2022)

  5. #3
    Registered Users
    Posts
    8,690
    Sex
    Location
    Normandy
    Ethnicity
    northwesterner
    aDNA Match (1st)
    VK245 Sandoy Faroe islands early medieval
    aDNA Match (2nd)
    I21275 England Middle Iron Age
    aDNA Match (3rd)
    I19874 England Middle Iron Age
    Y-DNA (P)
    R-BY3604-Z275
    mtDNA (M)
    H5a1
    Y-DNA (M)
    I-M253
    mtDNA (P)
    K1c1c

    Normandie Orkney Netherlands Friesland East Frisia Finland
    Some long-awaited novelties inside (including Saxons !!)
    MyHeritage
    North and West European 55.8%
    English 28.5%
    Baltic 11.5%
    Finnish 4.2%
    GENETIC GROUPS Scotland (Aberdeen and Aberdeenshire)

  6. The Following 2 Users Say Thank You to anglesqueville For This Useful Post:

     JMcB (11-18-2022),  pmokeefe (11-18-2022)

  7. #4
    Registered Users
    Posts
    346
    Sex
    Ethnicity
    Finnish

    Quote Originally Posted by PLogan View Post
    my fam(ily) files seem to have the family/population names stripped from them.
    `convertf` removes the population names, but you can add the population names to the `fam` file like this: `f=v54.1_1240K_public;awk 'NR==FNR{a[$1]=$3;next}{$1=a[$2]}1' $f.{ind,fam}>$f.temp;mv $f.{temp,fam}`.

    However when the combined length of the population name and sample name of some sample is over 39 characters, EIGENSOFT tools like SmartPCA exit with an error like this: `idnames too long Russian_Archangelsk_Krasnoborsky Rakr-203 ll: 41 limit: 39`. So in that case you can convert the population names back to a sequence of integers: `f=v54.1_1240K_public;awk '{$1=NR}1' $f.fam|sponge $f.fam`.

  8. The Following 2 Users Say Thank You to Nganasankhan For This Useful Post:

     crawford (11-18-2022),  JMcB (11-18-2022)

  9. #5
    Global Moderator
    Posts
    848
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Quote Originally Posted by Nganasankhan View Post
    `convertf` removes the population names, but you can add the population names to the `fam` file like this: `f=v54.1_1240K_public;awk 'NR==FNR{a[$1]=$3;next}{$1=a[$2]}1' $f.{ind,fam}>$f.temp;mv $f.{temp,fam}`.

    However when the combined length of the population name and sample name of some sample is over 39 characters, EIGENSOFT tools like SmartPCA exit with an error like this: `idnames too long Russian_Archangelsk_Krasnoborsky Rakr-203 ll: 41 limit: 39`. So in that case you can convert the population names back to a sequence of integers: `f=v54.1_1240K_public;awk '{$1=NR}1' $f.fam|sponge $f.fam`.
    Thank you friend, good to see you again.

  10. The Following User Says Thank You to PLogan For This Useful Post:

     digital_noise (11-19-2022)

  11. #6
    Global Moderator
    Posts
    4,908
    Sex
    Location
    Vissaiom
    Ethnicity
    Portuguese highlander
    aDNA Match (1st)
    Romano-Lusitanian:Miroišo_R10500
    aDNA Match (2nd)
    Lusitanian:Conimbriga_R10488
    Y-DNA (P)
    E-Y31991>FT17866
    mtDNA (M)
    H20 (xH20a)

    Asturias Galicia Portugal 1143 Portugal 1485 Portugal Order of Christ PortugalRoyalFlag1830
    Can anyone please check whether or not R10488 is present in the dataset?

    Edit: Never mind, I see he is. How does one check SNP data? Would like to see whether he has certain mutations or not
    YDNA E-Y31991>PF4428>Y134097>Y134104>Y168273>FT17866 (TMRCA ~1100AD) - Domingos Rodrigues, b. circa 1690 Hidden Content , Viana do Castelo, Portugal - Stonemason, miller.
    mtDNA H20 - Monica Vieira, b. circa 1700 Hidden Content , Porto, Portugal

    Hidden Content
    Global25 PCA West Eurasia dataset Hidden Content

    [1] "distance%=1.497"
    Ruderico

    Galaico-Lusitanian,72.2
    Berber_IA_EMA,9.6
    Briton_IA,9.2
    Roman_Colonial,9

    RY'N NI YMA O HYD!

  12. The Following 4 Users Say Thank You to Ruderico For This Useful Post:

     JMcB (11-18-2022),  Luso (11-18-2022),  pmokeefe (11-18-2022),  TheHolyGoose (11-19-2022)

  13. #7
    Global Moderator
    Posts
    848
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Quote Originally Posted by Ruderico View Post
    Can anyone please check whether or not R10488 is present in the dataset?
    It's in both sets

    R10488.SG M Portugal_Conimbriga_Roman.SG

    R10488.SG M Portugal_Conimbriga_Roman.SG

  14. The Following 3 Users Say Thank You to PLogan For This Useful Post:

     Luso (11-18-2022),  Ruderico (11-18-2022),  TheHolyGoose (11-19-2022)

  15. #8
    Global Moderator
    Posts
    848
    Sex
    Location
    Missouri, U.S.
    Ethnicity
    Colonial American
    Nationality
    American
    aDNA Match (1st)
    VK2020_Scotland_Orkney_VA:VK207
    Y-DNA (P)
    R1b-U152 >R-FTA96415
    mtDNA (M)
    J1b1a1a
    Y-DNA (M)
    I2-P37 > I-BY77146
    mtDNA (P)
    H66a

    United States of America Scotland England Netherlands
    Quote Originally Posted by Ruderico View Post
    Can anyone please check whether or not R10488 is present in the dataset?

    Edit: Never mind, I see he is. How does one check SNP data? Would like to see whether he has certain mutations or not
    I can extract it later for you if nobody beats me to it. Will be several hours though.

  16. The Following 2 Users Say Thank You to PLogan For This Useful Post:

     JMcB (11-18-2022),  Ruderico (11-18-2022)

  17. #9
    Global Moderator
    Posts
    4,908
    Sex
    Location
    Vissaiom
    Ethnicity
    Portuguese highlander
    aDNA Match (1st)
    Romano-Lusitanian:Miroišo_R10500
    aDNA Match (2nd)
    Lusitanian:Conimbriga_R10488
    Y-DNA (P)
    E-Y31991>FT17866
    mtDNA (M)
    H20 (xH20a)

    Asturias Galicia Portugal 1143 Portugal 1485 Portugal Order of Christ PortugalRoyalFlag1830
    Quote Originally Posted by PLogan View Post
    I can extract it later for you if nobody beats me to it. Will be several hours though.
    Thanks, that would be very kind of you
    YDNA E-Y31991>PF4428>Y134097>Y134104>Y168273>FT17866 (TMRCA ~1100AD) - Domingos Rodrigues, b. circa 1690 Hidden Content , Viana do Castelo, Portugal - Stonemason, miller.
    mtDNA H20 - Monica Vieira, b. circa 1700 Hidden Content , Porto, Portugal

    Hidden Content
    Global25 PCA West Eurasia dataset Hidden Content

    [1] "distance%=1.497"
    Ruderico

    Galaico-Lusitanian,72.2
    Berber_IA_EMA,9.6
    Briton_IA,9.2
    Roman_Colonial,9

    RY'N NI YMA O HYD!

  18. The Following 2 Users Say Thank You to Ruderico For This Useful Post:

     Gentica277282 (11-18-2022),  JMcB (11-18-2022)

  19. #10
    Registered Users
    Posts
    2,779
    Sex
    Omitted

    Can someone pretty please upload a plink version? The compilations have gotten so big that my old dino computer doesn't have the power to convert them anymore.
    Thanks in advance

    EDIT: First observations looking at .anno
    - Looks like the Southern arc samples are in there (didn't check for all)
    - Klein7.SG - 7244-6796BCE - Austria_Kleinhadersdorf_LBK_EN.SG - mt: W1-119 - Conflict between date and cultural attribution
    - I19561 - the Bosnian of unknown date in Southern Arc, suspected to be recent - 1223-1273 calCE
    - A lot of samples from the country Gernamy, woops!
    - Nea2.SG & Nea3.SG - new Greece_EN samples
    - Some new samples from Indonesia ~2kbp
    - The SNP count for UKY001 is different than in v52.2, hopefully it's the real UKY001 now instead of KPT002.
    - Ukraine Trypillian samples from GelabertSciRep2022 present
    - Looks like I21276 still missing from .anno (was missing in v52.2 anno but present in dataset)
    Last edited by Kale; 11-18-2022 at 04:15 PM.
    Collection of 14,000 d-stats: Hidden Content Part 2: Hidden Content Part 3: Hidden Content PM me for d-stats, qpadm, qpgraph, or f3-outgroup nmonte models.

  20. The Following 10 Users Say Thank You to Kale For This Useful Post:

     Albruic (01-04-2023),  crawford (11-20-2022),  Dewsloth (11-18-2022),  Gentica277282 (11-18-2022),  Gordius (11-18-2022),  J Man (11-18-2022),  JMcB (11-18-2022),  MethCat (11-19-2022),  peloponnesian (11-18-2022),  Ruderico (11-18-2022)

Page 1 of 3 123 LastLast

Similar Threads

  1. New??? Reich Lab data?
    By PLogan in forum Ancient (aDNA)
    Replies: 1
    Last Post: 06-15-2022, 07:27 AM
  2. Any more ancient Egyptian Autosomal data?
    By IjustlikeHistory in forum Ancient (aDNA)
    Replies: 33
    Last Post: 04-15-2022, 11:38 PM
  3. Replies: 9
    Last Post: 01-25-2022, 07:41 PM
  4. Replies: 1
    Last Post: 11-13-2021, 07:25 PM
  5. Replies: 43
    Last Post: 12-08-2015, 10:45 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •