Page 3 of 3 FirstFirst 123
Results 21 to 28 of 28

Thread: Making drift-free northern European(Celtic, Baltic, Scandinavian, and Slavic) samples

  1. #21
    Moderator
    Posts
    2,669
    Sex
    Ethnicity
    Pashtun/East-Slavic
    Y-DNA (P)
    R1a-Z93

    Quote Originally Posted by Nganasankhan View Post
    I'm getting additional EHG for Uralic populations in qpAdm models that include Corded Ware in left populations, like for example the first model below. But then in models with Yamnaya and Turkey_N in place of Corded Ware, like the fifth model below, often I don't get extra EHG.



    Also in these models for Saami, the first model which doesn't include Corded Ware doesn't get EHG, but the third model which is the highest-p model with Corded Ware gets high EHG:



    Code:
    library(admixtools)
    library(tidyverse)
    library(colorspace)
    library(cowplot)
    
    right=c("Russia_Kostenki14","Karitiana","Russia_Ust_Ishim.DG","Iran_GanjDareh_N","Spain_HG_published.SG","Han")
    left=c("Turkey_N.DG","Russia_Samara_EBA_Yamnaya","Estonia_CordedWare.SG","Hungary_EN_HG_Koros_published.SG","Russia_HG_Samara","Nganasan")
    target="Saami.DG"
    
    qp=qpadm("v44.3_HO_public",left,right,target)
    
    t=qp$popdrop%>%filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%select(1,4,5,7:last_col(5))
    t2=melt(t[-c(2,3)],id.vars="pat")
    
    lab=sub("e-0","e-",sub("^0","",sprintf(ifelse(t$p<.001,"%.0e","%.3f"),t$p)))
    # lab=paste0(lab," (",ifelse(t$chisq<10,sub("^0","",sprintf("%.1f",t$chisq)),round(t$chisq)),")")
    
    subtit=str_wrap(paste("Outgroups:",paste(sort(right),collapse=", ")),width=50)
    
    legcols=1
    
    p=ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y=value,fill=variable))+
    geom_bar(stat="identity",width=1,position=position_fill(reverse=T),size=.1,color="gray20")+
    geom_text(aes(label=round(100*value)),position=position_stack(vjust=.5,reverse=T),size=3.5)+
    ggtitle(paste("Target:",target),subtitle=subtit)+
    coord_flip()+
    scale_x_discrete(expand=c(0,0),labels=rev(lab))+
    scale_y_discrete(expand=c(0,0))+
    scale_fill_manual(values=colorspace::hex(HSV(c(30,60,90,200,230,280),.5,1)))+ # manual colors
    # scale_fill_manual(values=colorspace::hex(HSV(head(seq(0,360,length.out=ncol(t)-2),-1),.5,1)))+ # automatic colors
    guides(fill=guide_legend(ncol=legcols,byrow=F))+
    theme(
      axis.text=element_text(color="black",size=11),
      axis.text.x=element_blank(),
      axis.ticks=element_blank(),
      axis.title=element_blank(),
      legend.direction="horizontal",
      legend.key=element_rect(fill=NA),
      legend.margin=margin(-4,0,1,0),
      legend.text=element_text(size=11),
      legend.title=element_blank(),
      plot.subtitle=element_text(size=11),
      plot.title=element_text(size=16)
    )
    
    ggdraw(p)
    hei=c(.5+.2*(str_count(subtit,"\n")+1)+.25*nrow(t),+.1+.23*ceiling(length(unique(t2[!is.na(t2$value),2]))/legcols))
    ggdraw(cowplot::plot_grid(p+theme(legend.position="none"),cowplot::get_legend(p),ncol=1,rel_heights=hei))
    ggsave("a.png",width=4,height=sum(hei))
    Could you check what happens if you add WSHG/Botai? I would think Uralics got their extra EHG-rich ancestry rather from them than from Volosovo/EHG.

  2. #22
    Gold Class Member
    Posts
    555
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Russian Federation Russia Tatarstan Arms Russia Mordovia Timurid
    Quote Originally Posted by Coldmountains View Post
    Afaik (i could be wrong) some Uralics especially people like Khanty and Mansi have significant WSHG and more western Uralics may have some small WSHG ancestry brought by early Uralic speakers and Steppe_IA (Steppe Iranics) from the east. EHG on the other side looks pretty much like a pop getting replaced from almost all directions. I mean we see a lot of Indo-Iranians since 2000 B.C rich in WSHG but so far nobody rich in extra EHG despite them living earlier in EHG territories of Central-East Russia. WSHG ancestry persisted much longer than EHG it seems and up to 0-500 A.D we get pops very rich in WSHG ancestry in Central Asia/Siberia unlike EHG which pretty much seems to be insignificant after Fatyanovo and CWC (maybe survived better in unsampled regions north of Fatyanovo?).
    Then that EHG shift in Udmurts I posted might be MA1/WSHG ancestry. I will try running the same setup but with Botai or Tyumen_HG instead of EHG.

    Out of the Volga Uralic speakers, the WSHG admixture is absent in Mordvins from what I've seen; in fact, they have very little kra001/Uralic ancestry, to begin with. On qpAdm, of the roughly 7%-8% 'Siberian' admixture, they score more than half shows up as Slab-Grave or some other south-Siberian/Eastern Steppe source. Udmurts and Mari, on the other hand show up as having only kra001 admixture.
    YDNA (P): R-Y33
    YDNA (P, maternal line): R-Y20756
    YDNA(M): E-Y6938

  3. The Following 2 Users Say Thank You to altvred For This Useful Post:

     Alain (08-01-2021),  Coldmountains (07-31-2021)

  4. #23
    Gold Class Member
    Posts
    555
    Sex
    Y-DNA (P)
    R-Y33
    mtDNA (M)
    J1c2
    Y-DNA (M)
    E-Y6938
    mtDNA (P)
    G2a

    Russian Federation Russia Tatarstan Arms Russia Mordovia Timurid
    Quote Originally Posted by Coldmountains View Post
    Could you check what happens if you add WSHG/Botai? I would think Uralics got their extra EHG-rich ancestry rather from them than from Volosovo/EHG.
    I replaced EHG with Botai in the f4 ratio function:

     


     



    It would appear that Mansi and Siberian Tatars have legit WSHG ancestry, to a far lesser extent Udmurts, no statistically significant results for other Uralic-speakers and Volga-Ural populations in general to indicate WSHG admixture.

     


     
    YDNA (P): R-Y33
    YDNA (P, maternal line): R-Y20756
    YDNA(M): E-Y6938

  5. The Following 2 Users Say Thank You to altvred For This Useful Post:

     Alain (08-01-2021),  Coldmountains (07-31-2021)

  6. #24
    Banned
    Posts
    223
    Sex
    Location
    Altai-Sayan

    Another option we could do is create all of these groups as a simple mix of Corded Ware early(as opposed to the one I used)+Austria_N+Villabruna(or Koros_HG?) since EHG is not a big factor here. Balto-Slavic drift seems more severe than even I thought on G25, totally skewing perceptions so much that it cant decide whether to take in WHG or EHG.

    @Coldmountains @altvred what do you think?

    If you have a link to a good chart based off formal stats for all these groups it would be helpful I can model ghosts based off that.
    Last edited by Cynic; 07-30-2021 at 03:55 AM.

  7. #25
    Registered Users
    Posts
    642
    Sex
    Omitted

    Quote Originally Posted by Nganasankhan View Post
    I'm getting additional EHG for Uralic populations in qpAdm models that include Corded Ware in left populations, like for example the first model below. But then in models with Yamnaya and Turkey_N in place of Corded Ware, like the fifth model below, often I don't get extra EHG.



    Also in these models for Saami, the first model which doesn't include Corded Ware doesn't get EHG, but the third model which is the highest-p model with Corded Ware gets high EHG:



    Code:
    library(admixtools)
    library(tidyverse)
    library(colorspace)
    library(cowplot)
    
    right=c("Russia_Kostenki14","Karitiana","Russia_Ust_Ishim.DG","Iran_GanjDareh_N","Spain_HG_published.SG","Han")
    left=c("Turkey_N.DG","Russia_Samara_EBA_Yamnaya","Estonia_CordedWare.SG","Hungary_EN_HG_Koros_published.SG","Russia_HG_Samara","Nganasan")
    target="Saami.DG"
    
    qp=qpadm("v44.3_HO_public",left,right,target)
    
    t=qp$popdrop%>%filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%select(1,4,5,7:last_col(5))
    t2=melt(t[-c(2,3)],id.vars="pat")
    
    lab=sub("e-0","e-",sub("^0","",sprintf(ifelse(t$p<.001,"%.0e","%.3f"),t$p)))
    # lab=paste0(lab," (",ifelse(t$chisq<10,sub("^0","",sprintf("%.1f",t$chisq)),round(t$chisq)),")")
    
    subtit=str_wrap(paste("Outgroups:",paste(sort(right),collapse=", ")),width=50)
    
    legcols=1
    
    p=ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y=value,fill=variable))+
    geom_bar(stat="identity",width=1,position=position_fill(reverse=T),size=.1,color="gray20")+
    geom_text(aes(label=round(100*value)),position=position_stack(vjust=.5,reverse=T),size=3.5)+
    ggtitle(paste("Target:",target),subtitle=subtit)+
    coord_flip()+
    scale_x_discrete(expand=c(0,0),labels=rev(lab))+
    scale_y_discrete(expand=c(0,0))+
    scale_fill_manual(values=colorspace::hex(HSV(c(30,60,90,200,230,280),.5,1)))+ # manual colors
    # scale_fill_manual(values=colorspace::hex(HSV(head(seq(0,360,length.out=ncol(t)-2),-1),.5,1)))+ # automatic colors
    guides(fill=guide_legend(ncol=legcols,byrow=F))+
    theme(
      axis.text=element_text(color="black",size=11),
      axis.text.x=element_blank(),
      axis.ticks=element_blank(),
      axis.title=element_blank(),
      legend.direction="horizontal",
      legend.key=element_rect(fill=NA),
      legend.margin=margin(-4,0,1,0),
      legend.text=element_text(size=11),
      legend.title=element_blank(),
      plot.subtitle=element_text(size=11),
      plot.title=element_text(size=16)
    )
    
    ggdraw(p)
    hei=c(.5+.2*(str_count(subtit,"\n")+1)+.25*nrow(t),+.1+.23*ceiling(length(unique(t2[!is.na(t2$value),2]))/legcols))
    ggdraw(cowplot::plot_grid(p+theme(legend.position="none"),cowplot::get_legend(p),ncol=1,rel_heights=hei))
    ggsave("a.png",width=4,height=sum(hei))
    Fascinating. Maris are indeed around 32-38% East Eurasian while Saamis are around 27-33% East Eurasian which is similar to the runs of their samples on G25. Can you do the same thing for Udmurts, Chuvashs, Tatar_Siberians and Bashkirs?

    Also can you try including Iran_N/BMAC like Ganj_Dareh/TJK_Sarazm_En/ Bustan_BA and MNG_North_N in your runs? I want to see if these Volga Uralics also possessed some Central Asian or Turkic-related ancestry deriving from ancient Iranian tribes roaming nearby the Ural region and from close geographical proximity to Kazakhstan.
    Last edited by Tsakhur; 09-13-2021 at 10:58 AM.

  8. The Following User Says Thank You to Tsakhur For This Useful Post:

     Nganasankhan (09-13-2021)

  9. #26
    Registered Users
    Posts
    159
    Sex
    Ethnicity
    Finnish

    Quote Originally Posted by Tsakhur View Post
    Fascinating. Maris are indeed around 32-38% East Eurasian while Saamis are around 27-33% East Eurasian which is similar to the runs of their samples on G25. Can you do the same thing for Udmurts, Chuvashs, Tatar_Siberians and Bashkirs?

    Also can you try including Iran_N/BMAC like Ganj_Dareh/TJK_Sarazm_En/ Bustan_BA and MNG_North_N in your runs? I want to see if these Volga Uralics also possessed some Central Asian or Turkic-related ancestry deriving from ancient Iranian tribes roaming nearby the Ural region and from close geographical proximity to Kazakhstan.
    At first I got a high amount of Iran_N in the highest-p models for most populations, including 36% for Finns. Then I added Armenia_C to outgroups in order to reduce the amount of Iran_N in the models, but now none of the highest-p models has Iran_N. Anyway, the selection of outgroups has a huge effect on the results on qpAdm, and I still have no idea how to choose the outgroups, so don't read too much into these results. I could try a ten or twenty different combinations of outgroups until I would eventually get results similar to G25, but it's a major pain in the ass.

    Last edited by Nganasankhan; 09-13-2021 at 12:22 PM.

  10. The Following User Says Thank You to Nganasankhan For This Useful Post:

     Tsakhur (09-13-2021)

  11. #27
    Registered Users
    Posts
    642
    Sex
    Omitted

    Quote Originally Posted by Nganasankhan View Post
    At first I got a high amount of Iran_N in the highest-p models for most populations, including 36% for Finns. Then I added Armenia_C to outgroups in order to reduce the amount of Iran_N in the models, but now none of the highest-p models has Iran_N. Anyway, the selection of outgroups has a huge effect on the results on qpAdm, and I still have no idea how to choose the outgroups, so don't read too much into these results. I could try a ten or twenty different combinations of outgroups until I would eventually get results similar to G25, but it's a major pain in the ass.

    I see. Thanks for your efforts! So looks like we are still uncertain whether Volga Uralics actually have Iran_N ancestry or not?

    Sorry what are these numbers such as "0.013", "0.02", "5e-5" and 3e-18", etc. are demonstrating? Are they distance fits?

    It's strange how when adding the Iran_N, Mongolia_North_N, along with Yamnaya_Samara_EBA, the Central Siberian (Nganasan-related) ancestry in the Mari decreases. The Iran_N and Yamnaya could seems to be absorbing some East Eurasian or ANE score. Also Mongolia_North_N is more East Eurasian and contains less ANE ancestry than the Nganasan which means that in runs where the Mongolia_North pops up instead of Nganasan, the Maris and other Uralics will seem slightly less North Asian that what they should be.

    Maybe you can try replacing the Nganasan with Krasnoyarsk_BA instead? Want to see if the East Eurasian in these groups will increase again.
    Last edited by Tsakhur; 09-13-2021 at 01:28 PM.

  12. #28
    Registered Users
    Posts
    171
    Sex

    Quote Originally Posted by Nganasankhan View Post
    I'm getting additional EHG for Uralic populations in qpAdm models that include Corded Ware in left populations, like for example the first model below. But then in models with Yamnaya and Turkey_N in place of Corded Ware, like the fifth model below, often I don't get extra EHG.



    Also in these models for Saami, the first model which doesn't include Corded Ware doesn't get EHG, but the third model which is the highest-p model with Corded Ware gets high EHG:



    Code:
    library(admixtools)
    library(tidyverse)
    library(colorspace)
    library(cowplot)
    
    right=c("Russia_Kostenki14","Karitiana","Russia_Ust_Ishim.DG","Iran_GanjDareh_N","Spain_HG_published.SG","Han")
    left=c("Turkey_N.DG","Russia_Samara_EBA_Yamnaya","Estonia_CordedWare.SG","Hungary_EN_HG_Koros_published.SG","Russia_HG_Samara","Nganasan")
    target="Saami.DG"
    
    qp=qpadm("v44.3_HO_public",left,right,target)
    
    t=qp$popdrop%>%filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%select(1,4,5,7:last_col(5))
    t2=melt(t[-c(2,3)],id.vars="pat")
    
    lab=sub("e-0","e-",sub("^0","",sprintf(ifelse(t$p<.001,"%.0e","%.3f"),t$p)))
    # lab=paste0(lab," (",ifelse(t$chisq<10,sub("^0","",sprintf("%.1f",t$chisq)),round(t$chisq)),")")
    
    subtit=str_wrap(paste("Outgroups:",paste(sort(right),collapse=", ")),width=50)
    
    legcols=1
    
    p=ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y=value,fill=variable))+
    geom_bar(stat="identity",width=1,position=position_fill(reverse=T),size=.1,color="gray20")+
    geom_text(aes(label=round(100*value)),position=position_stack(vjust=.5,reverse=T),size=3.5)+
    ggtitle(paste("Target:",target),subtitle=subtit)+
    coord_flip()+
    scale_x_discrete(expand=c(0,0),labels=rev(lab))+
    scale_y_discrete(expand=c(0,0))+
    scale_fill_manual(values=colorspace::hex(HSV(c(30,60,90,200,230,280),.5,1)))+ # manual colors
    # scale_fill_manual(values=colorspace::hex(HSV(head(seq(0,360,length.out=ncol(t)-2),-1),.5,1)))+ # automatic colors
    guides(fill=guide_legend(ncol=legcols,byrow=F))+
    theme(
      axis.text=element_text(color="black",size=11),
      axis.text.x=element_blank(),
      axis.ticks=element_blank(),
      axis.title=element_blank(),
      legend.direction="horizontal",
      legend.key=element_rect(fill=NA),
      legend.margin=margin(-4,0,1,0),
      legend.text=element_text(size=11),
      legend.title=element_blank(),
      plot.subtitle=element_text(size=11),
      plot.title=element_text(size=16)
    )
    
    ggdraw(p)
    hei=c(.5+.2*(str_count(subtit,"\n")+1)+.25*nrow(t),+.1+.23*ceiling(length(unique(t2[!is.na(t2$value),2]))/legcols))
    ggdraw(cowplot::plot_grid(p+theme(legend.position="none"),cowplot::get_legend(p),ncol=1,rel_heights=hei))
    ggsave("a.png",width=4,height=sum(hei))
    Does anything change if you use exclusively non-SG/DG samples as the sources and outgtoups? SG samples are known to be noisy.

  13. The Following User Says Thank You to bce For This Useful Post:

     Nino90 (09-13-2021)

Page 3 of 3 FirstFirst 123

Similar Threads

  1. Using G25 to find signs of Balto-Slavic genetic drift
    By J Man in forum Autosomal (auDNA)
    Replies: 6
    Last Post: 02-03-2021, 03:35 AM
  2. Replies: 11
    Last Post: 03-22-2020, 09:22 AM
  3. Interesting post about Germanic-Baltic-Slavic and Satem
    By spruithean in forum Linguistics
    Replies: 9
    Last Post: 02-09-2019, 01:33 PM
  4. Baltic, Slavic, Germanic
    By paoloferrari in forum Linguistics
    Replies: 7
    Last Post: 05-05-2017, 10:23 PM
  5. Replies: 2
    Last Post: 02-07-2017, 01:21 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •