View Full Version : An experiment with DNALand and Admixtools5

03-27-2017, 04:12 PM
Last week, working with qpAdm and the 23&me data of my father merged with the HumanOrigins dataset, I found this model for my father:

left pops:

(for those who are not familiar with qpAdm, it means that the first pop, my dad, is modelled as a mix of English and Finnish, in terms of gene flows). I got:
best coefficients: 0.824 0.176
The fitness indicators are:
std. errors: 0.128 0.128
chi2:3.930 tail_prob: 0.863398

It means that that model is very good, in fact nearly optimal. The problem was that only 121388 snps were used for the analysis ( only the snps in commun with 23&me v4 and Lazaridis' HumanOrigins dataset), and that was not satisfying.

Now ( and here comes the reason why I post on this thread), I had the dangerous idea to see what I was able to do with my dad's DNALand imputed datas. That was my afternoon job. A little work (and pain): I had to extract the snps shared by HumanOrigins from the imputed file ( with vcftools), prun, recode, convert in bed, merge with HumanOrigins ( with plink), convert in Eigenstrat ( with Eigensoft), and finally run with qpAdm. Results ( with 598391 snps):
best coefficients: 0.725 0.275
std. errors: 0.136 0.136
chi2:7.161 tail_prob: 0.519384
The fitness is not so good, but still acceptable ( nothing is perfect). Notice that, taking in account the standard errors, the coefficients are in the same range. Of course, 477003 snps were imputed, but as I had kept only the alleles with a high probability, I believe it's not dramatical. All that is imo rather rassuring. Please, do not ask me to do this for you: I'll likely do this job again for my mother ( as DNALand was claiming after the last update that she is 100% northwest-european, I'm curious to see what qpAdm will tell), and after this ... finished!

03-27-2017, 09:35 PM
Same job with my mother, and even more satisfying. I just searched very simple models, and it would very likely be possible to find better. But the two results I got allow me to conclude that this experiment is positive. That said, two cases don't make a generality.
After update:

left pops:

numsnps used: 598373
best coefficients: 0.632 0.368

chisq tail prob
5.386 0.715662

that one, slightly better

left pops:

best coefficients: 0.501 0.499
chisq tail prob
5.344 0.720268

03-28-2017, 01:36 PM
Your mother is arguably one of the most un-French individuals I've seen from Normandy so far, not surprised she's more or less identical to my mother genetically-speaking.

03-28-2017, 01:48 PM
Just for the anecdote, before the update her ancestral compo on DNA.Land was far more complicated, with Finnish, East-Slavic, and so on ... I was very skeptical, because on her gedmatch list she has only 4 or 5 Finns and Lithuanians, nothing compared with the 120 Finns of my dad. The first time I saw her updated results, with this beautiful green ring I burst out laughing. But actually, this qpAdm model 50/50 english/norwegian is nearly optimal.