PDA

View Full Version : qpAdm-How to identify a good fitting model



Bas
11-19-2017, 02:34 AM
What are the important things to look for in the output of qpAdm? I get this output:


Left:
Iberia_EN
Anatolia_N
Iberia_HG
Hungary_HG

jackknife block size: 0.050
snps: 1151063 indivs: 143
number of blocks for block jackknife: 713
dof (jackknife): 619.341
numsnps used: 379951
codimension 1
f4info:
f4rank: 2 dof: 2 chisq: 0.397 tail: 0.820146509 dofdiff: 4 chisqdiff: -0.397 taildiff: 1
B:
scale 1.000 1.000
Eskimo -0.638 1.425
Mbuti 1.717 0.879
Karitiana -0.796 0.606
Ulchi -0.104 0.911
A:
scale 1109.175 7801.996
Anatolia_N 0.224 0.445
Iberia_HG -1.242 -1.091
Hungary_HG -1.187 1.270


full rank 1
f4info:
f4rank: 3 dof: 0 chisq: 0.000 tail: 1 dofdiff: 2 chisqdiff: 0.397 taildiff: 0.820146509
B:
scale 1.000 1.000 1.000
Eskimo -0.618 1.450 -1.186
Mbuti 1.717 0.875 0.024
Karitiana -0.809 0.609 0.729
Ulchi -0.129 0.873 1.436
A:
scale 1103.742 7689.244 74708.748
Anatolia_N 0.216 0.426 1.665
Iberia_HG -1.245 -1.118 0.447
Hungary_HG -1.184 1.253 -0.167


best coefficients: 0.851 0.241 -0.091
ssres:
-0.000023884 -0.000014637 0.000019307 0.000028112
-1.082349727 -0.663311893 0.874963854 1.273960225

Jackknife mean: 0.850314358 0.176871913 -0.027186271
std. errors: 0.057 0.244 0.242

error covariance (* 1000000)
3194 -1864 -1330
-1864 59296 -57432
-1330 -57432 58762


fixed pat wt dof chisq tail prob
000 0 2 0.397 0 0.851 0.241 -0.091 infeasible
001 1 3 0.583 0.900212 0.848 0.152 0.000
010 1 3 1.710 0.634764 0.859 0.000 0.141
100 1 3 4.460 0 0.000 39.072 -38.072 infeasible
011 2 4 8.854 0.0648513 1.000 -0.000 0.000
101 2 4 52.924 8.84048e-11 0.000 1.000 0.000
110 2 4 55.005 3.2416e-11 0.000 -0.000 1.000
best pat: 000 0 - -
best pat: 001 0.900212 chi(nested): 0.187 p-value for nested model: 0.665509
best pat: 011 0.0648513 chi(nested): 8.271 p-value for nested model: 0.00402907

## end of run

I hear people talk about chisq (as low as poss?) and tail prob (close to 1%?). It gives chi and p-value at the end and a few other places but where do I identify the most important one, the one that will tell me if the model works on these outputs?

Kale
11-19-2017, 02:58 AM
Hi Bas, I can give you a quick noob answer until someone comes along with a better one.

You want low std. errors, 0.057 isn't bad, it means Iberia_EN is 85.1% +/- 5.7% Anatolia_N. 0.244 is terrible, it means Iberia_EN is 24.1% +/- 24.4% LaBrana.
You want low chisq,
001 1 3 0.583 0.900212 0.848 0.152 0.000 - really good
010 1 3 1.710 0.634764 0.859 0.000 0.141 - not quite as good, still not bad
And you want high tail prob
001 1 3 0.583 0.900212 0.848 0.152 0.000 - really good
010 1 3 1.710 0.634764 0.859 0.000 0.141 - not bad

The conclusion there is that...
001 1 3 0.583 0.900212 0.848 0.152 0.000
This model looks the best. It has Iberia_EN as 84.8% Anatolia_Neolithic, 15.2% LaBrana1. Not sure where the std. errors are though.

The initial result that says infeasible should also be taken into consideration...it suggests that something farther along the KO1-LB1 cline than LB1 may be a better fit.

Also extremely important to consider is the populations in the Right:
You should put here as many relevant populations that do not share specific drift with any of your left pops. Anatolia_N and Iberia_EN shouldn't have any particular relation with Natufians for example, but Iberia_Chalcolithic would be a terrible idea, because Iberia_MN is directly ancestral to them, thus sharing excess drift...no combination of Anatolian and LB1 could replicate the shared drift. Good populations to have in the right in this case would be something like Ust_Ishim, Kostenki14, GoyetQ-116, Vestonice16, Levant_Neolithic, Natufian, ElMiron, Iran_Neolithic, Kotias, MA1, probably some more as well.

Bas
11-19-2017, 03:17 AM
Hi Bas, I can give you a quick noob answer until someone comes along with a better one.

You want low std. errors, 0.057 isn't bad, it means Iberia_EN is 85.1% +/- 5.7% Anatolia_N. 0.244 is terrible, it means Iberia_EN is 24.1% +/- 24.4% LaBrana.
You want low chisq,
001 1 3 0.583 0.900212 0.848 0.152 0.000 - really good
010 1 3 1.710 0.634764 0.859 0.000 0.141 - not quite as good, still not bad
And you want high tail prob
001 1 3 0.583 0.900212 0.848 0.152 0.000 - really good
010 1 3 1.710 0.634764 0.859 0.000 0.141 - not bad

The conclusion there is that...
001 1 3 0.583 0.900212 0.848 0.152 0.000
This model looks the best. It has Iberia_EN as 84.8% Anatolia_Neolithic, 15.2% LaBrana1. Not sure where the std. errors are though.

Thanks, that's really helpful. So if I understood correctly, analysing another run from someone else,(which admittedly produced bad fits) out of these, the run I highlighted would be the best?


fixed pat wt dof chisq tail prob
0000 0 12 12.008 0.445043 -17.854 -0.184 15.344 3.695 infeasible
0001 1 13 38.894 0.000207903 2.295 0.392 -1.688 0.000 infeasible
0010 1 13 33.938 0.00122978 0.304 0.335 0.000 0.361
0100 1 13 11.158 0.597625 -12.233 -0.000 10.582 2.651 infeasible
1000 1 13 30.554 0.00391643 0.000 0.317 0.267 0.416
0011 2 14 125.593 5.02173e-20 0.459 0.541 -0.000 0.000
0101 2 14 19.914 0.132868 4.400 -0.000 -3.400 0.000 infeasible
0110 2 14 136.169 4.08862e-22 0.458 0.000 -0.000 0.542
1001 2 14 162.703 2.02883e-27 -0.000 0.583 0.417 0.000
1010 2 14 126.468 3.37779e-20 0.000 0.492 -0.000 0.508
1100 2 14 123.156 1.5135e-19 0.000 -0.000 0.395 0.605
0111 3 15 287.190 0 1.000 -0.000 -0.000 0.000
1011 3 15 236.718 0 0.000 1.000 -0.000 0.000
1101 3 15 296.545 0 0.000 -0.000 1.000 0.000
1110 3 15 310.900 0 0.000 0.000 -0.000 1.000

The chisq is high and tail prob really low but do you think that this would also be the best fit in this output in this case?

Kale
11-19-2017, 06:35 AM
"Best" is kind of a relative term there I suppose, they are all absolutely terrible, that is just the least terrible.

anglesqueville
11-19-2017, 08:17 AM
The choice of the right pops is actually a difficulty. As a learner some months ago I inspired myself from the text of Patterson and tried to choose as right pops only population very distant from the left pops. But studying the models of Generalissimo on Eurogenes I saw that he applied this rule rather freely. The same for the tail_prob: when I was starting with qpAdm I was keeping only models with tail_prob>0.7, but you can easily find in the texts models with tail_prob<0.5, and even with negative coefficients. The empirical part is really great in all that.

Generalissimo
11-19-2017, 12:13 PM
Look for p-values (tails) of over 0.05. The higher the better.

Bas
11-20-2017, 09:16 PM
Is the chisq more important than tail prob or vice versa? Say I have two models, one with chisq 1.5 and tail prob of 0.55 and another with chisq 6 and tail prob of 0.85, which one would I pick as being the more likely?

Generalissimo
11-20-2017, 09:27 PM
Is the chisq more important than tail prob or vice versa? Say I have two models, one with chisq 1.5 and tail prob of 0.55 and another with chisq 6 and tail prob of 0.85, which one would I pick as being the more likely?

Look mostly at the taildiff.

Chad Rohlfsen
11-21-2017, 05:45 PM
Chi-square

Generalissimo
11-21-2017, 07:54 PM
Chi-square

Pretty sure P-value > Chi-square.

Chad Rohlfsen
11-21-2017, 09:07 PM
If you have a high chi-square the whole thing is shaky. It's a goodness of fit.

Chad Rohlfsen
11-21-2017, 09:21 PM
You want a rank2 model, 2 dof, with a low chi-square. Tail-probabilities are important though, yes. But, you're unlikely to see a very high tail-probability with a high chi-square. Keep standard errors low too. Preferably under 5%, with a p-value under 0.05 in the nested value. I think that is the way to go.

Chad Rohlfsen
11-21-2017, 09:28 PM
Now, if you have a 2 chi-square and .95 or 0.5 chi-square .9 tail-diff, I would still lean towards the latter. You're unlikely to have a low tail with a low chi-square or a high tail-prob with a high chi-square anyway. For instance, you won't have an 11 chi-square and .95 tail-probability or a 0.5 chi-square with a .20 tail-probability.

Generalissimo
11-21-2017, 10:38 PM
I do look at the chi-square, but it's worth noting that it doesn't even get a mention in any of the papers that use qpAdm. In these papers the results are always ranked by highest taildiff.

Bas
11-24-2017, 11:16 PM
left pops:
Italian_South
Iberia_EN
Bell_Beaker_Germany
Kumtepe_4
Minoan

right pops:
Han
Eskimo
Mbuti
Karitiana
Ulchi
Papuan
Kostenki_14
Afontova_Gora
Vestonice_16
Goyet_Q116

0 Italian_South 5
1 Iberia_EN 5
2 Bell_Beaker_Germany 13
3 Kumtepe_4 1
4 Minoan 10
5 Han 43
6 Eskimo 20
7 Mbuti 10
8 Karitiana 12
9 Ulchi 25
10 Papuan 14
11 Kostenki_14 1
12 Afontova_Gora 1
13 Vestonice_16 1
14 Goyet_Q116 1
jackknife block size: 0.050
snps: 1145807 indivs: 162
number of blocks for block jackknife: 713
dof (jackknife): 410.948
numsnps used: 1343
codimension 1
f4info:
f4rank: 3 dof: 6 chisq: 1.364 tail: 0.967985262 dofdiff: 8 chisqdiff: -1.364 taildiff: 1
B:
scale 1.000 1.000 1.000
Eskimo -0.056 0.323 0.460
Mbuti 1.355 1.076 0.864
Karitiana 1.260 0.758 -1.534
Ulchi 0.524 0.630 -1.132
Papuan 0.445 1.253 -0.101
Kostenki_14 0.775 0.380 0.957
Afontova_Gora 1.613 -2.189 -0.684
Vestonice_16 0.544 -0.489 1.146
Goyet_Q116 1.266 0.140 1.303
A:
scale 837.854 959.284 1259.009
Iberia_EN -1.290 0.286 0.182
Bell_Beaker_Germany 0.554 -1.629 0.725
Kumtepe_4 1.330 0.509 -1.122
Minoan 0.508 1.003 1.478


full rank 1
f4info:
f4rank: 4 dof: 0 chisq: 0.000 tail: 1 dofdiff: 6 chisqdiff: 1.364 taildiff: 0.967985262
B:
scale 1.000 1.000 1.000 1.000
Eskimo 0.106 0.378 0.571 -0.402
Mbuti 0.965 0.965 0.448 1.689
Karitiana 1.302 0.784 -1.589 -0.103
Ulchi 0.697 0.672 -1.051 -0.639
Papuan 0.412 1.267 -0.268 0.319
Kostenki_14 0.292 0.214 0.477 1.679
Afontova_Gora 1.505 -2.198 -0.762 0.454
Vestonice_16 0.364 -0.555 1.039 0.652
Goyet_Q116 1.796 0.260 1.698 -1.419
A:
scale 775.326 941.245 1297.093 1978.141
Iberia_EN -0.985 0.267 0.507 -1.643
Bell_Beaker_Germany 0.599 -1.637 0.919 -0.342
Kumtepe_4 1.547 0.458 -0.577 -1.031
Minoan 0.526 1.019 1.602 0.345


best coefficients: 0.423 0.210 0.297 0.070
ssres:
0.000385252 -0.000545465 -0.000049318 0.000223295 -0.000164415 -0.000691260 0.000005664 0.000043214 0.001675739
0.591868941 -0.838007852 -0.075768361 0.343051577 -0.252593315 -1.061995626 0.008701784 0.066390144 2.574468353

Jackknife mean: 0.282333485 0.245479742 0.207858234 0.264328539
std. errors: 0.413 0.285 0.193 0.457

error covariance (* 1000000)
170321 -63706 31573 -138188
-63706 81055 -7458 -9891
31573 -7458 37061 -61176
-138188 -9891 -61176 209256


fixed pat wt dof chisq tail prob
0000 0 6 1.364 0.967985 0.423 0.210 0.297 0.070
0001 1 7 1.397 0.985673 0.423 0.210 0.297 0.070
0010 1 7 5.453 0 -2.313 1.317 0.000 1.996 infeasible
0100 1 7 1.939 0.963156 0.634 0.000 0.342 0.024
1000 1 7 2.796 0.903199 -0.000 0.395 0.279 0.326
0011 2 8 7.112 0.524594 0.851 0.149 0.000 0.000
0101 2 8 1.941 0.982755 0.651 0.000 0.349 0.000
0110 2 8 5.677 0.683324 0.297 0.000 0.000 0.703
1001 2 8 3.771 0.877212 0.000 0.575 0.425 0.000
1010 2 8 5.097 0.747109 0.000 0.249 0.000 0.751
1100 2 8 6.010 0.646087 0.000 -0.000 0.104 0.896
0111 3 9 7.160 0.62045 1.000 0.000 0.000 0.000
1011 3 9 11.802 0.224732 0.000 1.000 0.000 0.000
1101 3 9 8.410 0.493373 0.000 0.000 1.000 0.000
1110 3 9 6.140 0.725857 0.000 0.000 0.000 1.000
best pat: 0000 0.967985 - -
best pat: 0001 0.985673 chi(nested): 0.032 p-value for nested model: 0.857565
best pat: 0101 0.982755 chi(nested): 0.545 p-value for nested model: 0.460465
best pat: 1110 0.725857 chi(nested): 4.198 p-value for nested model: 0.0404645

## end of run

I was looking at these values in red to judge the model. What are the 'nested model' values at the bottom? Are they what I should be looking at instead?

And how does the rank and dof/dofdiff play into all of this? Because in that section of the run, a lot of the models given there are later labelled 'infeasible' in the section I highlighted in red above (not in this one though).

Chad Rohlfsen
11-24-2017, 11:45 PM
You have many problems here. Two main ones is that standard errors are horrible and so is your SNP count. Dont use Kumtepe4. It is useless. You need to keep SNPs used as high as possible. Replace Afontovagora with MA1 too. Lastly, youll never sort out farmer ancestry without having prights that can differentiate, like Iran N, Levant N, and Anatolia N.

Chad Rohlfsen
11-24-2017, 11:48 PM
Trying to have 4 mixing pops is going to cause problems too. 3 is pushing it. You need to learn qpGraph too.

Bas
11-25-2017, 12:06 AM
You have many problems here. Two main ones is that standard errors are horrible and so is your SNP count. Dont use Kumtepe4. It is useless. You need to keep SNPs used as high as possible. Replace Afontovagora with MA1 too. Lastly, youll never sort out farmer ancestry without having prights that can differentiate, like Iran N, Levant N, and Anatolia N.

Cheers Chad, yeah don't know why I'm having issues with SNP count.1343?! That's ridiculous, I didn't even notice that. I'm using latest Human Origins public and the Minoan sets, maybe just one or two of my samples? And about the right pops, could you think of any other additions on top of the neos you mentioned?

Chad Rohlfsen
11-25-2017, 01:39 AM
It all depends on the pops you test. Your SNP count is dreadful because of Kumtepe4. Don't ever use it. It's a different capture method and probably should only be run with transversion sites, but it doesn't have enough markers to do that. If you're testing Euros, a WHG in the pright will help too. Think of it this way. You want stuff in your pright that creates the most significant Dstats in your plefts. That will flesh out stuff from kinda related pops.

Chad Rohlfsen
11-25-2017, 01:43 AM
The use of a lot of Asians is unnecessary too if you aren't testing someone closely related to them. You don't need anything more than like an Ami or Karitiana. Having Chukchi and Ulchi is pointless.

Danimf
01-31-2018, 03:05 PM
Could someone help me figure out the best model of the following test for the LBK, considering ranks 0, 1, and 2 all provide good tailprob (0.18-0.37)?
I would be inclined for the simplest model (rank 0), but I have more recent samples with clear increased WHG ancestry (as suggested in admixture and f3/f4) who still show rank 0 for Anatolia_N with p>0.05... Thanks!


qpAdm: parameter file: qpAdm_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_E HG
### THE INPUT PARAMETERS
##PARAMETER NAME: VALUE
indivname: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.ind
snpname: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.snp
genotypename: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.geno
popleft: left_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_EH G
popright: right_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_E HG
maxrank: 4
allsnps: NO
## qpAdm version: 650
seed: 1730420728

left pops:
LBK_EN
Anatolia_Neolithic
WHG
EHG

right pops:
Ust_Ishim_HG
Kostenki14
MA1_HG
Han
Papuan
Onge
Chukchi
Karitiana
Mbuti

0 LBK_EN 9
1 Anatolia_Neolithic 20
2 WHG 3
3 EHG 2
4 Ust_Ishim_HG 1
5 Kostenki14 1
6 MA1_HG 1
7 Han 8
8 Papuan 4
9 Onge 9
10 Chukchi 6
11 Karitiana 11
12 Mbuti 10
jackknife block size: 0.050
snps: 593124 indivs: 85
number of blocks for block jackknife: 711
## ncols: 593124
coverage: LBK_EN 574017
coverage: Anatolia_Neolithic 576274
coverage: WHG 576894
coverage: EHG 545164
coverage: Ust_Ishim_HG 575780
coverage: Kostenki14 530430
coverage: MA1_HG 414502
coverage: Han 593124
coverage: Papuan 593122
coverage: Onge 593124
coverage: Chukchi 593124
coverage: Karitiana 593124
coverage: Mbuti 576983
dof (jackknife): 610.218
numsnps used: 403594
codimension 1
f4info:
f4rank: 2 dof: 6 chisq: 7.001 tail: 0.320726995 dofdiff: 8 chisqdiff: -7.001 taildiff: 1
B:
scale 1.000 1.000
Kostenki14 0.378 2.351
MA1_HG 1.995 -0.049
Han 0.419 -0.153
Papuan 0.148 -0.013
Onge 0.234 0.012
Chukchi 1.092 -0.226
Karitiana 1.505 -0.425
Mbuti -0.412 -1.489
A:
scale 449.227 2013.990
Anatolia_Neolithic -0.049 -0.057
WHG 0.451 1.670
EHG 1.671 -0.457


full rank
f4info:
f4rank: 3 dof: 0 chisq: 0.000 tail: 1 dofdiff: 6 chisqdiff: 7.001 taildiff: 0.320726995
B:
scale 1.000 1.000 1.000
Kostenki14 0.369 -2.337 0.622
MA1_HG 2.015 0.029 -1.035
Han 0.409 0.166 0.466
Papuan 0.146 0.037 0.192
Onge 0.220 -0.010 0.480
Chukchi 1.083 0.261 0.618
Karitiana 1.494 0.471 1.178
Mbuti -0.400 1.489 -2.071
A:
scale 446.681 1882.622 10772.739
Anatolia_Neolithic -0.021 -0.121 -1.404
WHG 0.464 -1.703 -0.878
EHG 1.669 0.295 -0.507


best coefficients: 0.945 0.037 0.018
Jackknife mean: 0.945381096 0.036909263 0.017709641
std. errors: 0.036 0.049 0.022

error covariance (* 1000000)
1273 -1589 315
-1589 2371 -783
315 -783 467


summ: LBK_EN 3 0.320727 0.945 0.037 0.018 1273 -1589 315 2371 -783 467

fixed pat wt dof chisq tail prob
000 0 6 7.001 0.320727 0.945 0.037 0.018
001 1 7 7.619 0.36741 0.934 0.066 0.000
010 1 7 7.592 0.369942 0.970 0.000 0.030
100 1 7 119.478 9.83841e-23 0.000 1.478 -0.478 infeasible
011 2 8 11.307 0.184925 1.000 0.000 0.000
101 2 8 237.622 0 0.000 1.000 0.000
110 2 8 792.779 0 0.000 0.000 1.000
best pat: 000 0.320727 - -
best pat: 010 0.369942 chi(nested): 0.591 p-value for nested model: 0.442153
best pat: 011 0.184925 chi(nested): 3.715 p-value for nested model: 0.0539387

coeffs: 0.945 0.037 0.018

## dscore:: f_4(Base, Fit, Rbase, right2)
## genstat:: f_4(Base, Fit, right1, right2)

details: Anatolia_Neolithic Kostenki14 0.000023 0.115955
details: WHG Kostenki14 0.002469 6.118548
details: EHG Kostenki14 0.001006 2.064895
dscore: Kostenki14 f4: 0.000130 Z: 0.681538

details: Anatolia_Neolithic MA1_HG 0.000036 0.187860
details: WHG MA1_HG 0.002178 5.785034
details: EHG MA1_HG 0.007664 15.628680
dscore: MA1_HG f4: 0.000250 Z: 1.319058

details: Anatolia_Neolithic Han -0.000111 -0.761027
details: WHG Han 0.000238 0.862752
details: EHG Han 0.001552 4.425303
dscore: Han f4: -0.000069 Z: -0.481023

details: Anatolia_Neolithic Papuan -0.000053 -0.328964
details: WHG Papuan 0.000105 0.338111
details: EHG Papuan 0.000577 1.510145
dscore: Papuan f4: -0.000036 Z: -0.227791

details: Anatolia_Neolithic Onge -0.000090 -0.584392
details: WHG Onge 0.000196 0.686686
details: EHG Onge 0.000815 2.264091
dscore: Onge f4: -0.000063 Z: -0.420111

details: Anatolia_Neolithic Chukchi -0.000171 -1.124657
details: WHG Chukchi 0.000841 2.956709
details: EHG Chukchi 0.004089 11.403239
dscore: Chukchi f4: -0.000059 Z: -0.392057

details: Anatolia_Neolithic Karitiana -0.000272 -1.656485
details: WHG Karitiana 0.001043 3.296909
details: EHG Karitiana 0.005643 14.037116
dscore: Karitiana f4: -0.000119 Z: -0.735263

details: Anatolia_Neolithic Mbuti 0.000172 1.178055
details: WHG Mbuti -0.001584 -5.574412
details: EHG Mbuti -0.001143 -3.283675
dscore: Mbuti f4: 0.000084 Z: 0.584163

gendstat: Ust_Ishim_HG Kostenki14 0.682
gendstat: Ust_Ishim_HG MA1_HG 1.319
gendstat: Ust_Ishim_HG Han -0.481
gendstat: Ust_Ishim_HG Papuan -0.228
gendstat: Ust_Ishim_HG Onge -0.420
gendstat: Ust_Ishim_HG Chukchi -0.392
gendstat: Ust_Ishim_HG Karitiana -0.735
gendstat: Ust_Ishim_HG Mbuti 0.584
gendstat: Kostenki14 MA1_HG 0.578
gendstat: Kostenki14 Han -1.139
gendstat: Kostenki14 Papuan -0.852
gendstat: Kostenki14 Onge -1.040
gendstat: Kostenki14 Chukchi -1.069
gendstat: Kostenki14 Karitiana -1.338
gendstat: Kostenki14 Mbuti -0.268
gendstat: MA1_HG Han -1.969
gendstat: MA1_HG Papuan -1.606
gendstat: MA1_HG Onge -1.833
gendstat: MA1_HG Chukchi -1.918
gendstat: MA1_HG Karitiana -2.269
gendstat: MA1_HG Mbuti -0.956
gendstat: Han Papuan 0.276
gendstat: Han Onge 0.058
gendstat: Han Chukchi 0.129
gendstat: Han Karitiana -0.476
gendstat: Han Mbuti 1.268
gendstat: Papuan Onge -0.206
gendstat: Papuan Chukchi -0.181
gendstat: Papuan Karitiana -0.605
gendstat: Papuan Mbuti 0.872
gendstat: Onge Chukchi 0.039
gendstat: Onge Karitiana -0.425
gendstat: Onge Mbuti 1.140
gendstat: Chukchi Karitiana -0.607
gendstat: Chukchi Mbuti 1.103
gendstat: Karitiana Mbuti 1.478

## end of run

I used the O9 outgroups from Lazaridis et al. 2016.

Chad Rohlfsen
02-01-2018, 10:56 PM
Your standard errors are worse than the coefficients. Bad deal. You also could use Levant N and Iran in the pright. 3-way mixes are mich harder to fit than simple 2-way, rank 2 runs. You need outgroups that create the biggest z score. Onge, Papuan, Chukchi really aren't needed here.

Bas
02-02-2018, 04:08 AM
Could someone help me figure out the best model of the following test for the LBK, considering ranks 0, 1, and 2 all provide good tailprob (0.18-0.37)?
I would be inclined for the simplest model (rank 0), but I have more recent samples with clear increased WHG ancestry (as suggested in admixture and f3/f4) who still show rank 0 for Anatolia_N with p>0.05... Thanks!


qpAdm: parameter file: qpAdm_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_E HG
### THE INPUT PARAMETERS
##PARAMETER NAME: VALUE
indivname: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.ind
snpname: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.snp
genotypename: Laz16ALLPolarizeEIG1200KAutosomesForStatsNEWlitQPa dmLev.geno
popleft: left_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_EH G
popright: right_LBK_EN_AS_Anatolia_Neolithic_PLUS_WHG_PLUS_E HG
maxrank: 4
allsnps: NO
## qpAdm version: 650
seed: 1730420728

left pops:
LBK_EN
Anatolia_Neolithic
WHG
EHG

right pops:
Ust_Ishim_HG
Kostenki14
MA1_HG
Han
Papuan
Onge
Chukchi
Karitiana
Mbuti

0 LBK_EN 9
1 Anatolia_Neolithic 20
2 WHG 3
3 EHG 2
4 Ust_Ishim_HG 1
5 Kostenki14 1
6 MA1_HG 1
7 Han 8
8 Papuan 4
9 Onge 9
10 Chukchi 6
11 Karitiana 11
12 Mbuti 10
jackknife block size: 0.050
snps: 593124 indivs: 85
number of blocks for block jackknife: 711
## ncols: 593124
coverage: LBK_EN 574017
coverage: Anatolia_Neolithic 576274
coverage: WHG 576894
coverage: EHG 545164
coverage: Ust_Ishim_HG 575780
coverage: Kostenki14 530430
coverage: MA1_HG 414502
coverage: Han 593124
coverage: Papuan 593122
coverage: Onge 593124
coverage: Chukchi 593124
coverage: Karitiana 593124
coverage: Mbuti 576983
dof (jackknife): 610.218
numsnps used: 403594
codimension 1
f4info:
f4rank: 2 dof: 6 chisq: 7.001 tail: 0.320726995 dofdiff: 8 chisqdiff: -7.001 taildiff: 1
B:
scale 1.000 1.000
Kostenki14 0.378 2.351
MA1_HG 1.995 -0.049
Han 0.419 -0.153
Papuan 0.148 -0.013
Onge 0.234 0.012
Chukchi 1.092 -0.226
Karitiana 1.505 -0.425
Mbuti -0.412 -1.489
A:
scale 449.227 2013.990
Anatolia_Neolithic -0.049 -0.057
WHG 0.451 1.670
EHG 1.671 -0.457


full rank
f4info:
f4rank: 3 dof: 0 chisq: 0.000 tail: 1 dofdiff: 6 chisqdiff: 7.001 taildiff: 0.320726995
B:
scale 1.000 1.000 1.000
Kostenki14 0.369 -2.337 0.622
MA1_HG 2.015 0.029 -1.035
Han 0.409 0.166 0.466
Papuan 0.146 0.037 0.192
Onge 0.220 -0.010 0.480
Chukchi 1.083 0.261 0.618
Karitiana 1.494 0.471 1.178
Mbuti -0.400 1.489 -2.071
A:
scale 446.681 1882.622 10772.739
Anatolia_Neolithic -0.021 -0.121 -1.404
WHG 0.464 -1.703 -0.878
EHG 1.669 0.295 -0.507


best coefficients: 0.945 0.037 0.018
Jackknife mean: 0.945381096 0.036909263 0.017709641
std. errors: 0.036 0.049 0.022

error covariance (* 1000000)
1273 -1589 315
-1589 2371 -783
315 -783 467


summ: LBK_EN 3 0.320727 0.945 0.037 0.018 1273 -1589 315 2371 -783 467

fixed pat wt dof chisq tail prob
000 0 6 7.001 0.320727 0.945 0.037 0.018
001 1 7 7.619 0.36741 0.934 0.066 0.000
010 1 7 7.592 0.369942 0.970 0.000 0.030
100 1 7 119.478 9.83841e-23 0.000 1.478 -0.478 infeasible
011 2 8 11.307 0.184925 1.000 0.000 0.000
101 2 8 237.622 0 0.000 1.000 0.000
110 2 8 792.779 0 0.000 0.000 1.000
best pat: 000 0.320727 - -
best pat: 010 0.369942 chi(nested): 0.591 p-value for nested model: 0.442153
best pat: 011 0.184925 chi(nested): 3.715 p-value for nested model: 0.0539387

coeffs: 0.945 0.037 0.018

## dscore:: f_4(Base, Fit, Rbase, right2)
## genstat:: f_4(Base, Fit, right1, right2)

details: Anatolia_Neolithic Kostenki14 0.000023 0.115955
details: WHG Kostenki14 0.002469 6.118548
details: EHG Kostenki14 0.001006 2.064895
dscore: Kostenki14 f4: 0.000130 Z: 0.681538

details: Anatolia_Neolithic MA1_HG 0.000036 0.187860
details: WHG MA1_HG 0.002178 5.785034
details: EHG MA1_HG 0.007664 15.628680
dscore: MA1_HG f4: 0.000250 Z: 1.319058

details: Anatolia_Neolithic Han -0.000111 -0.761027
details: WHG Han 0.000238 0.862752
details: EHG Han 0.001552 4.425303
dscore: Han f4: -0.000069 Z: -0.481023

details: Anatolia_Neolithic Papuan -0.000053 -0.328964
details: WHG Papuan 0.000105 0.338111
details: EHG Papuan 0.000577 1.510145
dscore: Papuan f4: -0.000036 Z: -0.227791

details: Anatolia_Neolithic Onge -0.000090 -0.584392
details: WHG Onge 0.000196 0.686686
details: EHG Onge 0.000815 2.264091
dscore: Onge f4: -0.000063 Z: -0.420111

details: Anatolia_Neolithic Chukchi -0.000171 -1.124657
details: WHG Chukchi 0.000841 2.956709
details: EHG Chukchi 0.004089 11.403239
dscore: Chukchi f4: -0.000059 Z: -0.392057

details: Anatolia_Neolithic Karitiana -0.000272 -1.656485
details: WHG Karitiana 0.001043 3.296909
details: EHG Karitiana 0.005643 14.037116
dscore: Karitiana f4: -0.000119 Z: -0.735263

details: Anatolia_Neolithic Mbuti 0.000172 1.178055
details: WHG Mbuti -0.001584 -5.574412
details: EHG Mbuti -0.001143 -3.283675
dscore: Mbuti f4: 0.000084 Z: 0.584163

gendstat: Ust_Ishim_HG Kostenki14 0.682
gendstat: Ust_Ishim_HG MA1_HG 1.319
gendstat: Ust_Ishim_HG Han -0.481
gendstat: Ust_Ishim_HG Papuan -0.228
gendstat: Ust_Ishim_HG Onge -0.420
gendstat: Ust_Ishim_HG Chukchi -0.392
gendstat: Ust_Ishim_HG Karitiana -0.735
gendstat: Ust_Ishim_HG Mbuti 0.584
gendstat: Kostenki14 MA1_HG 0.578
gendstat: Kostenki14 Han -1.139
gendstat: Kostenki14 Papuan -0.852
gendstat: Kostenki14 Onge -1.040
gendstat: Kostenki14 Chukchi -1.069
gendstat: Kostenki14 Karitiana -1.338
gendstat: Kostenki14 Mbuti -0.268
gendstat: MA1_HG Han -1.969
gendstat: MA1_HG Papuan -1.606
gendstat: MA1_HG Onge -1.833
gendstat: MA1_HG Chukchi -1.918
gendstat: MA1_HG Karitiana -2.269
gendstat: MA1_HG Mbuti -0.956
gendstat: Han Papuan 0.276
gendstat: Han Onge 0.058
gendstat: Han Chukchi 0.129
gendstat: Han Karitiana -0.476
gendstat: Han Mbuti 1.268
gendstat: Papuan Onge -0.206
gendstat: Papuan Chukchi -0.181
gendstat: Papuan Karitiana -0.605
gendstat: Papuan Mbuti 0.872
gendstat: Onge Chukchi 0.039
gendstat: Onge Karitiana -0.425
gendstat: Onge Mbuti 1.140
gendstat: Chukchi Karitiana -0.607
gendstat: Chukchi Mbuti 1.103
gendstat: Karitiana Mbuti 1.478

## end of run

I used the O9 outgroups from Lazaridis et al. 2016.

Yes definitely Levant_N, Iran_N in there,your Anatolia_N will need that, possibly some older Euro HGs as you are using WHG-EHG as your references (maybe Bichon and Villabruna are going to be too close to your 3 WHG left pops to be in your right pops, but possibly Goyet and or El_Miron could just scrape by and provide some extra Euro HG depth in your right). What WHGs are you using in your left?

Funnily enough though, a roughly 95% Anatolia_N and 5% WHG is what it should be. Whether the same pright can work out a more complex mix without letting just any combo pass tail prob is another matter. While testing the effect of different prights on LBK, using a similar pright as Mathieson I've had qpadm tell me that LBK could be modelled as Anatolia_N + La_Brana. No problem. Running Mota/Ust/Kost/Ma1/WHG/EHG/Iran_N/Levant_N/Natufian also said La_Brana was a plausible mixing pop. Only when I put El_Miron in the right did it give a tail prob under 0.05, while it passed Koros_HG.

This might sound like a stupid question, but why have the big papers so frequently gone with a pright setup of moderns like Mbuti,Han,Papuan,Karitiana etc over a pright of what might be a closer and more relevant '4 pillars' of West Eurasia such as WHG/EHG/Iran_N/Levant_N +basals like Ust-Ishim,Kostenki,MA_1?

Kale
02-05-2018, 06:09 AM
Academic conservatism probably? It's not a good idea / valid method until someone else has published a paper saying it is? (Even though obviously it is)

Bas
02-05-2018, 11:10 PM
There might need to be a few tweaks made to some models, knowing how sensitive qpAdm is to the pRight selection.