PDA

View Full Version : Experimentation with Admixture Software



Pages : [1] 2 3 4

Dr_McNinja
02-19-2014, 06:04 AM
I wanted to see the feasibility of making a Eurogenes K36-like calculator for India, so I used a supervised admixture run at k=30 with the populations from this study:

http://www.cell.com/AJHG/abstract/S0002-9297(11)00488-5

And here were the results:


HRP0349:

0.37% Gond
1.23% Brahmin from Uttaranchal
60.25% Kshatriya
38.12% Brahmin from Uttar Pradesh

HRP0341:

3.57% Brahmin from Tamil Nadu
1.03% Meghawal
54.21% Kshatriya
41.16% Brahmin from Uttar Pradesh

HRP0370:

0.92% Meena
33.89% Kshatriya
65.17% Brahmin from Uttar PradeshTheir number of samples was pretty low though. I think this is doable, but which datasets to include? Maybe include one of the HGDP Pakistani groups (Sindhi probably). Which datasets have the largest number of diverse Indian samples? This was at around ~500k SNPs.

EDIT: We could make reference groups using our own raw data if we could get enough, but that'd be pretty time consuming.

Dr_McNinja
02-19-2014, 01:46 PM
I was fooling around and used the India study I mention above with me (HRP0349), Sapporo (HRP0341), and the Afghan Pashtun (HRP0370) in unsupervised mode with 2 populations.

It split into two weird populations. Population 1 peaked in Dharkar and Dusadh (also above 90% in Kanjar, Brahmins, Kshatriya, etc) and Population 2 peaked in Naga and Pulliyar. HRP0341 and HRP0370 were 99% Population 1 while I was ~95% with ~5% of Population 2. Naga are a Northeastern Indian tribal group:


Naga 4 25.67, 94.11 Sino-Tibetan Tibeto-Burman Kuki-Chin-Naga Naga (folk people) refers to a conglomeration of many tribes inhabiting the northeastern part of India who has similar cultures. There are more than 15 subtribes inhabit northeastern state Nagaland.

I did it again without HRP0370 and got mostly the same:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdEJKN3I1el8tbTFpdUVTR1Rjd3I3d kE&usp=drive_web#gid=0

I don't know what that is, but I'm pretty sure that represents the little bit extra of S-Indian I've been getting in calculators.

Population 2 drops off sharply after Naga/Pulliyar.

I also used only myself, Sapporo, and HRP0370 with ~900+k SNPs in unsupervised mode with K=2, K=4, and K=8. At k=2, I clustered with HRP0370 @ 100% as Population 1 and HRP0341 clustered by themselves as Population 2. At k=8 the three of us split into non-overlapping populations of various amounts. At k=4, however, there was some overlap:

HRP0349:

Pop1: 0.39%
Pop2: 85.3%
Pop3: 14.3%
Pop4: 0%

HRP0341:

Pop1: 56.1%
Pop2: 40.1%
Pop3: 0%
Pop4: 3.84%

HRP0370:

Pop1: 0%
Pop2: 0%
Pop3: 54.12%
Pop4: 45.88%

Finally I ran k=32 in supervised mode with the previous dataset of k=30 plus HGDP Pathans and Sindhis and got this:

HRP0349:

10.80% Gond
1.30% Brahmin from Uttaranchal
34.45% Kshatriya
50.43% Pathan

HRP0341:

3.49% Dharkar
1.37% Meghawal
5.73% Brahmin from Tamil Nadu
13.52% Kshatriya
32.37% Sindhi
41.28% Pathan

HRP0370:

100% Pathan

I ran it at k=33 with Hapmap's GIH population and it didn't change much.

What I notice is that as soon as populations outside modern India are introduced (i.e, HGDP Pakistanis), I start developing this extra little bit of some ASI-like component which is then compensated for by an increased western shift in my other components (manifested as European admixture that is West-leaning in the calculators which feature those components). It's not a gradient like other users (here shown as HRP0341 having Sindhi admixture to go with the Pathan). Or maybe it's the other way around, it detects some Western Eurasian admixture that is compensated for by pulling out some East Eurasian which has ASI affinity. That could be the overlap I had with HRP0370 in the ~900+k SNP k=4 run. (EDIT: The Gond had like ~8-9% SE-Asian in Harappa)

Dr_McNinja
02-19-2014, 02:15 PM
I re-ran the k=2, k=4, k=8 unsupervised with just me, Sapporo, and HRP0370 but pruned with indep-pairwise 50 5 0.3 in order to see the difference LD-based pruning made, and it pruned down to ~15k SNPs (don't know why it pruned so much).

Results changed to, k=2:

HRP0349:

52.57% Population 1
47.43% Population 2

HRP0341:

100% Population 1

HRP0370:

100% Population 2

At k=4,

HRP0349:

52.42% Population 1
28.14% Population 2
19.44% Population 3
0% Population 4

HRP0341:

1.49% Population 1
98.5% Population 2
0% Population 3
0% Population 4

HRP0370:

0% Population 1
0% Population 2
54.38% Population 3
45.62% Population 4

EDIT: I should mention plink keeps detecting chromosomes 23, 24, and 26 in our data.

EDIT #2: Pruning down to ~300k keeps the results mostly similar to the first run but HRP0341's little bit of Population 4 disappeared after pruning down.

BTW, if anyone else wants to send me their 23andMe v3 raw data, I'll include you and re-do these runs (without pruning at ~900+k and then again at 2 or 3 different levels of pruning).

parasar
02-19-2014, 04:35 PM
I was fooling around and used the India study I mention above with me (HRP0349), Sapporo (HRP0341), and the Afghan Pashtun (HRP0370) in unsupervised mode with 2 populations.

It split into two weird populations. Population 1 peaked in Dharkar and Dusadh (also above 90% in Kanjar, Brahmins, Kshatriya, etc) and Population 2 peaked in Naga and Pulliyar. HRP0341 and HRP0370 were 99% Population 1 while I was ~95% with ~5% of Population 2. Naga are a Northeastern Indian tribal group:


Regarding the two populations at the extremes:
The Pulliyar were one the populations (the Irula were another) that showed a Caucasus component but no Baloch when the latter broke up.
The Dharkar as per Moorjani et al were one of the later populations to admix.

Sein
02-19-2014, 09:54 PM
I was fooling around and used the India study I mention above with me (HRP0349), Sapporo (HRP0341), and the Afghan Pashtun (HRP0370) in unsupervised mode with 2 populations.

It split into two weird populations. Population 1 peaked in Dharkar and Dusadh (also above 90% in Kanjar, Brahmins, Kshatriya, etc) and Population 2 peaked in Naga and Pulliyar. HRP0341 and HRP0370 were 99% Population 1 while I was ~95% with ~5% of Population 2. Naga are a Northeastern Indian tribal group:



I did it again without HRP0370 and got mostly the same:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdEJKN3I1el8tbTFpdUVTR1Rjd3I3d kE&usp=drive_web#gid=0

I don't know what that is, but I'm pretty sure that represents the little bit extra of S-Indian I've been getting in calculators.

Population 2 drops off sharply after Naga/Pulliyar.

I also used only myself, Sapporo, and HRP0370 with ~900+k SNPs in unsupervised mode with K=2, K=4, and K=8. At k=2, I clustered with HRP0370 @ 100% as Population 1 and HRP0341 clustered by themselves as Population 2. At k=8 the three of us split into non-overlapping populations of various amounts. At k=4, however, there was some overlap:

HRP0349:

Pop1: 0.39%
Pop2: 85.3%
Pop3: 14.3%
Pop4: 0%

HRP0341:

Pop1: 56.1%
Pop2: 40.1%
Pop3: 0%
Pop4: 3.84%

HRP0370:

Pop1: 0%
Pop2: 0%
Pop3: 54.12%
Pop4: 45.88%

Finally I ran k=32 in supervised mode with the previous dataset of k=30 plus HGDP Pathans and Sindhis and got this:

HRP0349:

10.80% Gond
1.30% Brahmin from Uttaranchal
34.45% Kshatriya
50.43% Pathan

HRP0341:

3.49% Dharkar
1.37% Meghawal
5.73% Brahmin from Tamil Nadu
13.52% Kshatriya
32.37% Sindhi
41.28% Pathan

HRP0370:

100% Pathan

I ran it at k=33 with Hapmap's GIH population and it didn't change much.

What I notice is that as soon as populations outside modern India are introduced (i.e, HGDP Pakistanis), I start developing this extra little bit of some ASI-like component which is then compensated for by an increased western shift in my other components (manifested as European admixture that is West-leaning in the calculators which feature those components). It's not a gradient like other users (here shown as HRP0341 having Sindhi admixture to go with the Pathan). Or maybe it's the other way around, it detects some Western Eurasian admixture that is compensated for by pulling out some East Eurasian which has ASI affinity. That could be the overlap I had with HRP0370 in the ~900+k SNP k=4 run. (EDIT: The Gond had like ~8-9% SE-Asian in Harappa)

Dr_McNinja, could you add the Di Cristofaro samples from Afghanistan (5 populations), include 7 HGDP South-Central Asian populations (Brahui, Burusho, Hazara, Kalash, Pashtun, Sindhi, and Uyghur. Basically all of them, but excluding Baloch and Makrani, and including the Brahui in their place), include the K=30 Indian data-set, include the GIH, and run a supervised admixture run? So supervised K=43 with all of these populations? And I'd love to send my raw-data.

Here is the link for the samples from Afghanistan:

http://www.evolutsioon.ut.ee/MAIT/public_data/afghan/

everest59
02-19-2014, 11:36 PM
I would like to send my data as well. Also, I can provide you with Singapore Indian samples if you want. I think it's like a 100 or so individuals.

Dr_McNinja
02-20-2014, 01:22 AM
I would like to send my data as well. Also, I can provide you with Singapore Indian samples if you want. I think it's like a 100 or so individuals.Sure, e-mail it to me. I'm more interested in the Hapmap samples that had 1+million SNPs. I wonder how much of those overlap with 23andMe v3? Or was the dataset you sent already reduced to the ones that overlapped with 23andMe?

Dr_McNinja
02-20-2014, 01:31 AM
Dr_McNinja, could you add the Di Cristofaro samples from Afghanistan (5 populations), include 7 HGDP South-Central Asian populations (Brahui, Burusho, Hazara, Kalash, Pashtun, Sindhi, and Uyghur. Basically all of them, but excluding Baloch and Makrani, and including the Brahui in their place), include the K=30 Indian data-set, include the GIH, and run a supervised admixture run? So supervised K=43 with all of these populations? And I'd love to send my raw-data.

Here is the link for the samples from Afghanistan:

http://www.evolutsioon.ut.ee/MAIT/public_data/afghan/Sure, I'll do that next. Will have to wait until the weekend though. Why do you want to exclude Baloch/Makrani but not Brahui?

Sein
02-20-2014, 01:40 AM
Sure, I'll do that next. Will have to wait until the weekend though. Why do you want to exclude Baloch/Makrani but not Brahui?

Thanks!

For the Balochistanis, they are all genetically very similar, so it might be redundant to have three nearly identical populations as three separate reference poles in a supervised Admixture run. And the Baloch are basically a cosmopolitan spin on the Brahui. Also, the Brahui are the most homogenous out of all three populations, and the least genetically African (especially in comparison to the Makrani), so I guess the Brahui are the best population to keep out of all three.

everest59
02-20-2014, 03:08 AM
Sure, e-mail it to me. I'm more interested in the Hapmap samples that had 1+million SNPs. I wonder how much of those overlap with 23andMe v3? Or was the dataset you sent already reduced to the ones that overlapped with 23andMe?

I don't remember why the file has only 600k snp's. I will give the big one to you by Saturday. I'll need to download it and do some work on it.
I think the Singapore Indian samples have >million snps if I remember correctly.

Dr_McNinja
02-20-2014, 03:54 AM
Alright, Sein sent me 3 files, for himself, another Afghan Pashtun, and a Khetran Baloch.

If you guys know any more individuals with 23andMe v3 raw data who you can convince, have them e-mail it to me. Anyone from the South Asia (especially anyone with very high S-Indian % in Harappa), Southwest Asia, Caucasus, Europe cline especially.

Does anyone know how Zack manages Harappa? Does he collect user submitted samples into one giant dataset and run it in unsupervised mode or did he select a particular group of reference populations and make a calculator, and run user samples through the calculator?

Sein
02-20-2014, 02:13 PM
Alright, Sein sent me 3 files, for himself, another Afghan Pashtun, and a Khetran Baloch.

If you guys know any more individuals with 23andMe v3 raw data who you can convince, have them e-mail it to me. Anyone from the South Asia (especially anyone with very high S-Indian % in Harappa), Southwest Asia, Caucasus, Europe cline especially.

Does anyone know how Zack manages Harappa? Does he collect user submitted samples into one giant dataset and run it in unsupervised mode or did he select a particular group of reference populations and make a calculator, and run user samples through the calculator?

I think he did the former, created a huge data-set, and ran it in unsupervised mode.

I think the latter is never a wise idea. The "calculator effect" always makes things problematic. But regardless, Zack didn't try this option, which I think was a good decision on his part.

Dr_McNinja
02-20-2014, 02:35 PM
I think he did the former, created a huge data-set, and ran it in unsupervised mode.

I think the latter is never a wise idea. The "calculator effect" always makes things problematic. But regardless, Zack didn't try this option, which I think was a good decision on his part.I'm surprised the results have stayed so consistent with almost 400 user submitted samples. Makes me wonder why the results are getting noisier as well. Shouldn't some of the results for earlier participants have changed with the introduction of that many newer ones?

Sein
02-20-2014, 02:46 PM
A very interesting question, have you asked him?

MfA
02-20-2014, 04:13 PM
Follow the posts after Curious' and Paul's comments, there are 2 sets of test results slight different admixtures, pre-HRP0239 and later iteration.
http://www.harappadna.org/2013/08/harappaworld-hrp0328-hrp0351/#comments

Dr_McNinja
02-20-2014, 04:55 PM
First, you create an admixture calculator using all available data. Then you compute the component percentages for the newer samples a few samples at a time so that you get the same components. That's how every genome blogger has done it.
I can redo the whole admixture analysis including everyone, but that won't be the same as HarappaWorld at all. Also, it is a fairly laborious process where I select samples and SNPs. Then run ADMIXTURE at various K and also compute cross validation error to figure out which K is the best. Also I have to run admixture at one K multiple times with different seeds to find out stable, likely solutions. Some samples, like the Kalash, can create their own useless component which needs to be rid of. Finally, there's bootstrapping and error estimation.Does this mean he introduces every new group of samples into his reference dataset in an admixture run, inputs their results into the web spreadsheet, and so on and so forth, without permanently adding new individuals to the original reference dataset? I'll e-mail him and ask

everest59
02-20-2014, 05:18 PM
Well try experimenting with termination criteria. I think that's what causes noise. He does not let the program run till the end probably to save time.

Dr_McNinja
02-21-2014, 02:38 PM
Zack said he's working on doing a totally new admixture run.

Dr_McNinja
02-21-2014, 02:52 PM
Also, I removed the first and second degree relatives from the India study and re-did the unsupervised k=2 with me, Sapporo, and HRP0370 and my 5% of Population 2 (where Naga/Pulliyar were Pop2 and Dusadh/Dharkar were Pop1) virtually disappeared to 0.6%.

Dr_McNinja
02-21-2014, 05:50 PM
Damn, this link seems to have gone down. Did anyone have this dataset?

http://www.evolutsioon.ut.ee/MAIT/public_data/munda/

From this paper: http://mbe.oxfordjournals.org/content/28/2/1013.long

everest59
02-21-2014, 06:22 PM
I may have it. I think I downloaded it just a few days ago.

Dr_McNinja
02-22-2014, 02:55 PM
Sein, I ran unsupervised k=2, k=4, k=8 with Brahui, Sindhi, Pathan plus the India study (plus some extra samples from the India Austroasiatic speakers study (http://mbe.oxfordjournals.org/content/28/2/1013.long), minus the Burmese) and all of us so far:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDk0NTZQdWxTcEdrMmVqdVdLWG5Oc 3c#gid=2

I notice one pretty interesting thing actually. A few people (the Afghan Pashtun and South Asian you gave me) have what ought to be their Brahui component registering as one or the other South Indian/East Eurasian component. A complete lack of Brahui. HRP0370 didn't have any either but registered as all Pathan.

Also k=4 is weird and I'm not sure what's going on. I think that Population 1 is East Asian and that Population 4 is South Indian (like Harappa's South Indian).

I'm thinking I should add the rest of the hgdp populations in, minus Kalash. Then see how that goes, then see about adding Hapmap.

Dr_McNinja
02-22-2014, 03:00 PM
I e-mailed Dr. Reich for his dataset with the Onge. If/when they come through, I want to add all those in and also run some supervised sessions with Onge.

Dr_McNinja
02-22-2014, 03:03 PM
Btw, I'm still looking for as many 23andMe v3 raw data files as possible to build up the ~900k SNP dataset. All ethnicities pretty much but especially European, Caucasian, Mideast, South Asian (really interested in some South Indians), Central Asian, East Asian.

Sein
02-22-2014, 03:05 PM
Sein, I ran unsupervised k=2, k=4, k=8 with Brahui, Sindhi, Pathan plus the India study (plus some extra samples from the India Austroasiatic speakers study (http://mbe.oxfordjournals.org/content/28/2/1013.long), minus the Burmese) and all of us so far:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDk0NTZQdWxTcEdrMmVqdVdLWG5Oc 3c#gid=2

I notice one pretty interesting thing actually. A few people (the Afghan Pashtun and South Asian you gave me) have what ought to be their Brahui component registering as one or the other South Indian/East Eurasian component. A complete lack of Brahui. HRP0370 didn't have any either but registered as all Pathan.

Also k=4 is weird and I'm not sure what's going on. I think that Population 1 is East Asian and that Population 4 is South Indian (like Harappa's South Indian).

I'm thinking I should add the rest of the hgdp populations in, minus Kalash. Then see how that goes, then see about adding Hapmap.

Cool! Thanks Dr_McNinja. Although, it says I need permission to see the spreadsheet.

That should turn out interesting, with all of those populations.

Dr_McNinja
02-22-2014, 03:13 PM
Sorry fixed sharing

Dr_McNinja
02-22-2014, 03:18 PM
Also, Sapporo and the other Jatt Muslim, my brother in law, are coming up mostly even until k=8 when the three of us hit 50% Pathan.

Sein
02-22-2014, 03:18 PM
Thanks!

K=2 is rather interesting, basically what one would expect

everest59
02-22-2014, 03:21 PM
Mcninja, do you by any chance have the Afghan data set and the Caucasus data set? That site is down. The Caucasus data set has populations like chechens and lezgins.

Dr_McNinja
02-22-2014, 03:22 PM
Mcninja, do you by any chance have the Afghan data set and the Caucasus data set? That site is down. The Caucasus data set has populations like chechens and lezgins.The site is relocated here:

http://evolbio.ut.ee/

everest59
02-22-2014, 03:23 PM
Sweet. Thanks for the link.

Sein
02-22-2014, 03:31 PM
I wonder if anything will change, if you add GIH, the other HGDP populations, and all of the populations from Afghanistan, with K=43?

everest59
02-22-2014, 03:39 PM
About the Reich data set, I think you are not allowed to publish it individually. Meaning you need to hide the ID's. I think it should be okay as long as individuals can't be identified. I basically ignore the data set. Also only 400k something snp's and the files I received were in a very different format.

Sein
02-22-2014, 03:48 PM
There is some new data here:
http://evolbio.ut.ee/khazar/

The actual paper:
http://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1040&context=humbiol_preprints

Dr_McNinja
02-22-2014, 03:49 PM
I wonder if anything will change, if you add GIH, the other HGDP populations, and all of the populations from Afghanistan, with K=43?Probably not, that's why I'm waiting for Onge for more supervised runs.

Sein
02-22-2014, 03:58 PM
Probably not, that's why I'm waiting for Onge for more supervised runs.

I think you might see some changes. The East Asian will get better allocation with Hazara, and the Afghan Pashtun and Afghan Tajik results will make more sense with Afghan populations. Also, I think the snp overlap isn't good with the Reich data-set.

Also, I'm somewhat confused about one aspect of the results. Aren't all the HGDP Pashtuns supposed to be 100% "Pathan", since this is a supervised run? Almost all of them have other components.

Dr_McNinja
02-22-2014, 06:04 PM
That last one was unsupervised. That's how k=4 was able to generate a pseudo-South Indian component.

Sein
02-22-2014, 06:46 PM
That last one was unsupervised. That's how k=4 was able to generate a pseudo-South Indian component.

Ah, I have to seriously improve my reading and attention skills! That makes so much more sense.

This probably explains why the Pashtun component is modal to HRP0370. She is probably coming from a lineage that has experienced genetic drift, and she does have some inbreeding in her history. I think that explains why the other Afghan Pashtun has such different results from her.

If you try the supervised K=43, it would be great for South Asians. You could even get it into GEDMatch, a comparable calculator to Eurogenes K=36. I probably sound like a broken record, but I still recommend this. Your K=30 supervised data-set, all HGDP South and Central Asian populations (with exception to the Baloch and Makrani, and Kalash are okay, since this is supervised), GIH, and all of the Di Cristofaro populations.

Dr_McNinja
02-22-2014, 07:06 PM
here's a supervised k=44, it seems mostly the same as the last k30+ runs I did,

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDI3eXlFWW9RdTRJazI3UzJRNlBKY Xc#gid=0

It's just India, Brahui, Sindhi, Pathan though.

I should start condensing the Indian populations into groups, but I don't know where to begin as I don't even recognize half of them.

Sein
02-22-2014, 07:20 PM
here's a supervised k=44, it seems mostly the same as the last k30+ runs I did,

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDI3eXlFWW9RdTRJazI3UzJRNlBKY Xc#gid=0

It's just India, Brahui, Sindhi, Pathan though.

I should start condensing the Indian populations into groups, but I don't know where to begin as I don't even recognize half of them.

The Balochi individual's results are very nice considering their background. The Khetran, although a Baloch tribe, claim Pashtun origins, and most anthropologist think they are a very old mixture between Pashtuns, Baloch, and Saraiki people, but at some point, they became assimilated to the Baloch people, and adopted the identity.

But the Afghan Pashtun I sent you is still somewhat unusual? I think their East Asian admixture is being consumed by their Sindhi score.

Dr_McNinja
02-22-2014, 07:40 PM
The Balochi individual's results are very nice considering their background. The Khetran, although a Baloch tribe, claim Pashtun origins, and most anthropologist think they are a very old mixture between Pashtuns, Baloch, and Saraiki people, but at some point, they became assimilated to the Baloch people, and adopted the identity.

But the Afghan Pashtun I sent you is still somewhat unusual? I think their East Asian admixture is being consumed by their Sindhi score.For some reason the South Asians (India/Pakistan) have been showning some affinity to that individual in almost every run I've done.

Sein
02-22-2014, 07:45 PM
For some reason the South Asians (India/Pakistan) have been showning some affinity to that individual in almost every run I've done.

That's pretty interesting.

Maybe the supervised runs are too South Asian-centric, so people from the far western fringe might not be getting proper parsing, since the variation is strongly contingent on the India-based population coverage? This is one of the reasons I recommend adding more HGDP Pakistani populations, and adding the samples from Afghanistan.

MfA
02-22-2014, 08:14 PM
This is anonymous and probably US whites but there are 500+ V2 and V3 23andMe raw datas, maybe you can filter them by running admixture.

https://my.pgp-hms.org/public_genetic_data?data_type=23andMe

Dr_McNinja
02-22-2014, 09:20 PM
This is anonymous and probably US whites but there are 500+ V2 and V3 23andMe raw datas, maybe you can filter them by running admixture.

https://my.pgp-hms.org/public_genetic_data?data_type=23andMeMost of these are labeled 'B36'. Everest, will that 23andme.pl conversion script (I got it from Zack's site) be able to convert these to Plink format or can we only use the B37 files?

everest59
02-22-2014, 09:26 PM
Most of these are labeled 'B36'. Everest, will that 23andme.pl conversion script (I got it from Zack's site) be able to convert these to Plink format or can we only use the B37 files?

You need lift over. I think the last time you may have unzipped the chain file. Try it this time without unzipping the file.
Edit: But yes it should convert the file. I use Razib's script.

everest59
02-23-2014, 04:42 AM
I did a quick admixture run using only 30k snp's. The main purpose was to check for mislabeling in my samples. This was a quick run, and probably didn't take even an hour. K of 10. I'm not claiming this is accurate. In a real run, the Kalash would have to be removed. You can all see they formed their own little component. Also, there are two native american-specific components. And btw the samples labeled "Brahmins" are a mixture of Tamil Brahmins, Uttar Pradesh Brahmins, and one Uttaranchal Brahmin.

https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdGNFTGwzMmdta1RLSGhwb2pmenlHN nc&usp=sharing

So a lot more work in a real run.
And as you can see, quite a bit of noise.
I have included everybody who sent me their samples.

ZephyrousMandaru
02-23-2014, 06:12 AM
Thanks Everest.

Papuan 1.13%
Native American One 1.20%
SW Asian-Med 38.16%
East Asian 0%
Native American Two 0%
Gedrosia-Caucasus 50.63%
European 5.03%
South Asian 0%
Kalash 2.73%
African 0%

MfA
02-23-2014, 10:27 AM
.....

Dr_McNinja
02-23-2014, 11:41 AM
I did an unsupervised k=5, k=6, and k=8:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdEE4YzJaNkRRZ3dNa09DTnRTVUFPN EE&usp=drive_web#gid=0

I'm not sure what happened but going from k=6 to k=8 a second Indian component appeared (peaking in Dharkar/Kanjar) which took an additional 10% out of everyone's West Eurasian.

What do you suppose that is? And what do you think the discrepancy between S-Indian and Dharkar/Kanjar means? Some have more of one than the other.

Someone mentioned that the Dharkar were one of the later populations to admix (between ANI-ASI I assume).

The individual labeled "SouthAsian" has this DIYHarappa resullt:


S-Indian 30.86%
Baloch 32.20%
Caucasian 14.81%
NE-Euro 9.62%
SE-Asian 1.99%
Siberian 2.34%
NE-Asian 3.63%
Papuan 0.14%
American 0.98%
Beringian 1.06%
Mediterranean -
SW-Asian 2.24%
San -
E-African 0.11%
Pygmy -
W-African - I've looked at it closer and Siberian/Beringian/Amerindian seems to all be getting caught in one of the Indian components.

Dr_McNinja
02-23-2014, 11:44 AM
Whoops, shared the document

Dr_McNinja
02-23-2014, 11:59 AM
The interesting one is HRP0370 whose results are very close to their Harappa breakdown:


S-Indian 3.55%
Baloch 35.80%
European 15.96%
Indian 14.58%
Caucasus 24.46%
Med 3.37%
East Asian 2.28%This would seem to imply their ~15% Indian in Harappa came entirely from the ~2kya admixture event (with around 1% ASI getting caught in East Asian perhaps) and their S-Indian score here is completely their Arctic. Or something like that.

Dharkar/Kanjar are ~54% S-Indian in Harappa while Onge, usually the substitute for ASI, is also around ~50%. Bhunjia and Dhurwa are 80% S-Indian here and ~73% in Harappa, the difference possible attributable to Arctic-type components.

Some other people might have component going between Baloch and Caucasus so it's beneficial to compare those between calculators added up. Nonetheless for the rest of us actually from India or India-bordering regions, our Caucasus hasn't changed a lot and some of our Baloch has gone into the Indian components. But which one is the question...

Sein's Indian component would also be Dharkar/Kanjar-heavy once we remove the Arctic components from the S-Indian result.

Dr_McNinja
02-23-2014, 02:04 PM
Did another run of k=8 with Everest (HRP0315 - Nepalese Brahmin):

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFdwdElaOUJ2MnpTazFtcVRpd0xjW Wc#gid=3

Dr_McNinja
02-23-2014, 02:23 PM
Sein, does the person whose data you sent me ("AfghanPashtun") have a gedmatch kit? Do you know their Harappa results?

everest59
02-23-2014, 02:27 PM
I did a quick PCA of Metspalu Indians, and it seems that the Pulliyar form their own cluster. In my opinion, you should remove them.

Edit: Nevermind, I was wrong. It was a 2x3 plot instead of the regular 1x2.

Dr_McNinja
02-23-2014, 02:37 PM
I did a quick PCA of Metspalu Indians, and it seems that the Pulliyar form their own cluster. In my opinion, you should remove them.

Edit: Nevermind, I was wrong. It was a 2x3 plot instead of the regular 1x2.I'll probably do that next run. Here they seem like a substitute for Papuan, SE-Asian or perhaps Amerindian.

Dr_McNinja
02-23-2014, 03:04 PM
Something else interesting I noticed about 23andMe. A Punjabi Jatt Sikh family recently shared with me on there, a mother, father, and their son. The parents were around 92-93% South Asian but the son, with Split View and everything, was at 99% South Asian. The husband and wife did share the same maternal lineage but were from different parts of Punjab so their common descent on the maternal side probably wasn't anytime recent.

Have you guys seen this phenomenon with other families?

Dr_McNinja
02-23-2014, 03:27 PM
Sein, does the person whose data you sent me ("AfghanPashtun") have a gedmatch kit? Do you know their Harappa results?Nevermind I checked it myself:


Population
S-Indian 21.89%
Baloch 33.91%
Caucasian 17.54%
NE-Euro 11.27%
SE-Asian 1.31%
Siberian 1.94%
NE-Asian 5.21%
Papuan -
American 2.23%
Beringian 1.04%
Mediterranean 1.79%
SW-Asian 1.57%
San 0.27%
E-African -
Pygmy -
W-African -

Sein
02-23-2014, 03:48 PM
Sein, does the person whose data you sent me ("AfghanPashtun") have a gedmatch kit? Do you know their Harappa results?

I wish I could share information like that, but I don't have permission. I might be able to send another Afghan Pashtun with similar results, but I just need that individual's permission. They are quite a few Afghan Pashtuns at 23andMe, and their results are wildly diverse, but I just need to get them into open endeavors like this.

Looking through Everest's K=10 run, I think Tadjik1_44Af is among the samples which show similarity to myself. My similarity to this sample leaves no more doubt in my mind, you guys were right, this sample is probably an Afghan Pashtun. Some mislabeling must have happened during the sampling.

My results:
Gedrosia-Caucasus=42.58%
South Asian=29.57%
Kalash=8.12%
European=7.02%
SW Asian-Med=4.44%
Native American 1=2.63%
Papuan=2.26%
East Asian=1.71%
Native American 2=1.63%

Tadjik1_44Af:
Gedrosia-Caucasus=40.68%
South Asian=26.76%
European=10.93%
Kalash=7.61%
East Asian=7.40%
Native American 1=3.40%
SW Asian-Med=2.88%
Native American 2=0.30%

On second thought, this sample is pretty different from me, besides the Gedrosia-Caucasus and South Asian components. This sample is much more East Eurasian than myself.

Side note: Everest, could you post Fst distances between the components?

Edit: I read the post above.

everest59
02-23-2014, 04:10 PM
Well Sein I did not save the fst distance output unfortunately. I wonder if there's a command to print it out. Unfortunately can't use it any time soon.
Anyways, do the Kalash really have to be removed? For example Lazairidis et al decided not to remove them from their runs even though they like to form their own cluster.
Also, sure they are inbred, but it doesn't seem it's too bad. It seems that the Karitiana for example are even more inbred. I guess I feel that it's better not to remove any populations.
Yes, it's a little confusing. I told Mcninja to remove the Pulliyars, but I'm not sure if it's a good idea. I say keep experimenting.

Sein
02-23-2014, 04:16 PM
Well Sein I did not save the fst distance output unfortunately. I wonder if there's a command to print it out. Unfortunately can't use it any time soon.
Anyways, do the Kalash really have to be removed? For example Lazairidis et al decided not to remove them from their runs even though they like to form their own cluster.
Also, sure they are inbred, but it doesn't seem it's too bad. It seems that the Karitiana for example are even more inbred. I guess I feel that it's better not to remove any populations.

With the Kalash, I don't think you should remove them. They do create their own component at some Ks, but I think it depends on the level of granularity. Also, when you do a more serious run, could you include Tadjik1_44Af as a Pashtun, and Pashtun2_6Af as an Afghan Tajik?

How many SNPs will you try with a more intensive run?

everest59
02-23-2014, 04:22 PM
With the Kalash, I don't think you should remove them. They do create their own component at some Ks, but I think it depends on the level of granularity. Also, when you do a more serious run, could you include Tadjik1_44Af as a Pashtun, and Pashtun2_6Af as an Afghan Tajik?

How many SNPs will you try with a more intensive run?

I think 200k is a good number. Sure I'll switch those samples.
Actually, the IDs(whether they are Pashtun or not) does not matter as Admixture does not look at it.

Sein
02-23-2014, 04:30 PM
I think 200k is a good number. Sure I'll switch those samples.
Actually, the IDs(whether they are Pashtun or not) does not matter as Admixture does not look at it.

200k sounds good.

What I meant is if you ever do population averages, could you count the mislabeled Tajik as a Pashtun.

everest59
02-23-2014, 04:35 PM
200k sounds good.

What I meant is if you ever do population averages, could you count the mislabeled Tajik as a Pashtun.

Sure.
However, Mcninja is doing a good job. He has a good computer, so his runs will be much faster than mine.
Don't worry, I'll do what I can.

Dr_McNinja
02-23-2014, 06:33 PM
Tweaked the earlier run and removed all Indian samples but Kanjar/Dharkar and ran k at 6, 7, 8:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDdxcU5CUm1lSDVic3FUblpNM2lHZ 3c#gid=2

A Hazara component appeared and pulled some of the Euro into other components. I think a little bit of the Euro or Caucasus is caught in Hazara itself which is interesting (also the levels it reaches in Russians).

Dr_McNinja
02-23-2014, 08:00 PM
Removed all Indian but Nihali, Dhurwa, Bhunjia, and ran k 6, 7, and 8:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdE5NV1BHZTU1YmM3ZFFYZDNBck5hZ Gc#gid=2

Some European components split up this time. Sheet 2 (k=7) was interesting because of two components:

"Euro1" only reached a max of ~60-66% in Orcadians primarily but also Russians and Basque. I wonder what that is?

And another one which was like a cross between Chechen and some kind of Eastern European which ate up most people's Caucasian. But not everyone, some still had the old Caucasus component.

Dr_McNinja
02-23-2014, 08:16 PM
And just for kicks, one more run at k 6, 7, 8 with no Indian samples aside from the participants:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFg1TEFCTlBrLW05VTgtMXB3bHpaT 1E#gid=1

Dr_McNinja
02-23-2014, 08:38 PM
Well, tinkering with the Indian samples shed no light on what that S-Indian/Dharkar split was. The Gondal Jatt seemed to exhibit a pattern where he would have less Hazara or whatever component substituted for S-Indian and more Baloch than HRP0341 even though they were very close when a S-Indian population was present (with HRP0341 having less S-Indian in Harappa), which led me to believe some of his Indian component was very similar to Baloch.

Tomorrow I'm going to try adding some Siberian to the original dataset. I don't really like the idea though, it seems to imply some kind of North Eurasian ancestry when, correct me if I'm wrong, these Arctic/North Eurasian populations emigrated out of Central Asia and probably reached that point from South Asia. It's some kind of ancestral South Central Asian that they represent in South Asians, which has a close affinity to ASI. It's the same mark left on Malta and most Central Asians and North Europeans of today who still have those trace marks of Siberian/Amerindian/etc. It might be what's left of the original North Indians (before the West Eurasian-like ANI came back into the picture) not long after they came out of Africa via Yemen perhaps.

MfA
02-24-2014, 01:15 AM
Although Caucasus seemed to stay same, I believe it's an illusion.. Removed portion of Baloch(-~2%) and Mediterranean(-~2%) merge with Caucasus.. Northern shifted Caucasus(+~4%) alleles creates Chechen component which also includes various European alleles(Orcadian +1% +minor Basque +1%)
I see similar trend on Italian and Turkish samples too.. Though It seems opposite for Central and South Asian's whom Baloch and Hazara #2 values increases regardless..


^South Asian
What I see here, most of the removed Caucasus alleles merge with Baloch and almost all Indian merge with Hazara and creates Hazara #2.. It's likely the extra Indian alleles(+~5%) join with Baloch but probably some Baloch alleles have been transfered into Caucasus, It seems to be compansated at the end..
Russian splits into Euro and Chechen, since some Baloch alleles've been transfered into Caucasus and some Northern shifted Caucasus alleles also creates Chechen as explained above (Chechen + Euro > Russian).

Dr_McNinja
02-24-2014, 01:33 PM
Made a few more runs of the initial dataset which doesn't have the new people who just sent me their data.

k=9, 10, 11:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdG5KbUxQTnJJeGl4Y25YLVZoTUlpR kE#gid=0

k=12, 13:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdHoxUUxIaFltYXNvc1JGRXZYc1lPe FE#gid=0

At higher K the Indian component fractured again in surprising ways. A new Indian component emerged which sucked out some of our Baloch. In my case by the highest k run, it took out almost half my Baloch %, more than anyone else I noticed, which I guess means my Baloch is similar to Indian (sort of like how the Gondal Jatt had it happen a little in the reverse in the previous post). So that ~5-6% extra S-Indian I'm getting could be from my Baloch category, which would put me in line with the other Pakistani Jatts. I deduced as much from my mother's results though since hers was similar to mine but her S-Indian and Baloch swung the other way by an almost equivalent amount.

Taking it up to the highest, k=13, resulted in even more Baloch pulled to Indian and also pulled to Caucasus.

EDIT: This was shown in the k44 supervised run too where "SouthAsian" and myself had virtually no Brahui, instead getting a few percentage points of Dusadh in its place. Aside from that, the k-30something run with no Brahui seemed to be the most geographically accurate for myself, making me essentially something like 35% Uttar Pradesh Kshatriya, 50% Pathan, and 10% Gond which averages out to south Indian Punjab. When Brahui was introduced, the Kshatriya and Pathan shed about 5% each into the Dusadh category for some reason.

Dr_McNinja
02-24-2014, 02:26 PM
My brother in law's DNA relatives just finished and he shows up as a 5th cousin to my mom. Their overlap is on a region of Chromosome 9 which 23andMe identifies as European for him, but South Asian for my mom.

everest59
02-24-2014, 03:07 PM
Hey, so I don't see my name in there. Are you going to include new samples any time soon?

parasar
02-24-2014, 06:06 PM
Well, tinkering with the Indian samples shed no light on what that S-Indian/Dharkar split was. The Gondal Jatt seemed to exhibit a pattern where he would have less Hazara or whatever component substituted for S-Indian and more Baloch than HRP0341 even though they were very close when a S-Indian population was present (with HRP0341 having less S-Indian in Harappa), which led me to believe some of his Indian component was very similar to Baloch.

Tomorrow I'm going to try adding some Siberian to the original dataset. I don't really like the idea though, it seems to imply some kind of North Eurasian ancestry when, correct me if I'm wrong, these Arctic/North Eurasian populations emigrated out of Central Asia and probably reached that point from South Asia. It's some kind of ancestral South Central Asian that they represent in South Asians, which has a close affinity to ASI. It's the same mark left on Malta and most Central Asians and North Europeans of today who still have those trace marks of Siberian/Amerindian/etc. It might be what's left of the original North Indians (before the West Eurasian-like ANI came back into the picture) not long after they came out of Africa via Yemen perhaps.

1. Yes that is possible that Siberians came via South Asia, ie, SE Asia (Y- MP) to South Asia (Y- P, Q, R) to Eastern Europe (Y- Q, R) to Siberia (Y- Q, R) and then back into India and Europe. They would have almost nothing to do with Yemen though, except perhaps during an exit from Africa 100000ybp or even earlier.

2. There is also a possibility they came from the east, ie, SE Asia (Y-P) to East Asia to Siberia (Y-P, Q, R) and then into India (Y-P, Q,R) and Europe (Y-R). There is also a very slight chance that SE Asia to Siberia path came through the Americas. There is clear connection between American Indians and the Baikal Yeniseians.

I think scenario 1 is the more likely one, as that explains why MA1 had both 0% W. Asian and ~0% E. Asian affinity but still shows some Papuan affinity.

Both scenarios would give ASI affinity to Siberians.

Dr_McNinja
02-24-2014, 08:15 PM
Everest or anyone, do you know how to create a dendogram from plink's .cluster files?

Dr_McNinja
02-24-2014, 08:16 PM
Here's a West vs. East Eurasian run with and without the India and Caucasus study samples:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdHlHS1RQX3pMNnVmOV9raHJlRUZPY mc#gid=0

Dr_McNinja
02-24-2014, 08:20 PM
Here were the contents of the .cluster3 file:

Using v3 and v4 overlap (~500k SNPs):


HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

AfghanPashtun 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

KhetranBaloch 2 2 2 2 2 2 1 1 1 1 1 0 0 0 0

HRP0349 3 3 3 3 3 1 0 0 0 0 0 0 0 0 0

HRP0341 4 3 3 3 3 1 0 0 0 0 0 0 0 0 0

HRP0370 5 4 4 4 4 3 2 2 2 0 0 0 0 0 0

SouthAsian 6 5 5 5 5 4 3 3 3 2 2 1 0 0 0

Kurd 7 6 6 6 6 5 4 4 4 3 3 2 1 1 0

AfghanTajik 8 7 7 7 7 6 5 5 5 4 4 3 2 0 0

Gondal 9 8 8 0 0 0 0 0 0 0 0 0 0 0 0

Turk1 10 9 9 8 8 7 6 4 4 3 3 2 1 1 0

NEEuro1 11 10 10 9 9 8 7 6 6 5 3 2 1 1 0

Italian1 12 11 11 10 8 7 6 4 4 3 3 2 1 1 0

Kaundinya 13 12 12 11 10 9 8 7 3 2 2 1 0 0 0

Turk2 14 13 6 6 6 5 4 4 4 3 3 2 1 1 0


Using only v3 (~800-900k SNPs):


HRP0282 HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0

AfghanPashtun AfghanPashtun 1 1 1 1 1 0 0 0 0 0 0 0 0 0

KhetranBaloch KhetranBaloch 2 2 2 2 2 1 1 1 1 1 1 0 0 0

HRP0349 HRP0349 3 3 3 3 1 0 0 0 0 0 0 0 0 0

HRP0341 HRP0341 4 3 3 3 1 0 0 0 0 0 0 0 0 0

HRP0370 HRP0370 5 4 4 4 3 2 2 2 0 0 0 0 0 0

SouthAsian SouthAsian 6 5 5 5 4 3 3 2 0 0 0 0 0 0

Kurd Kurd 7 6 6 6 5 4 4 3 2 2 2 1 1 0

AfghanTajik AfghanTajik 8 7 7 7 6 5 5 4 3 3 3 2 0 0

NEEuro1 NEEuro1 9 8 8 8 7 6 6 5 4 2 2 1 1 0

Turk1 Turk1 10 9 9 9 8 7 4 3 2 2 2 1 1 0

Italian1 Italian1 11 10 10 9 8 7 4 3 2 2 2 1 1 0

Kaundinya Kaundinya 12 11 11 10 9 8 7 6 5 4 0 0 0 0

Turk2 Turk2 13 12 6 6 5 4 4 3 2 2 2 1 1 0
^ I'm pretty surprised the Gondal Jatt didn't cluster with me and Sapporo.

Dr_McNinja
02-24-2014, 08:24 PM
Btw, all my admixture runs above are with around ~450-475k SNPs.

everest59
02-24-2014, 08:54 PM
Mcninja feed it to R. Never tried it though. I think Humanist may know.

Sein
02-24-2014, 09:38 PM
Here's a West vs. East Eurasian run with and without the India and Caucasus study samples:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdHlHS1RQX3pMNnVmOV9raHJlRUZPY mc#gid=0

Your run without India corresponds very strongly to 23andMe's old results with only "European", "African", and "Asian".

MfA
02-24-2014, 11:36 PM
Here were the contents of the .cluster3 file:
Using only v3 (~800-900k SNPs):

^ I'm pretty surprised the Gondal Jatt didn't cluster with me and Sapporo.

only v3

http://abload.de/img/dendrogram_mfa_pdr2l.png

MfA
02-25-2014, 01:46 PM
Made a few more runs of the initial dataset which doesn't have the new people who just sent me their data.

k=9, 10, 11:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdG5KbUxQTnJJeGl4Y25YLVZoTUlpR kE#gid=0

k=12, 13:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdHoxUUxIaFltYXNvc1JGRXZYc1lPe FE#gid=0

At higher K the Indian component fractured again in surprising ways. A new Indian component emerged which sucked out some of our Baloch. In my case by the highest k run, it took out almost half my Baloch %, more than anyone else I noticed, which I guess means my Baloch is similar to Indian (sort of like how the Gondal Jatt had it happen a little in the reverse in the previous post). So that ~5-6% extra S-Indian I'm getting could be from my Baloch category, which would put me in line with the other Pakistani Jatts. I deduced as much from my mother's results though since hers was similar to mine but her S-Indian and Baloch swung the other way by an almost equivalent amount.

Taking it up to the highest, k=13, resulted in even more Baloch pulled to Indian and also pulled to Caucasus.

EDIT: This was shown in the k44 supervised run too where "SouthAsian" and myself had virtually no Brahui, instead getting a few percentage points of Dusadh in its place. Aside from that, the k-30something run with no Brahui seemed to be the most geographically accurate for myself, making me essentially something like 35% Uttar Pradesh Kshatriya, 50% Pathan, and 10% Gond which averages out to south Indian Punjab. When Brahui was introduced, the Kshatriya and Pathan shed about 5% each into the Dusadh category for some reason.

based on K12, K13 results, only participants.

http://abload.de/img/dendrogram_k12kgo4b.pnghttp://abload.de/img/dendrogram_k13jwp2g.png

Sapporo
02-25-2014, 02:28 PM
Interesting that HRP0370 is clustering with the Khetran Baloch. Kind of surprised that HRP0282 and AfghanPashtun are not clustering immediately but SouthAsian (some type of mixed South Asian) is with AfghanPashtun.

MfA
02-25-2014, 04:27 PM
Interesting that HRP0370 is clustering with the Khetran Baloch. Kind of surprised that HRP0282 and AfghanPashtun are not clustering immediately but SouthAsian (some type of mixed South Asian) is with AfghanPashtun.

I gave it a another try using Pearson dissimilarity rather than Euclidean distance unlike previous iteration

http://abload.de/img/k13_pearsonqwku3.png

Dr_McNinja
02-26-2014, 07:12 PM
Here's some runs with the Saqqaq study's Siberian populations added into the dataset:

https://docs.google.com/spreadsheets/d/1SGQlslPoSxQEf70d6DIEyaUXv3aqjU63pdnTuDDdlEI/edit#gid=670495311

They're all pretty interesting, similar phenomenon with certain components breaking up and recombining at different k levels. I noticed the Indian components were peaking at higher percentages in the modal populations than with Harappa's S-Indian component and that was also true for the Baloch and in general most of the populations. At one point a NE-Euro-like component split off and I began labeling it as Euro2 because it only peaked at near 60% in Russians. The way our North Eurasian-like components (including ASI in S-Indian) were recombining varied. More S-Indian bled off into Siberian-type components in some individuals than others (HRP0341 consistently had pretty high Siberian components that didn't show up in the normal gedmatch calculators).

k=14 was interesting since it pulled the S-Indian apart (~9%) from my other Indian and I was equivalent to HRP0341 and the other Jatt (who is going to be HRP0402) in the other Indian component (peaking in Dharkar/Kanjar). Everest wound up right with us too. It was pretty stark that that (the ~9%) was equivalent roughly to the difference in Baloch between us.

The single k8 pruned run was interesting because the S-Indian/Baloch values for the Gondal Jatt and HRP0341 were close to their Harappa results.

(These don't include the new participants like soulblighter and Zeph since I started these a few days ago before I got their data)

Dr_McNinja
02-26-2014, 07:29 PM
^ Also at k=9 and k=10, the Yukagir component somehow picked up a bunch of NE-European, that's why it got so high in everyone. (Edit: Although now we've had concrete examples of Admixture shifting S-Indian<->Siberian, Siberian<->NE-Euro, and S-Indian<->Baloch into one another).

everest59
02-26-2014, 08:23 PM
This is very comprehensive.
The Caucasus component is consistently at >10% in my case. The Gedmatch calculators have it between 6-9%, and almost always at less than 10%. Calculator effect at play here? Also, Baloch in the low 30's, which is typical in my case.
Also, look at the Northern European in UP Brahmins. BR084 is almost a whopping 20% Northern European at K=16. I'm at 15 % in comparison.

MfA
02-26-2014, 08:42 PM
Thanks.. I have no idea how this Admixture stuff procedure goes but can you check cross validation, which K is the most stable etc?
K12, K14 looks good to me, though. Maybe next step would be an African cluster..

Hanna
02-26-2014, 09:23 PM
This is very comprehensive.
The Caucasus component is consistently at >10% in my case. The Gedmatch calculators have it between 6-9%, and almost always at less than 10%. Calculator effect at play here? Also, Baloch in the low 30's, which is typical in my case.
Also, look at the Northern European in UP Brahmins. BR084 is almost a whopping 20% Northern European at K=16. I'm at 15 % in comparison.

My Caucasus component is slightly more than normal in this run. I am Turk2.

everest59
02-26-2014, 10:27 PM
I downloaded Zack's Harappa calculator, and created Zombies out of the allele frequences. The following plot is a little hard to read, but it makes sense as far as I can tell. So, the Baloch looks a little shifted towards South Asian. Furthermore, the SW Asian is shifted slightly towards African.
I was trying to create zombies out of the run I did earlier, but I got weird results. This one appears fine to me.

https://drive.google.com/file/d/0B3vEDdpZDjUpSkRaX1JKSGJYVGM/edit?usp=sharing

The next step is to calculate what each component is made up of by conducting supervised runs (.e.g. compare the South Indian component to other components, etc).

Sein
02-26-2014, 11:28 PM
I downloaded Zack's Harappa calculator, and created Zombies out of the allele frequences. The following plot is a little hard to read, but it makes sense as far as I can tell. So, the Baloch looks a little shifted towards South Asian. Furthermore, the SW Asian is shifted slightly towards African.
I was trying to create zombies out of the run I did earlier, but I got weird results. This one appears fine to me.

https://drive.google.com/file/d/0B3vEDdpZDjUpSkRaX1JKSGJYVGM/edit?usp=sharing

The next step is to calculate what each component is made up of by conducting supervised runs (.e.g. compare the South Indian component to other components, etc).

Great!

Do you think you could eventually try something similar to Dienekes' ANI-ASI experiment, after you conduct supervised runs on the components?

everest59
02-26-2014, 11:34 PM
Great!

Do you think you could eventually try something similar to Dienekes' ANI-ASI experiment, after you conduct supervised runs on the components?

Do you have a link to his ANI ASI experiment?
If I remember, it seems like a pain. You need to calculate ANI and ASI using every single SNP. So a bunch of equations per locus that will have to be solved.

Sein
02-26-2014, 11:36 PM
Do you have a link to his ANI ASI experiment?
If I remember, it seems like a pain. You need to calculate ANI and ASI using every single SNP. So a bunch of equations per locus that will have to be solved.

Here is the link http://dodecad.blogspot.com/2011/05/more-zombies-ancestral-north-indians.html

Analysis of HGDP samples http://dodecad.blogspot.com/2011/05/aniasi-analysis-of-hgdp-pakistan-groups.html

A side note, but is it possible for you to project the participants you have onto the PCA plot of the components?

everest59
02-26-2014, 11:49 PM
I don't know how to project. Never tried it before.
Also ran into a problem. My zombies use different allele codes. I tried changing the codes using Plink, but PCA looks weird. All my participants are clustered very closely, all of whom are close to the South Indian. I need to solve this issue first before I can do anything else.

Sapporo
02-27-2014, 12:35 AM
I think K11, K12, K13 and K15 are pretty solid efforts. K14 and K16 seem a bit weird with their South Asian components. K9's Yukagir component also seems strange.

everest59
02-27-2014, 03:15 AM
I tried comparing some Harappa calculator components to other components to see what I get. South Indian is almost 85% Caucasus according to my run. Caucasus itself has relationship to Baloch, Mediterranean, NE Euro, as well as South Indian a little bit.
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEJYRTVmMk9FeE5TNzg1aHlFSnp5b Xc&usp=sharing

Now, the 85% caught me by surprise. I was expecting 40-50% Caucasus.

Sapporo
02-27-2014, 03:59 AM
I tried comparing some Harappa calculator components to other components to see what I get. South Indian is almost 85% Caucasus according to my run. Caucasus itself has relationship to Baloch, Mediterranean, NE Euro, as well as South Indian a little bit.
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEJYRTVmMk9FeE5TNzg1aHlFSnp5b Xc&usp=sharing

Now, the 85% caught me by surprise. I was expecting 40-50% Caucasus.


Not sure what to say. I don't think that seems right. Dienekes got about 45-50% Gedrosia for South Asian and HarappaWorld is quite similar to Dodecad K12b. Also fst distances indicate something in the 40-50% range of a West Asian like affinity for South Indian as well.

http://dienekes.blogspot.co.uk/2012/08/inter-relationships-of-dodecad-k12b-and.html

everest59
02-27-2014, 04:06 AM
Yeah, I think there's something wrong here. I'll try a different approach tomorrow.

Sein
02-27-2014, 04:35 AM
I tried comparing some Harappa calculator components to other components to see what I get. South Indian is almost 85% Caucasus according to my run. Caucasus itself has relationship to Baloch, Mediterranean, NE Euro, as well as South Indian a little bit.
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEJYRTVmMk9FeE5TNzg1aHlFSnp5b Xc&usp=sharing

Now, the 85% caught me by surprise. I was expecting 40-50% Caucasus.

That is pretty surprising, although it isn't prima facie unlikely. The South Indian component is moderately closer to the West Eurasian components, based on the Fst distances. Also, HarappaWorld yields much higher East Asian percentages for many individuals, including myself. I think I'm 9%-10% East Asian for HarappaWorld, but most other calculators put me at 4%-6%. So maybe HarappaWorld's "South Indian" component is just much more West Eurasian than usual? That would explain why all that East Eurasian affinity is being "transferred" to other components. It'll be very interesting if your next attempt yields the same result.

And Sapporo is right, the Dodecad K12b "South Asian" component turned out around 50% "Caucasus", and that component is very similar to HarappaWorld's "South Indian". But if you get the same result, it is what it is. We can't really "force" anything here, and there isn't any specific configuration that "should" be happening. Perhaps in this context, HarappaWorld's "South Indian" just is 85% Caucasian? Again, I wonder if you'll get different results tomorrow.

Dr_McNinja
02-27-2014, 10:14 AM
That is pretty surprising, although it isn't prima facie unlikely. The South Indian component is moderately closer to the West Eurasian components, based on the Fst distances. Also, HarappaWorld yields much higher East Asian percentages for many individuals, including myself. I think I'm 9%-10% East Asian for HarappaWorld, but most other calculators put me at 4%-6%. So maybe HarappaWorld's "South Indian" component is just much more West Eurasian than usual? That would explain why all that East Eurasian affinity is being "transferred" to other components. It'll be very interesting if your next attempt yields the same result.

And Sapporo is right, the Dodecad K12b "South Asian" component turned out around 50% "Caucasus", and that component is very similar to HarappaWorld's "South Indian". But if you get the same result, it is what it is. We can't really "force" anything here, and there isn't any specific configuration that "should" be happening. Perhaps in this context, HarappaWorld's "South Indian" just is 85% Caucasian? Again, I wonder if you'll get different results tomorrow.The S-Indian component corresponds roughly to double the old 'Onge' component I thought:

https://docs.google.com/spreadsheet/ccc?key=0AuW3R0Ys-P4HdHpQMmNqbDJGZENVYUdlUXVEeHEwMEE&hl=en_US#gid=0

The Onge itself in the new Harappa calculator is 50% South Indian and 50% East Asian and was being used as a stand-in for ASI. That still leaves the issue of what that 50% of Onge's S-Indian affinity really is. I've been trying to get the Reich dataset without success in order to play around with the Onge data. How did they describe the Onge?

everest59
02-27-2014, 10:40 AM
That is pretty surprising, although it isn't prima facie unlikely. The South Indian component is moderately closer to the West Eurasian components, based on the Fst distances. Also, HarappaWorld yields much higher East Asian percentages for many individuals, including myself. I think I'm 9%-10% East Asian for HarappaWorld, but most other calculators put me at 4%-6%. So maybe HarappaWorld's "South Indian" component is just much more West Eurasian than usual? That would explain why all that East Eurasian affinity is being "transferred" to other components. It'll be very interesting if your next attempt yields the same result.

And Sapporo is right, the Dodecad K12b "South Asian" component turned out around 50% "Caucasus", and that component is very similar to HarappaWorld's "South Indian". But if you get the same result, it is what it is. We can't really "force" anything here, and there isn't any specific configuration that "should" be happening. Perhaps in this context, HarappaWorld's "South Indian" just is 85% Caucasian? Again, I wonder if you'll get different results tomorrow.

I think the best thing I can do is download the K12b calculator and compare my results with that of Dienekes. Let's first see if I get similar results to dienekes.

Dr_McNinja
02-27-2014, 03:46 PM
I made a perl script to measure the overlapping segments in GEDmatch's one to many tool. It was buggy but the idea was to get the entire list of all matches over 5cM then check every kit to weed out all those whose European was less than 70% (NE-Euro+Med on Harappa admixture), and then measure the exact total overlap per chromosome and overall. Of course it wound up being very slow. I had 1195 matches total (I don't know why it got a smaller count than when I do it on the website, which was 1348... or maybe they just added 150 matches recently). And of those initial 1195 matches, 1029 met the threshold for >70% European. It took over 24 hours to go through all the kits which isn't something I can do again since I need to use my computer for other things and can't leave it in Linux all day. Also I don't think the gedmatch administrators would like having repeated drains on their bandwidth of that amount.

Unfortunately I ran the script prematurely without debugging it so it only kept 400-500 some odd kit numbers, of which 220 or so were autosomal matches above 5cM (the rest were matches on the X chromosome). But of those few hundred, it calculated 11.66% overlap with Europeans across 23 chromosomes. So I figure my Photoshop ruler method (which was around 20%) wouldn't be too far off since that 11.66% came about from only a fifth of my total matches. When I did it by hand, I probably selected 90-95% of the kits when it turns out only 86% were adequately European (if I lower the threshold to include 65-70% matches, it would be close). I compared the one-to-many results at GEDmatch and the heaviest areas of coverage on the chromosomes correspond roughly to the Countries of Ancestry and advanced Family Inheritance+Relative Finder results. They all show the middle of my Chromosome 15 as European (and I have the most gedmatch hits at that exact same spot). A chromosome painting using gedmatch admixture calculators (like DIYHarappa) shows the middle of Chromsome 15 as barely having any European, and shows most of the European coverage in a completely different spot on the same chromosome.

Just reaffirms in my mind the questions raised about what exactly population structure software like ADMIXTURE is actually doing or if it's worth anything to begin with. Serious questions considering these tools have been integrated into the offerings of every autosomal testing service. I suppose these results could be of use in scientific studies of ancestral population links which is a different domain than what the average consumer is interested in (recent and current ancestry) but that doesn't explain their failure to correlate to IBS/IBD matches across populations.

parasar
02-27-2014, 04:20 PM
...

Just reaffirms in my mind the questions raised about what exactly population structure software like ADMIXTURE is actually doing or if it's worth anything to begin with. Serious questions considering these tools have been integrated into the offerings of every autosomal testing service. I suppose these results could be of use in scientific studies of ancestral population links which is a different domain than what the average consumer is interested in (recent and current ancestry) but that doesn't explain their failure to correlate to IBS/IBD matches across populations.

An ADMIXTURE advocate (a developer if I recall) tried to explain it to me, but being statistically challenged I never could understand why if a population B is sourced from A and C from B, then B would appear to be an admixture between A and C.

Then these other components showed up, some looked reasonable, but others such as the Kalash, Ongee, Orcadian etc. look to be forming components when they are just drifted populations. There appears to be a reverse logic where rather than parent populations appearing as components in their daughter populations, the daughter populations are ones that break out earlier as components.

everest59
02-27-2014, 04:34 PM
Those drifted populations that form a cluster probably will have to be removed to get a clearer picture. It seems to me that any population that shows as 100% of a component ( e.g. the Kalash show up as 100% Kalash component) may need to be removed.
Also about the Gedmatch chromosome match, would that not be biased towards more recent ancestry? Let's suppose we have two people with equal Euro admixture. Let us suppose at 10%. One person's ancestry is from 200 years ago, whereas the other person's ancestry is 2000 years ago. What happens is, due to recombination for a longer period of time in the individual with older ancestry, the individual with more recent ancestry will have more matches. Am I correct here?
I don't trust unsupervised admixture runs a whole lot. Onece we start getting more ancient DNA samples, then I think supervised admixture runs is the way to go. Unsupervised runs is just a guess.

Dr_McNinja
02-27-2014, 04:55 PM
Also about the Gedmatch chromosome match, would that not be biased towards more recent ancestry? Let's suppose we have two people with equal Euro admixture. Let us suppose at 10%. One person's ancestry is from 200 years ago, whereas the other person's ancestry is 2000 years ago. What happens is, due to recombination for a longer period of time in the individual with older ancestry, the individual with more recent ancestry will have more matches. Am I correct here?Yup, especially for the past several generations, that's why it would be more relevant for consumer interests which is in current ethnicity and recent ancestry.

But he same affects admixture eventually too. I'm submitting one of my grandparent's kits for Family Finder at FTDNA and I'd be willing to bet money they have less S-Indian and more Baloch than mine. Recombination from successive generations of inbreeding will also deteriorate the old signals apparently. So admixture might be relevant for longer time periods but it will also succumb to the effects of recombination eventually.

My European signal with IBS/IBD is twice that of my signal from admixture analysis. And it's probably thousands of years old.

Dr_McNinja
02-27-2014, 04:56 PM
Onece we start getting more ancient DNA samples, then I think supervised admixture runs is the way to go. Unsupervised runs is just a guess.I agree. Who knows what relevance the unsupervised algorithm's components might have to reality. At least with supervised we know the limited degree of error we're getting.

everest59
02-27-2014, 05:16 PM
Another thing I wonder about, do these chromosome matches necessarily have to be European? Keep in mind that Europeans themselves have a very ancient near eastern admixture. So the correlation may be with West Eurasian as opposed to exclusively European. Gedmatch however has a real lack of non Europeans.
Where do I go for IBD/ IBS runs? I downloaded Beagle but never got to use it.

Sein
02-27-2014, 05:23 PM
We must admit that, in an epistemological sense, the biological assumptions underlying ADMIXTURE are deeply problematic. Panmictic populations don't exist in the real world, "ancestral populations" never actually existed in the real world, and we living people are not "admixed" descendants of these putatively "pure" "ancestral populations". That sort of simple population structure just doesn't apply to most of Africa and Eurasia. What we actually see on an empirical basis is isolation-by-distance, punctuated by gene flow between geographically distant populations, followed by more isolation-by-distance. This sort of model describes genetic variation throughout much of the globe, with few exceptions. In addition to that, we have to remember a few technical issues. For example, ADMIXTURE can't utilize Linkage-Disequilibrium. We are missing out on some very rich haplotype information, and that does have an effect on the inferences we make. Then we have interesting stuff happening with ascertainment bias.

At the end of day, ADMIXTURE works best when the model underlying it is closest to reality. For example, African Americans. We know that African Americans are admixtures between three highly divergent population sources, Sub-Saharan Africans, Europeans, and Native Americans, and that all three sources admixed in very recent history. So ADMIXTURE produces results that make good sense.

But everest makes an excellent point, supervised runs are perfect for ancient DNA. Personally, I couldn't agree more, since in such a case, the empirical reality matches the underlying conceptual ontology.

Sein
02-27-2014, 05:29 PM
Where do I go for IBD/ IBS runs? I downloaded Beagle but never got to use it.

Are you planning on trying something with fastIBD?

everest59
02-27-2014, 05:44 PM
Are you planning on trying something with fastIBD?

Definitely will try something .

Sein
02-27-2014, 05:50 PM
Definitely will try something .

Great! This should prove very interesting.

Dr_McNinja
02-27-2014, 05:59 PM
Another thing I wonder about, do these chromosome matches necessarily have to be European? Keep in mind that Europeans themselves have a very ancient near eastern admixture. So the correlation may be with West Eurasian as opposed to exclusively European. Gedmatch however has a real lack of non Europeans.That's why I filtered out by a minimum of 70% NE-Euro+Mediterranean. The people it was filtering out were all European too but had higher Caucasian admixture for example. In my case the Mediterranean is representing admixture resembling Atlantic components thus the repeated Basque I keep getting in the admixture runs. I don't know if West European signals are present in other West Eurasian populations like that, it's usually proportionate to the NE-Euro involved and my admixture is lopsided the other way.

Dr_McNinja
02-27-2014, 06:10 PM
Btw everest do you know where we can get the 1000genomes dataset?

Dr_McNinja
02-27-2014, 06:27 PM
The problem with making a dataset for supervised runs is what population(s) to use for South Asia and India. That's why I figured Onge would be a good fit, but I haven't been able to get the Reich dataset (I asked Dr. Reich, he asked another person on their team, and that person didn't respond). But that's a limited number of SNPs anyway. Is there any Onge-like population in HGDP, Hapmap, or one of the other datasets from studies?

everest59
02-27-2014, 06:28 PM
Not sure. I thought hapmap was 1000 genomes. Some of the populations they list like GIH are found in the Hapmap data set. Are you looking for the Punjabi data set? I think it's not available.

everest59
02-27-2014, 06:30 PM
The problem with making a dataset for supervised runs is what population(s) to use for South Asia and India. That's why I figured Onge would be a good fit, but I haven't been able to get the Reich dataset (I asked Dr. Reich, he asked another person on their team, and that person didn't respond). But that's a limited number of SNPs anyway. Is there any Onge-like population in HGDP, Hapmap, or one of the other datasets from studies?

You can replace Onge with Dai. It works well.
Try emailing that person again . She did not respond the first time I emailed her.

Dr_McNinja
02-27-2014, 06:30 PM
Not sure. I thought hapmap was 1000 genomes. Some of the populations they list like GIH are found in the Hapmap data set. Are you looking for the Punjabi data set? I think it's not available.I was looking at the Eurogenes .fam files and was wondering what dataset individuals whose IDs were prefixed with "HG" were from. Like:


South_Chinese South_Han_Chinese_HG00403 0 0 0 -9
South_Chinese South_Han_Chinese_HG00404 0 0 0 -9
South_Chinese South_Han_Chinese_HG00406 0 0 0 -9
South_Chinese South_Han_Chinese_HG00407 0 0 0 -9
South_Chinese South_Han_Chinese_HG00419 0 0 0 -9
South_Chinese South_Han_Chinese_HG00421 0 0 0 -9
South_Chinese South_Han_Chinese_HG00422 0 0 0 -9
South_Chinese South_Han_Chinese_HG00427 0 0 0 -9
South_Chinese South_Han_Chinese_HG00428 0 0 0 -9
South_Chinese South_Han_Chinese_HG00436 0 0 0 -9
South_Chinese South_Han_Chinese_HG00437 0 0 0 -9
South_Chinese South_Han_Chinese_HG00442 0 0 0 -9

everest59
02-27-2014, 06:37 PM
That's not HGDP? Hgdp seems to have two different types of Han. One says Han Chinese. The other says Han North China.

Dr_McNinja
02-27-2014, 06:42 PM
That's not HGDP? Hgdp seems to have two different types of Han. One says Han Chinese. The other says Han North China.These are throughout the Eurogenes file:

https://docs.google.com/file/d/0B9o3EYTdM8lQUlVTRmxQdGpZdW8/edit?pli=1

I don't think they're HGDP, those are marked separately.

everest59
02-27-2014, 06:56 PM
Check the 1000genomes FTP. Let me know if you find anything.

Dr_McNinja
02-27-2014, 06:56 PM
Those are 1000genomes apparently and their website has tools to download:

http://browser.1000genomes.org/Homo_sapiens/UserData/Haploview?db=core;panelurl=ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/phase1_integrated_calls.20101123.ALL.panel;region= ;url=ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/ALL.chr.integrated_phase1_v3.20101123.snps_indels_ svs.genotypes.vcf.gz

But I can't figure out what to put for the 'region' field. I want all 23 chromosomes, not a particular region...

everest59
02-27-2014, 07:06 PM
I can't check right now since I'm at work.
Did you download plink 2? It will allow you to convert files to plink format. Also, plink 2 has a command to convert 23andme data.

Dr_McNinja
02-28-2014, 01:16 AM
The problem is 1000genomes has data uploaded by chromosome and each file is more than a gig. They have tools to extract specific individuals or populations, but they require a chromosome region and will still give you one chromosome at a time. Is there any way to get multiple chromosomes in one file?

everest59
02-28-2014, 01:29 AM
How have they organized it? Does each chromosome data have multiple individuals? I say check to see if they have a VCF file. Download it and convert to .bed file using Plink2. Do this for every chromosome and merge them all to form a single file.

Dr_McNinja
02-28-2014, 09:53 AM
Did another run (same individuals) with Karitiana, Papuan, Tuscan, Pygmy, and San added:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDRpaVM5dkJCOWlvSkN6a2hmSlZ6W VE#gid=0

Not a whole lot of difference from the previous run. But in k=4 and k=5, Papuan acted as a substitute for a South Indian putting me at 35.x% and HRP0341 @ 30% and the other Jatt @ 32%, close to our Harappaworld S-Indian numbers which was interesting. Then it split off into its own component after that. Everyone else loses about 5% S-Indian after a certain point which goes into their Baloch except mine.

Dr_McNinja
02-28-2014, 09:55 AM
How have they organized it? Does each chromosome data have multiple individuals? I say check to see if they have a VCF file. Download it and convert to .bed file using Plink2. Do this for every chromosome and merge them all to form a single file.How does that work? It just combines the chromosomes into individuals?

I can't download their original data unfortunately. Each chromosome file is several gigs and I'm on a 4Mb/s (~500KB/s) connection. Their online tools to download selective parts don't work either.

everest59
02-28-2014, 01:22 PM
Yeah it will combine all the chromosomes. Even Zack had trouble downloading those files. I think he managed to get a code from Die ekes.

Dr_McNinja
02-28-2014, 06:04 PM
Hm, what would you do for Gedrosian/Baloch in supervised mode? It only peaks at ~50-60% in Balochis, so leave that one out in supervised runs? Is there any good proxy?

Sein
02-28-2014, 06:42 PM
Hm, what would you do for Gedrosian/Baloch in supervised mode? It only peaks at ~50-60% in Balochis, so leave that one out in supervised runs? Is there any good proxy?

I think a component peaks among the Baloch at certain K because they are much more homogenous (and closely related) in comparison to the other HGDP Pakistani samples. Zack made a comment on this in his Eurasian ChromoPainter run. I think the Baloch vs Caucasian difference is just a breakdown of the earlier Caucasian/Pakistani component, which has a bimodal distribution, peaking in the Caucasus and southwestern Pakistan.

Dr_McNinja
03-01-2014, 05:00 PM
Just messing around in supervised mode:


HRP0349:
6.2% East Asian
82.3% Southwest Asian
3.7% Papuan
4.7% European
3.1% Southeast Asian
(87% West Eurasian)

HRP0341:
3.6% East Asian
79.3% Southwest Asian
1.7% Papuan
12.5% European
2.8% Southeast Asian
(92% West Eurasian)

HRP0402:
3.1% Caucasus
3.6% East Asian
81.2% Southwest Asian
2.9% Papuan
6.8% European
1.8% Southeast Asian
(91% West Eurasian)

HRP0282:
24.3% Caucasus
5.3% East Asian
61.4% Southwest Asian
2.2% Papuan
3.0% European
3.8% Southeast Asian
(89% West Eurasian)

HRP0370:
34.6% Caucasus
2.6% East Asian
51.8% Southwest Asian
9.9% European
1% Southeast Asian
(96% West Eurasian)

HRP0315:
10.7% East Asian
75.5% Southwest Asian
3.8% Papuan
8.6% European
1.4% Southeast Asian
(84% West Eurasian)

soulblighter:
11.1% East Asian
83.1% Southwest Asian
5.5% Papuan
0.2% Southeast Asian
(83% West Eurasian)

KhetranBaloch:
18.7% Caucasus
0.5% East Asian
74% Southwest Asian
1.7% Papuan
1.7% European
3.4% Southeast Asian
(94% West Eurasian)
Southwest Asian is Baloch, Brahui, and Bedouin.

European is French, Orcadian, Basque, North Sea, North Atlantic, and Uralic Siberian (Yukagir, Nganassan, and Selkup).

Southeast Asian is Dai and East Asian is Japanese.

I decided to mix Western European with Uralic since I noticed HRP0341 kept showing affinity to Northeastern European which seemed like a mix of European and Uralic. My European component seems allergic to Uralic. I figure that's because all my East Eurasian-like admixture is caught up in other components. In almost every other normal run, I'd be right behind HRP0341 in European (usually 11% and 10%) and HRP0402 (the Gondal Jatt) would be a little further back by roughly 2 or 3%.

HRP0282 also has high East Eurasian. I think something is also going on there with the ancestral South Asian admixture mixing up with the East Eurasian-via-Central Asian admixture present to some degree in most Afghans. I think his European admixture is also kind of opposed to the Uralic element I injected into the European component.

Dr_McNinja
03-01-2014, 05:18 PM
That was at ~437k SNPs, btw. It seems like admixture is handy for comparing between individuals so long as you know the quirks of the software, but it is rather useless for measuring actual ancestral signals. I think it's marginally more useful than basic East vs. West Eurasian distinction. However, I don't want to rush to judgment. It's not all just numbers. Most of my IBD/IBS matches on 23andMe and GEDmatch are West/Northwest European leaning and I've noticed a relative lack of East/Northeast European matches compared to other Punjabis, so there is some correlation to the segment matches. The side effect is that I have relatively lower number of matches overall (and my mother even less) than the other Punjabis I've checked. I even have fewer Indian/South Indian matches than the other Punjabis who are 30% or below in Harappa S-Indian (they usually have a few, I had like a handful). I think it's because there's all this mixup of East-Eurasian-like admixture and West Eurasian-like admixture which isn't genuinely ASI when put together but is being picked up as such by admixture. It no longer resembles North Eurasian/NE-Euro.

This could suggest that the NE-Euro links are real, and not an artifact of East Eurasian bleeding off of ASI. They might be compounded by resemblance to ASI though. But either way, it could still be a product of Caucasian-type West Eurasian mixing with Uralic-like East Eurasian bleedoff from ASI. So I don't know if just software analysis will be enough to settle the debate on whether the European admixture is just by chance due to recombination over many generations (as Razib Khan suggested I believe) or actual ancestral signals. I think haplogroup study and more research into ancient DNA remains is needed. Wasn't Malta like a cross between Siberian, European, and South Asian? Or was he only clustering near South Asians on PCA plots?

Dr_McNinja
03-01-2014, 05:27 PM
The picture painted by all these East/West Eurasian connections seems to resemble the migration map of the Genographic project.

Dr_McNinja
03-01-2014, 06:40 PM
Here are the results with Uralic removed from European (leaving just French, Basque, Orcadian, North Sea, North Atlantic):


HRP0349:
6.8% East Asian
83.1% Southwest Asian
3.8% Papuan
2.9% European (-38%)
3.4% Southeast Asian
(86% West Eurasian)


HRP0341:
9.0% Caucasus
5.2% East Asian
75.9% Southwest Asian
1.8% Papuan
4.6% European (-63%)
3.4% Southeast Asian
(89.5% West Eurasian)

HRP0402:
11.5% Caucasus
4.2% East Asian
78.6% Southwest Asian
3.1% Papuan
0.5% European (-93%)
2.1% Southeast Asian
(90.6% West Eurasian)

HRP0282:
30.6% Caucasus
5.1% East Asian
58.2% Southwest Asian
2.3% Papuan
0.0% European (-100%)
3.8% Southeast Asian
(88.8% West Eurasian)

HRP0370:
39.7% Caucasus
3.2% East Asian
48.7% Southwest Asian
0.4% Papuan
6.9% European (-30%)
1.2% Southeast Asian
(95.3% West Eurasian)

HRP0315:
3.3% Caucasus
11.7% East Asian
74.9% Southwest Asian
3.9% Papuan
4.3% European (-50%)
1.9% Southeast Asian
(82.5% West Eurasian)

soulblighter:
11.0% East Asian
83.0% Southwest Asian
5.7% Papuan
0.3% Southeast Asian
(83% West Eurasian)

KhetranBaloch:
24.0% Caucasus
0.3% East Asian
70.4% Southwest Asian
1.9% Papuan
0.0% European (-100%)
3.3% Southeast Asian
(94.4% West Eurasian)

Unfortunately I don't have any Finnish samples. The next closest I have are Russian, Mordovian, Ukrainian, and these only peaked in NE-Euro at a highest of like 60 something % (Finnish was 80% I believe). But I ran it anyway and the European values went up a little in reverse of the above. Here were the only changes:

HRP0349: 3.2% European (+10%)

HRP0341: 7.9% Caucasian, 5.5% European (+20%)

HRP0402: 10.7% Caucasian, 1.0% European (+50%)

HRP0370: 39.3% Caucasian, 7.3% European (+5.8%)

HRP0315: 2.6% Caucasian, 4.9% European (+14%)

It would be interesting to know if this fake Euro (West Euro + Uralic) was higher than NE-Euro (Finnish) actually was. I think it would've been by a little bit, taking into account the difference between supervised (where it takes the population as 100%) and unsupervised (where Finnish peaked at 80%... meaning NE-Euro would've been even more admixed with East Eurasian).

I don't think NE-Euro should be used as a component in admixture for populations with significant East Eurasian admixture (i.e, other than Europeans). I know you can't control admixture's output in unsupervised mode but steering it to a fully West Eurasian population (West European) and then separate Siberian components (Uralic, Altaic, Paleo, Amerindian) would be more accurate.

EDIT: I mean, we could keep NE-Euro and label it as something else ("Ancestral West Eurasian"), but it will still fluctuate wildly in some people.

Dr_McNinja
03-01-2014, 06:50 PM
Another point of interest, whenever I'd remove South Asian components in supervised mode, the South Asians would often jump completely into West Caucasian (Abhkasian) instead of Dai or Papuan. Only at low k would it go into SE-Asian. So it does seem the Indian component is more Caucasian than East Eurasian.

I was getting frustrated by how they'd jump first into Gedrosian when I took out S-Indian, and then into West Caucasian, so then I put in Bedouin but it wouldn't take the bait so I forced Bedouin and Gedrosian together to leave the Caucasian/European separate.

Dr_McNinja
03-01-2014, 07:01 PM
Compare the above numbers with 23andMe's current Ancestry Composition.

HRP0370:
39.7% Caucasus
3.2% East Asian
48.7% Southwest Asian
0.4% Papuan
6.9% European
1.2% Southeast Asian

Ancestry Comp (in Speculative mode):
51.7% South Asian
41.8% Middle Eastern (a.k.a. Caucasus)
3.8% European
0.5% East Asian

---

HRP0282:
30.6% Caucasus
5.1% East Asian
58.2% Southwest Asian
2.3% Papuan
0.0% European
3.8% Southeast Asian

Ancestry Composition (in Speculative):
78.6% South Asian
11.6% Middle Eastern (a.k.a. Caucasus)
4.0% European
1.7% East Asian
4.2% Unassigned

Pull out the East Asian from my run minus East Asian of 23andMe (9.5%), pull out an equivalent amount of Caucasian, add to Southwest Asian (+19%):

HRP0282:
21.1% Caucasus
1.7% East Asian
77.2% Southwest Asian
0.0% European

So with this method his 'Unassigned' is probably all Caucasian or European, they can't decide. If South Asian is more West Eurasian than East Eurasian, then the unassigned could be East Eurasian.

Dr_McNinja
03-01-2014, 08:08 PM
plink clustering:


Genotyping rate of 98-99% with ~330k SNPs (mix of v2, v3, v4, and FTDNA Family Finder):

HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AfghanPashtun 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
KhetranBaloch 2 2 2 2 2 2 2 2 2 2 2 2 2 1 0 0 0 0 0 0
HRP0349 3 3 3 3 3 3 3 1 1 1 1 1 1 0 0 0 0 0 0 0
HRP0341 4 4 3 3 3 3 3 1 1 1 1 1 1 0 0 0 0 0 0 0
HRP0370 5 5 4 4 4 4 4 3 3 3 0 0 0 0 0 0 0 0 0 0
SouthAsian 6 6 5 5 5 5 5 4 4 4 3 2 2 1 0 0 0 0 0 0
Kurd 7 7 6 6 6 6 6 5 5 5 4 3 3 2 1 1 1 1 1 0
AfghanTajik 8 8 7 7 7 7 7 6 6 6 5 4 4 3 2 2 2 0 0 0
HRP0402 9 9 8 8 8 8 8 7 7 7 6 5 5 4 3 3 3 2 0 0
Turk1 10 10 9 9 9 9 9 8 5 5 4 3 3 2 1 1 1 1 1 0
NEEuro1 11 11 10 10 10 10 10 9 8 8 7 6 3 2 1 1 1 1 1 0
Italian1 12 12 11 11 9 9 9 8 5 5 4 3 3 2 1 1 1 1 1 0
Kaundinya 13 13 12 12 11 11 11 10 9 9 8 7 6 5 4 3 3 2 0 0
Turk2 14 14 13 6 6 6 6 5 5 5 4 3 3 2 1 1 1 1 1 0
Zeph 15 15 14 13 12 12 6 5 5 5 4 3 3 2 1 1 1 1 1 0
soublighter 16 16 15 14 13 13 12 11 10 9 8 7 6 5 4 3 3 2 0 0
karachay 17 17 16 15 14 10 10 9 8 8 7 6 3 2 1 1 1 1 1 0
DMXX 18 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 1 1 1 0
SouthAsian2 19 16 15 14 13 13 12 11 10 9 8 7 6 5 4 3 3 2 0 0

Genotyping rate of 93-94% with ~820k SNPs (for some reason it jumps from 330k to 820k when I move up incrementally in the frequency pruning setting)

HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AfghanPashtun 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
KhetranBaloch 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 0 0 0 0 0
HRP0349 3 3 3 3 3 3 3 3 1 1 1 1 1 0 0 0 0 0 0 0
HRP0341 4 4 4 3 3 3 3 3 1 1 1 1 1 0 0 0 0 0 0 0
HRP0370 5 5 5 4 4 4 4 4 3 0 0 0 0 0 0 0 0 0 0 0
SouthAsian 6 6 6 5 5 5 5 5 4 3 3 3 3 2 0 0 0 0 0 0
Kurd 7 7 7 6 6 6 6 6 5 4 4 4 4 3 2 1 1 0 0 0
AfghanTajik 8 8 8 7 7 7 7 7 6 5 5 5 5 4 3 2 0 0 0 0
HRP0402 9 9 9 8 8 8 8 8 7 6 6 6 6 5 4 3 2 1 1 0
Turk1 10 10 10 9 9 9 9 9 8 7 4 4 4 3 2 1 1 0 0 0
NEEuro1 11 11 11 10 10 10 10 10 9 8 7 7 4 3 2 1 1 0 0 0
Italian1 12 12 12 11 11 9 9 9 8 7 4 4 4 3 2 1 1 0 0 0
Kaundinya 13 13 13 12 12 11 11 11 10 9 8 3 3 2 0 0 0 0 0 0
Turk2 14 14 14 13 6 6 6 6 5 4 4 4 4 3 2 1 1 0 0 0
Zeph 15 15 15 14 13 12 12 9 8 7 4 4 4 3 2 1 1 0 0 0
soublighter 16 16 16 15 14 13 3 3 1 1 1 1 1 0 0 0 0 0 0 0
karachay 17 17 11 10 10 10 10 10 9 8 7 7 4 3 2 1 1 0 0 0
DMXX 18 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SouthAsian2 19 16 16 15 14 13 3 3 1 1 1 1 1 0 0 0 0 0 0 0I'm going to add a dozen 23andMe v3 European samples tomorrow, about 5 or 6 of which are some kind of Eastern European (NE-Euro is low, Caucasus is high). I was downloading them from that link one of you posted a few days back of a public database where people were uploading their samples. I got "SouthAsian2" from there, I don't know who it is. I'll post a cluster file of the exclusively v3 ~900k SNP dataset as well.

DMXX
03-01-2014, 11:01 PM
plink clustering:

I'm going to add a dozen 23andMe v3 European samples tomorrow, about 5 or 6 of which are some kind of Eastern European (NE-Euro is low, Caucasus is high). I was downloading them from that link one of you posted a few days back of a public database where people were uploading their samples. I got "SouthAsian2" from there, I don't know who it is. I'll post a cluster file of the exclusively v3 ~900k SNP dataset as well.

How would you advise us to interpret these results? I seem to have almost every number in all the lists (0-18).

Dr_McNinja
03-02-2014, 03:40 AM
How would you advise us to interpret these results? I seem to have almost every number in all the lists (0-18).I hope MfA can make a nice visual out of that like he did last time, I don't know how to do it

MfA
03-02-2014, 11:14 AM
plink clustering:

I'm going to add a dozen 23andMe v3 European samples tomorrow, about 5 or 6 of which are some kind of Eastern European (NE-Euro is low, Caucasus is high). I was downloading them from that link one of you posted a few days back of a public database where people were uploading their samples. I got "SouthAsian2" from there, I don't know who it is. I'll post a cluster file of the exclusively v3 ~900k SNP dataset as well.

~820k
http://abload.de/img/dendro2_800kkrolh.png

http://abload.de/img/dendro2_800kkrolh.png

DMXX
03-02-2014, 11:56 AM
~820k
http://abload.de/img/dendro2_800kkrolh.png

Interesting... I've just learned my closest pairing (HRP0402) is a Punjabi Jatt (https://docs.google.com/spreadsheet/ccc?key=0AuW3R0Ys-P4HdGE4eDh6emt1dUs2U2pXTkVjS0lsV1E&hl=en#gid=0).

Hanna
03-02-2014, 12:01 PM
~820k
http://abload.de/img/dendro2_800kkrolh.png

http://abload.de/img/dendro2_800kkrolh.png
Are you the Kurd ? If so I am close to you.

MfA
03-02-2014, 12:11 PM
Interesting... I've just learned my closest pairing (HRP0402) is a Punjabi Jatt (https://docs.google.com/spreadsheet/ccc?key=0AuW3R0Ys-P4HdGE4eDh6emt1dUs2U2pXTkVjS0lsV1E&hl=en#gid=0).

You are still quite distinct from him.. Karachay and Swedish is really weird they are very close and seems to diverged lately, they are more similar each other than Hanna to me.. Zeph-Turk1-Italian is typical mediterranean cline

everest59
03-02-2014, 02:15 PM
~820k
http://abload.de/img/dendro2_800kkrolh.png

http://abload.de/img/dendro2_800kkrolh.png

Are you using R to create these dendrograms? Or is there a software that can be downloaded?

Thanks.

Sein
03-02-2014, 02:49 PM
I'm clustering with HRP0370, an Afghan Pashtun, so my results seem reasonable.

MfA
03-02-2014, 04:30 PM
Are you using R to create these dendrograms? Or is there a software that can be downloaded?

Thanks.

Didn't use any tool other than Excel by typing manually, took nearly half and hour to complete.. No idea how to handle the unusual input format of Dr. McNinja..

everest59
03-02-2014, 10:50 PM
I just created a dendrogram using 10 dimensions of PCA. Around 28k snp's.

https://drive.google.com/file/d/0B3vEDdpZDjUpRTkxeEd0aWtkZjQ/edit?usp=sharing

Also, fam file to go with the dendrogram above:
https://drive.google.com/file/d/0B3vEDdpZDjUpYzBCbFZUWjhNWXc/edit?usp=sharing

More than 1700 samples. Pdf format, so you can search yourself.

I say download it. Looks very tiny on google.

Humanist
03-02-2014, 11:10 PM
Thanks, everest59.

The positions of most of the West Asian samples from the forum:

Took your suggestion and downloaded the file. Zephyrous and I cluster with a group that consists almost entirely of Eastern Mizrahim. Which is expected.

http://i1096.photobucket.com/albums/g326/dok101/Faces/human_zeph.jpg


DMXX, NK19, and MfA cluster with a group that is comprised almost exclusively of Iranians and Kurds:

http://i1096.photobucket.com/albums/g326/dok101/Faces/dmxx_mfa_nk19.jpg

Sein
03-02-2014, 11:10 PM
Thanks everest!

For some reason though, the PDF is blank?

everest59
03-02-2014, 11:13 PM
Thanks everest!

For some reason though, the PDF is blank?

It seems fine to me. Did you download it?
Maybe Humanist can help here.

Sein
03-02-2014, 11:18 PM
It seems fine to me. Did you download it?
Maybe Humanist can help here.

I downloaded it, but still blank.

I'd really appreciate any assistance Humanist!

MfA
03-02-2014, 11:19 PM
For some reason Acrobat reader fails to load the pdf in full hence blank screen. First i thought my aging hdd is the problem, though moving to ssd didn't help any either.. I had to dl a 3rd party tool on pc(http://pdf-xchange-viewer.en.softonic.com/), on android there was no problem..

http://abload.de/img/dendro7vjkg.png

I believe tur182 is of Kurdish origin, his admixture results and high IBD shares with Kurds evident, I commented on him in this (http://corduene.blogspot.com.tr/2013/09/eurogenes-id-kurd13s-ibd-matches.html) post earlier.. Could be an Alevi of Turkish speaker, though they are identical to northern Kurds especially the ones from Erzincan, Erzurum and probably Sivas and Malatya too.. Seeing arm9 and arm4 on same branch is no suprise either, their admixture results are very similar to northern Kurd but unlike Alevis of Turkish speaker I haven't notice significant IBD matches with them..

everest59
03-02-2014, 11:28 PM
For some reason Acrobat reader fails to load the pdf in full hence blank screen. First i thought my aging hdd is the problem, though moving to ssd didn't help any either.. I had to dl a 3rd party tool on pc(http://pdf-xchange-viewer.en.softonic.com/), on android there was no problem..

http://abload.de/img/dendro7vjkg.png

It was loading on my ubuntu reader no problem.
Sein, try it with this software.

Sein
03-02-2014, 11:40 PM
It was loading on my ubuntu reader no problem.
Sein, try it with this software.

It worked!

I'm clustering with HGDP0226.

everest59
03-02-2014, 11:43 PM
It worked!

I'm clustering with HGDP0226.

Afghan Pashtuns aren't too far either. Looking at 226's harappa numbers, they are very similar to yours.

Sein
03-02-2014, 11:51 PM
Afghan Pashtuns aren't too far either. Looking at 226's harappa numbers, they are very similar to yours.

I really like this, it seems very accurate. And nearly all of the Pashtuns are in a pretty compact set of close clusters.

Humanist
03-02-2014, 11:56 PM
I downloaded it, but still blank.

I'd really appreciate any assistance Humanist!

Try downloading it from here (https://app.box.com/s/k3omngzqbuzqkrcu6edz).


It worked!

I'm clustering with HGDP0226.

Strike my post. :)

everest59
03-02-2014, 11:56 PM
I really like this, it seems very accurate. And nearly all of the Pashtuns are in a pretty compact set of close clusters.

I used a program called Hclust, which comes standard with R. I basically ran PCA first, then ran Hclust on the PCA file.

Sapporo
03-03-2014, 12:50 AM
Okay, I'm not directly clustering with anyone but pretty close to HGDP00222, HGDP00187 and not too far off from HGDP00197 and HGDP00201. All Sindhis with higher than typical NE Euro and less Baloch other than HGDP00187. Reasonable. Although, HGDP00222 is a slightly South Asian shifted HGDP Pashtun.

http://www.harappadna.org/tag/sindhi/

http://www.harappadna.org/2013/07/pathan-pashtun-admixture-results/

Sein
03-03-2014, 01:01 AM
Okay, I'm not directly clustering with anyone but pretty close to HGDP00222, HGDP00187 and not too far off from HGDP00197 and HGDP00201. All Sindhis with higher than typical NE Euro and less Baloch other than HGDP00187. Reasonable. Although, HGDP00222 is a slightly South Asian shifted HGDP Pashtun.

http://www.harappadna.org/tag/sindhi/

http://www.harappadna.org/2013/07/pathan-pashtun-admixture-results/

The cool thing about you clustering with HGDP00222, your new admixture results at K=16 for Dr_McNinja's run are extremely similar to HGDP00222. You both have the same amount of "South Indian".

Sapporo
03-03-2014, 01:05 AM
The cool thing about you clustering with HGDP00222, your new admixture results at K=16 for Dr_McNinja's run are extremely similar to HGDP00222. You both have the same amount of "South Indian".

Fair point. My South Indian and Baloch are extremely similar and my NE Euro is slightly higher while Caucasian is clearly much lower.

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDRpaVM5dkJCOWlvSkN6a2hmSlZ6W VE#gid=10

Actually, I think K14 is most comparable since HAP uses K14 as well.

Baloch: 45.93%
South Indian: 25.34%
Euro 2: 13.57%
Caucasus 10.20%
Greenland: 1.79%
S Siberian: 1.4%
Yukagir: 1.33%
Beringian: 0.27%

Sein
03-03-2014, 01:15 AM
Fair point. My South Indian and Baloch are extremely similar and my NE Euro is slightly higher while Caucasian is clearly much lower.

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdDRpaVM5dkJCOWlvSkN6a2hmSlZ6W VE#gid=10

Actually, I think K14 is most comparable since HAP uses K14 as well.

Baloch: 45.93%
South Indian: 25.34%
Euro 2: 13.57%
Caucasus 10.20%
Greenland: 1.79%
S Siberian: 1.4%
Yukagir: 1.33%
Beringian: 0.27%

HAP is actually K=16, but I'm sure your K=14 results are also similar. In my view, there is definitely going to be overlap between some of the less South Asian-shifted Punjabi Jatts, and the more South Asian-shifted Pashtuns, and your results reflect that. The only difference would be European vs Caucasian.

Sapporo
03-03-2014, 01:19 AM
HAP is actually K=16, but I'm sure your K=14 results are also similar. In my view, there is definitely going to be overlap between some of the less South Asian-shifted Punjabi Jatts, and the more South Asian-shifted Pashtuns, and your results reflect that. The only difference would be European vs Caucasian.

Oh my mistake. Not sure why I thought K14. Let me look at McNinja's slightly older run.

https://docs.google.com/spreadsheets/d/1SGQlslPoSxQEf70d6DIEyaUXv3aqjU63pdnTuDDdlEI/edit#gid=1378887197

In this one, K13 and K15 make sense why I'm close to HGDP00222

soulblighter
03-03-2014, 02:09 PM
Here are the results with Uralic removed from European (leaving just French, Basque, Orcadian, North Sea, North Atlantic):
soulblighter:
11.0% East Asian
83.0% Southwest Asian
5.7% Papuan
0.3% Southeast Asian
(83% West Eurasian)


Thanks Dr_McNinja! I guess I ended up with the most East Eurasian of your current samples which is not surprising. Although I wonder why I have a lot of East Asian (Japanese) and no SE Asian. I would have assumed that being from South India, I would have been closely related with SE Asians rather than the Japanese. The Papuan admixture probably indicates ASI here? As I am not on your 'K' runs, I do not know how my components split up with each K to make more sense out of these results.
Also from the clustering results, it appears "SouthAsian2" is from South/West/East India (rather than the north).

Dr_McNinja
03-03-2014, 04:13 PM
Thanks Dr_McNinja! I guess I ended up with the most East Eurasian of your current samples which is not surprising. Although I wonder why I have a lot of East Asian (Japanese) and no SE Asian. I would have assumed that being from South India, I would have been closely related with SE Asians rather than the Japanese. The Papuan admixture probably indicates ASI here? As I am not on your 'K' runs, I do not know how my components split up with each K to make more sense out of these results.
Also from the clustering results, it appears "SouthAsian2" is from South/West/East India (rather than the north).'SouthAsian2' was getting results similar to yours across the board, that's why I didn't mention them specifically in that last batch. That's probably also why they clustered close to you.

I would assume East Asian represents admixture that no longer resembles SE-Asian/Papuan as closely due to many recombination events. I think we have higher proportion of SE-Asian in NW India for the same reason we have higher proportions of NE-Euro.

Speaking of which, I wonder how East Eurasian admixed ANI was, because that would also explain a lot. All of these West Eurasian components used (one of my complaints) are East admixed to some degree, NE-Euro perhaps moreso than any other (SW-Asian and Mediterranean probably least). I'm guessing ANI was significantly East Eurasian in admixture, but whether it was more or less than today's Central Asians is unknown. Perhaps someone who can do those ANI 'zombies' can try to measure that. I think there was a PCA plot of where ANI placed.

So I think we're all less European/Caucasian/etc and less 'ASI' than these results lead us to believe because we don't have an ANI component to compare with.

Dr_McNinja
03-03-2014, 04:26 PM
Here's the post comparing the components from K12b:

http://dienekes.blogspot.de/2012/08/inter-relationships-of-dodecad-k12b-and.html

It kind of makes sense of HRP0402 (Pakistani Jatt). His Euro component, while lower overall, is West leaning like mine (even more than mine according to my runs, while I would have more N/E Euro than West Euro he'd get entirely West Euro), his Gedrosian is in the upper range, his Caucasian above average, and his S-Indian only marginally above the Jatt average. That's what would happen if you cut up the NE-Euro component into those proportions in those images and redistributed them accordingly (and wind up pushing some Gedrosian into Caucasus and Siberian so you always wind up with some amount of Arctic admixture). If you go in the reverse, he starts to resemble the admixture profile of the other Punjabi Jatts. If you keep going (lowering S-Indian and Baloch even moreso) you get the Haryana Jatts.

Sein
03-03-2014, 06:52 PM
Here's the post comparing the components from K12b:

http://dienekes.blogspot.de/2012/08/inter-relationships-of-dodecad-k12b-and.html

It kind of makes sense of HRP0402 (Pakistani Jatt). His Euro component, while lower overall, is West leaning like mine (even more than mine according to my runs, while I would have more N/E Euro than West Euro he'd get entirely West Euro), his Gedrosian is in the upper range, his Caucasian above average, and his S-Indian only marginally above the Jatt average. That's what would happen if you cut up the NE-Euro component into those proportions in those images and redistributed them accordingly (and wind up pushing some Gedrosian into Caucasus and Siberian so you always wind up with some amount of Arctic admixture). If you go in the reverse, he starts to resemble the admixture profile of the other Punjabi Jatts. If you keep going (lowering S-Indian and Baloch even moreso) you get the Haryana Jatts.

Dr_McNinja, what I find rather interesting about your newer Admixture runs is that all of us Pashtuns have more "Caucasian" than we do "South Indian". For example, I'm 21% "Caucasian", and 18% "South Indian" for your latest unsupervised K=17 run, the exact opposite of my HarappaWorld results. Now, if you'd add the HGDP Pashtuns, would they get similar results?

MfA
03-03-2014, 07:11 PM
Dr_McNinja, what I find rather interesting about your newer Admixture runs is that all of us Pashtuns have more "Caucasian" than we do "South Indian". For example, I'm 21% "Caucasian", and 18% "South Indian" for your latest unsupervised K=17 run, the exact opposite of my HarappaWorld results. Now, if you'd add the HGDP Pashtuns, would they get similar results?

Maybe That is because of your Southwest Asian related alleles on K17 run that probably merged with your Caucasus..

Sein
03-03-2014, 07:15 PM
Maybe That is because of your Southwest Asian related alleles on K17 run that probably merged with your Caucasus..

This is an excellent point! I didn't even think of this, the Caucasus component probably subsumes the Southwest Asian component.

Dr_McNinja
03-03-2014, 08:11 PM
This is an excellent point! I didn't even think of this, the Caucasus component probably subsumes the Southwest Asian component.Yup, I couldn't get the SW-Asian to pull out any admixture at all (until I merged it with Gedrosian).

I wouldn't compare the absolute numbers to Harappa, or between any calculator really. I was using very few reference populations so that's why the peak percentages were rather mercurial in my runs, especially in supervised mode. South Indian was coming in low for people lower than 30% on Harappa (I think Sapporo was hitting 20% in some) and high for people above that.

Another thing I couldn't figure out is why Armenian (South Caucasus) was dominating for HRP0370 whenever I included it whereas in Eurogenes K36, it barely took anything away from North Caucasus for the Pashtun.

everest59
03-03-2014, 10:21 PM
I just completed a zombie run using Dienekes' K12b calculator. I used 20 zombies per component. I'm sure I'd get a more accurate result if I used more zombie samples. First and foremost, a PCA plot:
https://drive.google.com/file/d/0B3vEDdpZDjUpaGtpRWxPYlhYanc/edit?usp=sharing

Second, a supervised run to test out my zombies. Let me know what you guys see:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEZmdVA2YTRES2pnNlR0WDU2ckRHa mc&usp=sharing

Third, a supervised run to test out what the "South Indian" component is made up of:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdHlfeEt6cFFRcU1xV1h3aF9JdGQzQ 0E&usp=sharing

I know Dienekes got Caucasus + East Asian. I got Gedrosia plus East Asian.

I went to GEDMATCH and did a quick K12b to see what it produced:
Gedrosia 34.61%
Siberian 1.28%
Northwest_African -
Southeast_Asian 1.78%
Atlantic_Med 3.07%
North_European 13.11%
South_Asian 35.12%
East_African -
Southwest_Asian 0.40%
East_Asian 4.28%
Caucasus 6.34%
Sub_Saharan -

Okay, so the Caucasus got underestimated in my case.

Sein
03-03-2014, 10:50 PM
I just completed a zombie run using Dienekes' K12b calculator. I used 20 zombies per component. I'm sure I'd get a more accurate result if I used more zombie samples. First and foremost, a PCA plot:
https://drive.google.com/file/d/0B3vEDdpZDjUpaGtpRWxPYlhYanc/edit?usp=sharing

Second, a supervised run to test out my zombies. Let me know what you guys see:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEZmdVA2YTRES2pnNlR0WDU2ckRHa mc&usp=sharing

Third, a supervised run to test out what the "South Indian" component is made up of:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdHlfeEt6cFFRcU1xV1h3aF9JdGQzQ 0E&usp=sharing

I know Dienekes got Caucasus + East Asian. I got Gedrosia plus East Asian.

I went to GEDMATCH and did a quick K12b to see what it produced:
Gedrosia 34.61%
Siberian 1.28%
Northwest_African -
Southeast_Asian 1.78%
Atlantic_Med 3.07%
North_European 13.11%
South_Asian 35.12%
East_African -
Southwest_Asian 0.40%
East_Asian 4.28%
Caucasus 6.34%
Sub_Saharan -

Okay, so the Caucasus got underestimated in my case.

Thanks everest!

That PCA plot is very interesting, nice to see how we all position in relation to the components.

Sapporo
03-03-2014, 10:56 PM
I just completed a zombie run using Dienekes' K12b calculator. I used 20 zombies per component. I'm sure I'd get a more accurate result if I used more zombie samples. First and foremost, a PCA plot:
https://drive.google.com/file/d/0B3vEDdpZDjUpaGtpRWxPYlhYanc/edit?usp=sharing

Second, a supervised run to test out my zombies. Let me know what you guys see:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdEZmdVA2YTRES2pnNlR0WDU2ckRHa mc&usp=sharing

Third, a supervised run to test out what the "South Indian" component is made up of:
https://docs.google.com/spreadsheet/ccc?key=0AnvEDdpZDjUpdHlfeEt6cFFRcU1xV1h3aF9JdGQzQ 0E&usp=sharing

I know Dienekes got Caucasus + East Asian. I got Gedrosia plus East Asian.

I went to GEDMATCH and did a quick K12b to see what it produced:
Gedrosia 34.61%
Siberian 1.28%
Northwest_African -
Southeast_Asian 1.78%
Atlantic_Med 3.07%
North_European 13.11%
South_Asian 35.12%
East_African -
Southwest_Asian 0.40%
East_Asian 4.28%
Caucasus 6.34%
Sub_Saharan -

Okay, so the Caucasus got underestimated in my case.

My Dodecad K12b:

Admixture Results:

#Population Percent
1 Gedrosia 40.78
2 South_Asian 30.95
3 North_European 14.48
4 Caucasus 8.88
5 Southwest_Asian 2.32
6 Siberian 2.1
7 Atlantic_Med 0.41
8 East_African 0.1

Supervised run:

1 Gedrosia 40.7081
2 South_Asian 31.7128
3 North_European 16.323
4 Caucasus 6.2934
5 Southwest_Asian 2.1902
6 Atlantic_Med 1.7786
7 Siberian 0.9883
8 East_Asian 0.0016

MfA
03-03-2014, 11:18 PM
Supervised vs Gedmatch



PCA is quite interesting btw

everest59
03-03-2014, 11:31 PM
The nice thing is, I can now calculate "admixture" in ancient DNA samples very easily. For example, here's Mal'ta:

4.7% East African
26.3% Gedrosia
47.8% Northern European
0.3% Siberian
20% South Asian

It's similar to what I've seen before.
(The components I didn't list are at 0%).
The run probably took 3 minutes max. The only real hassle is merging the sample.

Sein
03-03-2014, 11:48 PM
The nice thing is, I can now calculate "admixture" in ancient DNA samples very easily. For example, here's Mal'ta:

4.7% East African
26.3% Gedrosia
47.8% Northern European
0.3% Siberian
20% South Asian

It's similar to what I've seen before.
(The components I didn't list are at 0%).
The run probably took 3 minutes max. The only real hassle is merging the sample.

Very interesting, he is about as South Asian as people from South Central Asia.

MfA
03-03-2014, 11:55 PM
The nice thing is, I can now calculate "admixture" in ancient DNA samples very easily. For example, here's Mal'ta:

4.7% East African
26.3% Gedrosia
47.8% Northern European
0.3% Siberian
20% South Asian

It's similar to what I've seen before.
(The components I didn't list are at 0%).
The run probably took 3 minutes max. The only real hassle is merging the sample.

He was around ~15-20% Amerindian though in the study, probably lacking of Amerindian component on K12b run breaks.

DMXX
03-03-2014, 11:56 PM
My results compared.



Component Supervised K12b
Caucasus 37.3% 37.6%
Gedrosia 25.14% 26.5%
SW_Asian 12.29% 13%
Atl_Med 8.38% 6.3%
N_Euro 7.66% 7.7%
S_Asian 3.44% 3.9%
Siberian 3.4% 1.4%
E_Asian 2% 2.4%
SE_Asian 0.38% 0.8%
E_African 0% 0.3%
NW_African 0% 0.1%
Subsaharan 0% 0%

everest59
03-04-2014, 12:02 AM
I just tried supervised run on La Brana-1. He's turning out to be 100% Northern European.

everest59
03-04-2014, 12:31 AM
Another ancient DNA, and a final one. This file was provided to me by Generalissimo.

Australian Aborigine

1.1% Atlantic Med
2.8% Caucasus
13% East Asian
3% East African
6% Gedrosia
1.2% NE Euro
4.4% NW African
12.8% SE_Asian
1.8% Siberian
53% South Asian

Clovis (Native American):
28.7% East Asian
16.7% Gedrosia
20.1% Northern European (it seems to me that Gedrosia and Northern European are capturing ANE)
23% Siberian
11.3% South Asian
(Lack of a native american component basically)

Dinka:
54% East African
23.3% North African
3.4% South Asian
18% Sub-Saharan

Saqqaq :

19.4% East Asian
8% East African
2.5% Gedrosia
20% Northern European
34% Siberian
14.8% South Asian

Lots of noise, which is expected in ancient DNA samples. The aborigine sample seems to be captured mainly by South Asian.
And ANE is being captured by Northern European and Gedrosia IMO.
You also see some South Asian. Perhaps it's showing some sort of migratory pattern. I have no idea to be honest.

Sein
03-04-2014, 01:03 AM
Another ancient DNA, and a final one. This file was provided to me by Generalissimo.

Australian Aborigine

1.1% Atlantic Med
2.8% Caucasus
13% East Asian
3% East African
6% Gedrosia
1.2% NE Euro
4.4% NW African
12.8% SE_Asian
1.8% Siberian
53% South Asian

Clovis (Native American):
28.7% East Asian
16.7% Gedrosia
20.1% Northern European (it seems to me that Gedrosia and Northern European are capturing ANE)
23% Siberian
11.3% South Asian
(Lack of a native american component basically)

Dinka:
54% East African
23.3% North African
3.4% South Asian
18% Sub-Saharan

Saqqaq :

19.4% East Asian
8% East African
2.5% Gedrosia
20% Northern European
34% Siberian
14.8% South Asian

Lots of noise, which is expected in ancient DNA samples. The aborigine sample seems to be captured mainly by South Asian.
And ANE is being captured by Northern European and Gedrosia IMO.
You also see some South Asian. Perhaps it's showing some sort of migratory pattern. I have no idea to be honest.

Everest, could you try a supervised run with these ancient samples as "fixed" reference poles? I think it would be pretty informative. And I think these samples can't be fit reasonably within modern variation. They all have some South Asian admixture, which must be incorrect.

Thank you.

everest59
03-04-2014, 01:23 AM
Everest, could you try a supervised run with these ancient samples as "fixed" reference poles? I think it would be pretty informative. And I think these samples can't be fit reasonably within modern variation. They all have some South Asian admixture, which must be incorrect.

Thank you.

Sure, I'll do it when I can. I think the main issue is lack of ancient South Asian and West Asian samples. The South Asian will probably be captured by a combination of Mal'ta and Native American/Siberian types.

everest59
03-04-2014, 01:45 AM
I created a dendrogram out of the Zombies to see where they cluster. The Atlantic Med and the Caucasus are part of the same branch. Furthermore, the four West-Asian components are part of the same branch. Also, the South Asian is really not clustering with anything else. The North African and SW Asian have some relationship, as they are clustering together.

https://drive.google.com/file/d/0B3vEDdpZDjUpSlYzUWxmUDI0a2c/edit?usp=sharing

everest59
03-04-2014, 03:19 AM
Another dendrogram using first 10 dimensions of PCA, using participants and Zombies:

https://drive.google.com/file/d/0B3vEDdpZDjUpal9nbnFXa1dwbU0/edit?usp=sharing


That's it for now.

Sein
03-04-2014, 03:30 AM
Another dendrogram using first 10 dimensions of PCA, using participants and Zombies:

https://drive.google.com/file/d/0B3vEDdpZDjUpal9nbnFXa1dwbU0/edit?usp=sharing


That's it for now.

This is pretty interesting, all of us South-Central Asians and Northwestern South Asians constitute a clade with the "Gedrosians", and all of the West Asians constitute a clade with the "Caucasians". I find this very fascinating, because the Caucasus and Gedrosia components are a single component at lower Ks.

Also, the Southwest Asian+North African components are more diverged from other West Eurasian components, probably because of the African affinity.

everest59
03-04-2014, 03:35 AM
This is pretty interesting, all of us South-Central Asians and Northwestern South Asians constitute a clade with the "Gedrosians", and all of the West Asians constitute a clade with the "Caucasians". I find this very fascinating, because the Caucasus and Gedrosia components are a single component at lower Ks.

I feel that this "Baloch" component is a real component. Nobody is 100% "Baloch", unlike the "Kalash" component, which peaks at 100% in the population that is Kalash. The components that need to be doubted are those that peak at 100%.

Dr_McNinja
03-04-2014, 02:18 PM
Did some more runs with the old dataset.

East Asian, two African components, Papuan, Basque, Sardinian, SW Asian: https://docs.google.com/spreadsheets/d/1a6itnwHs2dPY2ouEXbm_QOOGa0eMJk69EEPCrAgLHUw/edit#gid=747615632

Then removed SW Asian: https://docs.google.com/spreadsheets/d/1k-2Njdz73v5lAo6H5xZ8es7PYLQT4KONe7rimYlExlI/edit#gid=1406931695

Noticed Papuan jumped for some of the West Eurasian individuals.

So I removed Papuan:

https://docs.google.com/spreadsheets/d/1wZugJ9HdnRXyM4Er3KJZ320z3wZn_zrtBJJBng6Ma_Q/edit#gid=1493305812

Then removed Sardinian or Basque (two sheets, one for Basque, one for Sardinian): https://docs.google.com/spreadsheets/d/12X03bTUBi9cty6j3o2TKFkGEXQBwmvOWLLKF8Yz46_g/edit#gid=792931946

Then the last, a run with just East Asian and either Basque or Sardinian (two sheets): https://docs.google.com/spreadsheets/d/1kh6HDIwaVegBEo7K3YIQGM7qV05zZd74O0rcuO9X49g/edit#gid=258432243

What happens with West African is pretty interesting. And at the end I'm pretty certain East Asian is now capturing some West Eurasian.

Dr_McNinja
03-04-2014, 04:21 PM
^ Shared the other two I forgot about

soulblighter
03-04-2014, 07:55 PM
Did some more runs with the old dataset.

East Asian, two African components, Papuan, Basque, Sardinian, SW Asian: https://docs.google.com/spreadsheets/d/1a6itnwHs2dPY2ouEXbm_QOOGa0eMJk69EEPCrAgLHUw/edit#gid=747615632

Then removed SW Asian: https://docs.google.com/spreadsheets/d/1k-2Njdz73v5lAo6H5xZ8es7PYLQT4KONe7rimYlExlI/edit#gid=1406931695

Noticed Papuan jumped for some of the West Eurasian individuals.

So I removed Papuan:

https://docs.google.com/spreadsheets/d/1wZugJ9HdnRXyM4Er3KJZ320z3wZn_zrtBJJBng6Ma_Q/edit#gid=1493305812

Then removed Sardinian or Basque (two sheets, one for Basque, one for Sardinian): https://docs.google.com/spreadsheets/d/12X03bTUBi9cty6j3o2TKFkGEXQBwmvOWLLKF8Yz46_g/edit#gid=792931946

Then the last, a run with just East Asian and either Basque or Sardinian (two sheets): https://docs.google.com/spreadsheets/d/1kh6HDIwaVegBEo7K3YIQGM7qV05zZd74O0rcuO9X49g/edit#gid=258432243

What happens with West African is pretty interesting. And at the end I'm pretty certain East Asian is now capturing some West Eurasian.


This run is very interesting.
With the first spreadsheet, I have the highest Papuan. But when you removed SW Asian, in the second spread sheet, I end up with minimal Papuan, and most of that gets assigned to Sardinian. The interesting thing is that some of my East Asian also gets assigned to Sardinian! (I would have guessed that the Papuan would have stayed intact or recombined into east Asian...)
So approx 8% goes from East Eurasian (Papuan+east Asian) to West Eurasian (Sardinian).
None of the others (i.e participants of this forum) seem to have such a dramatic change between East and West Eurasian. I wonder what is going on.... is it because of the FTDNA dataset?

Further investigation yields some populations losing 100% of their west eurasian (Maya), while others gaining 80+% west eurasian (Dai)

Here is the data sorted based on the change in West Eurasian between your first and second spreadsheet below. There are some interesting (but expected) patterns. The native americans/siberians/NE European/NW Europe arc seems to lose West Eurasian decreasing monotonically (only exception are the Bedouins). The Papuan/Japanese/East asians seem to have no changes and the south Asians seem to be the only ones to gain west Eurasians (with exception of Dai).


ID Population West_Eurasian(with_SW_Asian) West_Eurasian(Without_SW_Asian) Change_in_west_eurasian
HGDP00856 Maya 0.00% 100.00% -100.00%
HGDP00857 Maya 0.00% 100.00% -100.00%
HGDP00865 Maya 0.00% 100.00% -100.00%
HGDP00873 Maya 0.00% 100.00% -100.00%
HGDP00995 Karitiana 0.00% 100.00% -100.00%
HGDP00998 Karitiana 0.00% 100.00% -100.00%
HGDP00999 Karitiana 0.00% 100.00% -100.00%
HGDP01001 Karitiana 0.00% 100.00% -100.00%
HGDP01003 Karitiana 0.00% 100.00% -100.00%
HGDP01006 Karitiana 0.00% 100.00% -100.00%
HGDP01009 Karitiana 0.00% 100.00% -100.00%
HGDP01010 Karitiana 0.00% 100.00% -100.00%
HGDP01012 Karitiana 0.00% 100.00% -100.00%
HGDP01013 Karitiana 0.00% 100.00% -100.00%
HGDP01014 Karitiana 0.00% 100.00% -100.00%
HGDP01015 Karitiana 0.00% 100.00% -100.00%
HGDP01018 Karitiana 0.00% 100.00% -100.00%
HGDP01019 Karitiana 0.00% 100.00% -100.00%
HGDP01037 Pima 0.00% 100.00% -100.00%
HGDP01041 Pima 0.00% 100.00% -100.00%
HGDP01043 Pima 0.00% 100.00% -100.00%
HGDP01044 Pima 0.00% 100.00% -100.00%
HGDP01047 Pima 0.00% 100.00% -100.00%
HGDP01050 Pima 0.00% 100.00% -100.00%
HGDP01051 Pima 0.00% 100.00% -100.00%
HGDP01053 Pima 0.00% 100.00% -100.00%
HGDP01055 Pima 0.00% 100.00% -100.00%
HGDP01056 Pima 0.00% 100.00% -100.00%
HGDP01057 Pima 0.00% 100.00% -100.00%
HGDP01058 Pima 0.00% 100.00% -100.00%
HGDP01059 Pima 0.00% 100.00% -100.00%
HGDP01060 Pima 0.00% 100.00% -100.00%
HGDP00864 Maya 0.00% 99.89% -99.89%
HGDP00859 Maya 0.66% 100.00% -99.34%
HGDP00855 Maya 0.00% 99.30% -99.30%
HGDP00870 Maya 1.14% 100.00% -98.86%
HGDP00872 Maya 1.31% 100.00% -98.69%
HGDP00858 Maya 2.31% 100.00% -97.69%
HGDP00863 Maya 1.55% 99.00% -97.45%
HGDP00854 Maya 4.07% 100.00% -95.93%
HGDP00875 Maya 3.63% 97.71% -94.08%
Bedouin 6.42% 100.00% -93.58%
Bedouin 6.76% 100.00% -93.24%
Bedouin 6.90% 100.00% -93.10%
Bedouin 7.00% 100.00% -93.00%
HGDP00862 Maya 6.56% 99.35% -92.79%
Bedouin 7.22% 100.00% -92.78%
Bedouin 7.48% 100.00% -92.52%
Bedouin 7.62% 100.00% -92.38%
Bedouin 7.62% 100.00% -92.38%
Bedouin 7.70% 100.00% -92.30%
Bedouin 8.20% 100.00% -91.80%
HGDP00877 Maya 7.95% 99.14% -91.19%
HGDP00868 Maya 8.66% 97.46% -88.80%
HGDP00861 Maya 10.74% 97.72% -86.98%
HGDP00869 Maya 13.11% 98.90% -85.79%
Bedouin 17.06% 100.00% -82.94%
HGDP00871 Maya 22.66% 96.98% -74.32%
HGDP00876 Maya 23.21% 96.32% -73.11%
HGDP00860 Maya 24.90% 93.12% -68.22%
eastGreenland7 EastGreenland 0.00% 57.91% -57.91%
eastGreenland3 EastGreenland 0.00% 57.62% -57.62%
eastGreenland17 EastGreenland 0.00% 57.08% -57.08%
eastGreenland1 EastGreenland 6.32% 61.73% -55.41%
eastGreenland10 EastGreenland 7.37% 62.62% -55.25%
westGreenland16 WestGreenland 6.97% 61.49% -54.52%
westGreenland11 WestGreenland 8.11% 60.52% -52.41%
eastGreenland14 EastGreenland 16.57% 66.10% -49.53%
westGreenland20 WestGreenland 17.25% 63.96% -46.71%
eastGreenland16 EastGreenland 20.79% 66.68% -45.89%
westGreenland1 WestGreenland 23.06% 66.29% -43.23%
chukchi119 Chukchi 0.00% 43.04% -43.04%
chukchi143 Chukchi 0.00% 40.46% -40.46%
westGreenland2 WestGreenland 29.87% 69.24% -39.37%
chukchi126 Chukchi 0.00% 38.59% -38.59%
chukchi128 Chukchi 0.00% 38.54% -38.54%
chukchi117 Chukchi 0.00% 38.48% -38.48%
chukchi116 Chukchi 0.00% 37.92% -37.92%
chukchi142 Chukchi 0.00% 37.68% -37.68%
chukchi123 Chukchi 0.00% 37.27% -37.27%
chukchi136 Chukchi 0.00% 36.71% -36.71%
chukchi130 Chukchi 0.00% 36.62% -36.62%
koryakE6 Koryak 0.00% 33.08% -33.08%
chukchi125 Chukchi 0.00% 32.99% -32.99%
westGreenland9 WestGreenland 45.41% 78.34% -32.93%
koryakE151 Koryak 0.00% 32.44% -32.44%
koryakE58 Koryak 0.00% 32.17% -32.17%
eastGreenland5 EastGreenland 47.26% 79.39% -32.13%
koryakE131 Koryak 0.00% 31.71% -31.71%
koryakE35 Koryak 0.00% 31.07% -31.07%
westGreenland5 WestGreenland 46.79% 77.31% -30.52%
koryakE41 Koryak 0.00% 30.31% -30.31%
koryakE67 Koryak 0.00% 30.17% -30.17%
westGreenland6 WestGreenland 47.01% 77.16% -30.15%
koryakE5 Koryak 0.00% 29.23% -29.23%
koryakE114 Koryak 0.00% 28.99% -28.99%
koryakE78 Koryak 0.00% 28.89% -28.89%
eastGreenland9 EastGreenland 48.06% 76.92% -28.86%
koryakE77 Koryak 0.00% 28.81% -28.81%
koryakE66 Koryak 0.00% 28.65% -28.65%
westGreenland17 WestGreenland 57.61% 83.07% -25.46%
westGreenland18 WestGreenland 55.02% 79.96% -24.94%
Yukagir9 Yukagir 0.00% 23.18% -23.18%
koryakE95 Koryak 0.00% 22.28% -22.28%
koryakE102 Koryak 0.00% 21.86% -21.86%
koryakE76 Koryak 0.00% 21.06% -21.06%
Bedouin 79.28% 100.00% -20.72%
chukchi114 Chukchi 45.87% 66.55% -20.68%
eastGreenland6 EastGreenland 66.83% 87.49% -20.66%
chukchi129 Chukchi 47.16% 67.43% -20.27%
chukchi124 Chukchi 43.57% 63.34% -19.77%
Bedouin 80.81% 100.00% -19.19%
Bedouin 81.83% 100.00% -18.17%
Bedouin 82.05% 100.00% -17.95%
Bedouin 82.70% 100.00% -17.30%
Bedouin 82.96% 100.00% -17.04%
koryakE117 Koryak 46.60% 63.58% -16.98%
koryakE137 Koryak 47.15% 63.73% -16.58%
selkup6 Selkup 36.97% 53.43% -16.46%
selkup5 Selkup 32.28% 48.73% -16.45%
selkup1 Selkup 38.20% 54.45% -16.25%
Bedouin 83.75% 100.00% -16.25%
selkup8 Selkup 34.33% 50.53% -16.20%
Bedouin 83.96% 100.00% -16.04%
ket2 Ket 26.78% 42.42% -15.64%
ket1 Ket 31.10% 46.72% -15.62%
selkup2 Selkup 27.42% 42.45% -15.03%
selkup9 Selkup 28.85% 43.45% -14.60%
selkup7 Selkup 21.51% 35.92% -14.41%
Bedouin 85.59% 100.00% -14.41%
selkup3 Selkup 39.63% 54.01% -14.38%
Bedouin 85.94% 100.00% -14.06%
Bedouin 86.09% 100.00% -13.91%
Bedouin 87.10% 100.00% -12.90%
Bedouin 87.18% 100.00% -12.82%
Bedouin 87.19% 100.00% -12.81%
Nganassan10 Nganassan 6.13% 18.91% -12.78%
Nganassan8 Nganassan 5.37% 18.04% -12.67%
Nganassan3 Nganassan 4.23% 16.76% -12.53%
Bedouin 87.78% 100.00% -12.22%
Nganassan7 Nganassan 2.09% 14.20% -12.11%
Nganassan9 Nganassan 3.89% 15.86% -11.97%
Nganassan15 Nganassan 5.92% 17.86% -11.94%
Bedouin 88.06% 100.00% -11.94%
Nganassan14 Nganassan 2.81% 14.68% -11.87%
Bedouin 88.17% 100.00% -11.83%
Nganassan2 Nganassan 6.19% 17.90% -11.71%
Bedouin 88.34% 100.00% -11.66%
Bedouin 88.34% 100.00% -11.66%
Bedouin 88.35% 100.00% -11.65%
Bedouin 88.43% 100.00% -11.57%
Nganassan5 Nganassan 3.99% 15.49% -11.50%
Bedouin 88.63% 100.00% -11.37%
selkup10 Selkup 40.03% 51.36% -11.33%
Bedouin 88.71% 100.00% -11.29%
Nganassan1 Nganassan 3.91% 15.11% -11.20%
Nganassan13 Nganassan 14.54% 25.70% -11.16%
Nganassan12 Nganassan 4.26% 15.33% -11.07%
selkup4 Selkup 58.58% 69.61% -11.03%
Bedouin 88.99% 100.00% -11.01%
Bedouin 89.02% 100.00% -10.98%
Bedouin 89.11% 100.00% -10.89%
Bedouin 89.23% 100.00% -10.77%
Bedouin 89.46% 100.00% -10.54%
Bedouin 89.47% 100.00% -10.53%
tuva7 Tuva 16.95% 27.44% -10.49%
Yukagir7 Yukagir 4.56% 14.84% -10.28%
Bedouin 89.74% 100.00% -10.26%
altai3 Altai 28.65% 38.83% -10.18%
Bedouin 89.84% 100.00% -10.16%
Yukagir10 Yukagir 4.32% 14.37% -10.05%
Bedouin 90.06% 100.00% -9.94%
HGDP00959 Yakut 0.00% 9.83% -9.83%
Bedouin 90.22% 100.00% -9.78%
Dolgan6 Dolgan 5.09% 14.81% -9.72%
Bedouin 90.30% 100.00% -9.70%
altai1 Altai 37.58% 47.14% -9.56%
Bedouin 90.56% 100.00% -9.44%
evenk3 Evenk 3.57% 12.87% -9.30%
evenk6 Evenk 5.96% 14.85% -8.89%
Dolgan7 Dolgan 18.88% 27.70% -8.82%
HGDP00953 Yakut 32.31% 41.12% -8.81%
Nganassan11 Nganassan 48.77% 57.46% -8.69%
altai7 Altai 27.31% 35.95% -8.64%
Nganassan6 Nganassan 47.18% 55.74% -8.56%
evenk2 Evenk 5.97% 14.47% -8.50%
altai13 Altai 24.79% 33.13% -8.34%
tuva14 Tuva 15.07% 23.39% -8.32%
altai11 Altai 22.34% 30.60% -8.26%
tuva8 Tuva 16.59% 24.72% -8.13%
tuva2 Tuva 13.85% 21.93% -8.08%
evenk8 Evenk 7.98% 16.02% -8.04%
Yukagir4 Yukagir 47.79% 55.79% -8.00%
evenk13 Evenk 8.91% 16.90% -7.99%
evenk16 Evenk 7.88% 15.83% -7.95%
Dolgan2 Dolgan 14.67% 22.60% -7.93%
tuva16 Tuva 15.62% 23.54% -7.92%
evenk4 Evenk 9.71% 17.61% -7.90%
altai4 Altai 28.30% 36.15% -7.85%
evenk1 Evenk 6.99% 14.82% -7.83%
HGDP00952 Yakut 10.43% 18.22% -7.79%
HGDP00967 Yakut 10.20% 17.89% -7.69%
HGDP00951 Yakut 8.99% 16.63% -7.64%
Yukagir6 Yukagir 31.13% 38.77% -7.64%
tuva12 Tuva 17.33% 24.96% -7.63%
altai9 Altai 25.13% 32.74% -7.61%
HGDP00968 Yakut 13.94% 21.51% -7.57%
altai14 Altai 27.41% 34.98% -7.57%
altai6 Altai 25.65% 33.18% -7.53%
Nganassan4 Nganassan 16.94% 24.46% -7.52%
HGDP00960 Yakut 8.12% 15.61% -7.49%
Yukagir5 Yukagir 11.28% 18.75% -7.47%
altai8 Altai 25.81% 33.21% -7.40%
evenk11 Evenk 8.18% 15.54% -7.36%
tuva4 Tuva 19.26% 26.62% -7.36%
altai10 Altai 23.44% 30.79% -7.35%
Burya16 Buryat 12.27% 19.62% -7.35%
evenk10 Evenk 7.19% 14.54% -7.35%
HGDP00962 Yakut 8.52% 15.86% -7.34%
tuva1 Tuva 14.67% 21.97% -7.30%
Dolgan5 Dolgan 18.30% 25.57% -7.27%
tuva5 Tuva 17.90% 25.16% -7.26%
HGDP00964 Yakut 11.45% 18.64% -7.19%
tuva13 Tuva 15.66% 22.78% -7.12%
HGDP00955 Yakut 9.13% 16.21% -7.08%
HGDP00946 Yakut 9.39% 16.45% -7.06%
HGDP00948 Yakut 15.17% 22.20% -7.03%
HGDP00949 Yakut 30.54% 37.57% -7.03%
HGDP00894 Russian 85.45% 92.43% -6.98%
HGDP00954 Yakut 9.42% 16.35% -6.93%
evenk9 Evenk 5.06% 11.97% -6.91%
tuva15 Tuva 18.26% 25.17% -6.91%
HGDP00947 Yakut 12.51% 19.40% -6.89%
altai5 Altai 21.84% 28.68% -6.84%
mongol120 Mongol 19.74% 26.56% -6.82%
HGDP00965 Yakut 9.02% 15.81% -6.79%
HGDP00969 Yakut 23.01% 29.78% -6.77%
Yukagir1 Yukagir 69.41% 76.12% -6.71%
HGDP00963 Yakut 8.95% 15.63% -6.68%
mongol115 Mongol 18.48% 25.16% -6.68%
HGDP00945 Yakut 9.76% 16.41% -6.65%
Yukagir2 Yukagir 72.35% 78.92% -6.57%
HGDP00957 Yakut 7.49% 14.04% -6.55%
HGDP00881 Russian 86.60% 93.12% -6.52%
Dolgan10 Dolgan 35.64% 42.16% -6.52%
HGDP00890 Russian 87.05% 93.56% -6.51%
HGDP00966 Yakut 13.56% 20.05% -6.49%
HGDP00956 Yakut 8.89% 15.37% -6.48%
tuva10 Tuva 12.91% 19.39% -6.48%
evenk7 Evenk 14.96% 21.41% -6.45%
HGDP00950 Yakut 17.08% 23.52% -6.44%
Dolgan3 Dolgan 21.72% 28.11% -6.39%
tuva9 Tuva 15.79% 22.17% -6.38%
buryat24 Buryat 12.46% 18.82% -6.36%
HGDP00895 Russian 84.89% 91.19% -6.30%
HGDP00897 Russian 86.33% 92.60% -6.27%
Buryat12 Buryat 11.45% 17.65% -6.20%
mongol131 Mongol 20.53% 26.73% -6.20%
HGDP00958 Yakut 14.90% 21.08% -6.18%
evenk14 Evenk 13.53% 19.69% -6.16%
Buryat9 Buryat 11.14% 17.28% -6.14%
HGDP00898 Russian 84.84% 90.98% -6.14%
Buryat14 Buryat 11.37% 17.45% -6.08%
HGDP00896 Russian 86.14% 92.20% -6.06%
mordovia13 Mordovians 88.01% 94.07% -6.06%
tuva11 Tuva 14.74% 20.79% -6.05%
mongol124 Mongol 21.02% 27.00% -5.98%
evenk5 Evenk 15.94% 21.89% -5.95%
mordovia9 Mordovians 88.59% 94.53% -5.94%
buryat18 Buryat 12.44% 18.37% -5.93%
mongol123 Mongol 20.17% 26.09% -5.92%
mongol107 Mongol 20.84% 26.67% -5.83%
tuva6 Tuva 14.18% 19.94% -5.76%
HGDP00961 Yakut 10.15% 15.89% -5.74%
HGDP00889 Russian 87.18% 92.91% -5.73%
evenk12 Evenk 55.52% 61.23% -5.71%
HGDP00880 Russian 88.30% 94.00% -5.70%
HGDP00893 Russian 85.74% 91.44% -5.70%
mordovia6 Mordovians 87.44% 93.09% -5.65%
Buryat3 Buryat 12.39% 18.03% -5.64%
tuva3 Tuva 16.81% 22.44% -5.63%
Burya10 Buryat 13.24% 18.81% -5.57%
buryat20 Buryat 12.21% 17.77% -5.56%
HGDP00899 Russian 87.25% 92.81% -5.56%
HGDP00883 Russian 85.98% 91.51% -5.53%
altai2 Altai 23.05% 28.57% -5.52%
evenk15 Evenk 9.06% 14.54% -5.48%
mordovia10 Mordovians 87.38% 92.85% -5.47%
Yukagir3 Yukagir 76.15% 81.61% -5.46%
HGDP00887 Russian 87.71% 93.17% -5.46%
buryat25 Buryat 12.73% 18.17% -5.44%
buryat26 Buryat 11.68% 17.10% -5.42%
Buryat2 Buryat 13.38% 18.80% -5.42%
HGDP00882 Russian 84.89% 90.27% -5.38%
Buryat4 Buryat 13.19% 18.55% -5.36%
HGDP00901 Russian 86.30% 91.61% -5.31%
Buryat15 Buryat 12.36% 17.64% -5.28%
mordovia1 Mordovians 87.16% 92.44% -5.28%
mordovia12 Mordovians 88.62% 93.88% -5.26%
HGDP00879 Russian 86.84% 92.09% -5.25%
HGDP00886 Russian 86.55% 91.80% -5.25%
mordovia11 Mordovians 87.34% 92.58% -5.24%
buryat27 Buryat 16.66% 21.87% -5.21%
HGDP00885 Russian 88.07% 93.27% -5.20%
HGDP00892 Russian 87.03% 92.22% -5.19%
mordovia8 Mordovians 88.39% 93.52% -5.13%
HGDP00891 Russian 86.23% 91.35% -5.12%
Buryat17 Buryat 11.75% 16.75% -5.00%
mordovia4 Mordovians 87.30% 92.29% -4.99%
buryat23 Buryat 12.79% 17.75% -4.96%
UkrainePol19 Ukraine 94.42% 99.36% -4.94%
tadjik2 Tajiks 82.46% 87.38% -4.92%
mongol119 Mongol 19.37% 24.25% -4.88%
Buryat6 Buryat 12.71% 17.56% -4.85%
Buryat8 Buryat 14.17% 19.02% -4.85%
mongol128 Mongol 15.32% 20.16% -4.84%
HGDP00884 Russian 90.35% 95.17% -4.82%
mordovia2 Mordovians 89.37% 94.10% -4.73%
HGDP00900 Russian 90.09% 94.77% -4.68%
Ukraine94 Ukraine 94.77% 99.44% -4.67%
mongol114 Mongol 20.23% 24.83% -4.60%
HGDP00903 Russian 87.84% 92.35% -4.51%
HGDP00804 Orcadian 95.22% 99.62% -4.40%
mordovia15 Mordovians 87.41% 91.75% -4.34%
mordovia5 Mordovians 87.20% 91.52% -4.32%
mordovia7 Mordovians 87.22% 91.54% -4.32%
Ukraine97 Ukraine 93.70% 98.01% -4.31%
turkmen1 Turkmens 69.31% 73.56% -4.25%
mordovia14 Mordovians 89.19% 93.40% -4.21%
NA12761 NorthAtlantic 91.22% 95.36% -4.14%
mordovia3 Mordovians 86.93% 91.02% -4.09%
tadjik13 Tajiks 75.39% 79.41% -4.02%
tad838 Tajiks 86.20% 90.17% -3.97%
HGDP00902 Russian 88.60% 92.51% -3.91%
nogay7 Nogais 81.80% 85.71% -3.91%
tadjik8 Tajiks 81.80% 85.69% -3.89%
UkrBel618 Ukraine 94.62% 98.43% -3.81%
UkrLv228 Ukraine 93.08% 96.85% -3.77%
HGDP01388 Adygei 88.27% 92.00% -3.73%
nogay15 Nogais 64.31% 68.04% -3.73%
NA12762 NorthAtlantic 95.99% 99.67% -3.68%
Dolgan8 Dolgan 50.60% 54.25% -3.65%
HGDP00794 Orcadian 95.74% 99.37% -3.63%
nogay14 Nogais 71.48% 75.10% -3.62%
Ukraine130 Ukraine 93.98% 97.57% -3.59%
HGDP00888 Russian 89.25% 92.82% -3.57%
HGDP00797 Orcadian 96.18% 99.68% -3.50%
UkrBel622 Ukraine 94.36% 97.85% -3.49%
HGDP00802 Orcadian 95.65% 99.11% -3.46%
turkmen2 Turkmens 70.61% 74.06% -3.45%
UkrBel614 Ukraine 95.23% 98.58% -3.35%
HGDP00808 Orcadian 95.69% 99.02% -3.33%
ch55 Chechens 95.74% 99.06% -3.32%
UkrLv223 Ukraine 95.10% 98.34% -3.24%
tadjik4 Tajiks 85.02% 88.20% -3.18%
.NEEuro1 94.77% 97.93% -3.16%
ch113 Chechens 95.11% 98.26% -3.15%
HGDP00799 Orcadian 96.53% 99.65% -3.12%
NA12005 NorthAtlantic 96.86% 99.95% -3.09%
Ukraine133 Ukraine 94.45% 97.54% -3.09%
UkrLv215 Ukraine 95.44% 98.47% -3.03%
tadjik6 Tajiks 77.82% 80.78% -2.96%
HGDP00525 French 96.10% 99.04% -2.94%
NA12248 NorthSea 96.70% 99.64% -2.94%
Ukraine141 Ukraine 94.19% 97.11% -2.92%
NA07346 NorthSea 96.49% 99.40% -2.91%
turkm537 Turkmens 72.53% 75.42% -2.89%
HGDP00796 Orcadian 96.56% 99.38% -2.82%
UkrLv226 Ukraine 95.31% 98.13% -2.82%
HGDP00526 French 96.87% 99.68% -2.81%
abh165 Abhkasians 95.57% 98.35% -2.78%
UkrLv237 Ukraine 96.30% 99.07% -2.77%
HGDP00800 Orcadian 96.87% 99.64% -2.77%
UkrBel620 Ukraine 95.13% 97.89% -2.76%
bal31 Balkars 90.89% 93.64% -2.75%
NA12057 NorthSea 95.72% 98.47% -2.75%
nogay16 Nogais 75.18% 77.90% -2.72%
nogay1 Nogais 79.67% 82.35% -2.68%
kumyks13 Kumyks 92.22% 94.89% -2.67%
HGDP00798 Orcadian 96.69% 99.35% -2.66%
UkrBel733 Ukraine 95.11% 97.70% -2.59%
NA11830 NorthSea 96.99% 99.53% -2.54%
nogay11 Nogais 73.35% 75.88% -2.53%
HGDP00523 French 97.30% 99.82% -2.52%
NA12003 NorthSea 93.71% 96.23% -2.52%
ch150 Chechens 95.93% 98.40% -2.47%
NA12234 NorthAtlantic 97.57% 100.00% -2.43%
NA12239 NorthSea 96.69% 99.07% -2.38%
NA12815 NorthSea 96.81% 99.19% -2.38%
HGDP00810 Orcadian 97.19% 99.55% -2.36%
NA12827 NorthAtlantic 96.76% 99.12% -2.36%
nogay4 Nogais 74.68% 77.04% -2.36%
tadjik15 Tajiks 79.53% 81.86% -2.33%
HGDP00057 Balochi 79.47% 81.79% -2.32%
nogay3 Nogais 77.65% 79.94% -2.29%
UkrLv240 Ukraine 95.89% 98.17% -2.28%
HGDP00519 French 96.80% 99.07% -2.27%
NA12044 NorthAtlantic 97.03% 99.29% -2.26%
tadjik7 Tajiks 76.81% 79.06% -2.25%
nogay12 Nogais 80.53% 82.77% -2.24%
turkm3661 Turkmens 82.55% 84.76% -2.21%
ch21 Chechens 94.86% 97.05% -2.19%
nogay10 Nogais 71.41% 73.60% -2.19%
tadjik9 Tajiks 82.77% 84.96% -2.19%
tadjik10 Tajiks 85.47% 87.63% -2.16%
ch16 Chechens 95.94% 98.09% -2.15%
nogay5 Nogais 78.53% 80.68% -2.15%
HGDP00803 Orcadian 97.52% 99.66% -2.14%
ch109 Chechens 95.57% 97.67% -2.10%
Ukraine136 Ukraine 93.87% 95.97% -2.10%
UkrBel736 Ukraine 94.33% 96.41% -2.08%
nogay9 Nogais 81.27% 83.33% -2.06%
HGDP00805 Orcadian 96.98% 99.04% -2.06%
HGDP00535 French 97.42% 99.45% -2.03%
kumyks110 Kumyks 92.48% 94.50% -2.02%
nogay8 Nogais 79.38% 81.40% -2.02%
HGDP00515 French 97.90% 99.91% -2.01%
nogay13 Nogais 73.59% 75.59% -2.00%
HGDP00533 French 97.53% 99.51% -1.98%
turkm7529 Turkmens 77.25% 79.23% -1.98%
turkmu26 Turkmens 84.76% 86.74% -1.98%
ch60 Chechens 96.72% 98.69% -1.97%
tadjik14 Tajiks 86.23% 88.18% -1.95%
HGDP01404 Adygei 90.98% 92.91% -1.93%
turkmBe24 Turkmens 85.50% 87.43% -1.93%
Bulgaria6 Bulgarians 97.07% 98.97% -1.90%
HGDP00511 French 97.59% 99.45% -1.86%
kumyks111 Kumyks 94.35% 96.21% -1.86%
HGDP00019 Brahui 91.14% 92.99% -1.85%
ch86 Chechens 95.53% 97.37% -1.84%
NA12347 NorthSea 97.90% 99.74% -1.84%
ch76 Chechens 94.89% 96.72% -1.83%
turkm1820 Turkmens 86.82% 88.65% -1.83%
HGDP00514 French 97.38% 99.20% -1.82%
.AfghanTajik 68.51% 70.32% -1.81%
HGDP00512 French 97.74% 99.55% -1.81%
NA12249 NorthSea 96.49% 98.30% -1.81%
NorthOssetia1 NorthOssetians 93.43% 95.23% -1.80%
NA12776 NorthSea 98.06% 99.85% -1.79%
Bulgaria5 Bulgarians 97.90% 99.67% -1.77%
tadjik12 Tajiks 77.08% 78.84% -1.76%
HGDP00516 French 98.28% 100.00% -1.72%
ch3 Chechens 96.37% 98.08% -1.71%
ch170 Chechens 96.71% 98.41% -1.70%
HGDP00527 French 97.04% 98.74% -1.70%
ch174 Chechens 96.25% 97.94% -1.69%
nogay6 Nogais 83.96% 85.65% -1.69%
ch101 Chechens 95.72% 97.41% -1.69%
.Zeph 98.33% 100.00% -1.67%
HGDP00795 Orcadian 97.38% 99.05% -1.67%
NA07345 NorthSea 97.40% 99.05% -1.65%
NA11829 NorthSea 97.42% 99.06% -1.64%
HGDP00534 French 97.87% 99.50% -1.63%
HGDP00043 Brahui 92.17% 93.78% -1.61%
HGDP00517 French 97.82% 99.41% -1.59%
tadjik11 Tajiks 82.19% 83.77% -1.58%
HGDP00537 French 97.96% 99.54% -1.58%
HGDP01403 Adygei 97.30% 98.86% -1.56%
HGDP00520 French 97.27% 98.82% -1.55%
ch11 Chechens 94.98% 96.52% -1.54%
kurd1101 97.49% 99.03% -1.54%
kurd1159 97.54% 99.07% -1.53%
tadjik5 Tajiks 80.98% 82.51% -1.53%
kumyks4 Kumyks 92.11% 93.63% -1.52%
kumyks6 Kumyks 92.25% 93.76% -1.51%
kumyks8 Kumyks 92.07% 93.56% -1.49%
UkrainePol25 Ukraine 96.97% 98.46% -1.49%
bal124 Balkars 93.92% 95.41% -1.49%
Bulgaria25 Bulgarians 97.95% 99.44% -1.49%
armenia279 Armenians 98.53% 100.00% -1.47%
turkmu33 Turkmens 83.68% 85.14% -1.46%
kumyks1 Kumyks 87.96% 89.41% -1.45%
HGDP00524 French 98.54% 99.97% -1.43%
NA12399 NorthAtlantic 97.54% 98.96% -1.42%
tadjik3 Tajiks 86.11% 87.53% -1.42%
Bulgaria2 Bulgarians 97.76% 99.17% -1.41%
HGDP00078 Balochi 91.06% 92.47% -1.41%
NorthOssetia8 NorthOssetians 91.88% 93.29% -1.41%
turkmH6 Turkmens 84.49% 85.89% -1.40%
bal149 Balkars 92.55% 93.92% -1.37%
HGDP00806 Orcadian 98.23% 99.59% -1.36%
ch131 Chechens 95.55% 96.89% -1.34%
HGDP01387 Adygei 96.58% 97.92% -1.34%
Bulgaria4 Bulgarians 97.59% 98.91% -1.32%
NorthOssetia4 NorthOssetians 92.10% 93.42% -1.32%
bal26 Balkars 92.75% 94.06% -1.31%
ch31 Chechens 97.91% 99.22% -1.31%
Bulgaria37 Bulgarians 97.02% 98.30% -1.28%
NorthOssetia14 NorthOssetians 91.45% 92.71% -1.26%
armenia139 Armenians 98.74% 100.00% -1.26%
abh24 Abhkasians 97.76% 99.01% -1.25%
HGDP00538 French 98.30% 99.52% -1.22%
HGDP00027 Brahui 92.32% 93.53% -1.21%
HGDP01396 Adygei 92.08% 93.29% -1.21%
.Kurd 98.11% 99.31% -1.20%
Bulgaria33 Bulgarians 98.04% 99.24% -1.20%
kurd1173 98.09% 99.28% -1.19%
HGDP00522 French 98.83% 100.00% -1.17%
NorthOssetia16 NorthOssetians 93.02% 94.19% -1.17%
HGDP00513 French 97.89% 99.04% -1.15%
bal102 Balkars 92.63% 93.77% -1.14%
HGDP00064 Balochi 95.14% 96.28% -1.14%
HGDP01399 Adygei 97.09% 98.23% -1.14%
bal136 Balkars 93.04% 94.16% -1.12%
HGDP00528 French 97.15% 98.25% -1.10%
HGDP00518 French 97.84% 98.92% -1.08%
kurd1160 98.65% 99.72% -1.07%
bal80 Balkars 93.14% 94.20% -1.06%
.Turk2 95.01% 96.05% -1.04%
HGDP00076 Balochi 86.33% 87.37% -1.04%
ch193 Chechens 96.88% 97.92% -1.04%
NorthOssetia13 NorthOssetians 91.95% 92.98% -1.03%
armenia176 Armenians 98.98% 100.00% -1.02%
armenia91 Armenians 98.99% 100.00% -1.01%
kurd1198 98.67% 99.68% -1.01%
.Turk1 89.64% 90.64% -1.00%
HGDP00005 Brahui 93.20% 94.17% -0.97%
HGDP01384 Adygei 89.27% 90.24% -0.97%
bal14 Balkars 91.05% 92.01% -0.96%
.KhetranBaloch 92.80% 93.74% -0.94%
HGDP01386 Adygei 96.37% 97.31% -0.94%
HGDP00529 French 98.82% 99.75% -0.93%
kumyks10 Kumyks 95.78% 96.70% -0.92%
Bulgaria1 Bulgarians 97.32% 98.23% -0.91%
HGDP00080 Balochi 95.27% 96.18% -0.91%
HGDP00539 French 98.00% 98.91% -0.91%
bal32 Balkars 93.95% 94.85% -0.90%
Bulgaria39 Bulgarians 97.59% 98.49% -0.90%
NorthOssetia20 NorthOssetians 92.63% 93.52% -0.89%
HGDP01381 Adygei 92.23% 93.11% -0.88%
ch179 Chechens 97.83% 98.70% -0.87%
armenia36 Armenians 99.14% 100.00% -0.86%
bal42 Balkars 91.21% 92.07% -0.86%
NorthOssetia11 NorthOssetians 92.42% 93.26% -0.84%
armenia80 Armenians 99.19% 100.00% -0.81%
kumyks7 Kumyks 92.45% 93.26% -0.81%
.Italian1 97.50% 98.30% -0.80%
bal50 Balkars 93.63% 94.43% -0.80%
Bulgaria8 Bulgarians 96.21% 97.01% -0.80%
armenia71 Armenians 99.21% 100.00% -0.79%
armenia162 Armenians 99.23% 100.00% -0.77%
armenia7 Armenians 99.23% 100.00% -0.77%
HGDP01397 Adygei 92.74% 93.51% -0.77%
nogay2 Nogais 77.42% 78.19% -0.77%
NA12875 NorthSea 97.35% 98.10% -0.75%
HGDP00029 Brahui 81.64% 82.38% -0.74%
Bulgaria26 Bulgarians 98.02% 98.74% -0.72%
HGDP00521 French 98.49% 99.21% -0.72%
armenia73 Armenians 99.33% 100.00% -0.67%
HGDP01402 Adygei 96.21% 96.88% -0.67%
NorthOssetia17 NorthOssetians 93.73% 94.40% -0.67%
HGDP00088 Balochi 94.82% 95.47% -0.65%
kumyks5 Kumyks 90.97% 91.61% -0.64%
HGDP00807 Orcadian 97.71% 98.33% -0.62%
bal64 Balkars 89.32% 89.93% -0.61%
abh137 Abhkasians 99.07% 99.67% -0.60%
armenia86 Armenians 99.40% 100.00% -0.60%
bal45 Balkars 94.20% 94.79% -0.59%
kumyks15 Kumyks 94.44% 95.02% -0.58%
kurd1156 98.91% 99.49% -0.58%
ch126 Chechens 97.01% 97.58% -0.57%
HGDP01385 Adygei 94.29% 94.85% -0.56%
armenia102 Armenians 99.45% 100.00% -0.55%
abh27 Abhkasians 98.95% 99.49% -0.54%
HGDP00023 Brahui 95.23% 95.77% -0.54%
NorthOssetia2 NorthOssetians 93.12% 93.66% -0.54%
HGDP00060 Balochi 83.56% 84.08% -0.52%
HGDP00062 Balochi 94.54% 95.05% -0.51%
HGDP01382 Adygei 98.61% 99.11% -0.50%
NorthOssetia5 NorthOssetians 92.68% 93.16% -0.48%
.HRP0282 87.94% 88.41% -0.47%
NorthOssetia12 NorthOssetians 92.76% 93.23% -0.47%
abh107 Abhkasians 98.65% 99.08% -0.43%
abh85 Abhkasians 98.33% 98.75% -0.42%
abh45 Abhkasians 99.01% 99.43% -0.42%
abh53 Abhkasians 97.60% 98.02% -0.42%
HGDP00035 Brahui 93.98% 94.40% -0.42%
abh41 Abhkasians 98.72% 99.13% -0.41%
Bulgaria3 Bulgarians 98.80% 99.20% -0.40%
abh119 Abhkasians 98.43% 98.82% -0.39%
abh60 Abhkasians 99.27% 99.66% -0.39%
abh71 Abhkasians 98.61% 99.00% -0.39%
abh74 Abhkasians 98.71% 99.10% -0.39%
bal115 Balkars 94.37% 94.76% -0.39%
NorthOssetia3 NorthOssetians 91.26% 91.65% -0.39%
turkmG31 Turkmens 85.57% 85.96% -0.39%
.AfghanPashtun 83.91% 84.28% -0.37%
abh100 Abhkasians 98.48% 98.85% -0.37%
abh122 Abhkasians 98.84% 99.21% -0.37%
bal108 Balkars 93.17% 93.54% -0.37%
Kumyk22 Kumyks 93.07% 93.41% -0.34%
HGDP00531 French 99.24% 99.57% -0.33%
HGDP00066 Balochi 95.42% 95.75% -0.33%
armenia106 Armenians 99.68% 100.00% -0.32%
NorthOssetia19 NorthOssetians 93.15% 93.47% -0.32%
armenia191 Armenians 99.69% 100.00% -0.31%
HGDP01400 Adygei 95.73% 96.04% -0.31%
armenia3 Armenians 99.71% 100.00% -0.29%
HGDP01401 Adygei 95.76% 96.04% -0.28%
turkmG33 Turkmens 86.08% 86.35% -0.27%
HGDP01398 Adygei 97.65% 97.90% -0.25%
ch34 Chechens 97.04% 97.27% -0.23%
abh135 Abhkasians 99.78% 100.00% -0.22%
bal97 Balkars 90.93% 91.15% -0.22%
turkmE42 Turkmens 88.25% 88.46% -0.21%
armenia293 Armenians 99.80% 100.00% -0.20%
bal88 Balkars 93.72% 93.91% -0.19%
Bulgaria7 Bulgarians 98.98% 99.16% -0.18%
HGDP00047 Brahui 91.30% 91.48% -0.18%
abh133 Abhkasians 99.81% 99.98% -0.17%
bal22 Balkars 92.68% 92.85% -0.17%
kumyks108 Kumyks 93.45% 93.62% -0.17%
HGDP01383 Adygei 97.27% 97.42% -0.15%
.karachay 92.11% 92.23% -0.12%
bal7 Balkars 91.42% 91.54% -0.12%
HGDP00072 Balochi 95.48% 95.60% -0.12%
HGDP00013 Brahui 71.34% 71.44% -0.10%
kumyks11 Kumyks 89.56% 89.58% -0.02%
NorthOssetia9 NorthOssetians 92.25% 92.27% -0.02%
.HRP0370 93.06% 93.06% 0.00%
abh147 Abhkasians 100.00% 100.00% 0.00%
abh154 Abhkasians 100.00% 100.00% 0.00%
abh9 Abhkasians 100.00% 100.00% 0.00%
HGDP00449 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00450 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00454 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00455 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00456 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00457 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00458 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00459 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00460 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00461 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00462 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00463 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00464 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00465 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00466 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00467 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00469 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00470 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00471 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00472 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00473 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00474 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00475 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00476 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00478 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00479 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00540 Papuan 0.00% 0.00% 0.00%
HGDP00541 Papuan 0.00% 0.00% 0.00%
HGDP00542 Papuan 0.00% 0.00% 0.00%
HGDP00543 Papuan 0.00% 0.00% 0.00%
HGDP00544 Papuan 0.00% 0.00% 0.00%
HGDP00545 Papuan 0.00% 0.00% 0.00%
HGDP00546 Papuan 0.00% 0.00% 0.00%
HGDP00547 Papuan 0.00% 0.00% 0.00%
HGDP00548 Papuan 0.00% 0.00% 0.00%
HGDP00549 Papuan 0.00% 0.00% 0.00%
HGDP00550 Papuan 0.00% 0.00% 0.00%
HGDP00551 Papuan 0.00% 0.00% 0.00%
HGDP00552 Papuan 0.00% 0.00% 0.00%
HGDP00553 Papuan 0.00% 0.00% 0.00%
HGDP00554 Papuan 0.00% 0.00% 0.00%
HGDP00555 Papuan 0.00% 0.00% 0.00%
HGDP00556 Papuan 0.00% 0.00% 0.00%
HGDP00665 Sardinian 100.00% 100.00% 0.00%
HGDP00666 Sardinian 100.00% 100.00% 0.00%
HGDP00667 Sardinian 100.00% 100.00% 0.00%
HGDP00668 Sardinian 100.00% 100.00% 0.00%
HGDP00669 Sardinian 100.00% 100.00% 0.00%
HGDP00670 Sardinian 100.00% 100.00% 0.00%
HGDP00671 Sardinian 100.00% 100.00% 0.00%
HGDP00672 Sardinian 100.00% 100.00% 0.00%
HGDP00673 Sardinian 100.00% 100.00% 0.00%
HGDP00674 Sardinian 100.00% 100.00% 0.00%
HGDP00747 Japanese 0.00% 0.00% 0.00%
HGDP00748 Japanese 0.00% 0.00% 0.00%
HGDP00749 Japanese 0.00% 0.00% 0.00%
HGDP00750 Japanese 0.00% 0.00% 0.00%
HGDP00751 Japanese 0.00% 0.00% 0.00%
HGDP00752 Japanese 0.00% 0.00% 0.00%
HGDP00753 Japanese 0.00% 0.00% 0.00%
HGDP00755 Japanese 0.00% 0.00% 0.00%
HGDP00756 Japanese 0.00% 0.00% 0.00%
HGDP00757 Japanese 0.00% 0.00% 0.00%
HGDP00758 Japanese 0.00% 0.00% 0.00%
HGDP00759 Japanese 0.00% 0.00% 0.00%
HGDP00760 Japanese 0.00% 0.00% 0.00%
HGDP00761 Japanese 0.00% 0.00% 0.00%
HGDP00762 Japanese 0.00% 0.00% 0.00%
HGDP00763 Japanese 0.00% 0.00% 0.00%
HGDP00764 Japanese 0.00% 0.00% 0.00%
HGDP00765 Japanese 0.00% 0.00% 0.00%
HGDP00766 Japanese 0.00% 0.00% 0.00%
HGDP00767 Japanese 0.00% 0.00% 0.00%
HGDP00768 Japanese 0.00% 0.00% 0.00%
HGDP00769 Japanese 0.00% 0.00% 0.00%
HGDP00771 Japanese 0.00% 0.00% 0.00%
HGDP00772 Japanese 0.00% 0.00% 0.00%
HGDP00773 Japanese 0.00% 0.00% 0.00%
HGDP00790 Japanese 0.00% 0.00% 0.00%
HGDP00791 Japanese 0.00% 0.00% 0.00%
HGDP00828 Japanese 0.00% 0.00% 0.00%
HGDP00920 Yoruba 0.00% 0.00% 0.00%
HGDP00924 Yoruba 0.00% 0.00% 0.00%
HGDP00925 Yoruba 0.00% 0.00% 0.00%
HGDP00926 Yoruba 0.00% 0.00% 0.00%
HGDP00927 Yoruba 0.00% 0.00% 0.00%
HGDP00928 Yoruba 0.00% 0.00% 0.00%
HGDP00929 Yoruba 0.00% 0.00% 0.00%
HGDP00930 Yoruba 0.00% 0.00% 0.00%
HGDP00931 Yoruba 0.00% 0.00% 0.00%
HGDP00932 Yoruba 0.00% 0.00% 0.00%
HGDP00933 Yoruba 0.00% 0.00% 0.00%
HGDP00934 Yoruba 0.00% 0.00% 0.00%
HGDP00935 Yoruba 0.00% 0.00% 0.00%
HGDP00936 Yoruba 0.00% 0.00% 0.00%
HGDP00937 Yoruba 0.00% 0.00% 0.00%
HGDP00938 Yoruba 0.00% 0.00% 0.00%
HGDP00939 Yoruba 0.00% 0.00% 0.00%
HGDP00940 Yoruba 0.00% 0.00% 0.00%
HGDP00941 Yoruba 0.00% 0.00% 0.00%
HGDP00942 Yoruba 0.00% 0.00% 0.00%
HGDP00943 Yoruba 0.00% 0.00% 0.00%
HGDP00982 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00984 MbutiPygmy 0.00% 0.00% 0.00%
HGDP00985 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00986 BiakaPygmy 0.00% 0.00% 0.00%
HGDP00991 San 0.00% 0.00% 0.00%
HGDP00992 San 0.00% 0.00% 0.00%
HGDP01029 San 0.00% 0.00% 0.00%
HGDP01032 San 0.00% 0.00% 0.00%
HGDP01036 San 0.00% 0.00% 0.00%
HGDP01062 Sardinian 100.00% 100.00% 0.00%
HGDP01063 Sardinian 100.00% 100.00% 0.00%
HGDP01064 Sardinian 100.00% 100.00% 0.00%
HGDP01065 Sardinian 100.00% 100.00% 0.00%
HGDP01066 Sardinian 100.00% 100.00% 0.00%
HGDP01067 Sardinian 100.00% 100.00% 0.00%
HGDP01068 Sardinian 100.00% 100.00% 0.00%
HGDP01069 Sardinian 100.00% 100.00% 0.00%
HGDP01070 Sardinian 100.00% 100.00% 0.00%
HGDP01071 Sardinian 100.00% 100.00% 0.00%
HGDP01072 Sardinian 100.00% 100.00% 0.00%
HGDP01073 Sardinian 100.00% 100.00% 0.00%
HGDP01074 Sardinian 100.00% 100.00% 0.00%
HGDP01075 Sardinian 100.00% 100.00% 0.00%
HGDP01076 Sardinian 100.00% 100.00% 0.00%
HGDP01077 Sardinian 100.00% 100.00% 0.00%
HGDP01078 Sardinian 100.00% 100.00% 0.00%
HGDP01079 Sardinian 100.00% 100.00% 0.00%
HGDP01081 MbutiPygmy 0.00% 0.00% 0.00%
HGDP01086 BiakaPygmy 0.00% 0.00% 0.00%
HGDP01090 BiakaPygmy 0.00% 0.00% 0.00%
HGDP01094 BiakaPygmy 0.00% 0.00% 0.00%
HGDP01357 Basque 100.00% 100.00% 0.00%
HGDP01358 Basque 100.00% 100.00% 0.00%
HGDP01359 Basque 100.00% 100.00% 0.00%
HGDP01360 Basque 100.00% 100.00% 0.00%
HGDP01361 Basque 100.00% 100.00% 0.00%
HGDP01362 Basque 100.00% 100.00% 0.00%
HGDP01363 Basque 100.00% 100.00% 0.00%
HGDP01364 Basque 100.00% 100.00% 0.00%
HGDP01365 Basque 100.00% 100.00% 0.00%
HGDP01366 Basque 100.00% 100.00% 0.00%
HGDP01367 Basque 100.00% 100.00% 0.00%
HGDP01368 Basque 100.00% 100.00% 0.00%
HGDP01369 Basque 100.00% 100.00% 0.00%
HGDP01370 Basque 100.00% 100.00% 0.00%
HGDP01371 Basque 100.00% 100.00% 0.00%
HGDP01372 Basque 100.00% 100.00% 0.00%
HGDP01373 Basque 100.00% 100.00% 0.00%
HGDP01374 Basque 100.00% 100.00% 0.00%
HGDP01375 Basque 100.00% 100.00% 0.00%
HGDP01376 Basque 100.00% 100.00% 0.00%
HGDP01377 Basque 100.00% 100.00% 0.00%
HGDP01378 Basque 100.00% 100.00% 0.00%
HGDP01379 Basque 100.00% 100.00% 0.00%
HGDP01380 Basque 100.00% 100.00% 0.00%
NA18939 JPT 0.00% 0.00% 0.00%
NA18940 JPT 0.00% 0.00% 0.00%
NA18942 JPT 0.00% 0.00% 0.00%
NA18943 JPT 0.00% 0.00% 0.00%
NA18944 JPT 0.00% 0.00% 0.00%
NA18945 JPT 0.00% 0.00% 0.00%
NA18946 JPT 0.00% 0.00% 0.00%
NA18947 JPT 0.00% 0.00% 0.00%
NA18948 JPT 0.00% 0.00% 0.00%
NA18949 JPT 0.00% 0.00% 0.00%
NA18951 JPT 0.00% 0.00% 0.00%
NA18952 JPT 0.00% 0.00% 0.00%
NA18953 JPT 0.00% 0.00% 0.00%
NA18954 JPT 0.00% 0.00% 0.00%
NA18955 JPT 0.00% 0.00% 0.00%
NA18956 JPT 0.00% 0.00% 0.00%
NA18957 JPT 0.00% 0.00% 0.00%
NA18959 JPT 0.00% 0.00% 0.00%
NA18960 JPT 0.00% 0.00% 0.00%
NA18961 JPT 0.00% 0.00% 0.00%
NA18962 JPT 0.00% 0.00% 0.00%
NA18963 JPT 0.00% 0.00% 0.00%
NA18964 JPT 0.00% 0.00% 0.00%
NA18965 JPT 0.00% 0.00% 0.00%
NA18966 JPT 0.00% 0.00% 0.00%
NA18967 JPT 0.00% 0.00% 0.00%
NA18968 JPT 0.00% 0.00% 0.00%
NA18969 JPT 0.00% 0.00% 0.00%
NA18970 JPT 0.00% 0.00% 0.00%
NA18971 JPT 0.00% 0.00% 0.00%
NA18972 JPT 0.00% 0.00% 0.00%
NA18973 JPT 0.00% 0.00% 0.00%
NA18974 JPT 0.00% 0.00% 0.00%
NA18975 JPT 0.00% 0.00% 0.00%
NA18976 JPT 0.00% 0.00% 0.00%
NA18977 JPT 0.00% 0.00% 0.00%
NA18978 JPT 0.00% 0.00% 0.00%
NA18979 JPT 0.00% 0.00% 0.00%
NA18980 JPT 0.00% 0.00% 0.00%
NA18981 JPT 0.00% 0.00% 0.00%
NA18987 JPT 0.00% 0.00% 0.00%
NA18990 JPT 0.00% 0.00% 0.00%
NA18991 JPT 0.00% 0.00% 0.00%
NA18993 JPT 0.00% 0.00% 0.00%
NA18994 JPT 0.00% 0.00% 0.00%
NA18995 JPT 0.00% 0.00% 0.00%
NA18997 JPT 0.00% 0.00% 0.00%
NA18998 JPT 0.00% 0.00% 0.00%
NA18999 JPT 0.00% 0.00% 0.00%
NA19000 JPT 0.00% 0.00% 0.00%
NA19001 JPT 0.00% 0.00% 0.00%
NA19002 JPT 0.00% 0.00% 0.00%
NA19005 JPT 0.00% 0.00% 0.00%
NA19007 JPT 0.00% 0.00% 0.00%
NA19009 JPT 0.00% 0.00% 0.00%
NA19010 JPT 0.00% 0.00% 0.00%
NA19054 JPT 0.00% 0.00% 0.00%
NA19055 JPT 0.00% 0.00% 0.00%
NA19056 JPT 0.00% 0.00% 0.00%
NA19057 JPT 0.00% 0.00% 0.00%
NA19058 JPT 0.00% 0.00% 0.00%
NA19059 JPT 0.00% 0.00% 0.00%
NA19060 JPT 0.00% 0.00% 0.00%
NA19062 JPT 0.00% 0.00% 0.00%
NA19063 JPT 0.00% 0.00% 0.00%
NA19064 JPT 0.00% 0.00% 0.00%
NA19065 JPT 0.00% 0.00% 0.00%
NA19066 JPT 0.00% 0.00% 0.00%
NA19067 JPT 0.00% 0.00% 0.00%
NA19068 JPT 0.00% 0.00% 0.00%
NA19070 JPT 0.00% 0.00% 0.00%
NA19072 JPT 0.00% 0.00% 0.00%
NA19074 JPT 0.00% 0.00% 0.00%
NA19075 JPT 0.00% 0.00% 0.00%
NA19076 JPT 0.00% 0.00% 0.00%
NA19077 JPT 0.00% 0.00% 0.00%
NA19078 JPT 0.00% 0.00% 0.00%
NA19079 JPT 0.00% 0.00% 0.00%
NA19080 JPT 0.00% 0.00% 0.00%
NA19081 JPT 0.00% 0.00% 0.00%
NA19083 JPT 0.00% 0.00% 0.00%
NA19084 JPT 0.00% 0.00% 0.00%
NA19085 JPT 0.00% 0.00% 0.00%
NA19086 JPT 0.00% 0.00% 0.00%
NA19087 JPT 0.00% 0.00% 0.00%
NA19088 JPT 0.00% 0.00% 0.00%
HGDP00536 French 98.99% 98.97% 0.02%
HGDP00021 Brahui 93.45% 93.36% 0.09%
HGDP00039 Brahui 90.57% 90.45% 0.12%
HGDP00096 Balochi 93.59% 93.46% 0.13%
HGDP00082 Balochi 94.36% 94.21% 0.15%
HGDP00045 Brahui 95.56% 95.40% 0.16%
turkmV32 Turkmens 84.68% 84.50% 0.18%
turkmE31 Turkmens 87.53% 87.32% 0.21%
HGDP00086 Balochi 96.29% 96.07% 0.22%
HGDP00001 Brahui 94.67% 94.37% 0.30%
HGDP00015 Brahui 95.78% 95.47% 0.31%
HGDP00094 Balochi 93.57% 93.21% 0.36%
HGDP00031 Brahui 95.04% 94.63% 0.41%
HGDP00054 Balochi 97.32% 96.87% 0.45%
HGDP00037 Brahui 94.99% 94.52% 0.47%
HGDP00052 Balochi 96.11% 95.62% 0.49%
HGDP00530 French 99.77% 99.24% 0.53%
HGDP00009 Brahui 95.32% 94.76% 0.56%
HGDP00098 Balochi 91.61% 91.03% 0.58%
HGDP00049 Brahui 94.95% 94.36% 0.59%
.HRP0341 89.54% 88.93% 0.61%
HGDP00068 Balochi 89.55% 88.90% 0.65%
HGDP00007 Brahui 96.03% 95.32% 0.71%
HGDP00017 Brahui 95.87% 95.11% 0.76%
HGDP00011 Brahui 95.41% 94.64% 0.77%
HGDP00003 Brahui 94.71% 93.93% 0.78%
HGDP00090 Balochi 93.06% 92.24% 0.82%
HGDP00033 Brahui 95.87% 95.01% 0.86%
NAG131 Naga 3.61% 2.74% 0.87%
HGDP00074 Balochi 95.71% 94.63% 1.08%
NAG133 Naga 3.32% 2.21% 1.11%
HGDP00025 Brahui 96.43% 95.29% 1.14%
HGDP00058 Balochi 94.12% 92.91% 1.21%
HGDP00041 Brahui 95.72% 94.49% 1.23%
NAG129 Naga 4.63% 3.26% 1.37%
NAG134 Naga 4.54% 3.09% 1.45%
HGDP00056 Balochi 91.11% 89.53% 1.58%
BR084 BrahminsfromUttarPradesh 87.76% 86.05% 1.71%
.HRP0349 87.54% 85.82% 1.72%
HGDP00092 Balochi 94.34% 92.58% 1.76%
.SouthAsian 83.26% 81.41% 1.85%
BR052 BrahminsfromUttarPradesh 85.70% 83.50% 2.20%
.Gondal 91.15% 88.92% 2.23%
D238 BrahminsfromUttarPradesh 86.57% 84.01% 2.56%
209e Kshatriya 86.64% 83.87% 2.77%
177e BrahminsfromUttarPradesh 88.43% 85.61% 2.82%
H1 BrahminsfromUttaranchal 62.36% 59.49% 2.87%
A105 Meena 88.20% 85.22% 2.98%
208e Kshatriya 86.62% 83.62% 3.00%
.Kaundinya 84.21% 81.10% 3.11%
BRG001 BrahminsfromUttarPradesh 86.50% 83.16% 3.34%
GA23 Garo 20.09% 16.72% 3.37%
199e Kshatriya 86.12% 82.40% 3.72%
BR008 BrahminsfromUttarPradesh 87.76% 83.94% 3.82%
GA1 Garo 19.41% 15.56% 3.85%
GA13 Garo 18.15% 14.20% 3.95%
GA24 Garo 19.35% 15.21% 4.14%
evo_9 BrahminsfromUttarPradesh 88.14% 83.89% 4.25%
195 Kshatriya 84.96% 80.48% 4.48%
201 Kshatriya 86.98% 82.46% 4.52%
GONC1 Gond 87.25% 82.72% 4.53%
HGDP00070 Balochi 84.87% 80.34% 4.53%
204e Kshatriya 83.96% 79.36% 4.60%
198 Kshatriya 87.72% 82.98% 4.74%
KHL4 Khasi 30.66% 25.88% 4.78%
SB001 BrahminsfromTamilNadu 86.04% 80.92% 5.12%
SB003 BrahminsfromTamilNadu 85.93% 80.77% 5.16%
KHP3 Khasi 36.79% 31.62% 5.17%
MV009 Muslim 85.03% 79.62% 5.41%
D260 Tharus 83.07% 77.65% 5.42%
335e Kol 49.40% 43.89% 5.51%
KHP4 Khasi 42.83% 37.12% 5.71%
D442 Meghawal 85.13% 78.90% 6.23%
.soublighter 86.09% 79.83% 6.26%
HA045 Dharkars 85.51% 79.16% 6.35%
evo_34 Kanjars 84.97% 78.50% 6.47%
HA029 Dharkars 87.19% 80.59% 6.60%
298 Kol 83.98% 77.29% 6.69%
HA048 Dharkars 84.66% 77.93% 6.73%
MV012 Muslim 82.88% 76.11% 6.77%
evo_40 Chamar 84.71% 77.92% 6.79%
Z224 UttarPradeshScheduledCaste 85.73% 78.93% 6.80%
Bengali1 Bengali 80.47% 73.65% 6.82%
309e Kol 85.24% 78.38% 6.86%
HA040 Dharkars 85.62% 78.74% 6.88%
evo_35 Kanjars 84.63% 77.70% 6.93%
HA030 Dharkars 85.39% 78.42% 6.97%
Z220 UttarPradeshScheduledCaste 83.96% 76.98% 6.98%
evo_33 Kanjars 84.84% 77.86% 6.98%
HA042 Dharkars 85.22% 78.21% 7.01%
evo_10 Kurmi 84.40% 77.38% 7.02%
288 Kol 85.92% 78.86% 7.06%
314 Kol 85.19% 77.97% 7.22%
A311 Lambadi 84.40% 77.17% 7.23%
319 Kol 85.14% 77.90% 7.24%
HA041 Dharkars 86.62% 79.33% 7.29%
HA037 Dharkars 84.65% 77.33% 7.32%
MV015 Muslim 84.96% 77.55% 7.41%
MV003 Muslim 83.01% 75.59% 7.42%
TN26 TamilNaduScheduledCaste 85.72% 78.24% 7.48%
DEL007 BrahminsfromUttarPradesh 85.80% 78.25% 7.55%
evo_37 Kanjars 86.94% 79.38% 7.56%
VELZ281 Velamas 86.31% 78.75% 7.56%
MV013 Muslim 83.91% 76.29% 7.62%
PK6458 PiramalaiKallars 82.65% 75.02% 7.63%
VELZ275 Velamas 84.95% 77.27% 7.68%
A141 Kanjars 85.81% 78.12% 7.69%
HA038 Dharkars 83.20% 75.50% 7.70%
A143 Kanjars 86.46% 78.74% 7.72%
evo_38 Kanjars 83.71% 75.96% 7.75%
D494 Dusadh 83.26% 75.51% 7.75%
KUR2 Kurumba 85.62% 77.85% 7.77%
296e Kol 84.35% 76.57% 7.78%
VELZ271 Velamas 85.74% 77.94% 7.80%
VELZ260 Velamas 89.00% 81.17% 7.83%
VELZ272 Velamas 86.40% 78.49% 7.91%
D254 Tharus 85.30% 77.33% 7.97%
KUR1 Kurumba 84.29% 76.26% 8.03%
VELZ274 Velamas 84.92% 76.87% 8.05%
VELZ265 Velamas 87.81% 79.72% 8.09%
evo_36 Kanjars 86.51% 78.37% 8.14%
VELZ267 Velamas 85.96% 77.77% 8.19%
VELZ277 Velamas 87.06% 78.86% 8.20%
332 Kol 82.86% 74.61% 8.25%
A392 Dusadh 83.24% 74.98% 8.26%
Z218 UttarPradeshScheduledCaste 84.69% 76.41% 8.28%
A396 Dusadh 83.27% 74.94% 8.33%
PK6490 PiramalaiKallars 82.06% 73.73% 8.33%
PK6491 PiramalaiKallars 84.43% 76.00% 8.43%
A397 Dusadh 83.95% 75.51% 8.44%
CHEND85 Chenchus 80.38% 71.81% 8.57%
HLKP252 Hakkipikki 81.54% 72.97% 8.57%
PK6422 PiramalaiKallars 84.69% 76.12% 8.57%
VELZ264 Velamas 84.20% 75.61% 8.59%
Z225 UttarPradeshScheduledCaste 82.90% 74.31% 8.59%
CHEND95 Chenchus 77.74% 69.11% 8.63%
A388 Dusadh 83.28% 74.61% 8.67%
A386 Dusadh 82.30% 73.58% 8.72%
KUR4 Kurumba 84.54% 75.81% 8.73%
306 Kol 84.20% 75.46% 8.74%
CHEND96 Chenchus 81.43% 72.69% 8.74%
PK6415 PiramalaiKallars 84.35% 75.57% 8.78%
A266 Chamar 81.02% 72.21% 8.81%
A259 Chamar 82.15% 73.33% 8.82%
290 Kol 82.38% 73.54% 8.84%
evo_42 Chamar 82.21% 73.37% 8.84%
DH001 Dharkars 82.57% 73.63% 8.94%
330e Kol 82.60% 73.63% 8.97%
evo_43 Chamar 81.67% 72.67% 9.00%
A261 Chamar 82.27% 73.25% 9.02%
PK6442 PiramalaiKallars 83.77% 74.73% 9.04%
323 Kol 80.01% 70.94% 9.07%
CHEND90 Chenchus 77.95% 68.84% 9.11%
TN18 TamilNaduScheduledCaste 84.10% 74.89% 9.21%
Z217 UttarPradeshScheduledCaste 80.89% 71.63% 9.26%
PK6459 PiramalaiKallars 84.89% 75.63% 9.26%
PK6468 PiramalaiKallars 83.30% 74.03% 9.27%
A260 Chamar 82.93% 73.53% 9.40%
KUR3 Kurumba 84.64% 75.19% 9.45%
328e Kol 82.70% 73.16% 9.54%
NIH1 Nihali 76.00% 66.45% 9.55%
GONC8 Gond 73.95% 64.31% 9.64%
310e Kol 81.97% 72.16% 9.81%
321e Kol 83.27% 73.45% 9.82%
A268 Chamar 81.07% 71.25% 9.82%
308e Kol 83.19% 73.35% 9.84%
HLKP245 Hakkipikki 81.83% 71.93% 9.90%
HLKP243 Hakkipikki 80.07% 69.90% 10.17%
evo_41 Chamar 82.21% 72.03% 10.18%
HLKP241 Hakkipikki 80.18% 69.79% 10.39%
ORI35 Bonda 52.14% 41.17% 10.97%
HO438 Ho 59.66% 48.66% 11.00%
A253 Chamar 70.89% 59.88% 11.01%
JUANA48 Juang 52.88% 41.79% 11.09%
KH15 Kharia 58.27% 47.14% 11.13%
226 Asur 71.43% 60.28% 11.15%
ORI37 Bonda 53.61% 42.45% 11.16%
ORI93 Savara 55.97% 44.73% 11.24%
Mawasi1 Mawasi 62.89% 51.63% 11.26%
ORI36 Bonda 53.11% 41.82% 11.29%
ORI88 Gadaba 54.98% 43.67% 11.31%
A41 Juang 53.01% 41.64% 11.37%
480 Ho 59.68% 48.18% 11.50%
HO446 Ho 59.83% 48.32% 11.51%
GONC5 Gond 74.98% 63.45% 11.53%
KH1 Kharia 58.57% 46.93% 11.64%
HO470 Ho 60.08% 48.38% 11.70%
HO434 Ho 60.27% 48.56% 11.71%
ORI23 Dhurwa 61.27% 49.56% 11.71%
PULD160 Pulliyar 76.20% 64.46% 11.74%
G22 Santhal 65.14% 53.39% 11.75%
PULD171 Pulliyar 76.62% 64.77% 11.85%
ORI34 Bonda 52.69% 40.80% 11.89%
D201 Bhunjia 62.18% 50.26% 11.92%
PULD172 Pulliyar 76.36% 64.39% 11.97%
PULD162 Pulliyar 77.32% 65.31% 12.01%
SAVOR105 Savara 60.32% 48.31% 12.01%
PULD151 Pulliyar 76.60% 64.42% 12.18%
ASUR350 Asur 62.10% 49.89% 12.21%
NIH2 Nihali 67.43% 54.54% 12.89%
G53 Gond 67.39% 54.18% 13.21%
Dai 87.29% 4.77% 82.52%
Dai 87.19% 4.27% 82.92%
Dai 88.17% 4.65% 83.52%
Dai 89.68% 5.36% 84.32%
Dai 88.56% 4.21% 84.35%
Dai 89.15% 4.65% 84.50%
Dai 89.69% 4.34% 85.35%
Dai 89.82% 4.44% 85.38%
Dai 89.07% 3.50% 85.57%
Dai 91.13% 4.33% 86.80%

Dr_McNinja
03-04-2014, 10:06 PM
This run is very interesting.
With the first spreadsheet, I have the highest Papuan. But when you removed SW Asian, in the second spread sheet, I end up with minimal Papuan, and most of that gets assigned to Sardinian. The interesting thing is that some of my East Asian also gets assigned to Sardinian! (I would have guessed that the Papuan would have stayed intact or recombined into east Asian...)
So approx 8% goes from East Eurasian (Papuan+east Asian) to West Eurasian (Sardinian).Yeah I removed Papuan when I saw it started capturing some West Eurasian after I removed SW-Asian. After I removed Papuan as a component, they broke down in the next run to ~38% East Asian, ~6% W-African, ~55% Sardinian. It seems there's a little bit of mixed East+West Eurasian which doesn't really settle out into one or the other. It either goes completely to one or the other.

Ignis90
03-05-2014, 10:29 AM
Really interesting thread! If you want a Berber sample (for your global or West Eurasian runs), I can give you my data. Something I noticed is that I often score some South Asian % (over 1%) while I barely get any West Asian.

soulblighter
03-05-2014, 01:43 PM
Yeah I removed Papuan when I saw it started capturing some West Eurasian after I removed SW-Asian. After I removed Papuan as a component, they broke down in the next run to ~38% East Asian, ~6% W-African, ~55% Sardinian. It seems there's a little bit of mixed East+West Eurasian which doesn't really settle out into one or the other. It either goes completely to one or the other.

What are your thoughts on those Bedouin samples that jump close to 100%? I assume SW Asian was modal among them initially.
With all these extreme shifting in a few populations such as Bedouin, Maya, Pima and Dai, I am starting to wonder if there is an issue with those data sets, or if the statistics/PDF of the alleles chosen is screwed up, and not according to the guidelines expected by admixture.. The little bit of mixed East and West Eurasian that shifts can be expected, but the extremum shifting in the populations illustrated above is rather puzzling.

MfA
03-05-2014, 01:52 PM
Did some more runs with the old dataset.

East Asian, two African components, Papuan, Basque, Sardinian, SW Asian: https://docs.google.com/spreadsheets/d/1a6itnwHs2dPY2ouEXbm_QOOGa0eMJk69EEPCrAgLHUw/edit#gid=747615632

Then removed SW Asian: https://docs.google.com/spreadsheets/d/1k-2Njdz73v5lAo6H5xZ8es7PYLQT4KONe7rimYlExlI/edit#gid=1406931695

Noticed Papuan jumped for some of the West Eurasian individuals.

So I removed Papuan:

https://docs.google.com/spreadsheets/d/1wZugJ9HdnRXyM4Er3KJZ320z3wZn_zrtBJJBng6Ma_Q/edit#gid=1493305812

Then removed Sardinian or Basque (two sheets, one for Basque, one for Sardinian): https://docs.google.com/spreadsheets/d/12X03bTUBi9cty6j3o2TKFkGEXQBwmvOWLLKF8Yz46_g/edit#gid=792931946

Then the last, a run with just East Asian and either Basque or Sardinian (two sheets): https://docs.google.com/spreadsheets/d/1kh6HDIwaVegBEo7K3YIQGM7qV05zZd74O0rcuO9X49g/edit#gid=258432243

What happens with West African is pretty interesting. And at the end I'm pretty certain East Asian is now capturing some West Eurasian.

What's your(@all) opinion about South-Central Asians score 20-30% Yoruba on this run, (Papuans 65%, Bedouins 2-8%).. AncAfrican(Pygmy) component looks more logical to me when it's about capturing African related ancestry..

soulblighter
03-05-2014, 01:58 PM
What's your(@all) opinion about South-Central Asians score 20-30% Yoruba on this run, (Papuans 65%, Bedouins 2-8%).. AncAfrican(Pygmy) component looks more logical to me when it's about capturing African related ancestry..

I don't think it is African ancestry but shared ancestry between west africans and south asians (i.e possibly middle eastern like farmer ancestry?) being captured.
I believe the Papuans are at 65% here, because their closest population is probably south Asian (which is not represented), which is then getting projected to the shared allele frequency in west Africans. The other three components modal in sardinian, basque and pygmy seem to be highly divergent and unique to accommodate papuan.

ZephyrousMandaru
03-05-2014, 02:21 PM
I don't think it is African ancestry but shared ancestry between west africans and south asians (i.e possibly middle eastern like farmer ancestry?) being captured.
I believe the Papuans are at 65% here, because their closest population is probably south Asian (which is not represented), which is then getting projected to the shared allele frequency in west Africans. The other three components modal in sardinian, basque and pygmy seem to be highly divergent and unique to accommodate papuan.

You have to wonder why it's being allocated into the African clusters, if it is Middle Eastern farmer ancestry. Seems rather odd.

Ignis90
03-05-2014, 03:14 PM
^ What about y-dna E back migration to Africa? The age of E doesn't correlate with any West Eurasian (thus Middle Eastern farmer) ancestry as it is way too old.

soulblighter
03-05-2014, 04:17 PM
You have to wonder why it's being allocated into the African clusters, if it is Middle Eastern farmer ancestry. Seems rather odd.

Because i don't think Yoruba is purely african. There is evidence of back migration into africa from the middle east as ignis states (but not necessarily correlated with YDNA E but maybe Farming). It could be that the allele frequencies currently in samples between the populations that back migrated into west Africa and those that migrated into south Asia are intersecting in a statistically significant way.
Sardinian,Basque, pygmy seem to have highly "specialized" allele frequencies so have a very "narrow and peaky" allelle frequency distribution (with small variance), whereas my guess is that west africa has a higher variance in its distribution function (i.e "fat" distribution function), thus being able to capture other "components"

Sein
03-05-2014, 05:08 PM
Not sure why, but the West African correlates rather beautifully with actual ASI estimates, for the populations in question. For example, MfA is 7.13% West African, which is the exact amount of ASI one would expect for a person of Kurdish background, and I'm 18% West African, which is probably the actual amount of ASI I have. The Afghan Pashtun (HRP0370, not the Afghan Pashtun I sent) is 15% West African, which is probably the actual amount of ASI she has. Soulblighter is 30%, a very reasonable percentage, if this was actually an ASI estimate. Just something I noticed, but I don't know why West African would correlate to ASI? Perhaps, because all the West Eurasian ancestry is properly sifted out, but there is still a set of allele frequencies which don't fit into West vs East Eurasian, so they go into West African. We may have stumbled unto solid ASI estimates for South Asians, South Central Asians, and Iranians, on accident!

And the East Asian might actually be partly ANE, for some participants. Since Sardinians and Basque have no ANE ancestry whatsoever, a lot of the ANE allele frequencies are going into the East Asian. So for some of us, much of the "East Asian" should be counted as West Eurasian, since, at the end of the day, ANE is basically West Eurasian.

everest59
03-05-2014, 05:32 PM
Sein might be on to something. As I showed above, the South Asian of K12b is around 54% ASI if I remember. Which in my case is around 19% ASI. This is close to the Yoruba number.

Sein
03-05-2014, 05:46 PM
Everest makes an excellent point here, the correlations are too robust for this to be coincidence.

Also, I just realized that the Papuans are around 60%-70% West African. This is pretty solid evidence that "West African" is acting as a very excellent/accurate proxy for ASI, since ASI is distantly connected to Oceania.

soulblighter
03-05-2014, 05:56 PM
Very possible. It just seems to indicate that the "south asian" allele frequency has a fat distribution.
Dr_McNinja, what would you think would happen when you make a South Asia centric calculator and try to represent the other populations around the world? are they a one to one mappings?
For example,the Brahui, Pulliyar, Garo as the basis?

ZephyrousMandaru
03-05-2014, 05:57 PM
^ What about y-dna E back migration to Africa? The age of E doesn't correlate with any West Eurasian (thus Middle Eastern farmer) ancestry as it is way too old.

E's highest diversity is observed in Africa, so I very much doubt E originated in the Middle East.

newtoboard
03-05-2014, 06:01 PM
E's highest diversity is observed in Africa, so I very much doubt E originated in the Middle East.

Diversity isn't always a good indicator of origin.

newtoboard
03-05-2014, 06:07 PM
Do you think DE or D originated in Africa?

ZephyrousMandaru
03-05-2014, 06:10 PM
Because i don't think Yoruba is purely african. There is evidence of back migration into africa from the middle east as ignis states (but not necessarily correlated with YDNA E but maybe Farming). It could be that the allele frequencies currently in samples between the populations that back migrated into west Africa and those that migrated into south Asia are intersecting in a statistically significant way.
Sardinian,Basque, pygmy seem to have highly "specialized" allele frequencies so have a very "narrow and peaky" allelle frequency distribution (with small variance), whereas my guess is that west africa has a higher variance in its distribution function (i.e "fat" distribution function), thus being able to capture other "components"

The Yoruba samples don't score any percentage in the West Eurasian component. If there was West Eurasian gene flow into West Africa, and the Yoruba were one of the recipients of this admixture. I would think that they would score some percentage in the Basque and Sardinian components, but they don't. It might be capturing more of the genetic variance because the Yoruba, being an African population, is naturally going to be more genetically diverse and therefore might be better suited as a proxy for representing the rest of the genetic variance, than more specialized components such as Basque and Sardinian are.

soulblighter
03-05-2014, 06:20 PM
The Yoruba samples don't score any percentage in the West Eurasian component. If there was West Eurasian gene flow into West Africa, and the Yoruba were one of the recipients of this admixture. I would think that they would score some percentage in the Basque and Sardinian components, but they don't. It might be capturing more of the genetic variance because the Yoruba, being an African population, is naturally going to be more genetically diverse and therefore might be better suited as a proxy for representing the rest of the genetic variance, than more specialized components such as Basque and Sardinian are.

There is no one west eurasian component. Yoruba did't score basque/sardinian because the Yoruba component (Yoruba are West African) exists in the run...so they score 100%. If you removed west african too then we have to see what they score. For example, see what the Yoruba score in the last run (Only east asian with basque/sardinian...they are 13% East Asian and 87% Basque/sardinian). If we had Ancestral African, Basque and Japanese I wonder what the results would look like.

Looks like Dienekes already blogged about this: http://dienekes.blogspot.com/2012/03/effects-of-ascertainment-on-admixture.html

Sein
03-05-2014, 06:50 PM
Just a quick note, but the Brahui are 17.3% ASI based on Dienekes' experiment. Taking this at face value, I estimated that I'd be 15% ASI based on Dienekes' experiment, since the Brahui are 19.88% West African in this experiment. HRP0370 ends up being 11.74% ASI, based on Dienekes' experiment. So, she is 3% less ASI than myself. I think this is pretty accurate. MfA turns out around 4% ASI. This is also pretty accurate, since the Dodecad Iranians were around 6%-7% ASI, and being of Kurdish background, MfA should have less ASI than Iranians of mostly Persian background. We didn't even have to do all the complicated work involved with Dienekes' experiment! I believe he did a regression of some sort, and solved a system of equations, or something along those lines, much beyond my head, lol. But if you are of South Asian, Central Asian, or Iranian-Kurdish background, you can calculate pretty accurate ASI percentages for yourself. Just take into account the 17.3% ASI for the Brahui in Dienekes' experiment, and the 19.88% West African in this experiment, and go from there. Also, I'm basing this off the Sardinian sheet, and this is Dienekes' experiment:

https://docs.google.com/spreadsheet/ccc?key=0ArJDEoCgzRKedEd3N2drM05sck1wcG03TFdWUnZaQ mc&authkey=CIHIwKcO&hl=en_US&authkey=CIHIwKcO#gid=0

Ignis90
03-05-2014, 09:02 PM
E's highest diversity is observed in Africa, so I very much doubt E originated in the Middle East.

So does mtdna M1 but it's accepted M1 is the result of a back migration to Africa. Y-dna E being closely related to Asian-restricted D and roughly of similar age as CF (and even C) makes the Eurasian origin credible.

And I am not implying Yorubas have West Eurasian ancestry but some kind of old Eurasian ancestry.

Dr_McNinja
03-05-2014, 10:43 PM
Sein might be on to something. As I showed above, the South Asian of K12b is around 54% ASI if I remember. Which in my case is around 19% ASI. This is close to the Yoruba number.I'm not sure. HRP0370 scores ~15% in K12b and Harappa while HRP0341 and I are ~30% and ~35-36% respectively. Some people's numbers are staying the same and others' are dropping noticeably.

Then again, ASI itself in these runs is coming out rather volatile. There have been runs where a few individuals are close to the expected marks based off K12b/Harappa but people above and below them are off by a certain factor. That could be the case here, with people near 15-20% coming out accurate and South Asians being off by a certain factor.

In other runs, we've all changed by a similar proportion from our K12b/Harappa runs (i.e, HRP0370 is in the single digits and we're in the teens/20s). It's not unreasonable though because we know the South Indian component is ~50% East Asian so ADMIXTURE will still pick up that same amount/frequency (give or take a little depending on component flows and SNP count) and just put it in something else. That East Asian amount changes once we get down to 3/4 components as some of the East Eurasian becomes undifferentiated from West Eurasian which calls into question that ASI estimate to begin with.

Dr_McNinja
03-05-2014, 10:44 PM
Very possible. It just seems to indicate that the "south asian" allele frequency has a fat distribution.
Dr_McNinja, what would you think would happen when you make a South Asia centric calculator and try to represent the other populations around the world? are they a one to one mappings?
For example,the Brahui, Pulliyar, Garo as the basis?I did some India-only runs before. UP Brahmin usually becomes the major West Eurasian component for non-Indians, Garo/Nihali/etc become East Asian. Pulliyar diverges into its own thing. (EDIT: I didn't include Gedrosian in that)

ZephyrousMandaru
03-06-2014, 03:16 AM
So does mtdna M1 but it's accepted M1 is the result of a back migration to Africa. Y-dna E being closely related to Asian-restricted D and roughly of similar age as CF (and even C) makes the Eurasian origin credible.

And I am not implying Yorubas have West Eurasian ancestry but some kind of old Eurasian ancestry.

The problem with that is you have to have aDNA to prove that the Yoruba actually do have West Eurasian ancestry, we'd need ancient DNA from West Africa to prove that link. Otherwise it's just an unfalsifiable proposition. The West Eurasian signal could just imply that the closest Non-African populations to the Yoruba are West Eurasians, specifically Middle Eastern West Eurasians which makes sense geographically.

How do you know it's Eurasian ancestry, and not just ancestral African ancestry simply being redefined as Eurasian by ADMIXTURE, because it still persists in both Africans and Eurasians?

ZephyrousMandaru
03-06-2014, 03:20 AM
There is no one west eurasian component. Yoruba did't score basque/sardinian because the Yoruba component (Yoruba are West African) exists in the run...so they score 100%. If you removed west african too then we have to see what they score. For example, see what the Yoruba score in the last run (Only east asian with basque/sardinian...they are 13% East Asian and 87% Basque/sardinian). If we had Ancestral African, Basque and Japanese I wonder what the results would look like.

Looks like Dienekes already blogged about this: http://dienekes.blogspot.com/2012/03/effects-of-ascertainment-on-admixture.html

Better yet, just remove the West Eurasian components and rerun it with just East Asian and African ancestral clusters. I bet most West Eurasians will score an overwhelming amount of African. If they did, would that imply that West Eurasians are African admixed simply from what those results alone would yield?

Dr_McNinja
03-06-2014, 12:29 PM
I'm not sure. HRP0370 scores ~15% in K12b and Harappa while HRP0341 and I are ~30% and ~35-36% respectively. Some people's numbers are staying the same and others' are dropping noticeably.

Then again, ASI itself in these runs is coming out rather volatile. There have been runs where a few individuals are close to the expected marks based off K12b/Harappa but people above and below them are off by a certain factor. That could be the case here, with people near 15-20% coming out accurate and South Asians being off by a certain factor.

In other runs, we've all changed by a similar proportion from our K12b/Harappa runs (i.e, HRP0370 is in the single digits and we're in the teens/20s). It's not unreasonable though because we know the South Indian component is ~50% East Asian so ADMIXTURE will still pick up that same amount/frequency (give or take a little depending on component flows and SNP count) and just put it in something else. That East Asian amount changes once we get down to 3/4 components as some of the East Eurasian becomes undifferentiated from West Eurasian which calls into question that ASI estimate to begin with.To add, the difference in East/West Eurasian between me and HRP0341 and HRP0402, other Punjabi Jatts is about 2-3% when we got down to 3 or 4 components. It was 2% actually in one of them. That sort of makes sense since that would be about a 4% difference in S-Indian which is what I would expect, based on my mom's results. Even my own is like 5% off (34.x vs. 29.x in Harappa I think). But that means all that other stuff, the excess Arctic admixture and whatnot is all really representing West Eurasian. Or, well, a few possibilities:

1. That Arctic admixture is not East Asian enough to be captured by S-Indian.
2. ADMIXTURE ran out of Caucasian mixture which it could mate to that East Eurasian admixture (for S-Indian) so it comes off as excess Arctic.

Either the East Asian within S-Indian or this excess Arctic is actually West Eurasian. It could be either. HRP0341 spun off a bunch of excess Arctic in my runs so maybe it's Arctic. But the source of his Arctic seemed to be the South Asian/Indian. But Dr. McDonald did say that the South Asian component seemed European-like. Another thing is that within my runs, I kept winding up with excess South Asian, and was within normal ranges for everything else. Which means it was pulling that Arctic into South Asian. So I don't know which East Eurasian is being kicked out into Arctic or kept within S-Indian, is it the genuine East Asian-like or is it the stuff that's reverting to West Eurasian in the other runs? There is one pattern with Uralic-like admixture being caught in NE-Euro. I figure that admixture in my family is the one that no longer resembles Uralic sufficiently enough anymore and is being confused for other kinds of East Eurasian (I had unusually high Beringian in K12b/Harappa I think).

Other patterns: Wherever my South Asian is significantly elevated, my Gedrosian/Caucasian numbers are lower than the Harappa/K12b baseline. Eurogenes K36 was one example where my Caucasian dropped to ~8% (from ~11% in Harappa/Dodecad K12b) as my Indian went up to 42%, a higher increase than most of the other users (one of the Haryana Jatts also had a slight bump in Indian). My Arctic was also much lower in Eurogenes K36.

I think it just comes down to genetic drift and ADMIXTURE's inability to handle linkage disequilibrium. For whatever reason, my mother and I are drifting in a different direction than the other Jatts. We can identify three distinct drifted Jatt populations. Haryana Jatts, Punjabi Jatts (usually Sikh), and Pahari Jatts (usually Pakistani). A fourth if you count my mom and I (who have no Pahari origins whatsoever, but mostly Punjabi/Indian/Sikh origins). Unlike other Punjabi Jatts, our family is pretty inbred in the last 3 generations and an unknown amount before that (though certain branches wouldn't have been inbred at all, with similar habits to Sikhs). When I included my mom and I together in one of my first runs, our European and South Asian both shot up. The software is extremely sensitive to genetic drift. You don't have to be as isolated as Kalash, any family/group with some recent inbreeding will drift off in their results compared to others of their ethnic/geographic background.

More factors:

1. We have some Pahari ancestry/links. My mom shows up as a distant (5th) cousin to my brother-in-law (HRP0402, the Gondal Jatt) and me and my dad have segments matching another Pahari-origin Jatt (HRP0283).
2. The Balham clan is supposed to be of South Punjabi (Multan) origin, though in recent history our relatives from that clan moved north to near Lahore and admixed with probably other Majhi/Majhail Jatts.
3. The Gills and Pansotas (and perhaps the Balhams) originally came from Indian Punjab (Doaba, the Northeast) and possibly even from Haryana before that (according to what my grandfather used to say).

That's a mixing of a wide variety of Jatts which probably isn't the case for most of the other Jatts who have been tested. They've either been Sikh (who are mostly constituted of inner parts of Majha/Doaba/Malwa, many Majha), Hindu (Haryana, one region), or Pahari-origin Muslim Jatts. We've been a mix of two or possibly all three of those and then inbred on top of that. The majority in recent ancestry have been Punjabi though.

Another thing I'd estimate is that S-Indian is more Caucasian-like than we usually estimate. Don't some S-Indian populations lack Gedrosian completely but still have Caucasian? And Gedrosian, even though it appears related to the Caucasus component, has gone through significant drift itself and is, phenotypically at least, of very little resemblance to the other cluster in the Caucasus. There is a bit of a pattern with S-Indian+Gedrosian being a decent estimate of total "South Asian" admixture (relative to what 23andMe's Ancestry Composition or FTDNA's Population Finder in some cases estimate) give or take a little.

I think three changes are required for any real research to come out of the ADMIXTURE approach:

1. The algorithm should run from low k (2) up to the k specified (n) and search for patterns in component flow.
2. In addition to, or instead of #1 (since that is making up for this shortcoming), make it utilize LD properly and better address genetic drift.
3. We need a much bigger reference dataset. Much higher SNP counts and more ancient DNA.

If we had all three, I'm betting we'd learn a lot of new things and I wouldn't be surprised if they began to paint an entirely new picture of human migratory behavior.

Ignis90
03-06-2014, 12:47 PM
The problem with that is you have to have aDNA to prove that the Yoruba actually do have West Eurasian ancestry, we'd need ancient DNA from West Africa to prove that link. Otherwise it's just an unfalsifiable proposition. The West Eurasian signal could just imply that the closest Non-African populations to the Yoruba are West Eurasians, specifically Middle Eastern West Eurasians which makes sense geographically.

How do you know it's Eurasian ancestry, and not just ancestral African ancestry simply being redefined as Eurasian by ADMIXTURE, because it still persists in both Africans and Eurasians?

Pretty much everything at this point is speculation. I don't see why my proposition would be less credible than others.

I can give the example of the first two dimensions of a global PCA plot where almost all African populations are in a cline between African hunter-gatherers (particularly pygmies) and some West Eurasians, some Africans deviating slightly from this cline.

newtoboard
03-06-2014, 01:26 PM
The problem with that is you have to have aDNA to prove that the Yoruba actually do have West Eurasian ancestry, we'd need ancient DNA from West Africa to prove that link. Otherwise it's just an unfalsifiable proposition. The West Eurasian signal could just imply that the closest Non-African populations to the Yoruba are West Eurasians, specifically Middle Eastern West Eurasians which makes sense geographically.

How do you know it's Eurasian ancestry, and not just ancestral African ancestry simply being redefined as Eurasian by ADMIXTURE, because it still persists in both Africans and Eurasians?

Well its unlikely there is African ancestry in South Asia where links with Africa in terms of uniparental markers (imo M and N originated in Eurasia along with CT (and maybe even B in Arabia?) but definitely CF) or autosomal markers (as shown by other admixture runs) seem to be absent.

newtoboard
03-06-2014, 01:28 PM
Pretty much everything at this point is speculation. I don't see why my proposition would be less credible than others.

I can give the example of the first two dimensions of a global PCA plot where almost all African populations are in a cline between African hunter-gatherers (particularly pygmies) and some West Eurasians, some Africans deviating slightly from this cline.

There is no aDNA showing that your hypothesis can be ruled out. So I agree that your idea is just as credible as other ideas. There would be no point in discussing anything on these forums if everything had to be verified by ancient DNA (which we may never obtain in some cases) so the best we can do is work it out using other ideas.

ZephyrousMandaru
03-06-2014, 03:57 PM
Pretty much everything at this point is speculation. I don't see why my proposition would be less credible than others.

I can give the example of the first two dimensions of a global PCA plot where almost all African populations are in a cline between African hunter-gatherers (particularly pygmies) and some West Eurasians, some Africans deviating slightly from this cline.

It may possible, but the fact of the matter is Africans have been around longer, and have been genetically diversifying since before West Eurasians existed. If all genetic variation is essentially a subset of African variation. Then is what's being defined here even "West Eurasian"? It might not even actually be Eurasian at all, but simply an artifact from a time period that precedes the OoA migrations but still continues to persist in both Eurasians and Africans.

Ignis90
03-06-2014, 09:37 PM
"West Eurasian" doesn't mean the same thing since the publication of Lazaridis et al. But if we apply the idea of "artifacts"/selection/African structure to the relative closeness of West Africans (with no clear North African ancestry) and "West Eurasians", then we can do the same for inter-Eurasian relationships too (and I'm open to that too).

Dr_McNinja
03-07-2014, 12:54 AM
I added some European individuals from that online database to the ~900k SNP dataset. I pruned it to 3 levels. The lowest (~850k) had 0.999x genotyping rate. The highest (~950k) had 0.990x genotyping rate. The differences are pretty significant even with that little of a change.

(Euro = Western/Northern Europe, EEuro = Eastern/Northern Europe, EuroEastAsian = individual who looked half East Asian half European)

850k:


HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AfghanPashtun 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
KhetranBaloch 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 0 0 0
HRP0349 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 1 1 1 0 0 0 0 0 0 0
HRP0341 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 1 1 1 1 1 1 0 0 0 0 0 0 0
HRP0370 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 3 3 3 0 0 0 0 0 0 0 0 0 0
SouthAsian 6 6 6 6 6 6 6 6 6 6 5 5 5 5 5 5 5 4 4 4 3 3 3 2 2 0 0 0 0 0
Kurd 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 6 6 5 5 5 4 4 4 3 3 2 2 1 1 0
AfghanTajik 8 8 8 8 8 8 8 8 8 8 7 7 7 7 7 7 7 6 6 6 5 5 5 4 4 3 3 2 0 0
NEEuro1 9 9 9 9 9 9 9 9 9 9 8 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Turk1 10 10 10 10 10 10 10 10 10 10 9 9 9 9 9 9 9 8 8 8 7 4 4 3 3 2 2 1 1 0
Italian1 11 11 11 11 11 11 11 11 11 11 10 10 10 10 10 6 6 5 5 5 4 4 4 3 3 2 2 1 1 0
Kaundinya 12 12 12 12 12 12 12 12 12 12 11 11 11 11 11 10 10 9 9 9 8 7 3 2 2 0 0 0 0 0
Turk2 13 13 13 13 13 13 13 13 13 13 12 12 12 6 6 6 6 5 5 5 4 4 4 3 3 2 2 1 1 0
Zeph 14 14 14 14 14 14 14 14 14 14 13 13 13 12 12 11 11 10 8 8 7 4 4 3 3 2 2 1 1 0
SouthAsian2 15 15 15 15 15 15 15 15 15 15 14 14 14 13 13 12 3 1 1 1 1 1 1 0 0 0 0 0 0 0
EEuro2 16 16 16 16 16 16 16 16 16 16 15 15 15 14 14 13 12 11 10 5 4 4 4 3 3 2 2 1 1 0
EEuro3 17 17 17 17 17 17 17 17 17 16 15 15 15 14 14 13 12 11 10 5 4 4 4 3 3 2 2 1 1 0
EEuro4 18 18 18 18 18 18 18 18 18 17 16 16 15 14 14 13 12 11 10 5 4 4 4 3 3 2 2 1 1 0
EEuro5 19 19 19 18 18 18 18 18 18 17 16 16 15 14 14 13 12 11 10 5 4 4 4 3 3 2 2 1 1 0
Euro2 20 20 20 19 19 19 19 19 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro3 21 21 21 20 20 19 19 19 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro4 22 22 22 21 9 9 9 9 9 9 8 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro5 23 23 23 22 21 20 9 9 9 9 8 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro6 24 20 20 19 19 19 19 19 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro7 25 24 24 23 22 21 20 19 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro8 26 25 25 24 23 22 21 20 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro9 27 26 26 25 24 23 22 21 20 19 18 17 16 15 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
Euro11 28 27 20 19 19 19 19 19 19 18 17 8 8 8 8 8 8 7 7 7 6 6 6 5 5 4 2 1 1 0
EuroEastAsian1 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 4 3 3 2 0 0

900k:


HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AfghanPashtun 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
KhetranBaloch 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 0 0 0
HRP0349 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 1 1 0 0 0 0 0 0 0
HRP0341 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 1 1 1 1 1 0 0 0 0 0 0 0
HRP0370 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 3 0 0 0 0 0 0 0 0 0 0 0
SouthAsian 6 6 6 6 6 6 6 6 6 6 6 5 5 5 5 5 5 5 4 3 3 3 0 0 0 0 0 0 0 0
Kurd 7 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 6 6 5 4 4 4 3 2 2 2 2 1 1 0
AfghanTajik 8 8 8 8 8 8 8 8 8 8 8 7 7 7 7 7 7 7 6 5 5 5 4 3 3 3 3 2 1 0
NEEuro1 9 9 9 9 9 9 9 9 9 9 9 8 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Turk1 10 10 10 10 10 10 10 10 10 10 10 9 9 9 9 9 9 9 8 7 7 4 3 2 2 2 2 1 1 0
Italian1 11 11 11 11 11 11 11 11 11 11 11 10 10 10 9 9 9 9 8 7 7 4 3 2 2 2 2 1 1 0
Kaundinya 12 12 12 12 12 12 12 12 12 12 12 11 11 11 10 10 10 10 9 8 8 7 6 5 5 0 0 0 0 0
Turk2 13 13 13 13 13 13 13 13 13 13 13 12 12 6 6 6 6 6 5 4 4 4 3 2 2 2 2 1 1 0
Zeph 14 14 14 14 14 14 14 14 14 14 14 13 13 12 11 11 11 9 8 7 7 4 3 2 2 2 2 1 1 0
SouthAsian2 15 15 15 15 15 15 15 15 15 15 15 14 14 13 12 12 3 3 1 1 1 1 1 0 0 0 0 0 0 0
EEuro2 16 16 16 16 16 16 16 16 16 16 16 15 15 14 13 13 12 11 10 9 4 4 3 2 2 2 2 1 1 0
EEuro3 17 17 17 17 17 17 17 17 16 16 16 15 15 14 13 13 12 11 10 9 4 4 3 2 2 2 2 1 1 0
EEuro4 18 18 18 18 18 18 18 18 17 17 16 15 15 14 13 13 12 11 10 9 4 4 3 2 2 2 2 1 1 0
EEuro5 19 19 19 18 18 18 18 18 17 17 16 15 15 14 13 13 12 11 10 9 4 4 3 2 2 2 2 1 1 0
Euro2 20 20 20 19 19 19 19 19 18 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro3 21 21 21 20 20 19 19 19 18 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro4 22 22 22 21 9 9 9 9 9 9 9 8 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro5 23 23 23 22 21 20 9 9 9 9 9 8 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro6 24 20 20 19 19 19 19 19 18 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro7 25 24 24 23 22 21 20 19 18 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro8 26 25 25 24 23 22 21 20 19 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro9 27 26 26 25 24 23 22 21 20 19 18 17 16 15 14 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
Euro11 28 27 20 19 19 19 19 19 18 18 17 16 8 8 8 8 8 8 7 6 6 6 5 4 4 4 2 1 1 0
EuroEastAsian1 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 3 3 3 2 1 0


956k:


HRP0282 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AfghanPashtun 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
KhetranBaloch 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0
HRP0349 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 0
HRP0341 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 0
HRP0370 5 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 3 3 3 3 3 3 3 3 2 2 1 1 1 0
SouthAsian 6 6 6 6 6 6 6 6 6 6 6 6 6 5 5 5 4 4 4 4 4 4 4 1 1 1 0 0 0 0
Kurd 7 7 7 7 7 7 7 7 7 7 7 7 7 6 6 6 5 5 5 5 5 5 0 0 0 0 0 0 0 0
AfghanTajik 8 8 8 8 8 8 8 8 8 8 8 8 8 7 7 7 6 6 6 6 6 6 5 4 3 0 0 0 0 0
NEEuro1 9 9 9 9 9 9 9 9 9 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Turk1 10 10 10 10 10 10 10 10 10 10 10 10 10 9 9 9 8 8 8 8 8 8 7 6 5 4 3 1 1 0
Italian1 11 11 11 11 11 11 11 11 11 11 11 11 7 6 6 6 5 5 5 5 5 5 0 0 0 0 0 0 0 0
Kaundinya 12 12 12 12 12 12 12 12 12 12 12 12 11 10 10 10 9 9 9 4 4 4 4 1 1 1 0 0 0 0
Turk2 13 13 13 13 13 13 13 13 13 13 13 10 10 9 9 9 8 8 8 8 8 8 7 6 5 4 3 1 1 0
Zeph 14 14 14 14 14 14 14 14 14 14 14 13 12 11 11 11 10 8 8 8 8 8 7 6 5 4 3 1 1 0
SouthAsian2 15 15 15 15 15 15 15 15 15 15 15 14 13 12 12 12 11 10 2 2 2 2 2 2 2 2 1 1 1 0
EEuro2 16 16 16 16 16 16 16 16 11 11 11 11 7 6 6 6 5 5 5 5 5 5 0 0 0 0 0 0 0 0
EEuro3 17 17 17 17 17 17 17 17 16 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
EEuro4 18 18 18 18 18 18 18 18 17 16 16 15 14 13 11 11 10 8 8 8 8 8 7 6 5 4 3 1 1 0
EEuro5 19 19 19 19 18 18 18 18 17 16 16 15 14 13 11 11 10 8 8 8 8 8 7 6 5 4 3 1 1 0
Euro2 20 20 20 20 19 19 17 17 16 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro3 21 21 21 20 19 19 17 17 16 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro4 22 22 9 9 9 9 9 9 9 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro5 23 23 22 21 20 9 9 9 9 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro6 24 24 23 22 21 20 19 19 18 17 17 16 15 14 13 13 12 11 10 9 9 8 7 6 5 4 3 1 1 0
Euro7 25 20 20 20 19 19 17 17 16 9 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro8 26 25 24 23 22 21 20 19 18 17 17 16 15 14 13 13 12 11 10 9 9 8 7 6 5 4 3 1 1 0
Euro9 27 26 25 24 23 22 21 20 19 18 9 9 9 8 8 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
Euro11 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 8 7 7 7 7 7 7 6 5 4 3 2 2 0 0
EuroEastAsian1 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 6 6 5 4 3 0 0 0 0 0

Dr_McNinja
03-07-2014, 01:02 AM
In the admixture runs at k=3 and k=5, the South Asians basically got conflated with the East Asian half of the mixed individual:

https://docs.google.com/spreadsheets/d/1j-J4Q6Xf00HE6cmTXVJAPAskrWH10pinVx5z6lpbykE/edit#gid=1268616534

At 956k it's hard to tell what's going on. At 850k it seems like a Mideast/Caucasus/Southeastern-Europe type of component is discernible with the Italian, Turks, the Kurd and Zeph, which the Pashtun have some of.

Dr_McNinja
03-07-2014, 01:10 AM
^ HRP0370 is one of the few consistently overlapping ones, showing ~60% East Asian and ~30% of another component in all the runs. Compare with my East/West Eurasian runs from the other day where I was coming out with ~50% South Asian and 20-30% Caucasian for them.

Dr_McNinja
03-07-2014, 01:32 AM
SouthAsian2's results were surprising considering their Dodecad K12b results were:

South Asian: 46.64
Gedrosia: 36.04
N-Euro: 5.69
AtlanticMed: 3.63
Caucasus: 4.13
EastAsian: 1.59
SW-Asian: 0.99
SE-Asian: 1.04
NW-African: 0.25

So either their non-South Indian half or their South Indian half is being picked up as Mideast/Caucasian-type. That's really interesting. I wonder if it's because it's easier for ADMIXTURE to make the distinction when it's split like that? It seems to have an easier time with some people (like HRP0370) and gets confused by others (mostly NW Indians/Pakistanis who it grouped as entirely East and South Asian here).

EDIT: EuroEastAsian1's K12b:

NorthEuropean: 34.92
Siberian: 12.42
Gedrosia: 9.52
SE-Asian: 1.41
AtlanticMed: 17.72
SouthAsian: 3.74
E-African: 0.53
SW-Asian: 0.04
EastAsian: 12.86
Caucasus: 4.77
SSA: 1.68

Hanna
03-07-2014, 02:37 AM
In the admixture runs at k=3 and k=5, the South Asians basically got conflated with the East Asian half of the mixed individual:

https://docs.google.com/spreadsheets/d/1j-J4Q6Xf00HE6cmTXVJAPAskrWH10pinVx5z6lpbykE/edit#gid=1268616534

At 956k it's hard to tell what's going on. At 850k it seems like a Mideast/Caucasus/Southeastern-Europe type of component is discernible with the Italian, Turks, the Kurd and Zeph, which the Pashtun have some of.

850k at K5 I am 100% Euro2 and I did not score Caucasus. Why is that?

soulblighter
03-07-2014, 02:53 AM
South Asian is a little like me but not exactly...i wonder what their ancestry is. By the way did my ftdna dataset not work for the run?

My K12b

Population
Gedrosia 35.81%
Siberian -
Northwest_African 0.07%
Southeast_Asian -
Atlantic_Med 0.48%
North_European 6.24%
South_Asian 49.74%
East_African -
Southwest_Asian 1.24%
East_Asian 2.49%
Caucasus 3.92%
Sub_Saharan -

Dr_McNinja
03-07-2014, 02:56 AM
850k at K5 I am 100% Euro2 and I did not score Caucasus. Why is that?It was one of 3 Euro-type components so it might've been something near Mediterranean.

Dr_McNinja
03-07-2014, 02:57 AM
South Asian is a little like me but not exactly...i wonder what their ancestry is. By the way did my ftdna dataset not work for the run?

My K12b

Population
Gedrosia 35.81%
Siberian -
Northwest_African 0.07%
Southeast_Asian -
Atlantic_Med 0.48%
North_European 6.24%
South_Asian 49.74%
East_African -
Southwest_Asian 1.24%
East_Asian 2.49%
Caucasus 3.92%
Sub_Saharan -This was only the ~960k 23andMe V3 data files. I made that dataset just to see how admixture does with that many SNPs.

The curious thing about SouthAsian2 is the high AtlanticMed. That's the highest ratio of AtlanticMed to NE-Euro I've yet seen. I wonder if a lot of their NE-Euro hasn't bled off into S-Indian/East Eurasian.

Hanna
03-07-2014, 03:01 AM
It was one of 3 Euro-type components so it might've been something near Mediterranean.
I would have expected the Italian to score it in that case.
The Italian scored 100% Caucasus.

Dr_McNinja
03-07-2014, 08:08 AM
I would have expected the Italian to score it in that case.
The Italian scored 100% Caucasus.That isn't actually Caucasus, just what I thought is Caucasus. It's also peaking in the Kurd and Zeph, so it could be something near it or a different type of Caucasus (South Caucasus perhaps?). It is odd because you usually wind up near the Italian if not the other Turk. But the other Turk and the EuroEastAsian individual have a little of that component.

ZephyrousMandaru
03-07-2014, 08:13 AM
That isn't actually Caucasus, just what I thought is Caucasus. It's also peaking in the Kurd and Zeph, so it could be something near it or a different type of Caucasus (South Caucasus perhaps?). It is odd because you usually wind up near the Italian if not the other Turk. But the other Turk and the EuroEastAsian individual have a little of that component.

Do you think it might be more of a generalized Near Eastern cluster, as opposed to something specific such as Caucasus or Mediterranean or Southwest Asian?

Dr_McNinja
03-07-2014, 09:35 AM
Do you think it might be more of a generalized Near Eastern cluster, as opposed to something specific such as Caucasus or Mediterranean or Southwest Asian?That's possible but the Europeans it's also peaking in are mostly NE-Euro/AtlanticMed

Hanna
03-07-2014, 03:47 PM
That isn't actually Caucasus, just what I thought is Caucasus. It's also peaking in the Kurd and Zeph, so it could be something near it or a different type of Caucasus (South Caucasus perhaps?). It is odd because you usually wind up near the Italian if not the other Turk. But the other Turk and the EuroEastAsian individual have a little of that component.

Apart from me EEuro5 and Euro4 also scored 100% on that component. Do you know their ethnicity?

Dr_McNinja
03-07-2014, 03:58 PM
Apart from me EEuro5 and Euro4 also scored 100% on that component. Do you know their ethnicity?All the Euro individuals had high (usually 40+%) NE-Euro and high (30+%) Atlantic-Med. The EEuro folks had lower NE-Euro and Caucasian in the 20-30+% range.

But those admixture calculators won't have the exact same results. They're using less than 200k SNPs and these runs are using almost a million.

MfA
03-07-2014, 04:03 PM
Dr_McNinja and everest59 do you have any plans to make Admixture runs for X Chromosome?

everest59
03-07-2014, 08:03 PM
Dr_McNinja and everest59 do you have any plans to make Admixture runs for X Chromosome?

The regular run excludes X chromosome? I always thought they were all included.

Dr_McNinja
03-07-2014, 08:05 PM
Dr_McNinja and everest59 do you have any plans to make Admixture runs for X Chromosome?All chromosomal data is included in my runs, I haven't pruned out any.

MfA
03-07-2014, 10:08 PM
The regular run excludes X chromosome? I always thought they were all included.

I've never seen at GEDmatch or Dodecad/Harappa runs including X chromosome.. Only Davidski did a special run for X chro once, It never went out other than commercial purpose though.

Dr_McNinja
03-08-2014, 01:15 PM
http://www.anthrogenica.com/showthread.php?2287-Furutrad-DNA-tools-Chromosome-2D-visualizer-and-comparison-software

This guy's program phases "child" kits as a cross between two autosomal kits which gave me an idea. I made crosses of myself with a bunch of individuals:

With HRP0370, it's very close to an average of all components except Gedrosia/Baloch which is slightly higher by around 1% than both of us, but since we're both near 35%, we can let that slide.

With HRP0282, it's also very close to averages of all components.

With HRP0341, who clusters very close to HRP0282 in all the PCA plots I've seen but is a Jatt who I match pretty well on 23andMe (75.95%), it comes out different. My Gedrosia/Baloch jumps to 39%, almost as high as HRP0341 themselves. My West Eurasian doesn't come out as a perfect average, it comes out closer to them (HRP0341 - 81.42%, Me: 76.16%, Cross: 79.31%). I figure this additional Gedrosian came from Caucasian and Siberian which both dropped, although the overall Arctic stayed close to HRP0341, it was not an average of the two kits. I ran it through Eurogenes K36 and in there the South Asian/South Central Asian did come out as averages but my West European is still all there. HRP0341 had ~2.5 North Sea as their only West European and I had around 6 overall, with most of that being North Sea. In the cross it came out as French 0.11%, North_Atlantic 0.69%, North_Sea 5.05% which is pretty much the same as what I already had. Overall didn't change much (we were 11 and 10 respectively, it was between that). I ran two crosses with HRP0341 with pretty much the same results (only did one each of the other two, they did as expected the first time).

I added the HRP0282 and HRP0341 crosses here temporarily for comparison: https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFMtRHVlZDBuQ3lMcjhxMDE4V3JoY lE&usp=drive_web#gid=6

I figure that seems to indicate that admixture is confusing our Gedrosian, Caucasian, Arctic, and European components a bit. It seems as if HRP0341 has a bunch of West European hidden there and I have a bunch of Gedrosian hidden there, or at least hidden to ADMIXTURE. Which makes me think most of the Jatts are probably extremely similar, even among the divergent groups (Haryana, Punjabi, Pahari), with similar Gedrosian-heavy results and a general European component which matches across the entire northern continent (west and east), and a bunch of East Eurasian which confuses the software since all those other groups (Caucasian, Gedrosian, European, South Indian) all have some East Eurasian admixture. Admixture is probably underestimating Gedrosian and overestimating European because the additional East Eurasian admixture from being further in towards India is amplifying the NE-Euro signal when the European component is not NE-Euro like at all, it's like general Indo-European (i.e, Germanic). Which makes sense because we should have an Indo-European link, but not a Uralic-Finnic link.

Dr_McNinja
03-08-2014, 01:42 PM
I did a cross with myself then ran it in Dodecad K12b (easier to run that one than Harappa since it's in DIYDodecadWrapper), again a little bit of Siberian disappears, like half a percent, and takes out half my AtlanticMed just about and goes into North European, Gedrosian, and SW-Asian.

EDIT: Looking at all these unpredictable component jumps/flows with the crosses (which do not happen when mixing recently unrelated populations), I would say Jatts would probably be closer to 25% S-Indian, 55-60% Gedrosian, 5-10% Caucasian, 5-10% AtlanticMed. The latter probably being left over from a Indo-European migration between 3000 and 2000 years ago by a people similar to the Alans. Dr. McDonald's plots pegged my European in Northwestern France and HRP0370's as on the Western/Southern fringes of the Balkans. In K36 HRP0370 had some Iberian, and our cross wound up with some French (and my mom also had some French). These were the areas the Alans settled. However, it's also quite possible there is no direct link to the Alans and instead there's a link to a common ancestor of more Germanic origin and from an even earlier migration (between 10,000 and 4,000 years ago). The Getae are one option but they didn't have the spread to Northern/Western Europe the Germanic tribes and the Alans did. And we know for a fact the Scythians were in Northwestern India around 2000 years ago. Whatever the case, the Scythians, Sarmations, Getae, and Germanic peoples all probably had the same common proto-Indo-European ancestor from the Pontic-Caspian steppe. Their descendants are now predominantly in Western and Northwestern Europe.

The reason I bring up HRP0370 is because the Durrani Pashtun are mostly L1c-M357. My opinion is that this is a relic of that area's various Indian-origin dynasties (like the Shahis but perhaps even earlier in periods not well recorded in history). I've found similar mtDNA haplogroup (J1b*) in a Kamboj and they also were one of the early people to have control over that region. I don't think the West European admixture can be an illusion (like the Eastern European admixture), there's nothing else it can come out of. So it's more or less the luck of the draw as to which Pashtun it appears in, but it's persisted in small amounts in some old Indian populations who practiced strict endogamy.

Dr_McNinja
03-09-2014, 10:28 AM
I don't know what just happened:


./plink --noweb --bfile TempD31E_pruned900k --read-genome TempD31_900k_IBD.genome --segment

@[email protected]
| PLINK! | v1.07 | 10/Aug/2009 |
|----------------------------------------------------------|
| (C) 2009 Shaun Purcell, GNU General Public License, v2 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://pngu.mgh.harvard.edu/purcell/plink/ |
@[email protected]

Skipping web check... [ --noweb ]
Writing this text to log file [ plink.log ]
Analysis started: Sun Mar 9 01:52:48 2014

Options in effect:
--noweb
--bfile TempD31E_pruned900k
--read-genome TempD31_900k_IBD.genome
--segment

Reading map (extended format) from [ TempD31E_pruned900k.bim ]
905730 markers to be included from [ TempD31E_pruned900k.bim ]
Reading pedigree information from [ TempD31E_pruned900k.fam ]
30 individuals read from [ TempD31E_pruned900k.fam ]
0 individuals with nonmissing phenotypes
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
0 cases, 0 controls and 30 missing
10 males, 2 females, and 18 of unspecified sex
Warning, found 18 individuals with ambiguous sex codes
These individuals will be set to missing ( or use --allow-no-sex )
Writing list of these individuals to [ plink.nosex ]
Reading genotype bitfile from [ TempD31E_pruned900k.bed ]
Detected that binary PED file is v1.00 SNP-major mode
Before frequency and genotyping pruning, there are 905730 SNPs
29 founders and 1 non-founders found
1003 heterozygous haploid genotypes; set to missing
Writing list of heterozygous haploid genotypes to [ plink.hh ]
Total genotyping rate in remaining individuals is 0.998034
0 SNPs failed missingness test ( GENO > 1 )
0 SNPs failed frequency test ( MAF < 0 )
After frequency and genotyping pruning, there are 905730 SNPs
After filtering, 0 cases, 0 controls and 30 missing
After filtering, 10 males, 2 females, and 18 of unspecified sex
30 nuclear families, 29 founder singletons found
0 non-founders with 2 parents in 0 nuclear families
1 non-founders without 2 parents in 1 nuclear families
0 affected offspring trios
0 phenotypically discordant parent pairs found
Added 1 dummy parents
Converting data to Individual-major format
Reading genome-wide IBD estimates from [ TempD31_900k_IBD.genome ]

340 pairs are informative ( 0.0025 <= pihat <= 1 )
Using multipoint analysis: step = 2 and fringe = 0.01 cM

Scanning from autosomes from chromosome 1 to MT

Writing IBD-segment information to [ plink.segment ]
Writing IBD-segment summary to [ plink.segment.summary ]

Minimum segment length is 1000 kb and 100 SNPs
Segment thresholds are 0.25 and 0.25
Maximum intra-segment inter-SNP distance is 1000
Scan region on chromosome 1 from [ rs4477212 ] to [ rs12746903 ]
72870 markers in this scan
Multipoint map has 145743 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 2 from [ rs10195681 ] to [ rs12478296 ]
73317 markers in this scan
Multipoint map has 146637 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 3 from [ rs13060385 ] to [ rs10433653 ]
59915 markers in this scan
Multipoint map has 119833 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 4 from [ rs13125929 ] to [ rs3903261 ]
51956 markers in this scan
Multipoint map has 103915 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 5 from [ rs10076494 ] to [ rs6894609 ]
53056 markers in this scan
Multipoint map has 106115 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 6 from [ rs7774353 ] to [ rs12530134 ]
59576 markers in this scan
Multipoint map has 119155 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 7 from [ rs7456436 ] to [ rs1124425 ]
48036 markers in this scan
Multipoint map has 96075 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 8 from [ rs11780869 ] to [ rs6599566 ]
46545 markers in this scan
Multipoint map has 93093 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 9 from [ rs10814410 ] to [ rs9314655 ]
40613 markers in this scan
Multipoint map has 81229 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 10 from [ rs7909677 ] to [ rs9645536 ]
47646 markers in this scan
Multipoint map has 95295 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 11 from [ rs3802985 ] to [ rs11224228 ]
45352 markers in this scan
Multipoint map has 90707 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 12 from [ rs11063263 ] to [ rs11147298 ]
44545 markers in this scan
Multipoint map has 89093 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 13 from [ rs2762261 ] to [ rs17067959 ]
34113 markers in this scan
Multipoint map has 68229 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 14 from [ rs28699618 ] to [ rs10149476 ]
29172 markers in this scan
Multipoint map has 58347 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 15 from [ i5900373 ] to [ rs4098905 ]
26906 markers in this scan
Multipoint map has 53815 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 16 from [ rs2541696 ] to [ rs4785775 ]
28534 markers in this scan
Multipoint map has 57071 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 17 from [ rs6565703 ] to [ rs7502442 ]
25179 markers in this scan
Multipoint map has 50361 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 18 from [ rs8096071 ] to [ rs12960632 ]
26513 markers in this scan
Multipoint map has 53029 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 19 from [ rs8100066 ] to [ rs10411093 ]
17103 markers in this scan
Multipoint map has 34209 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 20 from [ rs6139074 ] to [ rs6062363 ]
22597 markers in this scan
Multipoint map has 45197 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 21 from [ rs28971224 ] to [ rs9637231 ]
12609 markers in this scan
Multipoint map has 25221 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis

Scan region on chromosome 22 from [ rs12157537 ] to [ rs5771007 ]
13150 markers in this scan
Multipoint map has 26303 positions
IBD calculation: 340 of 340
30 unique, informative individuals in analysis


ERROR: --chr {chromosome} not found:25

Results: http://imgur.com/a/c8RgM#0

wtf? Is it because I didn't prune that much? I thought maybe it was recognizing missing SNPs or something but the genotyping rate is fine. And why would it keep using me of all people? I'm not the first or last one in the file or anything either, I'm 4th. Is it because I'm the one non-founder?

EDIT: Relevant: http://pngu.mgh.harvard.edu/~purcell/plink/ibdibs.shtml


FID1 Family ID of first individual
IID1 Individual ID of first individual
FID2 Family ID of second individual
IID2 Individual ID of second individual
PHE Phenotype concordance: -1,0,1
CHR Chromosome code
BP1 Start physical position of segment (bp)
BP2 End physical position of segment (bp)
SNP1 Start SNP of segment
SNP2 End SNP of segment
NSNP Number of SNPs in this segment
KB Physical length of segment (kb)

Dr_McNinja
03-09-2014, 11:09 PM
^ I rephased from the crosses. It gave pretty close results to the original Harappa results for HRP0370 which was surprising because it usually mangles the provided parent kit #. Mine came back as the following:

From HRP0370:

S-Indian 30.60%
Baloch 32.47%
Caucasian 10.28%
NE-Euro 10.77%
SE-Asian 0.67%
Siberian 1.62%
NE-Asian -
Papuan 1.27%
American 1.26%
Beringian 1.81%
Mediterranean 3.00%
SW-Asian 2.65%
San -
E-African 2.07%
Pygmy 1.53%
W-African -

From HRP0341:

S-Indian 31.82%
Baloch 33.39%
Caucasian 9.68%
NE-Euro 10.79%
SE-Asian 0.76%
Siberian 1.35%
NE-Asian -
Papuan 1.24%
American 1.01%
Beringian 1.86%
Mediterranean 2.10%
SW-Asian 2.44%
San -
E-African 2.29%
Pygmy 1.28%
W-African -

From HRP0282:

S-Indian 31.20%
Baloch 32.65%
Caucasian 10.69%
NE-Euro 10.43%
SE-Asian 0.48%
Siberian 1.49%
NE-Asian 0.83%
Papuan 1.13%
American 1.14%
Beringian 1.69%
Mediterranean 2.59%
SW-Asian 2.17%
San -
E-African 1.83%
Pygmy 1.44%
W-African 0.23%

Dr_McNinja
03-10-2014, 12:10 AM
Test run in unsupervised mode with Malta and my 23andMe V3 dataset. It only overlapped with about 230k SNPs from 23andMe V3 and a little over 100k with the all types one (V3, V4, FTDNA, etc). This was run at about 200k SNPs:

https://docs.google.com/spreadsheets/d/19jOgzU4BkgAVKc_oc1l6HCv6tg77AcSXhIPi98ETkXs/edit#gid=515528888

At k=4 it started to become uninterpretable. Kaundinya and HRP0370 split off into their own cluster and chaos followed at higher Ks, but Mal'ta stayed by themselves and only EuroEastAsian1 had any appreciable amount of it.

Thanks to everest for sending me the Mal'ta and LaBrana data.

Dr_McNinja
03-10-2014, 12:22 AM
I ran the segment thing at 200k SNPs from the dataset used in the last post and I came out as sharing like before but just with some of the South Asians and the EuroEastAsian1 person. In the unpruned dataset at ~900k it was with everyone. Still not sure what's happening there...

Dr_McNinja
03-10-2014, 12:40 AM
LaBrana only overlapped with Mal'ta and 23andMe by about 80k SNPs so I ran it with just LaBrana and 23andMe at 215k SNPs and it came out similar to the above run with European, LaBrana, and Asian components. EuroEastAsian1 had 2.8% LaBrana and the Europeans generally had 1.7-2.0% LaBrana.

Sein
03-10-2014, 12:45 AM
If possible, could you try a supervised run?

Dr_McNinja
03-10-2014, 02:08 AM
I'll try that. What components?

Sein
03-10-2014, 02:13 AM
If you think this could work, perhaps try MA-1, the Han Chinese, South Bedouins, and the Mbuti Pygmies? And if you'll include Europeans, could you also try La Brana as an extra component? Basically, a repeat of David's experiment, the only difference being Mbuti Pygmies, rather than the Yoruba. I believe David is also going to try this, so it'll be nice to compare results. Thank you in advance.

Dr_McNinja
03-10-2014, 02:18 AM
Don't know why it didn't occur to me earlier to try the previous runs with a West Asian (Caucasus+Gedrosian) component.

https://docs.google.com/spreadsheets/d/1DXKFpaGMm8gSIOM67Rj4LXLXhCLs8yoeWmAQYJgP7ag/edit#gid=1813679910

Two runs, one with West Asian, another with West Asian and North Indian (Kshatriya + Uttar Pradesh Brahmin).

Notice how SE-Asian takes over for S-Indian in the first run but there's no Arctic admixture thrown off. With North Indian suddenly European goes up a little and there is Arctic admixture. I would say SibPaleo is actually from ASI while Amerindian is from ANE. SibUralic, not sure, it's raised in soulblighter and Kaundinya.

Even more interesting is how some Europeans are averaging 2% or more North Indian. The Russians and Mordovians especially. The Ukrainians not so much. Probably a sign of that Uralic<->ASI connection responsible for the South Asian affinity for NE-Euro. The Russians and Mordovians were throwing off some SE-Asian until I put in North Indian so it's clearly an ASI-like affinity. The only way for ASI to have gotten up there is via ANE.

Dr_McNinja
03-10-2014, 03:57 AM
Did one with Southeast Indian Tribals (Gadaba, Bonda, Juang) in addition to North Indian and West Asian:

https://docs.google.com/spreadsheets/d/1H4oU8bolUCsO2h_PQ0itUjURvIz5G3sMUH3U_1aUxgo/edit#gid=2095976244

Something interesting happens.

First it appears that adding a North Indian component drew off all of the Southeast Asian plus a big chunk of West Asian. Then it put all but a small portion of that into North Indian and put the remainder into European, Arctic, and East Asian. (EDIT: I added another sheet showing that for us South Asians, it matches up exactly)

Then adding the SE-Indian-Tribal component made everyone lose some more West Asian to North Indian. Except a few people who lose West Asian to SE-Indian-Tribal. Either way, some West Asian jumps into an Indian component, even the Europeans. I just checked it out and it's West Asian, East Asian, and Arctic decreasing across the board and Indian (both components), SE-Asian, and European gaining.

There is some kind of affinity between Indian/South Asian and European (represented by French Basque) that's showing up here. Possibly SE-Asian as well. It's some kind of connection that skips over West Asia entirely. As people became East-shifted, European continued to increase. Either that or ADMIXTURE is playing games and increasing European to offset the East-shift.

Dr_McNinja
03-10-2014, 01:10 PM
If you think this could work, perhaps try MA-1, the Han Chinese, South Bedouins, and the Mbuti Pygmies? And if you'll include Europeans, could you also try La Brana as an extra component? Basically, a repeat of David's experiment, the only difference being Mbuti Pygmies, rather than the Yoruba. I believe David is also going to try this, so it'll be nice to compare results. Thank you in advance.How about Biaka and Mbuti Pygmies? Also, I only have Japanese in my last dataset, I'm gonna try it with them first (since adding in the Han would take a bit of time, I already have these three in there).

Dr_McNinja
03-10-2014, 01:23 PM
Came out as I expected, barely any Malta for anyone.

https://docs.google.com/spreadsheets/d/1qcxlGBcecztWm0PBSwbr--fplp_AmMefM2wOmdP6qcE/edit#gid=1434504802

I don't know what David is doing in his runs. Is it because I have too many other populations in there? Should I remove all non-component people except us? Is it the Japanese instead of Han Chinese? I figure JPT are a better component for East Asian in every other situation.

(EDIT: This was around ~110k SNPs)

EDIT #2: The Native Americans are coming out near 100% Mal'ta. Should I remove them?

Dr_McNinja
03-10-2014, 01:56 PM
Removing the Native Americans fixed it:

https://docs.google.com/spreadsheets/d/1KiwzhNnJsWB2Ojr-lXfj0FbazEBjCzbsp0cbTkKl4Xw/edit#gid=769782428

Dr_McNinja
03-10-2014, 02:00 PM
I should remove Papuan and do it again, gimme a minute

Dr_McNinja
03-10-2014, 02:24 PM
Well that kind of made it worse, now it's up to 90% Mal'ta in Pulliyar. Guess I have to remove most of the Indians.