PDA

View Full Version : Looking for ASI



Pages : 1 [2] 3

parasar
05-28-2015, 03:02 AM
After K9 and K10, everyone feel free to chime in on what you'd like to see. Personally, I'm interested to see how things behave if I throw in the rest of the South Asians. Especially, the K7. It might get rid of that split between the three SE Asians, or at least clean it up.

Chad,

The Ongee do not have a Papuan component, but the Papuans do have a Ongee like component (based on Denisovan and f stats). So what you are seeing in k5 and k6 reflects this Ongee-Papuan relationship, but in the reverse.

http://2.bp.blogspot.com/-_-GtLtBKUCo/UE8yPk8lDcI/AAAAAAAAAeo/UFx4610gTAI/s1600/ReichOceania.gif

Chad Rohlfsen
05-28-2015, 03:12 AM
I can try to do a supervised run with just an Onge component, just to see what happens. I was thinking the Papuan reflected something Onge like with some Denisovan and maybe Austronesian. I'll play around with that later. Right now, I am doing a K7 run with 1800 samples, including about 400 more East and South Eurasians. I'll see if it comes out the same. Something interesting in the K10, there is no Near Eastern but very little LBK in South Central Asia. All of the Basal basically goes into a SC Asian component. I will see if I can do a supervised without the Papuans. I may just exclude them from the population list. We'll see.

Chad Rohlfsen
05-28-2015, 03:15 AM
Want me to run the unsupervised K7 without Papuans?

Sein
05-28-2015, 03:34 AM
After K9 and K10, everyone feel free to chime in on what you'd like to see. Personally, I'm interested to see how things behave if I throw in the rest of the South Asians. Especially, the K7. It might get rid of that split between the three SE Asians, or at least clean it up.

I have an idea I wanted to pass by you.

Looking at ADMIXTURE, I've noticed that ASI is much more East Asian-shifted, in comparison to the Onge. Basically, even though the Onge themselves are an unambiguously ENA population, closely related to East Asians, it seems that South Asian-specific ENA (ASI) is even closer to living East Asian populations. I base this on the fact that East Asian-specific components always do the work of ASI, in all of your ADMIXTURE runs, despite the inclusion of supervised Onge components (Zack's old Onge run feels like a miracle now, I wonder how he got a component that made so much sense in light of formal stats?).

This is something that we could have predicted, just by looking at the Reich et al. paper. Apparently, "ASI" shares more drift with Dai than it does with Onge. Also, qpAdm gives excellent fits for South Asians, using Dai as the ASI proxy.

So, I was hoping you could try a small supervised run, but with Onge + Dai + Melanesian as a single component. In theory, this should be a much better ASI proxy than having only the Onge. Basically, these would be the supervised components:

Onge + Dai + Melanesian

Nganasan

BedouinB

EEF

Yamnaya

MA1 (ANE)

EHG (Russian hunter-gatherers)

WHG

Mbuti

The populations to be tested would be:

Paniya

UP Brahmins (if you have them)

UP Chamar (if you have them)

Sindhi

Brahui

Baloch

Pathans

Kalash

Pomiri Tajiks

Yaghnobi Tajiks

Iranians

There is a reason for why there are so few populations to be tested. Mainly, I want to see if this works. If it does, you could try these same components with a huge data-set.

Chad Rohlfsen
05-28-2015, 03:41 AM
I could see what happens, but without a lot of samples it can go a little nutty. One thing I've noticed is when unsupervised goes full Bedouin, South Asians do not like them, and avoid them altogether. I may have to make it unsupervised. Also, the Dai appear to get a decent amount of Onge, when I go supervised, which could dilute it down. I'll keep plugging away. I won't stop until you guys are satisfied. What about Austronesians? Is there any Onge expected there?

Edit: As far as making ASI, let's do a few unsupervised runs. When something looks to make the most sense and close to statistics, I'll try to make an artificial ASI.

Sein
05-28-2015, 03:45 AM
I could see what happens, but without a lot of samples it can go a little nutty. One thing I've noticed is when unsupervised goes full Bedouin, South Asians do not like them, and avoid them altogether. I may have to make it unsupervised. Also, the Dai appear to get a decent amount of Onge, when I go supervised, which could dilute it down. I'll keep plugging away. I won't stop until you guys are satisfied. What about Austronesians? Is there any Onge expected there?

Thanks!

I'd just like to see this supervised ADMIXTURE run tried once, as it makes sense in light of the formal stuff we've seen.

If it fails to deliver, we can attack the issue from another angle.

Dr_McNinja
05-28-2015, 03:47 AM
I tried to "fix" Gedrosian, based on the following:

Yamnaya = 50% EHG, 50% mystery South Central Asian population which is 31.16 ANE, 5.94 ASE, 42.96 ENF, 18.98 WHG, 0.78 West African, 0.17 East African (or thereabouts)

Or,

EHG mixed with Gedrosian (HAP Baloch as a stand-in) results in this: 36.56% ANE, 6.45% ASE, 27.56% ENF, 2.49% East Eurasian, 26.20% WHG.

Baloch/Gedrosian is ~35% ANE, ~9% ASE, ~55% ENF.

So I used the first one, the mystery hypothetical South Central Asian ancestor of Yamnaya as Gedrosian.

These are the kinds of results I'm seeing:

HRP0349

0.02% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
13.28% ASI
0.36% Western-Steppe
11.91% Eastern-Steppe
50.32% South-Central-Asian
21.39% Caucasus
0.00% Mideast
0.00% Europe
0.20% West-African
2.52% East-African

HRP0393

0.00% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
10.43% ASI
2.88% Western-Steppe
6.39% Eastern-Steppe
74.92% South-Central-Asian
3.28% Caucasus
0.00% Mideast
0.00% Europe
0.00% West-African
2.10% East-African

HRP0370

1.15% SE-Asian
0.22% NE-Asian
0.51% Indo-Tibetan
5.65% ASI
10.18% Western-Steppe
3.51% Eastern-Steppe
30.71% South-Central-Asian
42.54% Caucasus
2.97% Mideast
1.00% Europe
0.10% West-African
1.45% East-African

HRP0341

0.00% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
10.55% ASI
0.38% Western-Steppe
7.37% Eastern-Steppe
58.98% South-Central-Asian
20.77% Caucasus
0.00% Mideast
0.00% Europe
0.00% West-African
1.94% East-African

HRP0353 (Punjabi Gujjar)

0.01% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
13.20% ASI
0.18% Western-Steppe
4.81% Eastern-Steppe
51.51% South-Central-Asian
28.16% Caucasus
0.00% Mideast
0.00% Europe
0.34% West-African
1.79% East-African

Soulblighter

0.15% SE-Asian
0.00% NE-Asian
0.02% Indo-Tibetan
18.08% ASI
0.33% Western-Steppe
14.03% Eastern-Steppe
33.30% South-Central-Asian
30.29% Caucasus
0.00% Mideast
0.00% Europe
0.09% West-African
3.71% East-African

Khetran Baloch

0.02% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
9.20% ASI
0.41% Western-Steppe
2.32% Eastern-Steppe
43.54% South-Central-Asian
41.58% Caucasus
0.00% Mideast
0.00% Europe
0.54% West-African
2.39% East-African


This could go somewhere, but a confusing issue is the Caucasian. Because South Asian ANE is going into Caucasian since I only have ASI for South Asia. The issue is that if I put ENF in there too, it will suddenly start competing with, and winning against, the new Gedrosian component because it really prefers ANE/ENF/ASE without WHG/EHG (which makes me wonder if this is a historically valid guess, even though it seems to make sense).

So I'm thinking an ANE/ASE/East Eurasian component for South Asia, but I don't know if this is a warranted guess, historically.

I chose to stick with Western Steppe (Yamnaya) as outlined earlier since I think people would be more interested in ancestry from there, even though if there was an EHG component there, Yamnaya should get 50% EHG, 50% South Central Asian. So South Central Asian here represents SC-Asian ancestry above and beyond that contained within Western Steppe.

I think some amount of EHG in the area ancestrally makes sense. Old Gedrosian is like West Caucasus, and "South Central Asian" here is like the Northern Caucasus populations who have some WHG. Or "North Indus" vs. "South Indus". Though technically this population would have existed probably somewhere in what is now the southern fringe of Central Asia (northern Afghanistan, Tajikstan, Turkmenistan, northern Pakistan).

One reason a component didn't come out by itself was because of Europeans and the nature of WHG. We didn't even know about EHG until recently.

EDIT: Since this is based on K7 with its WHG/ANE issues (this WHG is more like EHG since it contains some ANE), this SC-Asian component is more susceptible to a calculator effect than the other components which are based off actual K7 averages to compensate.

Chad Rohlfsen
05-28-2015, 04:14 AM
Should I expect any Near Eastern in ASI, or am I looking for straight ENA?

Sein
05-28-2015, 04:15 AM
Should I expect any Near Eastern in ASI, or am I looking for straight ENA?

Just ENA.

Chad Rohlfsen
05-28-2015, 04:42 AM
Alright. I've got about 400 SC and South Asians about to finish an unsupervised K7. It may alter the results. If it looks good, I'll post it. Otherwise, I will begin working on your request, Sein.

Chad Rohlfsen
05-28-2015, 04:59 AM
All of those pops made the K7 a dud. I am going to do K4-6, and K10, while I sleep. I'll analyze that tomorrow. I don't have any homework for a couple days, so I will go to town on this. If nothing looks good unsupervised, I will go down to just a few hundred samples across Iran to South Asia, with a few European and Asian outliers. After a couple unsupervised runs, I'll go all supervised.

Chad Rohlfsen
05-28-2015, 05:04 AM
If that first K7 is giving a correct indication, ASI could be something modeled as 22% Papuan, 19% Nganasan, 23% Atayal, and 36% Onge.

tamilgangster
05-28-2015, 08:59 AM
Going from the K7/K8 breakdown of other calculator components and various individuals, it looks like there's some East Eurasian that is correlated to ASE whenever a South Asian component is formed. However, that is lacking in the Baloch component itself which only has ASE and not East Eurasian. It's also lacking in other ancient populations or components which don't have that ASE-entangled East Eurasian.

Going by general K7 results, it looks like it should be around a 90% ASE, 10% East Eurasian composition. Hitting as high as 92% in individuals with the least East Eurasian (Punjabi Arain, Punjabi Gujjar, Baloch, some Sindhis).

Not addressing this admixture results in it pulling ASE away from a South Asian component and becoming "Southeast Asian". Malay are around 50-60% ASE I think, highest in a standalone population, and the HAP South Indian component has 70-72% ASE (out of ASE+East Eurasian total). That made me think Onge would be around 70 ASE as well. More than 60 at least, otherwise the Malay would be more ASE than the Onge.

But that would mean the source of ASE in a lot of India was not Onge or an Onge-like population because there just isn't enough East Eurasian.

So how's this picture look?

Northeast Asian: 3.71% ANE, 11.52% ASE, 0.06% WHG, 83.86% East Eurasian, 0.21% West African, 0.32% East African, 0.31% ENF

Himalayan/Indo-Tibetan?: 30% ASE, 70% East Eurasian?

Southeast Asian: 1.79% ANE, 50.28% ASE, 1.70% WHG, 43.57% East Eurasian, 0.13% West African, 0.61% East African, 1.93% ENF

ASI?: 70% ASE, 30% East Eurasian?

A confounding factor is archaic admixture. I've only been able to isolate Ust-Ishim which I suspect is basically just tying up the archaic signal seen previously in Oceanian with some WHG/ANE/ENF. We can see hints of this archaic signal perhaps in Southeast Asian above. It should be high in Himalayan/Indo-Tibetan as well, but we have no idea what it would be like in ASI.

1) The harappa DNA south Indian component, is 50% west eurasian. THe Paniyas who are 85% South Indian(and lack baloch component) score 50% west eurasian on one of the earlier calculator(the one with austronesian component). THey have the Bedouin Type ENF and EHG(but in lower ratio) THis implies that there has to be a pregedrosian source to ANE.

2) The East Eurasian component is proportional to the amount of ASE within the Same area, but it increases as you go towards areas like Bengal, but its very minute in NW india. The ASE in India was not the same population as onge due to 60,000 Years of seperation. Also the primary source of the East Eurasian component was the Austroasitic migration from the Northeast, this is the reason why the East Eurasian to ASE ratios are not always consistant. This migration also is a carried alot of ASE.

tamilgangster
05-28-2015, 09:11 AM
Wow. That's surprising. It's a small sample for comparison, but I think what shows there would actually put Tianyuan closer to West Eurasians than East Eurasians. Han and Dai are the same distance from Tianyuan as French and Sardinian. And that is WITH the modern Euros containing Basal Eurasian, which should drag them further away. Karitiana is closer than the other East Eurasians...at 40kya I don't think there would be enough divergence in the East Eurasian clade for Tianyuan to have any preference among them (except maybe against Papuans or Onge). So the only other thing that could potentially bring Karitiana closer is the ~40% West Eurasian ancestry.

Karitiana Score 40% ANE on eurogenes k7

tamilgangster
05-28-2015, 09:15 AM
Wow. That's surprising. It's a small sample for comparison, but I think what shows there would actually put Tianyuan closer to West Eurasians than East Eurasians. Han and Dai are the same distance from Tianyuan as French and Sardinian. And that is WITH the modern Euros containing Basal Eurasian, which should drag them further away. Karitiana is closer than the other East Eurasians...at 40kya I don't think there would be enough divergence in the East Eurasian clade for Tianyuan to have any preference among them (except maybe against Papuans or Onge). So the only other thing that could potentially bring Karitiana closer is the ~40% West Eurasian ancestry.

Does this mean that Tianyuan was a proto EastEurasian population, which predated the East Eurasian split

tamilgangster
05-28-2015, 09:57 AM
So, here's a makeshift calculator I'm tweaking:

HRP0349:

0.87% SE-Asian
0.11% NE-Asian
0.30% Indo-Tibetan
10.57% ASI
7.84% Western-Steppe
7.89% Eastern-Steppe
66.81% Gedrosian
2.27% Caucasus
0.00% Mideast
0.02% Europe
0.34% West-African
2.98% East-African

HRP0393 (Haryana Jatt):

0.57% SE-Asian
0.04% NE-Asian
0.15% Indo-Tibetan
9.22% ASI
24.44% Western-Steppe
3.48% Eastern-Steppe
55.55% Gedrosian
3.48% Caucasus
0.02% Mideast
0.16% Europe
0.00% West-African
2.88% East-African

HRP0341:

0.52% SE-Asian
0.04% NE-Asian
0.15% Indo-Tibetan
8.14% ASI
10.73% Western-Steppe
3.86% Eastern-Steppe
70.29% Gedrosian
3.57% Caucasus
0.01% Mideast
0.05% Europe
0.00% West-African
2.64% East-African

East-African is like basal Eurasian (it's from Eurogenes K7).

There's a "DIY" South Asian component, just multiply ASI by 3, then subtract the difference from Gedrosian. Since HAP S-Indian is basically a version of ASI mixed with Oceanian/East Eurasian mixed with, literally, HAP Baloch. The ratios even add up to make it unmistakable (31-35 ANE, rest ENF).

So I'd have:

0.87% SE-Asian
0.11% NE-Asian
0.30% Indo-Tibetan
10.57% ASI
7.84% Western-Steppe
7.89% Eastern-Steppe
66.81% Gedrosian
2.27% Caucasus
0.00% Mideast
0.02% Europe
0.34% West-African
2.98% East-African
(31.71% South Asian)
(45.67% Gedrosian)

and HRP0393 would have,

0.57% SE-Asian
0.04% NE-Asian
0.15% Indo-Tibetan
9.22% ASI
24.44% Western-Steppe
3.48% Eastern-Steppe
55.55% Gedrosian
3.48% Caucasus
0.02% Mideast
0.16% Europe
0.00% West-African
2.88% East-African
(27.66% South Asian)
(37.11% Gedrosian)

and HRP0341 would have:

0.52% SE-Asian
0.04% NE-Asian
0.15% Indo-Tibetan
8.14% ASI
10.73% Western-Steppe
3.86% Eastern-Steppe
70.29% Gedrosian
3.57% Caucasus
0.01% Mideast
0.05% Europe
0.00% West-African
2.64% East-African
(24.42% South Asian)
(54.01% Gedrosian)

Or you could combine ASI with Gedrosian and East-African in whatever proportions you'd like. It's not technically a proper admixture calculator that way since there's a bunch of ASE in Gedrosian which would then get pulled into a South Asian component that incorporated ANE/ENF, but it's probably just as accurate a method (in terms of application to reality).

I'm gonna tinker with it some more.

EDIT: Here's soulblighter (South Indian Brahmin),

Soulblighter

0.76% SE-Asian
0.12% NE-Asian
0.31% Indo-Tibetan
14.94% ASI
0.07% Western-Steppe
9.42% Eastern-Steppe
69.91% Gedrosian
0.23% Caucasus
0.00% Mideast
0.00% Europe
0.05% West-African
4.19% East-African

What population did u use for the gedrosian proxy. THe gedrosian sample seems too high and the ASI appears to be too low. The gedrosian sample probably includes some Basal Eurasian.

tamilgangster
05-28-2015, 10:23 AM
K7, ASI looks like it is about 10% Near East, and the rest is Onge, Papuan, some Siberian and Austronesian like.

https://docs.google.com/spreadsheets/d/1ljkxdB3IAcU-Nu-cQjvn3jPo8dgME32Z4knde8ClLgE/edit#gid=244244658

THe reason why In the K5 south asians score such high east asian, is due to EHG admixture im guessing but once you create a siberian component it goes down significantly. THe alot SE asian component on the k7, once the onge is seperated probably represents real East eurasian ancestry

tamilgangster
05-28-2015, 10:38 AM
If that first K7 is giving a correct indication, ASI could be something modeled as 22% Papuan, 19% Nganasan, 23% Atayal, and 36% Onge.

If you include Nganasa into the ASI alot of the ANE/EHG will show up as ASI

Kale
05-28-2015, 01:56 PM
I think Tianyuan would be slightly closer to Loschbour than to MA1, and more distant to Kostenki 14 who has Basal.

Kostenki has no basal in the sense of the basal found in neolithic farmers. Ust-Ishim is equally related to Loschbour, MA1, Dai, and Kostenki.

So far I think the evidence suggests the initial West-East Eurasian split shortly before Ust-Ishim's time, up to maybe 50kya. With a bit of mixing in the middle shortly after Kostenki's time (37kya), contributing a small amount of ENA to Loschbour and MA1.

Kale
05-28-2015, 01:59 PM
Does this mean that Tianyuan was a proto EastEurasian population, which predated the East Eurasian split

To me it suggests Tianyuan might have had West Eurasian affinities. But that sample is so small as to not be definitive. I'd say it warrants a more detailed examination, instead of just dismissing him as an early, undifferentiated sort of East Eurasian.

Dr_McNinja
05-28-2015, 02:55 PM
What population did u use for the gedrosian proxy. THe gedrosian sample seems too high and the ASI appears to be too low. The gedrosian sample probably includes some Basal Eurasian.

Gedrosian will always be high for South Asians in lieu of a competing ANE/ENF containing component. ASI is pure ASE/East Eurasian, unlike Harappa's S-Indian or South Asian in other calculators which is a mix of ASI and Baloch/Gedrosian which artificially lowers Gedrosian itself.

The Eurogenes Hindu Kush and MDLP K14 ANI calculators indicate a Gedrosian area component will be a lot higher than the old equivalents. It will be the majority component in all of India.

parasar
05-28-2015, 02:59 PM
Kostenki has no basal in the sense of the basal found in neolithic farmers.

The basal in Kostenki 14 is ~39000 old split of the same Basal. So obviously it cannot be the same. But it is the same basal in the sense that both are limited to Western Eurasia and are not seen on the non-Basal side - eg. in the Amazonian Karitiana. In other words, K14's basal and the EEF basal split of the same Basal, you could entitle them say Basal 1 and Basal 2, respectively. As we find more ancient samples we will see more of these splits on the Basal side, much as we are seeing on the non-Basal side.



Ust-Ishim is equally related to Loschbour, MA1, Dai, and Kostenki.
...

Sure since Ust-Ishim is himself basal to much of K14 (except to the ~20% Basal portion in K14).

Kale
05-28-2015, 05:36 PM
The basal in Kostenki 14 is ~39000 old split of the same Basal. So obviously it cannot be the same. But it is the same basal in the sense that both are limited to Western Eurasia and are not seen on the non-Basal side - eg. in the Amazonian Karitiana. In other words, K14's basal and the EEF basal split of the same Basal, you could entitle them say Basal 1 and Basal 2, respectively. As we find more ancient samples we will see more of these splits on the Basal side, much as we are seeing on the non-Basal side.

I'm referring to the following possibility as outlined in Haak Et. Al. 2015

"The hypothesis of Basal Eurasian ancestry in Kostenki14 needs to be further tested, as the
negative D(Mbuti, Han; Loschbour, Kostenki14) statistic could also reflect gene flow between Han and Loschbour
this a priori plausible, as these populations are much younger
than Kostenki14 and may share intra-Eurasian genetic drift that Kostenki14 lacks because of
its age. The possibility of later gene flow between Europeans and eastern non-Africans must
be further tested with additional ancient samples from Upper Paleolithic Europe and Asia."

This possibility is supported by the differing relationships of Neolithic Farmers and Kostenki to Ust-Ishim.
Chimp Ust'-Ishim Kostenki14 Dai 0.0043 0.635
Chimp Ust_Ishim LBK_EN Dai 0.0199 4.663
Your 'basal 1 and basal 2' would still fall under a main branch of basal...as they diverge from each other after they diverge from the main Eurasian branch. So 'basal 1 and basal 2' would not differ in their relationship to the main Eurasian branch. Basal was originally defined 'ancestry that diverged before the common ancestor of Loschbour and Onge'....So if Loschbour and Onge share later contact after their initial divergence, any relatives of Loschbour that did not partake in this event (I.E. Kostenki...because this event hadn't happened yet) would be considered to have 'basal ancestry'.

Dr_McNinja
05-28-2015, 05:46 PM
I tried to "fix" Gedrosian, based on the following:

Yamnaya = 50% EHG, 50% mystery South Central Asian population which is 31.16 ANE, 5.94 ASE, 42.96 ENF, 18.98 WHG, 0.78 West African, 0.17 East African (or thereabouts)

Or,

EHG mixed with Gedrosian (HAP Baloch as a stand-in) results in this: 36.56% ANE, 6.45% ASE, 27.56% ENF, 2.49% East Eurasian, 26.20% WHG.

Baloch/Gedrosian is ~35% ANE, ~9% ASE, ~55% ENF.

So I used the first one, the mystery hypothetical South Central Asian ancestor of Yamnaya as Gedrosian.

These are the kinds of results I'm seeing:

HRP0349

0.02% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
13.28% ASI
0.36% Western-Steppe
11.91% Eastern-Steppe
50.32% South-Central-Asian
21.39% Caucasus
0.00% Mideast
0.00% Europe
0.20% West-African
2.52% East-African

HRP0393

0.00% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
10.43% ASI
2.88% Western-Steppe
6.39% Eastern-Steppe
74.92% South-Central-Asian
3.28% Caucasus
0.00% Mideast
0.00% Europe
0.00% West-African
2.10% East-African

HRP0370

1.15% SE-Asian
0.22% NE-Asian
0.51% Indo-Tibetan
5.65% ASI
10.18% Western-Steppe
3.51% Eastern-Steppe
30.71% South-Central-Asian
42.54% Caucasus
2.97% Mideast
1.00% Europe
0.10% West-African
1.45% East-African

HRP0341

0.00% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
10.55% ASI
0.38% Western-Steppe
7.37% Eastern-Steppe
58.98% South-Central-Asian
20.77% Caucasus
0.00% Mideast
0.00% Europe
0.00% West-African
1.94% East-African

HRP0353 (Punjabi Gujjar)

0.01% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
13.20% ASI
0.18% Western-Steppe
4.81% Eastern-Steppe
51.51% South-Central-Asian
28.16% Caucasus
0.00% Mideast
0.00% Europe
0.34% West-African
1.79% East-African

Soulblighter

0.15% SE-Asian
0.00% NE-Asian
0.02% Indo-Tibetan
18.08% ASI
0.33% Western-Steppe
14.03% Eastern-Steppe
33.30% South-Central-Asian
30.29% Caucasus
0.00% Mideast
0.00% Europe
0.09% West-African
3.71% East-African

Khetran Baloch

0.02% SE-Asian
0.00% NE-Asian
0.00% Indo-Tibetan
9.20% ASI
0.41% Western-Steppe
2.32% Eastern-Steppe
43.54% South-Central-Asian
41.58% Caucasus
0.00% Mideast
0.00% Europe
0.54% West-African
2.39% East-African


This could go somewhere, but a confusing issue is the Caucasian. Because South Asian ANE is going into Caucasian since I only have ASI for South Asia. The issue is that if I put ENF in there too, it will suddenly start competing with, and winning against, the new Gedrosian component because it really prefers ANE/ENF/ASE without WHG/EHG (which makes me wonder if this is a historically valid guess, even though it seems to make sense).

So I'm thinking an ANE/ASE/East Eurasian component for South Asia, but I don't know if this is a warranted guess, historically.

I chose to stick with Western Steppe (Yamnaya) as outlined earlier since I think people would be more interested in ancestry from there, even though if there was an EHG component there, Yamnaya should get 50% EHG, 50% South Central Asian. So South Central Asian here represents SC-Asian ancestry above and beyond that contained within Western Steppe.

I think some amount of EHG in the area ancestrally makes sense. Old Gedrosian is like West Caucasus, and "South Central Asian" here is like the Northern Caucasus populations who have some WHG. Or "North Indus" vs. "South Indus". Though technically this population would have existed probably somewhere in what is now the southern fringe of Central Asia (northern Afghanistan, Tajikstan, Turkmenistan, northern Pakistan).

One reason a component didn't come out by itself was because of Europeans and the nature of WHG. We didn't even know about EHG until recently.

EDIT: Since this is based on K7 with its WHG/ANE issues (this WHG is more like EHG since it contains some ANE), this SC-Asian component is more susceptible to a calculator effect than the other components which are based off actual K7 averages to compensate.Ah, no wonder. That "Caucasus" is really "West Asian". I had mixed up the components. That makes more sense now. I'm trying one with a real Caucasus component (it pushes a lot of admixture back into South Central Asian).

parasar
05-28-2015, 07:12 PM
I'm referring to the following possibility as outlined in Haak Et. Al. 2015

"The hypothesis of Basal Eurasian ancestry in Kostenki14 needs to be further tested, as the
negative D(Mbuti, Han; Loschbour, Kostenki14) statistic could also reflect gene flow between Han and Loschbour
this a priori plausible, as these populations are much younger
than Kostenki14 and may share intra-Eurasian genetic drift that Kostenki14 lacks because of
its age. The possibility of later gene flow between Europeans and eastern non-Africans must
be further tested with additional ancient samples from Upper Paleolithic Europe and Asia."

This possibility is supported by the differing relationships of Neolithic Farmers and Kostenki to Ust-Ishim.
Chimp Ust'-Ishim Kostenki14 Dai 0.0043 0.635
Chimp Ust_Ishim LBK_EN Dai 0.0199 4.663
Your 'basal 1 and basal 2' would still fall under a main branch of basal...as they diverge from each other after they diverge from the main Eurasian branch. So 'basal 1 and basal 2' would not differ in their relationship to the main Eurasian branch. Basal was originally defined 'ancestry that diverged before the common ancestor of Loschbour and Onge'....So if Loschbour and Onge share later contact after their initial divergence, any relatives of Loschbour that did not partake in this event (I.E. Kostenki...because this event hadn't happened yet) would be considered to have 'basal ancestry'.

Haak says Basal diverged by at least 45000ybp, i.e. latest by Ust-Ishim time-frame.
"the split between “Basal” and other Eurasians must have taken place no later than ~45,000 years ago (the age of the Ust-Ishim individual)."

So I would agree with the definition since as I understand it the Ongee and Loschbour have common ancestry at 42500ybp after Ust-Ishim who is 45000ybp.

As to whether K14's ancestors partook in intra-Eurasian exchange, I think they did (not with the Han or the Ongee branches) with ancestors of the ANE branch and Loschbour branch.

Chad Rohlfsen
05-28-2015, 09:01 PM
Supervised try with a possible ASI zombie is on the way. Attempt number one of what may be 100. I'll post the first one that gets close.

Chad Rohlfsen
05-29-2015, 01:51 AM
MA1 is too old. it acts like a Hindu Kush/Kalash/SC Asia component. I've dropped it and started again.

VOX
05-29-2015, 03:32 AM
http://i62.tinypic.com/u0bar.jpg

http://i59.tinypic.com/25insqx.jpg

Has anyone ever heard of Sundaland? Before 10kya South East Asia was connected due to low sea levels. Sundaland was separated from Sahul which was a landmass that joins Papua New Guinea and Australia.

Since there is a Sahul component( Australia + New Guinea) I theorize that there must be a Sunda component since at that time the connected areas allowed for gene flow. I propose that such a component would have stretched from India to Vietnam and areas above the Wallace line.

The problem today is that Indians and South East Asians are mixed with newer populations from the north and the west. As a result admixture components like "South Asian" and "East Asian" likely carry a Sunda like genetic substrate. I think the Sundaland substrate could also be the reason that these two components are genetically close.

To get a Sunda like component to pop out of admixture, I think there needs to be heavy and high resolution sampling of pure or close to pure former populations of Sundaland such as the Onge. To date this has never really been done, which is disappointing and source of much confusion about the 1.7 billion people in South Asia and the 0.6 billion people in SEA.

Chad Rohlfsen
05-29-2015, 03:41 AM
I'm using the Onge.

VOX
05-29-2015, 04:05 AM
Yep, they do trick and cover for ASI some of the time, like Zacks K=11 but In order for a more robust, solid and general component to emerge. I think there there needs to higher diversity of Sundaland population samples. The Onge, from what I understand have isolated for a long time.

Chad Rohlfsen
05-29-2015, 04:07 AM
Yep, they do trick and cover for ASI some of the time, like Zacks K=11 but In order for a more robust, solid and general component to emerge I think there there needs to higher diversity of Sundaland population samples. The Onge, from what I understand have isolated for a long time.

I have Onge, Malays, Papuans, Bougainville Papuans, and Australians.

Chad Rohlfsen
05-29-2015, 04:08 AM
Okay, going Dai, Papuan, and 2 Onge, didn't work. It came up as 94% in the Paniya. I switched from Dai to the Atayal and added a 3rd Onge. I'll see how it goes and adjust from there. I may substitute in the Yi, as my East Asian, instead of the Atayal.

VOX
05-29-2015, 04:34 AM
I have Onge, Malays, Papuans, Bougainville Papuans, and Australians.

I would consider the Onge, purely Sunda-like. Malays are most likely a fusion between East Asians (75%) and Sunda Aboriginals (25%). Papuans and Australians belong to their own component because the wallace barrier separating Sunda from Sahul.

In the future, perhaps this year there will be studies focusing on SEA and Sunda-like populations like the Jehai and Orang Asli. When those become available, see if you can do something with them. The abstracts were available somewhere, couldn't remember where though.

VOX
05-29-2015, 04:42 AM
On another note you should give this software (sNMF) for calculating admixture coefficients. I managed to to it about a year ago without ever having used admixture (and still haven't :\). It goes super fast but you have to convert everything to .geno file or something using plink.

http://membres-timc.imag.fr/Olivier.Francois/snmf/index.htm

Sein
05-29-2015, 04:58 AM
MA1 is too old. it acts like a Hindu Kush/Kalash/SC Asia component. I've dropped it and started again.

Very interesting, is it possible to share the output?

Chad Rohlfsen
05-29-2015, 05:04 AM
I'll post it tomorrow. I've gotta crash. Hopefully, the current run works.

Sein
05-29-2015, 05:09 AM
I'll post it tomorrow. I've gotta crash. Hopefully, the current run works.

Thanks!

Chad Rohlfsen
05-29-2015, 11:34 AM
Does anyone have a PCA including Onge and Papuans?

tamilgangster
05-29-2015, 12:30 PM
Gedrosian will always be high for South Asians in lieu of a competing ANE/ENF containing component. ASI is pure ASE/East Eurasian, unlike Harappa's S-Indian or South Asian in other calculators which is a mix of ASI and Baloch/Gedrosian which artificially lowers Gedrosian itself.

The Eurogenes Hindu Kush and MDLP K14 ANI calculators indicate a Gedrosian area component will be a lot higher than the old equivalents. It will be the majority component in all of India.

The gedrosian component competes with Archaic ENF and ANE strains in South Asia which predates gedrosians. SOuth Indian component contains those archaic ENF and ANE

tamilgangster
05-29-2015, 12:51 PM
Okay, going Dai, Papuan, and 2 Onge, didn't work. It came up as 94% in the Paniya. I switched from Dai to the Atayal and added a 3rd Onge. I'll see how it goes and adjust from there. I may substitute in the Yi, as my East Asian, instead of the Atayal.

I think you should run a seperate test specifically Just for East eurasian populations.

Han, Aytal, Dai, Aeta, Gedrosian, Onge. Rykyukan, and Papuan, General west eurasian

THere has been a test on Discover magazin done.
http://blogs.discovermagazine.com/gnxp/files/2011/07/F1.jpg

tamilgangster
05-29-2015, 01:04 PM
I would consider the Onge, purely Sunda-like. Malays are most likely a fusion between East Asians (75%) and Sunda Aboriginals (25%). Papuans and Australians belong to their own component because the wallace barrier separating Sunda from Sahul.

In the future, perhaps this year there will be studies focusing on SEA and Sunda-like populations like the Jehai and Orang Asli. When those become available, see if you can do something with them. The abstracts were available somewhere, couldn't remember where though.

http://www.harappadna.org/2012/03/pan-asian-ref3-k11-admixture/

Javanese are around 23% onge, SUmatrans, mon and malays are 15-20 percent onge and 5-10% south asian. Orang asli shows up as onge. Aetas show up as primarily east asian, but have around 17% papuan 7% onge and 5% south asian. Mainland SouthEast asians Most southeast asians have visible australoid ancestry. In Mainland SEA and Western Indonesia the austroloid component is most related to onge but in eastern indonesia is more related to papuans and its hard to tell in aetas. South asian admixture only shows up in certain populations implying recent admixture. Jehai are 70%onge 30% East asian. Also siberia component shows up among Mainland SE asian populations but lacking among Austronesian populations

Chad Rohlfsen
05-29-2015, 02:15 PM
I know about the Austro-Asiatics, but creating ASI is failing so far. I've had to drop MA1 and Yamnaya. Getting the right amount of Onge and everything else isn't working. It's not matching the k7. I'll post the last run after work.

VOX
05-29-2015, 05:10 PM
Hi Chad, I'm assuming that you merged a lot of data sets together into .bed, .ped and .fam files. If you're willing to share some data sets, give me a shout. I experimented a year back briefly with plink and admixture-like programs in Linux but need to get back into it light of new data sets and ancient DNA. :)

Sein
05-29-2015, 07:29 PM
I know about the Austro-Asiatics, but creating ASI is failing so far. I've had to drop MA1 and Yamnaya. Getting the right amount of Onge and everything else isn't working. It's not matching the k7. I'll post the last run after work.

It'll be very interesting to see how that run turned out.

The best solution I can come up with:

Try using all of the Onge samples and all of Dai samples together, as a single component, and drop Papuan. Also, drop EHG, WHG, and EEF. So, the components would be:

Onge + Dai

Nganasan

BedouinB

Yamnaya

MA1

Mbuti

Same populations would be tested.

Chad Rohlfsen
05-29-2015, 08:45 PM
Sein,

I'll give that a shot. MA1 is really old and not the same as David's ANE. It is very much behaving as something that has Basal Eurasian, due to his age. For instance, with huge amounts of samples, Native Americans avoid MA1 and go with Yamnaya. It's very odd. I've seen this repeated over and over, with good amounts of samples (800+) I am tempted to go unsupervised with all 2500 samples, just to see how that changes everything, perhaps up to K11-12.

VOX,

I have the full dataset from Reich and a South Asian tribal plus Austronesian Dataset. I am not able to share them though, as I had to sign a waiver.

Chad Rohlfsen
05-29-2015, 08:46 PM
Sein,
Do you know where I can find a PCA that includes all South Asians, along with the Onge and Papuans. I am wondering what is going on here.

Sein
05-29-2015, 08:56 PM
Sein,
Do you know where I can find a PCA that includes all South Asians, along with the Onge and Papuans. I am wondering what is going on here.

Chad,

I'm not really aware of any PCA that has all South Asians, Onge, and Papuans together (that would be rather nice).

Hopefully, the modified component list works out. I think the previous list had too many components.

If it doesn't, I think the best path is qpAdm.

parasar
05-29-2015, 09:42 PM
Sein,
Do you know where I can find a PCA that includes all South Asians, along with the Onge and Papuans. I am wondering what is going on here.

Zack had done one with South Asians, Europeans, Ongee, and the Ongee component. No Papuans though. We have to keep in mind that the Papuans are Denisovan rich.

The real Ongee are at the bottom left.
http://www.harappadna.org/wp-content/uploads/2011/05/cline_onge11_sa11_sa12_pca.png

Dr_McNinja
05-29-2015, 09:48 PM
Here's what that new South Central Asian component looks like so far:

https://docs.google.com/spreadsheets/d/1BdR1BbYAijPNtfY02sbQNXZXqqwVOtuwrVjAcd4ljd0/edit#gid=686287602

This is still quite preliminary. It basically just shows the spread.

I'm still not sure what to do about ASI. It's completely overruling SE-Asian and Indo-Tibetan, but if I leave it out, then SE-Asian will work for the East Asian (a Han Chinese individual I think, which would be like 30% SE-Asian I think, maybe more Indo-Tibetan (should be renamed Tibeto-Burman)), but then it's not really applicable for South Asians and starts eating up their non-ASI related East Eurasian. I've stuck to 90% ASE, 10% East Eurasian because that's about the ratio I see at the fringe of the subcontinent near Balochistan, where actual non-ASI related East Eurasian should be rare to non-existent.

I've experimented with making it a hybrid. HAP S-Indian can be considered either 2/3 Gedrosian, 1/3 a population which is 47% ASE, 27% East Eurasian, 17% East African, etc or 1/2 Mideast, and 1/2 a population which is 42% ANE, 40% ASE, 15% East Eurasian, etc.

The first looks like some kind of Southeast Asian population mixed with a bunch of archaic/basal Eurasian. The second looks like a southern version of Karitiana... which actually is an interesting, although so far baseless, proposition.

But neither of those solutions will play nice with South Central Asian. I can make my own customized South Asian component which is like 50% ASE, 50% Gedrosian/Baloch, and that is alright, and lowers West Asian in South Asians. It could represent early Neolithic India as the ENF wave hit South Central Asia, and then began to move into the subcontinent. That might fix ASI issues in non-South Asian populations.

I also have to remove the modern Europe component and add Neolithic era European populations perhaps. Or maybe just a South Europe/EEF type population.

Yamnaya only seems to matter for people literally near that area, but that's because South Central Asian is the non-EHG half of Yamnaya which still has plenty of EHG (around 18-19% WHG) and is taking up what would otherwise go into Yamnaya.

Here's a map of how it seems to look at the moment: http://i.imgur.com/Qmpp1z0.png

It looks like it's getting fractured around southwestern Afghanistan since HRP0370 is showing Western Steppe, so I don't know why it isn't just going up again there. Western Steppe and a little West Asian admixture together could both have just gone into South Central Asian. I'm guessing the West Asian signal had a strong influence and stuck together. HRP0370's Caucasian has usually been hard to break up or make go into other components. In terms of numbers, there isn't enough ENF for both the demand in SC-Asian and West-Asian so it went into the latter and let the EHG/WHG go into Western Steppe rather than break the EHG/ENF proportions of SC-Asian.

On the other hand, you can see the dip in ASI, Eastern Steppe, and West Asian near Haryana. This would otherwise correlate with a peak in Western Steppe. Eastern Steppe and West Asian do not average out to South Central Asian because they don't have enough EHG whose gradient drops off sharply on the other side of the Eurasian diagonal from northeastern Europe to South Asia).

Eastern Steppe is also likely being inflated by excess East Eurasian due to ASE being locked down tight in the "ASI" component (which would otherwise free enough up for the other East Asian components).

Dr_McNinja
05-29-2015, 10:12 PM
Here's what that new South Central Asian component looks like so far:

https://docs.google.com/spreadsheets/d/1BdR1BbYAijPNtfY02sbQNXZXqqwVOtuwrVjAcd4ljd0/edit#gid=686287602

This is still quite preliminary. It basically just shows the spread.

I'm still not sure what to do about ASI. It's completely overruling SE-Asian and Indo-Tibetan, but if I leave it out, then SE-Asian will work for the East Asian (a Han Chinese individual I think, which would be like 30% SE-Asian I think, maybe more Indo-Tibetan (should be renamed Tibeto-Burman)), but then it's not really applicable for South Asians and starts eating up their non-ASI related East Eurasian. I've stuck to 90% ASE, 10% East Eurasian because that's about the ratio I see at the fringe of the subcontinent near Balochistan, where actual non-ASI related East Eurasian should be rare to non-existent.

I've experimented with making it a hybrid. HAP S-Indian can be considered either 2/3 Gedrosian, 1/3 a population which is 47% ASE, 27% East Eurasian, 17% East African, etc or 1/2 Mideast, and 1/2 a population which is 42% ANE, 40% ASE, 15% East Eurasian, etc.

The first looks like some kind of Southeast Asian population mixed with a bunch of archaic/basal Eurasian. The second looks like a southern version of Karitiana... which actually is an interesting, although so far baseless, proposition.

But neither of those solutions will play nice with South Central Asian. I can make my own customized South Asian component which is like 50% ASE, 50% Gedrosian/Baloch, and that is alright, and lowers West Asian in South Asians. It could represent early Neolithic India as the ENF wave hit South Central Asia, and then began to move into the subcontinent. That might fix ASI issues in non-South Asian populations.

I also have to remove the modern Europe component and add Neolithic era European populations perhaps. Or maybe just a South Europe/EEF type population.

Yamnaya only seems to matter for people literally near that area, but that's because South Central Asian is the non-EHG half of Yamnaya which still has plenty of EHG (around 18-19% WHG) and is taking up what would otherwise go into Yamnaya.

Here's a map of how it seems to look at the moment: http://i.imgur.com/Qmpp1z0.png

It looks like it's getting fractured around southwestern Afghanistan since HRP0370 is showing Western Steppe, so I don't know why it isn't just going up again there. Western Steppe and a little West Asian admixture together could both have just gone into South Central Asian. I'm guessing the West Asian signal had a strong influence and stuck together. HRP0370's Caucasian has usually been hard to break up or make go into other components. In terms of numbers, there isn't enough ENF for both the demand in SC-Asian and West-Asian so it went into the latter and let the EHG/WHG go into Western Steppe rather than break the EHG/ENF proportions of SC-Asian.

On the other hand, you can see the dip in ASI, Eastern Steppe, and West Asian near Haryana. This would otherwise correlate with a peak in Western Steppe. Eastern Steppe and West Asian do not average out to South Central Asian because they don't have enough EHG whose gradient drops off sharply on the other side of the Eurasian diagonal from northeastern Europe to South Asia).

Eastern Steppe is also likely being inflated by excess East Eurasian due to ASE being locked down tight in the "ASI" component (which would otherwise free enough up for the other East Asian components).

Compare to the results with a normal Gedrosian component:

http://www.anthrogenica.com/showthread.php?4425-Looking-for-ASI&p=86413&viewfull=1#post86413

Both are essentially saying the same thing. An incursion of EHG admixture from Northwest Eurasia into South Asia, with the nexus being Haryana/Punjab, and the Jatts being the closest population left. The South Central Asian picture just reflects an alternate history, instead of some epic invasion from the Western Steppe (as is imagined by the 24% Yamnaya of Haryana Jatts), it shows a natural gradient of EHG which reaches South Asia. A back migration from South Central Asia to the Western Steppe created Yamnaya which spread into Europe. South Asians being descended from a common ancestor of Yamnaya is why the affinity is there (and yet not there, since the WHG/EHG in South Asia is not that which Yamnaya have).

But this does change the local picture significantly as it appears more than half the admixture of some populations in South Asia is essentially of closely related but different immediate (post-IE) origin to one another. Or to put it in different terms, Khetran Baloch is 39.89% "Old, Southern Baloch" (West Asian in this case meaning "Old, Old Baloch"... the common ancestor of both Gedrosia/Caucasus) centered in Balochistan and 45.84% "Northern or Newer Baloch" centered in South Central Asia.

Dr_McNinja
05-29-2015, 10:20 PM
As for why West Asian/Gedrosian moved up into Pakistan and interrupted the gradient of EHG/SC-Asian/Steppe/whatever... that could be from the southern Jatts (and related groups) from Sindh/Balochistan who moved up the Indus to settle Punjab.

Chad Rohlfsen
05-29-2015, 10:47 PM
Hmm... I wonder if using Bedouin and EHG would be better. Yamnaya kind of takes it all, with nothing much in Bedouin. Paniya came out 45% of this ASI and 55% Yamnaya. Or, trade Bedouin out for LBK. Bedouins being like EEF with 8-10% SSA could be an issue.

tamilgangster
05-30-2015, 12:04 AM
I know about the Austro-Asiatics, but creating ASI is failing so far. I've had to drop MA1 and Yamnaya. Getting the right amount of Onge and everything else isn't working. It's not matching the k7. I'll post the last run after work.

THe problem with isolating ASI, is that it mixtures are so archaic that they diverge into its own component. On some tests it will be mixed in with Austroasiatic component(SE asian component on Genographic), but on other tests it will be mixed in with SW asian type ENF and Archaic ANE(eg south Indian component on harappa DNA)

Sein
05-30-2015, 12:21 AM
Hmm... I wonder if using Bedouin and EHG would be better. Yamnaya kind of takes it all, with nothing much in Bedouin. Paniya came out 45% of this ASI and 55% Yamnaya. Or, trade Bedouin out for LBK. Bedouins being like EEF with 8-10% SSA could be an issue.

Very interesting, the ASI score is basically perfect (it's what we expected from Zack's Onge run).

They get Yamnaya, because the Paniya do have very high levels of ANE ancestry (they are at North Caucasian levels for that component). So the software has no other proxy to capture the 55% of their ancestry which is BEA + ANE (aka, ANI).

But the ASI score seems quite solid/reasonable. If possible, could you share the full output?

Edit: Thankfully, the ASI score is finally in line with what we expect. I guess this means that all Onge samples + all Dai samples is the way to go, whenever creating a supervised "ASI" component.

Chad Rohlfsen
05-30-2015, 12:29 AM
Reich had ASI at 64% in Paniya, 62% in Mala, so it's not there yet. I'm still playing around. I've added EHG, to see what happens.

Sein
05-30-2015, 12:33 AM
Reich had ASI at 64% in Paniya, 62% in Mala, so it's not there yet. I'm still playing around. I've added EHG, to see what happens.

Ah, but those estimates were always incorrect. They didn't account for North Eurasian ancestry in South Asia.

When using ancient DNA with f4 stats, qpAdm gives northern South Asians 10% ENA, even though the Reich paper gave them 30% ENA. Basically, those percentages aren't the real goal.

Anything resembling Zack's K11 is good, as those percentages have been verified via new formal methods and aDNA.

Chad Rohlfsen
05-30-2015, 12:36 AM
That set had the Kharia at the highest with 54%, so I'm not sure it's legit. I'll play with it for just a bit. Should I add one Papuan?

Sein
05-30-2015, 12:38 AM
That set had the Kharia at the highest with 54%, so I'm not sure it's legit. I'll play with it for just a bit.

Very interesting!

Certainly, it'll be interesting to see how things look with EHG.

Out of curiosity, what does HGDP00244 score, and how do Iranians and Pomiri Tajiks look, in that set?

Chad Rohlfsen
05-30-2015, 12:42 AM
Which ethnic group are they?

Sein
05-30-2015, 12:44 AM
Which ethnic group are they?

HGDP00244?

It's a Pashtun/Pathan sample, from the HGDP data-set, also found in the full Reich data-set.

Chad Rohlfsen
05-30-2015, 12:56 AM
HGDP00244?

It's a Pashtun/Pathan sample, from the HGDP data-set, also found in the full Reich data-set.

Pathans are around 35% Bedouin 6%, ASI, 4% Siberian, and 55%Yamnaya. Kalash have 38% Bedouin, 0% ASI, 4% Siberian, 58%Yamnaya. Some Lodhi, Piramalai, Dusadh, and Chamar had the highest Yamnaya at 74%.

Sein
05-30-2015, 01:02 AM
Pathans are around 35% Bedouin 6%, ASI, 4% Siberian, and 55%Yamnaya. Kalash have 38% Bedouin, 0% ASI, 4% Siberian, 58%Yamnaya. Some Lodhi, Piramalai, Dusadh, and Chamar had the highest Yamnaya at 74%.

That's actually pretty reasonable, but only for the Pashtuns and Kalash (the South Central Asian results seem solid). In fact, qpAdm tends to have Pashtuns at 50%-55% Yamnaya, and 10%-15% ENA, in the best fits, which is quite close to the 55% Yamnaya and 11% ENA in this run.

The South Asian results are strange though.

I guess exploring this with EHG might yield something interesting.

Chad Rohlfsen
05-30-2015, 01:07 AM
That's actually pretty reasonable, but only for the Pashtuns and Kalash (the South Central Asian results seem solid). In fact, qpAdm tends to have Pashtuns at 50%-55% Yamnaya, and 10%-15% ENA, in the best fits, which is quite close to the 55% Yamnaya and 11% ENA in this run.

The South Asian results are strange though.

I guess exploring this with EHG might yield something interesting.

I think adjusting the Near East component will help. They likely got farming from Northern Mesopotamia and the Caucasus, so there shouldn't be any SSA, and should have some EEF in it. I may throw a couple LBK's with the Bedouins, as one component, to see if that brings it out better.

parasar
05-30-2015, 01:08 AM
Very interesting, the ASI score is basically perfect (it's what we expected from Zack's Onge run).

They get Yamnaya, because the Paniya do have very high levels of ANE ancestry (they are at North Caucasian levels for that component). So the software has no other proxy to capture the 55% of their ancestry which is BEA + ANE (aka, ANI).

But the ASI score seems quite solid/reasonable. If possible, could you share the full output?

Edit: Thankfully, the ASI score is finally in line with what we expect. I guess this means that all Onge samples + all Dai samples is the way to go, whenever creating a supervised "ASI" component.

Sein,

Are sure that the Paniya have ANE? I suspect they should have minimal ANE (~0-3%) looking at their Y lines. http://journals.plos.org/plosone/article/figure/image?size=large&id=info:doi/10.1371/annotation/8663819b-5ff0-4133-b70a-2d686dfb0a44.t002.cn

Sein
05-30-2015, 01:11 AM
I think adjusting the Near East component will help. They likely got farming from Northern Mesopotamia and the Caucasus, so there shouldn't be any SSA, and should have some EEF in it. I may throw a couple LBK's with the Bedouins, as one component, to see if that brings it out better.

That sounds like a good idea.

Sein
05-30-2015, 01:13 AM
Sein,

Are sure that the Paniya have ANE? I suspect they should have minimal ANE (~0-3%) looking at their Y lines. http://journals.plos.org/plosone/article/figure/image?size=large&id=info:doi/10.1371/annotation/8663819b-5ff0-4133-b70a-2d686dfb0a44.t002.cn

Parasar,

Certainly, as everything David (of Eurogenes) tried had them at around 24%-27% ANE (a PCA-based test had them at 27%, and ADMIXTURE tends to give them around 24%).

Chad Rohlfsen
05-30-2015, 01:16 AM
What about putting Ust-Ishim into the ASI? Something deep and ancestral. Maybe that is the reason for the bogus Papuan?

Sein
05-30-2015, 01:22 AM
What about putting Ust-Ishim into the ASI? Something deep and ancestral. Maybe that is the reason for the bogus Papuan?

I think this wouldn't be good, as Ust-Ishim is equally related to all non-Africans without BEA, while ASI is just ENA.

Chad Rohlfsen
05-30-2015, 01:26 AM
Okay. I'll see how the run with EHG goes. I may take Yamnaya off of the pops too. I'll do these two, and then mess with the Near East stuff. One thing too, is that adjusting the number of Dai that I select, the numbers for ASI change. Kharia had 64% ASI with all Dai, while Paniya were in the upper 30s. Removing three Dai had the Kharia at 54 and the Paniya at 45%. I'll need suggestions as far as what to do.

parasar
05-30-2015, 01:31 AM
Parasar,

Certainly, as everything David (of Eurogenes) tried had them at around 24%-27% ANE (a PCA-based test had them at 27%, and ADMIXTURE tends to give them around 24%).

Wow! That's a lot. Then I would have to reconsider my Y correlations as I thought that ANI ancestors as well ASI ancestors (Ust-Ishim) derive from a Paniya like population and that the Paniya derive about 10% of their ancestry from a (back-flow) population related to Ust-Ishim which from their Y-dna looks to be from the ANE branch under Ust-Ishim. That is why I put their ANE at about 0-3%.

Chad Rohlfsen
05-30-2015, 01:37 AM
Here's that K6

https://docs.google.com/spreadsheets/d/1CGwcVCgRByRas9nAiuxR1up-CDE_jbHR9_zv6zxcp00/edit?usp=sharing

Sein
05-30-2015, 01:44 AM
Okay. I'll see how the run with EHG goes. I may take Yamnaya off of the pops too. I'll do these two, and then mess with the Near East stuff. One thing too, is that adjusting the number of Dai that I select, the numbers for ASI change. Kharia had 64% ASI with all Dai, while Paniya were in the upper 30s. Removing three Dai had the Kharia at 54 and the Paniya at 45%. I'll need suggestions as far as what to do.

Sounds good.

For what it's worth, I think you should remove 4 Dai.

Also, you could try to combine MA1 and both EHG samples, as one North Eurasian component.

Sein
05-30-2015, 01:48 AM
Here's that K6

https://docs.google.com/spreadsheets/d/1CGwcVCgRByRas9nAiuxR1up-CDE_jbHR9_zv6zxcp00/edit?usp=sharing

This is very good.

Looking at the results, I think it is best to drop Nganasan, and just have Onge + Dai as the sole ENA reference.

Edit: Also, MA1 isn't his own supervised component in this run. I think this is why Yamnaya dominates South Asia, as they are the closest group to ANI among the supervised components. I think you might get better results, if both Yamnaya and MA1 are separate supervised components.

Chad Rohlfsen
05-30-2015, 01:53 AM
I need something for the Turkic and Mongol among the Hazara, Tajiks, Iranians, Uygurs, and Uzbeks.

Sein
05-30-2015, 01:55 AM
I need something for the Turkic and Mongol among the Hazara, Tajiks, Iranians, Uygurs, and Uzbeks.

Certainly.

Also, I think you might need to have MA1 as a supervised component, alongside a Yamnaya supervised component.

Chad Rohlfsen
05-30-2015, 02:03 AM
Certainly.

Also, I think you might need to have MA1 as a supervised component, alongside a Yamnaya supervised component.


I can try, but it may go all nutty, with all basal going into MA1.

Sein
05-30-2015, 02:07 AM
I can try, but it may go all nutty, with all basal going into MA1.

This is a good point.

For what it's worth, I think this test is pretty good. Still, I think the only supervised component that needs to be added is MA1. It'll make more sense for South Asia, as the North Eurasian affinity will be better distinguished from IE genetic input. We could see if this works. If not, we'll rethink some things.

Sein
05-30-2015, 02:14 AM
A side note, but I think a Han sample was accidentally included in the ASI reference (it's 100% ASI). I think it would be best to remove it from the ASI component.

Chad Rohlfsen
05-30-2015, 02:15 AM
Okay. I'll try that, after this run I'm doing. I'll add MA1 to this, and then to that K6.

Chad Rohlfsen
05-30-2015, 02:16 AM
Yes, adding the Han was the one that changed the numbers, for some reason. Maybe Austro Asiatics came after some Chinese Neolithic stuff. It helped bring up the Paniya. I'll try that K6 without the Han, to show you. Give me a bit here. All of these tests will take a couple of hours.

Sein
05-30-2015, 02:23 AM
Yes, adding the Han was the one that changed the numbers, for some reason. Maybe Austro Asiatics came after some Chinese Neolithic stuff. It helped bring up the Paniya. I'll try that K6 without the Han, to show you. Give me a bit here. All of these tests will take a couple of hours.

I have to say, thank you very much for exploring these issues. The analyses your'e trying here are exceedingly interesting, and your work is sincerely appreciated.

I have to crash, so I'll rejoin the discussion tomorrow night.

Chad Rohlfsen
05-30-2015, 02:47 AM
Just EHG didn't work. I'll try it with Yamnaya and EHG.

Chad Rohlfsen
05-30-2015, 02:47 AM
I have to say, thank you very much for exploring these issues. The analyses your'e trying here are exceedingly interesting, and your work is sincerely appreciated.

I have to crash, so I'll rejoin the discussion tomorrow night.

You're welcome. I'll keep plugging away and post anything that looks interesting.

tamilgangster
05-30-2015, 11:08 AM
Here's what that new South Central Asian component looks like so far:

https://docs.google.com/spreadsheets/d/1BdR1BbYAijPNtfY02sbQNXZXqqwVOtuwrVjAcd4ljd0/edit#gid=686287602

This is still quite preliminary. It basically just shows the spread.

I'm still not sure what to do about ASI. It's completely overruling SE-Asian and Indo-Tibetan, but if I leave it out, then SE-Asian will work for the East Asian (a Han Chinese individual I think, which would be like 30% SE-Asian I think, maybe more Indo-Tibetan (should be renamed Tibeto-Burman)), but then it's not really applicable for South Asians and starts eating up their non-ASI related East Eurasian. I've stuck to 90% ASE, 10% East Eurasian because that's about the ratio I see at the fringe of the subcontinent near Balochistan, where actual non-ASI related East Eurasian should be rare to non-existent.

I've experimented with making it a hybrid. HAP S-Indian can be considered either 2/3 Gedrosian, 1/3 a population which is 47% ASE, 27% East Eurasian, 17% East African, etc or 1/2 Mideast, and 1/2 a population which is 42% ANE, 40% ASE, 15% East Eurasian, etc.

The first looks like some kind of Southeast Asian population mixed with a bunch of archaic/basal Eurasian. The second looks like a southern version of Karitiana... which actually is an interesting, although so far baseless, proposition.

But neither of those solutions will play nice with South Central Asian. I can make my own customized South Asian component which is like 50% ASE, 50% Gedrosian/Baloch, and that is alright, and lowers West Asian in South Asians. It could represent early Neolithic India as the ENF wave hit South Central Asia, and then began to move into the subcontinent. That might fix ASI issues in non-South Asian populations.

I also have to remove the modern Europe component and add Neolithic era European populations perhaps. Or maybe just a South Europe/EEF type population.

Yamnaya only seems to matter for people literally near that area, but that's because South Central Asian is the non-EHG half of Yamnaya which still has plenty of EHG (around 18-19% WHG) and is taking up what would otherwise go into Yamnaya.

Here's a map of how it seems to look at the moment: http://i.imgur.com/Qmpp1z0.png

It looks like it's getting fractured around southwestern Afghanistan since HRP0370 is showing Western Steppe, so I don't know why it isn't just going up again there. Western Steppe and a little West Asian admixture together could both have just gone into South Central Asian. I'm guessing the West Asian signal had a strong influence and stuck together. HRP0370's Caucasian has usually been hard to break up or make go into other components. In terms of numbers, there isn't enough ENF for both the demand in SC-Asian and West-Asian so it went into the latter and let the EHG/WHG go into Western Steppe rather than break the EHG/ENF proportions of SC-Asian.

On the other hand, you can see the dip in ASI, Eastern Steppe, and West Asian near Haryana. This would otherwise correlate with a peak in Western Steppe. Eastern Steppe and West Asian do not average out to South Central Asian because they don't have enough EHG whose gradient drops off sharply on the other side of the Eurasian diagonal from northeastern Europe to South Asia).

Eastern Steppe is also likely being inflated by excess East Eurasian due to ASE being locked down tight in the "ASI" component (which would otherwise free enough up for the other East Asian components).

I believe the cause of the fracture is SW asian and persian geneflow from groups like sassanids and the caliphate, which explains the lower levels of EHG in those area

tamilgangster
05-30-2015, 11:17 AM
Hmm... I wonder if using Bedouin and EHG would be better. Yamnaya kind of takes it all, with nothing much in Bedouin. Paniya came out 45% of this ASI and 55% Yamnaya. Or, trade Bedouin out for LBK. Bedouins being like EEF with 8-10% SSA could be an issue.

Thats because there was probably an archaic type of EHG related component in South asia and that mixed with an ENF group will create the illusion of it being related to Yamnaya. For a these tests, you cant use both archaic and modern components, because they might compete with one another and generate alot of noise. Its best to use the Mal'ta boy instead for that reason. I believe LBK is as close to pure ENF. Is there a way you could isolate basal eurasian into a seperate component, because that would be interesting to see.

tamilgangster
05-30-2015, 11:27 AM
Pathans are around 35% Bedouin 6%, ASI, 4% Siberian, and 55%Yamnaya. Kalash have 38% Bedouin, 0% ASI, 4% Siberian, 58%Yamnaya. Some Lodhi, Piramalai, Dusadh, and Chamar had the highest Yamnaya at 74%.

THe lack of an archaic EHG/ANE related proxy, is resulting in Yamnaya taking over. It intereprets the combination of ENF and ANE as Yamnya

tamilgangster
05-30-2015, 11:42 AM
Sein,

Are sure that the Paniya have ANE? I suspect they should have minimal ANE (~0-3%) looking at their Y lines. http://journals.plos.org/plosone/article/figure/image?size=large&id=info:doi/10.1371/annotation/8663819b-5ff0-4133-b70a-2d686dfb0a44.t002.cn

They score ANE like every other population in South asia, but its fairly low. From these tests im estimating around 10-15%. Y DNA haplogroups only represent a small percent of the DNA. RU basing this on the low levels of R1A1. Piramalai Kallars have ANE of 30% and they have similar levels of Haplogroup R.

tamilgangster
05-30-2015, 11:56 AM
Okay. I'll see how the run with EHG goes. I may take Yamnaya off of the pops too. I'll do these two, and then mess with the Near East stuff. One thing too, is that adjusting the number of Dai that I select, the numbers for ASI change. Kharia had 64% ASI with all Dai, while Paniya were in the upper 30s. Removing three Dai had the Kharia at 54 and the Paniya at 45%. I'll need suggestions as far as what to do.

Kharias are an austroasiatic tribal, they score 69% south Indian and 25% south east asian. THis is basically excess East eurasian ancestry that did not fit into the SOuth indian component. Paniyas have 84% south indian. Id expect Paniyars to have higher ASI, but their East eurasian is much lower. U need a seperate Austroasiatic specific component. Dais, though southern mongoloids are not Austroasiatic. Id reccomend obtaining a East eurasian sample which is austroasiatic specific. H'tin People are as close to pure austroasiatic as one can get. Try contacting Razib if he has their DNA sample.

parasar
05-30-2015, 02:38 PM
They score ANE like every other population in South asia, but its fairly low. From these tests im estimating around 10-15%. Y DNA haplogroups only represent a small percent of the DNA. RU basing this on the low levels of R1A1. Piramalai Kallars have ANE of 30% and they have similar levels of Haplogroup R.

K-M9 derivatives, and a lack of overlapping H lines.
The Y chromosome may be a small part, but for the most part we can figure out the components from the Y-distributions at the population level (rather than for an individual).

parasar
05-30-2015, 03:05 PM
... at around 24%-27% ANE (a PCA-based test had them at 27%, and ADMIXTURE tends to give them around 24%).



K-M9 derivatives, and a lack of overlapping H lines.
The Y chromosome may be a small part, but for the most part we can figure out the components from the Y at the population level (rather than for an individual).

I was looking at their k12b numbers and it is difficult to figure out how they can have the high ANE %ages noted by Sein. Their South Asian overlap will give them some.

PANIYA
Gedrosia 0
Siberian 2.7
Northwest_African 0
Southeast_Asian 62.4
Atlantic_Med 0
North_European 0.2
South_Asian 23.8
East_African 0.5
Southwest_Asian 0
East_Asian 8.8
Caucasus 0
Sub_Saharan 1.6
https://docs.google.com/spreadsheet/ccc?key=0ArJDEoCgzRKedEY4Y3lTUVBaaFp0bC1zZlBDcTZEY lE

Sein
05-30-2015, 05:01 PM
I was looking at their k12b numbers and it is difficult to figure out how they can have the high ANE %ages noted by Sein. Their South Asian overlap will give them some.

PANIYA
Gedrosia 0
Siberian 2.7
Northwest_African 0
Southeast_Asian 62.4
Atlantic_Med 0
North_European 0.2
South_Asian 23.8
East_African 0.5
Southwest_Asian 0
East_Asian 8.8
Caucasus 0
Sub_Saharan 1.6
https://docs.google.com/spreadsheet/ccc?key=0ArJDEoCgzRKedEY4Y3lTUVBaaFp0bC1zZlBDcTZEY lE

These are different Paniya samples, from the Behar data-set. Zack found out that these are mislabeled Southeast Asian samples + actual Paniya samples.

Chad Rohlfsen
05-30-2015, 06:08 PM
I just had a very strange run. After I put in a bunch of Malays, it went crazy. I now have a component that is 74% in Paniyas and 23% in the Kalash. Maybe it's just another form of that South Asian though.

Sein
05-30-2015, 06:24 PM
I just had a very strange run. After I put in a bunch of Malays, it went crazy. I now have a component that is 74% in Paniyas and 23% in the Kalash. Maybe it's just another form of that South Asian though.

That's quite interesting, as the behavior of this component is identical to HarappaWorld's "South-Indian".

For what it's worth, the Malay have some West Eurasian admixture, via India.

Chad Rohlfsen
05-30-2015, 06:35 PM
Yeah, I'm trying to figure something out here. Only being about 50% ASI doesn't make a lot of sense to me, if they're about 45% Onge. Although, I'm not sure how Zack did his run. These tribals are lower coverage, so maybe he is not getting the real signal. I've got a list that makes 98.9% coverage. Does mine seem more accurate than his?

parasar
05-30-2015, 06:42 PM
I notice that the Puliyar have 0.5% Gedrosia.

Metspalu's Paniya have no k5 - which should translate to no Gedrosia.
On the PCA these Paniya fall within South Asians farthest down the cline towards Dravidians - Supplementary figure 12 - http://www.sciencedirect.com/science/MiamiMultiMediaURL/1-s2.0-S0002929711004885/1-s2.0-S0002929711004885-mmc1.pdf/276895/html/S0002929711004885/46015407af7db60fe139cbf0ee8160ee/mmc1.pdf

http://ars.els-cdn.com/content/image/1-s2.0-S0002929711004885-gr2.jpg


Raghavan et al. used the Behar Paniya too.
Behar et al . 2010 Paniya 4 who share some of the lowest drift among South Asians with MA-1
http://img811.imageshack.us/img811/4151/zubo.png


Moorjani et al. had 5 post curation Paniya samples that had the lowest ANI - "ANI ancestry along the Indian cline ranges from as low as 17% (Paniya)"

Arunkumar's dataset had 72 Paniya samples. 75% of these are Y-F (likely early divergent H lines) and 15% are Y-C, showing about 10% ANI influence.
Very puzzling!

Kurd
05-30-2015, 06:52 PM
I just had a very strange run. After I put in a bunch of Malays, it went crazy. I now have a component that is 74% in Paniyas and 23% in the Kalash. Maybe it's just another form of that South Asian though.

I dont have access to my computer or spreadsheets, but I think I recall those numbers to be consistent with the "S Asian" numbers in one of the low K calculators.

Chad Rohlfsen
05-30-2015, 06:58 PM
Sein,
Question for you. I dropped most of the Malays. I've got one that only shows 2% Yamnaya. I'm thinking of dropping a Dai for them. That would only make the component about .1% Yamnaya. Would that be reasonable, to see if it gives a better result?

Chad Rohlfsen
05-30-2015, 07:08 PM
Those k runs show similar stuff to what I get, unsupervised. I think showing 50% Yamnaya is feasible, when they lack much near eastern to make ANI. If Yamnaya can be modeled as 35% ANI and 65% EHG, it makes sense. Georgians were the best proxy and also go into Yamnaya at about that rate. This might be alright. I'll play around a bit and post another run soon.

One of you can take the scores and fsts to make a plot. Well see how it looks.

Kurd
05-30-2015, 07:18 PM
Those k runs show similar stuff to what I get, unsupervised. I think showing 50% Yamnaya is feasible, when they lack much near eastern to make ANI. If Yamnaya can be modeled as 35% ANI and 65% EHG, it makes sense. Georgians were the best proxy and also go into Yamnaya at about that rate. This might be alright. I'll play around a bit and post another run soon.

One of you can take the scores and fsts to make a plot. Well see how it looks.

I dont mind putting your spreadsheet on a PCA. Takes me about 30 mins. I should have my computer back in the next 1 to 3 days.

Chad Rohlfsen
05-30-2015, 07:51 PM
I dont mind putting your spreadsheet on a PCA. Takes me about 30 mins. I should have my computer back in the next 1 to 3 days.

Okay, I'll do a few more runs and post one up here, with the fst distances.

Chad Rohlfsen
05-30-2015, 08:47 PM
These are different Paniya samples, from the Behar data-set. Zack found out that these are mislabeled Southeast Asian samples + actual Paniya samples.

It depends on reference pops too. If they lack an Onge and Papuan, they'll be covered by Tai Dai speakers.

everest59
05-30-2015, 09:27 PM
I was looking at their k12b numbers and it is difficult to figure out how they can have the high ANE %ages noted by Sein. Their South Asian overlap will give them some.

PANIYA
Gedrosia 0
Siberian 2.7
Northwest_African 0
Southeast_Asian 62.4
Atlantic_Med 0
North_European 0.2
South_Asian 23.8
East_African 0.5
Southwest_Asian 0
East_Asian 8.8
Caucasus 0
Sub_Saharan 1.6
https://docs.google.com/spreadsheet/ccc?key=0ArJDEoCgzRKedEY4Y3lTUVBaaFp0bC1zZlBDcTZEY lE

Actually, I uploaded a Paniya sample to Gedmatch a while ago. The Kit # is M960763. This is what I get:

Gedrosia 21.07%
Siberian 0.32%
Northwest_African -
Southeast_Asian 3.08%
Atlantic_Med -
North_European -
South_Asian 75.54%

You know what else is interesting? The eye-color prediction.

Now, the K12b South Asian is basically similar to Harappa's South Indian. IMO, it does not have a whole lot of ANE. As we can see though, the Paniya tend to get some Gedrosia as well.

Actually, I just did a one-to-one compararison between Paniya and MA-1. At 2.0 cm with minimum threshold of 100 SNPs and bunching limit of 50, the sample gets around 7.3 cm total match.
I get 25.9 cm with the same parameters.
So, the Paniya probably have some ANE, but I highly doubt 20 something percent.

Here is a comparison to a Northern European individual . With the same options selected, this European individual gets 14.4 cm.

Dr_McNinja
05-31-2015, 01:19 AM
Actually, I uploaded a Paniya sample to Gedmatch a while ago. The Kit # is M960763. This is what I get:

Gedrosia 21.07%
Siberian 0.32%
Northwest_African -
Southeast_Asian 3.08%
Atlantic_Med -
North_European -
South_Asian 75.54%

You know what else is interesting? The eye-color prediction.

Now, the K12b South Asian is basically similar to Harappa's South Indian. IMO, it does not have a whole lot of ANE. As we can see though, the Paniya tend to get some Gedrosia as well.

Actually, I just did a one-to-one compararison between Paniya and MA-1. At 2.0 cm with minimum threshold of 100 SNPs and bunching limit of 50, the sample gets around 7.3 cm total match.
I get 25.9 cm with the same parameters.
So, the Paniya probably have some ANE, but I highly doubt 20 something percent.

Here is a comparison to a Northern European individual . With the same options selected, this European individual gets 14.4 cm.Eurogenes K7:

Population
ANE 27.00%
ASE 27.69%
WHG-UHG -
East_Eurasian 10.63%
West_African 0.44%
East_African 6.54%
ENF 27.70%

More ANE and ASE than HAP S-Indian. But less ENF.

Dr_McNinja
05-31-2015, 01:29 AM
Eurogenes K7:

Population
ANE 27.00%
ASE 27.69%
WHG-UHG -
East_Eurasian 10.63%
West_African 0.44%
East_African 6.54%
ENF 27.70%

More ANE and ASE than HAP S-Indian. But less ENF.The East African is slightly higher which suggests it's possibly not related to the ENF. Modern Bedouin is only around 8-9% East African. So if it's from something Gedrosian-like (has both ANE and ENF), that leaves a non-Gedrosian ASI that has 20% ANE.

Chad Rohlfsen
05-31-2015, 02:02 AM
K5 fellas. It's looking better and better. Still tinkering.

https://docs.google.com/spreadsheets/d/1ESoOszTLdzPhKrnN8bw4B3UZS0AXrkabsm9JoNn9g-8/edit?usp=sharing

Dr_McNinja
05-31-2015, 02:54 AM
Actually, I uploaded a Paniya sample to Gedmatch a while ago. The Kit # is M960763. This is what I get:

Gedrosia 21.07%
Siberian 0.32%
Northwest_African -
Southeast_Asian 3.08%
Atlantic_Med -
North_European -
South_Asian 75.54%

You know what else is interesting? The eye-color prediction.

Now, the K12b South Asian is basically similar to Harappa's South Indian. IMO, it does not have a whole lot of ANE. As we can see though, the Paniya tend to get some Gedrosia as well.

Actually, I just did a one-to-one compararison between Paniya and MA-1. At 2.0 cm with minimum threshold of 100 SNPs and bunching limit of 50, the sample gets around 7.3 cm total match.
I get 25.9 cm with the same parameters.
So, the Paniya probably have some ANE, but I highly doubt 20 something percent.

Here is a comparison to a Northern European individual . With the same options selected, this European individual gets 14.4 cm.

Not sure if that method is the most reliable. I ran a bunch of kits through that and for example, Sapporo gets 16 cM.

Dr_McNinja
05-31-2015, 02:57 AM
Eurogenes K7:

Population
ANE 27.00%
ASE 27.69%
WHG-UHG -
East_Eurasian 10.63%
West_African 0.44%
East_African 6.54%
ENF 27.70%

More ANE and ASE than HAP S-Indian. But less ENF.

The ASE:East_Eurasian ratio here is 72:28 or so, still near the 70:30 mark of HAP S-Indian. It seems like that's the combination Admixture keeps focusing on but that isn't relevant for populations outside that narrow geographic range of Southeast India.

But it does point to a high ANE/high ASE population if the trend is drawn out. Like what I found earlier about a "southern" version of Karitiana.

everest59
05-31-2015, 03:00 AM
Not sure if that method is the most reliable. I ran a bunch of kits through that and for example, Sapporo gets 16 cM.

Yeah, I think you may be right about that. The program doesn't seem very well-developed at all.

Chad Rohlfsen
05-31-2015, 03:23 AM
I'm adding EHG and Ulchi for NE Asian. Hopefully, it cleans it up more.

Sein
05-31-2015, 03:30 AM
Yeah, I'm trying to figure something out here. Only being about 50% ASI doesn't make a lot of sense to me, if they're about 45% Onge. Although, I'm not sure how Zack did his run. These tribals are lower coverage, so maybe he is not getting the real signal. I've got a list that makes 98.9% coverage. Does mine seem more accurate than his?

The K6 (with Yamnaya, and a Onge + Dai "ASI" component) seemed very accurate at distinguishing between different kinds of ENA. There is a rather good separation between South Asian-specific ENA ("ASI"), East Asian ENA ("SE Asian"), and Siberian ENA + ANE ("Siberian"), in that run.

The only trouble with that run is that it seemed to make Yamnaya into an "Ancestral North Indian" component.

Then again, this might not be inaccurate, if Yamnaya's "Armenian" half was from a West Asian population that hugely affected the genetic landscape in South Asia.

Sein
05-31-2015, 03:31 AM
K5 fellas. It's looking better and better. Still tinkering.

https://docs.google.com/spreadsheets/d/1ESoOszTLdzPhKrnN8bw4B3UZS0AXrkabsm9JoNn9g-8/edit?usp=sharing

This one looks pretty good!

Chad Rohlfsen
05-31-2015, 03:31 AM
Opinions on the K5?

Chad Rohlfsen
05-31-2015, 03:35 AM
Yeah, If ANI is Georgian like, then those with Near East plus Yamnaya are getting exactly that. Those with no Near East are little ANI with extra EHG/ANE. I'm hoping this run can sort that out. Doubtful, but worth a shot. There's still an obvious cline. The Chamar were 27% ANE, correct? They're almost 70% Yamnaya, which would cover it.

Sein
05-31-2015, 03:36 AM
Opinions on the K5?


In all honesty, it looks better than the K6. I guess Onge + Kinh yields better results, versus Onge + Dai.

I think your'e quite close to having a solid ADMIXTURE run for South Asians.

Chad Rohlfsen
05-31-2015, 03:41 AM
In all honesty, it looks better than the K6. I guess Onge + Kinh yields better results, versus Onge + Dai.

I think your'e quite close to having a solid ADMIXTURE run for South Asians.

Thanks. I have a feeling there is something that I need to do. I think that adding Ust Ishim to it, would be like finding something deeper still in India or make up for the drift in the Onge. It would only be one sample among 18 others, so it shouldn't really affect West Eurasian stuff. I think it just lacks a modern proxy with drift. He must be so basal to ENA that only a modern West Eurasian and African can cover it. That African could really help, as it would pull my ASI from SE Asia and more into South Asia.

Sein
05-31-2015, 03:46 AM
Thanks. I have a feeling there is something that I need to do. I think that adding Ust Ishim to it, would be like finding something deeper still in India or make up for the drift in the Onge. It would only be one sample among 18 others, so it shouldn't really affect West Eurasian stuff. I think it just lacks a modern proxy with drift. He must be so basal to ENA that only a modern West Eurasian and African can cover it. That African could really help, as it would pull my ASI from SE Asia and more into South Asia.

It's worth a try, but I doubt it'll yield anything good. I think Everest tried to fit South Asians with Ust-Ishim, using qpAdm, and the fits were always failures.

But perhaps you could find a signal, via ADMIXTURE.

Chad Rohlfsen
05-31-2015, 03:49 AM
Yeah, mine will be a little different as this component is flooded with the Onge and Kinh. It shouldn't make a drastic shift.

Chad Rohlfsen
05-31-2015, 04:01 AM
I think I should remove one Kinh from the list. I don't like 30% ASI in Cambodians and Malays. How about you guys? It'll probably drop the Paniya to 50-50.

Sein
05-31-2015, 04:13 AM
I think I should remove one Kinh from the list. I don't like 30% ASI in Cambodians and Malays. How about you guys? It'll probably drop the Paniya to 50-50.

Sounds good.

For what it's worth, I was wondering if adding MA1 could also help, as the some of the East Asian percentages are reflective of ANE, since MA1 scores around 17%.

Chad Rohlfsen
05-31-2015, 04:20 AM
MA1 and EHG tend to steal away the South Indians. I've had to drop it. I'm keeping the Ulchi, for now. I'll have that run in the next hour.

Dr_McNinja
05-31-2015, 04:20 AM
Thanks. I have a feeling there is something that I need to do. I think that adding Ust Ishim to it, would be like finding something deeper still in India or make up for the drift in the Onge. It would only be one sample among 18 others, so it shouldn't really affect West Eurasian stuff. I think it just lacks a modern proxy with drift. He must be so basal to ENA that only a modern West Eurasian and African can cover it. That African could really help, as it would pull my ASI from SE Asia and more into South Asia.The Ust-Ishim signal kept disappearing for South Asians for me with anything beyond the basic ANE/ENF/ASE components. It reappeared immediately outside South Asia in Afghans, Central Asians, and Iranians (highest in Afghans). Even when I had just South Central Asian and Eurogenes ASE, it still disappeared into South Central Asian which had WHG. Perhaps with a WHG-less Gedrosian component it would reappear. Although I was thinking of just combining it as a minor part of South Central Asian since that's close to where it's peaking.

Chad Rohlfsen
05-31-2015, 04:24 AM
I think EHG/ANE needs to wait until I can add Europeans. I'm starting with 500 extra West and East Asians.

Chad Rohlfsen
05-31-2015, 04:26 AM
The Ust-Ishim signal kept disappearing for South Asians for me with anything beyond the basic ANE/ENF/ASE components. It reappeared immediately outside South Asia in Afghans, Central Asians, and Iranians (highest in Afghans). Even when I had just South Central Asian and Eurogenes ASE, it still disappeared into South Central Asian which had WHG. Perhaps with a WHG-less Gedrosian component it would reappear. Although I was thinking of just combining it as a minor part of South Central Asian since that's close to where it's peaking.

Interesting. If this run fails, I'll go back and adjust the k5.

Chad Rohlfsen
05-31-2015, 04:34 AM
David has also provided me with a new SNP list. I'll try a run with that one after finding something that makes sense.

Sein
05-31-2015, 04:56 AM
I think EHG/ANE needs to wait until I can add Europeans. I'm starting with 500 extra West and East Asians.

That might be a very good addition, perhaps Europeans are needed?

Chad Rohlfsen
05-31-2015, 05:06 AM
The K6 may actually work. I'm removing a couple of Kinh from the ASI.

Edit: Scratch that. I'm going back to K5, with all of the new pops. I'll have to adjust it after that.

parasar
05-31-2015, 05:09 AM
Actually, I uploaded a Paniya sample to Gedmatch a while ago. The Kit # is M960763. This is what I get:

Gedrosia 21.07%
Siberian 0.32%
Northwest_African -
Southeast_Asian 3.08%
Atlantic_Med -
North_European -
South_Asian 75.54%

You know what else is interesting? The eye-color prediction.

Now, the K12b South Asian is basically similar to Harappa's South Indian. IMO, it does not have a whole lot of ANE. As we can see though, the Paniya tend to get some Gedrosia as well.

Actually, I just did a one-to-one compararison between Paniya and MA-1. At 2.0 cm with minimum threshold of 100 SNPs and bunching limit of 50, the sample gets around 7.3 cm total match.
I get 25.9 cm with the same parameters.
So, the Paniya probably have some ANE, but I highly doubt 20 something percent.

Here is a comparison to a Northern European individual . With the same options selected, this European individual gets 14.4 cm.

Is that the case with just this sample or with others too?
If Moorjani et al. is correct and the Paniya have 17% ANI, I can't see how they can have anywhere close to even 20% ANE.

Sein
05-31-2015, 05:33 AM
The K6 may actually work. I'm removing a couple of Kinh from the ASI.

Edit: Scratch that. I'm going back to K5, with all of the new pops. I'll have to adjust it after that.

Sounds great.

For what it's worth, I have a few ideas. Mainly, you could reduce the Kalash to only 1 sample, and have only 1 sample per Pomiri Tajik sub-population (1 Rushan, 1 Ishkashim, 1 Shugnan, etc). The Kalash are very drifted, so having only one sample will give better results.

Also, if you add Karitiana and Kets + Selkup, South Indians won't overtake the MA1 and EHG components. In fact, with Native Americans and those particular Siberians added, you'll probably be able to manage two North Eurasian components (EHG and ANE)

Sein
05-31-2015, 05:34 AM
Is that the case with just this sample or with others too?
If Moorjani et al. is correct and the Paniya have 17% ANI, I can't see how they can have anywhere close to even 20% ANE.

In Chad's K5, the Paniya are around 44% ANI. I think they were also around 45% West Eurasian in Zack's K11.

David's first supervised MA1-based ADMIXTURE tests also had them at around 40%-45% West Eurasian, and his PCA-based ANE test had them at 49% West Eurasian.

Generalissimo
05-31-2015, 05:55 AM
The most ENA Paniya gets the following in the K8.

ANE 20.14
South_Eurasian 24.95
ENF 17.98
East_Eurasian 17.52
WHG 0.01
Oceanian 11.4
Pygmy 0
Sub-Saharan 8

This is supported by genotype-based PCA, because the Paniya aren't all that far from other Indians, which means that an ANI proportion of 17% is totally unrealistic.

Anyway, the ASI/ANI analysis was designed before anyone knew about ANE, so who knows how accurate it is.

Kurd
05-31-2015, 06:21 AM
@chad
Here is your K5 PCA. I can plot averages to reduce clutter tomorrow. Alternatively, on your next spreadsheet, you can omit similar populations. The k5 did not look bad. I will take a closer look tomorrow.
http://i.imgur.com/QKKM8jT.jpghttp://i.imgur.com/U6jBJS0.jpg

Chad Rohlfsen
05-31-2015, 06:36 AM
@chad
Here is your K5 PCA. I can plot averages to reduce clutter tomorrow. Alternatively, on your next spreadsheet, you can omit similar populations. The k5 did not look bad. I will take a closer look tomorrow.
http://i.imgur.com/QKKM8jT.jpghttp://i.imgur.com/U6jBJS0.jpg

It'll change slightly tomorrow. I'm going omit a Kinh. I may look for a sub that brings it up in South Asia. It may take another day. I'm still running some stuff.

Chad Rohlfsen
05-31-2015, 06:56 AM
The extra pops are really making this ASI cluster behave strangely. It's still huge in the Kharia and Paniya. Paniya are up to 59% of this, but it's also huge in SE Asia. For instance, the Cambodians and Thais get around 40% of it, Malays at almost 50%. That was with 3 Kinh dropped off already. I just dropped off two more. I'll see what happens.

Sein
05-31-2015, 07:04 AM
The extra pops are really making this ASI cluster behave strangely. It's still huge in the Kharia and Paniya. Paniya are up to 59% of this, but it's also huge in SE Asia. For instance, the Cambodians and Thais get around 40% of it, Malays at almost 50%. That was with 3 Kinh dropped off already. I just dropped off two more. I'll see what happens.

In theory, ASI should be heavy in Southeast Asia, as ASI in India represents a genetic continuum with Southeast Asia.

How do the Pashtuns and Baloch look?

Chad Rohlfsen
05-31-2015, 07:19 AM
In theory, ASI should be heavy in Southeast Asia, as ASI in India represents a genetic continuum with Southeast Asia.

How do the Pashtuns and Baloch look?

Pathan all between 9-17% and Balochi from 6-15%. It's odd though, as in Zack's run the Kalash had 0% Onge, but getting 6-9% ASI here, and Cambodians had 17% Onge, or about 25% ASI, but it's up over 40% here. It's in West Asia now, at 1-5%, but that may be extra basal. I'm not so sure about this. Some Han are as high as 30%. It doesn't seem right. Maybe, switching back to Dai, and a separate Onge component will help. I'm going to dump the LBK or throw some Bedouin in it.

Sein
05-31-2015, 07:23 AM
Pathan all between 9-17% and Balochi from 6-15%. It's odd though, as in Zack's run the Kalash had 0% Onge, but getting 6-9% ASI here, and Cambodians had 17% Onge, or about 25% ASI, but it's up over 40% here. I'm not so sure about this. Some Han are as high as 30%. It doesn't seem right. Maybe, switching back to Dai, and a separate Onge component will help.

I think it's a good sign that the Kalash are finally showing ASI, that makes sense. The Pashtun and Baloch percentages also make sense.

Although, I do agree that the Han percentages are quite higher than anything expected.

If possible, could you share the full output?

Edit: It's nice that it's showing in West Asia, as qpAdm puts a very small gap between Pashtun and Iranian ENA levels.

Chad Rohlfsen
05-31-2015, 07:33 AM
I think it's a good sign that the Kalash are finally showing ASI, that makes sense. The Pashtun and Baloch percentages also make sense.

Although, I do agree that the Han percentages are quite higher than anything expected.

If possible, could you share the full output?

I think I know what to do. I'm going to make a NE Asian pop, to get rid of a good chunk of the ASI in China. Japanese have no Onge, apparently, so I'll use them. Everyone else is too much of a wildcard on the ANE/EHG.

Chad Rohlfsen
05-31-2015, 07:38 AM
https://docs.google.com/spreadsheets/d/1QrEKtd0y7gGfHD-8fXuCotCMhoq1lOqRMubr8H_BSdE/edit?usp=sharing

Chad Rohlfsen
05-31-2015, 07:45 AM
Actually, picking out those two Ami that are usually peaking should fix it, I hope.

Sein
05-31-2015, 07:54 AM
Actually, picking out those two Ami that are usually peaking should fix it, I hope.

Also, perhaps adding 1 Kinh back.

Chad Rohlfsen
05-31-2015, 07:58 AM
David and anyone else,

Looking at some PCA's it looks like the Ami and Atayal are Onge shifted, compared to the Han. Should I use the Han, or stick to the Ami and Atayal?

Generalissimo
05-31-2015, 08:08 AM
The Han are Onge shifted as well. In other words, they're also mixed with something ASI-like.

Have a look at the North Han and Koreans. They should look somewhat different, and if so, then you're on the right track.

Chad Rohlfsen
05-31-2015, 08:11 AM
Back in business here. It's working with all these pops. Just adding a couple more Kinh and it should be good to go.

Chad Rohlfsen
05-31-2015, 08:12 AM
Yeah, I am wondering if I should use the Japanese. The Koreans fall within them. North Han have like 1-2% Onge. Southern Han show a little more. I know Koreans and Japanese show none. The tricky part is which one is more West Eurasian. Do you have any idea? I'm using the Ulchi, but I don't know how well they'll behave.

Chad Rohlfsen
05-31-2015, 08:18 AM
What do Japanese and Koreans get for ANE on the K8? Or, who peaked out the East Eurasian?

Generalissimo
05-31-2015, 08:50 AM
What do Japanese and Koreans get for ANE on the K8? Or, who peaked out the East Eurasian?

Japanese and North Han get 1-2% of ANE. Don't know about Koreans.

East Eurasian peaks in Hezhen at 91%.

Chad Rohlfsen
05-31-2015, 09:10 AM
Hmm. Do I dare go with the Japanese? Or, would the Ulchi make more sense?

Generalissimo
05-31-2015, 09:19 AM
Hmm. Do I dare go with the Japanese? Or, would the Ulchi make more sense?

Ulchi look even more East Eurasian than Japanese based on what I've seen.

Dr_McNinja
05-31-2015, 12:53 PM
The most ENA Paniya gets the following in the K8.

ANE 20.14
South_Eurasian 24.95
ENF 17.98
East_Eurasian 17.52
WHG 0.01
Oceanian 11.4
Pygmy 0
Sub-Saharan 8

This is supported by genotype-based PCA, because the Paniya aren't all that far from other Indians, which means that an ANI proportion of 17% is totally unrealistic.

Anyway, the ASI/ANI analysis was designed before anyone knew about ANE, so who knows how accurate it is.Can you run the Onge through the K8?

Generalissimo
05-31-2015, 01:15 PM
Can you run the Onge through the K8?

Nope.

Kurd
05-31-2015, 02:15 PM
This is based on your latest spreadsheet. I averaged your pops to reduce clutter. I have also posted PC3 PC4. You can get total distance by adding the distance from one pop to the other from PC1/PC2 to the distance for PC3/PC4. Thus, you can check the total distance from say Mbuti to Onge, or Mbuti to LBK, or Mbuti to Ust to see if it is consistent with population dispersion/differentiation theories. You can also check to see if the clines from one pop to the other are consistent with the latest theories.

http://i.imgur.com/chwfVIC.jpg
http://i.imgur.com/qViZpOv.jpg
http://i.imgur.com/q2OtUJh.jpg
http://i.imgur.com/JXgjhCQ.jpg


EDIT: Observation; the Paniya and Austranesian pops are most similar to Ust Ishim according to your data

Chad Rohlfsen
05-31-2015, 03:27 PM
Almost got it guys!!!! It's very close. Just had a run with the Paniya at top with 62% ASI. I might have it today.

parasar
05-31-2015, 03:36 PM
The most ENA Paniya gets the following in the K8.

ANE 20.14
South_Eurasian 24.95
ENF 17.98
East_Eurasian 17.52
WHG 0.01
Oceanian 11.4
Pygmy 0
Sub-Saharan 8

This is supported by genotype-based PCA, because the Paniya aren't all that far from other Indians, which means that an ANI proportion of 17% is totally unrealistic.

Anyway, the ASI/ANI analysis was designed before anyone knew about ANE, so who knows how accurate it is.

ANI/ASI is a two component scenario and its accuracy is not related to ANE.

From what we know now ASI shares ancestry with ANE but has no ANE. So if the Paniya are 83% ASI, their ANE has to be well below 17%.

This is consistent with their location on the PCA, no doubt they are not far from other Indians, but they anchor the two clines on the PCA.

http://2.bp.blogspot.com/-E9i6jahCv_4/UgPApW5yg1I/AAAAAAAAJBA/OvlrTcEdAYE/s1600/moorjani.jpg

Chad Rohlfsen
05-31-2015, 03:50 PM
ANI/ASI is a two component scenario and its accuracy is not related to ANE.

From what we know now ASI shares ancestry with ANE but has no ANE. So if the Paniya are 83% ASI, their ANE has to be well below 17%.

This is consistent with their location on the PCA, no doubt they are not far from other Indians, but they anchor the two clines on the PCA.

http://2.bp.blogspot.com/-E9i6jahCv_4/UgPApW5yg1I/AAAAAAAAJBA/OvlrTcEdAYE/s1600/moorjani.jpg

I've only seen the Paniya listed as high as 69% ASI. I don't think it's quite that high. They don't even look to have that much East Eurasian.

Kurd
05-31-2015, 04:25 PM
This is based on your latest spreadsheet. I averaged your pops to reduce clutter. I have also posted PC3 PC4. You can get total distance by adding the distance from one pop to the other from PC1/PC2 to the distance for PC3/PC4. Thus, you can check the total distance from say Mbuti to Onge, or Mbuti to LBK, or Mbuti to Ust to see if it is consistent with population dispersion/differentiation theories. You can also check to see if the clines from one pop to the other are consistent with the latest theories.

http://i.imgur.com/chwfVIC.jpg
http://i.imgur.com/qViZpOv.jpg
http://i.imgur.com/q2OtUJh.jpg
http://i.imgur.com/JXgjhCQ.jpg


EDIT: Observation; the Paniya and Austranesian pops are most similar to Ust Ishim according to your data

Some other observations supported by the spreadsheet:

ASI vs Yamna are only slightly positively correlated. An increase in one only slightly affects an increase in the other (almost uncorrelated). Same holds true for ASI vs Atayal.
SSA and ASI are negatively correlated. Generally, an increase in one results in a decrease in the other.
Yamnaya is most similar to MA1
Yamnaya can not be accurately modeled as a mix of any 2 ancestral spreadsheet populations.

everest59
05-31-2015, 04:25 PM
I think Paniyas are probably a 50-50 mix between ANI and ASI, although the new software seems to suggest 30% ASI from what I recall.

Anabasis
05-31-2015, 04:34 PM
I am not expert as you but what i think about creating component is, none of any bronze age populations should be taken as reference population. Yamna samples are not "pure" specific population. They had already mixed with other populations. Whenever i see a component 99% for a yamnaya it does not makes any sense on me. That will mix everthing in your works. Thats my opinion. Regards.

everest59
05-31-2015, 04:42 PM
Is that the case with just this sample or with others too?
If Moorjani et al. is correct and the Paniya have 17% ANI, I can't see how they can have anywhere close to even 20% ANE.

They underestimated ANI in the Paniya. Now, some of the results I posted before seemed to suggest 20-30% ANE, but the issue there is the software cannot separate out ANE and ASI.
So I feel that the shared drift data are more valid. If I had to guess, probably 6 or 7% at best.
We need to realize, MA-1 is a 20k year old sample. Comparing it to modern samples may not be a good idea. Now, the Yamnaya is at least more modern...I can see why Haak came up with a new model.

parasar
05-31-2015, 04:45 PM
I've only seen the Paniya listed as high as 69% ASI. I don't think it's quite that high. They don't even look to have that much East Eurasian.

ASI in the two admixture simplistic scenario of Reich's ANI/ASI as modified by Moorjani: the Paniya are at 87% ASI and 17% ANI.

Table S4: Ancestry estimates from F4 Ratio Estimation

Paniya 5 Dravidian Tribal

We performed F4 Ratio Estimation to estimate the proportion of ANI ancestry in Indians. Specifically, we use the following statistics:

16.7 ± 2.4 §ANI ancestry (Pop2 = Basque) = f4(YRI, Basque; X, Onge)/ f4(YRI, Basque; Georgian, Onge);

16.8 ± 2.1 §ANI ancestry (Pop2 =Abhkasian) = f4(YRI, Abhkasian; X, Onge)/ f4(YRI, Abhkasian; Georgian, Onge);

22.5 ± 1.6 §ANI ancestry (Reich et al., 09) = f4(Adygei, Papuan; X, Onge)/ f4(Adygei, Papuan; CEU, Onge).

We computed standard errors using a Block Jackknife with a block size of 5cM.

everest59
05-31-2015, 04:52 PM
ASI in the two admixture simplistic scenario of Reich's ANI/ASI as modified by Moorjani: the Paniya are at 87% ASI and 17% ANI.

Table S4: Ancestry estimates from F4 Ratio Estimation

Paniya 5 Dravidian Tribal

We performed F4 Ratio Estimation to estimate the proportion of ANI ancestry in Indians. Specifically, we use the following statistics:

16.7 ± 2.4 §ANI ancestry (Pop2 = Basque) = f4(YRI, Basque; X, Onge)/ f4(YRI, Basque; Georgian, Onge);

16.8 ± 2.1 §ANI ancestry (Pop2 =Abhkasian) = f4(YRI, Abhkasian; X, Onge)/ f4(YRI, Abhkasian; Georgian, Onge);

22.5 ± 1.6 §ANI ancestry (Reich et al., 09) = f4(Adygei, Papuan; X, Onge)/ f4(Adygei, Papuan; CEU, Onge).

We computed standard errors using a Block Jackknife with a block size of 5cM.

Here is something else that is pretty amusing. If you replace the Onge by Paniya in the formula above, and try to calculate ANI in UP Brahmins, the numbers do not change a whole lot. Does that mean the Paniyas do not have a whole lot of ANI?
So perhaps some ANE may be part of ASI.
And I suppose ASI itself is a composite of some sort.
The Paniya get some signifcant match with Stuttgart.

Chad Rohlfsen
05-31-2015, 04:57 PM
That seems way too high. That was before EHG and MA1, which looks half South Asian and Native American. That has to be skewing things. Unsupervised, they come out 50-60% ENA. They're about as close to west Asians as they are to the Onge and they don't have much NE Asian.

parasar
05-31-2015, 05:12 PM
They underestimated ANI in the Paniya. Now, some of the results I posted before seemed to suggest 20-30% ANE, but the issue there is the software cannot separate out ANE and ASI.
So I feel that the shared drift data are more valid. If I had to guess, probably 6 or 7% at best.
We need to realize, MA-1 is a 20k year old sample. Comparing it to modern samples may not be a good idea. Now, the Yamnaya is at least more modern...I can see why Haak came up with a new model.

It is important to keep in mind that ASI is a component present in South Asians but absent in most other West Eurasians.
So ASI cannot have ANE, otherwise we should redefine what ASI is, or use qualifiers such as David's ASI.
Reich had noted:

We caution that “models” in population genetics should be treated with caution. While they provide an important framework for testing historical hypotheses, they are oversimplifications. For example, the true ancestral populations of India were probably not homogeneous as we assume in our model but instead were likely to have been formed by clusters of related groups that mixed at different times. However, modeling them as homogeneous fits the data and appears to capture meaningful features of history


ASI is most like Metspalu's k6 component.

the dark green ancestry component, k6 is primarily restricted to the Indian subcontinent with modest presence in Central Asia and Iran. Haplotype diversity associated with dark green ancestry is greatest in the south of the Indian subcontinent, indicating that the alleles underlying it most likely arose there and spread northwards. It is notable that this ancestry component also exhibits greater haplotype diversity than European or Near Eastern components despite the fact that the Illumina genotyped markers were principally ascertained in a sample of European individuals....

The geographic spread of the Indian-specific PC2 (or k6) could at least partly correspond to the genetic signal from the ASI and PC4 (or k5), distributed across the Indus Valley, Central Asia, and the Caucasus, might represent the genetic vestige of the ANI (Figure S2). However, within India the geographic cline (the distance from Baluchistan) of the Indus/Caucasus signal (PC4 or k5) is very weak, which is unexpected under the ASI-ANI model, according to which the ANI contribution should decrease as one moves to the south of the subcontinent. This can be interpreted as prehistorical migratory complexity within India that has perturbed the geographic signal of admixture.

parasar
05-31-2015, 05:16 PM
Here is something else that is pretty amusing. If you replace the Onge by Paniya in the formula above, and try to calculate ANI in UP Brahmins, the numbers do not change a whole lot. Does that mean the Paniyas do not have a whole lot of ANI?
So perhaps some ANE may be part of ASI.
And I suppose ASI itself is a composite of some sort.
The Paniya get some signifcant match with Stuttgart.

It is better to use the Ongee as they have 0 ANI, but yes the fact that replacing the Ongee by the Paniya results in little change reflects the minimal ANI in the Paniya.

Chad Rohlfsen
05-31-2015, 05:48 PM
They always get about 15-20% Bedouin plus the central Asian component. Higher ks will always make population specific clusters, but rarely pure ones, unless you have a pure reference. I'll have something soon. It's looking good.

Helgenes50
05-31-2015, 05:55 PM
They always get about 15-20% Bedouin plus the central Asian component. Higher ks will always make population specific clusters, but rarely pure ones, unless you have a pure reference. I'll have something soon. It's looking good.

Is the South-Central_Asian of K8 different to the Teal component, or is it the same ?

Chad Rohlfsen
05-31-2015, 06:21 PM
Is the South-Central_Asian of K8 different to the Teal component, or is it the same ?

It's different. I think the Hindu Kush vs Teal is a matter of East Eurasian in the mix, plus a little less EHG than the teal.

Chad Rohlfsen
05-31-2015, 06:22 PM
I'm doing some fine tuning. I'm hoping to have this done before the night is up.

Sein
05-31-2015, 07:09 PM
I think Paniyas are probably a 50-50 mix between ANI and ASI, although the new software seems to suggest 30% ASI from what I recall.

Everest,

You could always verify this via qpAdm. Try Paniya as a mix of BedouinB + MA1 + Dai, and see what happens. With qpAdm, you can always check the chisq, and tail probability, to see if what your'e seeing is a good fit.

Edit: These would constitute your pright populations.

Ju_hoan_North
Mbuti
Karitiana
Surui
Han

Chad Rohlfsen
05-31-2015, 10:22 PM
Getting closer. Still tinkering.

https://docs.google.com/spreadsheets/d/1v8WEETPIfGPo3xnaZ5wWlbf9CKjopEqk3DjrnjXUCKo/edit?usp=sharing

Chad Rohlfsen
05-31-2015, 10:29 PM
I'm adding a Bedouin component for K7. I'm on the fence about ASI in West Asia. What's everyone's opinion on this?

Sein
05-31-2015, 10:48 PM
I'm adding a Bedouin component for K7. I'm on the fence about ASI in West Asia. What's everyone's opinion on this?

That sounds good.

For what it's worth, the ASI cluster in this run seems somewhat inflated in West Asia. Also, MA1 is 4% ASI in this run, although he was 0% in your K5 (which also matched Zack's old Onge run).

I think the ASI cluster in K5 was close to perfect, in terms of capturing South Asian-specific ENA.

I'm not sure if this is reasonable, but you could perhaps combine EEF and BedouinB? I'm not really sure how that would effect the results.

Also, I think you could probably remove Nganasan from your NE Asian component, and make it all Ulchi. The Ulchi samples who aren't part of the NE Asian component aren't showing any West Eurasian admixture, while Nganasan always seem to be around 20%-15% West Eurasian.

Chad Rohlfsen
05-31-2015, 10:58 PM
That was using the Ulchi. Nganasan must be less west Eurasian. I typically get 8-12% Yamnaya for them.

Sein
05-31-2015, 11:02 PM
That was using the Ulchi. Nganasan must be less west Eurasian. I typically get 8-12% Yamnaya for them.

Ah, I see. But all the Nganasan are 100% NE Asian in this run, while only 2 Ulchi behave as such?

I was thinking you could probably have all the Ulchi samples to constitute your supervised NE Asian component, and leave out Nganasan.

Chad Rohlfsen
05-31-2015, 11:02 PM
I think that using Bedouins will drop ASI by 2-4%. It'll be done soon.

Chad Rohlfsen
05-31-2015, 11:03 PM
I can try it.

Chad Rohlfsen
05-31-2015, 11:06 PM
Maybe a run with an Amerindian component would sort out whom to use.

Dr_McNinja
05-31-2015, 11:07 PM
That's a good run. ASI looks like 16% ANE, 45% Eurogenes ASE, ~24% East Eurasian, ~15% African.

Chad Rohlfsen
05-31-2015, 11:09 PM
Remember, the top Paniya was 26% Onge, 19% Atayal, and 15% Papuan on unsupervised. This is pretty close to that level of ENA.

Chad Rohlfsen
05-31-2015, 11:09 PM
They might be 60% ENA. I've got a Bedouin run going.

Sein
05-31-2015, 11:12 PM
Maybe a run with an Amerindian component would sort out whom to use.

I think adding Native Americans will make it possible to have ANE/EHG components, alongside Yamnaya.

But I think you'll have to follow David's strategy, which was to use only a few Karitiana (I'm thinking of his first supervised runs). David did the same with the Kalash (only 1 sample), and it helped.

tamilgangster
05-31-2015, 11:28 PM
Actually, I uploaded a Paniya sample to Gedmatch a while ago. The Kit # is M960763. This is what I get:

Gedrosia 21.07%
Siberian 0.32%
Northwest_African -
Southeast_Asian 3.08%
Atlantic_Med -
North_European -
South_Asian 75.54%

You know what else is interesting? The eye-color prediction.

Now, the K12b South Asian is basically similar to Harappa's South Indian. IMO, it does not have a whole lot of ANE. As we can see though, the Paniya tend to get some Gedrosia as well.

Actually, I just did a one-to-one compararison between Paniya and MA-1. At 2.0 cm with minimum threshold of 100 SNPs and bunching limit of 50, the sample gets around 7.3 cm total match.
I get 25.9 cm with the same parameters.
So, the Paniya probably have some ANE, but I highly doubt 20 something percent.

Here is a comparison to a Northern European individual . With the same options selected, this European individual gets 14.4 cm.

On harappa DNA they completley lack any baloch component, the gedrosian seems to high

tamilgangster
05-31-2015, 11:36 PM
The East African is slightly higher which suggests it's possibly not related to the ENF. Modern Bedouin is only around 8-9% East African. So if it's from something Gedrosian-like (has both ANE and ENF), that leaves a non-Gedrosian ASI that has 20% ANE.

Modern bedouins also have divereged from the original Basal eurasian and have gotten admixture from more modern ENF types, Gedrosians have very little east africans and the amound of gedrosian they have is minimal, they completley lack any baloch component

everest59
05-31-2015, 11:44 PM
On harappa DNA they completley lack any baloch component, the gedrosian seems to high

There seems to be a difference due to the calculator effect.
I am getting 21% Baloch in the Paniya using the Harappa calculator.
On the other hand, the Harappa database shows Paniya with 0% Baloch.
Actually, it just proves to me that the South Indian of Harappa has ANI (estimate being close to 50%).

tamilgangster
05-31-2015, 11:50 PM
K5 fellas. It's looking better and better. Still tinkering.

https://docs.google.com/spreadsheets/d/1ESoOszTLdzPhKrnN8bw4B3UZS0AXrkabsm9JoNn9g-8/edit?usp=sharing

This calculator overestimates the ASI component, also using kinh as ASI proxy skews the values. It also under estimates the East asian values. SInce kinh are austro asiatic alot of the austroasiatic admixture in SOuth asia shows up as ASI. This also results in Han scoring over 10% ASI, which is also flawed

tamilgangster
06-01-2015, 12:00 AM
The ASE:East_Eurasian ratio here is 72:28 or so, still near the 70:30 mark of HAP S-Indian. It seems like that's the combination Admixture keeps focusing on but that isn't relevant for populations outside that narrow geographic range of Southeast India.

But it does point to a high ANE/high ASE population if the trend is drawn out. Like what I found earlier about a "southern" version of Karitiana.

Piramalai kallar also have an ASE to east eurasian ratio of 75 to 25 and the hakkipkki and kurumba scored similar. Among austroasiatic adiviasis its around 1:1 and its 4:3 among bengalis. The ASE to East eurasian ratio among NW populations is much higher.

Chad Rohlfsen
06-01-2015, 12:09 AM
This calculator overestimates the ASI component, also using kinh as ASI proxy skews the values. It also under estimates the East asian values. SInce kinh are austro asiatic alot of the austroasiatic admixture in SOuth asia shows up as ASI. This also results in Han scoring over 10% ASI, which is also flawed

That's the old one.

Chad Rohlfsen
06-01-2015, 12:49 AM
That k6 dropped ASI for all but the Paniya and Mala. It's more in line with the Onge on Zachs run. I've got another one coming soon.

tamilgangster
06-01-2015, 01:55 AM
The most ENA Paniya gets the following in the K8.

ANE 20.14
South_Eurasian 24.95
ENF 17.98
East_Eurasian 17.52
WHG 0.01
Oceanian 11.4
Pygmy 0
Sub-Saharan 8

This is supported by genotype-based PCA, because the Paniya aren't all that far from other Indians, which means that an ANI proportion of 17% is totally unrealistic.

Anyway, the ASI/ANI analysis was designed before anyone knew about ANE, so who knows how accurate it is.

THe ANI ASI analysis is only good for determining gedrosian, and post gedrosian admixture, but pre gedrosian ANE gives off noise as ANI

tamilgangster
06-01-2015, 01:59 AM
Pathan all between 9-17% and Balochi from 6-15%. It's odd though, as in Zack's run the Kalash had 0% Onge, but getting 6-9% ASI here, and Cambodians had 17% Onge, or about 25% ASI, but it's up over 40% here. It's in West Asia now, at 1-5%, but that may be extra basal. I'm not so sure about this. Some Han are as high as 30%. It doesn't seem right. Maybe, switching back to Dai, and a separate Onge component will help. I'm going to dump the LBK or throw some Bedouin in it.

I believe the reason for that is that some of the archaic ANE and ENF might show up as ASI because the mixture is archaic.

tamilgangster
06-01-2015, 02:13 AM
I'm adding a Bedouin component for K7. I'm on the fence about ASI in West Asia. What's everyone's opinion on this?

I believe th ASI is over estimated in this run by a little amount. A bedouin component will be good because it can isolate the archaic ENF strains, which i believe show up as ASI. I also believe that for The SE asian proxy you should use add dai and kinh as samples, because alot of East eurasian in South Asia, might also be showing up as ASI. If you create a seperate E asian and SE asian
sample that might lead to creating a specific population cluster.

I think you should try using these proxies
Bedouin
Ma'lta Boy
E asian(with dai, Kinh, and Aytal as proxies)
ASI(using Onge and Papuan component as proxies)
SSA
This will be very good for south asian specific populations. People might disagree with using papuan as ASI proxy, but on other tests south asians score oceanian component as noise.

tamilgangster
06-01-2015, 02:16 AM
That's a good run. ASI looks like 16% ANE, 45% Eurogenes ASE, ~24% East Eurasian, ~15% African.

Where ru getting these numbers from. These actually seem accurate though describing the ASI proportion on the most recent test

tamilgangster
06-01-2015, 02:19 AM
There seems to be a difference due to the calculator effect.
I am getting 21% Baloch in the Paniya using the Harappa calculator.
On the other hand, the Harappa database shows Paniya with 0% Baloch.
Actually, it just proves to me that the South Indian of Harappa has ANI (estimate being close to 50%).

Its well established that the South Indian component on harappa DNA is 50% west eurasian, but the ANI found among paniya is just noise from ANE and ENF, but its not gedrosian related in any way

Dr_McNinja
06-01-2015, 02:32 AM
Where ru getting these numbers from. These actually seem accurate though describing the ASI proportion on the most recent test

Converted over to K7 values using Ust-Ishim as a reference.

The only issue is the still inflated East Eurasian in all these components. This will cause problems in northern India where there's East Eurasian from other sources or with Oceanian.

I wonder if it's possible to actually use a population like the Baloch and perhaps some Sindhis and western Punjabis to tease out an ASI component in admixture. No other Indian populations at all. Basically all those with low East Eurasian.

Edit: Like how David took the non-ENA part of Karitiana for ANE. Take the ENA part of Baloch/Brahui/Makrani and use that as ASI.

Chad Rohlfsen
06-01-2015, 02:43 AM
tamil,
I'm not going to make the Dai and Kinh as something separate as they show Onge admixture. They will have ASI, so I would lose ASI in South Asia. The Paniya should be about 50% or more, and the Dai over 10%. I'm still working on this K7. I may have that before the night is up. Bedouins caused other stuff to shift around, so it's taking a little longer than expected.

Chad Rohlfsen
06-01-2015, 02:50 AM
I'm tempted to just throw in all Africans and Eurasians then go crazy on the components. Every time I add a component, everything changes. I feel like I might as well go all out and have a run like this..
Khomani
Mbuti
Yoruba
WHG
EHG
EEF
Bedouin
Yamnaya
Ami
ASI
Chukchi
Nganasan
Papuan

Generalissimo
06-01-2015, 02:57 AM
Like how David took the non-ENA part of Karitiana for ANE. Take the ENA part of Baloch/Brahui/Makrani and use that as ASI.

Karitiana are a two-way mixture, so they're relatively easy to deconstruct. The South Central Asians are much more complex. With them it'd be like trying to pull out a specific flavor from a cocktail.

Sein
06-01-2015, 02:58 AM
I'm tempted to just throw in all Africans and Eurasians then go crazy on the components. Every time I add a component, everything changes. I feel like I might as well go all out and have a run like this..
Khomani
Mbuti
Yoruba
WHG
EHG
EEF
Bedouin
Yamnaya
Atayal
ASI
Chukchi
Nganasan

Maybe that is the best strategy.

My suggestions as far as components go:

Mbuti

BedouinB

EEF

Kostenki-14

EHG

ANE

Yamnaya

ASI

Ulchi

Ami/Atayal

I think you should use these with all the populations you have, but with Oceanians excluded.

Also, test WHG alongside modern global populations, but don't have them as a supervised component. It'll be interesting to see if ADMIXTURE can construe them as Kostenki-14 + EHG/ANE.

Finally, perhaps you should include Native Americans. It would be pretty nice if ADMIXTURE has them as Ulchi + EHG/ANE, or Ami/Atayal + EHG/ANE (I have my fingers crossed).

For what it's worth, we know that the Mbuti are the closest population we have to "unadmixed" Africans (although, I doubt we can construe any population as truly "unadmixed"), but the whole picture in that continent is exceedingly vague and undeveloped. Perhaps ADMIXTURE can model Yoruba as Mbuti + Bedouin, or Mbuti + EEF, or something along those lines. Regardless, it would be very interesting to see.

Helgenes50
06-01-2015, 10:06 AM
Getting closer. Still tinkering.

https://docs.google.com/spreadsheets/d/1v8WEETPIfGPo3xnaZ5wWlbf9CKjopEqk3DjrnjXUCKo/edit?usp=sharing
Chad,

Your results seem very interesting.
Although this test is more for the Asian populations, did you run
a few modern Europeans to compare.

Generalissimo
06-01-2015, 10:34 AM
I don't think Kostenki14 will make a cluster. The program will probably just find something else to latch onto, even though Kostenki14 will be marked as 100%.

tamilgangster
06-01-2015, 11:23 AM
Converted over to K7 values using Ust-Ishim as a reference.

The only issue is the still inflated East Eurasian in all these components. This will cause problems in northern India where there's East Eurasian from other sources or with Oceanian.

I wonder if it's possible to actually use a population like the Baloch and perhaps some Sindhis and western Punjabis to tease out an ASI component in admixture. No other Indian populations at all. Basically all those with low East Eurasian.

Edit: Like how David took the non-ENA part of Karitiana for ANE. Take the ENA part of Baloch/Brahui/Makrani and use that as ASI.

For makranis and even some balochi you might get minor amounts of SSA mixed into it

Chad Rohlfsen
06-01-2015, 02:31 PM
Chad,

Your results seem very interesting.
Although this test is more for the Asian populations, did you run
a few modern Europeans to compare.

Almost everyone is in it now. It may take a few days. Each run with 2300 samples takes about 7 hours with 12-13 components.

parasar
06-01-2015, 03:12 PM
Okay. I'll see how the run with EHG goes. I may take Yamnaya off of the pops too. I'll do these two, and then mess with the Near East stuff. One thing too, is that adjusting the number of Dai that I select, the numbers for ASI change. Kharia had 64% ASI with all Dai, while Paniya were in the upper 30s. Removing three Dai had the Kharia at 54 and the Paniya at 45%. I'll need suggestions as far as what to do.

On the dendograms posted by Kurd, the Paniya and the Khariya cluster together, and in turn both cluster with Ust-Ishim.
"Austra. K is paired with Paniya, with Ust Ishim not too far"
http://www.anthrogenica.com/showthread.php?4548-Upcoming-DIY-Eurogenes-K7-amp-K8-Calculator-amp-Oracles-for-tracking-E-Asian-amp-ASI&p=87148&viewfull=1#post87148

The Khariya at present are located in a region previously known as Paniya Bhumi.
http://www.anthrogenica.com/showthread.php?3433-Waves-of-migration-into-South-Asia&p=81141&viewfull=1#post81141

The Paniya Bhumi region of Chuta Nagpur ((old) Bihar/Jharkhand/Orissa) is presently considered AA speaking but I have no idea what it was in the past.
The Paniya region: paniya-bhumi https://books.google.com/books?id=d4PCyXslrNsC&pg=PA13 see also paniya jataka and kalpasutta

parasar
06-01-2015, 03:44 PM
...
We need to realize, MA-1 is a 20k year old sample. Comparing it to modern samples may not be a good idea. Now, the Yamnaya is at least more modern...I can see why Haak came up with a new model.

Now that we have MA-1, sure moving on to Yamana is better, at least for Europeans. But MA-1 was key to our understanding of the difference between derived and shared ancestry which ADMIXTURE conflates. Because of MA-1 and formal stats we now know that both Europeans and Amerindians have shared ancestry that derives from an Ancient North Eurasian population related to MA-1.

What we do not know as of yet is whether the Ancient North Eurasian in South Asia and the Caucasus is derived from an MA-1 like population or shared with or both. For South Asia, it looks to be both. For this to be confirmed we need ancient genomes from South Asia and the Caucasus.

Samara/Karelia helped partly as it showed that the Yamna had derived ancestry from a Samara/Karelia like and something else, and they took this latter ancestry to Europe which for the first time appears in Corded Ware.

Chad Rohlfsen
06-02-2015, 02:10 AM
Okay, I've dropped to 1900 pops, eliminating most Native Americans and some Siberians. I'm testing it without a Karitiana component first. Yoruba came out as 100% Mbuti in the first run.

Sein
06-02-2015, 04:25 AM
Okay, I've dropped to 1900 pops, eliminating most Native Americans and some Siberians. I'm testing it without a Karitiana component first. Yoruba came out as 100% Mbuti in the first run.

Is it possible to share the full output?

Chad Rohlfsen
06-02-2015, 04:38 AM
No, It's gone and all of it was a mess outside of Europe. Nothing made sense. It'll take a few runs to get something worth showing. I have to see how they all behave together.

Generalissimo
06-02-2015, 06:11 AM
Here's a suggestion: base your test on the one that produces the best ASI estimate (from memory the last K6), and just add two more clusters to it, Arabian (Yemen Jews as the refs) and Euro_HG (based on Loschbour). Also add the Europeans and Near Easterners that never create clusters, like French, English, Czechs, Hungarians, Spanish, all of the Italians and Sicilians, Greeks and other Balkanoids, Armenians, Lebanese, Syrians, Iranians, Jordanians, Turkish Jews, Moroccan Jews, all of the Turks (so no Druze, Ashkenazi or most other Jews, Moazbites, Caucasians, South Central Asians except maybe Pathans).

Try that, see how you go, and then fine tune to get the best results.

Chad Rohlfsen
06-02-2015, 09:08 PM
3 Bedouins topped the Yemeni Jews, so I'm using them and ditched MA1.

Chad Rohlfsen
06-03-2015, 09:21 PM
Sorry for the delay. It's a little trickier with an extra 1000 samples to go through.

Chad Rohlfsen
06-04-2015, 03:13 AM
I'm getting very close. Unfortunately, I've lost the Paniya, Austroasiatics (Ho, Bonda, etc.), and Malays, so we will have to do without them.

Chad Rohlfsen
06-04-2015, 04:56 AM
Okay! All SA samples are back in. I may be posting it tomorrow.

tamilgangster
06-04-2015, 12:24 PM
Now that we have MA-1, sure moving on to Yamana is better, at least for Europeans. But MA-1 was key to our understanding of the difference between derived and shared ancestry which ADMIXTURE conflates. Because of MA-1 and formal stats we now know that both Europeans and Amerindians have shared ancestry that derives from an Ancient North Eurasian population related to MA-1.

What we do not know as of yet is whether the Ancient North Eurasian in South Asia and the Caucasus is derived from an MA-1 like population or shared with or both. For South Asia, it looks to be both. For this to be confirmed we need ancient genomes from South Asia and the Caucasus.

Samara/Karelia helped partly as it showed that the Yamna had derived ancestry from a Samara/Karelia like and something else, and they took this latter ancestry to Europe which for the first time appears in Corded Ware.

WHat exactly do you mean by derived vs shared ancestry.

Chad Rohlfsen
06-04-2015, 02:17 PM
Ma1 is too old and Crown Eurasian like to use. It covers up some Near eastern ancestry and all for tribals. Only something the same age as EHG will be informative.

parasar
06-04-2015, 03:23 PM
WHat exactly do you mean by derived vs shared ancestry.

Perhaps it is best to illustrate using the tree from Lazaridis.
The node under non-African where Eastern non-African and Ancient North Eurasian-West Eurasian divide is approximately where Ust-Ishim stands.

Europeans derive ANE ancestry from Ancient North Eurasians mixing with other populations to form Europeans, but this is not the only extent of their connection as Europeans also share very remote ancestry with Ancient North Eurasians at the non-African level. We could resolve this branching due the ancient samples from Europe (entitled EEF and WHG). We need ancient DNA from South Asia to clarify whether the overlaps (probability) were are seeing is due to derived ancestry (admixture) or share ancestry. [Plus we have that additional problem with ADMIXTURE - the propensity of drifted populations forming components.]

http://3.bp.blogspot.com/-YbYK8NzQNAY/UrihRsR5eSI/AAAAAAAAJbo/TYynaV4cO4Y/s1600/model.png

tamilgangster
06-05-2015, 11:15 AM
Perhaps it is best to illustrate using the tree from Lazaridis.
The node under non-African where Eastern non-African and Ancient North Eurasian-West Eurasian divide is approximately where Ust-Ishim stands.

Europeans derive ANE ancestry from Ancient North Eurasians mixing with other populations to form Europeans, but this is not the only extent of their connection as Europeans also share very remote ancestry with Ancient North Eurasians at the non-African level. We could resolve this branching due the ancient samples from Europe (entitled EEF and WHG). We need ancient DNA from South Asia to clarify whether the overlaps (probability) were are seeing is due to derived ancestry (admixture) or share ancestry. [Plus we have that additional problem with ADMIXTURE - the propensity of drifted populations forming components.]

http://3.bp.blogspot.com/-YbYK8NzQNAY/UrihRsR5eSI/AAAAAAAAJbo/TYynaV4cO4Y/s1600/model.png

The shared remote ancestry europeans have with ANE is from WHG, which is from the same lineage as ANE, which this graph gets wrong. WHG is more closeley related to ANE than to ENF

Chad Rohlfsen
06-05-2015, 11:38 AM
I had a run that looked good without EHG, but to include EHG will likely require something like Iraqi Jews for something with EHG and next to no Yamnaya.

tamilgangster
06-05-2015, 12:00 PM
I had a run that looked good without EHG, but to include EHG will likely require something like Iraqi Jews for something with EHG and next to no Yamnaya.

WOuldn't it make more sense to isolate the EHG from a siberian population, becuae they would be closer to the type of EHG found in south asia, and they lack any bit of Yamnya admixture.

Chad Rohlfsen
06-05-2015, 02:15 PM
Siberians pick Yamnaya over EHG, just as Native Americans. Having EHG on its own leads to tribals hijacking it and messing up Asia. I need a non Yamnaya or EHG source of Eastern ancestry. Bedouins are too SSA for tribals to take them and EHG. There doesn't appear to be any other way around this than using a west Asian pop with little SSA and Yamnayan ancestry.

parasar
06-05-2015, 03:11 PM
The shared remote ancestry europeans have with ANE is from WHG, which is from the same lineage as ANE, which this graph gets wrong.
ANE is a component into Europeans and into MA1. What we can indeed say from the graph is that the Karitiana and Europeans share ancestry from Ancient North Eurasian.
ANE and Ancient North Eurasian are not necessarily the same thing entirely. This part of phylogeny and how it relates to WHG I had doubted when Lazaridis pre-print came out. My thinking was that ADMIXTURE was giving results not consistent with Lazarids' phylogeny.

"MA1 was admixed - he had both ancestry shared with Gedrosia and another pre-ANE ancestry also seen in Native Americans who don't show any Gedrosia."
http://www.anthrogenica.com/showthread.php?2551-Experimentation-with-TreeMix-Software&p=46509&viewfull=1#post46509
"I consider MA1 admixed ... K=6 reveals a south Eurasian component maximized in Papuans, which is also represented in South Asians. MA1 shows some affinity to this component, in contrast to more recent European hunter gatherers who continue to mainly show ties to Native Americans."
http://www.anthrogenica.com/showthread.php?2551-Experimentation-with-TreeMix-Software&p=40917&viewfull=1#post40917

Anyway, the later Haak paper said something similar:


ANE ancestry in Karelia_HG is derived from the branch of “Ancient_North_Eurasian” that goes into the Karitiana Native Americans, rather than the MA1 branch.






WHG is more closeley related to ANE than to ENF

The graph is is not saying that that WHG is more related to ENF that WHG is to ANE.

Kale
06-05-2015, 04:20 PM
Anyway, the later Haak paper said something similar:

ANE ancestry in Karelia_HG is derived from the branch of “Ancient_North_Eurasian” that goes into the Karitiana Native Americans, rather than the MA1 branch.


I'm assuming here you're referring to page 120 or so...

In that paper, there were given several possible conclusions that could be true with the evidence available. That there was once possibility...although a rather unlikely one, as it puts WHG (Loschbour in this case) as admixed with 48% of an ANE group with a rather short drift parameter from the same input into Karitiana.

The more likely scenario is that Loschbour is mixed 11% with a group most related to Karelia, and in that case, the ANE in Karitiana diverges before the divergence of MA1 and Karelia_HG. If I remember correctly, archaeology suggests there was an isolation of the ancestors of Native Americans in Beringia for quite some time before the trek to America, which would make sense given this scenario.

...Also on the ANE-WHG-EEF split...you got it backwards. WHG is not more related to EEF than ANE, EEF is more related to WHG than ANE...because EEF is 56% or so WHG, and roughly 0% ANE.

parasar
06-05-2015, 05:55 PM
I'm assuming here you're referring to page 120 or so...

In that paper, there were given several possible conclusions that could be true with the evidence available. That there was once possibility...although a rather unlikely one, as it puts WHG (Loschbour in this case) as admixed with 48% of an ANE group with a rather short drift parameter from the same input into Karitiana.

The more likely scenario is that Loschbour is mixed 11% with a group most related to Karelia, and in that case, the ANE in Karitiana diverges before the divergence of MA1 and Karelia_HG. If I remember correctly, archaeology suggests there was an isolation of the ancestors of Native Americans in Beringia for quite some time before the trek to America, which would make sense given this scenario.

...Also on the ANE-WHG-EEF split...you got it backwards. WHG is not more related to EEF than ANE, EEF is more related to WHG than ANE...because EEF is 56% or so WHG, and roughly 0% ANE.

I think the query was to the ENF part of EEF, but I may be mistaken.

Kale
06-05-2015, 06:33 PM
I don't think it's possible at this point to really clarify ENF's relation to anything at this point considering 1) ENF likely has West Eurasian admixture...and 2) West Eurasians have East Eurasian mixture, which skews pure West Eurasian into looking like some sort of basal.

Sein
06-05-2015, 06:50 PM
I had a run that looked good without EHG, but to include EHG will likely require something like Iraqi Jews for something with EHG and next to no Yamnaya.

Is it possible to share the output?

Chad Rohlfsen
06-05-2015, 08:29 PM
I'll share one in the next 3-4 hours.

parasar
06-05-2015, 08:44 PM
I don't think it's possible at this point to really clarify ENF's relation to anything at this point considering 1) ENF likely has West Eurasian admixture...and 2) West Eurasians have East Eurasian mixture, which skews pure West Eurasian into looking like some sort of basal.

I agree.
We don't have any population that is an unmixed descendant of early neareastern farmers, and we would need ancient dna to find out if there was such a population in the past. That ancient DNA would also tell us if they had any West Eurasian. I have my doubts that they had any so called West Eurasian, but it is possible they had some.

Chad Rohlfsen
06-05-2015, 08:57 PM
The first farmers will be close to LBK. Bedouins fit as 92% LBK and 8% SSA, on unsupervised runs.

Chad Rohlfsen
06-06-2015, 12:39 AM
K8

Not a big fan of this. I'm still working on a way to include the EHG.

https://docs.google.com/spreadsheets/d/1B1oCx_CmKidFyNg2NwEvuDHrgo867MvqE9tpQNMP6_k/edit?usp=sharing

WHG seems too high and Yamnaya too low in Europe.

Chad Rohlfsen
06-06-2015, 01:12 AM
I might have to include an Amerindian component. I'm going to try and get this to make sense.

Sein
06-06-2015, 01:15 AM
K8

Not a big fan of this. I'm still working on a way to include the EHG.

https://docs.google.com/spreadsheets/d/1B1oCx_CmKidFyNg2NwEvuDHrgo867MvqE9tpQNMP6_k/edit?usp=sharing

WHG seems too high and Yamnaya too low in Europe.

Interesting run.

I think one way to reduce WHG in Europe is by only using Loschbour, and not using SHG samples as part of the WHG component.

Also, I think your northern ENA reference should be based on Ulchi, not Nganasan (who have anywhere from 10%-20% West Eurasian admixture).

For what it's worth, the ASI cluster in this one is somewhat inflated, as evidenced by MA1 being around 12% ASI.

At the end of the day, I think David's suggestions provide a strong framework for something that'll be successful.

Perhaps, this will work:

Ulchi as your Northeast Asian/Siberian reference

Ami/Atayal as your Southeast Asian reference

Yamnaya

WHG (Loschbour)

EEF

Yemeni Jews

ASI (the ASI cluster would involve the same configuration seen in this spreadsheet: https://docs.google.com/spreadsheets/d/1ESoOszTLdzPhKrnN8bw4B3UZS0AXrkabsm9JoNn9g-8/edit#gid=659238251)

The crucial part involves the groups to be tested. I think David's suggestions are solid, it's best that you ditch all South Central Asians in your preliminary test run, with exception to Pathans/Pashtuns. But, include Indian tribal groups. Also, try to first add only these European and West Asian populations:

"Also add the Europeans and Near Easterners that never create clusters, like French, English, Czechs, Hungarians, Spanish, all of the Italians and Sicilians, Greeks and other Balkanoids, Armenians, Lebanese, Syrians, Iranians, Jordanians, Turkish Jews, Moroccan Jews, all of the Turks (so no Druze, Ashkenazi or most other Jews, Moazbites, Caucasians, South Central Asians except maybe Pathans)."

Chad Rohlfsen
06-06-2015, 01:39 AM
I can give the Ulchi a shot, but Nganasans usually peak their component. Motalas chose the component. I didn't fix them.

Chad Rohlfsen
06-06-2015, 01:40 AM
I've got a run finishing up, then I'll try it.

Sein
06-06-2015, 01:44 AM
I can give the Ulchi a shot, but Nganasans usually peak their component. Motalas chose the component. I didn't fix them.

Ah, that's pretty interesting. I think that explains the WHG distribution in this run, it involves a EHG/ANE signal.

Chad Rohlfsen
06-06-2015, 02:39 AM
It's running. It'll be a couple hours.

tamilgangster
06-06-2015, 03:44 AM
Interesting run.

I think one way to reduce WHG in Europe is by only using Loschbour, and not using SHG samples as part of the WHG component.

Also, I think your northern ENA reference should be based on Ulchi, not Nganasan (who have anywhere from 10%-20% West Eurasian admixture).

For what it's worth, the ASI cluster in this one is somewhat inflated, as evidenced by MA1 being around 12% ASI.

At the end of the day, I think David's suggestions provide a strong framework for something that'll be successful.

Perhaps, this will work:

Ulchi as your Northeast Asian/Siberian reference

Ami/Atayal as your Southeast Asian reference

Yamnaya

WHG (Loschbour)

EEF

Yemeni Jews

ASI (the ASI cluster would involve the same configuration seen in this spreadsheet: https://docs.google.com/spreadsheets/d/1ESoOszTLdzPhKrnN8bw4B3UZS0AXrkabsm9JoNn9g-8/edit#gid=659238251)

The crucial part involves the groups to be tested. I think David's suggestions are solid, it's best that you ditch all South Central Asians in your preliminary test run, with exception to Pathans/Pashtuns. But, include Indian tribal groups. Also, try to first add only these European and West Asian populations:

"Also add the Europeans and Near Easterners that never create clusters, like French, English, Czechs, Hungarians, Spanish, all of the Italians and Sicilians, Greeks and other Balkanoids, Armenians, Lebanese, Syrians, Iranians, Jordanians, Turkish Jews, Moroccan Jews, all of the Turks (so no Druze, Ashkenazi or most other Jews, Moazbites, Caucasians, South Central Asians except maybe Pathans)."

The main issue for the ASI being elevated is probably caused by Kinh being used as an ASI proxy, which also will include east eurasian(especially among bengalis and tribals) in south asia. Also as a result of Yamnya being the only proxy for West eurasians in South Asia, alot of the archaic west eurasian elements are showing up as ASI

Chad Rohlfsen
06-06-2015, 03:53 AM
It's not elevated by the Kinh. It's about finding the right mix of reference pops to have it make sense. I could do a run with different pops and more Kinh, yet end up in the single digits for ASI. It's just about trying to get it to make sense. That was the closest so far. Adding the extra pops and components is making it a little trickier. I'll keep plugging away, over the weekend. I should have another run or two done this evening.

Sein
06-06-2015, 04:00 AM
The main issue for the ASI being elevated is probably caused by Kinh being used as an ASI proxy, which also will include east eurasian(especially among bengalis and tribals) in south asia. Also as a result of Yamnya being the only proxy for West eurasians in South Asia, alot of the archaic west eurasian elements are showing up as ASI

Not necessarily, as there were less Kinh used in this run, yet the ASI component went up across the board, and now appears in populations which shouldn't have any ASI. But the K5 was perfect, as far as the ASI cluster was concerned, and it used more Kinh samples.

For what it's worth, EEF, Bedouin, and WHG components are also there to capture West Eurasian ancestry for South Asia (to a lesser extent, the Siberian component can do the same job, in some cases). Yet, Yamnaya is consistently preferred. There is something connecting Yamnaya to South Asia. It's hard to say what this means though. In this sort of case, we really need aDNA from Central and South Asia.

I'm guessing that this involves something really complex. Perhaps, the West Asian ancestry we see in Yamnaya comes from a source which had a huge genetic impact in South Asia. In addition, qpAdm suggests very substantial Yamnaya-related admixture in South Asia, so there must be an element of actual steppe Indo-Iranian ancestry throughout South Asia.

tamilgangster
06-06-2015, 04:55 AM
Not necessarily, as there were less Kinh used in this run, yet the ASI component went up across the board, and now appears in populations which shouldn't have any ASI. But the K5 was perfect, as far as the ASI cluster was concerned, and it used more Kinh samples.

For what it's worth, EEF, Bedouin, and WHG components are also there to capture West Eurasian ancestry for South Asia (to a lesser extent, the Siberian component can do the same job, in some cases). Yet, Yamnaya is consistently preferred. There is something connecting Yamnaya to South Asia. It's hard to say what this means though. In this sort of case, we really need aDNA from Central and South Asia.

I'm guessing that this involves something really complex. Perhaps, the West Asian ancestry we see in Yamnaya comes from a source which had a huge genetic impact in South Asia. In addition, qpAdm suggests very substantial Yamnaya-related admixture in South Asia, so there must be an element of actual steppe Indo-Iranian ancestry throughout South Asia.

THe Yamnya component is the only west eurasian component high in EHG, causing it to dominate all the West Eurasian in SOuth Asia. All the Gedrosian admixture is also showing up as Yamnya. Since the mixture between Bedouin type ENF in South ASIA, and ASI is archaic the excess ENF is probably showing up as ASI. THe Near east component probably has some degree of SSA included in it causing it not to pick up any ENF in South Asia. Inorder to get a proper Sample, all the nonSSA elements, from a yemeni population should be isolated to get a Bedouin type ENF sample.

Chad Rohlfsen
06-06-2015, 08:15 PM
I'm trying something without ASI, just looking at how the Onge and Atayal show up in various groups. I may end up needing to add Papuans. I'm still having issues with the tribals latching onto EHG, in large amounts. I'm not sure if I'll be able to use them.

Dr_McNinja
06-07-2015, 02:22 PM
I'm trying something without ASI, just looking at how the Onge and Atayal show up in various groups. I may end up needing to add Papuans. I'm still having issues with the tribals latching onto EHG, in large amounts. I'm not sure if I'll be able to use them.

Do they still do that if there's a separate Gedrosian/West Asian type component?

Chad Rohlfsen
06-07-2015, 02:56 PM
Usually they don't. That'll be my next test. I'm combining EHG's and MA1, for one run, leaving Yamnaya out, just to see what happens.

Chad Rohlfsen
06-12-2015, 02:14 PM
While David is putting the new files together, I'll try to have one with Corded Ware up before the weekend ends.

Dr_McNinja
06-12-2015, 03:04 PM
While David is putting the new files together, I'll try to have one with Corded Ware up before the weekend ends.

Which new files? For a calculator?

Chad Rohlfsen
06-12-2015, 08:32 PM
Which new files? For a calculator?

The new ancient samples, to merge with all the others I have.

Chad Rohlfsen
06-13-2015, 04:13 PM
I have a question here folks. Would you prefer that I do not make a synthetic Andronovo, as that can take away from the ASI, or just go with it? Also, I think that I should have something that might be reflective of the Zagros farmers, with a bit of EHG. I was thinking something like Iraqi Jews, or mixing Iraqi Jews with a Bedouin or two. I think that could be representative of non or pre-Yamnaya stuff. I just want to make sure I get the right breakdown in West Asia and SC Asia.

Dr_McNinja
08-04-2015, 04:58 PM
Didn't want to put it in a new thread, but this was a calculator I was/am working on:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFMtRHVlZDBuQ3lMcjhxMDE4V3JoY lE#gid=82

The issue is that the 'Eurasia' component is actually eating up ancient and modern Eurasian (by the latter I mean East Eurasian). A Siberian/Amerindian component would help since as you can see, a mixed South American/European individual got a whopping 90% of this component. If some of that bled off into a Siberian component, that would look better, but I haven't had much success in doing that, it screws everything all up.

I might release the calculator files if I can't tweak it any better.

It's interesting the spread of the components though. European is basically EEF here. Mideast is like Bedouin. Eurasia is ANE-heavy but also has WHG and East Eurasian. It should peak among Central Asians and/or Native Americans (don't have any of the latter to test it). It hits 40% in Northeastern Europe (Sweden, probably higher in Finland/Russia) and 54% in north Caucasus and then is 60+% across Hindu Kush/South Asia.

khanabadoshi
08-04-2015, 05:16 PM
....

The spread is great, the Brahmins and Jatt cluster with each other, the Khetran and the Pashtun do, the Native Americans and East Asians. I think it's the first time I've seen parasar,everest,duffy all very close together. Then again, I'm looking at this all superficially; at least from my untrained eyes, it's a good job.

Kurd
08-04-2015, 07:18 PM
Didn't want to put it in a new thread, but this was a calculator I was/am working on:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFMtRHVlZDBuQ3lMcjhxMDE4V3JoY lE#gid=82

The issue is that the 'Eurasia' component is actually eating up ancient and modern Eurasian (by the latter I mean East Eurasian). A Siberian/Amerindian component would help since as you can see, a mixed South American/European individual got a whopping 90% of this component. If some of that bled off into a Siberian component, that would look better, but I haven't had much success in doing that, it screws everything all up.

I might release the calculator files if I can't tweak it any better.

It's interesting the spread of the components though. European is basically EEF here. Mideast is like Bedouin. Eurasia is ANE-heavy but also has WHG and East Eurasian. It should peak among Central Asians and/or Native Americans (don't have any of the latter to test it). It hits 40% in Northeastern Europe (Sweden, probably higher in Finland/Russia) and 54% in north Caucasus and then is 60+% across Hindu Kush/South Asia.

Which datasets have you included and how many reference samples?

If you do have Siberian forming clusters such as Nganasan, Itelmen, etc. in your dataset, then you can go a little higher K, and the Siberian should break off your E Eurasian. This will also give you the opportunity to see what your Eurasian breaks down into

Dr_McNinja
08-04-2015, 07:47 PM
It's an old Eurogenes ANE K7 prototype I think. I found it on my hard drive and it had 180k SNPs and included an Oceanian component. The results were quite similar to the K7. That calculator isolated ANE/ASE somewhat reliably, and WHG/ENF while not as accurate as K8 were still workable.

I just rearranged the frequencies for those typical pops (EEF for Europe, Bedouin for Mideast), then experimented with trying to simulate an ancient Eurasian steppe population as a "leftover" component that would suitably capture the desired variance or deviation from those two basic geographic axes. For some reason when I made it a combination of ANE, WHG, ENF, and East Eurasian in large proportions, it just "clicked" and left little noise (although South Asians are still showing signs of something archaic... in spite of the Oceanian).

There's no way to get this kind of component in Admixture to my knowledge without manually creating it. I remember trying this last year in supervised mode and it just not working. You can't really create reliable ancient components from modern drifted populations without some tinkering (or ancient DNA).

I think this also illustrates the effect seen in that blog post on Eurogenes showing 60+% ancestry from European-like populations in Central Asia in the Hindu Kush. There's just such high ANE affinity in South Asia that the presence of WHG doesn't even really affect it. There's a lot of WHG in this Eurasia component (but not as much as the EEF/Europe one though that has no ANE). Gedrosian might not have existed in any real sense and we may have very well gotten our ANE from a population which had high WHG in Central Asia. Without finding an ancient South Central Asian HG-less ANE-heavy neolithic population, we can't really be sure. The Steppe groups like Sintashta/Karasuk/Andronovo/etc could very well be it. The penetration of those haplogroups like R1a into the most remote corners of India doesn't rule out a very recent origin for ANE there.

I first stumbled upon this idea by forcing a little WHG into Gedrosian, to make it line up with Yamnaya better. And it peaked in Tajiks and Jatts and acted like Gedrosian for everything else (so, in effect, fixing Gedrosian as an 'ANI'). Then I basically included East Eurasian too since it's endemic in Central Asia and the Eastern Steppe. Eastern Steppe components (based on populations from modern day Central Asia/Western China/North Eurasia/Siberia) worked well in South Asia too.

EDIT: You can see the ingredients of a "South Indian" component there too. In North India, there's excess Mideast (i.e, Caucasus in some calculators) admixture.

Kurd
08-04-2015, 07:59 PM
Ah ok, I thought that your spreadsheet was from the output of a run in ADMIXTURE

Edit: With regards to gedrosia , my admixture runs, even as high as K20 have shown that it is very similar and not easily separable from the Caucasus component indicating to me that there likely has been recent gene flow from west Asia into SC Asia possibly via balochi/kurdi type groups

Edit 2: Although I am not certain, but Zack's Balochi signal may be based off an isolated drifted small Baloch population

Dr_McNinja
09-05-2015, 05:09 PM
There's some kind of coherent "Neolithic India" signal I got by mixing half ASE and Mideastern ENF ("SW-Asian") which peaks just east of the Indus:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFMtRHVlZDBuQ3lMcjhxMDE4V3JoY lE#gid=85

Everything here is almost half and half something (an experiment to see how the software would group the admixture... since the presence of ANE ruins the signal because there's a large amount of ANE in the subcontinent).

Neolithic India = ASE + ENF
West Asia = ANE + ENF
SE-Asia = ASE + East Eurasian
Arctic = ANE + East Eurasian
EHG = ANE + WHG
EEF = WHG + ENF
Near-East = 90% ENF, 10% East African
NE-Asian = 12-15% ASE, ~85% East Eurasian

The leftover S-SE-Asia and Oceanian represent the edges of the Indian subcontinent pretty much. They go up as you get closer to the southeast, or the Himalayas, or the southwestern coast. Mixing this stuff with some of "West Asia" would give you the difference between this "Neolithic India" signal and something like HAP S-Indian.

Mixing half and half Neolithic India with West Asia would give you something approaching the HAP S-Indian signal... which suggests this should never have been called "South Indian" to begin with, it's actually just the Indus region plus the extra India-specific East Eurasian-like admixture (extra SE-Asian, NE-Asian, and Oceanian around the fringes). So that was a hybrid component averaging together two opposite halves of India and not representing any actual real population.

I'm still experimenting to see if this can be teased into a coherent signal with more "real" components.

In the meantime, I also got a decent equivalent to HAP S-Indian without the Oceanian:

https://docs.google.com/spreadsheet/ccc?key=0AuXBmvmgdkfVdFMtRHVlZDBuQ3lMcjhxMDE4V3JoY lE#gid=88 (This is still running, I'll fill out the sheet when it is... these components resemble their actual names better)

Another thing I noticed, the Haryana Jatt comes out ~20% EHG no matter what I do, and the rest of us around half that, going up a bit in the Nepal Brahmin. And that (the Haryana Jatt) is approximately half a Swede's EHG. Since Western Steppe is more or less half EHG and half Gedrosia-WestAsian-like, that's almost like 40-50% of the Haryana Jatt's admixture that could be represented as Steppe, going down to 25% in the rest of India, with a sharp gradient around Haryana (in eastern Punjab, Nepal, presumably bordering areas of Rajasthan and then other parts of North India).

Another side effect of the Neolithic India calculator is that it will split admixture into West Asia and EEF instead of EHG for some Southern Europeans and Caucasians. But not for Northern Europeans or South/SC Asians curiously enough.

I'm trying right now (mostly unsuccessfully) to get the Neolithic India component to coexist properly with a HAP Gedrosian-like component.

EDIT: This also suggests the South Asia or South India scores of Afghans and other populations further away are purely "Neolithic India" (Except for those getting boosted by East Eurasian).