PDA

View Full Version : Experimentation with TreeMix Software



Pages : [1] 2 3 4 5

everest59
05-07-2014, 04:32 PM
Sein wanted me to calculate ANE ancestry in Pathans using Treemix. What I did was, I used 42% Mal'ta admixture in the Karitiana as a known event. This is the chart that Treemix produced:

https://drive.google.com/file/d/0B3vEDdpZDjUpTXlHSHJPZlhHTHM/edit?usp=sharing

The ANE admixture in Pathans according to Treemix is 37.668%.

Also, note that even Papuans have some ANE, although very low at only 0.08%.

When I tried unsupervised mode, it's not working.

This was using migration edge of 2.

Dr_McNinja
05-07-2014, 04:41 PM
Sein wanted me to calculate ANE ancestry in Pathans using Treemix. What I did was, I used 42% Mal'ta admixture in the Karitiana as a known event. This is the chart that Treemix produced:

https://drive.google.com/file/d/0B3vEDdpZDjUpTXlHSHJPZlhHTHM/edit?usp=sharing

The ANE admixture in Pathans according to Treemix is 37.668%.

Also, note that even Papuans have some ANE, although very low at only 0.08%.

When I tried unsupervised mode, it's not working.

This was using migration edge of 2.This sounds closer to what we'd expect. We know David's admixture runs were probably leaking some ANE into other components.

everest59
05-07-2014, 05:10 PM
This sounds closer to what we'd expect. We know David's admixture runs were probably leaking some ANE into other components.

I just tried it on the Paniya, and Treemix hasn't been able to find any ANE ancestry in the Paniya. However, Papuans had 0.08% ANE ancestry. In any event, whatever ANE ancestry is in the Paniya is probably minimal.

everest59
05-07-2014, 05:32 PM
Okay, some Zombie experiments. Remember that using supervised mode, ADMIXTURE calculated the SW Asian component to be 100% Caucasian. At migration edge of 1, treemix actually found East African ancestry in SW Asian component. See the chart below:
https://drive.google.com/file/d/0B3vEDdpZDjUpREVmSEVwcDQ5Qm8/edit?usp=sharing

I'm gettin 20% East African in SW Asian component.

(You know, the East African component is probably not a pure component either).

Actually, at Migration Edge of 2, the arrow was reversed. it is the East African component that is mixed with SW Asian. I am trying Migration Edge of 10 right now.

everest59
05-07-2014, 05:47 PM
At Migration Edge of 10, I see some weirdness. However, some of it makes sense. For example, the American to Baloch, which is what Admixture found as well. This is basically ANE.

https://drive.google.com/file/d/0B3vEDdpZDjUpaEJGQUVScmtCeVk/edit?usp=sharing

I think some of the "Admixture" heading towards any of the African component may be some basal Eurasian stuff.

Migration edge of 4.
https://drive.google.com/file/d/0B3vEDdpZDjUpa016VExOS1U3S0U/edit?usp=sharing

Anyways, this is enough Zombie experiments.

Sein
05-07-2014, 06:02 PM
This is all extremely awesome. Finally, a solid ANE percentage for Pashtuns (around 38%).

The 10 migration edges graph does look somewhat weird, but the Papuans are interesting. Long ago, Tishkoff et al. found evidence of substantial African ancestry among the Papuan, using STRUCTURE (or was it FRAPPE?). On PCA plots, the Papuans deviate towards Africans and West Eurasians, in comparison to other ENA. Maybe this is all pointing to something real? What's the migration edge strength?

everest59
05-07-2014, 06:05 PM
This is all extremely awesome. Finally, a solid ANE percentage for Pashtuns (around 38%).

The 10 migration edges graph does look somewhat weird, but the Papuans are interesting. Long ago, Tishkoff et al. found evidence of substantial African ancestry among the Papuan, using STRUCTURE (or was it FRAPPE?). On PCA plots, the Papuans deviate towards Africans and West Eurasians, in comparison to other ENA. Maybe this is all pointing to something real? What's the migration edge strength?

Whoops. I did not save the file. It seems to be dark yellow?
I think it could be some basal stuff.
San -----> Papuan is 11.8%.

Sein
05-07-2014, 06:13 PM
Very interesting stuff.

Sein
05-07-2014, 09:33 PM
everest,

If possible, could you try what you did with the HGDP Pashtuns, using the Di Cristofaro Pashtuns? It'll be interesting to compare. I would exclude Pashtun2_6Af, they are probably a mislabeled Uzbek or Tajik.

*Just out of curiosity, did you exclude the four HGDP Pashtun outliers?

Shaikorth
05-07-2014, 09:34 PM
African HG in Papuans could be just proxying for Denisovan. In the Skoglund paper Denisovan is the only admixture Papuans pick, though their main component is also different from those appearing in WHG's, MA-1 and Anzick/Saqqaq.

everest59
05-07-2014, 10:23 PM
everest,

If possible, could you try what you did with the HGDP Pashtuns, using the Di Cristofaro Pashtuns? It'll be interesting to compare. I would exclude Pashtun2_6Af, they are probably a mislabeled Uzbek or Tajik.

*Just out of curiosity, did you exclude the four HGDP Pashtun outliers?

Here you go :
https://drive.google.com/file/d/0B3vEDdpZDjUpRjNoNWZfSHNVeG8/edit?usp=sharing

I'm getting 41.6 %.

Yes, those outliers removed.

However, I wonder if I'm getting the right number.

Sein
05-07-2014, 10:26 PM
Thanks!

Very interesting. So, the East Asian admixture does leak in. ENA and ANE are a clade here, versus something African-like for the rest of Pashtun ancestry. The 18 HGDP Pashtuns you have used completely lack East Asian admixture, only 1% (which is probably noise). By contrast, the 4 Di Cristofaro Pashtuns you used are around 6% East Asian.

Could you try using my data in the same experiment? Just me, if that's possible.

everest59
05-07-2014, 10:32 PM
Thanks!

Very interesting. So, the East Asian admixture does leak in.

Could you try using my data in the same experiment? Just me, if that's possible.

I'm getting 41% in your case also.

https://drive.google.com/file/d/0B3vEDdpZDjUpM0lJZm45OHl2ZWc/edit?usp=sharing


However, this procedure is not working with the Lezgins.

everest59
05-07-2014, 10:37 PM
Okay, I tried it on myself. I'm getting 34.9 %.

Sein
05-07-2014, 10:39 PM
Thanks! Your'e just awesome.

Not sure what to think. I was assuming East Asian admixture was leaking into ANE. For what it's worth, David got 36% ANE for me and the Karitiana. So, it's possible that I have as much ANE as Karitiana. I don't know.

What do the Lezgins get?

Edit: You have quite a bit of East Asian admixture, but your percentage makes a lot of sense. I always expected the mid 30s. So, I guess I'm probably 41% ANE, the Afghan Pashtuns are 41% ANE, the HGDP Pashtuns are 38% ANE, and you are 35%.

everest59
05-07-2014, 10:40 PM
Thanks! Your'e just awesome.

Not sure what to think. I was assuming East Asian admixture was leaking into ANE. For what it's worth, David got 36% ANE for me and the Karitiana. So, it's possible that I have as much ANE as Karitiana. I don't know.

What do the Lezgins get?

The software does not find ANE admixture in the Lezgins.

Sein
05-07-2014, 10:42 PM
The software does not find ANE admixture in the Lezgins.

Very interesting, just like the French in Skoglund et al.

everest59
05-07-2014, 10:44 PM
Very interesting, just like the French in Skoglund et al.

Sapporo is getting a whopping 42.5%.

Sein
05-07-2014, 10:46 PM
Sapporo is getting a whopping 42.5%.

Makes sense, he has much more Gedrosia/Baloch, and NE Euro admixture than us.

Quick question. What was my precise estimate?

everest59
05-07-2014, 10:48 PM
Makes sense, he has much more Gedrosia/Baloch, and NE Euro admixture than us.

Quick question. What was my precise estimate?

It was something like 41.11.
I have to create a new file every time I run the program, so your file got deleted. But it was 41.1.

Sein
05-07-2014, 10:50 PM
This is amazing information. Could you try some Brahui or Baloch? Just one random sample, if you like.

everest59
05-07-2014, 10:55 PM
This is amazing information. Could you try some Brahui or Baloch? Just one random sample, if you like.

The Baloch are at 34.16%

Sein
05-07-2014, 10:57 PM
The Baloch are at 34.16%

Perfect!

So the results do make sense. It is interesting that the two groups most closely associated with Scythians/Hepthalites/Kushans (Pashtuns and Jatts) have the highest ANE.

Do you think it could obtain an ANE migration edge for a South Indian population?

everest59
05-07-2014, 11:04 PM
Perfect!

So the results do make sense. It is interesting that the two groups most closely associated with Scythians/Hepthalites/Kushans (Pashtuns and Jatts) have the highest ANE.

Do you think it could obtain an ANE migration edge for a South Indian population?

I'm getting 34.9% in Iranians. Do you think the numbers are off by 20%?

Sein
05-07-2014, 11:08 PM
I'm getting 34.9% in Iranians. Do you think the numbers are off by 20%?

It's probably taking/absorbing UHG ("unknown hunter gatherer" from ancient Near East).

Mehrdad
05-07-2014, 11:13 PM
Perfect!

So the results do make sense. It is interesting that the two groups most closely associated with Scythians/Hepthalites/Kushans (Pashtuns and Jatts) have the highest ANE.

Do you think it could obtain an ANE migration edge for a South Indian population?

So could it be that the Jatt and Pasthun populations were one and same.....say 2000 years ago? Or are my assumptions way off? :)

Sein
05-07-2014, 11:16 PM
So could it be that the Jatt and Pasthun populations were one and same.....say 2000 years ago?

I think this is a very good idea. There is geographic proximity, some cultural traits match, and anthropologists have always attributed substantial ancient Central Asian ancestry to both groups. Maybe both groups are long-separated twigs from the same branch?

Sein
05-07-2014, 11:23 PM
Just a note, but what happens if one changes 42% for the Karitiana into 41%? I'm wondering if we see subtle changes.

everest59
05-07-2014, 11:27 PM
Perfect!

So the results do make sense. It is interesting that the two groups most closely associated with Scythians/Hepthalites/Kushans (Pashtuns and Jatts) have the highest ANE.

Do you think it could obtain an ANE migration edge for a South Indian population?

I tried it on the North Kannadi. Did not work. It didn't work on the Paniya either.

everest59
05-07-2014, 11:30 PM
Okay, it worked on Tamil Brahmins. I got 31 %.

Sein
05-07-2014, 11:35 PM
Okay, it worked on Tamil Brahmins. I got 31 %.

The results for South Asians seem to always make sense.

Sein
05-08-2014, 12:14 AM
A plot with La Brana1, MA1, Karitiana, the Onge, "Bedouin B", the Mbuti Pygmy, and a South Asian population, would be great. If this is actually possible.

everest, could you arrange for something along those lines?

If the Onge have very few shared SNPs, maybe try Dai? They are an excellent ASI proxy.

Also, I believe David might know which HGDP Bedouins are "B" or "A".

This would be an extremely informative graph. You could just fix one migration edge into the Karitiana, 42% ANE, and try different migration edges from there.

everest59
05-08-2014, 01:24 AM
A plot with La Brana1, MA1, Karitiana, the Onge, "Bedouin B", the Mbuti Pygmy, and a South Asian population, would be great. If this is actually possible.

everest, could you arrange for something along those lines?

If the Onge have very few shared SNPs, maybe try Dai? They are an excellent ASI proxy.

Also, I believe David might know which HGDP Bedouins are "B" or "A".

This would be an extremely informative graph. You could just fix one migration edge into the Karitiana, 41% ANE, and try different migration edges from there.

Sure, I'll do it when I get a chance.

Sein
05-08-2014, 01:25 AM
Sure, I'll do it when I get a chance.

Thanks everest!

As a favor, could you first try this with my data? I'd really appreciate this.

Sein
05-08-2014, 02:15 AM
Just one final thing. It is said that having a good outgroup helps. Not sure if you can get the Denisova genome, or perhaps Chimp to test robustness.

Dr_McNinja
05-08-2014, 04:52 PM
I think this is a very good idea. There is geographic proximity, some cultural traits match, and anthropologists have always attributed substantial ancient Central Asian ancestry to both groups. Maybe both groups are long-separated twigs from the same branch?I think that (long separated branches of the same West Eurasian branch) is definitely true.

I think whoever was the progenitor of the Pashtun people definitely also settled in India and merged into some Punjabi people (since we know even in modern times some Pashtun-origin tribes went really far into India). We don't have any idea about ages though. We know Jatts are likely at least 3000 years old, probably more than 4000 (I wouldn't be surprised if they were a distinct group in India for twice that or basically since whenever the IVC collapsed, and some of their genetic origins may be partially rooted in that civilization even if traces no longer remain). Pashtun could be anywhere from 1500 to 3000 as a distinct ethnic group I think.

One interesting thing to do would be to figure out the MRCA of the L1c-M357 branches in Popalzai Pashtun and Jatts. There are a few of both in the FTDNA South Asian haplogroup projects:

https://www.familytreedna.com/public/India/

https://www.familytreedna.com/public/dna_afghan_pak/

Look at the Sandhu (159251) and Wazirzada (1318) especially. There were some more groups with more Afghan Pashtun L1c-M357 individuals.

It could be possible that at least some Pashtun and Jatts share a direct common ancestor very recently.

Dr_McNinja
05-08-2014, 04:59 PM
Thanks everest!

As a favor, could you first try this with my data? I'd really appreciate this.Mine as well if you get time!

Sein
05-08-2014, 08:31 PM
Just one final thing. It is said that having a good outgroup helps. Not sure if you can get the Denisova genome, or perhaps Chimp to test robustness.

I take this back. It doesn't sound like a good idea.

No Denisova or Chimp outgroup, just La Brana1, MA1, Karitiana, the Dai, "Bedouin B", the Mbuti Pygmy, and a South Asian population/participant. Experiment with different migration edges, but fix 42% ANE into Karitiana.

parasar
05-08-2014, 09:38 PM
So could it be that the Jatt and Pasthun populations were one and same.....say 2000 years ago? Or are my assumptions way off? :)

I think both the Pakhtoon and Jat have been Indus populations for the last 2000 years. When Arabs came to occupy Sindh and Multan (and even Kabul), they gave a pretty detailed account of clans, castes, tribes all along the Indus and other nearby parts, but never once mentioned the Pakhtoons. Chachanama gives a detailed account of Sindh, Kashmir, Chittor, Kandabel, Kanuj etc and has no mention of Pakhtoon in any form Pathan, Pashtun, etc.. http://persian.packhum.org/persian/main?url=pf%3Ffile%3D12701030%26ct%3D0

Therefore it appears that the people whom the Arabs referred to as Jatan-i-gharbi were to be later known as Pakhtoon.

Sein
05-08-2014, 09:58 PM
I think both the Pakhtoon and Jat have been Indus populations for the last 2000 years. When Arabs came to occupy Sindh and Multan (and even Kabul), they gave a pretty detailed account of clans, castes, tribes all along the Indus and other nearby parts, but never once mentioned the Pakhtoons. Chachanama gives a detailed account of Sindh, Kashmir, Chittor, Kandabel, Kanuj etc and has no mention of Pakhtoon in any form Pathan, Pashtun, etc.. http://persian.packhum.org/persian/main?url=pf%3Ffile%3D12701030%26ct%3D0

Therefore it appears that the people whom the Arabs referred to as Jatan-i-gharbi were to be later known as Pakhtoon.

There are still Jatts in Afghanistan, but it remains to be seen how they relate to Jatts from greater Punjab. They speak Hindko, and are nomadic people. Their lifestyle resembles that of the Romani of Europe.

Al Biruni mentions "Afghan" tribes (which is just the term used in Islamic sources for Pashtuns/Pakhtuns) living all the way down to the Indus. He wrote in the 11th century. Ibn Battuta mentions "Afghans" in what is eastern Afghanistan, and then he mentions "Afghans" robbing/killing travelers in a location that is now in Pakistan. In his time, Pashtuns/Pakhtuns were already performing their traditional calling, acting as highwaymen. He was writing in the 14th century. Amazingly, the Suleiman mountains were already of great importance to Pashtuns, in his day.

everest59
05-08-2014, 10:14 PM
A plot with La Brana1, MA1, Karitiana, the Onge, "Bedouin B", the Mbuti Pygmy, and a South Asian population, would be great. If this is actually possible.

everest, could you arrange for something along those lines?

If the Onge have very few shared SNPs, maybe try Dai? They are an excellent ASI proxy.

Also, I believe David might know which HGDP Bedouins are "B" or "A".

This would be an extremely informative graph. You could just fix one migration edge into the Karitiana, 42% ANE, and try different migration edges from there.

Well, it's not working at all. With the Onge, the SNP count went down all the way to 16k. Then I replaced it with Dai. Even with this adjustment, the snp count is at only 40k.
The South Asian population I chose was HGDP Pathans. Treemix is not finding Mal'ta -------> Pathan admixture with those populations. There are too many populations. The other problem is SNP count.

Sein
05-08-2014, 10:15 PM
Well, it's not working at all. With the Onge, the SNP count went down all the way to 16k. Then I replaced it with Dai. Even with this adjustment, the snp count is at only 40k.
The South Asian population I chose was HGDP Pathans. Treemix is not finding Mal'ta -------> Pathan admixture with those populations. There are too many populations. The other problem is SNP count.

Could you show us the tree?

everest59
05-08-2014, 10:17 PM
Could you show us the tree?
There you go.
https://drive.google.com/file/d/0B3vEDdpZDjUpU2YweWFGSHByQWs/edit?usp=sharing

Sein
05-08-2014, 10:19 PM
There you go.
https://drive.google.com/file/d/0B3vEDdpZDjUpU2YweWFGSHByQWs/edit?usp=sharing

Thanks!

What are the migration edges+strength?

everest59
05-08-2014, 10:20 PM
Thanks!

What are the migration edges+strength?

11.9% from LaBrana + Mal'ta to Pathan.
Also, LaBrana + Mal'ta to Karitiana was at 19%.
Mbuti->Bedouin was 14%.

Sein
05-08-2014, 10:22 PM
11.9% from LaBrana + Mal'ta to Pathan.

Very interesting, nothing from the Dai, and the Pashtuns are just in the West Eurasian clade.

How does the tree look without any migration edges?

Also, could you remove La Brana?

Edit: So, you didn't input a known event?

everest59
05-08-2014, 10:27 PM
Very interesting, nothing from the Dai, and the Pashtuns are just in the West Eurasian clade.

How does the tree look without any migration edges?

Also, could you remove La Brana?

Edit: So, you didn't input a known event?

There you go :
https://drive.google.com/file/d/0B3vEDdpZDjUpMll4dXpXRENUanc/edit?usp=sharing

I'll remove LaBrana-1 next.

Yes, I did input Mal'ta --> Karitiana with admixture of 0.42.

Sein
05-08-2014, 10:31 PM
There you go :
https://drive.google.com/file/d/0B3vEDdpZDjUpMll4dXpXRENUanc/edit?usp=sharing

I'll remove LaBrana-1 next.

Beautiful!

As expected, ANE, WHG, and ENA are part of the same big clade.

It'll be interesting to see how this looks without LaBrana-1.

Edit: Without LaBrana-1, could you increase SNP count?

everest59
05-08-2014, 10:35 PM
Very interesting, nothing from the Dai, and the Pashtuns are just in the West Eurasian clade.

How does the tree look without any migration edges?

Also, could you remove La Brana?

Edit: So, you didn't input a known event?

It's still not finding Mal'ta to Pathan.
Even when you enter known event, the program still runs in an unsupervised manner.

Sein
05-08-2014, 10:37 PM
It's still not finding Mal'ta to Pathan.
Even when you enter known event, the program still runs in an unsupervised manner.

What does the tree look like?

If so, don't enter a known event, and see what it does. If it works unsupervised, that is best.

everest59
05-08-2014, 10:38 PM
What does the tree look like?
Actually, it sort of found Mal'ta ---> Pathan when I increased migration edge to 10.
Value of 42.4 %.
https://drive.google.com/file/d/0B3vEDdpZDjUpbnRiVEF2dG9PRU0/edit?usp=sharing

Sein
05-08-2014, 10:41 PM
Actually, it sort of found Mal'ta ---> Pathan when I increased migration edge to 10.
Value of 42.4 %.
https://drive.google.com/file/d/0B3vEDdpZDjUpbnRiVEF2dG9PRU0/edit?usp=sharing

That's interesting, they are on the same branch as MA1.

In your view, why can't it detect ENA to the Pashtuns?

everest59
05-08-2014, 10:43 PM
That's interesting, they are on the same branch as MA1.

In your view, why can't it detect ENA to the Pashtuns?
Good question. I think it may with a population like the Paniya. Perhaps need a population that is actually closer to ASI.

Sein
05-08-2014, 10:45 PM
Good question. I think it may with a population like the Paniya. Perhaps need a population that is actually closer to ASI.

Maybe the SNP count also plays a part?

Note: Could I move these posts to a new thread, this software is very interesting.

everest59
05-08-2014, 10:46 PM
Maybe the SNP count also plays a part?

Note: Could I move these posts to a new thread, this software is very interesting.

Yeah, the SNP count may play a part. You know, I should probably create a new dataset. I'm using 60k snp's right now. I can push that to 100k+.

Sure, you can move it to a new thread.

Sein
05-08-2014, 10:53 PM
Yeah, the SNP count may play a part. You know, I should probably create a new dataset. I'm using 60k snp's right now. I can push that to 100k+.

Sure, you can move it to a new thread.

Before creating a new dataset, could you try my data on this one?

Completely unsupervised, no LaBrana1. I'd truly appreciate this.

everest59
05-08-2014, 11:01 PM
Before creating a new dataset, could you try my data on this one?

Completely unsupervised, no LaBrana1. I'd truly appreciate this.

You mean something like this?
https://drive.google.com/file/d/0B3vEDdpZDjUpbVB5TUNJWE13ckU/edit?usp=sharing

Or did you want migration edge.

Sein
05-08-2014, 11:03 PM
You mean something like this?
https://drive.google.com/file/d/0B3vEDdpZDjUpbVB5TUNJWE13ckU/edit?usp=sharing

Or did you want migration edge.

Thanks!

What happens with one migration edge?

everest59
05-08-2014, 11:05 PM
Thanks!

What happens with one migration edge?

Without using known event i.e. Mal'ta<-Karitiana of 0.42?

Sein
05-08-2014, 11:06 PM
Without using known event i.e. Mal'ta<-Karitiana of 0.42?

That would be best, no known event.

everest59
05-08-2014, 11:13 PM
That would be best, no known event.

Well, migration edge of 1 was really weird. There was a line going from MbutiPygmy to Dai at 0.48 or 48%. What does this represent?

However, with migration edge of 2, it found ENA in you of 6.7 %. There is a line going from Dai to you.
https://drive.google.com/file/d/0B3vEDdpZDjUpMmJCUExWQUhqRVk/edit?usp=sharing

Sein
05-08-2014, 11:18 PM
Well, migration edge of 1 was really weird. There was a line going from MbutiPygmy to Dai at 0.48 or 48%. What does this represent?

However, with migration edge of 2, it found ENA in you of 6.7 %. There is a line going from Dai to you.
https://drive.google.com/file/d/0B3vEDdpZDjUpMmJCUExWQUhqRVk/edit?usp=sharing

Thanks!

6.7% sounds very reasonable, especially if that only involves ASI. I think I'm probably 6%-7% ASI, and 4% NE Asian/Siberian.

Could you try 3, 4, 5, 6, 7, 8, 9, and 10 migration edges? I really hope I'm not being annoying. I'm interested in how things change from the addition of another migration edge, each time.

But I must say, 48% Mbuti Pygmy into Dai is rather weird. I think it's best to attribute that to SNP count.

everest59
05-08-2014, 11:23 PM
Thanks!

6.7% sounds very reasonable, especially if that only involves ASI. I think I'm probably 6%-7% ASI, and 4% NE Asian/Siberian.

Could you try 3, 4, 5, 6, 7, 8, 9, and 10 migration edges? I really hope I'm not being annoying. I'm interested in how things change from the addition of another migration edge, each time.

But I must say, 48% Mbuti Pygmy into Dai is rather weird. I think it's best to attribute that to SNP count.

Well, it seems 6.7% was just east asian. At migration edge of 3, DAI<-Sein increased to 16.1 %. The program may have incorporated ASI this time.

https://drive.google.com/file/d/0B3vEDdpZDjUpVHlKcmRKZVZuTG8/edit?usp=sharing

This is consistent with the 50% of South Indian component of Harappa is ASI theory.

Sein
05-08-2014, 11:26 PM
Well, it seems 6.7% was just east asian. At migration edge of 3, DAI<-Sein increased to 16.1 %. The program may have incorporated ASI this time.

https://drive.google.com/file/d/0B3vEDdpZDjUpVHlKcmRKZVZuTG8/edit?usp=sharing

This is consistent with the 50% of South Indian component of Harappa is ASI theory.

Very interesting! Basically, 9% ASI, 7% East Asian.

Migration edge 4?

everest59
05-08-2014, 11:30 PM
Very interesting! Basically, 9% ASI, 7% East Asian.

Migration edge 4?

I'm just going to post the data :
0.472349 NA NA NA MbutiPygmy:0.0654612 Dai:0.086075
0.0756995 NA NA NA Dai:0.086075 .Sein:0
0.0926898 NA NA NA MbutiPygmy:0.0654612 (Dai:0.086075,Karitiana:0.0272592):0.0347537
0.105215 NA NA NA Mal'ta:0.153745 .Sein:0

7.6 % from Dai to Sein
10% from Mal'ta to Sein

The values change quite a bit .

everest59
05-08-2014, 11:33 PM
5 as follow:
0.470464 NA NA NA MbutiPygmy:0.0656329 Dai:0.0855654
0.0766708 NA NA NA Dai:0.0855654 .Sein:7.80703e-08
0.074606 NA NA NA MbutiPygmy:0.0656329 (Dai:0.0855654,Karitiana:0.0271839):0.0330153
0.100604 NA NA NA Mal'ta:0.154321 .Sein:7.80703e-08
0.000947448 NA NA NA Mal'ta:0.154321 MbutiPygmy:0.0656329

You know, the patterns move around a lot. Let's just say garbage in garbage out. Got to be careful with the populations that are chosen.
However, I feel that the 16.7% ENA is probably correct.

Sein
05-08-2014, 11:35 PM
I'm just going to post the data :
0.472349 NA NA NA MbutiPygmy:0.0654612 Dai:0.086075
0.0756995 NA NA NA Dai:0.086075 .Sein:0
0.0926898 NA NA NA MbutiPygmy:0.0654612 (Dai:0.086075,Karitiana:0.0272592):0.0347537
0.105215 NA NA NA Mal'ta:0.153745 .Sein:0

7.6 % from Dai to Sein
10% from Mal'ta to Sein

The values change quite a bit .

Interesting. I guess an average of 16% and 8%, or 12% ENA, makes the most sense.

Besides the 10% ANE, and 7.6% ENA, is the rest of my ancestry closest to the Bedouin?

I suppose SNP count has a role to play.

everest59
05-08-2014, 11:37 PM
Interesting. I guess an average of 16% and 7%, or 11.5% ENA, makes the most sense.

Besides the 10.5% ANE, and 7.6% ENA, is the rest of my ancestry closest to the Bedouin?

Migration edge 5?

I posted 5 above. Here's 10 in case you're intereted :
0.468441 NA NA NA MbutiPygmy:0.0664884 Dai:0.0853287
0.088006 NA NA NA Dai:0.0853287 .Sein:0.00209956
0.0697183 NA NA NA MbutiPygmy:0.0664884 (Dai:0.0853287,Karitiana:0.0271414):0.0326086
0.14216 NA NA NA Mal'ta:0.154463 .Sein:0.00209956
0.00689038 NA NA NA Mal'ta:0.154463 MbutiPygmy:0.0664884
0.00034138 NA NA NA Mal'ta:0.154463 Dai:0.0853287
2.7366e-05 NA NA NA Karitiana:0.0271414 .Sein:0.00209956
0.00083923 NA NA NA Bedouin:0.00405196 Dai:0.0853287

Okay, I gotta go now.
If people have anything specific in mind, let me know and I'll run it when I get the opportunity.

parasar
05-09-2014, 12:49 AM
There are still Jatts in Afghanistan, but it remains to be seen how they relate to Jatts from greater Punjab. They speak Hindko, and are nomadic people. Their lifestyle resembles that of the Romani of Europe.

Al Biruni mentions "Afghan" tribes (which is just the term used in Islamic sources for Pashtuns/Pakhtuns) living all the way down to the Indus. He wrote in the 11th century. Ibn Battuta mentions "Afghans" in what is eastern Afghanistan, and then he mentions "Afghans" robbing/killing travelers in a location that is now in Pakistan. In his time, Pashtuns/Pakhtuns were already performing their traditional calling, acting as highwaymen. He was writing in the 14th century. Amazingly, the Suleiman mountains were already of great importance to Pashtuns, in his day.

That is correct, and therefore it is surprising that in the Arab period there is no mention of Afghans in the same regions. But in the post-Arab Turkic period, Afghans are potentially noted by Arab writer El Masudi (amazing writer) in 956AD as Turks in the form Awkhan or Ayghan (Awkhan, Tubbat, Ayghan combo) which could have become Abghan and later Afghan in Persian.
http://books.google.com/books?id=bCVyhH5VDjAC&pg=PA168

Sein
05-09-2014, 04:04 AM
That is correct, and therefore it is surprising that in the Arab period there is no mention of Afghans in the same regions. But in the post-Arab Turkic period, Afghans are potentially noted by Arab writer El Masudi (amazing writer) in 956AD as Turks in the form Awkhan or Ayghan (Awkhan, Tubbat, Ayghan combo) which could have become Abghan and later Afghan in Persian.
http://books.google.com/books?id=bCVyhH5VDjAC&pg=PA168

For some reason, Turkic origins are ascribed to many Pashtun groups: Ghilzai, Turi, Afridi, Marwat, Yusufzai, etc. I think in many cases this was due to confusion on the part of writers.

Sein
05-09-2014, 06:15 AM
I posted 5 above. Here's 10 in case you're intereted :
0.468441 NA NA NA MbutiPygmy:0.0664884 Dai:0.0853287
0.088006 NA NA NA Dai:0.0853287 .Sein:0.00209956
0.0697183 NA NA NA MbutiPygmy:0.0664884 (Dai:0.0853287,Karitiana:0.0271414):0.0326086
0.14216 NA NA NA Mal'ta:0.154463 .Sein:0.00209956
0.00689038 NA NA NA Mal'ta:0.154463 MbutiPygmy:0.0664884
0.00034138 NA NA NA Mal'ta:0.154463 Dai:0.0853287
2.7366e-05 NA NA NA Karitiana:0.0271414 .Sein:0.00209956
0.00083923 NA NA NA Bedouin:0.00405196 Dai:0.0853287

Okay, I gotta go now.
If people have anything specific in mind, let me know and I'll run it when I get the opportunity.

Once you get some leisure time, I'd like to see you try the same procedure on your own data. I'm rather interested in comparing our results, especially our ENA percentages. Also, it seems 16% ENA was a one time thing with my data. I guess the ANE leaked into ENA, as ANE and ENA are part of the same clade (looking at these results, I can't believe people were calling MA1 a "West Eurasian". He is clearly most similar to ENA populations, if we ignore the genetic contributions of his relatives to living West Eurasians). Around 7% ENA seems consistent for myself, when using different migration edges. It also matches 23andMe's 94% "European" (6% "East Asian"), and Dr. McDonald's 91% "European" (9% "East Asian).

If I'm around 6%-8% ENA, which seems to be the case, I think you are probably around 16%-19%. If I'm really at 16.1% ENA (this percentage seems to subsume some ANE, but we'll just assume it's correct, for our present purposes), I think you are probably around 30%-35% ENA.

To me though, it is interesting that the program can't detect any ANE admixture for the Karitiana, yet it finds 10%-15% ANE for myself. It seems my ANE signal is very strong.

For what it's worth, I think you really need a data-set with many more SNPs, around 100K.

*Question: Are these Bedouins the "Bedouin B"? If not, I think that could introduce some problems. Many HGDP Bedouins have minor ANE admixture (David said something about 8%), but the "B" samples have 0% ANE. If these are "Bedouin B", please ignore this.

Edit: New paper with more aDNA, http://www.plosgenetics.org/article/...l.pgen.1004353
They use TreeMix in the paper, :biggrin1:. TreeMix seems to be a popular tool, and with good reason.

Also, before you create a new data-set, I have one request. Try a graph with Georgians, Paniya, Dai, and MA1. Show us the tree without any migration edges, show us the tree with 1 migration edge, show us the tree with 2 migration edges, show us the tree with 3 migration edges, and show us the tree with 4 migration edges. Tell us the edge strengths for every migration edge, in every tree. Thanks in advance. We all truly appreciate your work with the data, you are absolutely amazing.

Shaikorth
05-09-2014, 07:45 AM
Once you get some leisure time, I'd like to see you try the same procedure on your own data. I'm rather interested in comparing our results, especially our ENA percentages. Also, it seems 16% ENA was a one time thing with my data. I guess the ANE leaked into ENA, as ANE and ENA are part of the same clade (looking at these results, I can't believe people were calling MA1 a "West Eurasian". He is clearly most similar to ENA populations, if we ignore the genetic contributions of his relatives to living West Eurasians). Around 7% ENA seems consistent for myself, when using different migration edges. It also matches 23andMe's 94% "European" (6% "East Asian"), and Dr. McDonald's 91% "European" (9% "East Asian).

Yeah, Kets have been picked as MA-1's closest living Eurasian relatives by about every method I've seen so far and it would be a stretch to call them West Eurasian. They do have a component that shows often as European, but breakdowns like Eurogenes' tests indicate their European is almost fully of the type that peaks in Volga populations like Mari and Chuvash (Eastern Euro). They have very little to none of the other westernly European components like North Sea, Atlantic and Baltic, interestingly unlike Mari and Chuvash.

This "European" is also sucked up into some kind of weird type of Siberian (Yeniseian?) component at higher K's. The component is modal in Kets and Selkups, absent in Yakuts or Evens, and forms all non-West Eurasian ancestry in HGDP Russians at K=10-K=13 in this Admixture test from Fedorova et al (http://www.biomedcentral.com/content/supplementary/1471-2148-13-127-s11.pdf) , but not in Chuvash who also have other types of Siberian like Yakut and also Nganassan at K=13 onwards.

This component is noticeably visible in Pathans and Burusho, and some of it is absorbed @K=14 into a Burusho component that is extremely high in Pathans.

Dr_McNinja
05-09-2014, 09:59 AM
There are still Jatts in Afghanistan, but it remains to be seen how they relate to Jatts from greater Punjab. They speak Hindko, and are nomadic people. Their lifestyle resembles that of the Romani of Europe.

Al Biruni mentions "Afghan" tribes (which is just the term used in Islamic sources for Pashtuns/Pakhtuns) living all the way down to the Indus. He wrote in the 11th century. Ibn Battuta mentions "Afghans" in what is eastern Afghanistan, and then he mentions "Afghans" robbing/killing travelers in a location that is now in Pakistan. In his time, Pashtuns/Pakhtuns were already performing their traditional calling, acting as highwaymen. He was writing in the 14th century. Amazingly, the Suleiman mountains were already of great importance to Pashtuns, in his day.Yeah those are probably related to the Pahari Jatt people if they came from the East. They actually speak a different branch of West Punjabi called Jakati (http://en.wikipedia.org/wiki/Jakati_language) (sometimes called Jatki, in reference to the ancient language of Jats). Hindko is also a West Punjabi language. It's possible they're descended from the ancestors of Jats, or that they're just Jats from the Pakistan area who later moved west. However, they are spread throughout Ukraine and Russia too so we don't know, both scenarios could work. A genetic analysis of them would be fascinating.

Dr_McNinja
05-09-2014, 10:03 AM
Once you get some leisure time, I'd like to see you try the same procedure on your own data. I'm rather interested in comparing our results, especially our ENA percentages. Also, it seems 16% ENA was a one time thing with my data. I guess the ANE leaked into ENA, as ANE and ENA are part of the same clade (looking at these results, I can't believe people were calling MA1 a "West Eurasian". He is clearly most similar to ENA populations, if we ignore the genetic contributions of his relatives to living West Eurasians). Around 7% ENA seems consistent for myself, when using different migration edges. It also matches 23andMe's 94% "European" (6% "East Asian"), and Dr. McDonald's 91% "European" (9% "East Asian).

If I'm around 6%-8% ENA, which seems to be the case, I think you are probably around 16%-19%. If I'm really at 16.1% ENA (this percentage seems to subsume some ANE, but we'll just assume it's correct, for our present purposes), I think you are probably around 30%-35% ENA.

To me though, it is interesting that the program can't detect any ANE admixture for the Karitiana, yet it finds 10%-15% ANE for myself. It seems my ANE signal is very strong.

For what it's worth, I think you really need a data-set with many more SNPs, around 100K.

*Question: Are these Bedouins the "Bedouin B"? If not, I think that could introduce some problems. Many HGDP Bedouins have minor ANE admixture (David said something about 8%), but the "B" samples have 0% ANE. If these are "Bedouin B", please ignore this.

Edit: New paper with more aDNA, http://www.plosgenetics.org/article/...l.pgen.1004353
They use TreeMix in the paper, :biggrin1:. TreeMix seems to be a popular tool, and with good reason.

Also, before you create a new data-set, I have one request. Try a graph with Georgians, Paniya, Dai, and MA1. Show us the tree without any migration edges, show us the tree with 1 migration edge, show us the tree with 2 migration edges, show us the tree with 3 migration edges, and show us the tree with 4 migration edges. Tell us the edge strengths for every migration edge, in every tree. Thanks in advance. We all truly appreciate your work with the data, you are absolutely amazing.Seeing this post again makes me wonder, has Everest run anything to test ANE in Baloch? If it's noticeably lower than Pathans or Uttar Pradesh Brahmins, that could explain why there's this weird behavior with the Gedrosian component in admixture, it could mean ANE-heavy populations are being pushed away from it. (EDIT: Also, worth it to compare South Indian ANE affinity to Baloch)

It might also explain why there's a link between Gedrosian and Atlantic-Mediterranean components, and why potentially high-ANE populations also push out more Atlantic-Med. In my case it could be there's a few % of ANE which is very ENA-like to the software (certainly possible as the Jatts from that area are pushing an extra 5 to 10% into NE-Euro). That would explain the strong affinity to any Pathan/Pashtun component as well as North Indian Brahmins, but very low affinity to anything inbetween (Sindhi, Baloch, Gujarati) because of the Gedrosian and some Atlantic-Med spun off. Because there are Europeans in these calculators, the Atl-Med seems to combine with some ANE and get picked up as Northwest European. Either that or it's a legitimate signal from some recent ancestor (Scythian/Sarmatian/Alan) that's getting split up, hard to say. (EDIT: I think it could be a legitimate ancestry signal, admixture is probably pushing out the extreme west shifted stuff (sometimes into Basque or Sardinian) to compensate for the east-leaning ANE, it's probably getting confused by all the ANE-related West Eurasian populations... the K36 showed an affinity to NW Europe in particular, not a shift west of it).

Without an ancestral population very high in ANE as a proxy it will be tough to get accurate admixture estimations for people around this area. Using one of these populations as the component itself is one solution (I wonder what it would be like if Jatts were used to represent Eurasian Heartland) , but not perfect. Plus all the genetic drift separating some of these famously endogamous populations makes it difficult.

EDIT: And how do we really know the ancestors of MA-1 weren't actually from this region (south central asia)? That would throw a wrench in the works, we'd need pre-MA-1 ancient DNA from this region.

Dr_McNinja
05-09-2014, 10:47 AM
Yeah, Kets have been picked as MA-1's closest living Eurasian relatives by about every method I've seen so far and it would be a stretch to call them West Eurasian. They do have a component that shows often as European, but breakdowns like Eurogenes' tests indicate their European is almost fully of the type that peaks in Volga populations like Mari and Chuvash (Eastern Euro). They have very little to none of the other westernly European components like North Sea, Atlantic and Baltic, interestingly unlike Mari and Chuvash.

This "European" is also sucked up into some kind of weird type of Siberian (Yeniseian?) component at higher K's. The component is modal in Kets and Selkups, absent in Yakuts or Evens, and forms all non-West Eurasian ancestry in HGDP Russians at K=10-K=13 in this Admixture test from Fedorova et al (http://www.biomedcentral.com/content/supplementary/1471-2148-13-127-s11.pdf) , but not in Chuvash who also have other types of Siberian like Yakut and also Nganassan at K=13 onwards.

This component is noticeably visible in Pathans and Burusho, and some of it is absorbed @K=14 into a Burusho component that is extremely high in Pathans.In my old admixture runs I grouped Selkups (along with Nganassans I believe) into "Siberian Uralic", Evenki/Yakut into "Siberian Altaic" (Turkic), and Kets into "Siberian Paleo". The Russians were coming out as 80-85% Eastern/Northeastern Europe and 10-15% Siberian Uralic.

I just looked it up, here are two runs I did:

https://docs.google.com/spreadsheets/d/1rKbCU0KPU0mORBaO1kgUGc2_ovcdjH5nplsc_jiOGfs/edit#gid=1381982864

Northwestern India is HGDP Pathan and North Indian Brahmin in the first sheet. The second sheet has separate Pathan, Sindhi, Gujarati, and North Indian components. There's a clear affinity in Mordovians and Russians to Pathans (in addition to Uralic) which is striking in light of what FTDNA did with Pashtun as Eurasian Heartland. Even the Northwestern European populations are getting Pathan more than anything else.

I used Basques/Sardinians as proxies for non-ANE/ENA European.

(EDIT: Also, seeing how my admixture is almost allergic here to Gujarati but had no problem registering 38% in myOrigins shows what I mean about admixture just looking for the best fit in the lineup).

Dr_McNinja
05-09-2014, 11:09 AM
That is correct, and therefore it is surprising that in the Arab period there is no mention of Afghans in the same regions. But in the post-Arab Turkic period, Afghans are potentially noted by Arab writer El Masudi (amazing writer) in 956AD as Turks in the form Awkhan or Ayghan (Awkhan, Tubbat, Ayghan combo) which could have become Abghan and later Afghan in Persian.
http://books.google.com/books?id=bCVyhH5VDjAC&pg=PA168There was something I read about a South Asian group sacking or attempting to sack Baghdad briefly in the early to mid 9th century AD. Did you read anything about some event like that? Who was it? I read that they called them Jats.

Wikipedia says Barmakids from Balkh (modern day Afghanistan) helped the Abbasids build the city in the 8th century too. It also says about the decline of their influence:


The Barmakid Army: Although technically this army was under the Abbasids, in reality, the soldiers gave allegiance to al-Fadl Ibn Yahya al-Barmaki, Ja'far's brother; it numbered 50,000 soldiers. During their last days, al-Fadl ordered 20 thousand of them to come to Baghdad and claimed to create a legion under the name of the Karnabiya Legion. This made Harun very wary of their intentions

So it could've been Afghans that they just called Jats? Or was it actually Jats who somehow randomly attacked that far west which I always doubted?

I got sucked into that Wikipedia black hole and was reading this: http://en.wikipedia.org/wiki/Battle_of_Rajasthan Talks about Gurjars. Who was "Yasovarman of Kannauj"? Any relation to the Yasodharman of Malwa who defeated the Huns in a similar alliance a few centuries before (and that inscription which says they were Jats)?

Shaikorth
05-09-2014, 11:16 AM
In my old admixture runs I grouped Selkups (along with Nganassans I believe) into "Siberian Uralic", Evenki/Yakut into "Siberian Altaic" (Turkic), and Kets into "Siberian Paleo". The Russians were coming out as 80-85% Eastern/Northeastern Europe and 10-15% Siberian Uralic.

I just looked it up, here are two runs I did:

https://docs.google.com/spreadsheets/d/1rKbCU0KPU0mORBaO1kgUGc2_ovcdjH5nplsc_jiOGfs/edit#gid=1381982864

Northwestern India is HGDP Pathan and North Indian Brahmin in the first sheet. The second sheet has separate Pathan, Sindhi, Gujarati, and North Indian components. There's a clear affinity in Mordovians and Russians to Pathans (in addition to Uralic) which is striking in light of what FTDNA did with Pashtun as Eurasian Heartland. Even the Northwestern European populations are getting Pathan more than anything else.

I used Basques/Sardinians as proxies for non-ANE/ENA European.

(EDIT: Also, seeing how my admixture is almost allergic here to Gujarati but had no problem registering 38% in myOrigins shows what I mean about admixture just looking for the best fit in the lineup).



Kets and Selkups by all indicators are very similar genetically, while Nganassans despite their Samoyedic language are much closer to Yakuts and Evens. Chukchis seem to be their own thing, approaching Eskimos.

The Selkups' presence in the Siberian-Uralic category probably makes those results comparable to the Fedorova et al admixture run for Russians. I think the ideal situation would be Kets and Selkups together without Nganassans.

Northwest Europeans like Orcadians seem to be getting either Basque and minor Sardinian or only Basque, and Russians get only Basque. How would the run look with only Sardinians or only Basques as the non-ANE euro reference?

Dr_McNinja
05-09-2014, 12:41 PM
Kets and Selkups by all indicators are very similar genetically, while Nganassans despite their Samoyedic language are much closer to Yakuts and Evens. Chukchis seem to be their own thing, approaching Eskimos.

The Selkups' presence in the Siberian-Uralic category probably makes those results comparable to the Fedorova et al admixture run for Russians. I think the ideal situation would be Kets and Selkups together without Nganassans.

Northwest Europeans like Orcadians seem to be getting either Basque and minor Sardinian or only Basque, and Russians get only Basque. How would the run look with only Sardinians or only Basques as the non-ANE euro reference?I did a few like that, it turns out mostly the same with let's say 80-85% now going into Sardinian instead of Basque (Sardinian is interpreted as west of Basque) for Russians. For people with less European it shifted around some (the total Euro amount decreased for some). Basques would be 100% Sardinian in that case.

I couldn't find that run so I don't think I saved it (there's like a hundred randomly named spreadsheets here -_-) but I found these: (Sorry this is getting off-topic since the other thread was for admixture results)

https://docs.google.com/spreadsheets/d/1wZugJ9HdnRXyM4Er3KJZ320z3wZn_zrtBJJBng6Ma_Q/edit#gid=1493305812 (East Asian, two African components, Basque, and Sardinian). That shows the weird west shift in some (me, HRP0349, and soulblighter).

https://docs.google.com/spreadsheets/d/12X03bTUBi9cty6j3o2TKFkGEXQBwmvOWLLKF8Yz46_g/edit#gid=792931946 This was interesting, not sure what it means. It shows the East Eurasian difference isn't as much between me and HRP0341 (Sapporo), but whatever that W-African represents. I think the actual East Eurasian difference between us is 2 to 3% which shows up as a 5% South Indian swing in Harappa. Then there's something which is like East Eurasian but not, and very old.

https://docs.google.com/spreadsheets/d/1a6itnwHs2dPY2ouEXbm_QOOGa0eMJk69EEPCrAgLHUw/edit#gid=747615632 This sheds some more light and raises some more questions. The Sardinian shifted into SW-Asian for those with high Sardinian values. What it was really capturing was some kind of "local" West Eurasian (South Asian most likely) and the Mediterranean is closer in that respect. Which could explain the matches between South Indians and Western Europeans (who are Atlantic-Mediterranean).

https://docs.google.com/spreadsheets/d/1Hr0i9T4f7FcV1Nau2fC1Uo0Mx0O4Xvrt-U2Euv_SMME/edit#gid=1061815567 This run came out somewhat similar to myOrigins but using Baloch. It would appear here that West Caucasian is actually coming from South Indian because adding those two gives the Indian Tectonic values for myself and Sein or thereabouts. That's not too surprising because the West Eurasian part of the Harappa S-Indian component would often leak into Harappa Caucasian which is West Caucasus. Eastern Europe here is similar to myOrigins' North Circumpolar and the two Western Europe components could be close to Trans-Ural Peneplain and some other continental components (I had 3% Trans-Ural, 6% North Circumpolar and Sein had 7% North Circumpolar). However, I would expect the closer affinity to Pashtun to account for a little bit of a swing here for Sein. Some of what's going into North Caucasus would be pulled into Eurasian Heartland. The opposite for Jatts like HRP0341 who have stronger Baloch affinity to explain the 60% value here (so it would probably be in the 50s in myOrigins). I remember this was one of those lucky runs to get us all into close proximity around this South Central Asian component. I'm less drawn to Baloch so mine barely moved when going to a Pashtun Eurasian Heartland (from 50% to 49%). The biggest difference with myOrigins is that myOrigins is picking up significant East Asian. This could be do to some shuffling since they use Gujarati and not SE-Indian Tribal. The latter is similar enough to East Asians to pull most of that in. Based on this if I had to guess a myOrigins result for Sapporo/HRP0341 I'd guess less than 30% Indian Tectonic, mid-50s for Eurasian Heartland, and 10-11% North Circumpolar with very little continental European if any.

Unfortunately my brother-in-law (Gondal) was tested on V4 so I can't have an autosomal transfer to FTDNA.

parasar
05-09-2014, 03:40 PM
There was something I read about a South Asian group sacking or attempting to sack Baghdad briefly in the early to mid 9th century AD. Did you read anything about some event like that? Who was it? I read that they called them Jats.

Wikipedia says Barmakids from Balkh (modern day Afghanistan) helped the Abbasids build the city in the 8th century too. It also says about the decline of their influence:

So it could've been Afghans that they just called Jats? Or was it actually Jats who somehow randomly attacked that far west which I always doubted?


Zott, Jat etc. was just a generic term used by Arabs for Indians they came across in the Sindh area. So yes initially Afghans would also be included in that term.

The Zott in Mesopotamia did indeed engage in a major battle with the Arabs. They lost and were deported perhaps giving rise to the Gypsy influx into Europe. I had a link to a document in Arabic confirming the battle but I can't locate it now.

The timeline is here: http://books.google.com/books?id=pcAm-VTW_CQC&pg=PR19
In 1322AD they appear in Crete. The Roma L657 signature seen in Crete is very likely from that period. http://www.anthrogenica.com/showthread.php?2347-New-R1a-paper-by-Underhill-et-al&p=39041&viewfull=1#post39041



I got sucked into that Wikipedia black hole and was reading this: http://en.wikipedia.org/wiki/Battle_of_Rajasthan Talks about Gurjars. Who was "Yasovarman of Kannauj"? Any relation to the Yasodharman of Malwa who defeated the Huns in a similar alliance a few centuries before (and that inscription which says they were Jats)?

I doubt they were Jats, but that their descendants are Jats is quite possible as the Malwa plateau is one of the regions from where Jats moved to the Punjab (the region's name Malwa moved north too.). At that time they were Olikara (Aulikara) Malavas. The current reading of the Mandasor inscription is doubtful. Most historians equate Mihirakula with Ephthalite Huns, but I don't read it that way - I really can't see how the Huns and Mihirakula can be equated in the inscription.
You can read it and see what you feel: http://www.sdstate.edu/projectsouthasia/upload/Mandasor-of-Yashodharman.pdf
and the Mandasor Inscription of Kumaragupta and Bandhuvarman where the Malavas are also mentioned http://www.sdstate.edu/projectsouthasia/upload/Mandasor-of-Kumara.pdf
Risthal Inscription: http://www.jatland.com/home/Risthal_Inscription_of_Prakashadharman

Yasovarman of Kannauj is different and follows a little bit of a dark period. He was a Maurya remnant, and such remnants are seen in many locations (Rajasthan, Konkan, etc).
During this dark period the Chinese and the Tibetans came into northern India. The Chinese say they came to remove a usurper who had occupied the throne of Magadha. This usurper is likely a Chalukyan (later Solanki Rajputs) chief from Badami Karnataka (Vinayaditya, Vijayaditya period).

parasar
05-09-2014, 04:01 PM
...

EDIT: And how do we really know the ancestors of MA-1 weren't actually from this region (south central asia)? That would throw a wrench in the works, we'd need pre-MA-1 ancient DNA from this region.

If Y-P came from SE Asia up via South Asia to Siberia, then I think it is both:
MA-1's ancestors are from South Asia (Y-P) and MA-1's descendants (Y-R) are in South Asia.

parasar
05-09-2014, 06:54 PM
For some reason, Turkic origins are ascribed to many Pashtun groups: Ghilzai, Turi, Afridi, Marwat, Yusufzai, etc. I think in many cases this was due to confusion on the part of writers.

Some equate Abdali with the Hephthalite Huns/Turks too.
Andre Wink at the link I had posted above goes in great detail (pgs 112-125) on the Upper/Central Asiatic connection of the Zunbils ("brought southward by the Hephthalites") of Zabul/Kandahar/Zamindawar.
Perhaps not a coincidence that when the Zunbils and Turk-Sahis of Kabul disappear, the Afghans come into the historical record. While Wink thinks that they were only part Turk "infiltrated," the Arab writers consistently call them Turks.

http://books.google.com/books?id=bCVyhH5VDjAC&pg=PA112

Sein
05-09-2014, 07:28 PM
Some equate Abdali with the Hephthalite Huns/Turks too.
Andre Wink at the link I had posted above goes in great detail (pgs 112-125) on the Upper/Central Asiatic connection of the Zunbils ("brought southward by the Hephthalites") of Zabul/Kandahar/Zamindawar.
Perhaps not a coincidence that when the Zunbils and Turk-Sahis of Kabul disappear, the Afghans come into the historical record. While Wink thinks that they were only part Turk "infiltrated," the Arab writers consistently call them Turks.

http://books.google.com/books?id=bCVyhH5VDjAC&pg=PA112

This seems quite plausible.

In terms of historical anthropology, the question that consistently comes up in my mind is how such a homogeneous ethnic identity arose in the highlands between Khorasan and Hind? From Kandahar to Taxila, Pashtun ethnic identity is very strong and homogeneous, and dialectal variation is rather minor. Pashtun tribal customs also don't seem to vary much from southwestern Afghanistan all the way to northwestern Pakistan. The question is, why though? Do all Pashtuns perhaps emanate from some sort of ancient expansion, with roots in a single, restricted geographic region? If so, I think the most likely candidate for the primary cauldron of Pashtun/Pakhtun ethnogenesis is the Zhob area of Pakistani Balochistan. In fact, I'd extend this to the whole Suleiman mountain range.

everest59
05-12-2014, 10:23 PM
Okay, I'm trying to see if I can calculate ANI in the Paniya using Treemix. The populations I have used are :
Georgians
Karitiana
Paniya
MbutiPygmy

I went all the way to 4, but no results I wanted.
I'm gonna add more populations and see what I get.

Sein
05-12-2014, 10:28 PM
Okay, I'm trying to see if I can calculate ANI in the Paniya using Treemix. The populations I have used are :
Georgians
Karitiana
Paniya
MbutiPygmy

I went all the way to 4, but no results I wanted.
I'm gonna add more populations and see what I get.

Could you show us the tree?

Also, maybe MA1 is better than Karitiana.

everest59
05-12-2014, 10:29 PM
Okay, I'm trying to see if I can calculate ANI in the Paniya using Treemix. The populations I have used are :
Georgians
Karitiana
Paniya
MbutiPygmy

I went all the way to 4, but no results I wanted.
I'm gonna add more populations and see what I get.

Added CEU, and all it found was 0.08% CEU admixture in the Paniya.

everest59
05-12-2014, 10:31 PM
Could you show us the tree?

Also, maybe MA1 is better than Karitiana.

There you go. BTW, your inbox is full.
https://drive.google.com/file/d/0B3vEDdpZDjUpZTlJa3RuczZHVVk/edit?usp=sharing

Sein
05-12-2014, 10:32 PM
There you go. BTW, your inbox is full.
https://drive.google.com/file/d/0B3vEDdpZDjUpZTlJa3RuczZHVVk/edit?usp=sharing

For the sake of completeness, please add MA1.

everest59
05-12-2014, 10:36 PM
For the sake of completeness, please add MA1.

Interesting. 15% from Karitiana----> Paniya when I added Mal'ta.
Unsupervised mode. See below:

(MbutiPygmy:0.0093367,((Mal'ta:0.0151931,Karitiana :0.0137835):0.0050319,(Georgians:0,(CEU:0.00071634 2,Paniya:0):0):0):0.0093367);
0.152188 NA NA NA Karitiana:0.0137835 Paniya:0

This was using migration edge of 1.

Sein
05-12-2014, 10:38 PM
Interesting. 15% from Karitiana----> Paniya when I added Mal'ta.
Unsupervised mode. See below:

(MbutiPygmy:0.0093367,((Mal'ta:0.0151931,Karitiana :0.0137835):0.0050319,(Georgians:0,(CEU:0.00071634 2,Paniya:0):0):0):0.0093367);
0.152188 NA NA NA Karitiana:0.0137835 Paniya:0

This was using migration edge of 1.

Rather interesting. If you input a known event (ANE admixture in Karitiana), maybe the results might be of interest?

everest59
05-12-2014, 10:40 PM
Interesting. 15% from Karitiana----> Paniya when I added Mal'ta.
Unsupervised mode. See below:

(MbutiPygmy:0.0093367,((Mal'ta:0.0151931,Karitiana :0.0137835):0.0050319,(Georgians:0,(CEU:0.00071634 2,Paniya:0):0):0):0.0093367);
0.152188 NA NA NA Karitiana:0.0137835 Paniya:0

This was using migration edge of 1.

I went straight to migration edge of 5 :
0.334419 NA NA NA Karitiana:0.0172883 Mal'ta:0.0424241
0.161414 NA NA NA Karitiana:0.0172883 Paniya:0
0.100205 NA NA NA MbutiPygmy:0.00972931 Paniya:0
0.0587607 NA NA NA MbutiPygmy:0.00972931 (Georgians:0,Paniya:0):0
0.0253798 NA NA NA Mal'ta:0.0424241 Paniya:0

So the plot below:
https://drive.google.com/file/d/0B3vEDdpZDjUpZ2pfdmY1bWw3bEE/edit?usp=sharing

I assume Mbuti---> Paniya is basal Euro, but who knows?

So without letting the software know that Mal'ta is an ancient sample, looks like Treemix wasn't able to properly calculate the direction. Which should have been Mal'ta -----> Karitiana.

Sein
05-12-2014, 10:46 PM
I went straight to migration edge of 5 :
0.334419 NA NA NA Karitiana:0.0172883 Mal'ta:0.0424241
0.161414 NA NA NA Karitiana:0.0172883 Paniya:0
0.100205 NA NA NA MbutiPygmy:0.00972931 Paniya:0
0.0587607 NA NA NA MbutiPygmy:0.00972931 (Georgians:0,Paniya:0):0
0.0253798 NA NA NA Mal'ta:0.0424241 Paniya:0

So the plot below:
https://drive.google.com/file/d/0B3vEDdpZDjUpZ2pfdmY1bWw3bEE/edit?usp=sharing

I assume Mbuti---> Paniya is basal Euro, but who knows?

So without letting the software know that Mal'ta is an ancient sample, looks like Treemix wasn't able to properly calculate the direction. Which should have been Mal'ta -----> Karitiana.

Interesting. Assuming Mbuti---> Paniya is "Basal Eurasian", and assuming the Karitiana and MA1 migration edges into Paniya are ANE, the Paniya are around 29% West Eurasian (West Eurasian is just a combination BEA, ANE, and WHG/SHG/UHG).

What is the migration strength for Mbuti---> Georgian?

everest59
05-12-2014, 10:51 PM
Interesting. Assuming Mbuti---> Paniya is "Basal Eurasian", and assuming the Karitiana and MA1 migration edges into Paniya are ANE, the Paniya are around 29% West Eurasian.

What is the migration strength for Mbuti---> Georgian?
That was 5.8%.
Anyways, in supervised mode, I'm getting weird results. 67% from Georgian ---> Paniya as you can see below:
0.123521 NA NA NA Karitiana:0.0197707 Paniya:0
0.143951 NA NA NA Paniya:0 Karitiana:0.0197707
0.678019 NA NA NA Georgians:0 Paniya:0
0.0595567 NA NA NA ((Karitiana:0.0197707,Mal'ta:0.016487):0.00706229, (CEU:0,Georgians:0):0):0.0223678 Georgians:0
0.0605919 NA NA NA Paniya:0 Mal'ta:0.016487
0.0420857 NA NA NA Mal'ta:0.016487 Paniya:0

You know 67% from Georgian ---> Paniya could be ASI (software is weird sometimes). If that's the case then they are 33% ANI, which is close to the number above.

Sein
05-12-2014, 10:55 PM
That was 5.8%.
Anyways, in supervised mode, I'm getting weird results. 67% from Georgian ---> Paniya as you can see below:
0.123521 NA NA NA Karitiana:0.0197707 Paniya:0
0.143951 NA NA NA Paniya:0 Karitiana:0.0197707
0.678019 NA NA NA Georgians:0 Paniya:0
0.0595567 NA NA NA ((Karitiana:0.0197707,Mal'ta:0.016487):0.00706229, (CEU:0,Georgians:0):0):0.0223678 Georgians:0
0.0605919 NA NA NA Paniya:0 Mal'ta:0.016487
0.0420857 NA NA NA Mal'ta:0.016487 Paniya:0

The fact that Georgians (and Armenians+West Asian Jews) seem to have South Asian admixture (in formal mixture tests) could be a confounding factor?

Who knows, maybe the Bedouin will give different results. Maybe there are too many populations here with ANE?

everest59
05-12-2014, 10:57 PM
The fact that Georgians (and Armenians+West Asian Jews) seem to have South Asian admixture (in formal mixture tests) could be a confounding factor?

Who knows, maybe the Bedouin will give different results. Maybe there are too many populations here with ANE?

Okay, I'll try Bedouin now. South Asian admixture should not make a difference.

everest59
05-12-2014, 11:04 PM
The fact that Georgians (and Armenians+West Asian Jews) seem to have South Asian admixture (in formal mixture tests) could be a confounding factor?

Who knows, maybe the Bedouin will give different results. Maybe there are too many populations here with ANE?

Only Bedouin with no population with ANE aside from Karitiana finds only 9.9% from Bedouin ----> Paniya.

Sein
05-12-2014, 11:12 PM
This is rather interesting, there seems to be quite a cline in South Asia. Unsupervised, the software had me at 92%-93% West Eurasian, and the HGDP Pashtuns at something around 95% West Eurasian. At the other end of the spectrum, when run unsupervised, the software has the Paniya at 29% West Eurasian. One also finds a South Indian (Tamil) genome being utilized by Raghavan et al. in their TreeMix figures. The South Indian sample is around 60%-65% West Eurasian, 35-40% ENA.

This would probably describe things in a short/simple manner. In the far northwestern fringe, which is fundamentally intertwined with Central and West Asia, and rather peripheral to the rest of South Asia, people are around 90%-95% West Eurasian (most people). In the far south, most people who are a part of the caste system, and who also don't happen to be Brahmins, are around 60%-65% West Eurasian (again, most people). And among isolated tribal South Indians, West Eurasian admixture is around 30%-40% (this mostly applies to the Paniyar. Other tribal South Indians might have more West Eurasian admixture, although it's unlikely that any population in South India has less than the Paniyar). In terms of physical variation/phenotype, this all makes sense. One can discern a comparable cline in general phenotype (facial features). For example:

Pashtuns, 85%-95% West Eurasian, depending on individual

http://2.bp.blogspot.com/-MVb6Lz14LEA/UhWvOoIHHQI/AAAAAAAAHrI/RNvV-UOaiNM/s1600/Shaista+Khan+Kakar.jpg

http://upload.wikimedia.org/wikipedia/commons/0/0a/Pashtun_clothes.jpg

Tamils, 55%-65% West Eurasian, depending on individual

http://m8.i.pbase.com/o4/05/32605/1/56305398.IMG_2351a.jpg

http://www0.f1online.de/premid/005737000/5737594.jpg

Paniyar, 30%-40% West Eurasian, depending on individual

http://stanthek.files.wordpress.com/2010/05/vellutha1.jpg

http://www.anujnair.net/photos/LIFE/Moopan-original1medium-1.jpg

Regardless, there has to be a way to get more dependable results. I'm not saying the results are incorrect, but they can vary rather wildly. The only solution is a Denisova/Neanderthal outgroup.

everest59
05-12-2014, 11:17 PM
This is rather interesting, there seems to be quite a cline in South Asia. Unsupervised, the program had me at 92%-93% West Eurasian, and the HGDP Pashtuns at something around 95% West Eurasian. At the other end of the spectrum, when run unsupervised, the program has the Paniya at 29% West Eurasian. In terms of physical variation/phenotype, this all makes sense.

Still, there has to be a way to get more dependable results. I'm not saying the results are incorrect, but they can vary rather wildly. The only solution is a Denisova/Neanderthal outgroup.

Alright, I'll try all this out with Neanderthal/Denisova outgroup.

Sein
05-12-2014, 11:25 PM
Alright, I'll try all this out with Neanderthal/Denisova outgroup.

Thanks! It'll be very interesting to see how the results change, or if they don't actually change at all.

Edit: everest, before you start using the Denisova/Neanderthal as outgroups, could you (as a final act, :biggrin1:) run TreeMix on your whole data-set? No migration edges, just a plain tree, with MA1 also included (but no LaBrana1, and no Oceanians. They excluded Oceanians in the paper).

everest59
05-12-2014, 11:37 PM
Thanks! It'll be very interesting to see how the results change, or if they don't actually change at all.

Edit: everest, before you start using the Denisova/Neanderthal as outgroups, could you (as a final act, :biggrin1:) run TreeMix on your whole data-set? No migration edges, just a plain tree, with MA1 also included (but no LaBrana1).

Good idea, but I think the file that'll be created will probably be humongous. I'm gonna try it now.

Edit, anyways here is a fun little plot with all the participants. There are some things that make sense, but not all of them. It was just for fun:
https://drive.google.com/file/d/0B3vEDdpZDjUpX2ltczRTcE1icmc/edit?usp=sharing

Migration edges to go with above:
(((.Sein:0.112802,((.Mcninja:0.0921023,((Karitiana :0.192948,Mal'ta:0.19105):0.0255537,.Everest:0.091 8502):0.00247343):0.00171813,.Sapporo:0.135938):0. 00170498):0.000878158,(.DMXX:0.0916781,(.NK19:0.09 22982,((.Humanist:0.0931185,.Zephyrous:0.0945437): 0.00105361,(.mfa:0.0926242,(.Icebreaker:0.123533,. Shaikorth:0.218759):0.00022638):0.000749822):0.001 09801):0.00104511):0.00220515):0.0405723,MbutiPygm y:0.0405723);
(((.Sein:0.0188484,((.Mcninja:0,((Karitiana:0.1883 63,Mal'ta:0.144875):0.0255537,.Everest:0):0.002473 43):0.00171813,.Sapporo:0.042099):0.00170498):0.00 0878158,(.DMXX:0,(.NK19:0,((.Humanist:0.00154389,. Zephyrous:0.00522159):0.00105361,(.mfa:0.00145363, (.Icebreaker:0.0385726,.Shaikorth:0.127078):0.0002 2638):0.000749822):0.00109801):0.00104511):0.00220 515):0.0405723,MbutiPygmy:0.035073);
0.0507763 NA NA NA MbutiPygmy:0.0405723 Karitiana:0.192948
0.333271 NA NA NA (Karitiana:0.192948,Mal'ta:0.19105):0.0255537 .Shaikorth:0.218759
0.370843 NA NA NA .Mcninja:0.0921023 Karitiana:0.192948
0.0914641 NA NA NA Karitiana:0.192948 .Sein:0.112802
0.177045 NA NA NA (Karitiana:0.192948,Mal'ta:0.19105):0.0255537 .Sapporo:0.135938
0.101206 NA NA NA Mal'ta:0.19105 .Icebreaker:0.123533

Sein
05-12-2014, 11:46 PM
Good idea, but I think the file that'll be created will probably be humongous. I'm gonna try it now.

Edit, anyways here is a fun little plot with all the participants. There are some things that make sense, but not all of them. It was just for fun:
https://drive.google.com/file/d/0B3vEDdpZDjUpX2ltczRTcE1icmc/edit?usp=sharing

MA1 is clustering with South Asians, B). :P

Edit: Could you do something with South Asia, Afghanistan to South India, and Central Asia also included (all the "stan" samples you got)?

everest59
05-12-2014, 11:48 PM
MA1 is clustering with South Asians, B). :P

Hmm, yeah. I didn't even notice that.

Sein
05-13-2014, 12:04 AM
Hmm, yeah. I didn't even notice that.

Speaking of South Asia, could you try a tree with all South Asians, all West Asians, and all Central Asians, together?

everest59
05-13-2014, 12:06 AM
Speaking of South Asia, could you try a tree with all South Asians, all West Asians, and all Central Asians, together?

Well, I may need to remove some populations because I am getting an allocation error. Wonder if that's got to do with memory. Last try and I'll do what you say.
Edit: Looks like I need to go due to an emergency unfortunately. Will continue later. If not tomorrow.

ZephyrousMandaru
05-13-2014, 12:11 AM
Are Humanist, myself, Icebreaker and Shaikroth more Basal Eurasian admixed than the individuals above us?

Sein
05-13-2014, 12:14 AM
Well, I may need to remove some populations because I am getting an allocation error. Wonder if that's got to do with memory. Last try and I'll do what you say.
Edit: Looks like I need to go due to an emergency unfortunately. Will continue later. If not tomorrow.

No problem my friend.

Sein
05-13-2014, 12:20 AM
Are Humanist, myself, Icebreaker and Shaikroth more Basal Eurasian admixed than the individuals above us?

This is a very good interpretation. The participants clustering with MA1 and Karitiana are predominantly "Derived Eurasian". You can tell among the South Asians which individual has the most BEA, judging by proximity to MA1 and Karitiana. Looking at the graph, it seems I have the most BEA out of all South Asians, and Everest has the least. DMXX has the least BEA out of all West Asians, and mfa seems to have the most.

ZephyrousMandaru
05-13-2014, 12:25 AM
This is a very good interpretation. The participants clustering with MA1 and Karitiana are predominantly "Derived Eurasian". You can tell among the South Asians which individual has the most BEA, judging by proximity to MA1 and Karitiana. Looking at the graph, it seems I have the most BEA out of all South Asians, and Everest has the least. DMXX has the least BEA out of all West Asians, and mfa seems to have the most.

It seems rather unusual that MfA has the most Basal Eurasian admixture out of all the West Asian participants. If anything, I would have expected Humanist and I to have the most. I tend to associate the derived clusters of Mediterranean and Southwest Asian with having the most Basal Eurasian ancestry. Iranians and Kurds have less Mediterranean and Southwest Asian overall. They also have nearly 30% on average of the Gedrosia component, which arguably has more ANE related ancestry in it than Caucasus or even North European.

Sein
05-13-2014, 12:37 AM
It seems rather unusual that MfA has the most Basal Eurasian admixture out of all the West Asian participants. If anything, I would have expected Humanist and I to have the most. I tend to associate the derived clusters of Mediterranean and Southwest Asian with having the most Basal Eurasian ancestry. Iranians and Kurds have less Mediterranean and Southwest Asian overall. They also have nearly 30% on average of the Gedrosia component, which arguably has more ANE related ancestry in it than Caucasus or even North European.

It could be that you and Humanist do have the highest BEA. Since you and Humanist are in the center, another interpretation could be that DMXX+NK19191 are shifted toward South Asia, and mfa towards Europe.

parasar
05-13-2014, 12:54 AM
I went straight to migration edge of 5 :
0.334419 NA NA NA Karitiana:0.0172883 Mal'ta:0.0424241
0.161414 NA NA NA Karitiana:0.0172883 Paniya:0
0.100205 NA NA NA MbutiPygmy:0.00972931 Paniya:0
0.0587607 NA NA NA MbutiPygmy:0.00972931 (Georgians:0,Paniya:0):0
0.0253798 NA NA NA Mal'ta:0.0424241 Paniya:0

So the plot below:
https://drive.google.com/file/d/0B3vEDdpZDjUpZ2pfdmY1bWw3bEE/edit?usp=sharing

I assume Mbuti---> Paniya is basal Euro, but who knows?

So without letting the software know that Mal'ta is an ancient sample, looks like Treemix wasn't able to properly calculate the direction. Which should have been Mal'ta -----> Karitiana.

Karitiana -----> Mal'ta looks correct in a sense, as it is possible one portion of the ancestors of Karitiana (Y-P, seen both in the Americas and Eurasia) merged into the Eurasian ancestors of MA1 (mtDNA U, seen only in Eurasia). I wonder if Raghavan et al would have gotten the results they did if they had removed the later east Asiatic component from the Karitiana and had checked direction with only the early American retained. The earliest Americans do not show east Asiatic features.


To test the statistical support for the direction of gene flow, a bootstrap analysis was performed where an a priori migration edge from the Denisova lineage to the ancestry of the Papuan was included,
and TreeMix was then used to optimize a second migration edge. Over 100 bootstrap pseudoreplicates, a migration edge from MA-1 to Karitiana was observed in 99 cases, and a migration edge from Karitiana to MA-1 in 1 case.

Sein
05-13-2014, 01:00 AM
"To test the statistical support for the direction of gene flow, a bootstrap analysis was performed where an a priori migration edge from the Denisova lineage to the ancestry of the Papuan was included, and TreeMix was then used to optimize a second migration edge. Over 100 bootstrap pseudoreplicates, a migration edge from MA-1 to Karitiana was observed in 99 cases, and a migration edge from Karitiana to MA-1 in 1 case."

This is actually a good argument for having a Denisova outgroup.

Edit: There seem to be no Dai in the Paniya experiments, the closest ASI proxies besides Onge. I think that might have affected the results, to a degree.

parasar
05-13-2014, 03:15 AM
Karitiana -----> Mal'ta looks correct in a sense, as it is possible one portion of the ancestors of Karitiana (Y-P, seen both in the Americas and Eurasia) merged into the Eurasian ancestors of MA1 (mtDNA U, seen only in Eurasia). I wonder if Raghavan et al would have gotten the results they did if they had removed the later east Asiatic component from the Karitiana and had checked direction with only the early American retained. The earliest Americans do not show east Asiatic features.

Pickrell et. al. had it going from the ancestors of Ameri-Indians to Eurasians. The Maya AmeriInds are different as they had a subsequent Eurasian input.

http://i46.tinypic.com/slppbt.jpg
Figure 3: Inferred human tree. A. Maximum likelihood tree.
pg 17 http://arxiv.org/pdf/1206.2332.pdf

MfA
05-13-2014, 09:58 AM
Good idea, but I think the file that'll be created will probably be humongous. I'm gonna try it now.

Edit, anyways here is a fun little plot with all the participants. There are some things that make sense, but not all of them. It was just for fun:
https://drive.google.com/file/d/0B3vEDdpZDjUpX2ltczRTcE1icmc/edit?usp=sharing

Migration edges to go with above:
(((.Sein:0.112802,((.Mcninja:0.0921023,((Karitiana :0.192948,Mal'ta:0.19105):0.0255537,.Everest:0.091 8502):0.00247343):0.00171813,.Sapporo:0.135938):0. 00170498):0.000878158,(.DMXX:0.0916781,(.NK19:0.09 22982,((.Humanist:0.0931185,.Zephyrous:0.0945437): 0.00105361,(.mfa:0.0926242,(.Icebreaker:0.123533,. Shaikorth:0.218759):0.00022638):0.000749822):0.001 09801):0.00104511):0.00220515):0.0405723,MbutiPygm y:0.0405723);
(((.Sein:0.0188484,((.Mcninja:0,((Karitiana:0.1883 63,Mal'ta:0.144875):0.0255537,.Everest:0):0.002473 43):0.00171813,.Sapporo:0.042099):0.00170498):0.00 0878158,(.DMXX:0,(.NK19:0,((.Humanist:0.00154389,. Zephyrous:0.00522159):0.00105361,(.mfa:0.00145363, (.Icebreaker:0.0385726,.Shaikorth:0.127078):0.0002 2638):0.000749822):0.00109801):0.00104511):0.00220 515):0.0405723,MbutiPygmy:0.035073);
0.0507763 NA NA NA MbutiPygmy:0.0405723 Karitiana:0.192948
0.333271 NA NA NA (Karitiana:0.192948,Mal'ta:0.19105):0.0255537 .Shaikorth:0.218759
0.370843 NA NA NA .Mcninja:0.0921023 Karitiana:0.192948
0.0914641 NA NA NA Karitiana:0.192948 .Sein:0.112802
0.177045 NA NA NA (Karitiana:0.192948,Mal'ta:0.19105):0.0255537 .Sapporo:0.135938
0.101206 NA NA NA Mal'ta:0.19105 .Icebreaker:0.123533

Thanks everest59, looks interesting. I hope the thread takes attention of our Armenian user Caesera, Luristani Baibars and Turkish Sangarius, Hanna and join up..

Ignis90
05-13-2014, 02:40 PM
http://i46.tinypic.com/slppbt.jpg

BTW, I realized later that the Mandenka contributions to Middle Eastern populations and Mozabites were superfluous because it's actually the other way around: Mandenka is actually clearly Eurasian-shifted compared to Yoruba in regular PCA plots and exhibits minor [West] Eurasian component in ADMIXTURE runs.
Makes sense because 33% African contribution to Mozabite is way too high (and 67% Eurasian ancestry is way too low). Same for Middle Eastern populations.

parasar
05-13-2014, 05:40 PM
BTW, I realized later that the Mandenka contributions to Middle Eastern populations and Mozabites were superfluous because it's actually the other way around: Mandenka is actually clearly Eurasian-shifted compared to Yoruba in regular PCA plots and exhibits minor [West] Eurasian component in ADMIXTURE runs.
Makes sense because 33% African contribution to Mozabite is way too high (and 67% Eurasian ancestry is way too low). Same for Middle Eastern populations.

I would agree that it shows affinity more than direction. I'm not even sure of the direction from Denisovans to Papuans, as depicted on these trees. It may be best to confirm using uni-parental markers.

In the case of the Makrani though, it would not surprise that the flow was from Africa as seen on the tree. The Makrani are clearly African shifted and this is also seen in their uni-parental markers as well as in the historical record.

"Makrani population in Pakistan, which is characterized by a large component of sub-Saharan African lineages" http://www.biomedcentral.com/1471-2148/7/32/
"Makrani, with a postulated origin in Africa, carry the highest frequency of haplogroup 8 chromosomes found in any Pakistani population, as noted elsewhere (Qamar et al. 1999). This haplogroup is largely confined to sub-Saharan Africa, where it constitutes about half of the population" http://www.ncbi.nlm.nih.gov/pmc/articles/PMC447589/
http://historypak.com/makrani-culture/

Sein
05-13-2014, 06:34 PM
Well, I may need to remove some populations because I am getting an allocation error. Wonder if that's got to do with memory. Last try and I'll do what you say...

Maybe, you could try all of the HGDP South/Central Asians (Balochi, Brahui, Burusho, Hazara, Kalash, Makrani, Pashtun, and Sindhi), all Di Cristofaro Afghan populations (Hazara, Pashtun, Tajik, Turkmen, and Uzbek), Central Asians from Yunusbayev and Behar (Yunusbayev Tajiks, Yunusbayev Turkmen, and Behar Uzbeks), Behar Iranians, Yunusbayev Armenians, Behar Lezgin, Yunusbayev Chechen, HGDP Bedouins, Behar Lebanese, HGDP Mozabite, HGDP Daur, HGDP Dai, HGDP She, HGDP Yoruba, HGDP Mbuti Pygmy, MA1, and all of your participants? Just a plain tree, and only these populations. I'd truly appreciate this, and the output would be interesting. Also, it'll be much easier than using all of your South, West, and Central Asian populations. I have a participant's raw-data, I think it'll be interesting to see where they cluster. I'll send you the raw-data. Thanks in advance my friend.

Note: As always, please exclude HGDP00239, HGDP00237, HGDP00232, and HGDP00258 from the Pashtun samples.

ZephyrousMandaru
05-13-2014, 07:50 PM
BTW, I realized later that the Mandenka contributions to Middle Eastern populations and Mozabites were superfluous because it's actually the other way around: Mandenka is actually clearly Eurasian-shifted compared to Yoruba in regular PCA plots and exhibits minor [West] Eurasian component in ADMIXTURE runs.
Makes sense because 33% African contribution to Mozabite is way too high (and 67% Eurasian ancestry is way too low). Same for Middle Eastern populations.

How do you know it's superfluous, or that the gene flow is necessarily unidirecitonal? Especially in light of this Basal Eurasian component, which splits into and forms its own distinct clade. If it were truly West Eurasian admixture these Africans had, I would have expected it to be a subset of the West Eurasian clade, instead of branching into its own. I'm not saying that there aren't some Africans who have recent West Eurasian ancestry, but I very much doubt all the "Eurasian" admixture in Africans, is entirely West Eurasian in origin or even West Eurasian at all.

Ignis90
05-13-2014, 09:39 PM
How do you know it's superfluous, or that the gene flow is necessarily unidirecitonal? Especially in light of this Basal Eurasian component, which splits into and forms its own distinct clade. If it were truly West Eurasian admixture these Africans had, I would have expected it to be a subset of the West Eurasian clade, instead of branching into its own. I'm not saying that there aren't some Africans who have recent West Eurasian ancestry, but I very much doubt all the "Eurasian" admixture in Africans, is entirely West Eurasian in origin or even West Eurasian at all.

I based my statement off of the results of Mandenka in other analyses. I am not discarding the bidirectional geneflow but I'm just pointing out that if there was to be ONE unidirectional geneflow, it should have been the other way around (from Mozabites to Mandenka, especially if the oultiers in Mozabites have been removed).
As for my use of "West Eurasian", I am using modern West Eurasian populations as a proxy for recent geneflow (and rightfully so, Mozabite geneflow into Mandenka should only date back to the medieval muslim kingdom of Mali). Of course, when it comes to older Eurasian geneflows into Africa (mtdna M1 for instance), modern "West Eurasians" (who are basal and derived Eurasian-admixed) may not always be the best proxy (SSA samples at lower Ks -when an East Eurasian-centered component appears - can score some non-West Eurasian %).

So Mandenka contribution is probably "superfluous" in the case of Mozabites (wholly or partially, I don't know) but may not be for others: Makranis for instance as parasar mentioned, although it should have been Bantu instead based on history (maybe Middle East-admixture in Makranis made Mozabite-admixed Mandenka a better fit).

everest59
05-13-2014, 10:18 PM
Maybe, you could try all of the HGDP South/Central Asians (Balochi, Brahui, Burusho, Hazara, Kalash, Makrani, Pashtun, and Sindhi), all Di Cristofaro Afghan populations (Hazara, Pashtun, Tajik, Turkmen, and Uzbek), Central Asians from Yunusbayev and Behar (Yunusbayev Tajiks, Yunusbayev Turkmen, and Behar Uzbeks), Behar Iranians, Yunusbayev Armenians, Behar Lezgin, Yunusbayev Chechen, HGDP Bedouins, Behar Lebanese, HGDP Mozabite, HGDP Daur, HGDP Dai, HGDP She, HGDP Yoruba, HGDP Mbuti Pygmy, MA1, and all of your participants? Just a plain tree, and only these populations. I'd truly appreciate this, and the output would be interesting. Also, it'll be much easier than using all of your South, West, and Central Asian populations. I have a participant's raw-data, I think it'll be interesting to see where they cluster. I'll send you the raw-data. Thanks in advance my friend.

Note: As always, please exclude HGDP00239, HGDP00237, HGDP00232, and HGDP00258 from the Pashtun samples.

A little busy right now. I'll do it tomorrow, I promise.
I actually tried it yesterday with only 400 or so samples, and my computer just froze. I had to pull the plug. I think I may need to include only 5 samples per population.

Sein
05-13-2014, 10:25 PM
A little busy right now. I'll do it tomorrow, I promise.
I actually tried it yesterday with only 400 or so samples, and my computer just froze. I had to pull the plug. I think I may need to include only 5 samples per population.

Thanks!

This sounds reasonable, 5 samples per population (all of these populations) would be very nice. For the HGDP Pashtuns, please use HGDP00214, HGDP00259, HGDP00254, HGDP00222, and HGDP00213. The nice thing is that the Afghan populations are all only 5-per group.

Also, I take back using the Behar Lebanese. Instead, just HGDP Palestinians in their place.

everest59
05-13-2014, 10:37 PM
Thanks!

This sounds reasonable, 5 samples per population (all of these populations) would be very nice. For the HGDP Pashtuns, please use HGDP00214, HGDP00259, HGDP00254, HGDP00222, and HGDP00213. The nice thing is that the Afghan populations are all only 5-per group.

Also, I take back using the Behar Lebanese. Instead, just HGDP Palestinians in their place.

Will do. Btw, also got the raw data.

MfA
05-14-2014, 10:46 AM
Maybe, you could try all of the HGDP South/Central Asians (Balochi, Brahui, Burusho, Hazara, Kalash, Makrani, Pashtun, and Sindhi), all Di Cristofaro Afghan populations (Hazara, Pashtun, Tajik, Turkmen, and Uzbek), Central Asians from Yunusbayev and Behar (Yunusbayev Tajiks, Yunusbayev Turkmen, and Behar Uzbeks), Behar Iranians, Yunusbayev Armenians, Behar Lezgin, Yunusbayev Chechen, HGDP Bedouins, Behar Lebanese, HGDP Mozabite, HGDP Daur, HGDP Dai, HGDP She, HGDP Yoruba, HGDP Mbuti Pygmy, MA1, and all of your participants? Just a plain tree, and only these populations. I'd truly appreciate this, and the output would be interesting. Also, it'll be much easier than using all of your South, West, and Central Asian populations. I have a participant's raw-data, I think it'll be interesting to see where they cluster. I'll send you the raw-data. Thanks in advance my friend.

Note: As always, please exclude HGDP00239, HGDP00237, HGDP00232, and HGDP00258 from the Pashtun samples.

I'd add Yunusbayev Kurds aswell, since Behar Iranians are mostly South Iranians, adding Kurds into context would be beneficial for Northern Iranians.

Sein
05-14-2014, 07:17 PM
I'd add Yunusbayev Kurds aswell, since Behar Iranians are mostly South Iranians, adding Kurds into context would be beneficial for Northern Iranians.

This is a very good idea. It'll be interesting to see how the tree looks without any migration edges, and how things turn out if everest goes to 5 migration edges.

everest59
05-14-2014, 11:11 PM
Here you go.
https://drive.google.com/file/d/0B3vEDdpZDjUpWWx3LVRrSm1VcU0/edit?usp=sharing

The Italians are actually all Sicilians.
I used around 56k snp's for this treemix run. Was faster than I thought.
Yes, there is some weirdness. Wish I could see the branches better.

Sein
05-14-2014, 11:43 PM
Here you go.
https://drive.google.com/file/d/0B3vEDdpZDjUpWWx3LVRrSm1VcU0/edit?usp=sharing

The Italians are actually all Sicilians.
I used around 56k snp's for this treemix run. Was faster than I thought.
Yes, there is some weirdness. Wish I could see the branches better.

Amazing, thanks everest!

So far, the results actually make sense. Nothing weird as far as I can see, especially since we aren't dealing with all of the samples, but just 5 random samples per population, which means variation isn't going to be identical to what we usually see. Interestingly, as always, MA1 clusters with ENA populations. Also, it's pretty cool that the Khetran individual clusters with Baloch/Brahui/Makrani. Pretty unambiguous result, despite the anthropological ambiguity regarding their origins. A good example of how population genetics can be used to answer historical questions.

Could you shot for 9 migration edges? I think the output will be fascinating.

everest59
05-15-2014, 01:38 AM
Amazing, thanks everest!

So far, the results actually make sense. Nothing weird as far as I can see, especially since we aren't dealing with all of the samples, but just 5 random samples per population, which means variation isn't going to be identical to what we usually see. Interestingly, as always, MA1 clusters with ENA populations. Also, it's pretty cool that the Khetran individual clusters with Baloch/Brahui/Makrani. Pretty unambiguous result, despite the anthropological ambiguity regarding their origins. A good example of how population genetics can be used to answer historical questions.

Could you shot for 9 migration edges? I think the output will be fascinating.

I think I did migration edge of 5. Program is still running. Seems like it'll take some time. Been more than an hour.

Sein
05-15-2014, 01:39 AM
I think I did migration edge of 5. Program is still running. Seems like it'll take some time. Been more than an hour.

Wow, more than an hour!

I wonder what it'll show?

everest59
05-15-2014, 02:20 AM
Wow, more than an hour!

I wonder what it'll show?

Looks like I'll need to let the program run overnight. Very slow.

Sein
05-15-2014, 02:23 AM
Looks like I'll need to let the program run overnight. Very slow.

Thanks though!

I'm already looking forward to seeing the output tomorrow.

everest59
05-15-2014, 02:29 AM
Thanks though!

I'm already looking forward to seeing the output tomorrow.

Just as I typed that, the program stopped running. Here's the output:

https://drive.google.com/file/d/0B3vEDdpZDjUpbWJxOV9mUVZGTnc/edit?usp=sharing

Pretty strange looking plot.

Sein
05-15-2014, 02:32 AM
Just as I typed that, the program stopped running. Here's the output:

https://drive.google.com/file/d/0B3vEDdpZDjUpbWJxOV9mUVZGTnc/edit?usp=sharing

Pretty strange looking plot.

Wow, that is one unique looking graph.

Could you take a look at the migration strengths?

Humanist
05-15-2014, 02:33 AM
Here you go.
https://drive.google.com/file/d/0B3vEDdpZDjUpWWx3LVRrSm1VcU0/edit?usp=sharing

The Italians are actually all Sicilians.
I used around 56k snp's for this treemix run. Was faster than I thought.
Yes, there is some weirdness. Wish I could see the branches better.

Interesting how Zeph and I appear to be more Mediterranean here than Caucasian. And, the Mediterranean is more like the European variety (Greek, Sicilian), rather than the Levantine variety (Lebanese).

everest59
05-15-2014, 02:40 AM
Wow, that is one unique looking graph.

Could you take a look at the migration strengths?

Heading to sleep. Let me just copy-paste it:

0.107033 NA NA NA Yoruba:0.00996978 Egyptans:0.00698373
0.134228 NA NA NA Dai:0.00520892 Yoruba:0.00996978
0.1394 NA NA NA Mal'ta:0.0567012 ((Lithuanians:0.00938062,Ukranians:0.00893243):0.0 0019751,(FIN:0.00549555,.Shaikorth:0.0276036):0.00 0345364):0.000200209

Sein
05-15-2014, 02:55 AM
Heading to sleep. Let me just copy-paste it:

0.107033 NA NA NA Yoruba:0.00996978 Egyptans:0.00698373
0.134228 NA NA NA Dai:0.00520892 Yoruba:0.00996978
0.1394 NA NA NA Mal'ta:0.0567012 ((Lithuanians:0.00938062,Ukranians:0.00893243):0.0 0019751,(FIN:0.00549555,.Shaikorth:0.0276036):0.00 0345364):0.000200209

It'll be interesting to see 11 migration edges on the same data, but I think your computer has had enough. We'll see you tomorrow, It'll be rather illustrative to see it work with 11 migration edges.

ZephyrousMandaru
05-15-2014, 05:08 AM
Interesting how Zeph and I appear to be more Mediterranean here than Caucasian. And, the Mediterranean is more like the European variety (Greek, Sicilian), rather than the Levantine variety (Lebanese).

Yeah, I'm also pulling towards the Yoruba. I think I might be slightly more Basal Eurasian than you Humanist. Because in another one of Everest's plot, I was shifted downwards relative to where you were. Even though we were part of the same branch.

ZephyrousMandaru
05-15-2014, 05:22 AM
Amazing work Everest, these experiments are very interesting and insightful. Keep it up!

Hanna
05-15-2014, 05:54 AM
Thanks everest59, looks interesting. I hope the thread takes attention of our Armenian user Caesera, Luristani Baibars and Turkish Sangarius, Hanna and join up..
I would like to join but what should I do?

Ignis90
05-15-2014, 08:37 AM
Just as I typed that, the program stopped running. Here's the output:

https://drive.google.com/file/d/0B3vEDdpZDjUpbWJxOV9mUVZGTnc/edit?usp=sharing

Pretty strange looking plot.

Yes, it is strange and hard to read, there are some unnecessary populations.

I think North African Jews are redundant because they are very similar to each other and are the product of recent mixes anyway, so there is no need for having three of them in a tree.

The same is true for Mozabites and Moroccans, they are exactly the same. Moroccans because of SSA-admixed outliers are making a useless branch in the tree and Mozabites - because of their inbred nature - can skew very easily any run they are in.
So I suggest you can merge the two, keeping only the non-outliers and the most outbred Mozabites. here is the list of the only useful Maghreb/Berber individuals (11):
Mozabite HGDP01263
Mozabite HGDP01256
Mozabite HGDP01259
Mozabite HGDP01273

Moroccan GSM536954
Moroccan GSM536955
Moroccan GSM536956
Moroccan GSM536958
Moroccan GSM536960
Moroccan GSM536961
Moroccan GSM536963

I think it can only make the tree clearer because the focus is Eurasian variation (and Mal'ta).

everest59
05-15-2014, 09:28 AM
Yes, it is strange and hard to read, there are some unnecessary populations.

I think North African Jews are redundant because they are very similar to each other and are the product of recent mixes anyway, so there is no need for having three of them in a tree.

The same is true for Mozabites and Moroccans, they are exactly the same. Moroccans because of SSA-admixed outliers are making a useless branch in the tree and Mozabites - because of their inbred nature - can skew very easily any run they are in.
So I suggest you can merge the two, keeping only the non-outliers and the most outbred Mozabites. here is the list of the only useful Maghreb/Berber individuals (11):
Mozabite HGDP01263
Mozabite HGDP01256
Mozabite HGDP01259
Mozabite HGDP01273

Moroccan GSM536954
Moroccan GSM536955
Moroccan GSM536956
Moroccan GSM536958
Moroccan GSM536960
Moroccan GSM536961
Moroccan GSM536963

I think it can only make the tree clearer because the focus is Eurasian variation (and Mal'ta).

I wasn't sure about which North African populations to keep. This is helpful.

Sein
05-16-2014, 05:15 AM
Okay, I'm trying to see if I can calculate ANI in the Paniya using Treemix. The populations I have used are :
Georgians
Karitiana
Paniya
MbutiPygmy

I went all the way to 4, but no results I wanted.
I'm gonna add more populations and see what I get.

Raghavan et al. used an "Indian" genome in their TreeMix analysis. According to the authors, the "Indian" individual who provided the sample was/is from Tamil Nadu. Also, it seems unlikely that the sample is from a Brahmin. In the TreeMix figures provided, they infer around 60%-65% West Eurasian ancestry for the South Indian sample, and around 35%-40% ENA admixture. Interestingly, the ENA admixture is from the root of all ENA populations (basically, ASI). So, non-Brahmin South Indians are around 60%-65% West Eurasian according to their TreeMix results (compare to Moorjani et al's. estimate of 37.3% West Eurasian admixture for the average non-Brahmin South Indian). I think it'll be very interesting if you could try the same procedure on the Paniya. Here are the populations they used:

San
Mbuti Pygmy
Mandenka
Yoruba
Dinka

MA1

Sardinian
French
Avar
Mari
Indian

Papuan
Han
Dai
Karitiana

Denisova

Basically, I was hoping you could try something similar, but with the Paniya, rather than a random South Indian. They describe their procedure (and other specific details) in pp. 66-68, here:
http://www.nature.com/nature/journal/v505/n7481/extref/nature12736-s1.pdf

Now, I'm not sure if you have Mari and Avar samples. And I am definitely sure that you don't have a Pamiri Tajik in your data-set (although, I might be wrong on that. If so, that's good news). I was hoping you could find reasonable replacements, if you don't have Mari, Avar, and the lonely Pamiri Tajik. If you do have these genomes, that's great. But if not, Lezgian would be a nice replacement for Avar. In replacement of Mari, I suppose one HGDP Russian should do. And in replacement of a Pamiri Tajik, I would suggest one of the Behar Iranians. If possible, chose one of the Behar Iranians that lacks any Sub-Saharan African admixture. Also, the Denisova outgroup is very important.

For what it's worth, they used a single individual to represent each population. I recommend the same thing, only one Paniya sample, and just one sample for every other population. Finally, I was hoping you could boost the SNP count, since a total of 17 individuals should be rather fast. Just experiment with different migration edges, although they find that 4 migration edges is best. Thanks in advance my friend.

Edit: As Shaikorth suggested, Chuvash are the best replacement for Mari, assuming that you don't have them.

Also, as Soulblighter has suggested, it seems the Indian genome belongs to the lead author, who is probably a Brahmin.

Shaikorth
05-16-2014, 06:30 AM
Chuvash are the best proxies for Mari, looking at the Raghavan paper's admixture test.

Sein
05-16-2014, 06:47 AM
Chuvash are the best proxies for Mari, looking at the Raghavan paper's admixture test.

Ah, this is a very good idea.

everest59
05-16-2014, 01:33 PM
I am pretty sure I have Mari samples. Btw, treemix relies on clusters. I don't think I'm clustering the samples properly. I have been using family ID as cluster name. If I get a chance in the evening, I will try to cluster the samples using Mclust. And then I'll input everything into treemix. Will do this with no migration edge.

soulblighter
05-16-2014, 01:43 PM
Raghavan et al. used an "Indian" genome in their TreeMix analysis. According to the authors, the "Indian" individual who provided the sample was/is from Tamil Nadu. Also, it seems unlikely that the sample is from a Brahmin. In the TreeMix figures provided, they infer around 60%-65% West Eurasian ancestry for the South Indian sample, and around 35%-40% ENA admixture. Interestingly, the ENA admixture is from the root of all ENA populations (basically, ASI). So, non-Brahmin South Indians are around 60%-65% West Eurasian according to their TreeMix results (compare to Moorjani et al's. estimate of 37.3% West Eurasian admixture for the average non-Brahmin South Indian). I think it'll be very interesting if you could try the same procedure on the Paniya. :



I think it is possibly the lead author's own sample (Raghavan is very common Iyengar name, which is essentially the same as an Iyer ). Hence it could be a Tamil Brahmin.

In the supplement, it says:


"Saliva sample from the researcher who performed extractions and library preparations
of the ancient samples (MR) was collected in 2013 in Copenhagen, Denmark using an
Oragen Dx collection kit (DNA Genotek Inc., Kanata, Canada). MR and her parents
and both sets of grandparents are originally from South India (Tamil Nadu). "

Considering the lead author is the only female (Maanasa) with a Tamil Name (Raghavan), the possibility is very likely!

Edit: I am betting on it now:

From the contributions section of the paper:


"E.W. and K.E.G. conceived the project. E.W. headed the project. E.W. and M.R. designed the experimental research project setup. S.D. and K.E.G. provided access to the Mal’ta and Afontova Gora-2 samples, and K.E.G. provided archaeological context for the samples. T.W.S. Jr performed AMS dating. E.B. and O.B. (Tajik individual), E.K. and S.L. (Mari and Avar individuals) provided modern DNA extracts for complete genome sequencing. E.K. and S.L. (Kazakh, Kirghiz, Uzbek and Mari individuals), L.P.O. (Selkup individuals), S.A.F. (Even, Dolgan and Yakut individuals) and M.I.V. (Altai individuals) provided access to modern DNA extracts for genotyping. R.V. carried out Illumina chip analysis on modern samples. P.F.C. performed DNA extraction from the Indian individual. M.R. performed the ancient extractions and library constructions on the modern and ancient samples —the latter with input from L.O. M.R. coordinated the sequencing. M.R. and S.Ra. performed mapping of MA-1 and AG-2 data sets with input from L.O. S.Ra., T.S.-P. and S.B. provided super-computing resources, developed the next-generation sequencing pipeline and performed mapping and genotyping for all the modern genomes. M.R. performed DNA damage analysis with input from L.O. M.M. performed the admixture analysis. M.M., E.M., K.T. and R.V. performed the mtDNA analysis. M.M., M.K., S.Ro., T.K., R.V. and R.M. performed the Y-chromosome analysis. A.A. and I.M. performed the autosomal contamination estimates, error rate estimates, D-statistics tests based on sequence reads and ngsAdmix analyses. P.S. performed biological sexing, mtDNA contamination estimates, PCA, TreeMix, MixMapper, D-statistic tests based on allele frequencies, f3-statistics and phenotypic analyses, and analysis of AG-2 using nucleotide misincorporation patterns under the supervision of R.N. and M.J. M.R., P.S. and E.W. wrote the majority of the manuscript with critical input from R.N., M.J., M.M., K.E.G., A.A., I.M. and M.D. M.M., A.A. and I.M. contributed equally to this work."

BMG
05-16-2014, 02:18 PM
Soulblighter You Sherlock !!

Sein
05-16-2014, 05:20 PM
I think it is possibly the lead author's own sample (Raghavan is very common Iyengar name, which is essentially the same as an Iyer ). Hence it could be a Tamil Brahmin....Considering the lead author is the only female (Maanasa) with a Tamil Name (Raghavan), the possibility is very likely!...

Very good detective work!

So, it's 65% West Eurasian versus Moorjani et al's. estimate of 50% West Eurasian (Tamil Brahmins will be in between the Kshatriya and Kanjar used in that paper).

Sein
05-16-2014, 05:41 PM
I am pretty sure I have Mari samples. Btw, treemix relies on clusters. I don't think I'm clustering the samples properly. I have been using family ID as cluster name. If I get a chance in the evening, I will try to cluster the samples using Mclust. And then I'll input everything into treemix. Will do this with no migration edge.

That sounds great!

One thing though, it'll probably group Sapporo with the Pashtuns (Mclust can't distinguish between Afghan Pashtuns, Pakistani Pashtuns, and some Punjabi Jatts), and it might group you and Dr_McNinja together. That's just a function of how it handles clinal variation. So, could you look out for that and not include Sapporo with the Pashtuns, and if you belong to the same cluster as Dr_McNinja, could you still use separate IDs for yourself and Dr_McNinja? I'd really appreciate this.

parasar
05-16-2014, 07:56 PM
I think it is possibly the lead author's own sample (Raghavan is very common Iyengar name, which is essentially the same as an Iyer ). Hence it could be a Tamil Brahmin.

In the supplement, it says:



Considering the lead author is the only female (Maanasa) with a Tamil Name (Raghavan), the possibility is very likely!

Edit: I am betting on it now:

From the contributions section of the paper:

That was my understanding too that the first author was the "Indian" sample, though I could not confirm whether she was a Brahmin (Iyer or Iyengar).

everest59
05-17-2014, 12:31 AM
Well, seems like I did not have to cluster using Mclust. What I did seemed to be correct.

Here's another plot with new participants :

https://drive.google.com/file/d/0B3vEDdpZDjUpTUpBa0gyVUJ6RXc/edit?usp=sharing

Sort of a winding plot.

Humanist
05-17-2014, 12:45 AM
Well, seems like I did not have to cluster using Mclust. What I did seemed to be correct.

Here's another plot with new participants :

Sort of a winding plot.

That is interesting. I have never seen Assyrians (e.g. Zeph and I) appear more similar to Levantines and Africans over populations like Armenians. Neat result.

ZephyrousMandaru
05-17-2014, 03:45 AM
If I'm not mistaken, Tree Mix is formal testing software. If so, this raises an interesting question, which is more representative of our ancestry as Middle Easterners? Formal testing or ADMIXTURE?

Generalissimo
05-17-2014, 04:20 AM
If I'm not mistaken, Tree Mix is formal testing software. If so, this raises an interesting question, which is more representative of our ancestry as Middle Easterners? Formal testing or ADMIXTURE?

ADMIXTURE, unless you add admixture edges to the TreeMix results to account for horizontal gene flows which, if not accounted for, can skew the results for many samples, and then the outcomes should resemble ADMIXTURE anyway.

Edit: I meant horizontal not parallel.

Hanna
05-17-2014, 04:50 AM
Could someone help interpret my position in the tree? So am I branched out with the Armenians, Europeans and the Lezgins?

I tried to zoom to get a clearer picture

https://scontent-a-lhr.xx.fbcdn.net/hphotos-frc3/t1.0-9/1480493_294843870691292_7767762191902613312_n.jpg

Sein
05-17-2014, 07:20 AM
If I'm not mistaken, Tree Mix is formal testing software. If so, this raises an interesting question, which is more representative of our ancestry as Middle Easterners? Formal testing or ADMIXTURE?

I think Generalissimo has hit the nail right on the head, ADMIXTURE is much more relevant than TreeMix, without "migration edges". Nevertheless, I think that when you start adding "migration edges", TreeMix is much more robust (and "accurate", although "accurate" is a fuzzy term) than ADMIXTURE.

Basically, without "migration edges", the deeper relationships underlying the topology are often rather unclear. To use myself as an example, I'm very close to purely "Derived Eurasians", relative to Near Easterners and Europeans. To put that in perspective, MA1 consistently clusters with the only other purely "Derived Eurasians" around, ENA populations. So, any ENA admixture will bring me closer to MA1's position on the tree, and any ANE admixture will bring me closer to the general position of ENA populations on the tree. That can be quite confusing. But when "migration edges" are added, things become much clearer. Using TreeMix, I get 41% ANE admixture. When it comes to ENA admixture, TreeMix infers around 6%-8% for myself (depending on the "migration edge" used, but it seems to settle at 7%). So, it's probable that my shift towards "Derived Eurasians" is overwhelming a function of my ANE-rich ancestry (the specific percentage doesn't really matter, I think that I'm almost at whatever % the Karitiana get for ANE admixture, be it 36% or 44%. Although, that could be wrong), with the minor ENA being a significantly weaker factor.

At the end of the day, I'd prefer TreeMix to ADMIXTURE because it relies on "f-statistics" (tests of treeness), and it explicitly models admixture/reticulation. The admixture/reticulation part of the deal is why this software was developed. With ADMIXTURE, one has to deal with many assumptions that are simply false (no such thing as panmictic populations, especially when it comes to humans), and the program has an inherent tendency to create clusters that just reflect genetic drift. Also, it's often difficult to make sense of ADMIXTURE output, since the clusters (in some cases) don't involve the same dynamics for all populations (some examples would be the "South Asian" cluster in ADMIXTURE, as well as the cluster specific to Northeastern Europe).

Sein
05-17-2014, 07:31 AM
Could someone help interpret my position in the tree? So am I branched out with the Armenians, Europeans and the Lezgins?

I tried to zoom to get a clearer picture

https://scontent-a-lhr.xx.fbcdn.net/hphotos-frc3/t1.0-9/1480493_294843870691292_7767762191902613312_n.jpg

I'm a rather fallible person, but I think your'e right about this.

everest59
05-17-2014, 11:25 AM
If I'm not mistaken, Tree Mix is formal testing software. If so, this raises an interesting question, which is more representative of our ancestry as Middle Easterners? Formal testing or ADMIXTURE?

Well I'm a bigger fan of Admixtools software more than anything else. Stuff like f4 ratio estimation, 3 population test, etc are formal tests of admixture per Patterson et al. Btw, treemix package has two executables for f3 and f4. I haven't used them yet.

everest59
05-17-2014, 11:27 AM
I'm a rather fallible person, but I think your'e right about this.

I think you and Hanna are both correct.
Unfortunately I can't change the shape of that graph. The software created that winding plot. Yes it is hard to read.

Generalissimo
05-17-2014, 12:27 PM
Admixtools and TreeMix are ultimately more sophisticated and useful tools than ADMIXTURE (although a pain in the ass to install). But any trees without admixture or migration edges are going to hide a lot of finescale stuff, and that's what the question above was referring to.

everest59
05-17-2014, 12:35 PM
Admixtools and TreeMix are ultimately more sophisticated and useful tools than ADMIXTURE (although a pain in the ass to install). But any trees without admixture or migration edges are going to hide a lot of finescale stuff, and that's what the question above was referring to.

Yeah, it was definitely a pain in the ass to install. You need to install all those packages, and the thing is they don't install properly. So you need to make a lot of changes, and it really takes a few days to get the programs running.
As far as Admixtools, what I noticed was, some of the scripts had to be changed a little bit. As a result, I think the rolloff software that I have doesn't produce correct result. Not to mention that rolloff is extremely slow on my laptop, so I don't like to run it. On the other hand, stuff like f4 ratio estimation, 3 pop test, etc are extremely fast.

everest59
05-17-2014, 12:44 PM
It'll be interesting to see 11 migration edges on the same data, but I think your computer has had enough. We'll see you tomorrow, It'll be rather illustrative to see it work with 11 migration edges.

Plot with 11 migration edges on the data above.
There is some weirdness, as you can see. It is better to keep the populations small. I'll try to do the Raghavan treemix method next.
https://drive.google.com/file/d/0B3vEDdpZDjUpZi1DdzJqVTMyZWc/edit?usp=sharing

And here's the edge strength:
0.447502 NA NA NA ((She:0.00601895,Dai:0.00524009):0.00121608,Daur:0 .00514907):0.00485798 (Yoruba:0.00564381,MbutiPygmy:0.0116106):0.0434154
0.0728938 NA NA NA (Yoruba:0.00564381,MbutiPygmy:0.0116106):0.0434154 (Bedouin:0.00599138,(Egyptans:0.00550447,.LifeIsDa ndy:0.0281497):7.14492e-05):0.000181305
0.106702 NA NA NA (Yoruba:0.00564381,MbutiPygmy:0.0116106):0.0434154 Mozabite:0.0086761
0.394173 NA NA NA Turkmen:0.0179696 (CEU:0.00551589,(((Lithuanians:0.00922644,Ukranian s:0.00880122):0,(.Shaikorth:0.0276435,FIN:0.005366 87):0.000800027):0.000213448,(GBR:0.00558487,Swedi sh:0.00558753):0):8.22773e-05):0.00108309
0.108599 NA NA NA (Yoruba:0.00564381,MbutiPygmy:0.0116106):0.0434154 .Ignis:0.0352146
0.199032 NA NA NA (She:0.00601895,Dai:0.00524009):0.00121608 (Uzbek:0.00497214,.Sapporo:0.335816):0.000986895
0.199984 NA NA NA Daur:0.00514907 Tadjik:0.00859372
0.380244 NA NA NA ((She:0.00601895,Dai:0.00524009):0.00121608,Daur:0 .00514907):0.00485798 Turkmen:0.0179696
0.0685195 NA NA NA MbutiPygmy:0.0116106 Balochi:0.00546538
0.714677 NA NA NA .Mcninja:0.0279851 .Sapporo:0.335816
0.121929 NA NA NA Hazara:0.00170776 (.Shaikorth:0.0276435,FIN:0.00536687):0.000800027

I know that Dienekes got 68% from Sadinian ---> Yoruba or other african populations.

I'm consistently getting high Dai (or any other East Eurasian population) ----> African. Some sort of backmigration or what?

One more comment. I forgot to choose a root (which is often Mbuti Pygmy).

everest59
05-17-2014, 02:20 PM
I just discovered the -global option (it wasn't there on the treemix manual). Same data as above, but with the -global option, and root set at MbutiPygmy. This one looks much better (i.e. the tree is more readable).

https://drive.google.com/file/d/0B3vEDdpZDjUpb0czSTh6TjA5MDQ/edit?usp=sharing

everest59
05-17-2014, 02:51 PM
I need some advice from everybody. I would like to remove some populations from the data above for better results. I'm removing Fins. What other populations?

Humanist
05-17-2014, 03:14 PM
I need some advice from everybody. I would like to remove some populations from the data above for better results. I'm removing Fins. What other populations?

I would replace Bedouin with Yemen Jews. And I would replace Lebanese with Druze.

Hanna
05-17-2014, 03:19 PM
Could you add Turks?

everest59
05-17-2014, 03:29 PM
Could you add Turks?

Sure. I thought I had turks in there.

Hanna
05-17-2014, 03:34 PM
Sure. I thought I had turks in there.

There is Turkmen but not Anatolian Turks.

Shaikorth
05-17-2014, 03:35 PM
You could add Mordovians if you have the samples. It seems that HGDP Russians behave like a mix of Mordovians and western Russians, and it might be better to have samples from the end of a cline.

Ignis90
05-17-2014, 04:22 PM
People, people! He would like to REMOVE populations.


I need some advice from everybody. I would like to remove some populations from the data above for better results. I'm removing Fins. What other populations?

As I wrote in a previous comment (http://www.anthrogenica.com/showthread.php?2551-Experimentation-with-TreeMix-Software&p=40115&viewfull=1#post40115), there is no need for both Moroccans and Mozabites.
You can also remove Moroccan Jews altogether, they are well-know to be recently admixed (Levantine+European+Berber). Unless you want to see if the TreeMix picks up these signals.


I also think Anthrogenica's members should be included in a separate run.

ZephyrousMandaru
05-17-2014, 04:34 PM
I would replace Bedouin with Yemen Jews. And I would replace Lebanese with Druze.

Humanist, do you have the raw data for the two Iraqi Mandaeans? If so, I think those two should be added.

Hanna
05-17-2014, 04:58 PM
Everest:
Are you interested in any more West Asian data?

icebreaker
05-17-2014, 06:48 PM
There is Turkmen but not Anatolian Turks.

I'm Turkish.

@everest59

Could you or anyone else help me interpret these results?I feel like I belong nowhere. Thanks.

Hanna
05-17-2014, 07:22 PM
I'm Turkish.

@everest59

Could you or anyone else help me interpret these results?I feel like I belong nowhere. Thanks.

I'm talking about Turkish reference population.

Sein
05-17-2014, 08:22 PM
I just discovered the -global option (it wasn't there on the treemix manual). Same data as above, but with the -global option, and root set at MbutiPygmy. This one looks much better (i.e. the tree is more readable).

https://drive.google.com/file/d/0B3vEDdpZDjUpb0czSTh6TjA5MDQ/edit?usp=sharing

This looks great! I like how Sapporo and Dr_McNinja are on the same branch, rather accurate.

Humanist
05-17-2014, 09:37 PM
This looks great! I like how Sapporo and Dr_McNinja are on the same branch, rather accurate.

There are some atypical results for West Asians. Note the position of Kurds. And the position of ZephyrousMandaru and I. Zeph appears to be located on a branch with Chechens.

Hanna
05-18-2014, 02:12 AM
There are some atypical results for West Asians. Note the position of Kurds. And the position of ZephyrousMandaru and I. Zeph appears to be located on a branch with Chechens.
I feel like it is inaccurate in my case too. That is why asked for Turks to be added.

everest59
05-18-2014, 02:13 AM
I'm Turkish.

@everest59

Could you or anyone else help me interpret these results?I feel like I belong nowhere. Thanks.

It seems like you're branching out from other West Asian populations but forming a cluster of your own.
You look better here in the sense that you're right next to Hanna, with Lezgins close by.

https://drive.google.com/file/d/0B3vEDdpZDjUpZi1DdzJqVTMyZWc/edit?pli=1

everest59
05-18-2014, 02:13 AM
Everest:
Are you interested in any more West Asian data?

Sure, email them to me.

everest59
05-18-2014, 02:16 AM
I feel like it is inaccurate in my case too. That is why asked for Turks to be added.

Well, this may not be clusters we are seeing. . Oerhaps the arrangements we see is due to migration events.
But like you said, it may not be accurate.
Hard to tell.

Chad Rohlfsen
05-18-2014, 02:32 AM
everest,

Are you interested in data from someone who is predominately Northern European? I am curious where I fit in with all the plots. I can email you my 23andme data. Thanks!

everest59
05-18-2014, 02:37 AM
everest,

Are you interested in data from someone who is predominately Northern European? I am curious where I fit in with all the plots. I can email you my 23andme data. Thanks!

Sent you a pm.

I'm accepting data from everybody right now.

Hanna
05-18-2014, 02:41 AM
Sure, email them to me.

Do you use data from FTDNA?

Chad Rohlfsen
05-18-2014, 02:42 AM
Okay, I sent my data. Looking forward to seeing the results! Thanks!

Hanna
05-18-2014, 02:46 AM
Well, this may not be clusters we are seeing. . Oerhaps the arrangements we see is due to migration events.
But like you said, it may not be accurate.
Hard to tell.

I don't really know how this works, but I would assume the Kurds to be closer to Iranians, and in my case I tend to be in between Turkish and Armenian due to my mother being Armenian like.

everest59
05-18-2014, 02:50 AM
Do you use data from FTDNA?

Yeah, I can use FTDNA data.

everest59
05-18-2014, 02:58 AM
I'm gonna post a quick PCA plot:
https://drive.google.com/file/d/0B3vEDdpZDjUpYXpyMV96WFB6aWs/edit?usp=sharing

Just an experiment. I don't know if Shaikorth for example got the right result.

If the font is too small, just download the pdf file.

Chad Rohlfsen
05-18-2014, 03:13 AM
everest,

Thanks! Any chance you could post my results using for instance, La Brana, Mal'ta, ENA, and something representative of EEF? Or just whatever you think seems best.

everest59
05-18-2014, 03:16 AM
everest,

Thanks! Any chance you could post my results using for instance, La Brana, Mal'ta, ENA, and something representative of EEF? Or just whatever you think seems best.

Sure. I'll do that tomorrow. It's 11 pm here.

Chad Rohlfsen
05-18-2014, 03:17 AM
Sounds good! Thanks!

ZephyrousMandaru
05-18-2014, 03:49 AM
I think the reason why these experiments seem to yield wildly different results each time, is because there isn't enough samples that represents each respective population globally. So I think these plots may suffer from low sampling density.

Sein
05-18-2014, 04:39 AM
I'm gonna post a quick PCA plot:
https://drive.google.com/file/d/0B3vEDdpZDjUpYXpyMV96WFB6aWs/edit?usp=sharing

Just an experiment. I don't know if Shaikorth for example got the right result.

If the font is too small, just download the pdf file.

Thanks everest!

The output is extremely interesting. Shaikorth's result is very reasonable, since PC2 is driven by both ANE and ENA admixture (PC2 can't distinguish between ENA and ANE). So, I think this is an indication that Shaikorth has a higher level of ANE admixture in comparison to other Europeans. It might also be evidence of greater ENA admixture for Shaikorth in comparison to other Europeans, since both have the same effect here. Only Shaikorth could tell us which is more likely (or, if it's a combination of both). If he doesn't have any real ENA admixture for various DIY calculators, I think this boils down to greater ANE admixture for him.

Shaikorth
05-18-2014, 05:36 AM
Thanks everest!

The output is extremely interesting. Shaikorth's result is very reasonable, since PC2 is driven by both ANE and ENA admixture (PC2 can't distinguish between ENA and ANE). So, I think this is an indication that Shaikorth has a higher level of ANE admixture in comparison to other Europeans. It might also be evidence of greater ENA admixture for Shaikorth in comparison to other Europeans, since both have the same effect here. Only Shaikorth could tell us which is more likely (or, if it's a combination of both). If he doesn't have any real ENA admixture for various DIY calculators, I think this boils down to greater ANE admixture for him.


There could well be both, but I'm still waiting for a tool that can separate them in a "conclusive" way.

Everest, if you have time you could do an approximate f3 test for (Dai, HGDP Russians; Shaikorth)? If the HGDP sample is ENA shifted compared to me it's more likely that excess ANE is affecting the position.

everest59
05-18-2014, 12:50 PM
Okay, one step at a time. This is a pca plot with La Brana and Mal'ta with European populations + Chad Rohlfsen and Shaikorth:

https://drive.google.com/file/d/0B3vEDdpZDjUpWk5PdTFXcFNHekE/edit?usp=sharing

I guess Sardinians are the best replacements for Stuttgart.
I included some Dai and just one Karitiana, as Karitiana seem to cause cluster problems.

everest59
05-18-2014, 12:51 PM
I think the reason why these experiments seem to yield wildly different results each time, is because there isn't enough samples that represents each respective population globally. So I think these plots may suffer from low sampling density.

Well, I can try to include every single sample I have for those populations, but computer goes very slow.

Shaikorth
05-18-2014, 04:08 PM
That plot is quite good, it gives a visualization of Karitiana's ANE-mixed nature which is easily obscured if they are allowed to define a dimension.

Chad Rohlfsen
05-18-2014, 04:13 PM
Awesome everest! What do my admixture percentages look like using these components?

everest59
05-18-2014, 08:36 PM
There could well be both, but I'm still waiting for a tool that can separate them in a "conclusive" way.

Everest, if you have time you could do an approximate f3 test for (Dai, HGDP Russians; Shaikorth)? If the HGDP sample is ENA shifted compared to me it's more likely that excess ANE is affecting the position.

The f3 score is positive as you can see below:
Source 1 Source 2 Target f_3 std. err Z SNPs
result: Dai Russian .Shaikorth 0.007510 0.004989 1.505 15029

So f3 score of 0.0075 1 with std. error of 0.004989.

Here's calculation of ANE ancestry for you and Chad:

number of blocks for block jackknife: 523
alpha std. err Z (null=0)
result: Papuan Karitiana .Shaikorth Sardinian : Papuan Karitiana Mal'ta Sardinian 0.260037 0.115906 2.244
result: Papuan Karitiana .Chad_Rohlfsen Sardinian : Papuan Karitiana Mal'ta Sardinian 0.121088 0.106495 1.137
## end of run

So, 26% in your case. 12 % in case of Chad.
I'll try to quantify your ENA by using Alder (although Alder does not accept just one individual, I have an idea).

Now, the standard error is high, but we're talking about 17k snp's only.


Awesome everest! What do my admixture percentages look like using these components?

I calculated ANE above for you, which is at 12 %. I don't think I'll be able to calculate WHG until Lazairidis et al release their dataset. La Brana has some EEF admixture.

Shaikorth
05-18-2014, 09:09 PM
Thanks again. Lazaridis paper notes that relations to these components can also be reduced by Basal Eurasian/African, or increased by lack of them, essentially EEF carries those.

Could you do a f3 test to check if Lithuanians are Basal Eurasian and EEF-shifted in comparison to me? Something like Moroccans, Lithuanians; Shaikorth and Sardinians, Lithuanians; Shaikorth should do.

everest59
05-18-2014, 09:17 PM
Thanks again. Lazaridis paper notes that relations to these components can also be reduced by Basal Eurasian/African, or increased by lack of them, essentially EEF carries those.

Could you do a f3 test to check if Lithuanians are Basal Eurasian and EEF-shifted in comparison to me? Something like Moroccans, Lithuanians; Shaikorth and Sardinians, Lithuanians; Shaikorth should do.

My current dataset does not have Moroccans, but it has Bedouins. See below:
result: Bedouin Lithuanians .Shaikorth 0.006164 0.004902 1.257 16020
result: Sardinian Lithuanians .Shaikorth 0.005446 0.004900 1.111 15937
##end of qp3Pop

It looks like you are less EEF shifted.

BTW, f3 (FIN, Dai; .Shaikorth) is very close to zero. Probably means that your ENA is close to Fins, but it was still on the positive side (around 0.01 f3 score).

Shaikorth
05-18-2014, 09:26 PM
My current dataset does not have Moroccans, but it has Bedouins. See below:
result: Bedouin Lithuanians .Shaikorth 0.006164 0.004902 1.257 16020
result: Sardinian Lithuanians .Shaikorth 0.005446 0.004900 1.111 15937
##end of qp3Pop

It looks like you are less EEF shifted.

BTW, f3 (FIN, Dai; .Shaikorth) is very close to zero. Probably means that your ENA is close to Fins, but it was still on the positive side (around 0.01 f3 score).

Looking plausible. Can you do a couple final tests, similar to previous but Lithuanians replaced with Russians. If they turn out EEF-shifted compared to me, it's possibly reducing their ENA shift.

everest59
05-18-2014, 09:32 PM
Looking plausible. Can you do a couple final tests, similar to previous but Lithuanians replaced with Russians. If they turn out EEF-shifted compared to me, it's possibly reducing their ENA shift.

result: Bedouin Russian .Shaikorth 0.007840 0.004870 1.610 16638
result: Sardinian Russian .Shaikorth 0.006030 0.004838 1.246 16474
##end of qp3Pop

f3 score went up.

Shaikorth
05-18-2014, 09:36 PM
result: Bedouin Russian .Shaikorth 0.007840 0.004870 1.610 16638
result: Sardinian Russian .Shaikorth 0.006030 0.004838 1.246 16474
##end of qp3Pop

f3 score went up.

Figured that might happen, given the relatively high West Asian score Kargopol Russians get in K13.

The supposed African in Basal Eurasian is somewhat of an open question. It might be interesting to see what those tests (with both Lithuanians and Russians) would show if Bedouin/Sardinian were replaced with Yoruba.

everest59
05-18-2014, 10:40 PM
Figured that might happen, given the relatively high West Asian score Kargopol Russians get in K13.

The supposed African in Basal Eurasian is somewhat of an open question. It might be interesting to see what those tests (with both Lithuanians and Russians) would show if Bedouin/Sardinian were replaced with Yoruba.
There you go:

result: Yoruba Russian .Shaikorth 0.006053 0.005104 1.186 15993
result: Yoruba Lithuanians .Shaikorth 0.001954 0.005184 0.377 15419
##end of qp3Pop

Shaikorth
05-18-2014, 10:51 PM
There you go:

result: Yoruba Russian .Shaikorth 0.006053 0.005104 1.186 15993
result: Yoruba Lithuanians .Shaikorth 0.001954 0.005184 0.377 15419
##end of qp3Pop

Some correlation for Russians, not that much for Lithuanians. Dunno if Mandenka or a larger SNP set gives different results.

Thanks a lot for doing this again. Let's end with the eastern affinities of WHG's, how would Dai, LaBrana; Shaikorth show?

everest59
05-18-2014, 10:54 PM
Some correlation for Russians, not that much for Lithuanians. Dunno if Mandenka or a larger SNP set gives different results.

Thanks a lot for doing this again. Let's end with the eastern affinities of WHG's, how would Dai, LaBrana; Shaikorth show?

Here it is:

result: Dai LaBrana .Shaikorth 0.003731 0.006208 0.601 9297

everest59
05-18-2014, 11:24 PM
Let me show you guys something. Lazaridis calculated basal admixture for stuttgart with the following formula:
f4(Stuttgart, Loschbour; Onge MA1)/f4(Mbuti, MA1; Onge, Loschbour

However, due to the fact that LaBrana has EEF admixture, we don't get the correct results:

result: Bedouin LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.870667 0.008772 99.259
result: .Shaikorth LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.810572 0.012932 62.677
result: .Chad_Rohlfsen LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.821916 0.012871 63.859
## end of run


That Loschbour sample is necessary to calculate basal eurasian correctly.

Sein
05-19-2014, 01:39 AM
everest, I was hoping you could try a PCA with just MA1 and South Asians, with the addition of the Dai? I would truly appreciate this. As always, no rush whatsoever, only when you find the time. Thanks in advance.

Shaikorth
05-19-2014, 08:35 AM
Let me show you guys something. Lazaridis calculated basal admixture for stuttgart with the following formula:
f4(Stuttgart, Loschbour; Onge MA1)/f4(Mbuti, MA1; Onge, Loschbour


I might have missed it reading the supplement, but did they actually test that formula on any modern European or Middle Eastern sample?

everest59
05-19-2014, 09:27 AM
I might have missed it reading the supplement, but did they actually test that formula on any modern European or Middle Eastern sample?

Actually they did not. They only tested it on Stuttgart. Stuttgart we know has some WHG. So the problem may come from ANE admixture, but there is no way of knowing until I can test that formula.

parasar
05-20-2014, 12:05 AM
Figured that might happen, given the relatively high West Asian score Kargopol Russians get in K13.

The supposed African in Basal Eurasian is somewhat of an open question. It might be interesting to see what those tests (with both Lithuanians and Russians) would show if Bedouin/Sardinian were replaced with Yoruba.


Let me show you guys something. Lazaridis calculated basal admixture for stuttgart with the following formula:
f4(Stuttgart, Loschbour; Onge MA1)/f4(Mbuti, MA1; Onge, Loschbour

However, due to the fact that LaBrana has EEF admixture, we don't get the correct results:

result: Bedouin LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.870667 0.008772 99.259
result: .Shaikorth LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.810572 0.012932 62.677
result: .Chad_Rohlfsen LaBrana Papuan Mal'ta : MbutiPygmy Mal'ta Papuan LaBrana 0.821916 0.012871 63.859
## end of run


That Loschbour sample is necessary to calculate basal eurasian correctly.

How about if you remove the Africans? Basal Eurasian may be an artifact of gene flow into Africans.



It is that the proposed movement out of Africa through the Levantine
corridor around 40 kya did not occur or has no maternal continuity to the present day. This is
because: first, in that period the Eurasian haplogroups M and N had already evolved and
spread at continental level in Eurasia, and, second, there is no evidence of any L-derived
clade outside Africa with a similar coalescence age to that proposed movement. Under this
perspective, the late Pleistocene human skull from Hofmeyr, South Africa, considered as a
sub-Saharan African predecessor of the Upper Paleolithic Eurasians [51], should be better
considered as the southernmost vestige of the Homo sapiens return to Africa.
http://www.biomedcentral.com/content/pdf/1471-2148-14-109.pdf

Sein
05-20-2014, 04:36 AM
everest, I was hoping you could try a PCA with just MA1 and South Asians, with the addition of the Dai? I would truly appreciate this. As always, no rush whatsoever, only when you find the time. Thanks in advance.

You know, I think the addition of HGDP Bedouin, Behar Iranians, and Behar Georgians would be very interesting. Also, She and Han. So, every South Asian population in your data-set, Bedouins, Iranians, Georgians, Dai, She, Han, and MA1. Thanks in advance.

Arbogan
05-20-2014, 07:00 AM
You know, I think the addition of HGDP Bedouin, Behar Iranians, and Behar Georgians would be very interesting. Also, She and Han. So, every South Asian population in your data-set, Bedouins, Iranians, Georgians, Dai, She, Han, and MA1. Thanks in advance.

I'd love to be included. Is there any possibility that I can contribute?

-addition 5 of the behar(Or metspalu, since he was responsible for the sampling) iranians will probably drift towards bedouins, because they have recent gulf-arab and african ancestry.

Namely:
GSM536746
GSM536751
GSM536752
GSM536754
GSM536758

Don't know whether I'd use them as examples of overall average iranians.

everest59
05-20-2014, 09:24 AM
Sure, send me your data. I'm gonna pm you my email addrrss.
I got a bunch of data recently. I'll need to create a new dataset in the near future.

everest59
05-20-2014, 09:28 AM
How about if you remove the Africans? Basal Eurasian may be an artifact of gene flow into Africans.


http://www.biomedcentral.com/content/pdf/1471-2148-14-109.pdf

Well, they are using MbutiPygmy to isolate a certain pathway. If I remember correctly the non African pathway where basal eurasian resides. So I don't know if it can be calculated without MbutiPygmy.

everest59
05-20-2014, 09:31 AM
You know, I think the addition of HGDP Bedouin, Behar Iranians, and Behar Georgians would be very interesting. Also, She and Han. So, every South Asian population in your data-set, Bedouins, Iranians, Georgians, Dai, She, Han, and MA1. Thanks in advance.

Alright. I'll try to do it tonight. Took a break yesterday.

parasar
05-20-2014, 05:58 PM
Well, they are using MbutiPygmy to isolate a certain pathway. If I remember correctly the non African pathway where basal eurasian resides. So I don't know if it can be calculated without MbutiPygmy.

They look to be unsure about basal Eurasian:

European hunter-gatherers and especially Ancient North Eurasians deviate towards present-day Eastern non-African populations. This pattern is consistent with our
model (Fig. 2a) according to which hunter-gatherers share common genetic drift with Eastern non-Africans that is only partially shared by Early European Farmers who trace part of their ancestry to a “Basal Eurasian” population that diverged prior to the split of European hunter-gatherers from Eastern non-Africans


Because they are assuming an African origin, under that scenario the Basal Eurasian would be the earliest split OoA.



Equally archeologically plausible is basal Eurasian admixture in Onge (Fig. S14.2f), which would
partially explain #1. The Onge are a southern Eurasian population, and a scenario of a “southern
route” peopling of Eurasia (of which the Onge are plausible partial descendants) might have resulted in them having deep Eurasian ancestry, similar to a model proposed for the early peopling of Australia
by anatomically modern humans15. Such ancestry would cause them to share less genetic drift with
West Eurasians than with the Karitiana




The Near East was the staging point for the peopling of Eurasia by anatomically
modern humans. As a result, it is entirely plausible that it harbored deep Eurasian ancestry which did
not participate in the initial peopling of Eurasia, but was much later brought into Europe by Near
Eastern farmers. [to me, this does not make sense]

... Finally, [B]it could reflect continuing more recent gene
flows between the Near East and nearby Africa after the initial out-of-Africa dispersal, perhaps
associated with the spread of Y-chromosome haplogroup E subclades from eastern Africa11,12 into the
Near East, which appeared at least 7,000 years ago in Neolithic Europe13, or the detection of African
skeletal morphology in Epipaleolithic Natufians from Israel14.


But what if that Y-E is a gene flow into Africa from the Near East. The Near East in this scenario would have some of the youngest populations outside Africa but older that the Near Eastern component in Africa. This would entail a movement from SE Asia to South Asia to Central Asia and Upper Asia and Europe to Near East to North Africa to Sub Saharan Africa (Y-E). In none of the Lazaridis models I see that scenario covered.

From the mtDNA side it has been proposed:

Outside Africa, the earliest and fastest growth is inferred in Southern Asia ∼52 kya, followed by a succession of growth phases in Northern and Central Asia (∼49 kya), Australia (∼48 kya), Europe (∼42 kya), the Middle East and North Africa (∼40 kya), New Guinea (∼39 kya), the Americas (∼18 kya), and a second expansion in Europe (∼10–15 kya) ... Perhaps most striking, however, is the magnitude of the Southern Asian growth phase, which implies that between approximately 45 and 20 kya over half of the global human population lived on the Indian subcontinent and what is now the Thai and Malay Peninsulas. Our population size estimates indicate that the proportion of people living in the region peaked at over 60% approximately 38 kya. This figure may even be an underestimate ...

By ∼35 kya global growth outside Africa begins to slow, most markedly in Southern Asia and Australia, whereas in Europe estimated effective population size actually declines slightly. This leveling of growth may reflect the impact of Malthusian environmental constraints and, particularly in Europe, increasing glaciation during this time. Climate also seems the most likely explanation for a second major growth phase inferred in Europe beginning 10–15 kya (fig. 1e), shortly after the end of the last ice age ∼15 kya.

http://mbe.oxfordjournals.org/content/25/2/468/F2.large.jpg
http://mbe.oxfordjournals.org/content/25/2/468.full



In terms of autosomal DNA, a similar scenario is indicated by Metspalu with the European and Near Eastern specific components being far younger that the South Asian specfic one.

Sein
05-20-2014, 06:14 PM
Sure, send me your data. I'm gonna pm you my email addrrss.
I got a bunch of data recently. I'll need to create a new dataset in the near future.

Could you include Denisova in your new data-set, if you can get the files?

everest59
05-20-2014, 10:44 PM
Could you include Denisova in your new data-set, if you can get the files?
Anywhere I can download it from?

Anyways, here's a plot that includes Kashmiri Pandits.

https://drive.google.com/file/d/0B3vEDdpZDjUpNnhkdGlvbTVFU2M/edit?usp=sharing

No Europeans here. Ma-1 clustering very close to South Asians, with one Burusho close by. Only 6k snp's used, since the Moorjani et al data is very different from other data.

If you don't see yourself, just search.

Not as accurate as I'd like...

Sein
05-20-2014, 11:22 PM
Anywhere I can download it from?

Anyways, here's a plot that includes Kashmiri Pandits.

https://drive.google.com/file/d/0B3vEDdpZDjUpNnhkdGlvbTVFU2M/edit?usp=sharing

No Europeans here. Ma-1 clustering very close to South Asians, with one Burusho close by. Only 6k snp's used, since the Moorjani et al data is very different from other data.

If you don't see yourself, just search.

Not as accurate as I'd like...

David posted this link:
http://www.cbs.dtu.dk/suppl/malta

He mentioned finding them in the unpublished variants folder. If you can't find them in there, David stated that he had them in VCF. He said he'd put them in PLINK bed files. He probably has, you could ask him.

Thanks, this is very interesting! Just to make sure, more SNPs would be better. I was wondering if you could remove Moorjani et al. data, but keep all other samples, to increase the SNP count.

Sein
05-21-2014, 06:31 AM
I should have mentioned this before, the Burusho results in relation to MA1 are rather interesting. Although eigenvector 1 involves a mix of ANE and ENA (difficult to disentangle the two), and although the Burusho have some very serious ENA admixture (if the HGDP Pashtuns are 5% ENA, and I don't see any reason to doubt TreeMix on this matter, the Burusho are probably around 15%-25% ENA), there is a precedent for the Burusho having the highest level of ANE admixture in South Asia (and since South Asians have the highest level of ANE admixture in Eurasia, that's a big deal). Since the Burusho aren't Indo-European speakers, this is pretty important. And even if they don't have the highest level of ANE in South Asia, I'm sure that they are very similar to Pashtuns when it comes to ANE admixture, probably around 35%-45% ANE. Outside of South Asia but still within Eurasia, ANE peaks among the Lezgians and Chechens, and both of these peoples aren't Indo-European speakers. David found 25% ANE admixture in South India (referring to the most accurate test he tried), which is still greater than the highest ANE % found in Europe. Again, we're not dealing with Indo-Europeans.

Taking all of this into account, and recognizing the fact that substantial ANE ancestry is an important part of the genetic heritage of all South Asians, I think ANE admixture doesn't involve Indo-Aryans for South Asians, and may not involve Indo-Europeans in general (although it might, to be completely honest, I'm not really sure). Instead, there may have been ANE hunter gatherer populations across West Asia, and into northern South Asia. These populations probably mixed with ASI hunter gatherer populations, as that's simply how the world works. After the agricultural revolution, these ANE-ASI mixed populations might have been culturally assimilated by agricultural peoples whose ultimate roots lay in Southwestern Asia, and who possessed an ancestry rich in "Basal Eurasian" admixture. There must have been substantial gene-flow between the ANE-ASI mixed populations and these farming groups. Later incursions into South Asia from the steep reinforced/augmented an ancient and already pervasive element of South Asian genetic ancestry. To be frank, I just can't imagine a relatively small number of people from the steep having such a huge impact on South Asian genetics. Mobile cultures with expansionist tendencies are almost always thoroughly absorbed by the settled/sophisticated cultures they encounter. Simply put, nomadic raiding peoples are replaced by the people they "conquer", their genetic contribution to the "conquered" tends to be rather meager, due to population size. Not to mention the fact that the settled people who lived in Punjab/Sindh before the Indo-Aryans looked very similar to living Punjabis and Sindhis.

My idea concerning ANE populations in West Asia+northern South Asia being replaced by farmers isn't original. It comes from the supplementary materials for Lazaridis et al. They note the puzzling nature of an ANE peak among speakers of North Caucasian languages (they haven't even tackled South Asia, but when they do, I'm sure they'll verify the huge presence of ANE admixture throughout South Asia), and the possibility of ANE populations throughout ancient West Asia is noted.

Not sure if this matters, but many scholars have long argued for a link between North Caucasian languages and Burushaski (the language of the Burusho). In turn, there have been arguments that North Caucasian languages+Burushaski are related to Yeniseian languages, and perhaps to Na Dene languages as well. The most ANE-like of all ENA populations are the Ket, speakers of a Yeniseian language. Perhaps these languages, assuming they are actually related, are descended from the language families prevalent among the ANE? Maybe Burushaski is related to some of the rather ancient, long extinct languages of northern South Asia and West Asia? I'm just thinking out loud here, and engaging in extremely unsubstantiated speculation. I'm no expert on these issues. Still, I just thought I should point this stuff out.

Edit: I still find it cool that if MA1 was alive today, he'd be eligible for participation in Harappa Ancestry Project. :biggrin1:

Even when there is a robust South Asian-specific dimension in PCA, he clusters with South Asians. Most of his "eastern" affinity seems to be specific to South Asia. In unsupervised ADMIXTURE runs, he scores very substantial percentages of the Baloch-specific and South Indian-specific components. If one counts the component modal in South Central Asia as "South Asian", and if one counts the component modal in tribal+"Scheduled Caste" South Indians as "South Asian", the majority of MA1's ancestry is "South Asian", followed by "Northern European", and "Native American" (again, referring to unsupervised ADMIXTURE runs).

everest59
05-21-2014, 11:32 AM
I just tried Plink's mds-plot command. What I did was, I used 20 dimensions to create the following dendrogram. 60k snp's used. Google is not displaying the plot properly, so you will need to download it for sure.
https://drive.google.com/file/d/0B3vEDdpZDjUpU3A2bC1DcGFXZG8/edit?usp=sharing

I did something similar a few months ago. The only difference is, this was created using Plink as opposed to flashpca.
Also, let me know if you need any ID's.

parasar
05-21-2014, 04:05 PM
I should have mentioned this before, the Burusho results in relation to MA1 are rather interesting. Although eigenvector 1 involves a mix of ANE and ENA (difficult to disentangle the two), and although the Burusho have some very serious ENA admixture (if the HGDP Pashtuns are 5% ENA, and I don't see any reason to doubt TreeMix on this matter, the Burusho are probably around 15%-25% ENA), there is a precedent for the Burusho having the highest level of ANE admixture in South Asia (and since South Asians have the highest level of ANE admixture in Eurasia, that's a big deal). Since the Burusho aren't Indo-European speakers, this is pretty important. And even if they don't have the highest level of ANE in South Asia, I'm sure that they are very similar to Pashtuns when it comes to ANE admixture, probably around 35%-45% ANE. Outside of South Asia but still within Eurasia, ANE peaks among the Lezgians and Chechens, and both of these peoples aren't Indo-European speakers. David found 25% ANE admixture in South India (referring to the most accurate test he tried), which is still greater than the highest ANE % found in Europe. Again, we're not dealing with Indo-Europeans.

Taking all of this into account, and recognizing the fact that substantial ANE ancestry is an important part of the genetic heritage of all South Asians, I think ANE admixture doesn't involve Indo-Aryans for South Asians, and may not involve Indo-Europeans in general (although it might, to be completely honest, I'm not really sure). Instead, there may have been ANE hunter gatherer populations across West Asia, and into northern South Asia. These populations probably mixed with ASI hunter gatherer populations, as that's simply how the world works. After the agricultural revolution, these ANE-ASI mixed populations might have been culturally assimilated by agricultural peoples whose ultimate roots lay in Southwestern Asia, and who possessed an ancestry rich in "Basal Eurasian" admixture. There must have been substantial gene-flow between the ANE-ASI mixed populations and these farming groups. Later incursions into South Asia from the steep reinforced/augmented an ancient and already pervasive element of South Asian genetic ancestry. To be frank, I just can't imagine a relatively small number of people from the steep having such a huge impact on South Asian genetics. Mobile cultures with expansionist tendencies are almost always thoroughly absorbed by the settled/sophisticated cultures they encounter. Simply put, nomadic raiding peoples are replaced by the people they "conquer", their genetic contribution to the "conquered" tends to be rather meager, due to population size. Not to mention the fact that the settled people who lived in Punjab/Sindh before the Indo-Aryans looked very similar to living Punjabis and Sindhis.

My idea concerning ANE populations in West Asia+northern South Asia being replaced by farmers isn't original. It comes from the supplementary materials for Lazaridis et al. They note the puzzling nature of an ANE peak among speakers of North Caucasian languages (they haven't even tackled South Asia, but when they do, I'm sure they'll verify the huge presence of ANE admixture throughout South Asia), and the possibility of ANE populations throughout ancient West Asia is noted.

Not sure if this matters, but many scholars have long argued for a link between North Caucasian languages and Burushaski (the language of the Burusho). In turn, there have been arguments that North Caucasian languages+Burushaski are related to Yeniseian languages, and perhaps to Na Dene languages as well. The most ANE-like of all ENA populations are the Ket, speakers of a Yeniseian language. Perhaps these languages, assuming they are actually related, are descended from the language families prevalent among the ANE? Maybe Burushaski is related to some of the rather ancient, long extinct languages of northern South Asia and West Asia? I'm just thinking out loud here, and engaging in extremely unsubstantiated speculation. I'm no expert on these issues. Still, I just thought I should point this stuff out.

Edit: I still find it cool that if MA1 was alive today, he'd be eligible for participation in Harappa Ancestry Project. :biggrin1:

Even when there is a robust South Asian-specific dimension in PCA, he clusters with South Asians. Most of his "eastern" affinity seems to be specific to South Asia. In unsupervised ADMIXTURE runs, he scores very substantial percentages of the Baloch-specific and South Indian-specific components. If one counts the component modal in South Central Asia as "South Asian", and if one counts the component modal in tribal+"Scheduled Caste" South Indians as "South Asian", the majority of MA1's ancestry is "South Asian", followed by "Northern European", and "Native American" (again, referring to unsupervised ADMIXTURE runs).

Quite possible. I had speculated something similar on other threads.


The slightly higher ANE in the Burusho may either point to a non-IE origin nature of ANE with the isolate Burusho retaining more of their composition and language or perhaps a second shot of ANE into the Burusho from east asia

http://www.anthrogenica.com/showthread.php?1781-Post-Your-EEF-WHG-and-ANE-Admixture-Proportions&p=33992&viewfull=1#post33992


So is it possible that the ANE input is the non-Indo European one, and EEF (perhaps Y-F,G,H,J) is the Indo-European one?

http://www.anthrogenica.com/showthread.php?1832-So-are-the-ANEs-the-root-of-PIE&p=34428&highlight=burushaski#post34428

http://www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0091722.g004&representation=PNG_M
http://www.anthrogenica.com/showthread.php?2314-Linguistic-Phylogenies-Support-Back-Migration-from-Beringia-to-Asia

The Chinese border land - Gansu Shanxi - nomads (Huns, Di, Yuezhi etc. http://en.wikipedia.org/wiki/Ethnic_groups_in_Chinese_history ) may have been the mechanism for spread.
http://www.anthrogenica.com/showthread.php?97-Genetic-Genealogy-and-Ancient-DNA-in-the-News&p=37506&viewfull=1#post37506

Some think that Indo-European was also spread by these same or similar nomads. Per Narain noted in Mallory and Mair (2000, p. 281):

“As we have just mentioned, the people who emerge as Tocharians in Western sources are often equated with a branch of the Yuezhi of Chinese sources who were driven first from the Gansu borderlands by the Xiongnu, then further west by the Wusun, arriving at the Oxus, and going on to conquer Bactria and establish the Kushan empire. Narain argues that once one accepts the equation Tocharian = Yuezhi, then one is forced to follow both the Chinese historical sources (which for him would propel the Yuezhi back to at least the 7th century BC) and the geographical reference of their first cited historical location (Gansu) to the conclusion that they had lived there ‘from times immemorial’. Narain infers that they had been there at least since the Qijia culture of c. 2000 BC and probably even earlier in the Yangshao culture of the Neolithic. This would render the Tocharians as virtually native to Gansu (and earlier than the putative spread of the Neolithic to Xinjiang) and Narain goes so far as to argue that the Indo-Europeans themselves originally dispersed from this area westwards. Seldom has a tail so small wagged a dog so large.”


If what Narain theorized is indeed the case, then perhaps both Nadene-Caucasian-Burushaski and IE came from Inner Asia with the Burusho having showing higher ANE due a subsequent ANE input.

parasar
05-21-2014, 05:33 PM
...

If what Narain theorized is indeed the case, then perhaps both Nadene-Caucasian-Burushaski and IE came from Inner Asia with the Burusho having showing higher ANE due a subsequent ANE input.
A subsequent shot of ANE argument makes more sense as the Busrusho (unlike their neighbors, except the Hazara) show a perceptible amount of Y-C.
http://www.ncbi.nlm.nih.gov/corecgi/tileshop/tileshop.fcgi?p=PMC3&id=537231&s=4&r=1&c=1

Later East Asiatic/Siberian input is also likely the cause of higher levels of ANE in some Europeans:

However, northeastern Europeans fall away from this line in the direction of Han. This is consistent withSiberian gene flow into some northeastern Europeans after the initial ANE admixture, and may be related to the fact that Y-chromosome haplogroup N 30, 31 is shared between Siberian and northeastern Europeans 32, 33 but not with western Europeans. There may in fact be multiple layers of Siberian gene flow into northeastern Europe after the initial ANE gene flow, as our analyses reported in SI 12 show that some Mordovians, Russians and Chuvash have
Siberian-related admixture that is significantly more recent than that in Finns (SI12). http://biorxiv.org/content/biorxiv/early/2013/12/23/001552.full.pdf

Sein
05-21-2014, 06:05 PM
I just tried Plink's mds-plot command. What I did was, I used 20 dimensions to create the following dendrogram. 60k snp's used. Google is not displaying the plot properly, so you will need to download it for sure.
https://drive.google.com/file/d/0B3vEDdpZDjUpU3A2bC1DcGFXZG8/edit?usp=sharing

I did something similar a few months ago. The only difference is, this was created using Plink as opposed to flashpca.
Also, let me know if you need any ID's.

Thanks everest! This looks great, and it's very interesting.

I was wondering what the IDs were for Pathan 1680, Pathan 1671, Tadjik 1923, Pathan 1675, and Pathan 1681? Also, what's the ID for Pathan 1674, right next to Sapporo. Thanks in advance.

Sangarius
05-21-2014, 08:10 PM
Anywhere I can download it from?

Anyways, here's a plot that includes Kashmiri Pandits.

https://drive.google.com/file/d/0B3vEDdpZDjUpNnhkdGlvbTVFU2M/edit?usp=sharing

No Europeans here. Ma-1 clustering very close to South Asians, with one Burusho close by. Only 6k snp's used, since the Moorjani et al data is very different from other data.

If you don't see yourself, just search.

Not as accurate as I'd like...

Interesting. I was first a bit confused when I saw the name Alper on the plot, which is an abbreviation of my first name. Somehow I thought you might have took it from the email I send you my data with and the plotting also kinda matched with where I would have expected to plot. But then I discovered the name Sangarius on the plot, but surprisingly plotting close to Bedouins. I guess the low number of snps might the explanation for my position on the plot?

Ignis90
05-21-2014, 08:19 PM
Interesting. I was first a bit confused when I saw the name Alper on the plot, which is an abbreviation of my first name. Somehow I thought you might have took it from the email I send you my data with and the plotting also kinda matched with where I would have expected to plot. But then I discovered the name Sangarius on the plot, but surprisingly plotting close to Bedouins. I guess the low number of snps might the explanation for my position on the plot?

You're plotting near the most ANE-admixed Bedouin, not the [supposably] ANE-free ones.

Hanna
05-21-2014, 08:44 PM
Interesting. I was first a bit confused when I saw the name Alper on the plot, which is an abbreviation of my first name. Somehow I thought you might have took it from the email I send you my data with and the plotting also kinda matched with where I would have expected to plot. But then I discovered the name Sangarius on the plot, but surprisingly plotting close to Bedouins. I guess the low number of snps might the explanation for my position on the plot?
Alper is my sister's husband. I recently had him tested.

Sangarius
05-21-2014, 08:56 PM
You're plotting near the most ANE-admixed Bedouin, not the [supposably] ANE-free ones.

Thanks for the info. But that still doesn't explain the discrepancy with where I plot compared to other people on the plot. At least based on admixture proportions from Eurogenes, Dodecad et al. it doesn't make any sense. Hanna f.e. is quite similar to me in admixture proportions and clusters as expected with the other Turkish user Icebreaker close to other West-Asians/South-Caucasians, whereas I plot far off.

Sangarius
05-21-2014, 08:59 PM
Alper is my sister's husband. I recently had him tested.

Aha, from where in Turkey is his ancestry? Is he the one with the red beard?

Sein
05-21-2014, 09:16 PM
Interesting. I was first a bit confused when I saw the name Alper on the plot, which is an abbreviation of my first name. Somehow I thought you might have took it from the email I send you my data with and the plotting also kinda matched with where I would have expected to plot. But then I discovered the name Sangarius on the plot, but surprisingly plotting close to Bedouins. I guess the low number of snps might the explanation for my position on the plot?

Interestingly, the lonely Bedouin sample far away from everyone isn't actually a Bedouin, but probably a mislabeled Sudanese sample (I've read something about this. The dendogram also supports this, this sample clusters with the Yoruba, along with the most Sub-Saharan African admixed Mozabite). So, just like PC2 involves both ANE and ENA, PC1 involves both Sub-Saharan African ancestry and "Basal Eurasian" ancestry. So I think your position implies that you may have either more BEA or SSA admixture than Iranians and Georgians. Maybe both. If you don't have any real SSA admixture on various DIY calculators, your position probably boils down to greater BEA admixture.

This is also why some of the Brahui and Baloch are clustering with Iranians and Georgians, some of them have as much BEA admixture as those populations, not to mention 1%-3% SSA admixture.

everest59
05-21-2014, 10:12 PM
Thanks everest! This looks great, and it's very interesting.

I was wondering what the IDs were for Pathan 1680, Pathan 1671, Tadjik 1923, Pathan 1675, and Pathan 1681? Also, what's the ID for Pathan 1674, right next to Sapporo. Thanks in advance.

Here are the ID's:
HGDP00244 1680
HGDP00226 1671
Tadjik1 44Af 1923
HGDP00262 1685
HGDP00234 1675
HGDP00251 1681

Looking at the Harappa HGDP results, HGDP00244 seems different from you in the sense that he has lower Caucasian but higher Baloch. Besides those two components everything else seems to match yours.

everest59
05-21-2014, 10:18 PM
Interesting. I was first a bit confused when I saw the name Alper on the plot, which is an abbreviation of my first name. Somehow I thought you might have took it from the email I send you my data with and the plotting also kinda matched with where I would have expected to plot. But then I discovered the name Sangarius on the plot, but surprisingly plotting close to Bedouins. I guess the low number of snps might the explanation for my position on the plot?

This got to be it, because if you look at the dendrogram I created above, you are clustering at the right position:
Georgian 915
Georgians 931
Georgian 922
Georgian 917
Turks 1981
Syrians 1902
Armenians 73
.Sangarius 15
Turks 1975
Turks 1969
Turks 1986
.Icebreaker 8
Turks 1976
Turks 1971

I'm gonna create a new plot with higher number of snp's.

everest59
05-21-2014, 10:47 PM
Thanks for the info. But that still doesn't explain the discrepancy with where I plot compared to other people on the plot. At least based on admixture proportions from Eurogenes, Dodecad et al. it doesn't make any sense. Hanna f.e. is quite similar to me in admixture proportions and clusters as expected with the other Turkish user Icebreaker close to other West-Asians/South-Caucasians, whereas I plot far off.

This one's better, with 30k snp's:
https://drive.google.com/file/d/0B3vEDdpZDjUpWU81Z3pEMUdyeEE/edit?usp=sharing

I'll say this. The dendrograms I create tend to always be more accurate than pca plots. The reason being, I am plotting only 2 dimensions. However, with my dendrogram, I can use as many dimensions as I'd like.

Sein
05-21-2014, 10:58 PM
This one's better, with 30k snp's:
https://drive.google.com/file/d/0B3vEDdpZDjUpWU81Z3pEMUdyeEE/edit?usp=sharing

I'll say this. The dendrograms I create tend to always be more accurate than pca plots. The reason being, I am plotting only 2 dimensions. However, with my dendrogram, I can use as many dimensions as I'd like.

This one's pretty cool, MA1's clustering with "Scheduled Caste" North Indians.

everest59
05-21-2014, 11:02 PM
This one's pretty cool, MA1's clustering with "Scheduled Caste" North Indians.

Yeah, a little different for sure.

BTW, I just ran treemix with ancient DNA samples. I get an arrow going from LaBrana +Mal'ta to Australian aborigine. Unfortunately only 6k snp's used:

https://drive.google.com/file/d/0B3vEDdpZDjUpTG5FNkZTM1ZTQnM/edit?usp=sharing

Migration edge of 6.2%.

Sein
05-21-2014, 11:04 PM
Here are the ID's:
HGDP00244 1680
HGDP00226 1671
Tadjik1 44Af 1923
HGDP00262 1685
HGDP00234 1675
HGDP00251 1681

Looking at the Harappa HGDP results, HGDP00244 seems different from you in the sense that he has lower Caucasian but higher Baloch. Besides those two components everything else seems to match yours.

Thanks everest!

I really like the dendograms. For me though, it's interesting that out of all Di Cristofaro populations, it's only the Pashtuns who are very close to fellow coethnics in neighboring countries. The Di Cristofaro Uzbeks, Tajiks, Hazara, and Turkmen seem rather different from their relatives, while all of the Pashtuns are rather tightly clustered. Also, it's interesting how it joins Pashtuns and Balochistanis together. Very fascinating stuff.

Although, 9-10 HGDP Pashtuns are clustering with northwest South Asians (Punjabis like Sapporo and Dr_McNinja, some very northern shifted GIH, and Sindhis). That's actually what we'd expect.

everest59
05-21-2014, 11:06 PM
Thanks everest!

I really like the dendograms. For me though, it's interesting that out of all Di Cristofaro populations, it's only the Pashtuns who are very close to fellow coethnics in neighboring countries. The Di Cristofaro Uzbeks, Tajiks, Hazara, and Turkmen seem rather different from their relatives, while all of the Pashtuns are rather tightly clustered. Also, it's interesting how it joins Pashtuns and Balochistanis together. Very fascinating stuff.

Although, 9-10 HGDP Pashtuns are clustering with northwest South Asians (Punjabis like Sapporo and Dr_McNinja, some very northern shifted GIH, and Sindhis). That's actually what we'd expect.

Yeah, Pashtuns are pretty tightly clustered.
You know, with Plink, I can probably create PCAs using a much higher number of snp's. Flashpca relies on memory.

Another plot with migraiton edge of 5.
https://drive.google.com/file/d/0B3vEDdpZDjUpWmh1OTFWUjFEcE0/edit?usp=sharing

I'll give the data:
0.0625201 NA NA NA LaBrana:0.027753 Aus_Aboriginal:0.11592
0.0383032 NA NA NA Saqqaq:0.106801 Mal'ta:0.0428713
0.0999811 NA NA NA Aus_Aboriginal:0.11592 Clovis:0.120896

Sein
05-21-2014, 11:08 PM
Yeah, a little different for sure.

BTW, I just ran treemix with ancient DNA samples. I get an arrow going from LaBrana +Mal'ta to Australian aborigine. Unfortunately only 6k snp's used:

https://drive.google.com/file/d/0B3vEDdpZDjUpTG5FNkZTM1ZTQnM/edit?usp=sharing

Migration edge of 6.2%.

This is pretty cool, because that Australian Aborigine sample has substantial South Asian admixture lacking in Papuans. South Asian admixture seems to be closely tied to ANE.

Did you try the -global option?

Edit: That sounds like an awesome idea! The more snps, the better.

everest59
05-21-2014, 11:13 PM
This is pretty cool, because that Australian Aborigine sample has substantial South Asian admixture lacking in Papuans. South Asian admixture seems to be closely tied to ANE.

Did you try the -global option?

Edit: That sounds like an awesome idea! The more snps, the better.

Yup. I tried the -global option. No difference.
I just tried the residuals (this program doesn't work all the time though for some reason. ):

https://drive.google.com/file/d/0B3vEDdpZDjUpNUZiT2R6aTRDLVU/edit?usp=sharing

Sein
05-21-2014, 11:18 PM
Yup. I tried the -global option. No difference.
I just tried the residuals (this program doesn't work all the time though for some reason. ):

https://drive.google.com/file/d/0B3vEDdpZDjUpNUZiT2R6aTRDLVU/edit?usp=sharing

Thanks!

Those residuals are pretty interesting.

everest59
05-21-2014, 11:21 PM
Thanks!

Those residuals are pretty interesting.

How do I read these? I don't get what S.E. stands for.

Sein
05-21-2014, 11:28 PM
How do I read these? I don't get what S.E. stands for.

Here is the explanation from the paper:
"Positive residuals indicate pairs of populations where the model underestimates the observed covariance, and thus populations where the fit might be improved by adding additional edges. Negative residuals indicate pairs of populations where the model overestimates the observed covariance; these are a necessary outcome of having positive residuals, but can also sometimes be interpreted as populations that are forced too close together due to unmodeled migration elsewhere in the graph."

everest59
05-21-2014, 11:36 PM
Here is the explanation from the paper:
"Positive residuals indicate pairs of populations where the model underestimates the observed covariance, and thus populations where the fit might be improved by adding additional edges. Negative residuals indicate pairs of populations where the model overestimates the observed covariance; these are a necessary outcome of having positive residuals, but can also sometimes be interpreted as populations that are forced too close together due to unmodeled migration elsewhere in the graph."

Well, at migration edge of 5, residuals all went white. So I don't see any colored boxes. Is that expected?

Anyways, I wanted to do the Raghavan treemix methodology. I'll do it tomorrow hopefully.

Sein
05-21-2014, 11:41 PM
Well, at migration edge of 5, residuals all went white. So I don't see any colored boxes. Is that expected?

Anyways, I wanted to do the Raghavan treemix methodology. I'll do it tomorrow hopefully.

Pretty much, it's what one would expect. I think this basically means that 5 migration edges is enough to explain the data, or that nothing important is going to happen beyond 5 migration edges.

That sounds exciting, I'm sure it's going to be very interesting.

Edit: everest, if possible, could you find out which sample on the dendogram is HGDP00214? I'd deeply appreciate this my friend. Thanks in advance.

everest59
05-22-2014, 01:57 AM
Pretty much, it's what one would expect. I think this basically means that 5 migration edges is enough to explain the data, or that nothing important is going to happen beyond 5 migration edges.

That sounds exciting, I'm sure it's going to be very interesting.

Edit: everest, if possible, could you find out which sample on the dendogram is HGDP00214? I'd deeply appreciate this my friend. Thanks in advance.

Number 1666, and he's clustering with Uzbek8_18af.

Sein
05-22-2014, 02:19 AM
Number 1666, and he's clustering with Uzbek8_18af.

Thanks!

That's rather reasonable. We can't compare both of them directly, because Zack calculated the percentages in a different manner between both data-sets. But even then, their HarappaWorld results are very similar. This seems to be a very accurate method. In fact, impressively accurate.

Again, I'd like to point out the Khetran individual. The Khetran people have a very complicated and ambiguous history. They tell everyone that they are Baloch, but lay claim to Pashtun ancestry when talking about their "ultimate" origins. The British anthropologists thought that they weren't Baloch or Pashtun, but just descendants of "Indian" people who were assimilated by the Baloch. I expected his HarappaWorld results to throw some light on this matter, but things got even more confusing. His "Caucasian" % is typical of Pashtuns, his "South Indian" % is typical of Pashtuns, his "Southwest Asian" is fairly typical of Pashtuns, and his "Baloch" % is more in line with Balochi people. I sent the data to Dr. McDonald, and he thought the best fit was Baloch or Brahui, but noted that his reference populations weren't adequate for this individual. And that's why I have to thank you my friend. Without your analyses, things would still be rather unclear. In everything you've done, he unambiguously clusters with Balochistanis. He does have an affinity to Pashtuns, but his predominant genetic heritage is clearly tied to the Baloch. Also, he is rather distinct from Sindhis or Punjabis, even though a lot of anthropologists have claimed that the Khetran are basically descended from the ancient non-Iranian populations of eastern Balochistan.

In short, you are amazing.

Shaikorth
05-22-2014, 05:33 AM
http://www.ebc.uu.se/Jakobsson/data/SkoglundEtAlData2014/?languageId=1

A bunch of Swedish ancient samples are available for download. The hunter-gatherers might be of particular interest since they're clearest WHG-ANE hybrids around.

everest59
05-22-2014, 09:33 AM
http://www.ebc.uu.se/Jakobsson/data/SkoglundEtAlData2014/?languageId=1

A bunch of Swedish ancient samples are available for download. The hunter-gatherers might be of particular interest since they're clearest WHG-ANE hybrids around.

Bam format is always a pain in the butt because I will need to call variants. It takes quite a bit of time. I guess I'll do it on a few of those samples.

soulblighter
05-22-2014, 01:13 PM
I just tried Plink's mds-plot command. What I did was, I used 20 dimensions to create the following dendrogram. 60k snp's used. Google is not displaying the plot properly, so you will need to download it for sure.
https://drive.google.com/file/d/0B3vEDdpZDjUpU3A2bC1DcGFXZG8/edit?usp=sharing

I did something similar a few months ago. The only difference is, this was created using Plink as opposed to flashpca.
Also, let me know if you need any ID's.

Thanks! I am clustered with Indian 1165, followed by Indian 1167. I assume these are the Singapore Indian samples?
Also Balochi 157 seems to be more Indian than Balochi based on the dendogram!


This one's better, with 30k snp's:
https://drive.google.com/file/d/0B3vEDdpZDjUpWU81Z3pEMUdyeEE/edit?usp=sharing

I'll say this. The dendrograms I create tend to always be more accurate than pca plots. The reason being, I am plotting only 2 dimensions. However, with my dendrogram, I can use as many dimensions as I'd like.

Is there a possibility that the 30K SNPs chosen for Malt'a in this run, coincidentally overlap mostly with the allele frequencies associated with his 33% South/Central Asian affinity?

parasar
05-22-2014, 01:59 PM
...

Is there a possibility that the 30K SNPs chosen for Malt'a in this run, coincidentally overlap mostly with the allele frequencies associated with his 33% South/Central Asian affinity?

While MA1 indeed overlaps about 33% with a 'South Asian' component, as South Asians themselves have other components besides 'South Asian' shared with MA1, the overlap with South Asians is higher than 33%. Lazaridis models MA1 as unadmixed, so considers the 33% overlap to be "shared ancestry" rather than admixture, which may not be the case.

After all Lazaridis even projects WHG onto modern South Asians.
http://4.bp.blogspot.com/-43t2J5JB4wI/U0E3V9B8oCI/AAAAAAAACe4/YdQUgTXvmDg/s1600/PCA+global.png
Figure S10.1: Projection of ancient samples onto “Global” PCA dimensions 1 and 2.
http://biorxiv.org/content/biorxiv/suppl/2014/04/05/001552.DC4/001552-3.pdf

parasar
05-22-2014, 04:11 PM
Bam format is always a pain in the butt because I will need to call variants. It takes quite a bit of time. I guess I'll do it on a few of those samples.

everest59,

Is it possible to obtain and run data for the Mohanna?
Mohanna MHN 48 samples HGDP

Sein
05-22-2014, 06:11 PM
...Also Balochi 157 seems to be more Indian than Balochi based on the dendogram!...

This is an outlier Baloch sample that sticks out whenever they are included on PCA plots or in ADMIXTURE runs. I think they are 45% "South Indian" for HarappaWorld. I've always wanted to verify this from Zack.