PDA

View Full Version : 16 Türk hakkında yapılan bir study hakkında



Buhedyar
01-13-2018, 08:48 PM
Merhaba arkadaşlar büyük ihtimal haberdarsınız 2014'te 16 Türk hakkında genetik bir çalışma yapıldı. Çalışma daha çok bu Türklerin Irki (Negroid, Asian, Caucasian) componentlerine yönelik. Aşağıda atacağım study'i

Asıl sorum şu: Bu Türklerin dnasını studyden alıp gedmatche yükleyebilir miyiz? Daha önce lukas'tan rica etmiştim yapması için ama yapmadı malesef. Bu arada her sampleın nereli olduğunu biliyoruz plaka kodlarıyla kodlanmış memleketleri.

https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-963


http://media.springernature.com/full/springer-static/image/art%3A10.1186%2F1471-2164-15-963/MediaObjects/12864_2014_Article_6660_Fig1_HTML.jpg
http://media.springernature.com/full/springer-static/image/art%3A10.1186%2F1471-2164-15-963/MediaObjects/12864_2014_Article_6660_Fig2_HTML.jpg

Onur Dincer
01-13-2018, 09:57 PM
Genel itibariyle iyi bir çalışma. Ama referanslar arasında Anadolu Yunanları ve Ermeniler de olsa daha doğru köken hesapları yapılabilirdi. Sonuçta Türk göçü öncesi Anadolu genetiği Toskana ya da İberya gibi değildi.

Buhedyar
01-13-2018, 11:21 PM
Genel itibariyle iyi bir çalışma. Ama referanslar arasında Anadolu Yunanları ve Ermeniler de olsa daha doğru köken hesapları yapılabilirdi. Sonuçta Türk göçü öncesi Anadolu genetiği Toskana ya da İberya gibi değildi.

Çalışmanın aslında etnisiteden çok ırki bir çalışma olduğunu düşünüyorum. Türkler Kuzey Avrupalılarla adeta kardeş halk gibiler çalışmaya göre ama etnik olarak arada dağlar kadar fark var. Bence Kuzey ve Güney Avrupa Caucasoidliği temsil ediyor (Bazı Kuzey Avrupalılar biraz güneye kayıyor onlar da Finler bu arada).

Onur Dincer
01-13-2018, 11:36 PM
Çalışmanın aslında etnisiteden çok ırki bir çalışma olduğunu düşünüyorum. Türkler Kuzey Avrupalılarla adeta kardeş halk gibiler çalışmaya göre ama etnik olarak arada dağlar kadar fark var. Bence Kuzey ve Güney Avrupa Caucasoidliği temsil ediyor (Bazı Kuzey Avrupalılar biraz güneye kayıyor onlar da Finler bu arada).

Finlerde güneye kayma yok da doğuya kayma var her ne kadar düşük olsa da Doğu Avrasya miksinden dolayı. Benzer miktarda bir Doğu Avrasya miksi Anadolu Türklerinde de var, Anadolu Türklerinde de doğuya kayma var o yüzden. Dünya ölçeği yerine Batı Avrasya ölçeğinde bakıldığında Anadolu Türklerinin güney Avrupalılara kuzey Avrupalılara olduğundan çok daha yakın genetikte olduğu rahatça görülebiliyor.

Buhedyar
01-13-2018, 11:53 PM
Finlerde güneye kayma yok da doğuya kayma var her ne kadar düşük olsa da Doğu Avrasya miksinden dolayı. Benzer miktarda bir Doğu Avrasya miksi Anadolu Türklerinde de var, Anadolu Türklerinde de doğuya kayma var o yüzden. Dünya ölçeği yerine Batı Avrasya ölçeğinde bakıldığında Anadolu Türklerinin güney Avrupalılara kuzey Avrupalılara olduğundan çok daha yakın genetikte olduğu rahatça görülebiliyor.

güneyden kastım çalışmadaki pca plottu. Asyalılar güneyde. Asyalı samplelar da Japonlar ve Çinliler. Üstelik çalışmadaki Türklerin çoğu Finlerden daha asyatik.

Onur Dincer
01-14-2018, 12:18 AM
güneyden kastım çalışmadaki pca plottu. Asyalılar güneyde. Asyalı samplelar da Japonlar ve Çinliler. Üstelik çalışmadaki Türklerin çoğu Finlerden daha asyatik.

PCA plotlarında kuzey, güney, doğu, batı olmaz, ama her neyse, o önemli bir mevzu değil. PCA'deki Anadolu Türklerinin çoğu Doğu Avrasyalılara mesafe bakımından Finlerle aynı hizada, 5 Anadolu Türkü ise Finlerden daha çok Doğu Avrasya miksli.

Sangarius
01-14-2018, 12:50 PM
Merhaba arkadaşlar büyük ihtimal haberdarsınız 2014'te 16 Türk hakkında genetik bir çalışma yapıldı. Çalışma daha çok bu Türklerin Irki (Negroid, Asian, Caucasian) componentlerine yönelik. Aşağıda atacağım study'i

Asıl sorum şu: Bu Türklerin dnasını studyden alıp gedmatche yükleyebilir miyiz? Daha önce lukas'tan rica etmiştim yapması için ama yapmadı malesef. Bu arada her sampleın nereli olduğunu biliyoruz plaka kodlarıyla kodlanmış memleketleri.

https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-963

Those are whole genomes sequences, each one over 100 GB in size. It's no wonder people decline requests to process them.

Furthermore, those samples wouldn't be usefull for regional analysis anyway. To quote the paper on its sampling methodology:


We recruited 16 healthy volunteers from across Turkey (Figure 1A). The individuals were included in the study irrespective of their mother-tongue/ethnicity; we refer to them collectively as “Turkish”.

Why do these people make the extra effort to sample from different regions, if they don't really care WHO they sample? It's like they are trying to out-PC their Western colleagues.

Buhedyar
01-14-2018, 03:58 PM
Those are whole genomes sequences, each one over 100 GB in size. It's no wonder people decline requests to process them.

The problem wasn't the size of genome sequences. Actually the person that i asked managed to extract a sample's dna and uploaded it to Gedmatch. But the problem is his components made no sense. He was your average Turk but he scored like 4-5% Negroid. Other samples were also like this. Sanırım dna'yı extract ederken bir hata yaptı ve böyle sonuç çıktı.


Why do these people make the extra effort to sample from different regions, if they don't really care WHO they sample? It's like they are trying to out-PC their Western colleagues
Uh, I think they used one Kurd and one Laz from Van/Artvin but the rest are Turkish imo.

Sangarius
01-14-2018, 05:14 PM
The problem wasn't the size of genome sequences. Actually the person that i asked managed to extract a sample's dna and uploaded it to Gedmatch. But the problem is his components made no sense. He was your average Turk but he scored like 4-5% Negroid. Other samples were also like this. Sanırım dna'yı extract ederken bir hata yaptı ve böyle sonuç çıktı.

Uh, I think they used one Kurd and one Laz from Van/Artvin but the rest are Turkish imo.

I see, from your initial post it sounded like he refused to do it.

How do you know the ancestries of the samples? While from the PCA it looks like most have an East-Eurasian shift, that still doesn't mean that they are regional natives. That's one of the criteria of our project, otherwise we can't use them as representative samples. I still might try processing one if I knew which sample was from which region.


Additional file 1: Table S1: Geographic locations of the samples used in the project and their NCBI accession numbers.

Yet there is no geographic information about the samples in that table.

Buhedyar
01-14-2018, 05:22 PM
I see, from your initial post it sounded like he refused to do it.

How do you know the ancestries of the samples? While from the PCA it looks like most have an East-Eurasian shift, that still doesn't mean that they are regional natives. That's one of the criteria of our project, otherwise we can't use them as representative samples. I still might try processing one if I knew which sample was from which region.



Yet there is no geographic information about the samples in that table.

Yes i think some of them or more than half of them are not regional natives. But i still want to see their gedmatch results.


Also I emailed one of the scientists about the where the samples are from. Here are the cities of the samples.

06A010111 is from Ankara
08P210611 is from Artvin
24D220611 is from Erzincan
25A220611 is from Erzurum
31P140611 is from Hatay
32A140611 is from Isparta
33M140611 is from Mersin
34S291210 is from İstanbul
35C240511 is from İzmir
38I220611 is from Kayseri
42S291210 is from Konya
48S210611 is from Muğla
50G301210 is from Nevşehir
52C130611 is from Ordu
57M220611 is from Sinop
65A220611 is from Van


Plaka numaralarıyla kodlanmış nereli oldukları.

mephisto
01-17-2018, 04:36 PM
23andme V5 format:
https://drive.google.com/open?id=14gSlZyONggJRCCanj3GUJt25QdRNJ1oD

Sangarius
01-17-2018, 06:50 PM
23andme V5 format:
https://drive.google.com/open?id=14gSlZyONggJRCCanj3GUJt25QdRNJ1oD

Dude, you actually downloaded and processed all of them? Incredible job, man!
But I wonder if you have them in a different format? Because the 23andme V5 format only offers about 600.000 SNPs, they reduced the number of SNPs starting with the V4.
The V3 format offers the highest number of SNPs with over 900.000 autosomal SNPs. Here is a link to a V3 SNP list: https://drive.google.com/open?id=1YnfqLju7mjLFNnO33ulNCe68jmKUNxsw

lukaszM
01-17-2018, 07:04 PM
23andme V5 format:
https://drive.google.com/open?id=14gSlZyONggJRCCanj3GUJt25QdRNJ1oD

Still the same shit bro:) I was this person who for first time converted one of them. Really guys don't waste your time.

WG2889568
K36 Eurogenes
Amerindian 0.67
Arabian 4.51
Armenian 4.57
Basque 0.27
Central_African -
Central_Euro 1.72
East_African 0.31
East_Asian 4.63
East_Balkan 3.81
East_Central_Asian 2.95
East_Central_Euro 1.37
East_Med 7.64
Eastern_Euro 0.99
Fennoscandian -
French 1.92
Iberian -
Indo-Chinese -
Italian 16.36
Malayan 1.84
Near_Eastern 5.19
North_African 0.74
North_Atlantic -
North_Caucasian 8.04
North_Sea 1.85
Northeast_African 0.31
Oceanian 1.65
Omotic 2.39
Pygmy 1.90
Siberian 6.56
South_Asian 3.78
South_Central_Asian 7.36
South_Chinese 2.13
Volga-Ural -
West_African -
West_Caucasian 4.50
West_Med -


Eurogenes K15

North_Sea 7.98
Atlantic 1.67
Baltic 3.88
Eastern_Euro 4.99
West_Med 7.39
West_Asian 17.09
East_Med 18.55
Red_Sea 4.85
South_Asian 7.31
Southeast_Asian 7.49
Siberian 8.84
Amerindian 1.68
Oceanian 2.15
Northeast_African 5.37
Sub-Saharan 0.77

mephisto
01-17-2018, 07:10 PM
Dude, you actually downloaded and processed all of them? Incredible job, man!
But I wonder if you have them in a different format? Because the 23andme V5 format only offers about 600.000 SNPs, they reduced the number of SNPs starting with the V4.
The V3 format offers the highest number of SNPs with over 900.000 autosomal SNPs. Here is a link to a V3 SNP list: https://drive.google.com/open?id=1YnfqLju7mjLFNnO33ulNCe68jmKUNxsw
To be honest, I extracted the SNPs of familytreedna, not 23andme. The (output "listing") format is only 23andme V5 (SNPs of ftdna in 23andme format would be accurate). Every single file contains +600000 SNPs (not as many as 23andme V3, but still).

I already deleted all my working files so I would have to start from zero (download everything etc.) again, maybe I will do it at a later point. I am having trouble to upload "self" made files (or even converted files) to Gedmatch, the tokenizing does not finish, how do you upload ancient samples or such extracted samples to Gedmatch?

Sangarius
01-17-2018, 07:25 PM
Still the same shit bro :) I was this person who for first time converted one of them. Really guys don't waste your time.

WG2889568
K36 Eurogenes
Amerindian 0.67
Arabian 4.51
Armenian 4.57
Basque 0.27
Central_African -
Central_Euro 1.72
East_African 0.31
East_Asian 4.63
East_Balkan 3.81
East_Central_Asian 2.95
East_Central_Euro 1.37
East_Med 7.64
Eastern_Euro 0.99
Fennoscandian -
French 1.92
Iberian -
Indo-Chinese -
Italian 16.36
Malayan 1.84
Near_Eastern 5.19
North_African 0.74
North_Atlantic -
North_Caucasian 8.04
North_Sea 1.85
Northeast_African 0.31
Oceanian 1.65
Omotic 2.39
Pygmy 1.90
Siberian 6.56
South_Asian 3.78
South_Central_Asian 7.36
South_Chinese 2.13
Volga-Ural -
West_African -
West_Caucasian 4.50
West_Med -


Eurogenes K15

North_Sea 7.98
Atlantic 1.67
Baltic 3.88
Eastern_Euro 4.99
West_Med 7.39
West_Asian 17.09
East_Med 18.55
Red_Sea 4.85
South_Asian 7.31
Southeast_Asian 7.49
Siberian 8.84
Amerindian 1.68
Oceanian 2.15
Northeast_African 5.37
Sub-Saharan 0.77

Do all of them get this SSA noise? Which sample is that?


To be honest, I extracted the SNPs of familytreedna, not 23andme. The (output "listing") format is only 23andme V5 (SNPs of ftdna in 23andme format would be accurate). Every single file contains +600000 SNPs (not as many as 23andme V3, but still).

I already deleted all my working files so I would have to start from zero (download everything etc.) again, maybe I will do it at a later point. I am having trouble to upload "self" made files (or even converted files) to Gedmatch, the tokenizing does not finish, how do you upload ancient samples or such extracted samples to Gedmatch?

Afaik Gedmatch expects an X chromosome if you are uploading in 23andme format. You would need to include X chromosome SNPs after chromosome 22. Just inputting "--" as genotype for those should probably work. Somebody correct me if I'm wrong.

mephisto
01-17-2018, 07:29 PM
Still the same shit bro:) I was this person who for first time converted one of them. Really guys don't waste your time.

WG2889568
K36 Eurogenes
Amerindian 0.67
Arabian 4.51
Armenian 4.57
Basque 0.27
Central_African -
Central_Euro 1.72
East_African 0.31
East_Asian 4.63
East_Balkan 3.81
East_Central_Asian 2.95
East_Central_Euro 1.37
East_Med 7.64
Eastern_Euro 0.99
Fennoscandian -
French 1.92
Iberian -
Indo-Chinese -
Italian 16.36
Malayan 1.84
Near_Eastern 5.19
North_African 0.74
North_Atlantic -
North_Caucasian 8.04
North_Sea 1.85
Northeast_African 0.31
Oceanian 1.65
Omotic 2.39
Pygmy 1.90
Siberian 6.56
South_Asian 3.78
South_Central_Asian 7.36
South_Chinese 2.13
Volga-Ural -
West_African -
West_Caucasian 4.50
West_Med -


Eurogenes K15

North_Sea 7.98
Atlantic 1.67
Baltic 3.88
Eastern_Euro 4.99
West_Med 7.39
West_Asian 17.09
East_Med 18.55
Red_Sea 4.85
South_Asian 7.31
Southeast_Asian 7.49
Siberian 8.84
Amerindian 1.68
Oceanian 2.15
Northeast_African 5.37
Sub-Saharan 0.77
What was the genotype rate before you recoded the bfiles?

Buhedyar
01-17-2018, 07:31 PM
Do all of them get this SSA noise?

this is the problem that i mentioned before

mephisto
01-17-2018, 07:38 PM
Still the same shit bro:) I was this person who for first time converted one of them. Really guys don't waste your time.

WG2889568
K36 Eurogenes
Amerindian 0.67
Arabian 4.51
Armenian 4.57
Basque 0.27
Central_African -
Central_Euro 1.72
East_African 0.31
East_Asian 4.63
East_Balkan 3.81
East_Central_Asian 2.95
East_Central_Euro 1.37
East_Med 7.64
Eastern_Euro 0.99
Fennoscandian -
French 1.92
Iberian -
Indo-Chinese -
Italian 16.36
Malayan 1.84
Near_Eastern 5.19
North_African 0.74
North_Atlantic -
North_Caucasian 8.04
North_Sea 1.85
Northeast_African 0.31
Oceanian 1.65
Omotic 2.39
Pygmy 1.90
Siberian 6.56
South_Asian 3.78
South_Central_Asian 7.36
South_Chinese 2.13
Volga-Ural -
West_African -
West_Caucasian 4.50
West_Med -


Eurogenes K15

North_Sea 7.98
Atlantic 1.67
Baltic 3.88
Eastern_Euro 4.99
West_Med 7.39
West_Asian 17.09
East_Med 18.55
Red_Sea 4.85
South_Asian 7.31
Southeast_Asian 7.49
Siberian 8.84
Amerindian 1.68
Oceanian 2.15
Northeast_African 5.37
Sub-Saharan 0.77
Yes, bad quality because of missing SNPs. How were you the first who did this? :) This support (for 23andme into both directions) through plink probably derives from the fact that it is used and needed. I extracted also good files. It depends on the reference itself.

Sangarius
01-17-2018, 07:41 PM
@Mephisto: I just checked the no-call rate in one of the files (Sinop), and it's ~45%. So, it's no wonder that there is so much noise in the results. Seems something went wrong during the processing of the files?

mephisto
01-17-2018, 07:46 PM
@Mephisto: I just checked the no-call rate in one of the files (Sinop), and it's ~45%. So, it's no wonder that there is so much noise in the results. Seems something went wrong during the processing of the files?
I extracted ftdna markers from a list. Very likely these SNPs were not tested by the makers (of the study) themselves, that is also the reason why that many markers are only shown as "--" (missing). I think extracting 23andme V3 would not change too much because scientific samples´ SNPs do not overlap with commercial tests. I could have removed those missing ones, the results would not get more accurate even though the geno rate would go up. What might would help is extracting the SNPs of all commercial tests (23andme V3, ftdna etc.).

Sangarius
01-17-2018, 07:59 PM
I extracted ftdna markers from a list. Very likely these SNPs were not tested by the makers (of the study) themselves, that is also the reason why that many markers are only shown as "--" (missing). I think extracting 23andme V3 would not change too much because scientific samples´ SNPs do not overlap with commercial tests. I could have removed those missing ones, the results would not get more accurate even though the geno rate would go up.

Yeah, I understand how you created the final V5 files with Plink. I don't think v3 would result in a higher genotype rate. I'm surprised by the low SNP overlap, given that the source where whole genome sequences.
Anyway, I might try uploading them to DNA.Land and see if imputation helps.

Afshar
01-17-2018, 08:01 PM
Do all of them get this SSA noise? Which sample is that?



Afaik Gedmatch expects an X chromosome if you are uploading in 23andme format. You would need to include X chromosome SNPs after chromosome 22. Just inputting "--" as genotype for those should probably work. Somebody correct me if I'm wrong.

All samples seem to have SSA noise

lukaszM
01-20-2018, 02:20 PM
What might would help is extracting the SNPs of all commercial tests (23andme V3, ftdna etc.).

Nothing would help. There are completely different than other VCF which I was processed before. I converted few hundred samples form Pagani vcf files and everything was ok. I have problem only with those Turkish. Probaly geneticists use some non-standard chip to process them.