PDA

View Full Version : Somali mtDNA (n=190)



Ezana
08-04-2012, 07:03 PM
Mikkelsen et al. 2012: http://www.springerlink.com/content/x13323337p155h22/fulltext.pdf

Forensic and phylogeographic characterisation of mtDNAlineages from Somalia

http://img515.imageshack.us/img515/2994/somalimtdna.png


115/190 L0-L6 (60.5%

29/190 M1 (15.3%)

19/190 N1 (10.0%)

11/190 R0 (5.8%)

9/190 K1 (4.7%)

4/190 U3/U9 (2.1%)

3/190 HV (1.6%)


Haplogroup distribution

All 190 haplotypes were assigned to haplogroups according to the mitochondrial phylogeny (Phylotree, build 13 [24]). A total number of 42 discernible haplogroups were found in the dataset (Table 2). In total, 60.5 % of the mtDNA haplotypes belonged to the African haplogroups L1–L4 and L6. The remaining 39.5 % were attributed to haplogroups M1 (15.3 %), N1 (10.0 %) and R (14.2 %).

DMXX
08-05-2012, 03:40 PM
Very interesting paper.

That all the mtDNA M found among Somalis belongs to M1 and its' offshoots inclines me to speculate it is an entrant from somewhere else. Additionally, the presence of M1* followed by a successive group of subclades (M1* -> M1a1* -> M1a1d, as well as M1a5) make it likely, in my opinion, that the mtDNA M's presumed entrance from somewhere else was prehistoric.

The mtDNA M results specifically are quite similar to the Ethiopian Jews from Non et al.'s Mitochondrial DNA Reveals Distinct Evolutionary Histories for Jewish Populations in Yemen and Ethiopia;



M1 129 154 183C 189 223 249 311
M1a 93 129 189 223 278 294 309
M1a1 129 189 223 249 311 359
M1a1 93 129 189 223 249 311 359
M1a1 93 129 183C 189 213 223 249 311 359
M1b 182C 183C 189 223 249 260 311 320


Note a similar progression from mtDNA M1 upstream is seen here. Clearly the ancestral mtDNA M1 bearer's impact was a regional one and not specific to the modern Somali genepool.

Ezana
08-05-2012, 05:49 PM
Right. Most researchers believe M1 is representative of pre-historic back-migration from the group that left out of Africa. See Gonzales et al. 2007 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1945034/.

There's a large overlap of M1 and U6 (albeit with opposing East-West gradients), which has led some to believe they were part of the same migration. However, TMRCA estimates for M1 are around 20-30kya, whereas U6's is 40-50kya.

TMRCA of M1a (the major East/Northeast African branch) is about 17kya, while M1c is 19kya. M1abcde is one main branch and M1c the other according to this study.

What complicates the matter is that M1a2 does not reflect a constant rate of mutation, carrying more mutations and in particularly in coding regionscompared to M1a . The TMRCA of M1 drops from 26kya to 21kya when removing M1a2 and M1a drops all the way to 4kya! That's likely too low, but it's a large effect that has yet to be adequately corrected for. Still, the diversity of M1a in East Africa points to there as its likely source of origin, although M1 must have an Asian provenance due to the widespread and ancient presence of its progenitor M in Asia.

Ezana
08-05-2012, 05:53 PM
Btw, I found it very interesting that N1 was so high in this sample. 10%! I don't think I've seen it that high in East African populations before, just trace amounts like 1-5%. It may just be an outlier, but it's something to think about. Any ideas on its significance or provenance?

DMXX
08-05-2012, 05:55 PM
What complicates the matter is that M1a2 does not reflect a constant rate of mutation, carrying more mutations and in particularly in coding regionscompared to M1a . The TMRCA of M1 drops from 26kya to 21kya when removing M1a2 and M1a drops all the way to 4kya! That's likely too low, but it's a large effect that has yet to be adequately corrected for. Still, the diversity of M1a in East Africa points to there as its likely source of origin, although M1 must have an Asian provenance due to the widespread and ancient presence of its progenitor M in Asia.

Agreed. The sheer clade diversity of mtDNA M in the Indian Subcontinent makes it the likely source of mtDNA M1 in East Africa, albeit at a very distant point in time.


Btw, I found it very interesting that N1 was so high in this sample. 10%! I don't think I've seen it that high in East African populations before, just trace amounts like 1-5%. It may just be an outlier, but it's something to think about. Any ideas on its significance or provenance?

Perhaps it accompanied the mtDNA M1 in its' western migration? Both macrohaplogroups M and N followed the Eurasian coastline before dispersing deeper into Asia, according to the consensus view. Therefore, a back-migration towards Africa by mtDNA N1 is fully plausible since M1 has already been implicated in such a scenario.

If the MRCA of Somali N1 is in the same ballpark as the M subclades, that would confirm this notion.

Ezana
08-05-2012, 07:17 PM
Maybe this can help you:


Haplotypes and diversity indices

In the dataset of 190 Somali sequences we observed 114 distinct haplotypes of which 83 were unique (Table 1; Table S1). The two most frequent haplotypes belonged to haplogroups L0a1d (eight individuals, 16129A, 16148T, 16168T, 16172C, 16187T, 16188G, 16189C, 16223T, 16230G, 16293G, 16311C, 93G, 152C, 185A, 189G, 236C, 247A, 263G, 315.1C, 523DEL, 524DEL and 553T) and M1a1d (eight individuals, 16093C, 16129A, 16189C, 16223T, 16249C, 16311C, 16359C, 16519C, 73G, 150T, 189G, 195C, 198T, 263G, 315.1C, 489C, 523DEL and 524DEL). Other frequent haplotypes belonged to haplogroups L2a1h (seven individuals, 16092C, 16183C, 16189C, 16192T, 16223T, 16278T, 16291T, 16294T, 16390A, 73G, 143A, 146C, 152C, 195C, 263G and 315.1C), N1b (six individuals, 16176G, 16223T, 16258C, 16390A, 16519C, 73G, 152C, 263G, 315.1C, 523DEL and 524DEL) and L3h2 (six individuals, 16111T, 16184T, 16223T, 16304C, 16519C, 73G, 150T, 195C, 263G, 315.1C, 318C, 523DEL and 524DEL). Ignoring length variation at positions 16193, 309 and 573, the mean number of pairwise differences between two random Somali individuals was 15.9 7.1 and the probability of two mtDNA sequences being identical was 1.5 %, which corresponds to a power of discrimination of 98.5 %.

They only did analysis on the diversity of HVS1, but this was the methodology they used:


Population-genetic analyses
Molecular diversity indices, pairwise differences between and within populations as well as analysis of molecular variance were calculated using ARLEQUIN version 3.5 [27]. We compared our data to previously published work describing populations in Saudi Arabia [28], Yemen [29], Soqotra (an island belonging to Yemen in the Gulf of Aden) [30], Libya [31], Egypt [32], Central African Republic [33], Cameroon [33], Congo [33], Gabon [33], Ethiopia [34] and Kenya [35]. All sequences were aligned and trimmed to the greatest common range 16030–16193 and 16194–16370. Multidimensional scaling (MDS) analysis was performed to illustrate the inter-population structure. The two-dimensional MDS plot (Fig. 1) was based on the average values of the pairwise population differences (Table S2).

Lank
08-06-2012, 12:12 PM
Btw, I found it very interesting that N1 was so high in this sample. 10%! I don't think I've seen it that high in East African populations before, just trace amounts like 1-5%. It may just be an outlier, but it's something to think about. Any ideas on its significance or provenance?
The frequency of N1a is similar in the Kenyan Rendille (Castri et al. (http://www.isita-org.com/jass/Contents/2008%20vol86/12_Castri.pdf)) and South Ethiopian Gurage (Kivisild et al. (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182106/), small sample size though).


Six Ethiopian N1a lineages, restricted to Semitic-speaking subpopulations, show low haplotype diversity and include an exact HVSI sequence match with a published N1a sequence from Egypt (Krings et al. 1999). A related sequence, from southern Sudan (Krings et al. 1999), was misclassified as a member of the L1a clade (Salas et al. 2002). Yemeni N1a sequences, on the other hand, display a high level of haplotype (h p 0.89) and nucleotide (r p 2.751) diversity, combined with the highest frequency (6.9%) of this haplogroup reported so far.

This lineage may be a result of gene flow from the Nile Valley.

Ezana
08-06-2012, 12:38 PM
Thanks. I forgot about Kivisild's Gurage. I was thinking more of some Southeast African (Tanzanian and Mozambican) samples that I recall carrying a lot of N1.

However, those populations all carry N1a. This sample includes a number of N1b individuals (six out of the nineteen).


N1b (six individuals, 16176G, 16223T, 16258C, 16390A, 16519C, 73G, 152C, 263G, 315.1C, 523DEL and 524DEL)

Awale
08-12-2015, 05:38 PM
I took the time to just go ahead and throw those percentages into a chart:

Somali mtDNA, Mikkelsen et al. 2012 (zoom in, it looks better ;)) (https://docs.google.com/spreadsheets/d/1ccpHA4xD2vgNQrKftOQqbNBqYQu2O1q00mCzMXnx52Y/pubchart?oid=1369933132&format=interactive)

Note: I've colored everything based on the larger mtDNA lineage each subclade belongs to (N1a & R0a* are colored green cos they're both mtDNA N lineages, L2 lineages are colored blue etc.).

Awale
08-13-2015, 02:12 AM
It's pretty cool that this data seems very much in line with what other studies have shown like Boattini et al.:


https://docs.google.com/spreadsheets/d/1ccpHA4xD2vgNQrKftOQqbNBqYQu2O1q00mCzMXnx52Y/pubchart?oid=312151723&format=interactive

The only meaningful difference is that Boattini et al. found mtDNA J, T and I carriers when Mikkelsen et al. didn't and also didn't find the U3a that Mikkelsen et al. encountered.

Just for the heck of it as I explain below in an email sent to a friend from just about an hour ago- :


And lol, cos it's summer and I have no "summer job" and only really go out every night or so-> the Sheikh has a lot of "free-time" so I went to work on 23andme Somalis. I basically looked through my relative finder list and listed out all of the mtDNA markers I saw and each one's frequency. I was cautious as feck and made sure not to "double out" by basically adding two people who clearly looked like they were siblings. I then did the same with Issac's relative finder and made sure to exclude every single person we clearly shared.

I ended up getting about 74 ethnic Somalis (for many I also looked at their ancestry comp to be sure) and got the following results:

https://docs.google.com/spreadsheets/d/1ccpHA4xD2vgNQrKftOQqbNBqYQu2O1q00mCzMXnx52Y/pubchart?oid=254740960&format=interactive

It's astounding how similar this looks to what you see in studies:

https://docs.google.com/spreadsheets/d/1ccpHA4xD2vgNQrKftOQqbNBqYQu2O1q00mCzMXnx52Y/pubchart?oid=1369933132&format=interactive

Meaningful difference would be certain seemingly different subclades and the fact that Mikkelsen et al. & Bottiani et al. have our our non-L lineage frequency at ~38% when this 23andme set has it at 31%. I double checked for possible mix ups and while I wanna go back soon and triple check; those 23andme numbers are pretty accurate/ I didn't overlook someone me and Isaac share or two people who looked like siblings etc.

-the Sheikh decided to try and add in various 23andme Somalis and the results seem a lot like these studies right down to patterns like L3 being the most common lineage or N lineages being more common than M lineages and so on. I'd say these two studies are pretty representative of Somali mtDNA markers for the most part.

Note:

This Isaac (not his real name) is a Somali relative I'm buddy buddies with and he basically gave me the password to his 23andme account...