PDA

View Full Version : "Virtual Autosomal Haplogroup"



AJL
04-19-2013, 02:04 AM
It's struck a lot of us, I am sure, that our actual haplogroups -- while important to us, clearly -- are not necessarily most emblematic of the sum total of our ancestry.

Therefore I undertook a little exercise in statistics, constructing "virtual" haplogroups based on the most frequently occurring Y and mtDNA of Relative Finder cousins. This is quite easy since you can download your RF results as a spreadsheet. (If I had more time, I would compare these to expected frequencies of ancestral populations, but that will have to wait for another time.)

PATERNAL GRANDFATHER (about 2,000 cousins)
yDNA J1e "virtual," actual R1a-CTS6
mtDNA K1a1b1a "virtual," actual H6a1a

MOTHER (about 1,400 cousins)
yDNA R1b-L21 "virtual," actual (father's) R1b-L48
mtDNA T2b "virtual," actual H11a2a3

The virtual haplogroups seem at least as emblematic of their known ancestries as their actual ones, but unfortunately this system will work a lot better if you're sharing with a fair number of relatives, and won't work as well if you have multiple origins which are not equally well represented in the database. Still, another way to look at your ancestry.

Feel free to post your results here and comment on how accurate you think they are!

DMXX
04-19-2013, 02:12 AM
Very interesting AJL. I was thinking about this some weeks ago when collating my RF results on the forum. I noticed some of the results didn't jibe very precisely with what is currently known about Iran's haplogroup distribution. How did you calculate the virtual haplogroups?

AJL
04-19-2013, 02:14 AM
I just downloaded the spreadsheet, sorted by haplogroup, and tallied each haplogroup occurrence with a simple formula {EDIT} then picked the most common one.

DMXX
04-19-2013, 02:50 AM
My results based on (far) fewer relatives than yours:

"Virtual" mtDNA = T2g (3)
"Virtual Y-DNA = E1b1b1a2* (5)

As you stated, it's a direct function of the number of relatives one has. Let's see if this changes in the future.

AJL
04-19-2013, 02:51 AM
Hm, I think you need more cousins!

geebee
04-19-2013, 04:19 AM
Interesting idea. I have 1014 RF cousins, 529 of whom are male. The list actually includes my brother and my father, so subtracting for them leaves 527.

I was a bit surprised to see that the greatest number of these are in R1b1b2a1a2f* -- 85. Of course, it doesn't automatically follow that I'm connected to any of these via one of their Y-line ancestors. Still 85 of 527 is a decent-sized portion.

From what I can gather, this seems to have somewhat of a Scottish and Northern Irish association. Only about a quarter of my ancestry should be British according to my paper trail -- although this should indeed be mostly Scots and Scots-Irish, and it is possible the more of my "missing leaves" have Scottish roots than I have assumed.

I'll try the same exercise with my brother and my father later on, and see what I come up with.

AJL
04-19-2013, 12:22 PM
I was a bit surprised to see that the greatest number of these are in R1b1b2a1a2f* -- 85. Of course, it doesn't automatically follow that I'm connected to any of these via one of their Y-line ancestors. Still 85 of 527 is a decent-sized portion.

From what I can gather, this seems to have somewhat of a Scottish and Northern Irish association. Only about a quarter of my ancestry should be British according to my paper trail -- although this should indeed be mostly Scots and Scots-Irish, and it is possible the more of my "missing leaves" have Scottish roots than I have assumed.

Yes, that's R-L21, same as my mother gets. R-L21 is fairly common in France, England, and parts of Germany/Scandinavia too. The Scottish/Irish association sounds more like R-M222, a subclade of R-L21 (that's given at 23andme as R1b1b2a1a2f2).

Of course there are now a number of other subclades of R-L21 and if 23andme actually kept up with the various Y subclades, other haplogroups might be in the lead.

It's also worth noting that the database composition is probably quite high in L21 simply because that's probably the most common haplogroup, at the level tested by 23andme, in the US now because of its heavy distribution in the Isles. For this reason it's perhaps a better indicator of American ancestry than of Isles ancestry.

geebee
04-19-2013, 02:14 PM
^^Makes sense. My total for R1b1b2a1a2f2 is quite a bit lower, just 12.

AJL
04-19-2013, 04:08 PM
My mother has 43 "R1b1b2a1a2f2" which makes sense given that a fair proportion of her ancestry hails from the right area (e.g. Islay, counties Antrim, Down, and Tyrone, etc.).

geebee
04-20-2013, 09:43 AM
I think the most recent two of my Scottish-originated names are Keith and Gregg. Gregg is a variation of Greig, which is a sept of Clan Macgregor (so I'm told).

Most of my other British names are rather generic -- White, Smith, Mills, Brooks, etc.

AJL
04-20-2013, 04:16 PM
I think the most recent two of my Scottish-originated names are Keith and Gregg. Gregg is a variation of Greig, which is a sept of Clan Macgregor (so I'm told).

Very hard to say -- sometimes there's quiet a bit of diversity within clans.

http://www.familytreedna.com/scottish-clan-list.aspx

http://www.familytreedna.com/public/MacGregor/default.aspx?section=yresults

You can add to that issue the fact there were many changes in surnames in Scotland for political, linguistic, or possibly other reasons. I am connected to a "Smith" line from Scotland that was originally probably named McNockard or McNokard (a name that's now almost extinct and means "son of the smith"), or McGowan (literally "son of the blacksmith"), or possibly something else.

geebee
04-20-2013, 04:37 PM
^^That's certainly true. "Allied families" might actually be related ... but they wouldn't have to be.

I also have a Smith line, but no way of telling what the national origin of this line was. I'd say probably British, but whether that means English, Irish, Scottish, or even Welsh -- I have no idea. This is also true of another line, named White.

To further complicate matters, there are also Schmidt/Schmitt/Schmid families who simply became "Smith".

I'm sometimes convinced that half my ancestors must have been cattle thieves, tax evaders, or murderers -- they certainly don't seem that they wanted to be found ...

EDIT: I found a Y-haplogroup project at FTDNA specific to Macgregors. There actually does seem to be some consistency for Gregg, with most of those having that surname being R1b1a2.

J Man
05-30-2013, 09:14 PM
So this exercise is not really useful for us of admixed backgrounds?

AJL
05-30-2013, 09:16 PM
So this exercise is not really useful for us of admixed backgrounds?

Not very -- it's a better test of which of your backgrounds is best represented in the database.

Anglecynn
02-28-2014, 09:14 PM
When all the different sub-groups are added up, it's U106 followed closely by L21.

AJL
02-28-2014, 11:30 PM
When all the different sub-groups are added up, it's U106 followed closely by L21.

Celto-Germanic indeed!

Anglecynn
02-28-2014, 11:40 PM
Celto-Germanic indeed!

Yeah it makes sense i guess. :)

It's actually not too far from what is shown in population analyses for parts of England either, so it's more reliable than i'd have expected really. I think R1b was about 60% overall, and between all my other family members I've tested it's between 55-65% i think. So in that case i'm very much an R1b person, despite my own lineage.

Anglecynn
02-28-2014, 11:48 PM
Here's a pie chart i made:
http://i628.photobucket.com/albums/uu7/Brodir93/a8f37ccc-561b-465d-8e3e-46cea7ff9ac4.png

Humanist
03-01-2014, 12:24 AM
My virtual haplogroups:

Y-DNA: R1b1b2a
mtDNA: HV

AJL
03-01-2014, 12:39 AM
My virtual haplogroups:

Y-DNA: R1b1b2a
mtDNA: HV

Sounds right for northern Iraq or eastern Syria!

Sein
03-01-2014, 12:39 AM
Looking at my matches, this is what I get:

Y-DNA: R1a1a (in a close tie with R1b1b2, 29% R1a1a and 24% R1b1b2)
mtDNA: H1 (by far, H (and its' various subclades) is the predominate mtDNA haplogroup among my matches)

Humanist
03-01-2014, 03:04 AM
Looking at my matches, this is what I get:

Y-DNA: R1a1a (in a close tie with R1b1b2, 29% R1a1a and 24% R1b1b2)

Thanks, Sein. I am a bit surprised by R1b1b2 coming in second for you. Did it surprise you at all, or is R1b1b2 present in Pashtuns at decent frequencies?

Stephen1986
03-01-2014, 12:52 PM
As myself and my brother have close to 2000 matches, I used the method outlined on this page - http://gentoypeme.blogspot.co.uk/2012/06/how-to-count-paternal-haplogroups-of.html.

My top 10 maternal haplogroups -

H1 140
H 131
T2b 79
H3 59
V 43
T1a1 29
H5a1 29
K1a4a1 26
U5a1a1 24
J1c2 24

My top 10 paternal haplogroups -

R1b1b2a1a2f* 190
R1b1b2a1a 86
I1* 84
R1b1b2a1a1* 61
R1a1a 47
R1b1b2a1a1 45
R1b1b2a1a2f2 42
R1b1b2a1a2f 40
R1b1b2a1a1d* 32
I1 29

Brother's top 10 maternal haplogrous -

H1 130
H 100
T2b 69
H3 49
V 38
K1a4a1 30
T1a1 28
H1c 28
H5a1 26
H1c1 23

Brother's top 10 paternal haplogroups -

R1b1b2a1a2f* 181
I1* 86
R1b1b2a1a 73
R1b1b2a1a1* 48
R1b1b2a1a1 47
R1b1b2a1a2f2 46
R1a1a 40
I1 33
R1b1b2a1a1d* 31
R1b1b2a1a2d3* 24

Sein
03-01-2014, 02:44 PM
Thanks, Sein. I am a bit surprised by R1b1b2 coming in second for you. Did it surprise you at all, or is R1b1b2 present in Pashtuns at decent frequencies?

I definitely share your thoughts on this, it was rather surprising for me. Not really sure what to make of this.

Dr_McNinja
03-01-2014, 09:59 PM
Here are the haplogroups for HRP0370, Afghan Pashtun:


Y-DNA:

T - 1
R1b - 4 (R1b1b2a1a1d, R1b1b2a1a2d3*, R1b1b2a1a2f*, R1b1b2a1a2f2)
R1a1a - 8
O3a3c1 - 1
L3 - 2
J2a1a - 2
I1 - 1
H1a - 1
G2c1 - 2
G2a3a1 - 1
F - 1
E1b1b1a2* - 1

----

mtDNA:

A4 - 1
D4 - 2 (D4g2a, D4j)
G3 - 3
H - 5
H15a - 1
H4a1 - 1
H5 - 2 (H5a1, H5b)
H7 - 1
HV - 5
I1 - 1
J1b - 2 (J1b, J1b1a)
K1a - 1
M - 1
M18 - 1
M2a1a - 1
M3 - 1
M35b - 1
M3a - 2 (M3a1, M3a2)
M53 - 2
M6 - 1
N1b1 - 1
R0a1a - 2
R2 - 4
T1 - 3 (T1, T1a1, T1a1)
T2 - 2 (T2b, T2f1)
U2b2 - 2
U2c - 1
U4 - 2 (U4, U4a2a)
U5 - 3 (U5a2b, U5a2c, U5b1b1)
U7 - 1
U8b - 1
X - 1I barely have any matches, like 25 overall. But my highest was a 3-way tie between R1a1a, R1b1b2a1, and I1 at two each. My highest mtDNA was U5a with 3 (U5a1, U5a1b1, U5a2b) and U2b with 3 (U2b, U2b, U2b1).

Total mtDNA list: H1, H1e1, H5b, I1, I1, I3, K2, M30, M4a, M4b, N1d, R30, R5a2, U2a, U2b, U2b, U2b1, U2c, U2e, U5a1, U5a1b1, U5a2b, U7

DMXX
03-01-2014, 10:45 PM
I definitely share your thoughts on this, it was rather surprising for me. Not really sure what to make of this.

Perhaps a qualitative assessment of the R1b relatives may provide some insight into this very curious result, Sein. It's possible the over-representation of European-descent customers on 23andMe is responsible.

Sein
03-01-2014, 11:30 PM
Perhaps a qualitative assessment of the R1b relatives may provide some insight into this very curious result, Sein. It's possible the over-representation of European-descent customers on 23andMe is responsible.

An excellent point. I think that would probably explain this result.

DMXX
03-02-2014, 06:34 PM
My results based on (far) fewer relatives than yours:

"Virtual" mtDNA = T2g (3)
"Virtual Y-DNA = E1b1b1a2* (5)


Since the above message 11 months ago, I've had quite a few new relatives crop up (n=102). The results look very different to before.

Funnily enough, it mirrors Sein's results for the time being (mtDNA H somewhere, own Y-DNA top, close second other R's).

Virtual mtDNA = U7 (7) [second place at 6 = H*]
Virtual Y-DNA = R2a (7) [second place at 5 = E1b1b1a2*, R1a1a, R1b1b2a]

Humanist
03-13-2014, 08:23 PM
My results based on (far) fewer relatives than yours:

"Virtual Y-DNA = E1b1b1a2* (5)

As you stated, it's a direct function of the number of relatives one has. Let's see if this changes in the future.


Since the above message 11 months ago, I've had quite a few new relatives crop up (n=102). The results look very different to before.

Funnily enough, it mirrors Sein's results for the time being (mtDNA H somewhere, own Y-DNA top, close second other R's).

Virtual Y-DNA = R2a (7) [second place at 5 = E1b1b1a2*, R1a1a, R1b1b2a]

Your Y-DNA E matches may be related to what is posted below (from another thread):

A comparison of Kurdish and Assyrian Y-DNA frequencies.

Kurdish DNA frequencies gathered from the Kurdish DNA Blogspot (http://kurdishdna.blogspot.com/2013/10/kurdish-y-dna-part-x.html)*. Sources used: Gokcumen et al., 2011; Grugni et al., 2012; Cristofaro et al., 2013; 23andMe/FTDNA Kurdish results.


N=157 N=126
KRD ASY
7% 25% -- R1b
30% 13% -- J2
15% 2% -- R1a
5% 14% -- T
16% 8% -- E <------
11% 17% -- J1
0% 4% -- Q1b
3% 1% -- L
3% 1% -- I
7% 9% -- G
0% 2% -- F
3% 4% -- R2
1% 0% -- H
1% 0% -- R1

Assyrians have more than 3x R1b.
Assyrians have almost 3x more T.
Assyrians have about .5x more J1.
Assyrians have more Q1b, G, F, and R2.

Kurds have more than 7x R1a.
Kurds have more than 2x J2.
Kurds have 2x more E.
Kurds have more L, I, H, and R1.

* The Malyarchuk et al. 2013 frequencies for J1 and J2 were combined, therefore, the Malyarchuk frequencies were excluded.