PDA

View Full Version : Haplogroup percentages for FTDNA's Ireland Y DNA Project



MitchellSince1893
11-01-2015, 06:13 AM
For those interested I went through this project to classify the haplogroups. Of the ~6000 members of this project, I only used samples that said they were born in Ireland or had "Ireland" or "Northern Ireland" in the country column, and had confirmed SNPs (those in green in the haplogroup column).

This resulted in 1675 samples.

Highlights include:
R1b = 81%
L21 = 60%
I =13.5%
U106 = 4.8%
DF27 = 2.75%
CTS4528 = 2.1%
U152 = 2%
J = 1.6%
R1a = 1.5%
G = 1.1%

See attachment for a more detailed list

rms2
11-01-2015, 01:30 PM
Busby et al have R1b-L21 at over 70% in all of their sample locations in Ireland.

RCO
11-01-2015, 02:28 PM
What is the proportion of M222 in Ireland ?

MitchellSince1893
11-01-2015, 03:39 PM
Busby et al have R1b-L21 at over 70% in all of their sample locations in Ireland.

Don't forget to account for the 187 R1b samples classified in haplogroups above L21 i.e. M343, P25, M269, L23, L51, M412, L151/P310, P311, P312 (see the image attachment in post 1).

L21 made up 83% of the 1209 samples below P311.

187 x .83 = 155 additional L21 samples
Add 155 to the 1004 L21 samples =1159
1159/1675 total samples = 69.2% of all samples may have been L21.

Busby got 73.7% overall for Ireland using the numbers you posted here http://eng.molgen.org/viewtopic.php?f=123&t=1053

Continuing this exercise with the other haplogroups below P311.
U106=5.5%. Busby=5.0%
DF27= 3.2%. Busby P312xL21/U152=6.9%
U152=2.3%. Busby=2.4%


Busby's numbers may have been closer to this project's if he had larger samples sizes per region.

As seen in the link above his sample sizes ranged from 67 to 149.
With 67 samples in West Ireland his confidence level is ~90% with a 10% margin of error.

To get a confidence level of 95% with a 5% margin of error for each region, one would need a minimum of 385 samples per region. Busby had 377 for all of Ireland

MitchellSince1893
11-01-2015, 03:44 PM
What is the proportion of M222 in Ireland ?

198 of the 1675 say "M222", but this doesn't include all the haplogroups below M222. It will take me a while to calculate this.

MitchellSince1893
11-01-2015, 04:28 PM
What is the proportion of M222 in Ireland ?

343 of 1675 were M222 and its branches...or about 1/3rd of all L21.

M222 may be around 23.6% of the total Ireland project when you estimate how many additional M222 are in the 187 samples above P311.

Forgot to attach a link to the actual project these numbers are coming from https://www.familytreedna.com/public/IrelandHeritage?iframe=yresults

alan
11-01-2015, 07:14 PM
343 of 1675 were M222 and its branches...or about 1/3rd of all L21.

M222 may be around 23.6% of the total Ireland project when you estimate how many additional M222 are in the 187 samples above P311.

Forgot to attach a link to the actual project these numbers are coming from https://www.familytreedna.com/public/IrelandHeritage?iframe=yresults

The dominance of relatively late L21 clades in Ireland from M222 to other clades linked to AD period royal lines in the south of Ireland makes me think we cannot read anything much into absolute totals of L21. If you subtract all the L21 people in Ireland who are not linked to clades that are little more than 2000 years old or significantly younger then the total nosedives. So, its impossible to know if Ireland was particularly high in L21 compared to other parts of the isles before these clades exploded. It may not have been significantly lower and Ireland may have not been the L21 peak before that. All we know is the descendants of a handful of royal men in late BC or first millenia AD period Ireland probably replaced a majority of the male lines that existed before that. How many modern descendants a few royal men from 2000 years ago or less produced in Ireland Ireland might be absolutely mind blowing once the details are known - meaning that modern percentages tell us absolutely zilch about the reality 2000+ years ago.

jdean
11-01-2015, 07:28 PM
According to Chromo2 which used people who's grandfather came from the receptive areas the frequency for L21 is

Connacht..........73
Leinster............70
Munster............73
Ulster...............61

and M222

Connacht..........29
Leinster............18
Munster.............5
Ulster...............30

ArmandoR1b
11-01-2015, 08:06 PM
According to Chromo2 which used people who's grandfather came from the receptive areas the frequency for L21 is

Do you know what the frequency of people from Ireland that tested with Chromo2 that are terminal for P312 is?

Heber
11-01-2015, 08:20 PM
Ireland has very high frequencies of L21 and DF13
https://www.pinterest.com/gerardcorcoran/irish-dna/
Irish Provinces have high frequencies of DF49, DF21, C4466 and L513
https://www.pinterest.com/gerardcorc...sh-dna-ulster/
https://www.pinterest.com/gerardcorc...-dna-connacht/
https://www.pinterest.com/gerardcorc...h-dna-munster/
https://www.pinterest.com/gerardcorc...-dna-leinster/

jdean
11-01-2015, 08:21 PM
Do you know what the frequency of people from Ireland that tested with Chromo2 that are terminal for P312 is?

P312*

Connacht.....6
Leinster.......8
Munster.......6
Ulster..........8

SRY2627

Connacht.....0
Leinster.......1
Munster.......1
Ulster..........2

U152

Connacht.....4
Leinster.......0
Munster.......3
Ulster..........2

Data was also given for L165 & L238 but the frequencies were too low to be recorded

rms2
11-01-2015, 08:53 PM
. . .

Busby's numbers may have been closer to this project's if he had larger samples sizes per region . . .


377 is a pretty good sample size, and Busby et al was a scientific paper with controls that one cannot get from an FTDNA project. Project members often claim ancestors they are not sure about. I've mentioned that a number of times, and I know it to be a fact from personal experience. Irish ancestry is something that is really subject to the phenomenon of wannabeism, which serves as an incentive for seeing things the way one wants rather than the way they are (I'm talking about project members and the temptation to grasp at an ancestor or pedigree that seems attractive rather than waiting for solid evidence). Other ancestries with strong wannabe factors are Viking, Germanic in general, and Norman.

FTDNA project stats are interesting and might provide some very rough ballpark info, but they are not all that reliable. Busby et al, IMO, is the best we have for the time being.

MitchellSince1893
11-01-2015, 09:50 PM
377 is a pretty good sample size, and Busby et al was a scientific paper with controls that one cannot get from an FTDNA project. Project members often claim ancestors they are not sure about. I've mentioned that a number of times, and I know it to be a fact from personal experience. Irish ancestry is something that is really subject to the phenomenon of wannabeism, which serves as an incentive for seeing things the way one wants rather than the way they are (I'm talking about project members and the temptation to grasp at an ancestor or pedigree that seems attractive rather than waiting for solid evidence). Other ancestries with strong wannabe factors are Viking, Germanic in general, and Norman.

FTDNA project stats are interesting and might provide some very rough ballpark info, but they are not all that reliable. Busby et al, IMO, is the best we have for the time being.

I would agree when you are talking about small FTDNA sample sizes. But when you have a sample size over 4 times larger than Busby, the lack of "controls that one cannot get from an FTDNA project", the Irish "wannabism" and "project members often claim ancestors they are not sure about" issues are minimized.

Because the FTDNA Ireland project numbers are close to what the Busby got for Ireland I wouldn't be so quick to treat them as unreliable.

I guess we will have to agree to disagree.

MitchellSince1893
11-01-2015, 10:05 PM
According to Chromo2 which used people who's grandfather came from the receptive areas the frequency for L21 is

Connacht..........73
Leinster............70
Munster............73
Ulster...............61

and M222

Connacht..........29
Leinster............18
Munster.............5
Ulster...............30

U152

Connacht.....4
Leinster.......0
Munster.......3
Ulster..........2


Thanks for posting.

Taking a simplistic average for the above numbers:

Chromo2 got 69.25% L21. I got 69.2% L21 from FTDNA numbers.

They got 20.5% M222. I got 23.6%.

They got 2.25% for U152. I got 2.3%

ArmandoR1b
11-01-2015, 10:10 PM
P312*

Connacht.....6
Leinster.......8
Munster.......6
Ulster..........8

SRY2627

Connacht.....0
Leinster.......1
Munster.......1
Ulster..........2

U152

Connacht.....4
Leinster.......0
Munster.......3
Ulster..........2

Data was also given for L165 & L238 but the frequencies were too low to be recorded

Thanks. Do you know what the sample sizes are? Either way, it points to a serious problem with the Valverde et al. study (http://www.nature.com/ejhg/journal/vaop/ncurrent/abs/ejhg2015114a.html) which showed 17.81% S116* out of 146 people from Ireland. I really wish we knew what they did wrong.

rms2
11-01-2015, 10:25 PM
I would agree when you are talking about small FTDNA sample sizes. But when you have a sample size over 4 times larger than Busby, the lack of "controls that one cannot get from an FTDNA project", the Irish "wannabism" and "project members often claim ancestors they are not sure about" issues are minimized.

I would say they are more likely to be magnified than minimized. In smaller projects it is easier for the admins to control for member errors and unreliability, but even then members will list what they want to list.



Because the FTDNA Ireland project numbers are close to what the Busby got for Ireland I wouldn't be so quick to treat them as unreliable.


Well, they differ by over 10% when it comes to L21 frequency. That's not insignificant.



I guess we will have to agree to disagree.

Yes, I guess we will. FTDNA project stats are okay, but they are not comparable to those from scientific papers like Busby et al.

MitchellSince1893
11-01-2015, 10:46 PM
...Well, they differ by over 10% when it comes to L21 frequency. That's not insignificant...

I guess you missed my post http://www.anthrogenica.com/showthread.php?5744-Haplogroup-percentages-for-FTDNA-s-Ireland-Y-DNA-Project&p=118075&viewfull=1#post118075

They differ by about 4.5% when you factor in the 187 R1b samples above P311...69.2% L21 for FTDNA vs 73.7% Busby. This 69.2% is right inline with what chromo2.0 got...69.25% for L21, and the numbers chromo2.0 and I got for U152 and M222. http://www.anthrogenica.com/showthread.php?5744-Haplogroup-percentages-for-FTDNA-s-Ireland-Y-DNA-Project&p=118136&viewfull=1#post118136

That was my point earlier. Busby's small regional samples sizes may account for this 4% difference. Had Busby sampled at 385 or more per region his L21 numbers may have been closer to what chromo2.0 got and I got.

I respect that based on your experience, you are providing cautionary input as to why doing what I did could lead to misleading results. My response is that larger sample sizes can minimize these concerns...again we disagree on this point.

rms2
11-01-2015, 10:58 PM
I did miss that part, but you have to assume that 83% of those are probably L21+. Maybe they are, maybe they aren't. They cannot be counted among the L21+ results.

Busby's regional sample sizes were not small, but they were undoubtedly Irish, which is more than you can say about all the members of the Ireland DNA Project.

MitchellSince1893
11-01-2015, 11:05 PM
I did miss that part, but you have to assume that 83% of those are probably L21+. Maybe they are, maybe they aren't. They cannot be counted among the L21+ results.

Busby's regional sample sizes were not small, but they were undoubtedly Irish, which is more than you can say about all the members of the Ireland DNA Project.

Statistically they were too small for the sampled populations...unless one wants a low confidence level and high margin of error. Something not desirable for accurate results.

Province Population
Leinster 2,504,814
Ulster 2,106,296
Munster 1,246,088
Connacht 542,547

Enter those populations in this calcuator at 95% confidence level and 5% confidence interval
http://www.calculator.net/sample-size-calculator.html

rms2
11-01-2015, 11:15 PM
Statistically they were too small for the sampled populations...unless one wants a low confidence level and high margin of error. Something not desirable for accurate results.

I don't claim to be a statistician, but the smallest sample size was 67, and those were people living in Ireland who, as I recall, had four Irish grandparents. You cannot seriously be arguing that y-dna stats gleaned from FTDNA projects are comparable to those from a scientific study like Busby et al.

Early on, the R L21 and Subclades Project got a number of members who claimed ancestry in Germany, especially the Rhineland. I actually thought we were seeing evidence of a likely L21 hot spot. But then Busby et al came out showing very low L21 percentages in Germany. I was surprised, but then our German L21 results dried up, and it seems likely that some of those may not have been as German as we once believed. When I ran the Normandy Y-DNA Project I encountered all sorts of ancestral claims and pseudo-claims. There are other stories I could tell. I just take FTDNA project stats with a huge grain of salt. Even scientific papers have to be viewed critically (as does everything in life), but at least they have a better system of quality control than most FTDNA projects, which have to please members.

jdean
11-01-2015, 11:38 PM
Thanks. Do you know what the sample sizes are? Either way, it points to a serious problem with the Valverde et al. study (http://www.nature.com/ejhg/journal/vaop/ncurrent/abs/ejhg2015114a.html) which showed 17.81% S116* out of 146 people from Ireland. I really wish we knew what they did wrong.

I'm afraid not, I seem to recall Scotland DNA mentioning a number however if so I can't find it now, but either way that would have been for Britain and Ireland combined.

That said, bearing in mind they were able to pick up 1% frequency for E-M123 in England South East (the only stat for that SNP) and likewise for O in Leinster, I'm guessing thay must have used a reasonable number of samples.

MitchellSince1893
11-02-2015, 12:51 AM
...You cannot seriously be arguing that y-dna stats gleaned from FTDNA projects are comparable to those from a scientific study like Busby et al...

You're right. I'm not arguing that. If we were comparing an FTNDA geographic project of 1600 confirmed SNP samples to scientific study with 1600 samples, I would go with the scientific study every time.

There are two issues that we are intertwining...which my fault: The usefulness of FTDNA data and the statistical accuracy of Busby.

Early on I attempted to show that this FTDNA data may be valid because it was comparable to Busby results. However my comment about the small differences between Busby and the FTDNA data may be due to Busby's small regional sample sizes took us on a tangent where you felt the need to defend Busby. I continued the tangent with additional posts and here we are.

Here's what I am arguing: An FTDNA project with a statistically significant sample size, despite all its flaws, has value/can be reliable, and it is useful to post the results for others to see.

Maybe I have misunderstood, but I hear you as saying (1) Even with large sample sizes, FTDNA data will not be reliable even when it provides comparable results to other sources e.g. Busby and chromo2.0 numbers, (2) No effort/technique can minimize FTDNA's data limitations to make it more reliable (3) Busby's regional samples sizes are adequate. If this is the case then we "will have to agree to disagree".

I took the tangent of pointing out that Busby may have some limitations based on his sample sizes per region and I attempted to explain why this was the case based on the target populations in the Irish regions and the minimum sample sizes needed to get more accurate results.

Steve, if I have misunderstood you, I apologize ahead of time.

We may want to take this to pm if you want to continue our discussion further.

rms2
11-02-2015, 01:01 PM
I think stats from FTDNA projects have value, but my chief concern with them is that members are the sources of the ancestral info listed on the project web pages, and much of that info is not reliable. Too many members want to get their y-dna pedigree back across the Atlantic in the worst way, which leads them to do genealogy in the worst way.

That's what concerns me about FTDNA project data. Active administrators can encourage members to be conservative about claiming ancestors and ask them to avoid just grabbing some undocumented info from Ancestry.com or elsewhere on the internet and running with it, but one has to tread softly to avoid offending and alienating his members. It's really impossible to control bogus member info, unless one wants to try to set up his own FTDNA version of the Mayflower Society, and that would be more than a full time job.

Dubhthach
11-02-2015, 01:31 PM
377 is a pretty good sample size, and Busby et al was a scientific paper with controls that one cannot get from an FTDNA project. Project members often claim ancestors they are not sure about. I've mentioned that a number of times, and I know it to be a fact from personal experience. Irish ancestry is something that is really subject to the phenomenon of wannabeism, which serves as an incentive for seeing things the way one wants rather than the way they are (I'm talking about project members and the temptation to grasp at an ancestor or pedigree that seems attractive rather than waiting for solid evidence). Other ancestries with strong wannabe factors are Viking, Germanic in general, and Norman.

FTDNA project stats are interesting and might provide some very rough ballpark info, but they are not all that reliable. Busby et al, IMO, is the best we have for the time being.

I should note as one of the admins of the Ireland project that we do look for information with regards to Irish background when people put in a join request. Though we are quite open as long as an ancestor spent some time in Ireland (eg. we have good chunk of people who obviously are of Plantation stock etc.), we do let in the odd outlier though particulary when they show alot of Irish matches (we have a number of Scandinavian members etc who have no known Irish mdka etc.)

Dubhthach
11-02-2015, 01:51 PM
Our major issue though I would say is that the vast bulk of members have done no SNP testing as a result predicted M269 is majority group within the project, even among those that have there is a cohort who did Deep Clade testing before L21 was added to even that! We have a separate group within project listed as "P312+, L21 unknown", DF27 suffers from this issue as well, many of men in the equivalent P312+ (L21-, U152-) group are probably DF27+.

I'm gonna do an analysis using some of my standard UNIX tools later, see if numbers match up with above, just doing a copy/paste into vi editor at moment!

MacUalraig
11-02-2015, 02:45 PM
I think stats from FTDNA projects have value, but my chief concern with them is that members are the sources of the ancestral info listed on the project web pages, and much of that info is not reliable. Too many members want to get their y-dna pedigree back across the Atlantic in the worst way, which leads them to do genealogy in the worst way.

That's what concerns me about FTDNA project data. Active administrators can encourage members to be conservative about claiming ancestors and ask them to avoid just grabbing some undocumented info from Ancestry.com or elsewhere on the internet and running with it, but one has to tread softly to avoid offending and alienating his members. It's really impossible to control bogus member info, unless one wants to try to set up his own FTDNA version of the Mayflower Society, and that would be more than a full time job.

Well you could run it like the Devon DNA project ... :-)

https://www.familytreedna.com/groups/devon/about/background

"participants must have a documented paper trail to Devon on the patriline for the Y-DNA test or on the matriline line for the mtDNA test...

If your ancestor has emigrated please provide details of the emigrant ancestor documenting how you have proven the link with Devon (eg, an obituary, a newspaper article, a birth or death certificate, etc).

Please note that the project is not able to accept deep-rooted lines from America where the only connection with Devon is in the 1500s or 1600s. These pedigrees are prone to error and are very difficult to verify. There is also the possibility of a "non-paternal event" leading to the introduction of non-Devon DNA. If your surname is from a distant American line please find someone from an English line of more recent origin to represent the surname on your behalf."

rms2
11-02-2015, 07:20 PM
Well you could run it like the Devon DNA project ... :-)

https://www.familytreedna.com/groups/devon/about/background

"participants must have a documented paper trail to Devon on the patriline for the Y-DNA test or on the matriline line for the mtDNA test...

If your ancestor has emigrated please provide details of the emigrant ancestor documenting how you have proven the link with Devon (eg, an obituary, a newspaper article, a birth or death certificate, etc).

Please note that the project is not able to accept deep-rooted lines from America where the only connection with Devon is in the 1500s or 1600s. These pedigrees are prone to error and are very difficult to verify. There is also the possibility of a "non-paternal event" leading to the introduction of non-Devon DNA. If your surname is from a distant American line please find someone from an English line of more recent origin to represent the surname on your behalf."

I'm guessing that's a small project. Most admins wouldn't have time for anything that exacting. I don't imagine a project as big as the Ireland Project would be able to do it. They would be snowed under and have to quit their day jobs.