PDA

View Full Version : I wish My heritage would filter matches in known pileup regions



firemonkey
07-27-2018, 07:58 AM
Came across a match with 30 cM longest block. Chr 15 20-27 in known pileup region.

geebee
07-28-2018, 02:43 AM
If you're suggesting that any and every "pile-up" is a bad thing in terms of genealogy, you might want to take a look at this:

https://segmentology.org/2015/10/07/pile-ups/

The author suggests that some pile-ups (over 10 cM) likely reflect an overlap of true shared DNA segments that simply have come in on both sides. The example he gives is of someone who is a 2nd cousin on his mother's side and an 8th cousin on his father's side.

geebee
07-28-2018, 02:54 AM
I'll give an example of how simply ignoring pile up regions can mislead. One of my sisters has a match with someone which is largely in what has been identified as one of the pile up regions of chromosome 15. She could ignore it, but it happens that three of her siblings match the same person in the same region, but for a little over 42 cM. So does one of her nieces (my daughter), for the same amount.

In fact, the person is actually a 2nd cousin and has many, many other matching segments with all of us.

EDIT: It's also important for others to note that pile-up regions may not be the same for everyone, even when they're widespread. So don't assume a particular pile-up region will be such for you, until you actually have some evidence that it is. This seems to be true in the OP's case, but not everyone has exactly the same pile-up regions -- though some, as I noted, are fairly widespread.

firemonkey
07-28-2018, 04:37 AM
If you're suggesting that any and every "pile-up" is a bad thing in terms of genealogy, you might want to take a look at this:

https://segmentology.org/2015/10/07/pile-ups/

The author suggests that some pile-ups (over 10 cM) likely reflect an overlap of true shared DNA segments that simply have come in on both sides. The example he gives is of someone who is a 2nd cousin on his mother's side and an 8th cousin on his father's side.


Something else to really throw me. I had always been told to ignore pile ups. Could these then be relevant though appearing in a pile up region on chr10? All are Norwegian surnames except one that does though have a lot of Norwegian ancestral surnames.

Name,Match Name,Chromosome,Start Location,End Location,Start RSID,End RSID,Centimorgans,SNPs
Gatty,All selected DNA Matches,10,45573645,53871455,rs11239352,rs11594206 ,12,3072
Gatty, Slettum,10,44705019,54061937,rs7906325,rs192363474 ,13,3712
Gatty, Jacobsen,10,44705019,53871455,rs7906325,rs11594206 ,12.6,3584
Gatty, Svensen,10,44705019,54501717,rs7906325,rs1975458,1 3.8,3968
Gatty, Jørgensen,10,44870705,54061937,rs17881412,rs192363 474,12.8,3584
Gatty, Wynne,10,44705019,54303449,rs7906325,rs10762771,13 .3,3840


Ethnicities in common

Slettum- Scandinavian
Jacobsen- Scandinavian
Svensen - Scandinavian , Finnish
Jørgensen- Irish,Scottish and Welsh, Scandinavian
Wynne - Irish,Scottish and Welsh, Scandinavian

So the common link I have with all of them is Scandinavian.

I had all but dismissed these but in light of the link you posted I'm not so sure now.

msmarjoribanks
07-28-2018, 02:47 PM
How close are those matches? cMs, I mean. Format makes it hard to tell.

I think if you check common matches and don't find any you know and it's very long list/known pile up regions are involved, then you can probably assume it's not real. If there was some other evidence of connection, it might be different.

It's easier for me because I have both parents on MyHeritage so can see if the match matches them. If not, I assume it's a false match. (False matches often do happen because they have small sections of matches with both parents, but too small to show up as matches of theirs. geebee's example matches on both sides but also would be a 2nd cousin once removed on one side, which should show up as a significant match with far more than 30 cM anyway (although obviously chance means not always).

firemonkey
07-28-2018, 05:29 PM
Shared shared segments Largest segment

S 13.0 1 13.0

J 12.6 1 12.6

S 13.8 1 13.8

J 12.8 1 12.8


W 13.3 1 13.3

msmarjoribanks
07-28-2018, 09:31 PM
Yeah, probably super ancient or not real.

firemonkey
07-29-2018, 12:49 PM
Blaine compared his matches to those of both of his parents. The matches that he has that neither parent has are false positives. He wrote up his results in blog post The Danger of Distant Matches . His results, using AncestryDNA data and the DNAgedcom Match-O-Match tool:

Above 15 cM, a match is 99.3% likely to be a real match, a match shared with either or both parents
Above 10 cM, a match is 94% likely to be a match shared with either or both parents (This statistic includes the range above 15 cM.)
Below 10 cM, a match is 59% likely to be a match shared with either or both parents
Below 7cM, a match is 40% likely to be a match shared with either or both parents, so more likely to be a false positive, not really family


https://lianejensenresearch.com/2017/02/22/how-to-tell-when-dna-matches-might-be-false-positives/

If anyone has any other links to IBD rates <7 >7 >8 >9 >10 >11 >12 >13 >14 >15 please post them.

msmarjoribanks
07-29-2018, 02:25 PM
Blaine's piece: https://thegeneticgenealogist.com/2017/01/06/the-danger-of-distant-matches/

Problem is that it's AncestryDNA rates. I'm not sure it will be consistent.

Using Blaine's methodology, I went through my top 5 Norwegian matches:

29.8 cM -- not shared with either parent
19.6 cM -- shared with my mom (but only 9.9 cM) and some matches that seem to be on my Swedish side
13.8 cM -- shared with both parents (11.7 cM and 10.2 cM)
13.7 cM -- not shared with either parent, but distant connection to 13.8 above
12.9 cM -- shared with my mom (12.5) and 13.8 cM above

These could be really distant matches, but they clearly aren't as close as it's presented, and I tend to think they are all false matches.

geebee
08-06-2018, 01:36 PM
If you're looking at Ancestry, it may be impossible to know if a failure to match one parent or the other represents a false positive for the offspring. It might represent a false negative for one of the parents. (Possibly due to Timber.)

I hadn't personally seen this phenomenon of neither parent showing up as someone's shared match. First, because I've spent most of my time looking at my own data with 1000+ "4th cousins or closer". I haven't really looked much at "Distant Cousins". Second, I don't have any tested parents.

However, my daughter does. I manage the files for my daughter, her mother, and me within my account at Ancestry. So I decided to go looking through her matches, beginning with "Distant Cousins".

I only had to go as far as the third of these, who is someone who reportedly shares 20.0 cM with my daughter. Neither my wife nor I were reported as a shared match. But is this a false positive for my daughter? Well, although I'm not one of the shared matches, all four of the shared matches listed happen to be shared matches between my daughter and me.

My suspicion is that this person is not a false positive for my daughter, but a false negative for me.

EDIT: I should also add that among "4th cousins or closer", I have one or two relatives who have the same 2nd cousin once removed (on my side) as a shared match. But they don't have his father, who is my 2nd cousin.

It's possible that we're somehow related through my 2nd cousin once removed's mother, but I doubt it. I think it's more likely that Timber has nixed calling a match with the 2nd cousin, but did call a match with the 2nd cousin's son.

Bottom line: Timber causes Ancestry to fail to report some actual matches, which to me is worse than reporting some false matches. If Timber sees a reason a match should not be reported, then the match should instead be "flagged" in some way, but still reported.

geebee
08-06-2018, 02:28 PM
Personally, I think excessive filtering can sometimes do more harm than no filter at all; but a filter along the lines of MyHeritage's other filters, which can be turned on or off, might not be bad.

I've found it useful, sometimes, to adjust some of the parameters (especially SNP count) when using GEDmatch. For example, I have a match at GEDmatch who's reported to share 38.4 cM with me on the autosomes, and 19.1 cM on the X chromosome. But the one-to-one X-DNA Comparison showed no matching.

I adjusted the minimum SNP count required to 500 -- which is not unreasonable for two males. (By default, GEDmatch uses 1000 -- which is greater than the default used with autosomes.) This time, the result was 19.0 cM with 637 matching SNPs.

I can tell from the location of the match that it's on my maternal grandmother's mother's side. It's within the 20 cM region that is all I share with my grandmother on the X chromosome -- all the rest is shared with my grandfather. I know my grandmother inherited this 20 cM from her mother, because I have several half 2nd cousins on her mother's side but not her father's. (Grandchildren of one or another of my grandmother's three half sisters.)

Based his shared matching to one of those half 2nd cousins, he is that cousin's full siblings. So he also is a half 2nd cousin to me. The amount of sharing is low for the relationship, but within the range. (I share somewhat more DNA with his sister, but still a bit below average for half 2nd cousins.) This relative also belongs to the same mtDNA haplogroup I do, H1bg. That's exactly as you'd expect since we have the same person in both our mtDNA lines.

msmarjoribanks
08-06-2018, 04:37 PM
If you're looking at Ancestry, it may be impossible to know if a failure to match one parent or the other represents a false positive for the offspring. It might represent a false negative for one of the parents. (Possibly due to Timber.)

I was looking at MyHeritage, as the first post was talking about MyHeritage and you can easily filter by location there. That was my point, that Blaine's numbers were for Ancestry, which tries to filter out the kinds of matches we were seeing at MyHeritage, so I don't think his numbers would translate to MyHeritage, I think it is likely to have more false matches and, especially, more false matches when you are looking at people in (and from) countries where you have no even somewhat recent ancestry.

I am not that bothered by Timber, but my issue with Ancestry is no chromosome browser. I see no reason they shouldn't add one.


I hadn't personally seen this phenomenon of neither parent showing up as someone's shared match. First, because I've spent most of my time looking at my own data with 1000+ "4th cousins or closer". I haven't really looked much at "Distant Cousins", and even if I chose to do so, I don't have two tested parents. (Or even one.)

The "closest" of these matches -- to whom neither of my parents matches -- is labeled 3rd to 5th. This is why I think you have to be careful.

I normally focus on closer ones too, as there is still more value in them for me, and I have lots and lots. Not everyone has that luxury. I looked at these because we were talking about Norwegian matches. I've looked at other smaller ones because someone emails me or I find a surname or location intriguing.

Based on the shared matches and other indications, yes, I'm reasonably certain it's a false match.

I have other similar-sized or even larger matches on other sites that I think may well be false matches or, at least, matches that are much, much farther back than the size of the match indicates. Unfortunately I have my parents on FTDNA and MyHeritage and Gedmatch, but not Ancestry and 23andMe, so it's harder to compare.

Especially if your parents have similar or overlapping ancestral backgrounds, it's not that surprising that some combination of tiny matches with both and just general IBD could end up looking like a closer match than it is.

What I find intriguing is that people of largely British Isles ancestry are seeing this with Norwegian matches (in my case I suspect my mom being a quarter Swedish plays a role too, but the matches seem to be as much with my dad, who is basically British Isles).

geebee
08-06-2018, 07:29 PM
I was looking at MyHeritage, as the first post was talking about MyHeritage and you can easily filter by location there. That was my point, that Blaine's numbers were for Ancestry, which tries to filter out the kinds of matches we were seeing at MyHeritage, so I don't think his numbers would translate to MyHeritage, I think it is likely to have more false matches and, especially, more false matches when you are looking at people in (and from) countries where you have no even somewhat recent ancestry.

I am not that bothered by Timber, but my issue with Ancestry is no chromosome browser. I see no reason they shouldn't add one.

Two points here. First, while you were looking at MyHeritage, as you said, you did also refer to Ancestry in your answer, which is why I responded.

The second is, you may not be "bothered" by Timber, but I think you should be. I've seen multiple instances in which a person who is clearly a DNA relative (even using Ancestry's own files) is not reported as one. However, I didn't see those instances at Ancestry. Why not? Because for some reason -- I presume Timber -- although the data files -- the match was suppressed. I know this, because a number of these matches are visible at GEDmatch.

The single exception I've seen, so far, to Ancestry not showing a match is an indirect one. Ancestry shows a match -- as I discussed -- between my daughter and someone. But neither parent is a shared match.

However, as I've noted all four of the shared matches between my daughter and this individual do show me as a shared match. Not only am I a shared match to all four, but in two cases both of my daughter's 1st cousins at Ancestry are shared matches. Each of these cousins is a child of a different one of my sisters.

So, would you think it's more likely or less likely that this match of my daughter's should also appear in my DNA match list? If Timber is the reason the match isn't being shown in my case, I think that's a problem.


The "closest" of these matches -- to whom neither of my parents matches -- is labeled 3rd to 5th. This is why I think you have to be careful.

I normally focus on closer ones too, as there is still more value in them for me, and I have lots and lots. Not everyone has that luxury. I looked at these because we were talking about Norwegian matches. I've looked at other smaller ones because someone emails me or I find a surname or location intriguing.

Based on the shared matches and other indications, yes, I'm reasonably certain it's a false match.

I don't see where you were able to read what I wrote as saying that caution should not be used. But, you can't use caution in investigating possible matches that you don't even get to see. Your match might well be a false match. But would you actually rather have it suppressed at the possible expense of also suppressing some genuine matches?


I have other similar-sized or even larger matches on other sites that I think may well be false matches or, at least, matches that are much, much farther back than the size of the match indicates. Unfortunately I have my parents on FTDNA and MyHeritage and Gedmatch, but not Ancestry and 23andMe, so it's harder to compare.

You and I are in somewhat the same boat here, in that my father tested at 23andMe only. Unfortunately, my mother passed away in 2002 so she's never been tested at all. Fortunately, with five tested siblings at 23andMe, it's generally possible to identify matches across her entire genome.


Especially if your parents have similar or overlapping ancestral backgrounds, it's not that surprising that some combination of tiny matches with both and just general IBD could end up looking like a closer match than it is.

What I find intriguing is that people of largely British Isles ancestry are seeing this with Norwegian matches (in my case I suspect my mom being a quarter Swedish plays a role too, but the matches seem to be as much with my dad, who is basically British Isles).

My parents, too, have some overlapping ancestries, including some from the British Isles. However, my father's British ancestors were almost exclusively Scots-Irish, where my mother's British ancestors were mostly English. She also has ancestry from southern Ireland, which my father does not.

Both have ancestry from Germany, but these mostly immigrated from different regions at somewhat different times. My father's German ancestry leans more toward southern Germany and Austria; my mother's in more Alsatian. As I'm sure you know, Alsace-Lorraine is now French.

Besides this, my mother has a fair amount of ancestry my father doesn't: Spanish, French, and Native American among them.

But the bottom line is, I don't see much by way of "pile ups", for example. Not even in areas that are supposedly notorious for pile ups. I think that's because pile ups are very affected by a person's general ancestral background. Even to the extent that my parents have some overlapping ancestries, they don't seem to overlap that much.

(And my father's family was all in Pennsylvania since colonial days, my mother's was mostly in the part of the Deep South heavily influenced by the French and Spanish.)

Regardless, the fact that the files of my father and all six of his and my mother's offspring have been uploaded to GEDmatch does make it possible to make some comparisons with folks not at 23andMe, or not sharing there. This is one way that I was able to determine that a person who's in a couple of my DNA Circles does share DNA with me, even though Ancestry reports her only as a match to the circle. She's nowhere in my match list.

Admittedly, in this case the amount of DNA shared is pretty small. GEDmatch says it's only about 15.6 cM in two segments. Only the larger of the two, which is 10.1 cM, is shown in one-to-one matching using the defaults. Normally, I'd ignore such matches.

The thing is, looking at her match list reveals many of the same names as on my match list -- which is unsurprising, given that we're in a couple of the same DNA Circles. But her match list includes all of my siblings, my 2nd cousin, and lots of other 3rd cousins or closer.

One of my sisters, for example, shows two shared matches. One is on chromosome 1, and is 23.5 cM; and one is on chromosome 8, and is 33.0 cM. Another sibling has three shared matches, ranging from 17.9 cM to 24.5 cM. Two these are in the same location as for my sister, but one is different.

Among the six of us, I believe we have about 7 shared DNA segments, with mine being the smallest. But, I would never have found this at Ancestry -- since none of my siblings has tested there. Since we are members of a couple of the same DNA Circles, it doesn't matter that much. I'm still able to see how we may be related.

One thing we definitely agree on -- though it's extremely unlikely to happen under the current CEO -- is that a chromosome browser would be more useful than just about anything else they could offer. This would make it possible to "test" possible false positives through triangulation.

But how would you use triangulation to test possible false negatives, if they're suppressed?

msmarjoribanks
08-06-2018, 08:23 PM
Two points here. First, while you were looking at MyHeritage, as you said, you did also refer to Ancestry in your answer, which is why I responded.

I was trying to contrast Ancestry -- as someone else had brought up Ancestry not having many false matches above a certain cM -- and MyHeritage, where I think false matches are more common even at higher cM. The point was that Ancestry won't be as likely to have false matches (although I'm sure it still does at lower levels, as Blaine noted) as sites like MyHeritage that don't filter, so you (general you, not you specifically) have to be more careful and be aware that a match at MyHeritage might not mean much.


The second is, you may not be "bothered" by Timber, but I think you should be. I've seen multiple instances in which a person who is clearly a DNA relative (even using Ancestry's own files) is not reported as one. However, I didn't see those instances at Ancestry. Why not? Because for some reason -- I presume Timber -- although the data files -- the match was suppressed. I know this, because a number of these matches are visible at GEDmatch.

I'm not saying you shouldn't be bothered, but that I am not, because I think not having the level of false matches I see at MyHeritage has its positives too. IMO, it's that big a deal not to see some tiny match that Ancestry makes it nearly impossible to confirm anyway (that's a separate issue and one that does bother me), and I'm not yet convinced that Ancestry blocks significant matches. Also, Ancestry has other tools, like circles, and if someone turns up match to circle I think that's enough evidence to do more research (and to think that they might also be a match to you, and likely a cousin even if they are not, if they match the right people).

But like I said, the problem with Ancestry, and what interferes with its usefulness for me, is the lack of a chromosome browser.

geebee
08-06-2018, 08:38 PM
I would agree that MyHeritage matches, specifically, may require a higher level of scrutiny. This is the only place, to my knowledge, that even I have seen a few rather distant "Norwegian" matches. Nothing near the size that you've reported. It's possible that your parents have more applicable overlapping ancestry than mine do.

So again, I am not saying don't use caution. I'm saying you can't do any sort of comparisons on matches that a company detects by standard methods, but then hides from you to avoid showing you a false positive. The reality is, they have no way of knowing whether the match is or is not a false positive.

Why not let the customer determine the validity of an apparent match?