PDA

View Full Version : Guessing original mtDNA haplogroup when one mutation reverted to ancestral state



Shadogowah
03-13-2018, 09:49 AM
Hello all,

First of all excuse me for my lack on knowledge on the field of genetics. I am trying to learn as fast as I can.

Sometime ago I took a mtDNA Full sequence test with FTDNA and they labeled my kit as H3-T152C!

I learnt that this meant that the base at position 152 had reverted to its ancestral state. I see that indeed among the HVR2 difference from rCRS I get a "C" at position 152 when the CRS tells it should be a "T"...

However I read in the forums that this position is sort of "volatile" and it mutates with frequency. I was wondering what stops me for reset this value back to T and if I would get then the "correct" haplogroup of my maternal lineage. Would it be correct?

Jenny
03-13-2018, 01:22 PM
I've never heard of resetting your values, but since we know that ftdna is the leader in Mtdna testing, I might check back there frequently. You might also check the Facebook group about Mtdna. It's run by a friend of mine who works at FamilyTreeDNA and she's very knowledgeable

C J Wyatt III
03-13-2018, 04:19 PM
I can't speak to this situation, but I am very dubious of people resetting their own data.

Jack

Shadogowah
03-13-2018, 07:00 PM
I can't speak to this situation, but I am very dubious of people resetting their own data.

Jack

I am not sure if you understood what I asked.

Shadogowah
03-13-2018, 07:15 PM
What I am asking has nothing to do with FTDNA. They just provided me with my mutations but the algorithm could not place me in any branch below H3 due to that naughty 152 one. My question is, knowing that 152 is a tricky position, if I can run the data on free online predictor with a T in the 152 position instead of a C and 1) get something sensible 2) to which extent I can assume that the "fixed" prediction was indeed the haplogroup of whatever maternal ancestor I had just before the mutation in 152 happened to one of her daughters.

I'll try anyway.

Osiris
03-13-2018, 08:10 PM
I had a good time confirming all my full mito results from FTDNA although I always stuck to the RSRS version(there's a zip file on this page: http://www.phylotree.org/tree/index.htm). I went to phylotree and downloaded the latest version of the tree and pasted as text into an Excel spreadsheet and placed all my known mutations and differences into the spreadsheet too. Next I did a CTRL+H replacing each mutation with the same value but changing the cell color to green. Once I did that it was pretty easy to look at each level in the tree and see which one was the likely correct one the hide the branches I knew I didn't have.

This is my step father.
22087

I've never proved FTDNA wrong for the 5 kits I've done but there are some mutations that they don't list for instance I'm H3h which is past the C182T to T182C!, C16311T to T16311C! and G15301A to A15301G! mutations so none of them are listed in my mutation list and I like knowing they happened.

Judith
03-13-2018, 08:13 PM
If I have understood your question right what you are IS H3 but you have a mutation back of (shown as -) 152, so plain H3 would be your HG if you did not have the back mutation. Yes H3 formed many thousands of years ago and some people have say 4 mutations added to that and some have zero. You have one back mutation ie. one mutation if I understand you correctly.
But 152 is more volatile And is the about second the fastest moving ones. Try Soares et al https://www.sciencedirect.com/science/article/pii/S0002929709001633 to see it in comparison to other positions in figure 1

Do you have any exact matches GD=0 and if so you together define a new sub-clade which may be incorporated into the next mtdna phylotree when it is next re-run, if you have all uploaded to genbank (it needs 3 examples). But because it is fast-moving it may be rejected as defining a sub-clade if there is a more stable and clearer option.
You can can edit your file in a text application and play with what group you get too.

C J Wyatt III
03-13-2018, 09:06 PM
I am not sure if you understood what I asked.

Probably not.

Jack

Kale
03-14-2018, 03:58 AM
I suppose you could if you were running into problems with getting a proper assignment, but I doubt that is the case. I uploaded my mtdna to https://dna.jameslick.com/mthap/ to get an assignment (I figured 23andme was being overconservative as they sometimes do with y-lines) and despite having a flip at a diagnostic position, the site was still able to figure out what mthg I was.

Shadogowah
03-14-2018, 01:40 PM
If I have understood your question right what you are IS H3 but you have a mutation back of (shown as -) 152, so plain H3 would be your HG if you did not have the back mutation. Yes H3 formed many thousands of years ago and some people have say 4 mutations added to that and some have zero. You have one back mutation ie. one mutation if I understand you correctly.
But 152 is more volatile And is the about second the fastest moving ones. Try Soares et al https://www.sciencedirect.com/science/article/pii/S0002929709001633 to see it in comparison to other positions in figure 1

Do you have any exact matches GD=0 and if so you together define a new sub-clade which may be incorporated into the next mtdna phylotree when it is next re-run, if you have all uploaded to genbank (it needs 3 examples). But because it is fast-moving it may be rejected as defining a sub-clade if there is a more stable and clearer option.
You can can edit your file in a text application and play with what group you get too.

Thanks Judith.

My understanding was that there are no "pure" H3 modal people anymore because this is very old mutation and everybody that is classified as H3 also have other mutations that define more modern lineages below H3. Only the ancestral mom of all of us was H3 and all her offspring accumulated extra mutations with time. I thought that my classification as H3 had to be interpreted as "the most we can tell" because the back mutation at 152 somehow introduced ambiguity and as far as I know the haplogroup classifications based on the SNP always play safe and do not "predict" based on statistics and assumptions.

What I understand of what you say is that my identified markers are just modal H3 plus the back mutation and if I ignored the back mutation I would be just plain H3... Is that correct?

I have indeed 20 matches also classified as H3-T152C! but all them with GD = 3 thus this sort of contradicts what I just interpreted above, doesn't it?

Shadogowah
03-14-2018, 01:47 PM
I had a good time confirming all my full mito results from FTDNA although I always stuck to the RSRS version(there's a zip file on this page: http://www.phylotree.org/tree/index.htm). I went to phylotree and downloaded the latest version of the tree and pasted as text into an Excel spreadsheet and placed all my known mutations and differences into the spreadsheet too. Next I did a CTRL+H replacing each mutation with the same value but changing the cell color to green. Once I did that it was pretty easy to look at each level in the tree and see which one was the likely correct one the hide the branches I knew I didn't have.

This is my step father.
22087

I've never proved FTDNA wrong for the 5 kits I've done but there are some mutations that they don't list for instance I'm H3h which is past the C182T to T182C!, C16311T to T16311C! and G15301A to A15301G! mutations so none of them are listed in my mutation list and I like knowing they happened.

I am not sure if I understood all but you are telling me that indeed in some cases the back mutation is not an obstacle to actually build the branch below once you find enough samples who match other SNPs and in my case I am placed in some sort of substitutes box waiting because I have no closer matches to identify a new branch, isn't it?

Shadogowah
03-14-2018, 01:48 PM
I suppose you could if you were running into problems with getting a proper assignment, but I doubt that is the case. I uploaded my mtdna to https://dna.jameslick.com/mthap/ to get an assignment (I figured 23andme was being overconservative as they sometimes do with y-lines) and despite having a flip at a diagnostic position, the site was still able to figure out what mthg I was.

Hello Kale. Do you mean that the predictor gave you the same result you already had?

Shadogowah
03-14-2018, 02:02 PM
Well, for any of you who could be curious about it, the predictor with my results indeed places H3-T152C! as the most suitable candidate. It considers several "children" below H3 as imperfect matches.

However it also suggests H22 and H24 as suitable candidates too.

Shadogowah
03-14-2018, 02:11 PM
The altered file with the mutation 152 reverted to T indeed predicts again H3 but now with the same amount of confidence also suggests H20 and H24 so I think my initial idea was right and the problem is that the back mutation introduces uncertainty about which haplogroup I could belong, either an undiscovered branch of H3 or one below H20, H22 or H24.

Kale
03-14-2018, 06:41 PM
Hello Kale. Do you mean that the predictor gave you the same result you already had?

It basically said yes, that assignment was correct, here is why (showing positive, negative, and no-call markers). As it turns out, 23andme just didn't test very deep, so all markers diagnostic for subclades were untested.


The altered file with the mutation 152 reverted to T indeed predicts again H3 but now with the same amount of confidence also suggests H20 and H24 so I think my initial idea was right and the problem is that the back mutation introduces uncertainty about which haplogroup I could belong, either an undiscovered branch of H3 or one below H20, H22 or H24.

I have a similar thing going on with my Y, kind of frustrating. Positive for the hugely common branch...
R1b1a1a2a1a2b1 (L2), but absolutely nothing downstream. On the other hand, I'm also positive for...
R1b1a1a2a1a2b3b (S47), but 23andme doesn't test for...
R1b1a1a2a1a2b3, or anything downstream of S47.
So either I'm an uncommon basal lineage of the common branch R-L2*, or a regular member of the uncommon branch R-S47
One of them has to be wrong, unless of course, I have two daddys O_o

Osiris
03-14-2018, 06:56 PM
I am not sure if I understood all but you are telling me that indeed in some cases the back mutation is not an obstacle to actually build the branch below once you find enough samples who match other SNPs and in my case I am placed in some sort of substitutes box waiting because I have no closer matches to identify a new branch, isn't it?
Correct, back mutations are common. The MTDNA tree is not that complicated and it's fairly easy to make your own review of your data on the tree. I personally found it worthwhile because I could see all the back mutations that affected my line.

But I agree with the others that you are almost certainly H3-T152C! and that eventually you'd find other people under that node with mutations that match yours and will form a new branch someday. Depending on what your extra mutations are of course.

Judith
03-17-2018, 12:28 PM
[QUOTE=Shadogowah;363771

What I understand of what you say is that my identified markers are just modal H3 plus the back mutation and if I ignored the back mutation I would be just plain H3... Is that correct?

I have indeed 20 matches also classified as H3-T152C! but all them with GD = 3 thus this sort of contradicts what I just interpreted above, doesn't it?[/QUOTE]

I understand a little better now! There is a HUGE range in mutation rate which dominates our instinctive understanding of the mutations. The 50% likelihood of a mutation (average) is one per 40 generations i.e. about 1000 years, but some people have many and some have few. The value comes from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367903/#!po=18.0556

As a non-member I cannot access the H3 data in the ftdna project but you can join and investigate. My personal H4 example is I am H4a1a1a (6000 years old) but I have 5 extra mutations 3 of which count and 2 fast moving which will never be defining in the tree but 3 that will eventually form further sub-clades and letters, hence I have more than 100 GD=3 matches.
The key to your question is what HG are all of your GD=3 matches. If they are H3 then you are too and you have 3 further mutations which eventually will be sub-clades. But if they are all different HGs then you could be too. The 152 will probably not be used to make a new sub clade at all.

The mtdna tree is formed by an algorithm which is maximum likelihood based and works on the huge amounts of data for number of people and mutations so for any individual there will be a slight range of outcomes.
Ftdna and James Lick both use version 17 of the tree but neither can give you a clade if it has not been defined yet.
So all in all I completely agree with your approach of editing your 152 back and revisiting the prediction.

Shadogowah
03-20-2018, 10:56 PM
Thanks Judith. All my 20 matches are also labeled as H3-T152C! and looks like a total heterogeneous bunch from the expected Iberians to British, Swiss and even Scandinavians.