Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: OptimaFit Adventures and Experimentation

  1. #1
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland

    OptimaFit Adventures and Experimentation

    Acknowledging how many credits OptimaFit uses, Iíve been selective about what populations I put together and the hypothesis and logic for each run (to level of knowledge anyway). To being with, when modelling North African populations, Egyptian provides the highest percentage, however I wanted to see how that would compare by placing more North African populations in, and then seeing if its still the best fit when each model is run. I have no doubt I have some minor North African, which based on my ancestry, seems logical. For what MyHeritage is worth, I get 1.5% North African, my sister gets 2.9%. She also scores some North African on myOrigins. We also get varying amounts across certain calculators. Ideally, I would also do an OptimaFit for my grandfather (the likely source of the North African), however I'm patiently counting down until the Eurogenes store reopens to get his coordinates. In the mean time, it may be difficult to deduce findings simply based off of mine alone.

    For the first experiment, I added a number of North African populations listed in the image below. You can add up to 20 populations on GenoPlot when conducting OptimaFit, however I thought I'd stick to a smaller amount, specifically the ones I have previously modeled that I know I score some of.



    Probably not overly surprising, Egyptian comes out on top. I plan on doing this for my grandfather to see if Egyptian is also his best fit. Depending on this, I may also do the same for my mother and sister. Can it be said Egyptian is the possible source of the North African? Its possible, but I need to validate this with more family members (as much as you can validate ancestry without a paper trail 900+ years ago).

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.2521
    Results: Irish 84.6
    Croatian 12
    Egyptian 3.4


    For my next experiment, I wanted to look at Levantine. This post shows that Lebanese_Christian is a good fit when using those particular populations, however I wanted to take this a step further and include Levantine populations only. I also added Egyptian, as I was curious to see how strong it is when compared to Levantine populations (I wasn't expecting it to be stronger, but what's the harm in trying with such a great tool).

    Once again I included populations I know I score small amounts of when modelling:



    What surprised me is that Samaritan is the best fit. Admittedly I've never modeled Samaritan before, and I was expecting Lebanese_Christian to still be the best fit. I've done some brief research and came across this 23andMe article. This is from 2008, so I'm not sure if the consensus has changed since then with new research. If every Ashkenazi Jew were to run this exact Levantine setup, would Samaritan be the best fit for most people? I'm not sure, possibly not if they are a unique population, so hopefully someone more knowledgeable in Jewish genetics can answer. Perhaps I could improve upon the populations I've selected. I'll be curious to see if my grandfather also gets the same result. I suppose Samaritan could also just be a "fill in" for something else, however it seems Samaritan is distinct enough that it may not be the case, and the purpose of OptimaFit is to not use proxies in a sense for something else, i.e. if Lebanese_Christian is still a better fit, it wouldn't pick Samaritan to represent it, as it has the choice of Samaritan if its a better fit overall. I'm yet to play around with unscaled data and/or turning penalty off to see what happens.

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.1805
    Results: Irish 84.8
    Croatian 10.2
    Samaritan 5
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  2. The Following 3 Users Say Thank You to BalkanKiwi For This Useful Post:

     DMXX (06-26-2020),  JMcB (06-26-2020),  Nino90 (07-10-2020)

  3. #2
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    I decided to use OptimaFit to play around with my minor Polynesian/SE Asian ancestry. I put a number of populations in from the region, including Korean and Japanese to see if it had any influence on what I expected the result to be. I was expecting Nasoi to be the best fit, or at the very least a Papuan like component, which is the case. Interestingly on a chromosome painting, I have really no SE Asian segments, unlike my grandmother who has a very large stretch on chromosome 21. Basically all of my Polynesian are small, scattered Oceanic segments, which is likely why Papuan populations are my best fit. I didn't seem to have inherited much SE Asian, unlike my grandmother, who has more SE Asian than Oceanian. I plan to also model her to confirm this hypothesis.



    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.1104
    Results: Irish 81.6
    Croatian 11
    Ashkenazi Poland 6.6
    Kosipe 0.8
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  4. The Following 2 Users Say Thank You to BalkanKiwi For This Useful Post:

     heksindhi (06-27-2020),  JMcB (06-27-2020)

  5. #3
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    Some of you may have seen my thread over in the Jewish section regarding East African ancestry and Ashkenazi. I'm hoping my grandfather's coordinates provide some insight into this, as its somewhat interesting to play around with, acknowledging it could very well be noise. I was curious to see if an OptimaFit would show anything that a normal NMonte run wouldn't.

    Firstly I created a model that included Eritrean or one of the few Ethiopian populations I match, such as Amhara. Shown below is a model I put together with Eritrean. It seems it could be mediated through Sicilian East, as when I remove Eritrean the Sicilian East percentage increases.

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.2324
    Results: Irish 72.5
    Croatian 14.5
    English Cornwall 9.5
    Sicilian East 3
    Eritrean 0.5


    When I run an OptimaFit (below) with Eritrean and a number of Ethiopian populations, the East African doesn't fit into any model. Whether this is due to it being noise I'm not sure. My grandfather's results might help with this, but it might highlight the usefulness of OptimaFit for ruling out potential noise. Just because a population has a percentage in a normal NMonte model doesn't mean its still going to appear in an OptimaFit run. Mind you, its important to note we are playing with very minor amounts of ancestry which needs to be taken into account.

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.2402
    Results: Irish 82
    Croatian 12.8
    Sicilian East 5.2
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  6. #4
    Gold Class Member
    Posts
    4,731
    Sex
    Location
    Calne,England
    Ethnicity
    British and Irish
    Nationality
    Great Britain
    Y-DNA (P)
    E-Y45878
    mtDNA (M)
    H67

    United Kingdom Scotland England Ireland
    Unable to run optima fit error message
    Please support Mental health research and world community grid

    Hidden Content
    Hidden Content
    Hidden Content
    Hidden Content

  7. #5
    Gold Class Member
    Posts
    4,731
    Sex
    Location
    Calne,England
    Ethnicity
    British and Irish
    Nationality
    Great Britain
    Y-DNA (P)
    E-Y45878
    mtDNA (M)
    H67

    United Kingdom Scotland England Ireland
    Sample: firemonkey ► firemonkey dad
    Fit: 1.3331
    Results: Welsh 46.6
    Scottish 41.8
    Irish 6
    Swedish 5.6


    Sample: firemonkey ► firemonkey
    Fit: 0.786
    Results: Scottish 52.2
    Irish 43
    Swedish 4.8
    Please support Mental health research and world community grid

    Hidden Content
    Hidden Content
    Hidden Content
    Hidden Content

  8. #6
    Quote Originally Posted by BalkanKiwi View Post
    When I run an OptimaFit (below) with Eritrean and a number of Ethiopian populations, the East African doesn't fit into any model. Whether this is due to it being noise I'm not sure. My grandfather's results might help with this, but it might highlight the usefulness of OptimaFit for ruling out potential noise. Just because a population has a percentage in a normal NMonte model doesn't mean its still going to appear in an OptimaFit run. Mind you, its important to note we are playing with very minor amounts of ancestry which needs to be taken into account.

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.2402
    Results: Irish 82
    Croatian 12.8
    Sicilian East 5.2
    Our primary goal with optimaFit was to provide users with a solid base set of source populations to build upon. In it's current iteration its configured to optimize down to the best 4 (or lower ) set of sources that provide the best fits. We may make that a configurable option in the feature to allow users to pick the optimal number of sources.

    It's important to note that increasing the optimal number has an exponential impact on the number of possible combinations. As an example, going to an optimaFit of 6 versus 4 for 20 sources increases the number of possible combinations from 4,845 to 38,760.

    All of these models run in parallel, so it would not have any impact on run times, however, they do use extra compute capacity so they would use quite a few more compute credits.

    Setting the value to 4 gives us a nice set of base populations with little to no possibility of over-fitting.
    Explore your Genealogy

  9. The Following 2 Users Say Thank You to GenoPlot For This Useful Post:

     BalkanKiwi (06-27-2020),  JMcB (06-27-2020)

  10. #7
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    Quote Originally Posted by firemonkey View Post
    Sample: firemonkey ► firemonkey dad
    Fit: 1.3331
    Results: Welsh 46.6
    Scottish 41.8
    Irish 6
    Swedish 5.6


    Sample: firemonkey ► firemonkey
    Fit: 0.786
    Results: Scottish 52.2
    Irish 43
    Swedish 4.8
    Which other Nordic populations did you include in the model?
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  11. #8
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    Quote Originally Posted by GenoPlot View Post
    Our primary goal with optimaFit was to provide users with a solid base set of source populations to build upon. In it's current iteration its configured to optimize down to the best 4 (or lower ) set of sources that provide the best fits. We may make that a configurable option in the feature to allow users to pick the optimal number of sources.

    It's important to note that increasing the optimal number has an exponential impact on the number of possible combinations. As an example, going to an optimaFit of 6 versus 4 for 20 sources increases the number of possible combinations from 4,845 to 38,760.

    All of these models run in parallel, so it would not have any impact on run times, however, they do use extra compute capacity so they would use quite a few more compute credits.

    Setting the value to 4 gives us a nice set of base populations with little to no possibility of over-fitting.
    From what I've seen so far I figured 4 populations was the maximum. So essentially, to use Eritrean as an example but its probably the same for other smaller amounts, even if it appears in a normal NMonte run and then doesn't appear in an OptimaFit, it doesn't confirm its noise (by itself anyway, it could very well be), but more so the 2-3 bigger populations when combined fit better when the lessor population is excluded?
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  12. #9
    Administrator
    Posts
    2,409
    Sex
    Y-DNA (P)
    DF27
    mtDNA (M)
    V33

    New Zealand Croatia Star of David Ireland England Poland
    I decided to play around with ancient samples and modelling, which I haven't done much of before. Firstly I wanted to use OptimaFit to see how the ancient Vanuatu and Tongan populations fit into an ancient model. There's a good chance there are better European populations I could use, but I need to experiment more. In any case, below are the populations I used. I should note I used default penalty with unscaled coordinates.



    I'm not surprised by the results below. There is a clear lack of ancient Oceanian samples and therefore unavoidable bias, as Vanuatu outnumbers the single Tongan sample. Even with more Tongan samples, I'd still expect a Vanuatu sample to be selected. The 2300BP sample is mostly of Papuan ancestry according to this study by Lipson et al. As most of my minor Polynesian seems to be Papuan like, and as shown in posts above, it seems logical that an ancient sample of mostly Papuan ancestry will be selected as the best fit. Hopefully in future more ancient DNA from this region will be found (and be usable). This OptimaFit modelling could also be expanded to include ancient samples from SE Asia, which I plan to do in the near future. Because mine is minor it may not make a big difference, so I'll be curious to see if my grandmother throws up something different.

    Sample: Balkan Kiwi ► BalkanKiwi
    Fit: 1.2348
    Results: England MBA 87.6
    BKG N 10.6
    2300BP all (Vanuatu) 1.8
    Last edited by BalkanKiwi; 06-28-2020 at 06:23 AM. Reason: Additions.
    Ancestry on paper: English, Scottish, Irish, Welsh, Croatian, Ashkenazi, Polish and Māori.

  13. #10
    Quote Originally Posted by BalkanKiwi View Post
    From what I've seen so far I figured 4 populations was the maximum. So essentially, to use Eritrean as an example but its probably the same for other smaller amounts, even if it appears in a normal NMonte run and then doesn't appear in an OptimaFit, it doesn't confirm its noise (by itself anyway, it could very well be), but more so the 2-3 bigger populations when combined fit better when the lessor population is excluded?
    Yes, that's correct. For a given sample of sufficiently differentiated ancestry that exceeds n - in this case n being 4 - the major sources of the ancestry will determine the optimal fits. That's why its important to run these models against parents and siblings as the combined results will give a much clearer picture of the ancestral sources.

    When it comes to minor ancestry, it can be helpful to view the optimaFit results in tandem with the accompanying oracles. The oracles, while a bit of a blunt instrument in comparison, will often point to alternative sources of minor ancestry with decreasing probabilities . For example, in your case:

    Oracle Distance
    86.6% Irish + 13.4% Ashkenazi Poland 1.1651
    92.6% Irish + 7.4% Samaritan 1.201
    92% Irish + 8% Lebanese Christian 1.2123
    92.8% Irish + 7.2% Syrian 1.2292
    92.4% Irish + 7.6% Lebanese Druze 1.2297
    92.2% Irish + 7.8% Druze 1.2336
    74.8% Irish + 25.2% Croatian 1.2857
    91.6% Irish + 8.4% Syrian Jew 1.301
    87.8% English + 12.2% Ashkenazi Poland 1.3551
    94% Irish + 6% Iranian Fars 1.3666


    Note: the oracle distances are only directly comparable to optimaFit runs with the default penalty "off".
    Last edited by GenoPlot; 06-28-2020 at 05:32 PM. Reason: note
    Explore your Genealogy

  14. The Following User Says Thank You to GenoPlot For This Useful Post:

     BalkanKiwi (06-28-2020)

Page 1 of 2 12 LastLast

Similar Threads

  1. Experimentation with Admixture Software
    By Dr_McNinja in forum Autosomal (auDNA)
    Replies: 925
    Last Post: 09-05-2017, 10:25 PM
  2. Experimentation suggests Vikings could have used sunstone to navigate
    By Amerijoe in forum Archaeology (Prehistory)
    Replies: 1
    Last Post: 07-27-2016, 02:37 PM
  3. Experimentation with SpaceMix
    By ZephyrousMandaru in forum Autosomal (auDNA)
    Replies: 0
    Last Post: 01-10-2015, 04:26 AM
  4. Experimentation with TreeMix Software
    By everest59 in forum Autosomal (auDNA)
    Replies: 1136
    Last Post: 08-18-2014, 01:27 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •