Page 12 of 12 FirstFirst ... 2101112
Results 111 to 115 of 115

Thread: Automated qpGraph: possibilities for the consideration of technically savvy Users?

  1. #111
    Registered Users
    Posts
    313
    Sex
    Omitted

    Nevermind, you have to put the actual file path in the first argument of the scan command, and not just the file name. Amateur mistake on my part.

    How many populations will you guys typically put in your pops file? I got a little aggressive and went with 18 in mine. When I first ran extract_f2, with the default maxmiss = 0 setting, it filtered out all but 303 SNPs, so garbage. I re-ran it now with maxmiss = 1, and now I have over 1.1 millions SNPs remaining, and it needs almost 9,000 MB RAM to compute ... Think I might blow up my laptop with this one... The instructions for extract_f2 do say to limit the number of populations if you lose too many SNPs the first time but I felt like being greedy.

    EDIT: It actually wasn't that bad, finished extracting in a little over 30 minutes. But now onto another issue - I'm able to successfully run some f4s, and I created a popfilelist like in the old Admixtools with 45 rows of tests, however when I run the f4 command here it only returns the first 10 tests from my popfilelist, the following 35 seem to be hidden? It can't be a computation issue because it produced the first 10 results almost instantly.

     
    1 Chimp.REF Denisova.DG Altai_Neanderthal.DG Vindija_Neanderthal.DG -3.48e-4 1.46e-4 -2.38 0.0172
    2 Chimp.REF Denisova.DG Altai_Neanderthal.DG Indian_GreatAndaman_100BP.SG -3.03e-2 5.33e-4 -56.8 0
    3 Chimp.REF Denisova.DG Vindija_Neanderthal.DG Indian_GreatAndaman_100BP.SG -2.99e-2 5.34e-4 -56.1 0
    4 Chimp.REF Denisova.DG Altai_Neanderthal.DG Dinka.DG -3.11e-2 5.09e-4 -61.1 0
    5 Chimp.REF Denisova.DG Vindija_Neanderthal.DG Dinka.DG -3.07e-2 5.09e-4 -60.5 0
    6 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Dinka.DG -8.05e-4 3.45e-4 -2.33 0.0197
    7 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Cameroon_SMA.SG -3.86e-4 3.30e-4 -1.17 0.242
    8 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Ethiopia_4500BP_published.SG -5.73e-4 3.68e-4 -1.56 0.119
    9 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG South_Africa_2200BP.SG 1.96e-5 4.23e-4 0.0463 0.963
    10 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Esan.DG -6.72e-4 3.20e-4 -2.10 0.0356
    # ... with 35 more rows
    Last edited by TuaMan; 02-28-2021 at 04:53 AM.

  2. The Following User Says Thank You to TuaMan For This Useful Post:

     discreetmaverick (03-23-2021)

  3. #112
    Registered Users
    Posts
    1,337
    Sex

    Quote Originally Posted by TuaMan View Post
    Nevermind, you have to put the actual file path in the first argument of the scan command, and not just the file name. Amateur mistake on my part.

    How many populations will you guys typically put in your pops file? I got a little aggressive and went with 18 in mine. When I first ran extract_f2, with the default maxmiss = 0 setting, it filtered out all but 303 SNPs, so garbage. I re-ran it now with maxmiss = 1, and now I have over 1.1 millions SNPs remaining, and it needs almost 9,000 MB RAM to compute ... Think I might blow up my laptop with this one... The instructions for extract_f2 do say to limit the number of populations if you lose too many SNPs the first time but I felt like being greedy.

    EDIT: It actually wasn't that bad, finished extracting in a little over 30 minutes. But now onto another issue - I'm able to successfully run some f4s, and I created a popfilelist like in the old Admixtools with 45 rows of tests, however when I run the f4 command here it only returns the first 10 tests from my popfilelist, the following 35 seem to be hidden? It can't be a computation issue because it produced the first 10 results almost instantly.

     
    1 Chimp.REF Denisova.DG Altai_Neanderthal.DG Vindija_Neanderthal.DG -3.48e-4 1.46e-4 -2.38 0.0172
    2 Chimp.REF Denisova.DG Altai_Neanderthal.DG Indian_GreatAndaman_100BP.SG -3.03e-2 5.33e-4 -56.8 0
    3 Chimp.REF Denisova.DG Vindija_Neanderthal.DG Indian_GreatAndaman_100BP.SG -2.99e-2 5.34e-4 -56.1 0
    4 Chimp.REF Denisova.DG Altai_Neanderthal.DG Dinka.DG -3.11e-2 5.09e-4 -61.1 0
    5 Chimp.REF Denisova.DG Vindija_Neanderthal.DG Dinka.DG -3.07e-2 5.09e-4 -60.5 0
    6 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Dinka.DG -8.05e-4 3.45e-4 -2.33 0.0197
    7 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Cameroon_SMA.SG -3.86e-4 3.30e-4 -1.17 0.242
    8 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Ethiopia_4500BP_published.SG -5.73e-4 3.68e-4 -1.56 0.119
    9 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG South_Africa_2200BP.SG 1.96e-5 4.23e-4 0.0463 0.963
    10 Chimp.REF Denisova.DG Indian_GreatAndaman_100BP.SG Esan.DG -6.72e-4 3.20e-4 -2.10 0.0356
    # ... with 35 more rows
    This is more of a coding issue, i.e. "how to look at all of the rows of a table in R", or maybe "how to print rows in tibble to a file in R" than an ADMIXTOOLS issue. Or maybe even "what is a tibble".

    Search the strings above on google and look at the results in Stackoverflow or some other place. This is the usual way people pick up R, python etc.
    Quoted from this Forum:

    "Which superman haplogroup is the toughest - R1a or R1b? And which SNP mutation spoke Indo-European first? There's only one way for us to find out ... fight!"

    " A Basal Eurasian and an Aurignacian walk into a bar... "

  4. The Following 2 Users Say Thank You to Ryukendo For This Useful Post:

     discreetmaverick (03-23-2021),  TuaMan (02-28-2021)

  5. #113
    Registered Users
    Posts
    313
    Sex
    Omitted

    >options(tibble.print_max = Inf)

    ^That did the trick.

    Whenever you guys run new f2s, do you typically always just set overwrite = TRUE to wipe whatever you have in your current directory, or will you create new folders for each batch of f2s?

    My first run included 18 populations on my file list, and with the default maxmiss = 0 my SNP count got killed. Do you guys try to keep your pop list on the smaller side, or do you just put maxmiss = 1 and try to work with as many pops as you can?
    Last edited by TuaMan; 03-01-2021 at 05:46 AM.

  6. #114
    Registered Users
    Posts
    1,948
    Sex
    Omitted

    Yep, maxmiss 1, I did I think 44 pops. Just split it into smaller chunks by using the maxmem argument in extract_f2. I've had errors when setting it too low (1000 or 2000), but 6000 or 8000 seem to work.
    Collection of 14,000 d-stats: Hidden Content Part 2: Hidden Content Part 3: Hidden Content PM me for d-stats, qpadm, qpgraph, or f3-outgroup nmonte models.

  7. The Following User Says Thank You to Kale For This Useful Post:

     TuaMan (03-01-2021)

  8. #115
    Registered Users
    Posts
    1,948
    Sex
    Omitted

    Here's an interesting graph.
    This was the initgraph (I think I posted it earlier) optimized for score (54.40).
     

    digraph G {
    size = "7.5,10";
    R -> South_Africa_2000BPSG [ label = "21" ];
    R -> Rr [ label = "21" ];
    Rr -> MotaSG [ label = "140" ];
    Rr -> Rrrl [ label = "31" ];
    MA1SG_l -> Yana_UPSG [ label = "10" ];
    Rrrl -> Rrl_m [ label = "4" ];
    Rrrl -> admixi [ style=dotted, label = "57%" ];
    Rrl_m -> Rrrll [ label = "1" ];
    Rrl_m -> Rrl [ label = "13" ];
    Rrrll -> OngeDG_b [ label = "5" ];
    Rrrll -> Ust_IshimDG [ label = "3" ];
    BachoKiro_Gravettian_c -> BachoKiro_Gravettian [ label = "53" ];
    OngeDG_b -> OngeDG [ label = "31" ];
    OngeDG_b -> OngeDG_ba [ label = "2" ];
    SunghirSG_gk -> SunghirSG [ label = "5" ];
    SunghirSG_gk -> admixz [ style=dotted, label = "62%" ];
    admix -> BachoKiro_Gravettian_cm [ style=dotted, label = "90%" ];
    Rrll -> BachoKiro_Gravettian_c [ label = "83" ];
    Rrll -> admixh [ label = "13" ];
    Rrrllu -> admixb [ style=dotted, label = "21%" ];
    admixb -> Rrll [ label = "2" ];
    OngeDG_ba -> admixr [ style=dotted, label = "46%" ];
    admixr -> MA1SG_l [ label = "3" ];
    BachoKiro_Gravettian_y -> admixb [ style=dotted, label = "79%" ];
    BachoKiro_Gravettian_y -> admixr [ style=dotted, label = "54%" ];
    Rrl -> SunghirSG_g [ label = "1" ];
    SunghirSG_g -> Kostenki14 [ label = "132" ];
    admixi -> Rrrllu [ label = "19" ];
    OngeDG_basl -> Tianyuan [ label = "38" ];
    admixz -> admix [ style=dotted, label = "97%" ];
    BachoKiro_Gravettian_cm -> Gravettian [ label = "44" ];
    admixh -> admixhy [ label = "43" ];
    MA1SG_l -> admixe [ style=dotted, label = "93%" ];
    admixe -> MA1SG [ label = "139" ];
    admixhy -> admixe [ style=dotted, label = "7%" ];
    Rrrllu -> Rrrllut [ label = "43" ];
    Rrrllut -> BachoKiro_IUP [ label = "5" ];
    BachoKiro_Gravettian_c -> admixu [ style=dotted, label = "73%" ];
    admixu -> BachoKiro_Gravettian_cm [ style=dotted, label = "10%" ];
    Rrrllut -> admixu [ style=dotted, label = "27%" ];
    SunghirSG_g -> SunghirSG_gk [ label = "6" ];
    Rrl -> BachoKiro_Gravettian_y [ label = "4" ];
    OngeDG_ba -> OngeDG_bar [ label = "6" ];
    OngeDG_bar -> OngeDG_bas [ label = "9" ];
    admixh -> admixj [ style=dotted, label = "93%" ];
    admixj -> GoyetQ116_1 [ label = "123" ];
    OngeDG_bas -> admixj [ style=dotted, label = "7%" ];
    OngeDG_bas -> admixi [ style=dotted, label = "43%" ];
    OngeDG_bar -> OngeDG_basl [ label = "90" ];
    admixhy -> admixz [ style=dotted, label = "38%" ];
    OngeDG_basl -> admix [ style=dotted, label = "3%" ];
    }

    Ran that optimized for F3 residual, this is was came out (F3 residual improved from 1.93 to 1.53)
     

    digraph G {
    size = "7.5,10";
    R -> South_Africa_2000BPSG [ label = "21" ];
    R -> Rr [ label = "21" ];
    Rr -> Rrrl [ label = "0" ];
    Rr -> Rrl_m [ label = "35" ];
    Rrrl -> MotaSG [ label = "140" ];
    Rrrl -> admixi [ style=dotted, label = "9%" ];
    Rrl_m -> Rrrll [ label = "1" ];
    Rrl_m -> Rrl [ label = "11" ];
    admixi -> Rrrllu [ label = "62" ];
    MA1SG_l -> Yana_UPSG [ label = "10" ];
    MA1SG_l -> admixe [ label = "6" ];
    admixe -> admixhy [ label = "155" ];
    admixe -> MA1SG [ label = "134" ];
    Rrrll -> OngeDG_b [ label = "3" ];
    Rrrll -> Ust_IshimDG [ label = "3" ];
    Rrl -> SunghirSG_g [ label = "3" ];
    Rrl -> BachoKiro_Gravettian_y [ label = "1" ];
    OngeDG_b -> OngeDG [ label = "32" ];
    OngeDG_b -> OngeDG_bas [ label = "2" ];
    SunghirSG_g -> SunghirSG_gk [ label = "5" ];
    SunghirSG_g -> Kostenki14 [ label = "132" ];
    BachoKiro_Gravettian_y -> admixr [ style=dotted, label = "63%" ];
    BachoKiro_Gravettian_y -> admixb [ style=dotted, label = "91%" ];
    OngeDG_ba -> admixr [ style=dotted, label = "37%" ];
    OngeDG_ba -> OngeDG_bar [ label = "55" ];
    BachoKiro_Gravettian_c -> BachoKiro_Gravettian [ label = "46" ];
    BachoKiro_Gravettian_c -> admixu [ style=dotted, label = "65%" ];
    admixu -> BachoKiro_Gravettian_cm [ style=dotted, label = "13%" ];
    admixr -> MA1SG_l [ label = "3" ];
    OngeDG_bar -> OngeDG_basl [ label = "54" ];
    SunghirSG_gk -> SunghirSG [ label = "5" ];
    SunghirSG_gk -> admixz [ style=dotted, label = "78%" ];
    admixz -> admix [ style=dotted, label = "98%" ];
    Rrll -> BachoKiro_Gravettian_c [ label = "88" ];
    Rrll -> admixh [ style=dotted, label = "83%" ];
    admixh -> admixj [ style=dotted, label = "98%" ];
    admixhy -> admixz [ style=dotted, label = "22%" ];
    admixhy -> admixh [ style=dotted, label = "17%" ];
    admixj -> GoyetQ116_1 [ label = "130" ];
    admixb -> Rrll [ label = "4" ];
    Rrrllu -> Rrrllut [ label = "7" ];
    Rrrllut -> BachoKiro_IUP [ label = "0" ];
    OngeDG_basl -> Tianyuan [ label = "25" ];
    admix -> BachoKiro_Gravettian_cm [ style=dotted, label = "87%" ];
    BachoKiro_Gravettian_cm -> Gravettian [ label = "45" ];
    OngeDG_bas -> admixi [ style=dotted, label = "91%" ];
    OngeDG_bas -> OngeDG_ba [ label = "1" ];
    Rrrllut -> admixu [ style=dotted, label = "35%" ];
    Rrrllu -> admixb [ style=dotted, label = "9%" ];
    OngeDG_basl -> admix [ style=dotted, label = "2%" ];
    OngeDG_bar -> admixj [ style=dotted, label = "2%" ];
    }

    That para-MA1 into Gravettian and Goyet has to just be a fudge factor, 155 drift length is insane.

    Here's another graph with the addition of PrimorskyKrai_Boisman_MN as an East Asian reference. Worst F3 residual is 2.17 (I'm thinking 2 is probably a healthy balance between addressing poor fits and avoiding overfitting).
    I don't like that Yana is coming out unadmixed. I think I'll try adding another admixture to see if that helps, it helped a lot when working with least score as the goal.
     

    digraph G {
    size = "7.5,10";
    admixk -> admixs [ label = "70" ];
    admixs -> admixbq [ style=dotted, label = "26%" ];
    admixs -> admixb [ style=dotted, label = "8%" ];
    admixsz -> admixm [ style=dotted, label = "8%" ];
    admixsz -> Gravettian [ label = "37" ];
    admixbq -> PrimorskyKrai_Boisman_MN [ label = "29" ];
    admixb -> admixsz [ label = "9" ];
    admixm -> MA1SG [ label = "140" ];
    Rrrllrrl -> admixx [ label = "22" ];
    Rrrllrrl -> Rrrllrrlp [ label = "19" ];
    admixx -> admixxc [ label = "48" ];
    admixx -> GoyetQ116_1 [ label = "111" ];
    Rrrllrrlp -> BachoKiro_Gravettian [ label = "117" ];
    Rrrllrrlp -> admix [ style=dotted, label = "33%" ];
    admixxc -> admixd [ style=dotted, label = "22%" ];
    admixxc -> BachoKiro_IUP_qy [ style=dotted, label = "53%" ];
    admixd -> admix [ style=dotted, label = "67%" ];
    BachoKiro_IUP_qy -> admixk [ style=dotted, label = "66%" ];
    admix -> admixb [ style=dotted, label = "92%" ];
    Rrrrr_ur -> Rrrrr_urm [ label = "0" ];
    Rrrrr_ur -> admixc [ style=dotted, label = "14%" ];
    Rrrrr_urm -> admixk [ style=dotted, label = "34%" ];
    Rrrrr_urm -> BachoKiro_IUP [ label = "41" ];
    BachoKiro_IUP_q -> Rrrll_c [ style=dotted, label = "3%" ];
    BachoKiro_IUP_q -> Tianyuan [ label = "20" ];
    Rrrll_c -> Rrrll_ce [ label = "7" ];
    Rrrll_ce -> admixbq [ style=dotted, label = "74%" ];
    Rrrll_ce -> OngeDG [ label = "27" ];
    BachoKiro_IUP_qg -> BachoKiro_IUP_qy [ style=dotted, label = "47%" ];
    BachoKiro_IUP_qg -> BachoKiro_IUP_q [ label = "38" ];
    admixz -> Rrrrr_ur [ label = "26" ];
    admixz -> BachoKiro_IUP_qg [ label = "81" ];
    SunghirSG_s -> admixr [ style=dotted, label = "98%" ];
    admixr -> SunghirSG [ label = "3" ];
    Rrr -> MotaSG [ label = "140" ];
    Rrr -> Rrrrr_u [ label = "35" ];
    Rrrrr_u -> admixz [ label = "3" ];
    Rrrrr_u -> Rrrr [ label = "1" ];
    R -> Rq [ label = "7" ];
    R -> South_Africa_2000BPSG [ label = "7" ];
    Rq -> admixr [ style=dotted, label = "2%" ];
    Rq -> Rrr [ label = "28" ];
    Rrrrl_l -> Rrrll_c [ style=dotted, label = "97%" ];
    Rrrr_q -> SunghirSG_s_h [ label = "0" ];
    Rrrr_q -> Yana_UPSG [ label = "10" ];
    admixc -> Rrrllrrl [ label = "3" ];
    Rrrr -> Rrrrl_l [ label = "0" ];
    SunghirSG_s_h -> admixm [ style=dotted, label = "92%" ];
    SunghirSG_s_h -> SunghirSG_s_hi [ label = "5" ];
    SunghirSG_sb -> SunghirSG_s [ label = "8" ];
    SunghirSG_s_hi -> admixc [ style=dotted, label = "86%" ];
    SunghirSG_s_hi -> SunghirSG_sb [ label = "3" ];
    SunghirSG_s -> admixd [ style=dotted, label = "78%" ];
    SunghirSG_sb -> Kostenki14 [ label = "131" ];
    Rrrrl_l -> Ust_IshimDG [ label = "5" ];
    Rrrr -> Rrrr_q [ label = "7" ];
    }

    Here's another go, same setup it just happened to find a better model. Worst F3 = 2.04
     

    digraph G {
    size = "7.5,10";
    admixk -> admixs [ label = "42" ];
    admixs -> admixbq [ style=dotted, label = "20%" ];
    admixs -> admixb [ style=dotted, label = "8%" ];
    admixsz -> admixm [ style=dotted, label = "5%" ];
    admixsz -> admix [ label = "23" ];
    admixbq -> PrimorskyKrai_Boisman_MN [ label = "32" ];
    admixb -> Gravettian [ label = "15" ];
    admixm -> MA1SG [ label = "140" ];
    Rrrllrrl -> admixx [ label = "20" ];
    Rrrllrrl -> Rrrllrrlp [ style=dotted, label = "84%" ];
    admixx -> admixxc [ label = "73" ];
    admixx -> GoyetQ116_1 [ label = "114" ];
    Rrrllrrlp -> BachoKiro_Gravettian [ label = "135" ];
    admixxc -> admixd [ style=dotted, label = "20%" ];
    admixxc -> BachoKiro_IUP_qy [ style=dotted, label = "70%" ];
    admixd -> admixsz [ label = "12" ];
    BachoKiro_IUP_qy -> admixk [ style=dotted, label = "72%" ];
    admix -> admixb [ style=dotted, label = "92%" ];
    admix -> Rrrllrrlp [ style=dotted, label = "16%" ];
    Rrrrr_ur -> admixk [ style=dotted, label = "28%" ];
    Rrrrr_ur -> Rrrrr_urm [ label = "0" ];
    Rrrrr_urm -> BachoKiro_IUP [ label = "26" ];
    Rrrrr_urm -> admixc [ style=dotted, label = "10%" ];
    Rrrll_c -> BachoKiro_IUP_q [ label = "10" ];
    Rrrll_c -> Rrrll_ce [ label = "3" ];
    BachoKiro_IUP_q -> BachoKiro_IUP_qg [ label = "123" ];
    BachoKiro_IUP_q -> admixz [ style=dotted, label = "36%" ];
    Rrrll_ce -> admixbq [ style=dotted, label = "80%" ];
    Rrrll_ce -> OngeDG [ label = "28" ];
    BachoKiro_IUP_qg -> BachoKiro_IUP_qy [ style=dotted, label = "30%" ];
    BachoKiro_IUP_qg -> Tianyuan [ label = "3" ];
    admixz -> Rrrrr_ur [ label = "43" ];
    SunghirSG_s -> admixr [ style=dotted, label = "99%" ];
    SunghirSG_s -> SunghirSG_sb [ label = "0" ];
    admixr -> SunghirSG [ label = "9" ];
    Rrr -> MotaSG [ label = "140" ];
    Rrr -> Rrrrr_u [ label = "32" ];
    Rrrrr_u -> admixz [ style=dotted, label = "64%" ];
    Rrrrr_u -> Rrrr [ label = "3" ];
    R -> Rq [ label = "1" ];
    R -> South_Africa_2000BPSG [ label = "1" ];
    Rq -> admixr [ style=dotted, label = "1%" ];
    Rq -> Rrr [ label = "42" ];
    Rrrrl_l -> Rrrll_c [ label = "4" ];
    Rrrrl_l -> Rrrr_q [ label = "6" ];
    Rrrr_q -> SunghirSG_s_h [ label = "0" ];
    Rrrr_q -> Yana_UPSG [ label = "11" ];
    admixc -> Rrrllrrl [ label = "2" ];
    Rrrr -> Rrrrl_l [ label = "1" ];
    Rrrr -> Ust_IshimDG [ label = "5" ];
    SunghirSG_s_h -> admixm [ style=dotted, label = "95%" ];
    SunghirSG_s_h -> SunghirSG_s_hi [ label = "4" ];
    SunghirSG_sb -> admixd [ style=dotted, label = "80%" ];
    SunghirSG_sb -> Kostenki14 [ label = "130" ];
    SunghirSG_s_hi -> SunghirSG_s [ label = "5" ];
    SunghirSG_s_hi -> admixc [ style=dotted, label = "90%" ];
    }
    Last edited by Kale; 03-03-2021 at 08:28 AM.
    Collection of 14,000 d-stats: Hidden Content Part 2: Hidden Content Part 3: Hidden Content PM me for d-stats, qpadm, qpgraph, or f3-outgroup nmonte models.

Page 12 of 12 FirstFirst ... 2101112

Similar Threads

  1. An American Wellspring: Concepts and Possibilities
    By Alvah Hicks in forum American
    Replies: 2
    Last Post: 09-20-2020, 07:31 PM
  2. Playing around with qpGraph
    By Chad Rohlfsen in forum Ancient (aDNA)
    Replies: 34
    Last Post: 06-19-2017, 10:45 PM
  3. Technically am I part east African or not?
    By jaderose22 in forum African
    Replies: 15
    Last Post: 03-26-2017, 04:59 AM
  4. Replies: 0
    Last Post: 10-04-2016, 12:01 PM
  5. Future possibilities for AncestryDNA
    By Erik in forum AncestryDNA
    Replies: 0
    Last Post: 08-31-2014, 12:35 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •