View Full Version : If anyone needs Y-DNA haplogroup test data

12-31-2018, 03:05 PM
If this would be of help to anyone, I recently created a Y-DNA test data generator that simulates the STR and SNP mutation evolution over any number of generations from a common ancestor.

The program starts with two descendants along separate lines from a common ancestor N generations back, and then randomly attaches branching points for new descendants at random generations until the desired number of descendants is reached. That creates a randomly-generated haplotree, and the program then assigns a SNP mutation at each of the branching points which then leads to each current descendant being positive for those SNPs on his ancestral branches. I currently have it reporting these SNP results for each kit in several ways, like every other SNP or just the SNPs in the upper branches, etc, to simulate various scenarios where the group might only have partial SNP information about their ancestry, as is common today. So no SNP mutation rate is assumed, the program just creates ALL the SNPs that would be useful to know for branching purposes and then I can limit the ones that it reports in various ways.

The program then “walks” the haplotree starting with an assigned STR haplotype at the common ancestor and evolves the Y111 STR set from FTDNA over the generations backwards or forwards randomly according to their published mutation rates (I’m using Chandler/Little rates but these can be adjusted). This produces randomized but representative STR data at the Y111 level for the entire group as well. Note that this is truely random and done step-wise (1 step at a time) for ALL markers so it doesn’t try to replicate all the oddities of STR mutations like multi-copy marker infinite-allele mutations, RecLOHs, null values, microalleles, or the possible higher odds of back-mutations at higher allele values.

Since this is randomized test data it won’t reproduce the actual mutations in anyone’s existing matches, but if you want more relatives that evolved along new branches I can create as many as you’d like :).

I’m using the output to test a tool but if it will help anyone else I’m happy to generate more. All I really need for the simulation is 1) what starting haplotype you’d like to use, 2) how many generations back to the common ancestor, and 3) how many current descendants to produce.


12-31-2018, 03:21 PM
When do you plan to release it?

12-31-2018, 03:30 PM
When do you plan to release it?

I hadn’t created a user-friendly interface for it because I didn’t think it was going to be widely needed, but I can. It’s a PHP program on a common server so it just needs an interface.

I can run it by request for now but I’ll put up a front-end for it in a week or so.

12-31-2018, 08:39 PM
Could this be used to find common descent from historical figures in other haplogroups outside your own?

12-31-2018, 10:07 PM
Could this be used to find common descent from historical figures in other haplogroups outside your own?

I don’t see how. This only mimics the evolutionary mutation process for STRs and SNPs over a period of time from a known common ancestor to a certain number of descendants today. It could for instance generate data simulating a whole haplogroup project.

Normally in Y-DNA testing we’re trying to recreate the ancestral tree for a group of men using the STR and SNP data discovered through testing; this starts the other way and creates STR and SNP data from a randomized ancestral tree. The value is that the ancestral tree is known already. But the data itself is fictitious, and the SNP haplotree and SNP labels are all new and don’t match any of the existing SNPs out there.

12-31-2018, 10:09 PM
Dave, sent you an email with a possible candidate.