After five years of searching, one gene at a time, one variant at a time, I have discovered six or seven novel variants, and about the same number of extremely rare variants, in my son's whole exome.

So, the obvious question - researchers routinely state in their research that they have discovered X (whatever the number) of novel and/or extremely rare variants in the (whatever the number) of whole genomes or whole exomes that they sequenced.

How are they finding those novel and extremely rare variants?? Allele frequencies are not typically part of the annotation data for whole exomes and whole genomes, so it really beats me how they are finding the those variants. Obviously, they are using sophisticated software to search through millions of recorded variants in public/private databases to determine the population frequencies of each and every one of those millions of variants. The computing power needed to accomplish that seems impossible to me, but it must fairly common. I guess I"m assuming that they are using huge arrays of linked computers to accomplish this??? Variants are identified by running the results against the human reference sequence, but beyond that, for each variant the frequency must be determined by reviewing public (and possibly privated) dabasase. Am I wrong? Is it possible to do this with a PC??

Anybody have any ideas?