Several ancient genomes have been posted online as text files and uploaded to GEDmatch over the last couple of weeks, and many more are likely to follow in the future. A lot of people have already taken this opportunity to analyze these files with various online ancestry tools, usually DIY calculators.
That's actually not a bad way of doing things, as long as everyone's aware that almost all of these calculators produce biased results. They produce biased results because they violate a very basic rule of science, which is this:
Do not test more than one variable at a time.Obviously, the variable we want to test with these calculators is ancestry. However, when the reference samples are tested in a different way to the test samples, which is what usually happens, then this adds another variable to the proceedings. As a result, we simply can't compare the results of the reference samples to those of the test samples.
I know that a lot of people find this difficult to grasp, and many just seem hell bent on not grasping it. However, anyone who isn't completely insane, and takes five minutes out of their day to try and understand the concepts involved, has to agree that this is a real problem. It can be proven empirically, like I did over two years ago (see here).
I suspect that a lot of confusion has been caused by the fact that the people who were used as reference samples in the making of the various DIY calculators saw highly accurate results when running them, and so assumed everything was fine. The accuracy of the DIY calculators for such people is indeed impressive, and I show that at the link above, but unfortunately the story is very different for everyone else.
Here's the good news: the Eurogenes calculators don't suffer from the calculator effect. That's because the reference samples are treated in the same way as the test samples, so there's only one variable: ancestry. What this means is that when you run a modern or ancient genome with a Eurogenes calculator you can confidently compare the result to those of the reference samples (provided enough SNPs are used), and then be able to make sensible inferences about its genetic origins.














