Focusing on European population genetics and modern physical anthropology.
search this blog
Monday, March 28, 2016
PC/nMonte open thread
Below are a few nMonte models of ancient individuals based on 25 principal components (PCs). The relevant datasheet and nMonte R script can be downloaded here and here, respectively.
Many of the outcomes are basically perfect. Others could certainly be better. But they all make sense.
The more complex the ancestry, the more difficult it is to model. Also, deamination, low coverage and missing markers are probably skewing things to some degree for most of these samples. So although time consuming, it might be a good idea to use population averages minus the most obvious outliers.
Are there any other ways to improve the analysis? Is 25 dimensions too much or too little? Let's run plenty of tests and see where this takes us.
I can update the datasheet with many more populations and dimensions later this week. Feel free to post your requests in the comments and I'll run them if I have them. Also, if anyone's wondering, I don't know yet which commercial genotype files I can run in this test, if any. I'll check.
Update 04/04/2016: A modified datasheet with 50 dimensions and many more samples is available here. It should be more useful in modeling South Central Asians, especially the Kalash. However, as far as I can tell, using just 9 dimensions, like in the version here, is faster and produces more accurate results.