Monday, December 16, 2013
Cluster analysis of West Eurasia: 13 clusters from 18 dimensions
I ran a quick Mclust analysis to get a better idea of the substructures in my recently updated dataset of West Eurasian samples. Mclust found that the optimal outcome was produced with 18 dimensions of genetic variation and 13 clusters, the latter of which are superimposed on a two dimensional MDS plot below. I chose the labels for the clusters myself and flipped the canvass to fit geography.
Here you can see the 13 clusters superimposed on all possible combinations of the 18 dimensions. Clicking on the image will take you to a 10.3MB PDF file.
It's interesting to note the presence of the very tight Jewish cluster, which includes Ashkenazi, Sephardic and Moroccan Jews. The Basques and Sardinians also cluster together, despite being clearly distinct from each other in the fist two dimensions. This is fascinating because these two groups have been mentioned a few times now in various studies and presentations as being the best modern proxies for Europe's Neolithic farmers.
The widespread Central and Eastern European cluster mostly includes individuals from populations that aren't easily characterized in these sorts of tests, and that's basically because they're of mixed origin. Indeed, I suspect things would look somewhat different in that part of the plot if I had more sizable samples from Germany, Scandinavia, Poland and nearby areas.
Mclust can produce many more clusters than just 13 from the same data, but as per above, I wanted to see what would happen if it was asked to come up with the optimal solution. For more on this type of analysis check out the articles here, here and here.
Update 17/12/2013: On a related note, here's an Mclust analysis of West, Central and South Asia. The optimal result was obtained with 10 dimensions and 14 clusters. Please note that although some of the clusters have the same names as in the analysis above, they aren't the same clusters.
Principal component analysis (PCA) of West Eurasia
Multidimensional views of South Asia, West Asia and Eastern Europe
Eurogenes' North Euro clusters - phase 2, final results