Friday, April 3, 2015
The teal people: did they actually exist, and if so, who were they?
The ADMIXTURE analysis in Haak et al. 2015 includes a series of intriguing teal colored components from K=16 to K=20 (see image here). The main reason I'm so intrigued by these components is because they generally make up over 40% of the genetic structure of the potentially Proto-Indo-European Yamnaya genomes.
But there's only so much one can learn by starring at a bar graph, so I thought I'd have a go at isolating the same signal with ADMIXTURE to study it in more detail. You can view the results of my experiment in the spreadsheet here.
I wasn't able to completely nail any one of the teal components from Haak et al., because I don't have access to all of the samples used in the paper (I'd have to sign a waiver to get them). Nevertheless, the signal looks basically the same.
Below is a bar graph based on the output featuring selected populations and ancient genomes from Europe and Asia. The Fst genetic distances between the nine components are available here.
Note that the teal component peaks in the Caucasus and the Hindu Kush, and generally shows a strong correlation with regions of relatively high MA1-related or Ancient North Eurasian (ANE) admixture. On the other hand, the orange component peaks among Early European Farmers (EEF), who basically lack ANE.
K8 model. As expected, the teal component harbors a high level of ANE, while the orange component lacks it altogether. Refer to the spreadsheet here.
It's very likely that the teal and orange components from Haak et al. share these traits. I think this is more than obvious by looking at their frequencies across space and time in Eurasia.
I also analyzed the synthetic individuals with PCA based on their K8 ancestry proportions. The samples representing the orange component fall just south of the Stuttgart genome from Neolithic Germany, and this is basically where I expect Neolithic genomes from the Near East to cluster when they become available.
Interestingly, the samples representing the blue component are dead ringers for Scandinavian hunter-gatherers (SHG). However, I suspect this is something of a coincidence caused by the small number of Western European hunter-gatherer (WHG) and Eastern hunter-gatherer (EHG) genomes in the dataset. The algorithm probably doesn't have enough variation to latch onto to create both WHG and EHG components, and in the end settles for something in between, which just happens to resemble SHG.
But the fact that the orange and blue samples more or less pass for ancient populations leaves open the possibility that the same might be said for the teal samples.
So did the teal people actually exist, and if so, who were they?
My view at the moment is that a population very similar to the teal samples formed in Central Asia or the North Caucasus during the Neolithic as result of admixture between MA1-like and Near Eastern groups. This population, I believe, then expanded into the Russo-Kazakh steppe by the onset of the Eneolithic.
Were they perhaps the Proto-Indo-Europeans? Probably not. I'd say they were Neolithic farmers who eventually played a role in the formation of the Proto-Indo-Europeans. In any case, someone had to bring the Caucasian or Central Asian admixture to the steppe, and I have it on good authority that it was already present among the Khvalynsk population of the Eneolithic, albeit at a lower level than among the Yamnaya of the early Bronze Age.
Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317
Update 16/11/2015: 'Fourth strand' of European ancestry originated with (Caucasus) hunter-gatherers isolated by Ice Age