search this blog

Thursday, April 30, 2015

The enigma of the Kalash

Last year Garrett Hellenthal et al. claimed that the Kalash people of the Hindu Kush received a large pulse of admixture from somewhere in the west, possibly Europe, as late as 327–326 BCE. They even suggested that Alexander's soldiers may have been the culprits. But this was naive and wrong.

Now, Qasim Ayub et al. are claiming that the Kalash are an Ancient North Eurasian (ANE) population that has remained genetically isolated for the past 11,800 years. This is also naive and wrong.

One day, perhaps in the not too distant future, someone will study the population history of the Hindu Kush using ancient DNA and methods that actually work. What I think they will find is that the Kalash, just like most of their neighbors, are largely the result of an admixture event during the Bronze Age between Indo-Iranian migrants from the steppe and Central Asian agriculturists. They will confirm that the Kalash are an extreme isolate, but only since the Bronze Age, not the early Neolithic.

These results will correlate very nicely with mainstream linguistics and archeology, latest expansion dates for uniparental markers, and even common sense.


Ayub et al., The Kalash Genetic Isolate: Ancient Divergence, Drift, and Selection, The American Journal of Human Genetics (2015),

See also...

The teal people: did they actually exist, and if so, who were they?

Friday, April 3, 2015

The teal people: did they actually exist, and if so, who were they?

The ADMIXTURE analysis in Haak et al. 2015 includes a series of intriguing teal colored components from K=16 to K=20 (see image here). The main reason I'm so intrigued by these components is because they generally make up over 40% of the genetic structure of the potentially Proto-Indo-European-speaking Yamnaya people.

But there's only so much one can learn by starring at a bar graph, so I thought I'd have a go at isolating the same signal with ADMIXTURE to study it in more detail. You can view the results of my experiment in the spreadsheet here.

I wasn't able to completely nail any one of the teal components from Haak et al., because I don't have access to all of the samples used in the paper (I'd have to sign a waiver to get them). Nevertheless, the signal looks basically the same.

Below is a bar graph based on the output featuring selected populations and ancient genomes from Europe and Asia. The Fst genetic distances between the nine components are available here.

Note that the teal component peaks in the Caucasus and the Hindu Kush, and generally shows a strong correlation with regions of relatively high MA1-related or Ancient North Eurasian (ANE) admixture. On the other hand, the orange component peaks among Early European Farmers (EEF), who basically lack ANE.

To learn about the structure of the three main West Eurasian components - blue, orange and teal - I made synthetic individuals from the P output to represent each of the components, and tested them with my K8 model. As expected, the teal component harbors a high level of ANE, while the orange component lacks it altogether. Refer to the spreadsheet here.

It's very likely that the teal and orange components from Haak et al. share these traits. I think this is more than obvious by looking at their frequencies across space and time in Eurasia.

I also analyzed the synthetic individuals with PCA based on their K8 ancestry proportions. The samples representing the orange component fall just south of the Stuttgart genome from Neolithic Germany, and this is basically where I expect Neolithic genomes from the Near East to cluster when they become available.

Interestingly, the samples representing the blue component are dead ringers for Scandinavian hunter-gatherers (SHG). However, I suspect this is something of a coincidence caused by the small number of Western European hunter-gatherer (WHG) and Eastern hunter-gatherer (EHG) genomes in the dataset. The algorithm probably doesn't have enough variation to latch onto to create both WHG and EHG components, and in the end settles for something in between, which just happens to resemble SHG.

But the fact that the orange and blue samples more or less pass for ancient populations leaves open the possibility that the same might be said for the teal samples.

So did the teal people actually exist, and if so, who were they?

My view at the moment is that a population very similar to the teal samples formed in Central Asia or the North Caucasus during the Neolithic as result of admixture between MA1-like and Near Eastern groups. This population, I believe, then expanded into the Pontic-Caspian steppe by the onset of the Eneolithic.

Were they perhaps the Proto-Indo-Europeans? Probably not. I'd say they were Neolithic farmers who eventually played a role in the formation of the Proto-Indo-Europeans. In any case, someone had to bring the Caucasian or Central Asian admixture to the steppe, and I have it on good authority that it was already present among the Khvalynsk population of the Eneolithic, albeit at a lower level than among the Yamnaya of the early Bronze Age.


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

Update 16/11/2015: 'Fourth strand' of European ancestry originated with (Caucasus) hunter-gatherers isolated by Ice Age