Thursday, April 30, 2015

The enigma of the Kalash

Last year Garrett Hellenthal et al. claimed that the Kalash people of the Hindu Kush received a large pulse of admixture from somewhere in the west, possibly Europe, as late as 327–326 BCE. They even suggested that Alexander's soldiers may have been the culprits. But this was naive and wrong.

Now, Qasim Ayub et al. are claiming that the Kalash are an Ancient North Eurasian (ANE) population that has remained genetically isolated for the past 11,800 years. This is also naive and wrong.

One day, perhaps in the not too distant future, someone will study the population history of the Hindu Kush using ancient DNA and methods that actually work. What I think they will find is that the Kalash, just like most of their neighbors, are largely the result of an admixture event during the Bronze Age between Indo-Iranian migrants from the steppe and Central Asian agriculturists. They will confirm that the Kalash are an extreme isolate, but only since the Bronze Age, not the early Neolithic.

These results will correlate very nicely with mainstream linguistics and archeology, latest expansion dates for uniparental markers, and even common sense.


Ayub et al., The Kalash Genetic Isolate: Ancient Divergence, Drift, and Selection, The American Journal of Human Genetics (2015),

Tuesday, April 21, 2015

West Eurasian mtDNA lineages in India

It might be interesting to compare the modern-day Indian mtDNA from this paper to the the ancient European mtDNA from Haak et al. 2015. The data is freely available in the spreadsheet here. Anybody up for it?

There is no indication from the previous mtDNA studies that west Eurasian-specific subclades have evolved within India and played a role in the spread of languages and the origins of the caste system. To address these issues, we have screened 14,198 individuals (4208 from this study) and analyzed 112 mitogenomes (41 new sequences) to trace west Eurasian maternal ancestry. This has led to the identification of two autochthonous subhaplogroups-HV14a1 and U1a1a4, which are likely to have originated in the Dravidian-speaking populations approximately 10.5-17.9 thousand years ago (kya). The carriers of these maternal lineages might have settled in South India during the time of the spread of the Dravidian language. In addition to this, we have identified several subsets of autochthonous U7 lineages, including U7a1, U7a2b, U7a3, U7a6, U7a7, and U7c, which seem to have originated particularly in the higher-ranked caste populations in relatively recent times (2.6-8.0 kya with an average of 5.7 kya). These lineages have provided crucial clues to the differentiation of the caste system that has occurred during the recent past and possibly, this might have been influenced by the Indo-Aryan migration. The remaining west Eurasian lineages observed in the higher-ranked caste groups, like the Brahmins, were found to cluster with populations who possibly arrived from west Asia during more recent times.

Palanichamy et al., West Eurasian mtDNA lineages in India: an insight into the spread of the Dravidian language and the origins of the caste system, Human Genetics, 2015 Apr 2. [Epub ahead of print]

Friday, April 3, 2015

The teal people: did they actually exist, and if so, who were they?

The ADMIXTURE analysis in Haak et al. 2015 includes a series of intriguing teal colored components from K=16 to K=20 (see image here). The main reason I'm so intrigued by these components is because they generally make up over 40% of the genetic structure of the potentially Proto-Indo-European Yamnaya genomes.

But there's only so much one can learn by starring at a bar graph, so I thought I'd have a go at isolating the same signal with ADMIXTURE to study it in more detail. You can view the results of my experiment in the spreadsheet here.

I wasn't able to completely nail any one of the teal components from Haak et al., because I don't have access to all of the samples used in the paper (I'd have to sign a waiver to get them). Nevertheless, the signal looks basically the same.

Below is a bar graph based on the output featuring selected populations and ancient genomes from Europe and Asia. The Fst genetic distances between the nine components are available here.

Note that the teal component peaks in the Caucasus and the Hindu Kush, and generally shows a strong correlation with regions of relatively high MA1-related or Ancient North Eurasian (ANE) admixture. On the other hand, the orange component peaks among Early European Farmers (EEF), who basically lack ANE.

To learn about the structure of the three main West Eurasian components - blue, orange and teal - I made synthetic individuals from the P output to represent each of the components, and tested them with my K8 model. As expected, the teal component harbors a high level of ANE, while the orange component lacks it altogether. Refer to the spreadsheet here.

It's very likely that the teal and orange components from Haak et al. share these traits. I think this is more than obvious by looking at their frequencies across space and time in Eurasia.

I also analyzed the synthetic individuals with PCA based on their K8 ancestry proportions. The samples representing the orange component fall just south of the Stuttgart genome from Neolithic Germany, and this is basically where I expect Neolithic genomes from the Near East to cluster when they become available.

Interestingly, the samples representing the blue component are dead ringers for Scandinavian hunter-gatherers (SHG). However, I suspect this is something of a coincidence caused by the small number of Western European hunter-gatherer (WHG) and Eastern hunter-gatherer (EHG) genomes in the dataset. The algorithm probably doesn't have enough variation to latch onto to create both WHG and EHG components, and in the end settles for something in between, which just happens to resemble SHG.

But the fact that the orange and blue samples more or less pass for ancient populations leaves open the possibility that the same might be said for the teal samples.

So did the teal people actually exist, and if so, who were they?

My view at the moment is that a population very similar to the teal samples formed in Central Asia or the North Caucasus during the Neolithic as result of admixture between MA1-like and Near Eastern groups. This population, I believe, then expanded into the Russo-Kazakh steppe by the onset of the Eneolithic.

Were they perhaps the Proto-Indo-Europeans? Probably not. I'd say they were Neolithic farmers who eventually played a role in the formation of the Proto-Indo-Europeans. In any case, someone had to bring the Caucasian or Central Asian admixture to the steppe, and I have it on good authority that it was already present among the Khvalynsk population of the Eneolithic, albeit at a lower level than among the Yamnaya of the early Bronze Age.


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

