Saturday, April 30, 2016

Y-hg J2 cannot be a Proto-Indo-European marker


The claim that the Proto-Indo-Europeans came from West Asia and largely belonged to Y-haplogroup J2 seems to be popular online nowadays. I won't discuss here in detail the reasons why, but suffice to say it has a lot do with aggressive lobbying on several online forums and blogs by a few people of Southern European extraction, like Dienekes Pontikos.

It was always a shaky proposition, but difficult to debunk thoroughly. Until now.

Thanks to recent advances in both modern and ancient DNA research, we can now safely say that Y-haplogroup J2 was not involved in any rapid, large scale population expansions during the Late Neolithic/Early Bronze Age (LN/EBA), the generally accepted Proto-Indo-European time frame.

It thus fails to meet even the most basic criteria of a Proto-Indo-European diagnostic marker. The Proto-Indo-Europeans, after all, were surely highly patriarchal and patrilineal, and therefore expected to have left a clear signal of their migrations in the Y-chromosomes of many present-day Indo-European speakers.

For instance, an analysis of data from the deep sequencing of human Y-chromosomes as part of the 1000 Genomes Project suggests that not a single major subclade of J2 began expanding even roughly close to the LN/EBA. See here.

In the plot above three lineages jump out at you. E1b, R1a, and R1b. The first is associated with the Bantu expansion, that occurred over the last 4,000 years. The second two are likely associated with Indo-Europeans in both Asia and Europe, respectively. The timescale is on the order of 4 to 5,000 years in the past. The association between culture and genes, or the genetic lineages of males, is rather clear, in these cases. In other instances the growth was more gradual. For example, the lineages likely associated with the first Neolithic pulses, J and G.

Moreover, not a single instance of J2 has been reported from remains classified as belonging to the Andronovo, Battle-Axe, Corded Ware, Khvalynsk, Poltavka, Potapovka, Sintashta, Srubnaya and Yamnaya archaeological cultures. In other words, Kurgan and Kurgan-derived groups generally accepted to be early Indo-European, whch is a view that now has very strong support from ancient genomics. See here and here.

To date, most of these samples have probably come from elite burials. So at some point, when many more non-elite samples are sequenced, we are likely to see J2 among a few supposedly early Indo-European individuals. But so what?

There might be a couple of ways to salvage the Proto-Indo-Europeans = J2 theory. We'd have to argue that...

- the Proto-Indo-European time frame was actually the early Neolithic

and/or

- the Proto-Indo-Europeans were a small group that Indo-Europeanized the steppe Kurgan people, perhaps mainly via female migrations, and then did not partake in the main early Indo-European expansions

But the former is not particularly clever when viewed in the context of historical linguistics data. See here.

For instance, almost all IE language branches testify to a word designating ‘wool’. Since archaeological evidence suggests that wool sheep did not exist until the beginning of the fourth millennium BCE, the existence of the word in PIE would indicate that the disintegration of the proto-language could not have taken place before this date. Similarly, words for concepts such as ‘wheel’, ‘yoke’, ‘honey bee’ and ‘horse’ may be correlated directly with concrete, datable archaeological evidence.

And the latter isn't very parsimonious, and to me looks like special pleading. Why even bother?

Monday, April 25, 2016

Signals of ancient population explosions in our Y-chromosomes


Nature Genetics has a massive new paper on human Y-chromosomes based on the latest 1000 Genomes data. I'm still getting my head around the details, but at first glance it looks like a very capable effort. This part basically reads like some of my blog entries in recent years. The emphasis is mine.

In South Asia, we detected eight lineage expansions dating to ~4.0–7.3 kya and involving haplogroups H1-M52, L-M11, and R1a-Z93 (Supplementary Fig. 14b,d,e). The most striking were expansions within R1a-Z93, occurring 4.0–4.5 kya. This time predates by a few centuries the collapse of the Indus Valley Civilization, associated by some with the historical migration of Indo-European speakers from the Western Steppe into the Indian subcontinent 27. There is a notable parallel with events in Europe, and future aDNA evidence may prove to be as informative as it has been in Europe.

Poznik et al., Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences, Nature Genetics, Published online 25 April 2016; doi:10.1038/ng.3559

See also...

The Poltavka outlier


Sunday, April 17, 2016

Estimating Basal Eurasian ancestry?


Basal Eurasians (BE) are a hypothetical ghost population that apparently split from other Eurasians no later than 45,000 years ago. If they actually existed, they had a significant impact on the ancestry of early Neolithic farmers, and thus all present-day West Eurasians.

Testing ancestry proportions from ghost populations isn't easy. However, Haak et al. 2015 made use of an f4 equation that seemingly gave an accurate estimate of BE admixture in LBK farmer Stuttgart: f4(Stuttgart,Loschbour;Onge,MA1)/f4(Mbuti,MA1;Onge,Loschbour) = 44%. The other LBK farmers scored an average of 40% BE, which also made sense.

Unfortunately, this equation doesn't appear to work too well for Caucasus Hunter-Gatherers (CHG) Kotias and Satsurblia. They both score around 25% BE, which, as far as I can see, seems way too low. Perhaps using MA1 in the equation is messing things up because CHG harbor significant MA1-related ancestry?

I tinkered around with Haak's equation and came up with this: f4(X,Iberia_Mesolithic;Dai,Karelia_HG)/f4(Mbuti,Karelia_HG;Dai,Iberia_Mesolithic). The results look solid, at least in relative terms (see image below). But is the equation actually valid?

My main worry is using both Iberia Mesolithic and Karelia HG. They share a lot of drift, much more than Loschbour and MA1. Also, even though both Dai and Onge belong to the so called Eastern non-African (ENA) clade, they're quite distinct, with Dai a lot less basal in the context of ENA diversity. Any thoughts? Suggestions?


Update 04/18/2016: Interestingly, my f4 equation essentially fails for most post-Neolithic Europeans, particularly those with relatively high ratios of Karelia HG-related ancestry. For instance, Yamnaya Kalmykia scores just 2.9% BE, which can't be right. Yamnaya Samara shows -2.2%, which is obviously wrong.

But I tried several combinations of reference samples and found that by replacing Karelia HG with Hungary HG and Dai with Ust-Ishim I was able to obtain coherent results for a wider range of groups, including Yamnaya.


To be honest, I still don't know what the hell I'm testing here exactly. The results appear to reflect the existence of two components within West Eurasia; one representing ancient hunter-gatherers from Europe and probably surrounding areas of the Near East, and another closely related to present-day Near Eastern populations. The latter might well be a signal of the so called Basal Eurasians, or perhaps a number of as yet unsampled meta populations from the ancient Near East?