All of the post-Middle Neolithic samples from the recent
Mittnik et al. and
Saag et al. preprints on the ancient population history of the Baltic region belonged to Y-chromosome haplogroup R1a. And most of them belonged to the R1a-M417 (R1a1a) subclade that makes up almost 100% of the R1a lineages in the world today. This is what the results look like in a table (the sample IDs are of my own design):
Earlier samples from the same region belonged to Y-haplogroups I2a and R1a, but this was a subclade of R1a defined by the YP1272 mutation that is extremely rare today even in Northeastern Europe.
And now shifting our focus west of Scandinavia: all but two of the post-Middle Neolithic samples from around the North Sea from the recent
Olalde et al. preprint on the Bell Beaker phenomenon and ancient population history of Northwest Europe belonged to Y-chromosome R1b, and more specifically to the R1b-M269 (R1b1a1a2) subclade, which makes up almost 100% of the R1b lineages in the world today. Here's a table:
Earlier samples from the same region belonged to Y-haplogroups I2a, I, G2a and CF, and most of the instances of I and the CF would probably be classified as I2a if not for missing data.
Interestingly, despite the R1a vs R1b dichotomy between these post-Middle Neolithic obvious newcomers to the Baltic and North Sea regions, respectively, they were very similar in terms of overall genetic structure, obviously closely related, starkly different from Middle Neolithic Northern Europeans, and in all likelihood mainly derived from the same homeland that was not located in Northern Europe.
So can we locate this homeland with any degree of certainty, you might wonder? In fact, you might ask, isn't this a futile search for the time being, as we await ancient DNA from many prehistoric Eurasian populations?
Not at all, because when attempting to answer this question we're bounded by two key constraints: the exceptionally high frequencies of R1a and R1b in the post-Middle Neolithic Baltic and North Sea samples, and their close genetic affinity to earlier and contemporaneous populations from the Pontic-Caspian steppe, part of which is due to significant Caucasus Hunter-Gatherer (CHG) admixture that was lacking in Middle Neolithic Northern Europeans.
Indeed, to date, the Pontic-Caspian steppe is the only region where both R1a and R1b have been found in ancient remains from the same sites dating to the Mesolithic, Neolithic and Eneolithic. Here's a table based on results from
Mathieson et al. 2015 and
2017. The R and R1 might really be R1a or R1b if not for missing data.
The Pontic-Caspian steppe also abuts the Caucasus foothills, and we know that CHG admixture was a major feature of its inhabitants from at least the Eneolithic. So odds are, and make no mistake, these are indeed excellent odds, that the homeland we're looking for was on the Pontic-Caspian steppe.
But of course I2a has also been recorded in prehistoric samples from the Pontic-Caspian steppe. So, you might ask, why did the populations migrating out of the steppe belong to R1a and R1b, and why did some of them seemingly carry only R1a while others only R1b? This can be explained by local founder effects on the steppe due to patrilocality. Moreover, it's possible that some groups moving out of the steppe did carry high frequencies of I2a, but they're yet to enter the ancient DNA record. [
Edit: Maybe they already have? See
here]
Now, the aforementioned post-Middle Neolithic newcomers to the Baltic and North Sea regions are most certainly in large part the direct ancestors of modern-day Northern Europeans, speaking languages belonging to the three daughter branches of late Proto-Indo-European (PIE): Balto-Slavic, Celtic and Germanic. It's highly unlikely that languages ancestral to these present-day languages were spoken by Middle Neolithic farmers, nor introduced into Northern Europe after it was colonized by the migrants from the Pontic-Caspian steppe.
What this strongly suggests is that the Pontic-Caspian steppe was also the late PIE homeland.
But, you might argue, the Pontic-Caspian steppe may have just been the expansion point for some of the late PIE language branches. No, that won't work. For one, modern-day populations speaking languages belonging to all other late PIE branches, such as Armenian, Greek, Indo-Iranian and Italic, show signals of the same population expansion from the Pontic-Caspian steppe that gave rise to modern-day Northern Europeans, in the form of Yamnaya-related genome-wide genetic admixture and appreciable frequencies of Y-chromosome haplogroups R1a-M417 and/or R1b-M269.
Some of these signals are certainly due to fairly recent admixture from Northern Europeans, like in much of Greece as a result of the Slavic expansions during the Early Middle Ages, but most cannot be explained in this way.
Secondly, Balto-Slavic, Celtic and Germanic are not more closely related to each other than to some of the other late PIE branches. For instance, Balto-Slavic is considered far more closely related to Indo-Iranian than to Celtic, which is generally seen as a sister branch to Italic. Therefore, if Balto-Slavic and Celtic derive from a homeland on the Pontic-Caspian steppe, then logically this is also where we should look for the origins of Indo-Iranian and Italic.
So as far as the late PIE homeland is concerned, thanks to ancient DNA, the debate is now practically over. But the PIE homeland debate is still wide open, or so we're told.
Apparently,
Mathieson et al. 2017 aren't comfortable with putting the PIE homeland on the Pontic-Caspian Steppe because they can't find any evidence in their ancient DNA dataset of a significant migration through the Balkans that would potentially bring Anatolian languages from the Pontic-Caspian steppe to Anatolia. From the paper:
One version of the Steppe Hypothesis of Indo-European language origins suggests that Proto-Indo European languages developed in the steppe north of the Black and Caspian seas, and that the earliest known diverging branch – Anatolian – was spread into Asia Minor by movements of steppe peoples through the Balkan peninsula during the Copper Age around 4000 BCE, as part of the same incursions from the steppe that coincided with the decline of the tell settlements. [51] If this were correct, then one way to detect evidence of it would be the appearance of large amounts of characteristic steppe ancestry first in the Balkan Peninsula, and then in Anatolia. However, our genetic data do not support this scenario. While we find steppe ancestry in Balkan Copper Age and Bronze Age individuals, this ancestry is sporadic across individuals in the Copper Age, and at low levels in the Bronze Age. Moreover, while Bronze Age Anatolian individuals have CHG/Iran Neolithic related ancestry, they have neither the EHG ancestry characteristic of all steppe populations sampled to date [20] , nor the WHG ancestry that is ubiquitous in southeastern Europe in the Neolithic (Figure 1A, Supplementary Data Table 2, Supplementary Information section 1). This pattern is consistent with that seen in northwestern Anatolia [11] and later in Copper Age Anatolia [23], suggesting continuing migration into Anatolia from the East rather than from Europe.
And this...
On the other hand, our data could still be consistent with the Steppe-Balkans-Anatolia route hypothesis model, albeit with constraints. It remains possible that populations dating to around 1600 BCE in the regions where the Indo-European Luwian, Hittite and Palaic languages were spoken did have European hunter-gatherer ancestry. However, our results would require that such ancestry was not ubiquitous in Bronze Age Anatolia, and was perhaps tightly linked to Indo-European speaking groups. We predict that additional insight about the genetic origins of the potential speakers of early Indo-European languages will be obtained when ancient DNA data become available from additional sites in this key period in Anatolia and the Caucasus.
But I'd say the authors are taking that one particular version of the Steppe Hypothesis way too seriously. They might even be implying things that the creator(s) of the said hypothesis never posited.
Why do they seemingly expect a massive surge of steppe admixture into the Balkans during the Copper Age? If the steppe people are just shooting through the Balkans on their way to Anatolia, why would they leave a lot of admixture along the way? And if the locals are abandoning their tell settlements and running for the hills as far away from the oncoming steppe invaders as they can, how exactly would they acquire steppe admixture? Osmosis or what?
The Balkans is not Northern Europe, and the hypothesized migration of the proto-Anatolians from the Pontic-Caspian Steppe to Anatolia through the Balkans was never, as far as I know, meant to parallel the massive Corded Ware expansion across Northern Europe. In other words, why should all of the early Indo-European expansions have been of the same character, especially considering that they moved into such starkly different areas of Eurasia?
Indeed, as
Mathieson et al. 2017 point out in the quote above, the evidence for the fleeting presence of steppe peoples in the Copper Age Balkans is in their dataset. For instance, in their Varna 1 sample set from Bulgaria, three out of the five individuals show significant steppe admixture. One of these individuals is almost 50% Yamnaya-like. Surely, there's really no need to expect anything more than that when looking for signals of a proto-Anatolian migration from the Pontic-Caspian Steppe to Anatolia.
In fact, even though I do appreciate the incredible work these guys are doing and the data they're making available to myself and everyone else, I suspect that there's a little bit of, shall we say,
schadenfreude going on here.
They sequenced all of three Early Bronze Age Anatolians of obscure origin (are they actually suspected Anatolian speakers, like Luwians?), and apparently it's a big deal that they can't find any steppe admixture in Early Bronze Age Anatolia. Come on.
And then we're offered just three Yamnaya samples from the Pontic Steppe in Ukraine. One happens to be a massive outlier towards the Caucasus. Wow, what are the chances of that? And guess what, all three of these Yamnayans are females, so of course we're left wondering about the Y-haplogroups of the Yamnaya males on the Pontic Steppe. What happened to the males? Next paper, that's what.
Update 19//05/2017: Please note that the authors are not holding back any Yamnaya males from Ukraine for a future paper, as per my claim in the last paragraph above. They used what they had for the time being.
Update 21/05/2017: Actually, I suspect that we already have a population from the Bronze Age steppe in the ancient DNA record with a high frequency of Y-haplogroup I2a. See
here.
See also...
R1a-M417 from Eneolithic Ukraine!!!11
Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts
Eastern Europe as a bifurcation hotspot for Y-hg R1
Globular Amphora people starkly different from Yamnaya people