An abstract book from a recent mathematics meeting in Estonia includes an abstract on the genetic impact of Bronze Age steppe pastoralists on Europe and South Asia. Titled A Pre-Existing Isolation by Distance Gradient in West Eurasia May Partly Account for the Observed “Steppe” Component in Europe, it's mostly authored by scientists from the Estonian Biocentre including Luca Pagani and Mait Metspalu. You can read it here.
Even though it's just an abstract of a paper that might never be published, it's so obviously wrong that I can't let it go. This is the sort of thing I'd expect to see from some of the half deranged visitors in the comments section at this blog, not scientists from the Estonian Biocentre.
First of all, even though the abstract doesn't spell out which data crunching algorithms were used by the authors, it's pretty clear to me that the main part of their analysis was run with ADMIXTURE. That basically makes it a pointless exercise from the outset, simply because ADMIXTURE is not designed for these types of analyses.
Why? Because it's impossible to accurately recapitulate ancient population structure with ADMIXTURE; the results are always significantly skewed in some way, usually by heavy genetic drift in one or more of the test populations. In other words, there's no way to truly revive ancient populations with ADMIXTURE components. And if you can't do that, then how can you estimate their impact more or less accurately? Not possible.
In any case, whether the authors relied on ADMIXTURE or not is immaterial to the fact that all of their main points are clearly wrong. Before I go through these points, and explain why they're wrong, I need to explain exactly what the Steppe component really is and isn't.
The Steppe component is the genetic structure of Early and Middle Bronze Age (EMBA) steppe pastoralist groups Afanasievo, Poltavka and Yamnaya. And it's a very specific thing. It isn't a component inferred from a random run of ADMIXTURE that peaks in Afanasievo, Poltavka and/or Yamnaya, or any other ancient populations.
So, Steppe component = Afanasievo, Poltavka and Yamnaya, or Steppe_EMBA. Nothing more, nothing less. Certainly nothing from outside of the steppe predating Afanasievo and Yamnaya.
Keep in mind also that Steppe_EMBA is a very specific mixture of older and contemporaneous populations. Using the formal-statistics-based qpAdm method, which models ancestry directly based on f4-statistics, Steppe_EMBA is probably best modeled as a mixture of Eastern European Hunter-Gatherers (EHG), Caucasus Hunter-Gatherers (CHG), and Anatolia Chalcolithic (Anatolia_ChL), with ancestry proportions of around 0.453, 0.453 and 0.094, respectively. See here.
I believe that in this model Anatolia_ChL represents some type of minor western admixture amongst the close relatives of CHG still living in the Caucasus during the Eneolithic/Early Bronze Age, and/or minor gene flow from the Balkans onto the steppe. But that's a topic for another day, perhaps after the release of the Bell Beaker behemoth?
Below is a visual representation of the model, using a typical Principal Component Analysis (PCA) of Western Eurasian population structure. Note the tight cluster formed by the Steppe_EMBA groups and individuals, which is easily differentiated from all ancient populations outside of the steppe, except, importantly, Corded Ware.
Thus, considering that I know what the Steppe component is and isn't exactly, then I can try to test for admixture from it and its ancestral components as best I can using qpAdm. Below are results for a few pertinent ancient populations (no idea how to model the farmers from Early Neolithic Iran at this stage, but I've already underlined their unique genetic character here and have no reason to believe that they're responsible for any part of the Steppe_EMBA signal in Europe or South Asia). If you're wondering why I chose Hungary_HG as the potential Western Hunter-Gatherer source, it's because it provided the best statistical fits overall. Also note that Ukraine_HG/N is based on samples from the Pontic Steppe.
The models involving Steppe_EMBA and CHG are almost always worse than the best models without them. As far as I can see, there's no strong evidence here of any mixture from a population even similar to Steppe_EMBA in any of these groups, except perhaps Ukraine_HG/N.
However, qpAdm results are dependent on the choice of pright and pleft populations (outgroups and potential mixture sources, respectively). Therefore, with different pright and pleft populations it might be possible to model all of the above groups with significant Steppe_EMBA admixture.
But of course there are other tests that I can run to double check my qpAdm models, such as the West Eurasian PCA. And clearly, the PCA basically supports the qpAdm results, with none of the test groups showing much, if any, deviation towards Steppe_EMBA or CHG from their main mixture clines.
So now let's take a look at the key points made in the abstract and why they're so way off the mark:
However ancient DNA samples from East European and Caucasian Hunter-Gatherers as well as from Early Iranian Neolithic, dating from before the Yamnaya expansion, already show signs of this so called “Steppe” component (Lazaridis et al. 2016).
There's no persuasive evidence for this; see my qpAdm and PCA models above for CHG and various Eastern European Hunter-Gatherer groups. As for the Early Neolithic farmers from Iran, there are no formal models that really make sense for them; we probably don't yet have old enough Near Eastern genomes to serve as potential mixture sources. But the idea that they're somehow interchangeable with Steppe_EMBA is patently idiotic.
Such an observation is compatible with the presence of a pre-existing genetic gradient ranging from Caucasus/Iran all the way to Europe, which likely formed through isolation by distance over thousands of years.
It's not. Isolation by distance has nothing to do with it, because there's no persuasive evidence for the existence of Steppe_EMBA ancestry, or even anything similar, outside of the steppe until the Late Neolithic/Early Bronze Age (LNBA). All of the evidence available to date points to a sudden, massive and perhaps even violent explosion of Steppe_EMBA peoples deep into Europe and also across much of Asia during the LNBA.
Here we show that such a gradient, defined as decrease of "steppe” component with distance from Iran, can be inferred from ancient samples pre-dating the Yamnaya expansion (r^2 = 0.93).
Not possible, because, as I've just pointed out, pre-Bronze Age samples from Iran (Iran_ChL) do not show strong evidence of Steppe_EMBA ancestry aka. the Steppe component.
When analysed in the light of this gradient, later ancient and modern samples from Europe still display an excess of Steppe component, however this excess is less pronounced than previously estimated.
Horseshit. Nothing's changed.
Additionally we found that, of the analysed samples, modern South Asians show the highest excess of “steppe” component, pointing to the documented, recent links between the Caucasus/Iran populations and the South Asian peninsula.
No, you're conflating Steppe_EMBA ancestry with Neolithic ancestry from what is now Iran because you don't know how to differentiate them. But this has already been done many times over on this blog and also in scientific literature.
By the way, Iosif Lazaridis made a couple of observations related to the Pagani et al. abstract on Twitter. See here and here.
I suspect P10: http://www.karger.com/Article/PDF/469638 … conflates Caucasus/Iran-component with "steppe" ancestry 1/n
Steppe ancestry brought into mainland Europe post-5kya was a mix of Caucasus/Iran-component (Basal Eurasian-rich) with ANE/EHG-component 2/n