search this blog

Showing posts with label PIE. Show all posts
Showing posts with label PIE. Show all posts

Saturday, June 18, 2022

David Reich on the origin of the Yamnaya people (!?)


Harvard's David Reich is doing a talk next month about the genetic history of West Asia and nearby parts of Europe. This is a quote from an online abstract of the talk (found here).

The impermeability of Anatolia to exogenous migration contrasts with our finding that the Yamnaya had two distinct gene flows, both from West Asia, suggesting that the Indo-Anatolian language family originated in the eastern wing of the Southern Arc and that the steppe served only as a secondary staging area of Indo-European language dispersal.

If this is actually what David Reich is going to claim then I'd say his team has a lot of work to do before they put out their paper on the topic.

First of all, Yamnaya did not have two distinct gene flows from West Asia. I don't even know what that means exactly, but there's no way that this statement is correct no matter how one interprets it.

In fact, the Yamnaya population formed on the Pontic-Caspian steppe from earlier groups native to this part of Eastern Europe, such as the people associated with the Sredny Stog culture.

That is, there were no migrations from West Asia into Eastern Europe that can be claimed to have been instrumental in the emergence of the Yamnaya population. On the other hand, Yamnaya may have been significantly influenced by cultural impulses from West Asia, but this is nothing new.

In terms of deep population structure, the Yamnaya genotype can be described as a mixture between Eastern European and West Asian-related genetic components. However, these Asian-related components were already in Europe thousands of years before Yamnaya came into existence.

Indeed, soon to be published ancient DNA shows that hunter-gatherers very similar to the Yamnaya people, packing quite a lot of West Asian-related ancestry, lived in the Middle Don region (just north of the Pontic-Caspian steppe) well before 5,000 BCE (see here).

So, did the West Asian ancestors of these Middle Don hunter-gatherers speak Proto-Indo-European, or, as David Reich calls it, Indo-Anatolian? Keep in mind that most linguists put the birth of Indo-Anatolian around 4,000 BCE, which is actually the Sredny Stog period.

Moreover, in underlining Anatolia's supposed impermeability to exogenous migration, David Reich is arguing against things that no one worth their salt ever claimed. That's because the spread of Indo-Anatolian speakers into Anatolia has never really been described by archeologists and linguists as a massive migration, but rather as an infiltration into lands already heavily populated by the Hattians (for instance, see here).

We may have already seen the genetic evidence of this infiltration in the presence of steppe Y-chromosome haplogroup R-V1636 in a Chalcolithic burial at Arslantepe (see here and here). Let's wait and see what else crops up over the next few years as many more ancient Anatolian genomes are sequenced by David Reich and colleagues.

See also...


Tuesday, June 30, 2020

The precursor of the Trojans


Who remembers kum4 from Omrak et al. 2016? I'm pretty sure now that this individual packs a lot of ancestry from the Pontic-Caspian (PC) steppe.

If so, that's a big deal, because her Chalcolithic (or Late Neolithic?) burial was located at Kumtepe. That is, in the same part of Anatolia as the later settlement of Troy, which may have been founded by early Anatolian speakers from Eastern Europe (see here).

The qpAdm mixture models below, featuring kum4 and the likely older kum6, also from Kumtepe, are based on qpfstats output. qpfstats is a new program from the David Reich Lab specifically designed to help analyze low coverage ancients (see here). And kum4 is certainly that.

TUR_Kumtepe_N_kum4
RUS_Progress_En 0.383±0.114
TUR_Barcin_N 0.617±0.114
chisq 7.868
tail prob 0.247957
Full output

TUR_Kumtepe_N_kum4
IRN_Seh_Gabi_C 0.325±0.150
TUR_Barcin_N 0.675±0.150
chisq 14.736
tail prob 0.0224096
Full output

TUR_Kumtepe_N_kum6
RUS_Progress_En 0.121±0.042
TUR_Barcin_N 0.879±0.042
chisq 21.790
tail prob 0.00132149
Full output

TUR_Kumtepe_N_kum6
IRN_Seh_Gabi_C 0.283±0.059
TUR_Barcin_N 0.717±0.059
chisq 6.289
tail prob 0.391566
Full output

Indeed, kum4 and kum6 offer just ~10,000 and ~100,000 "valid SNPs", respectively (see here). However, if nothing else, the results are clearly not random.

For one, because they fit the expected pattern, with the likely older individual lacking ancestry from the PC steppe (her model with RUS_Progress_En shows a weak statistical fit). Moreover, the qpAdm mixture ratios align almost perfectly with the results in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation. Coincidence?

See also...

Perhaps a hint of things to come

Tuesday, June 2, 2020

Perhaps a hint of things to come


It's still a mystery how the Hittites and other Anatolian speakers ended up in the Near East. However, the leading theory is that their ancestors migrated from the steppes of Eastern Europe to western Anatolia via the Balkans sometime during the Copper Age.

Consider the qpAdm mixture models below, made possible thanks to some of the ancient samples published recently along with Skourtanioti et al. 2020. The key ancients are described in a text file available here.

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.471±0.094
RUS_Vonyuchka_En 0.148±0.040
TUR_Barcin_N 0.381±0.069
chisq 12.874
tail prob 0.116261
Full output

TUR_Barcin_C
RUS_Vonyuchka_En 0.107±0.029
TUR_Buyukkaya_EC 0.893±0.029
chisq 12.107
tail prob 0.207331
Full output

I'd say it's quite clear now that TUR_Barcin_C harbors minor ancestry from the Pontic-Caspian (PC) steppe. The reason this isn't widely accepted yet is because demonstrating it convincingly hasn't been possible without a proximate Anatolian ancestry source for TUR_Barcin_C, precisely like TUR_Buyukkaya_EC.

Admittedly, though, the statistical fits in my models aren't all that great. I suspect the problem lies with RUS_Vonyuchka_En, which is likely to be a rather poor stand in for the people who brought steppe ancestry, and possibly early Anatolian speech, to western Anatolia.

So let's see what happens when I try a more proximate reference for the steppe ancestry in TUR_Barcin_C. How about Yamnaya_BGR, an individual of mixed Balkan and steppe origin from what is now Bulgaria?

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.518±0.075
TUR_Barcin_N 0.203±0.056
Yamnaya_BGR 0.279±0.067
chisq 10.602
tail prob 0.225269
Full output

TUR_Barcin_C
TUR_Buyukkaya_EC 0.749±0.058
Yamnaya_BGR 0.251±0.058
chisq 9.687
tail prob 0.376414
Full output

That's a little better. Unfortunately, the problem now is that the models are anachronistic, because TUR_Barcin_C is about a thousand years older than Yamnaya_BGR. Clearly, we need more Copper Age samples from the western edge of the PC steppe, the eastern Balkans, and especially northwestern Anatolia.

The Principal Component Analysis (PCA) below effectively illustrates why my qpAdm models work. It was produced with Global25 data using the Vahaduo PCA tools freely available here. Note that TUR_Barcin_C is shifted away from the essentially perfect cline formed by AZE_Caucasus_lowlands_LN, TUR_Barcin_N and TUR_Buyukkaya_EC towards samples from ancient Eastern Europe, including Yamnaya_BGR.


See also...

Steppe invaders in the Bronze Age Balkans

Friday, August 2, 2019

The PIE homeland controversy: August 2019 status report


Archeologist David Anthony has a new paper on the Indo-European homeland debate titled Archaeology, Genetics, and Language in the Steppes: A Comment on Bomhard. It's part of a series of articles dealing with Allan R. Bomhard's "Caucasian substrate hypothesis" in the latest edition of The Journal of Indo-European Studies. It's also available, without any restrictions, here.

Any thoughts? Feel free to share them in the comments below. Admittedly, I found this part somewhat puzzling (emphasis is mine):

It was the faint trace of WHG, perhaps 3% of whole Yamnaya genomes, that identified this admixture as coming from Europe, not the Caucasus, according to Wang et al. (2018). Colleagues in David Reich’s lab commented that this small fraction of WHG ancestry could have come from many different geographic places and populations.

I think that's highly optimistic. It really should be obvious by now thanks to archeological and ancient genomic data, including both uniparental and genome-wide variants, that the Yamnaya people were practically entirely derived from Eneolithic populations native to the Pontic-Caspian (PC) steppe. So, in all likelihood, this was also the source of their minor WHG ancestry.

Indeed, they clearly weren't some mishmash of geographically, culturally and genetically disparate groups that had just arrived in Eastern Europe, but the direct descendants of closely related and already significantly Yamnaya-like peoples associated with long-standing PC steppe archeological cultures such as Khvalynsk and Sredny Stog. I discussed this earlier this year, soon after the Wang et al. paper was published:

On Maykop ancestry in Yamnaya

I hope I'm wrong, but I get the feeling that the scientists at the Reich Lab are finding this difficult to accept, because it doesn't gel with their theory that archaic Proto-Indo-European (PIE) wasn't spoken on the PC steppe, but rather south of the Caucasus, and that late or rather nuclear PIE was introduced into the PC steppe by migrants from the Maykop culture who were somehow involved in the formation of the Yamnaya horizon.

Inexplicably, after citing Wang et al. on multiple occasions and arguing against any significant gene flow between Maykop and Yamnaya groups, Anthony fails to mention Steppe Maykop. But the Steppe Maykop people are an awesome argument against the idea that there was anything more than occasional mating between the Maykop and Yamnaya populations, because they were wedged between them, and yet clearly distinct from both, with a surprisingly high ratio of West Siberian forager-related ancestry (see here and here).


Despite all the talk lately about the potential cultural, linguistic and genetic ties between Maykop and Yamnaya, including claims that the latter possibly acquired its wagons from the former, my view is that the Steppe Maykop and Yamnaya wagon drivers may have competed with each other and eventually clashed in a big way. Indeed, take a look at what happens after Yamnaya burials rather suddenly replace those of Steppe Maykop just north of the Caucasus around 3,000 BCE.

Yamnaya_RUS_Caucasus
RUS_Progress_En_PG2001 0.808±0.058
RUS_Steppe_Maykop 0.000
UKR_Sredny_Stog_II_En_I6561 0.192±0.058
chisq 13.859
tail prob 0.383882
Full output

Yep, total population replacement with no significant gene flow between the two groups. Apparently, as far as I can tell, there's not even a hint that a few Steppe Maykop stragglers were incorporated into the ranks of the newcomers. Where did they go? Hard to say for now. Maybe they ran for the hills nearby?

Intriguingly, Anthony reveals a few details about new samples from three different Eneolithic steppe burial sites associated with the Khvalynsk culture:

The Reich lab now has whole-genome aDNA data from more than 30 individuals from three Eneolithic cemeteries in the Volga steppes between the cities of Saratov and Samara (Khlopkov Bugor, Khvalynsk, and Ekaterinovka), all dated around the middle of the fifth millennium BC.

...

Most of the males belonged to Y-chromosome haplogroup R1b1a, like almost all Yamnaya males, but Khvalynsk also had some minority Y-chromosome haplogroups (R1a, Q1a, J, I2a2) that do not appear or appear only rarely (I2a2) in Yamnaya graves.

As far as I can tell, he suggests that they'll be published in the forthcoming Narasimhan et al. paper. If so, it sounds like the paper will have many more ancient samples than its early preprint that was posted at bioRxiv last year.

For me the really fascinating thing in regards to these new samples is how scarce Y-haplogroup R1a appears to have been everywhere before the expansion by the putative Indo-European-speaking steppe ancestors of the Corded Ware culture (CWC) people. It's basically always outnumbered by other haplogroups wherever it's found prior to about 3,000 BCE, even on the PC steppe. But then, suddenly, its R1a-M417 subclade goes BOOM! And that's why I call it...

The beast among Y-haplogroups

At this stage, I'm not sure how to interpret the presence of Y-haplogroup J in the Khvalynsk population. It may or may not be important to the PIE homeland debate. Keep in mind that J is present in two foragers from Karelia and Popovo, northern Russia, dated to the Mesolithic period and with no obvious foreign ancestry. So it need not have arrived north of the Caspian as late as the Eneolithic with migrants rich in southern ancestry from the Caucasus or what is now Iran. In other words, for the time being, the steppe PIE homeland theory appears safe.

Update 20/12/2019: A note on Steppe Maykop

See also...

Is Yamnaya overrated?

The PIE homeland controversy: January 2019 status report

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Sunday, March 31, 2019

Map of pre-Corded Ware culture (>2900 BCE) instances of Y-haplogroup R1a (updated)


Below is a map showing the global distribution of Y-chromosome haplogroup R1a prior to the expansions of the R1a-rich Corded Ware culture (CWC) people and their descendants across Europe and Asia from around 2900 BCE. I'll be updating this map regularly and using it to help me narrow down the options for the place of origin of R1a, and also to counter the misinformation about this topic that has appeared in print and online over the years, including in many scientific publications and popular websites such as Wikipedia.


Incredibly, as far as I know, there are just six reliably called instances of R1a in the now ample Eurasian ancient DNA record dating to the pre-CWC period. To put this into perspective, consider that R1a is today the most common Y-haplogroup in much of Europe and Asia. How did that happen I wonder? However, please note that I chose to base the map only on samples sequenced with the capture and shotgun methods, rather than the PCR method, which is susceptible to producing contaminated results and no longer used in major ancient DNA studies.

See also...

Y-haplogroup R1a and mental health

The Poltavka outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Monday, March 4, 2019

An exceptional burial indeed, but not that of an Indo-European


Not too many people have been buried sitting on wagons. The most famous case is that of an Early Bronze Age man who, considering his injuries, may have died in a high-speed crash - high-speed for its time anyway - on the Pontic-Caspian steppe in Eastern Europe.

It's likely that this guy was one of the very first wagon-drivers in human history, because his four-wheeled wooden model is dated to 3336-3105 calBCE, which makes it the oldest wagon discovered thus far. His genotype data, under the label Steppe Maykop SA6004, were published recently along with Wang et al. 2019.

Early wagons are very important for a couple of reasons: they revolutionized human transport and warfare, and they're often closely associated with the prehistoric expansions of Indo-European languages.

So I'm pretty sure that many of you must be thinking right now that wagon-driver SA6004 was an early Indo-European, or even a Proto-Indo-European! I bet that's what Wang et al. thought too, considering the conclusion in their paper. But, alas, the chances of this are slim to none.

Steppe Maykop samples show rather peculiar genetic structure considering their geographic origin, with a large proportion of their ancestry deriving from a source closely related to western Siberian hunter-gatherers (aka West_Siberia_N in the ancient DNA record). Indeed, SA6004 basically looks like a 50/50 mix between West_Siberia_N and Piedmont_Eneolithic. Here's a map with all of the relevant details.


Thus, clearly, the Steppe Maykop population wasn't ancestral or even directly related to the steppe and steppe-derived groups generally regarded to have been Indo-European speaking, such as those associated with the Yamnaya, Corded Ware, and Bell Beaker cultures. That's because these groups lack any discernible West_Siberia_N-related ancestry.

It also wasn't ancestral or directly related to any present-day or currently sampled ancient Indo-European speaking populations, again because these populations basically lack West_Siberia_N-related ancestry.

On the other hand, Yamnaya, Corded Ware and other closely related groups show an exceptionally strong genetic relationship with Indo-European speakers, especially those from across Northern Europe, which experienced massive migrations from the Pontic-Caspian steppe during the late Neolithic period, and hardly anything from elsewhere since then.

Case in point, the samples from Wang et al. labeled Yamnaya Caucasus were recovered from the same area of the Pontic-Caspian as their Steppe Maykop samples, and yet, take a look at this linear model based on outgroup f3-statistics. Steppe Maykop does show high genetic affinity to Indo-European speakers (no doubt mediated via its Piedmont_Eneolithic-related ancestry), but, unlike Yamnaya Caucasus, it also shows unusually high affinity for a West Eurasian population to Native Americans and Siberians. The relevant datasheet is available here.
So the only way that the Steppe Maykop population was Indo-European-speaking, was if it inherited its Indo-European speech from its Piedmont_Eneolithic-related ancestors. And even if it was Indo-European-speaking, it probably spoke an extinct Indo-European language not closely related to any extant Indo-European languages. In other words, the possibility that Steppe Maykop passed on its language to Yamnaya, along with its wagons, is close to zero. More likely, Yamnaya stole a few wagons from Steppe Maykop, and the rest is history.

See also...

The Steppe Maykop enigma

On Maykop ancestry in Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 2, 2019

Maykop: a multi-ethnic layer cake?


Let's speculate about the linguistic affinities of the currently available ancient populations from the Caucasus and surrounds. I put together a series of outgroup f3-stats to help things along. They're available for download here.

Maykop
Georgian 0.258224
Abkhasian 0.257899
Latvian 0.257376
Swedish 0.257301
Turkish_Trabzon 0.256996
Basque_Spanish 0.256589
Chechen 0.256514
Icelandic 0.256418
Norwegian 0.256325
Lezgin 0.256272
Irish 0.256227
Tabasaran 0.256092
Italian_Bergamo 0.25605
English_Cornwall 0.256032
Polish_East 0.255991
Scottish 0.255955
Adygei 0.255913

Steppe_Maykop
Latvian 0.261845
Russian_North 0.26145
Estonian 0.260355
Finnish 0.260211
Lithuanian 0.260072
Udmurd 0.259804
Ingrian 0.259663
Surui 0.259637
Vepsa 0.259608
Karelian 0.259532
Karitiana 0.259482
Russian_West 0.259397
Russian_Central 0.259274
Wichi 0.259106
Saami 0.258982
Komi 0.258945
Icelandic 0.258854
Swedish 0.258814
Mordovian 0.258604
Irish 0.25859

Eyeballing the stats might be enough to get a general impression about what they mean, but to understand them properly it's necessary to get technical with something like PAST3 (see here). That's because f3-stats pick up shared genetic drift from all drift paths, and don't especially focus on more recently shared ancestry. This can often lead to confusing outcomes.

Below are a few examples of linear models based on my f3-stats. Note that many Indo-European speakers, especially from Northern Europe, are foremost attracted to ancient samples from the Pontic-Caspian steppe. On the other hand, non-Indo-European speakers, from such far flung locations as the Caucasus and Iberia, show relatively stronger affinity to ancient samples from Anatolia and the Caucasus. Moreover, Uralic speakers show elevated affinity to ancient hunter-gatherer samples from Eastern Europe and Siberia. Makes sense, right?
Based on these and other data, I'd say that Maykop and the culturally related Steppe Maykop were something of a multi-ethnic polity, with many near and far related languages spoken by its people, including perhaps Kartvelian, Northwest Caucasian, Yeniseian and Indo-European. But it seems to me that Proto-Indo-European was spoken by steppe foragers turned pastoralists just outside of the Maykop zone. And I'm quite sure that after the Maykop collapse various early Indo-European groups pushed across the Caucasus and deep into the Near East. Just take a look at the f3-stats and linear model for Hajji_Firuz_BA to see what I mean.

See also...

An exceptional burial indeed, but not that of an Indo-European

The Steppe Maykop enigma

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Tuesday, January 1, 2019

The PIE homeland controversy: January 2019 status report


Last year, the preprint that claimed to have presented archaeogenetic data that opened up the possibility of the Proto-Indo-European (PIE) homeland being located south of the Caucasus was, ironically, also the preprint that considerably strengthened my confidence that the said homeland was actually located north of the Caucasus.

Of course, I'm talking about the Wang et al. manuscript at bioRxiv, which is apparently soon to be published as a peer-reviewed paper in Nature Communications (see here).

It'll be fascinating to observe if and how the peer-review process has impacted on the preprint, and especially its conclusion. My impression was that the authors seemed pretty sure that the Maykop people gave rise to the Yamnaya culture, or at least Indo-Europeanized it. But, as far as I saw, the archaeogenetic data didn't bear this out at all, and instead showed a lack of any direct, recent and meaningful genetic relationship between Maykop and Yamnaya (see here). Was this also picked up by the peer reviewers? We shall see.

Moreover, there was some exceedingly interesting fine print in the manuscript's supplementary information:

Complementary to the southern [Darkveti-Meshoko] Eneolithic component, a northern component started to expand between 4300 and 4100 calBCE manifested in low burial mounds with inhumations densely packed in bright red ochre. Burial sites of this type, like the investigated sites of Progress and Vonyuchka, are found in the Don-Caspian steppe [10], but they are related to a much larger supra-regional network linking elites of the steppe zone between the Balkans and the Caspian Sea [16]. These groups introduced the so-called kurgan, a specific type of burial monument, which soon spread across the entire steppe zone.

Always read the fine print, they say. And they're right. Imagine if I only read the preprint's conclusion and missed this little gem; I'd probably think that the PIE homeland was located south of the Caucasus rather than on the Don-Caspian steppe.

Wow, proto-kurgans with inhumations densely packed in bright red ochre? A supra-regional network linking the elites of the steppe all way from the Balkans to the Caspian Sea? An expansionist culture? And, as evidenced by the ancient DNA from the Progress and Vonyuchka sites, a people who may well have been in large part ancestral to the Yamnaya, Corded Ware and Andronovo populations, that have been identified based on archeological and historical linguistics data as the main vectors for the spread of Indo-European languages as far as Iberia in the west and the Indian subcontinent in the east.

I wonder if the authors actually asked themselves who these people may have been, before so haphazardly turning to Maykop and, ultimately, the Near East, as the likely sources of the Yamnaya culture? To me they look like the Proto-Indo-Europeans and true antecedents of Yamnaya.

So as things stand, my pick for the PIE homeland is firmly the Don-Caspian steppe. And I genuinely thank Wang et al., and indeed the Max-Planck-Institut für Menschheitsgeschichte (aka MPI-SHH), for their assistance.

But, you might ask, what about the Hittites? Yes, I realize that no one apart from me and a few of my readers here can find any steppe ancestry in the so called Hittite genomes published to date. However, consider this: if the PIE homeland really was on the steppe, and a dense sampling strategy of Hittite era Anatolia fails to turn up any unambiguous steppe ancestry in at least a few individuals, then there has to be an explanation for it. But let's wait and see what a dense sampling strategy of Hittite era Anatolia actually reveals before we go that far.

See also...

The PIE homeland controversy: August 2019 status report

Yamnaya: home-grown

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, November 1, 2018

Big deal of 2018: Yamnaya not related to Maykop


I was going to write this post after the genotype data from the Wang et al. preprint on the genetic prehistory of the Greater Caucasus became available, because I wanted to demonstrate a few key points with analyses of my own. But I've got a hunch that the formal publication of the manuscript, and thus also the release of the data, has been indefinitely delayed for one reason or another. So here goes anyway, the big deal of 2018...

This year, ancient DNA has revealed that the populations associated with the Maykop and Yamnaya archeological cultures were genetically distinct from each other, and, in all likelihood, didn't mix to any significant degree. Case in point: an ADMIXTURE analysis from Wang et al. 2018.


No doubt, this is quite a shock for many people, especially those of you who consider Maykop to have been a Proto-Indo-European-speaking culture that either gave rise to Yamnaya or at least Indo-Europeanized it. So now, if you still want to see Maykop as the Indo-Europeanizing agent in the Pontic-Caspian steppe, you'll have to rely solely on archeological and linguistics data, and also keep in mind that ancient DNA has slapped you in the face.

In just a few years, ancient DNA has provided us with plenty of shocks, but this is arguably among the biggest.

However, I honestly can't say that it was a huge surprise for me, because I tentatively predicted this outcome more than two years ago based on a handful of mitochondrial (mtDNA) haplotypes (see here). Certainly, analyzing genome-wide genetic data is what I thrive on, but if that's off limits, then eyeballing even a few mtDNA markers can also be very useful.

Wang et al. easily demonstrate the lack of any meaningful genetic relationship between Maykop (including Steppe Maykop, which shows an unusual eastern influence) and Yamnaya using a range of methods. But, judging by their conclusion, in which they still seem to want to see Maykop as the said Indo-Europeanizing agent in the Pontic-Caspian steppe, they're not exactly enthused by their own results. And they also make the following claim (emphasis is mine):

Based on PCA and ADMIXTURE plots we observe two distinct genetic clusters: one cluster falls with previously published ancient individuals from the West Eurasian steppe (hence termed ‘Steppe’), and the second clusters with present-day southern Caucasian populations and ancient Bronze Age individuals from today’s Armenia (henceforth called ‘Caucasus’), while a few individuals take on intermediate positions between the two. The stark distinction seen in our temporal transect is also visible in the Y-chromosome haplogroup distribution, with R1/R1b1 and Q1a2 types in the Steppe and L, J, and G2 types in the Caucasus cluster (Fig. 3A, Supplementary Data 1). In contrast, the mitochondrial haplogroup distribution is more diverse and almost identical in both groups (Fig. 3B, Supplementary Data 1).

I'd say that what they're almost suggesting there is that the Caucasus and Steppe clusters, hence also the Maykop and Yamnaya populations, shared significant maternal ancestry. If this were true, then perhaps it might mean that the Pontic-Caspian steppe was Indo-Europeanized via female-biased migrations from Maykop? Yes, perhaps, if this were true. However, it's not.

To be sure, Yamnaya does show a close genome-wide genetic relationship with an earlier group from the North Caucasus region: the so called Eneolithic steppe people. But they can't be linked to Maykop or even the roughly contemporaneous nearby Eneolithic Caucasus population, and seem to have vanished, at least as a coherent genetic unit, just as Maykop got going. Wang et al. managed to sequence three Eneolithic steppe samples with the following mtDNA haplogroups: H2, I3a and T2a1b.

H2 is too broad a haplogroup to bother with, but here are the results for I3a and T2a1b from the recently launched AmtDB, the first database of ancient human mitochondrial genomes (see here).


In a database of 1,131 ancient samples, I3a shows up in just five individuals, all of them associated with Yamnaya-related archeological cultures and populations: Poltavka (BARu), Unetice (UNC), Corded Ware (CWC), and Bell Beaker (BBC). Similarly, T2a1b shows up in just four individuals, all of them associated with Corded Ware (CWC) and Bell Beaker-derived Bronze Age Britons (BABI). And if I go back a step to T2a1, then the list reveals two Yamnaya individuals from what is now Kalmykia, Russia.

Thus, using just two mtDNA haplotypes I'm able to corroborate the results from genome-wide genetic data showing a close relationship between Eneolithic steppe and Yamnaya. So like I said, useful stuff.

This obviously begs the question: what does the AmtDB reveal about Maykop mtDNA haplotypes, especially in the context of the genetic relationship, or rather lack of, between Yamnaya and Maykop? Yep, again, the AmtDB basically corroborates the results from genome-wide genetic data.

But don't take my word for it. Stick the currently available Maykop mtDNA haplogroups into the AmtDB and see what happens (for your convenience I've made a list available here). Considering the close geographic and temporal proximity of Maykop to Yamnaya, you won't see an overly high sharing rate with Yamnaya and closely related populations. Moreover, Maykop shows several haplogroups that appear highly unusual in the context of the Eneolithic and Bronze Age steppe mtDNA gene pool, and, instead, link its maternal ancestry to those of the early European farmers, West Asians or even Central Asians, such as HV, M52, U1b, U7b and X2f.

See also...

Yamnaya: home-grown

Yamnaya isn't from Iran just like R1a isn't from India

Big deal of 2016: the territory of present-day Iran cannot be the Indo-European homeland

Friday, April 13, 2018

On the doorstep of India


One of the most remarkable discoveries in the recent Narasimhan et al. 2018 preprint has to be the presence of what are essentially Eastern European migrant populations within the Inner Asian Mountain Corridor (IAMC) during the Middle to Late Bronze Age (MLBA). Remarkable for so many reasons, but seemingly under-appreciated by a lot of people, judging by the online discussions that I've seen about the preprint, and even, I'd say, the authors themselves.

Narasimhan et al. labeled these groups as belonging to the "forest/steppe MLBA" complex (for instance, see the main figure from the preprint here). This is indeed what they are in terms of their genetic structure, but certainly not geography, because the IAMC is well south of the steppe. Thus, in my Principal Component Analysis (PCA) I'm going to label them as part of the "post-steppe herder expansion Turan" complex.

Strikingly, most of these people cluster with Bronze Age Eastern Europeans, and even some Bronze Age Central Europeans. They're also sitting very close to the more easterly present-day Slavic-speakers from Russia and Ukraine, and indeed closer to the bulk of the European cluster than some present-day Turkic and Uralic groups from the Volga-Ural region. Even I never predicted such an outcome. Sure, I was expecting to see ancient genomes from South Central Asia with some very heavy steppe influence, but not this. The relevant datasheet is available here.


Two of the MLBA IAMC individuals are from Kashkarchi in the Ferghana Valley, in what is now Uzbekistan, and basically on the doorstep of the Indian subcontinent. I've made special mention of them on the plot, and I've also highlighted a pair of individuals from the Bronze Age Central Asian sites of Gonur Tepe and Shahr-i Sokhta, who are, in all likelihood, unadmixed migrants from the Indus Valley (for more on that, see here).

It's surely not a coincidence that the ancient and present-day South Asians on the plot (including those from Pakistan's Swat Valley dated to the Iron Age) form an almost prefect cline between these two pairs of individuals. It's also surely not a coincidence that the MLBA IAMC groups are rich in Y-haplogroup R1a-M417, and in particular its R1a-Z93 subclade, which is today an especially frequent marker in Indo-European-speaking South Asians.

Forget about the pre-MLBA populations from the forests, steppe, or IAMC, like those represented by Dali_EBA; they're practically irrelevant to this story. How do I know? Because they have little to no impact on the above mentioned cline. And this can be easily verified with mixture models based on multiple Principal Components (PCs) and formal statistics (for instance, see here).


Clearly, many populations in South Asia, particularly those speaking Indo-European languages, derive the bulk of their steppe-related ancestry from the peoples of the MLBA IAMC, and/or their very close relatives. And if you do believe that this inference is just based on coincidences, then I'm sorry to say this, but obviously a new, much less mentally challenging, hobby or profession beckons. All the best with that.

Just to help put all of this in a geographic perspective, here's a topographical map of Eurasia. I've marked the location of the Ferghana Valley. The close relatives of Kashkarchi_BA most likely skirted their way around those winding high mountains and slipped into India via the Khyber Pass, which I've also marked on the map.


And the rest, as they say, is history, including the history described in the ancient Indo-Aryan Sanskrit texts known as the Vedas. I'm sure we'll soon be learning about these events in great detail when many more ancient samples from Pakistan and, hopefully, the first ancient samples from India, are published.

Citation...

Narasimhan et al, The Genomic Formation of South and Central Asia, Posted March 31, 2018, doi: https://doi.org/10.1101/292581

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Wednesday, August 2, 2017

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)


Over at Nature at this LINK. Why is the presence of steppe admixture in Mycenaeans important? And why does it matter if the Minoans already had a lot of ancestry from the Caucasus or surrounds? Because Mycenaeans were Indo-Europeans and Minoans weren't. I'm still reading the paper and will update this entry regularly over the next few days. Below is the abstract and, in my opinion, a key quote. Emphasis is mine.

The origins of the Bronze Age Minoan and Mycenaean cultures have puzzled archaeologists for more than a century. We have assembled genome-wide data from 19 ancient individuals, including Minoans from Crete, Mycenaeans from mainland Greece, and their eastern neighbours from southwestern Anatolia. Here we show that Minoans and Mycenaeans were genetically similar, having at least three-quarters of their ancestry from the first Neolithic farmers of western Anatolia and the Aegean [1, 2], and most of the remainder from ancient populations related to those of the Caucasus [3] and Iran [4, 5]. However, the Mycenaeans differed from Minoans in deriving additional ancestry from an ultimate source related to the hunter–gatherers of eastern Europe and Siberia [6, 7, 8], introduced via a proximal source related to the inhabitants of either the Eurasian steppe [1, 6, 9] or Armenia [4, 9]. Modern Greeks resemble the Mycenaeans, but with some additional dilution of the Early Neolithic ancestry. Our results support the idea of continuity but not isolation in the history of populations of the Aegean, before and after the time of its earliest civilizations.

...

The simulation framework also allows us to compare different models directly. Suppose that there are two models (Simulated1, Simulated2) and we wish to examine whether either of them is a better description of a population of interest (in this case, Mycenaeans). We test f4(Simulated1, Simulated2; Mycenaean, Chimp), which directly determines whether the observed Mycenaeans shares more alleles with one or the other of the two models. When we apply this intuition to the best models for the Mycenaeans (Extended Data Fig. 6), we observe that none of them clearly outperforms the others as there are no statistics with |Z|>3 (Table S2.28). However, we do notice that the model 79%Minoan_Lasithi+21%Europe_LNBA tends to share more drift with Mycenaeans (at the |Z|>2 level). Europe_LNBA is a diverse group of steppe-admixed Late Neolithic/Bronze Age individuals from mainland Europe, and we think that the further study of areas to the north of Greece might identify a surrogate for this admixture event – if, indeed, the Minoan_Lasithi+Europe_LNBA model represents the true history.

Lazaridis, Mittnik et al., Genetic origins of the Minoans and Mycenaeans, Nature, Published online 02 August 2017, doi:10.1038/nature23310

Update 03/08/2017: This is my own Principal Component Analysis (PCA) of the Minoan and Mycenaean samples, which are freely available at the Reich Lab website here. The Armenian angle for the eastern admixture in Mycenaeans looks forced. The trajectory of this admixture obviously runs from Northern or Eastern Europe to the Minoans. If it did arrive from Armenia, then realistically only via a heavily steppe-admixed population. Right click and open in a new tab to enlarge:


Update 05/08/2017: Much like Lazaridis et al., I ran a series to qpAdm analyses to find the best mixture model for the Mycenaeans. However, just to see what would happen, unlike Lazaridis et al., I didn't group any of the archaeological populations into larger clusters based on their genetic affinities. The three models below stood out from the rest in terms of their statistical fits.

Mycenaean
Minoan_Lasithi 0.786±0.049
Sintashta 0.214±0.049
taildiff: 0.96574059
chisq: 6.030
Full output

Mycenaean
Corded_Ware_Germany 0.210±0.043
Minoan_Lasithi 0.790±0.043
taildiff: 0.961238695
chisq: 6.198
Full output

Mycenaean
Minoan_Lasithi 0.791±0.043
Srubnaya 0.209±0.043
taildiff: 0.950419642
chisq: 6.558
Full output

So it's essentially the same outcome as the one obtained by Lazaridis et al., because Sintashta and Srubnaya are part of their Steppe_MLBA cluster, while Corded Ware is part of their Europe_LNBA cluster, and it's these clusters that, along with Minoan_Lasithi, provided their most successful mixture models for the Mycenaeans. But it's nice to see Sintashta at the top of my results, because it fits so well with the long postulated archaeological links between Sintashta and the Mycenaeans (for instance, see here).

By the way, here's what I said back in May when the Mathieson et al. 2017 preprint came out (see here). So things are falling into place rather nicely.

The same paper also includes the following individual from present-day Bulgaria dated to the start of the Late Bronze Age (LBA), which is roughly when the Mycenaeans appeared nearby in what is now Greece:

Bulgaria_MLBA I2163: Y-hg R1a1a1b2 mt-hg U5a2 1750-1625 calBCE

This guy is the most Yamnaya-like of all of the Balkan samples in Mathieson et al. 2017, and, as far as I can see based on his overall genome-wide results, probably indistinguishable from the contemporaneous Srubnaya people of the Pontic-Caspian steppe. He also belongs to Y-haplogroup R1a-Z93, which is a marker typical of Srubnaya and other closely related steppe groups such as Andronovo, Potapovka and Sintashta. So there's very little doubt that he's either a migrant or a recent descendant of migrants to the Balkans from the Pontic-Caspian steppe.

See also...

A Mycenaean and an Iron Age Iranian walk into a bar...

Main candidates for the precursors of the proto-Greeks in the ancient DNA record to date

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friday, May 12, 2017

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...


All of the post-Middle Neolithic samples from the recent Mittnik et al. and Saag et al. preprints on the ancient population history of the Baltic region belonged to Y-chromosome haplogroup R1a. And most of them belonged to the R1a-M417 (R1a1a) subclade that makes up almost 100% of the R1a lineages in the world today. This is what the results look like in a table (the sample IDs are of my own design):


Earlier samples from the same region belonged to Y-haplogroups I2a and R1a, but this was a subclade of R1a defined by the YP1272 mutation that is extremely rare today even in Northeastern Europe.

And now shifting our focus west of Scandinavia: all but two of the post-Middle Neolithic samples from around the North Sea from the recent Olalde et al. preprint on the Bell Beaker phenomenon and ancient population history of Northwest Europe belonged to Y-chromosome R1b, and more specifically to the R1b-M269 (R1b1a1a2) subclade, which makes up almost 100% of the R1b lineages in the world today. Here's a table:


Earlier samples from the same region belonged to Y-haplogroups I2a, I, G2a and CF, and most of the instances of I and the CF would probably be classified as I2a if not for missing data.

Interestingly, despite the R1a vs R1b dichotomy between these post-Middle Neolithic obvious newcomers to the Baltic and North Sea regions, respectively, they were very similar in terms of overall genetic structure, obviously closely related, starkly different from Middle Neolithic Northern Europeans, and in all likelihood mainly derived from the same homeland that was not located in Northern Europe.

So can we locate this homeland with any degree of certainty, you might wonder? In fact, you might ask, isn't this a futile search for the time being, as we await ancient DNA from many prehistoric Eurasian populations?

Not at all, because when attempting to answer this question we're bounded by two key constraints: the exceptionally high frequencies of R1a and R1b in the post-Middle Neolithic Baltic and North Sea samples, and their close genetic affinity to earlier and contemporaneous populations from the Pontic-Caspian steppe, part of which is due to significant Caucasus Hunter-Gatherer (CHG) admixture that was lacking in Middle Neolithic Northern Europeans.

Indeed, to date, the Pontic-Caspian steppe is the only region where both R1a and R1b have been found in ancient remains from the same sites dating to the Mesolithic, Neolithic and Eneolithic. Here's a table based on results from Mathieson et al. 2015 and 2017. The R and R1 might really be R1a or R1b if not for missing data.


The Pontic-Caspian steppe also abuts the Caucasus foothills, and we know that CHG admixture was a major feature of its inhabitants from at least the Eneolithic. So odds are, and make no mistake, these are indeed excellent odds, that the homeland we're looking for was on the Pontic-Caspian steppe.

But of course I2a has also been recorded in prehistoric samples from the Pontic-Caspian steppe. So, you might ask, why did the populations migrating out of the steppe belong to R1a and R1b, and why did some of them seemingly carry only R1a while others only R1b? This can be explained by local founder effects on the steppe due to patrilocality. Moreover, it's possible that some groups moving out of the steppe did carry high frequencies of I2a, but they're yet to enter the ancient DNA record. [Edit: Maybe they already have? See here]

Now, the aforementioned post-Middle Neolithic newcomers to the Baltic and North Sea regions are most certainly in large part the direct ancestors of modern-day Northern Europeans, speaking languages belonging to the three daughter branches of late Proto-Indo-European (PIE): Balto-Slavic, Celtic and Germanic. It's highly unlikely that languages ancestral to these present-day languages were spoken by Middle Neolithic farmers, nor introduced into Northern Europe after it was colonized by the migrants from the Pontic-Caspian steppe.

What this strongly suggests is that the Pontic-Caspian steppe was also the late PIE homeland.

But, you might argue, the Pontic-Caspian steppe may have just been the expansion point for some of the late PIE language branches. No, that won't work. For one, modern-day populations speaking languages belonging to all other late PIE branches, such as Armenian, Greek, Indo-Iranian and Italic, show signals of the same population expansion from the Pontic-Caspian steppe that gave rise to modern-day Northern Europeans, in the form of Yamnaya-related genome-wide genetic admixture and appreciable frequencies of Y-chromosome haplogroups R1a-M417 and/or R1b-M269.

Some of these signals are certainly due to fairly recent admixture from Northern Europeans, like in much of Greece as a result of the Slavic expansions during the Early Middle Ages, but most cannot be explained in this way.

Secondly, Balto-Slavic, Celtic and Germanic are not more closely related to each other than to some of the other late PIE branches. For instance, Balto-Slavic is considered far more closely related to Indo-Iranian than to Celtic, which is generally seen as a sister branch to Italic. Therefore, if Balto-Slavic and Celtic derive from a homeland on the Pontic-Caspian steppe, then logically this is also where we should look for the origins of Indo-Iranian and Italic.

So as far as the late PIE homeland is concerned, thanks to ancient DNA, the debate is now practically over. But the PIE homeland debate is still wide open, or so we're told.

Apparently, Mathieson et al. 2017 aren't comfortable with putting the PIE homeland on the Pontic-Caspian Steppe because they can't find any evidence in their ancient DNA dataset of a significant migration through the Balkans that would potentially bring Anatolian languages from the Pontic-Caspian steppe to Anatolia. From the paper:

One version of the Steppe Hypothesis of Indo-European language origins suggests that Proto-Indo European languages developed in the steppe north of the Black and Caspian seas, and that the earliest known diverging branch – Anatolian – was spread into Asia Minor by movements of steppe peoples through the Balkan peninsula during the Copper Age around 4000 BCE, as part of the same incursions from the steppe that coincided with the decline of the tell settlements. [51] If this were correct, then one way to detect evidence of it would be the appearance of large amounts of characteristic steppe ancestry first in the Balkan Peninsula, and then in Anatolia. However, our genetic data do not support this scenario. While we find steppe ancestry in Balkan Copper Age and Bronze Age individuals, this ancestry is sporadic across individuals in the Copper Age, and at low levels in the Bronze Age. Moreover, while Bronze Age Anatolian individuals have CHG/Iran Neolithic related ancestry, they have neither the EHG ancestry characteristic of all steppe populations sampled to date [20] , nor the WHG ancestry that is ubiquitous in southeastern Europe in the Neolithic (Figure 1A, Supplementary Data Table 2, Supplementary Information section 1). This pattern is consistent with that seen in northwestern Anatolia [11] and later in Copper Age Anatolia [23], suggesting continuing migration into Anatolia from the East rather than from Europe.

And this...

On the other hand, our data could still be consistent with the Steppe-Balkans-Anatolia route hypothesis model, albeit with constraints. It remains possible that populations dating to around 1600 BCE in the regions where the Indo-European Luwian, Hittite and Palaic languages were spoken did have European hunter-gatherer ancestry. However, our results would require that such ancestry was not ubiquitous in Bronze Age Anatolia, and was perhaps tightly linked to Indo-European speaking groups. We predict that additional insight about the genetic origins of the potential speakers of early Indo-European languages will be obtained when ancient DNA data become available from additional sites in this key period in Anatolia and the Caucasus.

But I'd say the authors are taking that one particular version of the Steppe Hypothesis way too seriously. They might even be implying things that the creator(s) of the said hypothesis never posited.

Why do they seemingly expect a massive surge of steppe admixture into the Balkans during the Copper Age? If the steppe people are just shooting through the Balkans on their way to Anatolia, why would they leave a lot of admixture along the way? And if the locals are abandoning their tell settlements and running for the hills as far away from the oncoming steppe invaders as they can, how exactly would they acquire steppe admixture? Osmosis or what?

The Balkans is not Northern Europe, and the hypothesized migration of the proto-Anatolians from the Pontic-Caspian Steppe to Anatolia through the Balkans was never, as far as I know, meant to parallel the massive Corded Ware expansion across Northern Europe. In other words, why should all of the early Indo-European expansions have been of the same character, especially considering that they moved into such starkly different areas of Eurasia?

Indeed, as Mathieson et al. 2017 point out in the quote above, the evidence for the fleeting presence of steppe peoples in the Copper Age Balkans is in their dataset. For instance, in their Varna 1 sample set from Bulgaria, three out of the five individuals show significant steppe admixture. One of these individuals is almost 50% Yamnaya-like. Surely, there's really no need to expect anything more than that when looking for signals of a proto-Anatolian migration from the Pontic-Caspian Steppe to Anatolia.

In fact, even though I do appreciate the incredible work these guys are doing and the data they're making available to myself and everyone else, I suspect that there's a little bit of, shall we say, schadenfreude going on here.

They sequenced all of three Early Bronze Age Anatolians of obscure origin (are they actually suspected Anatolian speakers, like Luwians?), and apparently it's a big deal that they can't find any steppe admixture in Early Bronze Age Anatolia. Come on.

And then we're offered just three Yamnaya samples from the Pontic Steppe in Ukraine. One happens to be a massive outlier towards the Caucasus. Wow, what are the chances of that? And guess what, all three of these Yamnayans are females, so of course we're left wondering about the Y-haplogroups of the Yamnaya males on the Pontic Steppe. What happened to the males? Next paper, that's what.

Update 19//05/2017: Please note that the authors are not holding back any Yamnaya males from Ukraine for a future paper, as per my claim in the last paragraph above. They used what they had for the time being.

Update 21/05/2017: Actually, I suspect that we already have a population from the Bronze Age steppe in the ancient DNA record with a high frequency of Y-haplogroup I2a. See here.

See also...

R1a-M417 from Eneolithic Ukraine!!!11

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Eastern Europe as a bifurcation hotspot for Y-hg R1

Globular Amphora people starkly different from Yamnaya people