search this blog

Monday, March 18, 2019

Open thread: What are the linguistic implications of Olalde et al. 2019?

I was going to write a huge post on the linguistic implications of the latest batch of ancient DNA from Iberia courtesy of Olalde et al. 2019, and then I thought better of it. Admittedly, I don't know enough about the languages of prehistoric Iberia to say anything really useful on the topic. So instead here's an open thread to bounce around a few ideas in the comments.

Just briefly, this is what Olalde et al. say in the abstract of their paper about the relationship between ancestry from the Pontic-Caspian steppe and languages in Iron Age Iberia:

We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia.

However, in the paper it's revealed that "Indo-European regions" actually refers to a Celtic-speaking part of northern Iberia. And it's quite possible that Celts moved into this area from outside of Iberia only during the Iron Age. In other words, the speakers of Indo-European languages here may not have been the descendants of any of the people with steppe ancestry who came to Iberia by ~2000 BCE.

So I'm probably not alone in thinking that the question of the linguistic affinities of these early migrants with steppe ancestry to Iberia (mostly associated with the Bell Beaker culture or BBC) remains open, especially since they evidently had such a profound genetic impact on the later non Indo-European-speaking populations of southern Iberia. Could they have been the speakers of unattested Indo-European languages, as well as Proto-Iberian and Proto-Basque? If not, why not?

Below is a Principal Component Analysis (PCA) of West Eurasian genetic variation. I highlighted some of the ancient samples from Olalde et al., as well as Basques and other present-day Iberians. The Basques form a tight cluster with most of the Copper, Bronze and Iron Age Iberians, and, unlike the other present-day Iberians, they basically look like an Iberian population from the metal ages. The relevant datasheet is available here.

This is nothing new and very much in line with the results in Olalde et al., but I wanted to emphasize the point that Basques were not just a group that experienced an extreme founder effect in R1b-P312, which is a Beaker-specific Y-chromosome lineage. Rather, they're still very similar to Iberian Beakers in terms of overall genetic structure. So where did they get their language?

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 16, 2019

Let's try a formal heuristic approach

I created a massive outgroup f3-statistics matrix, featuring almost 300 ancient and present-day populations and individuals, for the purpose of running unsupervised, or at least semi-supervised, fine scale mixture tests with nMonte. Most of the stats were computed with 400-900K SNPs, which is a lot and should provide plenty of power. The matrix is available in a zip file here.

The results I'm getting with this new setup are very similar to those obtained with the Global25. The main differences, as far as I can see for now, are that the f3 data produce more stable results when modeling very deep ancestry, while the Global25 provides more accuracy when modeling fine scale recent ancestry (probably because it's better at picking up more recent genetic drift).

Let's investigate some pertinent issues with the new data using nMonte and PAST. How about we start with these?

- where did Bell Beakers get their steppe ancestry from?

- which Steppe_MLBA group did Indians get their steppe ancestry from?

- do the present-day Irish have any Hallstatt ancestry?

- what is the origin of present-day Basques?

- what is the precise ancestry of Armenia_ChL?

- do the Swat Iron Age samples really lack BMAC ancestry?

- does Anatolia_MLBA really lack steppe ancestry?

Note that the f3 matrix includes the ancients from the new Olalde et al. paper on the genomic history of Iberia (see here). I've also updated the Global25 datasheets with most of these samples.

Global 25 datasheet (scaled)

Global 25 pop averages (scaled)

Global 25 datasheet

Global 25 pop averages

By the way, Hajji_Firuz_ChL I2327, from Narasimhan et al. 2018, is now labeled Hajji_Firuz_IA in the above datasheets, because my understanding is that he's actually from the Iron Age rather than the Chalcolithic period. For background reading about this controversial sample see here and here. I don't have any more info on this topic; we'll just have to wait for the formal publication of the Narasimhan et al. manuscript to get all the details. Apparently it's coming very soon.

See also...

An exceptional burial indeed, but not that of an Indo-European

Maykop: a multi-ethnic layer cake?

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, March 14, 2019

Two new papers on ancient Iberia

Olalde et al. 2019 (Science) at this LINK...

Abstract: We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean. DOI: 10.1126/science.aav4040

Villalba-Mouco et al. 2019 (Current Biology) at this LINK...

Summary: The Iberian Peninsula in southwestern Europe represents an important test case for the study of human population movements during prehistoric periods. During the Last Glacial Maximum (LGM), the peninsula formed a periglacial refugium [1] for hunter-gatherers (HGs) and thus served as a potential source for the re-peopling of northern latitudes [2]. The post-LGM genetic signature was previously described as a cline from Western HG (WHG) to Eastern HG (EHG), further shaped by later Holocene expansions from the Near East and the North Pontic steppes [3, 4, 5, 6, 7, 8, 9]. Western and central Europe were dominated by ancestry associated with the ∼14,000-year-old individual from Villabruna, Italy, which had largely replaced earlier genetic ancestry, represented by 19,000–15,000-year-old individuals associated with the Magdalenian culture [2]. However, little is known about the genetic diversity in southern European refugia, the presence of distinct genetic clusters, and correspondence with geography. Here, we report new genome-wide data from 11 HGs and Neolithic individuals that highlight the late survival of Paleolithic ancestry in Iberia, reported previously in Magdalenian-associated individuals. We show that all Iberian HGs, including the oldest, a ∼19,000-year-old individual from El MirĂ³n in Spain, carry dual ancestry from both Villabruna and the Magdalenian-related individuals. Thus, our results suggest an early connection between two potential refugia, resulting in a genetic ancestry that survived in later Iberian HGs. Our new genomic data from Iberian Early and Middle Neolithic individuals show that the dual Iberian HG genomic legacy pertains in the peninsula, suggesting that expanding farmers mixed with local HGs. DOI:

See also...

Migration of the Bell Beakers—but not from Iberia (Olalde et al. 2018)

Single Grave > Bell Beakers

CHG or no CHG in Bronze Age western Iberia?

Thursday, March 7, 2019

A challenge

The datasheets below contain outgroup f3-statistics for a wide range of ancient and present-day populations. Five of the ancient groups and individuals are labeled "Unknown". In fact, I do know what they are, but I'd like you to try and work out whether they were the speakers of Indo-European or non-Indo-European languages by analyzing the datasheets with, say, PAST or nMonte.



I'll reveal the identities and likely languages of the mystery ancients in a couple of days. It'll be interesting to see if any of you nail this challenge. It shouldn't be too difficult, but to help things along, I color coded the populations in the datasheets (black = Indo-European, blue = Uralic, and grey = neither). If you haven't done this sort of thing before, these blog posts might be useful as background reading.

Maykop: a multi-ethnic layer cake?

Global25 PAST-compatible datasheets

D-stats/nMonte open thread

Update 09/03/2019: Samuel nailed the challenge in the first post below. And then Matt almost figured out the precise identities of the mystery ancients here. In hindsight I should've made this more difficult. Here are the answers:

Unknown1 = England_Anglo-Saxon (Indo-European) > more here
Unknown2 = Levanluhta_IA (non-Indo-European) > more here
Unknown3 = Minoan_Lasithi (non-Indo-European) > more here
Unknown4 = Slavic_Bohemia (Indo-European) > more here
Unknown5 = Turkmenistan_IA (Indo-European) > more here

Monday, March 4, 2019

An exceptional burial indeed, but not that of an Indo-European

Not too many people have been buried sitting on wagons. The most famous case is that of an Early Bronze Age man who, considering his injuries, may have died in a high-speed crash - high-speed for its time anyway - on the Pontic-Caspian steppe in Eastern Europe.

It's likely that this guy was one of the very first wagon-drivers in human history, because his four-wheeled wooden model is dated to 3336-3105 calBCE, which makes it the oldest wagon discovered thus far. His genotype data, under the label Steppe Maykop SA6004, were published recently along with Wang et al. 2019.

Early wagons are very important for a couple of reasons: they revolutionized human transport and warfare, and they're often closely associated with the prehistoric expansions of Indo-European languages.

So I'm pretty sure that many of you must be thinking right now that wagon-driver SA6004 was an early Indo-European, or even a Proto-Indo-European! I bet that's what Wang et al. thought too, considering the conclusion in their paper. But, alas, the chances of this are slim to none.

Steppe Maykop samples show rather peculiar genetic structure considering their geographic origin, with a large proportion of their ancestry deriving from a source closely related to western Siberian hunter-gatherers (aka West_Siberia_N in the ancient DNA record). Indeed, SA6004 basically looks like a 50/50 mix between West_Siberia_N and Piedmont_Eneolithic. Here's a map with all of the relevant details.

Thus, clearly, the Steppe Maykop population wasn't ancestral or even directly related to the steppe and steppe-derived groups generally regarded to have been Indo-European speaking, such as those associated with the Yamnaya, Corded Ware, and Bell Beaker cultures. That's because these groups lack any discernible West_Siberia_N-related ancestry.

It also wasn't ancestral or directly related to any present-day or currently sampled ancient Indo-European speaking populations, again because these populations basically lack West_Siberia_N-related ancestry.

On the other hand, Yamnaya, Corded Ware and other closely related groups show an exceptionally strong genetic relationship with Indo-European speakers, especially those from across Northern Europe, which experienced massive migrations from the Pontic-Caspian steppe during the late Neolithic period, and hardly anything from elsewhere since then.

Case in point, the samples from Wang et al. labeled Yamnaya Caucasus were recovered from the same area of the Pontic-Caspian as their Steppe Maykop samples, and yet, take a look at this linear model based on outgroup f3-statistics. Steppe Maykop does show high genetic affinity to Indo-European speakers (no doubt mediated via its Piedmont_Eneolithic-related ancestry), but, unlike Yamnaya Caucasus, it also shows unusually high affinity for a West Eurasian population to Native Americans and Siberians. The relevant datasheet is available here.
So the only way that the Steppe Maykop population was Indo-European-speaking, was if it inherited its Indo-European speech from its Piedmont_Eneolithic-related ancestors. And even if it was Indo-European-speaking, it probably spoke an extinct Indo-European language not closely related to any extant Indo-European languages. In other words, the possibility that Steppe Maykop passed on its language to Yamnaya, along with its wagons, is close to zero. More likely, Yamnaya stole a few wagons from Steppe Maykop, and the rest is history.

See also...

The Steppe Maykop enigma

On Maykop ancestry in Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 2, 2019

Maykop: a multi-ethnic layer cake?

Let's speculate about the linguistic affinities of the currently available ancient populations from the Caucasus and surrounds. I put together a series of outgroup f3-stats to help things along. They're available for download here.

Georgian 0.258224
Abkhasian 0.257899
Latvian 0.257376
Swedish 0.257301
Turkish_Trabzon 0.256996
Basque_Spanish 0.256589
Chechen 0.256514
Icelandic 0.256418
Norwegian 0.256325
Lezgin 0.256272
Irish 0.256227
Tabasaran 0.256092
Italian_Bergamo 0.25605
English_Cornwall 0.256032
Polish_East 0.255991
Scottish 0.255955
Adygei 0.255913

Latvian 0.261845
Russian_North 0.26145
Estonian 0.260355
Finnish 0.260211
Lithuanian 0.260072
Udmurd 0.259804
Ingrian 0.259663
Surui 0.259637
Vepsa 0.259608
Karelian 0.259532
Karitiana 0.259482
Russian_West 0.259397
Russian_Central 0.259274
Wichi 0.259106
Saami 0.258982
Komi 0.258945
Icelandic 0.258854
Swedish 0.258814
Mordovian 0.258604
Irish 0.25859

Eyeballing the stats might be enough to get a general impression about what they mean, but to understand them properly it's necessary to get technical with something like PAST3 (see here). That's because f3-stats pick up shared genetic drift from all drift paths, and don't especially focus on more recently shared ancestry. This can often lead to confusing outcomes.

Below are a few examples of linear models based on my f3-stats. Note that many Indo-European speakers, especially from Northern Europe, are foremost attracted to ancient samples from the Pontic-Caspian steppe. On the other hand, non-Indo-European speakers, from such far flung locations as the Caucasus and Iberia, show relatively stronger affinity to ancient samples from Anatolia and the Caucasus. Moreover, Uralic speakers show elevated affinity to ancient hunter-gatherer samples from Eastern Europe and Siberia. Makes sense, right?
Based on these and other data, I'd say that Maykop and the culturally related Steppe Maykop were something of a multi-ethnic polity, with many near and far related languages spoken by its people, including perhaps Kartvelian, Northwest Caucasian, Yeniseian and Indo-European. But it seems to me that Proto-Indo-European was spoken by steppe foragers turned pastoralists just outside of the Maykop zone. And I'm quite sure that after the Maykop collapse various early Indo-European groups pushed across the Caucasus and deep into the Near East. Just take a look at the f3-stats and linear model for Hajji_Firuz_BA to see what I mean.

See also...

An exceptional burial indeed, but not that of an Indo-European

The Steppe Maykop enigma

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...