search this blog

Sunday, March 31, 2019

Map of pre-Corded Ware culture (>2900 BCE) instances of Y-haplogroup R1a

Below is a map showing the global distribution of Y-chromosome haplogroup R1a prior to the expansions of the R1a-rich Corded Ware culture (CWC) people and their descendants across Europe and Asia from around 2900 BCE. I'll be updating this map regularly and using it to help me narrow down the options for the place of origin of R1a, and also to counter the misinformation about this topic that has appeared in print and online over the years, including in many scientific publications and popular websites such as Wikipedia.

Incredibly, as far as I know, there are just six reliably called instances of R1a in the now ample Eurasian ancient DNA record dating to the pre-CWC period. To put this into perspective, consider that R1a is today the most common Y-haplogroup in much of Europe and Asia. How did that happen I wonder? However, please note that I chose to base the map only on samples sequenced with the capture and shotgun methods, rather than the PCR method, which is susceptible to producing contaminated results and no longer used in major ancient DNA studies.

See also...

The beast among Y-haplogroups

The Poltavka outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Monday, March 25, 2019

Celtic probably not from the west

The term "Celtic from the west" is the catchphrase for a working theory, offered in a couple of recent books, positing that the earliest speakers of Celtic languages lived in Atlantic Europe during the Bronze Age or even earlier. It'll be interesting to see how this theory holds up against increasing numbers of ancient samples from attested early Celtic-speaking populations.

More popular and long-standing theories postulate that the Proto-Celts are associated with the Urnfield and/or Hallstatt archeological cultures of Late Bronze Age and Iron Age Central Europe. I'm inclined to agree with these more mainstream views when looking at my qpAdm mixture models below of three Celtiberians from what is now La Hoya, northern Spain, from the recent Olalde et al. paper on the genomic history of Iberia.

Halberstadt_LBA 0.207±0.077
Pre-Celtiberian_LaHoya 0.793±0.077

chisq 15.031
tail prob 0.522396
Full output

Halberstadt_LBA 0.196±0.074
Non-Celtic_Iberian 0.804±0.074

chisq 17.366
tail prob 0.362297
Full output

The Celtiberians show a stronger signal of (Urnfield-related?) ancestry from the northeast than their Bronze Age predecessors in northern Iberia (Pre-Celtic_LaHoya) as well as their Iron Age contemporaries from eastern Iberia (Non-Celtic_Iberian). The latter group very likely spoke the non-Indo-European Iberian language. It's not clear what the Bronze Age northern Iberians spoke, but it may have been a language related to Basque, which is also non-Indo-European.

Of course, the fact that the Celtiberians harbored more northern Bell Beaker-related ancestry than basically all earlier Iberian groups was already reported in the Olalde et al. paper (on page 2), but I just wanted to see if I could flesh out some more details in regards to this observation by using chronologically and archeologically more proximate reference populations.

See also...

Open thread: What are the linguistic implications of Olalde et al. 2019?

An exceptional burial indeed, but not that of an Indo-European

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 23, 2019

How did Y-haplogroup J2b get to Europe?

Y-haplogroup J2b, defined by the L282 mutation, is found throughout Europe and reaches relatively high frequencies in the southeastern part of the continent. But the question of how and when it got to Europe is still wide open.

It's certainly native to the Near East, where all of the main subclades of Y-haplogroup J2 show more structure than anywhere else. Indeed, it's first attested in the ancient DNA record in an Early Neolithic sample from the Zagros Mountains, in what is now western Iran, dating to ~8,000 calBCE.

It doesn't appear outside of this region until a few thousand years later, when it's recorded in an Early Bronze Age sample dating to ~2,300 calBCE from a site near the Mediterranean Sea in present-day Jordan.

In Europe, it's first attested in a Middle Bronze Age sample from the Caucasus Mountains, in what is now southern Russia, dating to ~1900 calBCE. However, this individual's burial site is practically in the Near East, and, in fact, in terms of ancestry and archeology he is best described as Near Eastern. Importantly, he's also not directly associated with any population that contributed to the genetic structure of Europeans (for instance, see here).

J2b first appears deep in Europe a little later during the Middle Bronze Age, in several samples from sites near the Mediterranean coast in what are now Croatia and Sardinia. This is obviously nowhere near the Caucasus, but it is in a part of Europe that was linked to the Near East at the time via extensive maritime trade networks. Interestingly, however, all of these individuals are genetically very typical of where and when they lived, in that they don't show any obvious recent foreign admixture.

So how did Y-haplogroup J2b get to Europe? My view for now is that it mostly arrived with a few sailors from the Near East during the Early to Middle Bronze Age. This is just about the only plausible theory that I can come up with when looking at this map.

The idea that J2b moved deep into Europe along with the population movements of early pastoralists from the Pontic-Caspian steppe seems to be fairly popular online. However, it currently has no support from ancient DNA. In fact, it's downright contradicted by ancient DNA, because J2b is missing in tens of samples from a wide range of archeological cultures associated with these population movements. If anyone out there disagrees, then please show me a single instance of J2b in samples from the Khvalynsk, Sredny Stog, Yamnaya, Poltavka, Corded Ware, Bell Beaker, Catacomb, Srubnaya and other closely related ancient European steppe and steppe-derived cultures.

See also...

Ancient island hopping in the western Mediterranean (Fernandes et al. 2019 preprint)

On Maykop ancestry in Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, March 21, 2019

Ancient island hopping in the western Mediterranean (Fernandes et al. 2019 preprint)

Over at bioRxiv at this LINK. Here's the abstract, emphasis is mine:
A series of studies have documented how Steppe pastoralist-related ancestry reached central Europe by at least 2500 BCE, while Iranian farmer-related ancestry was present in Aegean Europe by at least 1900 BCE. However, the spread of these ancestries into the western Mediterranean where they have contributed to many populations living today remains poorly understood. We generated genome-wide ancient DNA from the Balearic Islands, Sicily, and Sardinia, increasing the number of individuals with reported data from these islands from 3 to 52. We obtained data from the oldest skeleton excavated from the Balearic islands (dating to ~2400 BCE), and show that this individual had substantial Steppe pastoralist-derived ancestry; however, later Balearic individuals had less Steppe heritage reflecting geographic heterogeneity or immigration from groups with more European first farmer-related ancestry. In Sicily, Steppe pastoralist ancestry arrived by ~2200 BCE and likely came at least in part from Spain as it was associated with Iberian-specific Y chromosomes. In Sicily, Iranian-related ancestry also arrived by the Middle Bronze Age, thus revealing that this ancestry type, which was ubiquitous in the Aegean by this time, also spread further west prior to the classical period of Greek expansion. In Sardinia, we find no evidence of either eastern ancestry type in the Nuragic Bronze Age, but show that Iranian-related ancestry arrived by at least ~300 BCE and Steppe ancestry arrived by ~300 CE, joined at that time or later by North African ancestry. These results falsify the view that the people of Sardinia are isolated descendants of Europe's first farmers. Instead, our results show that the island's admixture history since the Bronze Age is as complex as that in many other parts of Europe.

Fernandes et al., The Arrival of Steppe and Iranian Related Ancestry in the Islands of the Western Mediterranean, bioRxiv, posted March 21, 2019, doi:

Update: Another preprint on a similar theme by Marcus et al. has appeared at bioRxiv (see here).

Abstract: Recent ancient DNA studies of western Eurasia have revealed a dynamic history of admixture, with evidence for major migrations during the Neolithic and Bronze Age. The population of the Mediterranean island of Sardinia has been notable in these studies -- Neolithic individuals from mainland Europe cluster more closely with Sardinian individuals than with all other present-day Europeans. The current model to explain this result is that Sardinia received an initial influx of Neolithic ancestry and then remained relatively isolated from expansions in the later Neolithic and Bronze Age that took place in continental Europe. To test this model, we generated genome-wide capture data (approximately 1.2 million variants) for 43 ancient Sardinian individuals spanning the Neolithic through the Bronze Age, including individuals from Sardinia's Nuragic culture, which is known for the construction of numerous large stone towers throughout the island. We analyze these new samples in the context of previously generated genome-wide ancient DNA data from 972 ancient individuals across western Eurasia and whole-genome sequence data from approximately 1,500 modern individuals from Sardinia. The ancient Sardinian individuals show a strong affinity to western Mediterranean Neolithic populations and we infer a high degree of genetic continuity on the island from the Neolithic (around fifth millennium BCE) through the Nuragic period (second millennium BCE). In particular, during the Bronze Age in Sardinia, we do not find significant levels of the "Steppe" ancestry that was spreading in many other parts of Europe at that time. We also characterize subsequent genetic influx between the Nuragic period and the present. We detect novel, modest signals of admixture between 1,000 BCE and present-day, from ancestry sources in the eastern and northern Mediterranean. Within Sardinia, we confirm that populations from the more geographically isolated mountainous provinces have experienced elevated levels of genetic drift and that northern and southwestern regions of the island received more gene flow from outside Sardinia. Overall, our genetic analysis sheds new light on the origin of Neolithic settlement on Sardinia, reinforces models of genetic continuity on the island, and provides enhanced power to detect post-Bronze-Age gene flow. Together, these findings offer a refined demographic model for future medical genetic studies in Sardinia.

Marcus et al., Population history from the Neolithic to present on the Mediterranean island of Sardinia: An ancient DNA perspective, bioRxiv, posted March 21, 2019, doi:

See also...

Open thread: What are the linguistic implications of Olalde et al. 2019?

Monday, March 18, 2019

Open thread: What are the linguistic implications of Olalde et al. 2019?

I was going to write a huge post on the linguistic implications of the latest batch of ancient DNA from Iberia courtesy of Olalde et al. 2019, and then I thought better of it. Admittedly, I don't know enough about the languages of prehistoric Iberia to say anything really useful on the topic. So instead here's an open thread to bounce around a few ideas in the comments.

Just briefly, this is what Olalde et al. say in the abstract of their paper about the relationship between ancestry from the Pontic-Caspian steppe and languages in Iron Age Iberia:

We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia.

However, in the paper it's revealed that "Indo-European regions" actually refers to a Celtic-speaking part of northern Iberia. And it's quite possible that Celts moved into this area from outside of Iberia only during the Iron Age. In other words, the speakers of Indo-European languages here may not have been the descendants of any of the people with steppe ancestry who came to Iberia by ~2000 BCE.

So I'm probably not alone in thinking that the question of the linguistic affinities of these early migrants with steppe ancestry to Iberia (mostly associated with the Bell Beaker culture or BBC) remains open, especially since they evidently had such a profound genetic impact on the later non Indo-European-speaking populations of eastern and southern Iberia. Could they have been the speakers of unattested Indo-European languages, as well as Proto-Iberian and Proto-Basque? If not, why not?

Below is a Principal Component Analysis (PCA) of West Eurasian genetic variation. I highlighted some of the ancient samples from Olalde et al., as well as Basques and other present-day Iberians. The Basques form a tight cluster with most of the Copper, Bronze and Iron Age Iberians, and, unlike the other present-day Iberians, they basically look like an Iberian population from the metal ages. The relevant datasheet is available here.

This is nothing new and very much in line with the results in Olalde et al., but I wanted to emphasize the point that Basques were not just a group that experienced an extreme founder effect in R1b-P312, which is a Beaker-specific Y-chromosome lineage. Rather, they're still very similar to Iberian Beakers in terms of overall genetic structure. So where did they get their language?

See also...

Celtic probably not from the west

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 16, 2019

Let's try a formal heuristic approach

I created a massive outgroup f3-statistics matrix, featuring almost 300 ancient and present-day populations and individuals, for the purpose of running unsupervised, or at least semi-supervised, fine scale mixture tests with nMonte. Most of the stats were computed with 400-900K SNPs, which is a lot and should provide plenty of power. The matrix is available in a zip file here.

The results I'm getting with this new setup are very similar to those obtained with the Global25. The main differences, as far as I can see for now, are that the f3 data produce more stable results when modeling very deep ancestry, while the Global25 provides more accuracy when modeling fine scale recent ancestry (probably because it's better at picking up more recent genetic drift).

Let's investigate some pertinent issues with the new data using nMonte and PAST. How about we start with these?

- where did Bell Beakers get their steppe ancestry from?

- which Steppe_MLBA group did Indians get their steppe ancestry from?

- do the present-day Irish have any Hallstatt ancestry?

- what is the origin of present-day Basques?

- what is the precise ancestry of Armenia_ChL?

- do the Swat Iron Age samples really lack BMAC ancestry?

- does Anatolia_MLBA really lack steppe ancestry?

Note that the f3 matrix includes the ancients from the new Olalde et al. paper on the genomic history of Iberia (see here). I've also updated the Global25 datasheets with most of these samples.

Global 25 datasheet (scaled)

Global 25 pop averages (scaled)

Global 25 datasheet

Global 25 pop averages

By the way, Hajji_Firuz_ChL I2327, from Narasimhan et al. 2018, is now labeled Hajji_Firuz_IA in the above datasheets, because my understanding is that he's actually from the Iron Age rather than the Chalcolithic period. For background reading about this controversial sample see here and here. I don't have any more info on this topic; we'll just have to wait for the formal publication of the Narasimhan et al. manuscript to get all the details. Apparently it's coming very soon.

See also...

An exceptional burial indeed, but not that of an Indo-European

Maykop: a multi-ethnic layer cake?

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, March 14, 2019

Two new papers on ancient Iberia

Olalde et al. 2019 (Science) at this LINK...

Abstract: We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean. DOI: 10.1126/science.aav4040

Villalba-Mouco et al. 2019 (Current Biology) at this LINK...

Summary: The Iberian Peninsula in southwestern Europe represents an important test case for the study of human population movements during prehistoric periods. During the Last Glacial Maximum (LGM), the peninsula formed a periglacial refugium [1] for hunter-gatherers (HGs) and thus served as a potential source for the re-peopling of northern latitudes [2]. The post-LGM genetic signature was previously described as a cline from Western HG (WHG) to Eastern HG (EHG), further shaped by later Holocene expansions from the Near East and the North Pontic steppes [3, 4, 5, 6, 7, 8, 9]. Western and central Europe were dominated by ancestry associated with the ∼14,000-year-old individual from Villabruna, Italy, which had largely replaced earlier genetic ancestry, represented by 19,000–15,000-year-old individuals associated with the Magdalenian culture [2]. However, little is known about the genetic diversity in southern European refugia, the presence of distinct genetic clusters, and correspondence with geography. Here, we report new genome-wide data from 11 HGs and Neolithic individuals that highlight the late survival of Paleolithic ancestry in Iberia, reported previously in Magdalenian-associated individuals. We show that all Iberian HGs, including the oldest, a ∼19,000-year-old individual from El MirĂ³n in Spain, carry dual ancestry from both Villabruna and the Magdalenian-related individuals. Thus, our results suggest an early connection between two potential refugia, resulting in a genetic ancestry that survived in later Iberian HGs. Our new genomic data from Iberian Early and Middle Neolithic individuals show that the dual Iberian HG genomic legacy pertains in the peninsula, suggesting that expanding farmers mixed with local HGs. DOI:

See also...

Migration of the Bell Beakers—but not from Iberia (Olalde et al. 2018)

Single Grave > Bell Beakers

CHG or no CHG in Bronze Age western Iberia?

Thursday, March 7, 2019

A challenge

The datasheets below contain outgroup f3-statistics for a wide range of ancient and present-day populations. Five of the ancient groups and individuals are labeled "Unknown". In fact, I do know what they are, but I'd like you to try and work out whether they were the speakers of Indo-European or non-Indo-European languages by analyzing the datasheets with, say, PAST or nMonte.



I'll reveal the identities and likely languages of the mystery ancients in a couple of days. It'll be interesting to see if any of you nail this challenge. It shouldn't be too difficult, but to help things along, I color coded the populations in the datasheets (black = Indo-European, blue = Uralic, and grey = neither). If you haven't done this sort of thing before, these blog posts might be useful as background reading.

Maykop: a multi-ethnic layer cake?

Global25 PAST-compatible datasheets

D-stats/nMonte open thread

Update 09/03/2019: Samuel nailed the challenge in the first post below. And then Matt almost figured out the precise identities of the mystery ancients here. In hindsight I should've made this more difficult. Here are the answers:

Unknown1 = England_Anglo-Saxon (Indo-European) > more here
Unknown2 = Levanluhta_IA (non-Indo-European) > more here
Unknown3 = Minoan_Lasithi (non-Indo-European) > more here
Unknown4 = Slavic_Bohemia (Indo-European) > more here
Unknown5 = Turkmenistan_IA (Indo-European) > more here

Monday, March 4, 2019

An exceptional burial indeed, but not that of an Indo-European

Not too many people have been buried sitting on wagons. The most famous case is that of an Early Bronze Age man who, considering his injuries, may have died in a high-speed crash - high-speed for its time anyway - on the Pontic-Caspian steppe in Eastern Europe.

It's likely that this guy was one of the very first wagon-drivers in human history, because his four-wheeled wooden model is dated to 3336-3105 calBCE, which makes it the oldest wagon discovered thus far. His genotype data, under the label Steppe Maykop SA6004, were published recently along with Wang et al. 2019.

Early wagons are very important for a couple of reasons: they revolutionized human transport and warfare, and they're often closely associated with the prehistoric expansions of Indo-European languages.

So I'm pretty sure that many of you must be thinking right now that wagon-driver SA6004 was an early Indo-European, or even a Proto-Indo-European! I bet that's what Wang et al. thought too, considering the conclusion in their paper. But, alas, the chances of this are slim to none.

Steppe Maykop samples show rather peculiar genetic structure considering their geographic origin, with a large proportion of their ancestry deriving from a source closely related to western Siberian hunter-gatherers (aka West_Siberia_N in the ancient DNA record). Indeed, SA6004 basically looks like a 50/50 mix between West_Siberia_N and Piedmont_Eneolithic. Here's a map with all of the relevant details.

Thus, clearly, the Steppe Maykop population wasn't ancestral or even directly related to the steppe and steppe-derived groups generally regarded to have been Indo-European speaking, such as those associated with the Yamnaya, Corded Ware, and Bell Beaker cultures. That's because these groups lack any discernible West_Siberia_N-related ancestry.

It also wasn't ancestral or directly related to any present-day or currently sampled ancient Indo-European speaking populations, again because these populations basically lack West_Siberia_N-related ancestry.

On the other hand, Yamnaya, Corded Ware and other closely related groups show an exceptionally strong genetic relationship with Indo-European speakers, especially those from across Northern Europe, which experienced massive migrations from the Pontic-Caspian steppe during the late Neolithic period, and hardly anything from elsewhere since then.

Case in point, the samples from Wang et al. labeled Yamnaya Caucasus were recovered from the same area of the Pontic-Caspian as their Steppe Maykop samples, and yet, take a look at this linear model based on outgroup f3-statistics. Steppe Maykop does show high genetic affinity to Indo-European speakers (no doubt mediated via its Piedmont_Eneolithic-related ancestry), but, unlike Yamnaya Caucasus, it also shows unusually high affinity for a West Eurasian population to Native Americans and Siberians. The relevant datasheet is available here.
So the only way that the Steppe Maykop population was Indo-European-speaking, was if it inherited its Indo-European speech from its Piedmont_Eneolithic-related ancestors. And even if it was Indo-European-speaking, it probably spoke an extinct Indo-European language not closely related to any extant Indo-European languages. In other words, the possibility that Steppe Maykop passed on its language to Yamnaya, along with its wagons, is close to zero. More likely, Yamnaya stole a few wagons from Steppe Maykop, and the rest is history.

See also...

The Steppe Maykop enigma

On Maykop ancestry in Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, March 2, 2019

Maykop: a multi-ethnic layer cake?

Let's speculate about the linguistic affinities of the currently available ancient populations from the Caucasus and surrounds. I put together a series of outgroup f3-stats to help things along. They're available for download here.

Georgian 0.258224
Abkhasian 0.257899
Latvian 0.257376
Swedish 0.257301
Turkish_Trabzon 0.256996
Basque_Spanish 0.256589
Chechen 0.256514
Icelandic 0.256418
Norwegian 0.256325
Lezgin 0.256272
Irish 0.256227
Tabasaran 0.256092
Italian_Bergamo 0.25605
English_Cornwall 0.256032
Polish_East 0.255991
Scottish 0.255955
Adygei 0.255913

Latvian 0.261845
Russian_North 0.26145
Estonian 0.260355
Finnish 0.260211
Lithuanian 0.260072
Udmurd 0.259804
Ingrian 0.259663
Surui 0.259637
Vepsa 0.259608
Karelian 0.259532
Karitiana 0.259482
Russian_West 0.259397
Russian_Central 0.259274
Wichi 0.259106
Saami 0.258982
Komi 0.258945
Icelandic 0.258854
Swedish 0.258814
Mordovian 0.258604
Irish 0.25859

Eyeballing the stats might be enough to get a general impression about what they mean, but to understand them properly it's necessary to get technical with something like PAST3 (see here). That's because f3-stats pick up shared genetic drift from all drift paths, and don't especially focus on more recently shared ancestry. This can often lead to confusing outcomes.

Below are a few examples of linear models based on my f3-stats. Note that many Indo-European speakers, especially from Northern Europe, are foremost attracted to ancient samples from the Pontic-Caspian steppe. On the other hand, non-Indo-European speakers, from such far flung locations as the Caucasus and Iberia, show relatively stronger affinity to ancient samples from Anatolia and the Caucasus. Moreover, Uralic speakers show elevated affinity to ancient hunter-gatherer samples from Eastern Europe and Siberia. Makes sense, right?
Based on these and other data, I'd say that Maykop and the culturally related Steppe Maykop were something of a multi-ethnic polity, with many near and far related languages spoken by its people, including perhaps Kartvelian, Northwest Caucasian, Yeniseian and Indo-European. But it seems to me that Proto-Indo-European was spoken by steppe foragers turned pastoralists just outside of the Maykop zone. And I'm quite sure that after the Maykop collapse various early Indo-European groups pushed across the Caucasus and deep into the Near East. Just take a look at the f3-stats and linear model for Hajji_Firuz_BA to see what I mean.

See also...

An exceptional burial indeed, but not that of an Indo-European

The Steppe Maykop enigma

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...