Tuesday, March 20, 2018

The Iberomaurusians

I can honestly say that I've suddenly become a more open minded individual after running the five Iberomaurusian samples from M. van de Loosdrecht et al. 2018 in my Global25 Principal Component Analysis (PCA).

They're certainly a curious bunch. In many pairs of the 25 PCs, they sit alone, in parts of the plots that I never expected to see populated. Interestingly though, modern-day North Africans often "pull" towards them, suggesting moderate to strong genetic continuity in North Africa since the Pleistocene. The PAST datasheet used to produce the plots below is available here.

To analyze this in more detail, I ran a series of nMonte mixture models for seven North African populations using Global25 scaled data. The models show the Iberomaurusians as one of the two best reference options for all of these North African groups except the Egyptians, which, at the very least, is an outcome that fits nicely with geography.

[1] distance%=2.5772 / distance=0.025772


Levant_BA 30.9
Iberomaurusian 24.1
Iberia_EN 17.9
Iberia_BA 14.45
Yoruba 11.85
Ethiopia_4500BP 0.8
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Levant_N 0
Natufian 0


[1] distance%=2.7927 / distance=0.027927


Levant_BA 73
Iberia_BA 7.7
Ethiopia_4500BP 7.55
Yoruba 5.3
Iberomaurusian 4.45
Iberia_EN 2
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Levant_N 0
Natufian 0


[1] distance%=1.6931 / distance=0.016931


Levant_BA 56.8
Iberomaurusian 11.75
Iberia_BA 10.05
Yoruba 8.55
Natufian 6.55
Ethiopia_4500BP 3.4
Levant_N 2.9
Iberia_ChL 0
Iberia_EN 0
Iberia_MN 0
Iberia_Southwest_CA 0


[1] distance%=1.7158 / distance=0.017158


Levant_BA 35.3
Iberomaurusian 25.85
Yoruba 14.6
Iberia_EN 13.35
Iberia_BA 10.9
Ethiopia_4500BP 0
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Levant_N 0
Natufian 0

[1] distance%=2.4367 / distance=0.024367


Iberomaurusian 29.6
Levant_BA 25.9
Iberia_EN 21.7
Iberia_BA 11.55
Yoruba 11.25
Ethiopia_4500BP 0
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Levant_N 0
Natufian 0


[1] distance%=2.3656 / distance=0.023656


Iberomaurusian 36.5
Levant_BA 17.15
Levant_N 13.7
Iberia_EN 12.85
Iberia_BA 9.95
Yoruba 9.55
Ethiopia_4500BP 0.3
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Natufian 0


[1] distance%=2.0838 / distance=0.020838


Levant_BA 41.85
Iberomaurusian 20.85
Iberia_BA 13.9
Iberia_EN 11.45
Yoruba 9.4
Ethiopia_4500BP 2.55
Iberia_ChL 0
Iberia_MN 0
Iberia_Southwest_CA 0
Levant_N 0
Natufian 0

Using the same methods, I also basically reproduced the ancestry proportions from the main mixture model for the Iberomaurusians in M. van de Loosdrecht et al. (~60/40% Natufian-like/Sub-Saharan African-related). But clearly, the very poor statistical fits suggest that, much like for the model in the paper, something is way off.

[1] distance%=25.4991 / distance=0.254991


Natufian 55.85
Tanzania_Luxmanda_3000BP 21.5
Ethiopia_4500BP 21
Tianyuan 1.65
ElMiron 0
GoyetQ116-1 0
Levant_N 0
Malawi_Hora_Holocene 0
South_Africa_2000BP 0
Ust_Ishim 0
Vestonice16 0


[1] distance%=24.6253 / distance=0.246253


Natufian 65.45
Dinka 22.9
Yoruba 9.45
Tianyuan 2.2
ElMiron 0
Ethiopia_4500BP 0
GoyetQ116-1 0
Levant_N 0
Malawi_Hora_Holocene 0
South_Africa_2000BP 0
Tanzania_Luxmanda_3000BP 0
Ust_Ishim 0
Vestonice16 0

The updated Global25 datasheets are available at the links below. Here's a challenge for the people in the comments: try to come up with a coherent, chronologically sound, mixture model for the Iberomaurusians that shows a distance of less than 15%. I don't think that this is doable just yet, and won't be until we have at least a few more ancient forager samples from Africa and the Near East, but let's see what happens anyway.

Global 25 datasheet

Global 25 datasheet (scaled)

Global 25 pop averages

Global 25 pop averages (scaled)


M. van de Loosdrecht et al., Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations, Science 10.1126/science.aar8380 (2018)

Sunday, March 18, 2018

Max Planck scientists: on a mission against geography

I was just reading the new Marieke van de Loosdrecht et al. 2018 paper [LINK] about the Pleistocene North African hunter-gatherers, and really enjoying it, until I saw this strange map. Please note that I edited the image for the purpose of review and to highlight an error (red pointer and arrow).

This is either a stupid oversight, or the authors of the paper, mainly from the Max Planck Institute for the Science of Human History, and also the scientists who peer reviewed it, don't know where the steppe is located in Eastern Europe. It's certainly not located anywhere near Karelia, Northern Russia, as the map suggests.

Now, you might say that I'm being nit picky. Well I'm not, because I can see an alarming trend emerging. Here's a quote from Aida Andrades Valtueña et al. 2017 [LINK], another paper authored mainly by scientists from the Max Planck Institute for the Science of Human History.

The Baltic Late Neolithic Y. pestis genomes (Gyvakarai1 and KunilaII) were reconstructed from individuals associated with the Corded Ware complex. Along with the Croatian Y. pestis genome (Vucedol complex) these are derived from a common ancestor shared with the Yamnaya-derived RK1001 and Afanasievo-derived RISE509. This supports the notion of the pathogen spreading in the context of the large-scale expansion of steppe peoples from Central Eurasia to Eastern and Central Europe.

Thus, what the authors are claiming is that the Pontic-Caspian steppe, which is where the Yamnaya culture was located, is in Central Eurasia rather than West Eurasia.

Obviously, Eurasia is a landmass made of up two continents: Europe and Asia. Try putting your finger in the middle of a map of Europe and Asia and see whether it lands anywhere near the Pontic-Caspian steppe. It won't, unless you've got the shakes or something, because Central Eurasia is more or less located around the Altai Mountains, between the Kazakh and Mongolian-Manchurian steppes, several thousand miles east of the Pontic-Caspian steppe.

Just another oversight, you might say? I doubt it, because here's a very similar case from Alissa Mittnik et al. 2018 [LINK], yet another paper authored mainly by scientists from the Max Planck Institute for the Science of Human History.

Studies of ancient genomes have shown that those associated with the CWC were closely related to the pastoralists of the Yamnaya Culture from the Pontic-Caspian steppe, introducing a genetic component that was not present in Europe previously [2, 3].

Nope, sorry, that doesn't make any sense whatsoever. Why? Because the Pontic-Caspian steppe is west of the Ural Mountains, therefore it's in Europe. You see, according to current geographic conventions, Eurasia west of the Urals and north of the Caucasus is Europe. Right or wrong, as things stand, that's just how it is. And if you happen to be a Max Planck scientist and adamant that I'm wrong, then Google it. I dare you to.

If anyone's still confused, then here's a simple guide, in point form, with a very basic, hopefully easy to grasp map:

- the Eurasian steppe is not a continent nor a country, but a geographical and topographical feature, and, indeed, it's called the Eurasian steppe because it's located on two continents known separately as Europe and Asia, and together as Eurasia

- the western part of the Eurasian steppe is called the Pontic-Caspian steppe, and it's firmly located in Eastern Europe

- the central part of the Eurasian steppe is called the Kazakh steppe, and it's located in Western and Central Asia, while the eastern part of the Eurasian steppe is called the Mongolian-Manchurian steppe, and it's located in East Central Asia

- the Yamnaya culture or horizon was entirely located within the Pontic-Caspian steppe, and therefore in Europe, and more precisely, in Eastern Europe.

Tuesday, March 13, 2018

First real foray into Migration Period Europe: the Gepid, Roman, Ostrogoth and others...

This is going to be our first meaningful look at the all important Migration Period, thanks to the recently published Veeramah et al. 2018 paper and accompanying dataset (see here). The Migration Period is generally regarded to have been the time when present-day Europe first began to take shape, in a rather sudden and violent way, with, you guessed it, a lot of migrations taking place.

Here's where most of the ancients from Veeramah et al. 2018 cluster in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation. Those East Germanics (the Gepid and Ostrogoth) are certainly very eastern, and indeed more exotic than I would've ever expected them to be. But I do love surprises like this. The relevant datasheet is available here.

Obviously, as per the paper, the ACD in about half of the labels stands for Artificial Cranial Deformation. I've also updated my Global25 datasheets with many of the same ancients. You can use these datasheets to plot them on 2D or 3D "genetic maps", and model their ancestry proportions. Feel free to share your findings in the comments below.

Global 25 datasheet

Global 25 datasheet (scaled)

Global 25 pop averages

Global 25 pop averages (scaled)

Here are a few of my own models for some of the more interesting of these individuals, using nMonte3 and based mainly on Iron Age (IA) reference samples. I used the same data file for all of the models; it includes scaled coordinates and is available for download here.

[1] distance%=3.7819



[1] distance%=3.6339




[1] distance%=2.5535




[1] distance%=2.9444



The Gepid and Ostrogoth show significant Scythian- and Armenian-related ancestry proportions, respectively. Should that be taken literally? Or do we have to wait for, say, Avar and Hunnic genomes to expect more realistic models?

Update 15/03/2018: This is where many of the Medieval German samples cluster in my PCA of modern-day Northern European genetic variation (see here). Obviously, I could only run the individuals with wholly or overwhelmingly North European genomes, and most of these turned out to be the males without any signs of ACD. They look very West Germanic. The relevant datasheet is available here.

Monday, March 12, 2018

Exotic female migrants in Early Medieval Bavaria (Veeramah et al. 2018)

PNAS has a new open access paper on the genomics of Early Medieval Bavarians, with a special focus on women with artificial skull deformation [LINK]. The data also include two very interesting Medieval samples from Crimea and Serbia, associated with the East Germanic Ostrogoths and Gepids, respectively. Both show significant Asian admixture. I'll try to get my hands on the dataset ASAP. Here's the abstract and a couple of quotes from the paper. Emphasis is mine:

Modern European genetic structure demonstrates strong correlations with geography, while genetic analysis of prehistoric humans has indicated at least two major waves of immigration from outside the continent during periods of cultural change. However, population-level genome data that could shed light on the demographic processes occurring during the intervening periods have been absent. Therefore, we generated genomic data from 41 individuals dating mostly to the late 5th/early 6th century AD from present-day Bavaria in southern Germany, including 11 whole genomes (mean depth 5.56×). In addition we developed a capture array to sequence neutral regions spanning a total of 5 Mb and 486 functional polymorphic sites to high depth (mean 72×) in all individuals. Our data indicate that while men generally had ancestry that closely resembles modern northern and central Europeans, women exhibit a very high genetic heterogeneity; this includes signals of genetic ancestry ranging from western Europe to East Asia. Particularly striking are women with artificial skull deformations; the analysis of their collective genetic ancestry suggests an origin in southeastern Europe. In addition, functional variants indicate that they also differed in visible characteristics. This example of female-biased migration indicates that complex demographic processes during the Early Medieval period may have contributed in an unexpected way to shape the modern European genetic landscape. Examination of the panel of functional loci also revealed that many alleles associated with recent positive selection were already at modern-like frequencies in European populations ∼1,500 years ago.


A much more diverse ancestry was observed among the females with elongated skulls, as demonstrated by a significantly greater group-based FIS (SI Appendix, Fig. S35). All these females had varying amounts of genetic ancestry found today predominantly in southern European countries [as seen by the varying amounts of ancestry inferred by model-based clustering that is representative of a sample from modern Tuscany, Italy (TSI), Fig. 3], and while the majority of samples were found to be closest to modern southeastern Europeans (Bulgaria and Romania, Fig. 4C), at least one individual, AED_1108, appeared to possess ∼20% East Asian ancestry (Fig. 3), which was also evident from the high number of haplotypes within the 5-Mb neutralome that were private to modern East Asian 1000 Genomes individuals (EAS), while also demonstrating an overall ancestry profile consistent with Central Asian populations (SI Appendix, Fig. S33). No modern European individual from the Simons Genome Diversity Panel (SGDP) (11) showed any evidence of significant East Asian ancestry except one Hungarian individual with less than 5%. A higher amount of East Asian ancestry was inferred for AED_1108 than all modern Caucasus and Middle Eastern individuals, and 28 of 33 South Asian individuals.


A diverse ancestry was also inferred for the two non-Bavarian samples with elongated heads. KER_1 from Ukraine possessed significant southern European ancestry as well as South Asian ancestry, with an overall profile that best matched modern Turkish individuals. The Gepid VIM_2 from Serbia demonstrated a similar Central Asian-like genetic profile to the Medieval Bavarian AED_1108 with an even larger East Asian component and number of private haplotypes but with less southern European/Middle Eastern ancestry (SI Appendix, Figs. S31 and S33).

Veeramah et al., Population genomic analysis of elongated skulls reveals extensive female-biased immigration in Early Medieval Bavaria, PNAS 2018; published ahead of print March 12, 2018,

Saturday, March 10, 2018

Was Ukraine_Eneolithic I6561 a Proto-Indo-European?

It's certainly a valid question, simply because the remains of this individual (sampled by Mathieson et al. 2018, see here) are from a cemetery of the Sredny Stog culture, which, based on historical linguistics and archaeological data, has already been posited to have been a Proto-Indo-European (PIE) culture, that gave rise to the supposedly Late Proto-Indo-European (LPIE) Yamnaya culture, that swept into Central Europe from the Pontic-Caspian steppe during the 3rd millennium BC. Moreover, consider the following points:

- whatever you might say about calling Y-Chromosome haplogroups "Proto-Indo-European", the fact is that Ukraine_Eneolithic I6561 is the oldest recorded individual belonging to Y-haplogroup R1a-M417, which is not a marker that can be reasonably linked to human expansions dating to the Paleolithic or even Neolithic, and yet today it peaks in frequency in modern-day Indo-European-speaking East and North European Europeans and South Asians, and is also recorded as the main Y-haplogroup amongst the ancient Scythians, who also were, in all likelihood, Indo-European-speakers, which strongly suggests that it was initially spread far and wide across Eurasia by the early Indo-Europeans

- following on from the last point, R1a-M417 can be divided into three main subclades: R1a-L664, R1a-Z293 and R1a-Z282, the first of which is almost exclusively confined to Northwestern Europe, while the latter two peak in frequency in South Central Asia and Eastern Europe, respectively, and the really interesting and important thing is that R1a-Z93 and R1a-Z282 are more closely related to each other than either is to R1a-L664, which mirrors the relatively close linguistic relationship between Balto-Slavs, who are rich in R1a-Z282, with Indo-Aryans, who are rich in R1a-Z93, (for instance, see here) and renders any arguments in this case based on isolation-by-distance practically useless

- Ukraine_Eneolithic I6561 is the oldest sample with UDG-treated genome-wide data to carry the 13910*T lactase persistence allele, which reaches its maximum frequency in Northwestern Europe, and is also relatively common amongst Indo-European-speaking South Asians, but not Middle Easterners (see here), suggesting that it spread from the Eastern European steppes both into Northwestern Europe and South Asia along with such ancient steppe markers as R1b-M269 and R1a-M417, and Indo-European speech

- based on historical linguistics data, the Proto-Indo-Europeans are generally regarded to have been foragers turned pastoralists, rather than farmers, but nevertheless, pastoralists familiar with farming, and indeed Ukraine_Eneolithic I6561 appears to be mostly a mixture of Eastern European and Caucasus Hunter-Gatherers (EHG and CHG, respectively), but with around 30% input from early European farmers.

Of course, we'll need many more ancient samples from Ukraine and surrounds to cement these findings, and prove, beyond any reasonable doubt, that the Sredny Stog people were indeed the Proto-Indo-Europeans, and that the Yamnaya people were the Late Proto-Indo-Europeans. It might also be necessary to develop new scientific methods that take into account multidisciplinary data to achieve this.

On a related note, the University of Leiden is currently seeking four historical linguists and one bioarchaeologist to take part in a new project titled The Linguistic Roots of Europe's Agricultural Transition. The principal investigator on the project is Guus Kroonen, whom I mentioned in a couple recent blog posts (see here, here and here). This is the project objective:

Today, Europe’s linguistic landscape is shaped almost entirely by a single language family: Indo-European. Even by the dawn of history, a patchwork of Indo-European subgroups, Germanic, Celtic, Italic, Baltic, Slavic and Greek, was covering the continent, and over the centuries, these subgroups evolved into the modern European languages, among which Russian, Italian, German, Lithuanian and Swedish, as well as the global lingua francas French, Spanish, and English.

The Indo-Europeanization of Europe was probably one of the most profound linguistic shifts ever to have taken place in the prehistory of Europe. The origin of the European languages, unsurprisingly, is therefore a matter of intense academic debate. There are currently only two prehistoric events that in the present academic debate are considered as likely driving factors behind the spread of Indo-European speech.

One the one hand, there are those historical linguists who by meticulous comparison of the different Indo-European languages have reconstructed a language and culture that is typical of the early Bronze Age. Terminology for horse-riding and wagon technology provides a possible link to the expansion of the Yamnaya culture on the Pontic-Caspian steppes, which was fueled by the invention of the wheel and the domestication of the horse. Others have suggested that the Indo-European languages diffused from Anatolia together with another major prehistoric event, the spread of agriculture to Europe between the 8th and 5th millennium.

The debate has remained unresolved for over two decades, but a new approach produces potentially decisive results. By studying prehistoric loanwords absorbed by the speakers of Indo-European when they entered Europe, and test the resulting cultural implications against the available archaeological record, new light can be shed on the language of Europe’s first farmers, and whether or not they spoke a form of Indo-European.

If you have the necessary passion and qualifications to apply for these positions, then please do so ASAP via these links:

PhD Candidate or Postdoctoral Researcher in the field of linguistics

Postdoctoral Researcher in the field of archaeology (specialization: bioarchaeology)

Friday, March 9, 2018

Ancient genomes from Southeast Asia (McColl et al. 2018 preprint)

Over at bioRxiv at this LINK. I'm still reading and trying to figure out what the 25 ancient genomes from this preprint say about the peopling of Eurasia and, in particular, South Asian population structure, including the so called Ancestral South Indian (ASI) genetic component. Any ideas? Below are the abstract and Figure 4 from the preprint.

Two distinct population models have been put forward to explain present-day human diversity in Southeast Asia. The first model proposes long-term continuity (Regional Continuity model) while the other suggests two waves of dispersal (Two Layer model). Here, we use whole-genome capture in combination with shotgun sequencing to generate 25 ancient human genome sequences from mainland and island Southeast Asia, and directly test the two competing hypotheses. We find that early genomes from Hoabinhian hunter-gatherer contexts in Laos and Malaysia have genetic affinities with the Onge hunter-gatherers from the Andaman Islands, while Southeast Asian Neolithic farmers have a distinct East Asian genomic ancestry related to present-day Austroasiatic-speaking populations. We also identify two further migratory events, consistent with the expansion of speakers of Austronesian languages into Island Southeast Asia ca. 4 kya, and the expansion by East Asians into northern Vietnam ca. 2 kya. These findings support the Two Layer model for the early peopling of Southeast Asia and highlight the complexities of dispersal patterns from East Asia.

McColl et al., Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia, bioRxiv, Posted March 8, 2018, doi:

Update 10/03/2018: Harvard and friends strike back with their own preprint on the same topic (LINK). Here's the abstract:

Southeast Asia is home to rich human genetic and linguistic diversity, but the details of past population movements in the region are not well known. Here, we report genome-wide ancient DNA data from thirteen Southeast Asian individuals spanning from the Neolithic period through the Iron Age (4100-1700 years ago). Early agriculturalists from Man Bac in Vietnam possessed a mixture of East Asian (southern Chinese farmer) and deeply diverged eastern Eurasian (hunter-gatherer) ancestry characteristic of Austroasiatic speakers, with similar ancestry as far south as Indonesia providing evidence for an expansive initial spread of Austroasiatic languages. In a striking parallel with Europe, later sites from across the region show closer connections to present-day majority groups, reflecting a second major influx of migrants by the time of the Bronze Age.

Lipson et al., Ancient genomes document multiple waves of migration in Southeast Asian prehistory, bioRxiv, Posted March 10, 2018, doi:

Thursday, March 8, 2018

Beakers vs modern-day Northern Europeans

Here are most of the Beakers from Olalde et al. 2018 in my Principal Component Analysis (PCA) of modern-day Northern European genetic variation. They look rather Celtic or perhaps Celto-Germanic, don't they? The relevant datasheet is available here.

If you're wondering why the Yamnaya and early Baltic Corded Ware individuals are sitting in the middle of the plot, I'd say it's because they don't share enough genetic drift with any specific sub-set of modern-day Northern Europeans to cluster with them. This might be also why the Ukraine Neolithic samples are so dispersed around the middle of the plot. In other words, they're possibly too old to feature in this PCA, unlike the Beakers and Bronze Age descendants of the Baltic Corded Ware people, who are clustering fairly deliberately with their likely closest modern-day relatives.

Tuesday, March 6, 2018

Main candidates for the precursors of the proto-Greeks in the ancient DNA record to date

Thanks to the recent release of the Mathieson et al. 2018 dataset (see here), I've been able to spot a very interesting northwest to southeast genetic cline running from the oldest Peloponnese Neolithic (Peloponnese_N) individuals to the Bronze Age Anatolians (Anatolia_BA). Here it is, highlighted in my Principal Component Analysis (PCA) of ancient West Eurasian variation. The relevant datasheet is available here.

I don't think it's a stretch to assume that this cline represents, more or less, the genetic diversity that existed in the Aegean region during the early Helladic period, just prior to the incursions of Bronze Age steppe or steppe-derived peoples who, according to the current academic consensus, probably gave rise to the proto-Greeks and Mycenaeans (see here).

There are three main reasons for this: 1) the Peloponnese_N samples show a very deliberate "pull" towards Anatolia_BA, suggesting that the Peloponnese population experienced admixture from a source similar to Anatolia_BA prior to the Bronze Age, 2) the cline cuts right through the middle of an "Old European" cluster made up of Minoans, who lived on Crete and other Aegean islands on the eve of the aforementioned steppe-derived incursions, and 3) both the Mycenaeans and Minoans can be modeled in large part as Anatolia_BA and Peloponnese_N.

The identification of this genetic cline, and what it likely stands for, is important, because it should allow us to plausibly point to the source of foreign input that created the Mycenaeans, and thus the Proto-Greeks. And clearly, the trajectory of the Mycenaean "pull" away from this cline is towards most of the samples marked as "Eneolithic and Bronze Age steppe".

However, this doesn't mean that it's necessary, or even sensible, to look for the precursors of the Proto-Greeks amongst these samples. That's because there might be much more proximate options based on, say, geography, archeology, chronology and mixture modeling. Indeed, using various criteria, I've chosen three individuals who sit along the Mycenaean to Eneolithic/Bronze Age steppe cline in the above PCA and might plausibly represent the precursors of the Proto-Greeks, or close relatives thereof. The first two are from Mathieson et al. 2018 and the third from Olalde et al. 2018.

- if, as most academics posit, the people who were to become the Proto-Greeks came from the Early Bronze Age (EBA) Yamnaya horizon on the Pontic-Caspian steppe, then it's possible that they were similar in terms of genome-wide genetic structure to the only Bulgarian Yamnaya sampled to date: Yamnaya_Bulgaria Bul4

- on the other hand, if, as has also been postulated in academic literature, they derived from the Middle Bronze Age (MBE) chariot warrior groups of the post-Yamnaya Pontic-Caspian steppe, then they may have been similar to Balkans_BA I2163, who is also from Bulgaria, but dated to more than a thousand years later than Bul4, and clusters strongly with the said chariot warriors, such as the Sintashta people, and even belongs to the same Y-haplogroup: R1a-Z93

- but if they came from the Yamnaya horizon via the Carpathian Basin, which, I'm told in the comments here, is also a serious option, although admittedly I've missed it in my reading, then they may have been similar to Proto-Nagyrév individual Hungary_BA I7043, who belongs to Western European-specific Y-haplogroup R1b-L51, a marker fairly common amongst modern-day Greeks.

And here's a mixture model for the Mycenaeans, using the Global25/nMonte method (see here and here), and the above trio as potential reference samples, alongside Anatolia_BA and Peloponnese_N.

[1] distance%=1.9802



Thus, it seems that the precursors of the Proto-Greeks came from Bulgarian Yamnaya. However, they, or the Mycenaeans, may also have had minor ancestry from the chariot warriors of the MBA Pontic-Caspian steppe. Yes, I'm probably reading far too much into these results, but I can't help it, because they appear so logical. Indeed, check this out:

[1] distance%=4.209

Mycenaean:I9033 (elite burial)


If this is just an artifact of the method, then it's a really nice one. But who are your main candidates for the precursors of the Proto-Greeks in the ancient DNA record to date? Feel free to let me know in the comments.

Sunday, March 4, 2018

On the origin of steppe ancestry in Beaker people (work in progress)

One of the major themes in the recent Bell Beaker Behemoth (ie. Olalde et al. 2018) is the presence of Yamnaya- or steppe-related ancestry in most of the Beaker individuals. Up to a whopping 75% in one guy from what is now Hungary. However, as far as I can see, the authors don't go into any specifics about the origin of this admixture. This is about as close as they come. Emphasis is mine:

However, migration had a key role in the further dissemination of the Beaker complex. We document this phenomenon most clearly in Britain, where the spread of the Beaker complex introduced high levels of steppe-related ancestry and was associated with the replacement of approximately 90% of Britain’s gene pool within a few hundred years, continuing the east-to-west expansion that had brought steppe-related ancestry into central and northern Europe over the previous centuries.

During the third millennium bc, two new archaeological pottery styles expanded across Europe and replaced many of the more localized styles that had preceded them [1]. The expansion of the ‘Corded Ware complex’ in north-central and northeastern Europe was associated with people who derived most of their ancestry from populations related to Early Bronze Age Yamnaya pastoralists from the Eurasian steppe [2–4] (henceforth referred to as ‘steppe’).

To be honest, I'm not quite sure what they're saying there. Is it that the steppe ancestry in the Beakers comes from Corded Ware people, one way or another, or that it derives from a later, closely related but separate, population wave from the steppe? Or are they leaving the question wide open for now?

If they are leaving it open, then I'm not surprised. That's because the only way to solve this mystery is to genotype at least a few hundred Eneolithic and Bronze Age skeletons from the Pontic-Caspian steppe in order to pinpoint the shared steppe homeland, or separate steppe homelands, of the Corded Ware and Beaker peoples. No doubt this will happen eventually, but it might take a few years for us to see the results. In the meantime, we can mess around with the data already available to see what it might reveal in regards to this topic.

Of course, I'm well aware that the Y-haplogroup most closely associated with the Corded Ware expansion is R1a, and in particular its R1a-M417 subclade, and that Beaker males with steppe ancestry almost exclusively belong to Y-haplogroup R1b, especially its R1b-P312 subclade. But this means very little for now, because considering the patchy sampling of ancient remains from Eneolithic/Bronze Age Europe, it's still possible that, for instance, these Beakers descend from an as yet unsampled subset of the Corded Ware population rich in R1b.

So for now, as we wait for more ancient data, the pertinent question is: are there any genome-wide genetic signals specific to Corded Ware people that are missing in the Beaker people, and vice versa?

One possible way to catch something like this might be to focus on differences in hunter-gatherer (HG) ancestry. That's because European hunter-gatherers are known to have had low effective populations and, as a result, a lot population-specific genetic drift. I can try to test this idea using the Global25/nMonte method (see here and here) and the following plausible, at least according to me, reference groups and individuals.

Barcin_N (Neolithic farmers from western Anatolia)
Blatterhole_HG (HG-like Middle Neolithic sample from Germany)
Koros_HG (HG-like Early Neolithic sample from Hungary)
Narva_Lithuania (late HGs from the southern Baltic)
Ukraine_Mesolithic (HGs from the North Pontic steppe)
Yamnaya_Samara (Bronze Age herders from the eastern end of the Pontic-Caspian steppe)

First up, the Corded Ware Culture (CWC) people, grouped into five sub-populations, based on geography and chronology:

[1] distance%=2.7491



[1] distance%=2.815



[1] distance%=1.9983



[1] distance%=2.9738



[1] distance%=3.2783



I'm pretty happy with these results. They make a lot of sense considering everything that we've seen about these samples to date. For instance, CWC_Baltic_early looks like it might have arrived in the Baltic region straight from the North Pontic steppe, which agrees with scientific literature and my earlier analyses (for instance, see here). Note also the exceptionally high Baltic HG signal in CWC_Baltic, which is missing in CWC_Baltic_early, no doubt caused by increasing gene flow from the indigenous Baltic population into the Corded Ware people. Now the Beakers:

[1] distance%=3.0892



[1] distance%=2.3366



[1] distance%=3.0011



Again, these clearly are very solid outcomes. But what do they tell us about the relationship between these Beakers and the Corded Ware people? To be honest, I'm not sure. The Narva_Lithuania signal is missing, which might be important, but then again, it's also missing in CWC_Czech. And now onto the Hungarian Beakers, grouped into three categories:

[1] distance%=1.9191



[1] distance%=4.9659



[1] distance%=2.4992



Check out the imposing level of Narva_Lithuania ancestry in Beaker_Hungary. Admittedly, I wasn't expecting this. Is there a chance that it's real? I honestly don't know, but we've certainly seen similar signals from Northeastern Europe in later Bronze Age samples from Hungary. On the other hand, Beaker_Hungary_outlier is the guy estimated by Olalde et al. to be as much as 75% steppe-derived. Here he gets a very similar figure of 76% of Yamnaya-like ancestry. Very nice! Finally, here are the Southern European Beakers:

[1] distance%=3.818



[1] distance%=5.4342



[1] distance%=2.992



[1] distance%=4.8488



[1] distance%=4.7903



[1] distance%=3.7945



It might be worth noting the lack of Narva_Lithuania and almost complete lack of Ukraine_Mesolithic ancestry proportions in these models. If this is not an artifact of the method, and please note that it very well might be, then it perhaps suggests that the steppe ancestors of the Beakers were basically like Samara Yamnaya, and that the northern and eastern Beakers picked up their Narva_Lithuania and/or Ukraine_Mesolithic-related ancestry by mixing with the descendants of the Corded Ware people.

Or not? At the very least, am I on the right track? How can I improve this analysis? Feel free to let me know in the comments.

Also, I should mention that I had to add a sample from Chalcolithic Anatolia (Anatolia_ChL) to the model for Beaker_Sicily_no_steppe to obtain more plausible ancestry proportions and a better statistical fit. It's intriguing that this type of ancestry is present in this southern Beaker, and missing in all the rest, but we've discussed this issue at length already in an earlier thread (see here).

On a related note, Danish linguist Guus Kroonen has a new article with his interpretations of the main findings by Olalde et al., freely available at his page at the link below.

Comments to Olalde et al. 2018 on the Bell Beaker phenomenon

It's interesting, I think, that he sees two distinct, and indeed "potentially competing", Indo-European migrations from the steppe, represented by the R1a-rich Corded Ware people and the R1b-P312-rich Beakers.

The identification of two different Y-chromosomal haplogroups deriving from the Steppe/Caucasus area is relevant for the prehistoric formation of the European linguistic landscape. What it implies is that Europe may have been confronted with originally separated networks of different, potentially competing, steppe-derived groups. It is through these cultural networks that Indo-European dialects may have diffused, probably existing alongside now extinct, non-Indo-European languages (cf. Iversen & Kroonen 2017).

Thursday, March 1, 2018

Awesome substructure within Czech Corded Ware

This is where the three Czech Corded Ware samples from Olalde et al. 2018 cluster in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation.

The two individuals belonging to Y-haplogroup R1a look like they might be straight from the Pontic-Caspian (PC) steppe. That's because they're sitting right next to an Eneolithic sample from the North Pontic part of the PC steppe, in what is now Ukraine, Eastern Europe. This guy, from Mathieson et al. 2018, also belongs to R1a. And if they're not totally of steppe origin, then clearly they both only have minor ancestry from outside of the steppe.

On the other hand, the third Czech Corded Ware individual, who belongs to the "Old European" Y-haplogroup I2a2a, actually shows no signs of steppe ancestry, because he clusters with Middle Neolithic Central Europeans. Indeed, I can test all of this with the Global25/nMonte method (see here and here), using the Eneolithic North Pontian and Samara Yamnaya as steppe references.

[1] distance%=3.8801 / distance=0.038801


Barcin_N 82.6
WHG 17.4
Ukraine_Eneolithic:I6561 0
Yamnaya_Samara 0

[1] distance%=2.4713 / distance=0.024713


Ukraine_Eneolithic:I6561 63.3
Yamnaya_Samara 24.65
WHG 7.35
Barcin_N 4.7

[1] distance%=2.4089 / distance=0.024089


Yamnaya_Samara 61.25
Barcin_N 17.7
Ukraine_Eneolithic:I6561 12.9
WHG 8.15

But what does this mean? Well, obviously that the R1a in Corded Ware people is not from the PC steppe!

Nah, I'm just messing around; poking a bit of fun at the dumb trolls online still arguing, against all odds, that the steppe ancestors of the Corded Ware people did not carry R1a. But let's just move on, shall we, because there's no longer any doubt that the R1a-M417 subclade of R1a, which encompasses almost 100% of the R1a lineages in the world today, expanded from the PC steppe with the forefathers of the Corded Ware folk. For one, it's found in the aforementioned Eneolithic North Pontian, and two, in the oldest and most steppe-shifted Corded Ware individuals. So that's that.

