search this blog

Wednesday, September 26, 2018

The Hallstatt effect (?)

Just to see what would happen, I ran a subset of the highest coverage Bronze Age samples from what are now Britain and Ireland in my new Celtic vs Germanic Principal Component Analysis (PCA). Look for the Britain_&_Ireland_BA cluster. The relevant datasheet is available here.

Perhaps it's not a coincidence that the likely Celtic-speaking Iron Age individuals from present-day England (labeled England_IA) are positioned between these older British and Irish samples and the two ancients from Iron Age burials in present-day Bylany, Czechia, associated with the Hallstatt culture (marked with black stars). That's because the Hallstatt people are generally considered to have been the earliest speakers of Celtic languages.

Hence, what the PCA might be showing is a genetic shift in the British and Irish Isles caused by the arrival of Hallstatt Celts in Northwestern Europe.

Interestingly, the present-day English samples appear to be a mixture of Britain_&_Ireland_BA, England_IA and England_Anglo-Saxon. However, a subset of these samples is also heavily shifted "east" towards one of the Hallstatt individuals and present-day Dutch, suggesting that they harbor extra admixture from continental Europe.

This isn't easy to make out on my plot, because of the clutter, but I can assure you that it's true. Keep in mind that you can plug the datasheet into the PAST program (freely available here) to have a much closer look at the PCA and even change the color coding.

To check whether England_IA can be modeled as a mixture of Britain_&_Ireland_BA and Hallstatt with formal methods, I ran an analysis with the qpAdm software using all of the publicly available Bronze Age samples from present-day Britain and Ireland. The standard errors are high, likely because Britain_&_Ireland_BA and Hallstatt are closely related, but, overall, we can probably say that the model does limp across the line.

Britain_&_Ireland_BA 0.555±0.172
Hallstatt 0.445±0.172
chisq 18.513
tail prob 0.100973
Full output

However, the really important thing about this output is that England_IA cannot be modeled as simply Britain_&_Ireland_BA (the chisq and tail prob are way off). Thus, even though the Hallstatt samples from Bylany don't appear to be ideal proxies for the admixture in England_IA that is lacking in Britain_&_Ireland_BA, the signal they produce does suggest that a closely related population arrived in the British Isles during or after the Bronze Age to give rise to England_IA.

See also...

Celtic probably not from the west

Celtic vs Germanic Europe

Tuesday, September 25, 2018

AmtDB: an interactive ancient human mitogenome database

A very useful resource called AmtDB has just come online. For background info, check out the relevant paper by Ehler et al. here. Below is the paper abstract:

Ancient mitochondrial DNA is used for tracing human past demographic events due to its population-level variability. The number of published ancient mitochondrial genomes has increased in recent years, alongside with the development of high-throughput sequencing and capture enrichment methods. Here, we present AmtDB, the first database of ancient human mitochondrial genomes. Release version contains 1107 hand-curated ancient samples, freely accessible for download, together with the individual descriptors, including geographic location, radiocarbon dating, and archaeological culture affiliation. The database also features an interactive map for sample location visualization. AmtDB is a key platform for ancient population genetic studies and is available at

To give an example of how this thing works, I'll search for a very specific mitochondrial (mtDNA) haplogroup, H6a1b, which was recorded, perhaps unexpectedly, in a sample from Hittite era Anatolia (individual MA2208 from Damgaard et al. 2018). I say perhaps unexpectedly, because it's a marker that is today, by and large, restricted to Northern Europe. Here are the results...

Interestingly, H6a1b only pops up in Copper and Bronze Age individuals from what are now Czechia, Great Britain, Poland and Russia, with not a single instance from the Near East. Moreover, the oldest sample on the list is from an Yamnaya culture burial in Samara, Russia. Thus, if the presence of this marker in the Hittite sample isn't due to contamination or poor quality sequencing, then it's likely that some Hittites belonged to mtDNA haplogroups that arrived in Anatolia from the steppes of what is now Russia.

See also...

Focus on Hittite Anatolia

Saturday, September 22, 2018

Corded Ware people =/= Proto-Uralics (Tambets et al. 2018)

A new paper on the genetic structure of Uralic-speaking populations has appeared at Genome Biology (see here). It looks to me like the prelude to a forthcoming paleogenetics paper on the same topic that was discussed in the Estonian media recently (see here). Although not exactly ground breaking (because it basically argues what I've been saying at this blog for years, like here), it's a very nice effort all round and must be read by anyone with an interest in this topic. From the paper, emphasis is mine:

Background The genetic origins of Uralic speakers from across a vast territory in the temperate zone of North Eurasia have remained elusive. Previous studies have shown contrasting proportions of Eastern and Western Eurasian ancestry in their mitochondrial and Y chromosomal gene pools. While the maternal lineages reflect by and large the geographic background of a given Uralic-speaking population, the frequency of Y chromosomes of Eastern Eurasian origin is distinctively high among European Uralic speakers. The autosomal variation of Uralic speakers, however, has not yet been studied comprehensively.

Results: Here, we present a genome-wide analysis of 15 Uralic-speaking populations which cover all main groups of the linguistic family. We show that contemporary Uralic speakers are genetically very similar to their local geographical neighbours. However, when studying relationships among geographically distant populations, we find that most of the Uralic speakers and some of their neighbours share a genetic component of possibly Siberian origin. Additionally, we show that most Uralic speakers share significantly more genomic segments identity-by-descent with each other than with geographically equidistant speakers of other languages. We find that correlated genome-wide genetic and lexical distances among Uralic speakers suggest co-dispersion of genes and languages. Yet, we do not find long-range genetic ties between Estonians and Hungarians with their linguistic sisters that would distinguish them from their non-Uralic-speaking neighbours.

Conclusions: We show that most Uralic speakers share a distinct ancestry component of likely Siberian origin, which suggests that the spread of Uralic languages involved at least some demic component.


Recent aDNA studies have shown that extant European populations draw ancestry form three main migration waves during the Upper Palaeolithic, the Neolithic and Early Bronze Age [2, 3, 45]. The more detailed reconstructions concerning NE Europe up to the Corded Ware culture agree broadly with this scenario and reveal regional differences [65–67]. However, to explain the demographic history of extant NE European populations, we need to invoke a novel genetic component in Europe—the Siberian. The geographic distribution of the main part of this component is likely associated with the spread of Uralic speakers but gene flow from Siberian sources in historic and modern Uralic speakers has been more complex, as revealed also by a recent study of ancient DNA from Fennoscandia and Northwest Russia [68]. Thus, the Siberian component we introduce here is not the perfect but still the current best candidate for the genetic counterpart in the spread of Uralic languages.


Tambets et al., Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations, Genome Biology, (2018) 19:139

See also...

Big deal of 2019: ancient DNA confirms the link between Y-haplogroup N and Uralic expansions

Friday, September 21, 2018

Dzudzuana Ice Age foragers: a different type of Caucasus hunter-gatherer (Lazaridis et al. 2018 preprint)

Over at bioRxiv at this LINK. Below is the abstract. Emphasis is mine.

The earliest ancient DNA data of modern humans from Europe dates to ~40 thousand years ago, but that from the Caucasus and the Near East to only ~14 thousand years ago, from populations who lived long after the Last Glacial Maximum (LGM) ~26.5-19 thousand years ago. To address this imbalance and to better understand the relationship of Europeans and Near Easterners, we report genome-wide data from two ~26 thousand year old individuals from Dzudzuana Cave in Georgia in the Caucasus from around the beginning of the LGM. Surprisingly, the Dzudzuana population was more closely related to early agriculturalists from western Anatolia ~8 thousand years ago than to the hunter-gatherers of the Caucasus from the same region of western Georgia of ~13-10 thousand years ago. Most of the Dzudzuana population's ancestry was deeply related to the post-glacial western European hunter-gatherers of the 'Villabruna cluster', but it also had ancestry from a lineage that had separated from the great majority of non-African populations before they separated from each other, proving that such 'Basal Eurasians' were present in West Eurasia twice as early as previously recorded. We document major population turnover in the Near East after the time of Dzudzuana, showing that the highly differentiated Holocene populations of the region were formed by 'Ancient North Eurasian' admixture into the Caucasus and Iran and North African admixture into the Natufians of the Levant. We finally show that the Dzudzuana population contributed the majority of the ancestry of post-Ice Age people in the Near East, North Africa, and even parts of Europe, thereby becoming the largest single contributor of ancestry of all present-day West Eurasians.

Lazaridis et al., Paleolithic DNA from the Caucasus reveals core of West Eurasian ancestry, bioRxiv, posted September 21, 2018, doi:

See also...

Villabruna cluster =/= Near Eastern migrants

Thursday, September 20, 2018

Early Anatolian farmers were overwhelmingly of local hunter-gatherer origin (Feldman et al. 2018 preprint)

Over at bioRxiv at this LINK. The dataset in this preprint includes just one Anatolian hunter-gatherer, but that's enough to make the point that in Anatolia, unlike in Europe, there was very strong genetic continuity between the local foragers and earliest farmers. His Y-chromosome haplogroup is an interesting one: C1a2, which has been recorded in European remains from the Upper Paleolithic. Below is the abstract and a pertinent quote. I think this preprint basically confirms what I argued about the origin of the so called Villabruna hunter-gatherer clade back in 2016 (see here). Emphasis is mine.

Anatolia was home to some of the earliest farming communities. It has been long debated whether a migration of farming groups introduced agriculture to central Anatolia. Here, we report the first genome-wide data from a 15,000 year-old Anatolian hunter-gatherer and from seven Anatolian and Levantine early farmers. We find high genetic continuity between the hunter-gatherer and early farmers of Anatolia and detect two distinct incoming ancestries: an early Iranian/Caucasus related one and a later one linked to the ancient Levant. Finally, we observe a genetic link between southern Europe and the Near East predating 15,000 years ago that extends to central Europe during the post-last-glacial maximum period. Our results suggest a limited role of human migration in the emergence of agriculture in central Anatolia.


Among the Later European HG, recently reported Mesolithic hunter-gatherers from the Balkan peninsula, which geographically connects Anatolia and central Europe (‘Iron Gates HG’) [18], are genetically closer to AHG when compared to all the other European hunter-gatherers, as shown in the significantly positive statistic D(Iron_Gates_HG, European hunter-gatherers; AHG, Mbuti/Altai). Iron Gates HG are followed by Epigravettian and Mesolithic individuals from Italy and France (Villabruna [14] and Ranchot88 respectively [17]) as the next two European hunter-gatherers genetically closest to AHG [20] (Fig. 3A and data table S5). Iron Gates HG have been suggested to be genetically intermediate between WHG and eastern European hunter-gatherers (EHG) with an additional unknown ancestral component [18]. We find that Iron Gates HG can be modeled as a three-way mixture of Near-Eastern hunter-gatherers (25.8 ± 5.0 % AHG or 11.1 ± 2.2 % Natufian), WHG (62.9 ± 7.4 % or 78.0 ± 4.6 % respectively) and EHG (11.3 ± 3.3 % or 10.9 ± 3 % respectively); (tables S4 and S9). The affinity detected by the above D-statistic can be explained by gene flow from Near-Eastern hunter-gatherers into the ancestors of Iron Gates or by a gene flow from a population ancestral to Iron Gates into the Near-Eastern hunter-gatherers as well as by a combination of both. To distinguish the direction of the gene flow, we examined the Basal Eurasian ancestry component (α), which is prevalent in the Near East [6] but undetectable in European hunter-gatherers [17]. Following a published approach [6], we estimated α to be 24.8 ± 5.5 % in AHG and 38.5 ± 5.0 % in Natufians (Fig. 3B, table S10), consistent with previous estimates for the latter [6]. Under the model of unidirectional gene flow from Anatolia to Europe, 6.4 % is expected for α of Iron Gates by calculating (% AHG in Iron Gates HG) × (α in AHG). However, Iron Gates can be modeled without any Basal Eurasian ancestry or with a non-significant proportion of 1.6 ± 2.8 % (Fig. 3B, table S10), suggesting that unidirectional gene flow from the Near East to Europe alone is insufficient to explain the extra affinity between the Iron Gates HG and the Near-Eastern hunter-gatherers. Thus, it is plausible to assume that prior to 15,000 years ago there was either a bidirectional gene flow between populations ancestral to Southeastern Europeans of the early Holocene and Anatolians of the late glacial or a dispersal of Southeastern Europeans into the Near East. Presumably, this Southeastern European ancestral population later spread into central Europe during the post-last-glacial maximum (LGM) period, resulting in the observed late Pleistocene genetic affinity between the Near East and Europe.

Feldman et al., Late Pleistocene human genome suggests a local origin for the first farmers of central Anatolia, biRxiv, posted September 20, 2018, doi:

Sunday, September 16, 2018

Celtic vs Germanic Europe

I have a feeling that ancient DNA from post-Bronze Age Northwestern Europe will be coming thick and fast from now on. To get the most out of such data I've designed a new Principal Component Analysis (PCA) that does a better job of separating the Celtic- and Germanic-speaking populations of Europe than my previous efforts of this sort (see here and here). Below are two different versions of the same PCA. The relevant datasheet is available here.

And here's a Discrimination Analysis (LDA) plot based on the 25 principal components. It further differentiates many of the populations along the east > west cline of genetic diversity.

The difference between the Germanic Anglo-Saxons and the Celtic and Roman Britons of what is now eastern England is obvious. The Anglo-Saxons could pass for Scandinavians, while the Celts and Romans both cluster between the Irish and French. This makes good sense, and is exactly what I was looking for. It's also interesting to see the presumably Celtic-speaking Hallstatt samples from Bylany, Czechia, clustering with the Belgians.

Update 14/12/2019: Pictured below is a new version of my Celtic vs Germanic genetic map. It's based on the same Principal Component Analysis (PCA) as the original, but more focused on Northwestern Europe and produced with a different program.

To see the interactive online version, navigate to Vahaduo Custom PCA and copy paste the text from here into the empty space under the PCA DATA tab. Then press the PLOT PCA button under the PCA PLOT tab. For more guidance, refer to the screen caps here and here.

To include a wider range of populations in the key, just edit the data accordingly. For instance, to break up the ancient grouping into more specific populations, delete the Ancient: prefix in all of the relevant rows. This is what you should see:

Conversely, you can leave the ancient sample set intact and instead reorder the present-day linguistic groupings into, say, geographic groupings. To achieve this just delete all of the linguistic prefixes, such as Celtic:, Germanic:, and so on. You should end up with a datasheet like this and plot like this.

Of course, you can design your own plot by using any combination of the ancient and present-day individuals and populations that I've already run in this PCA. Their coordinates are listed here. Indeed, if you're in the possession of your own Celtic vs Germanic PCA coordinates, you can add yourself to the plot. And if you're not, see here.

It's also possible to re-process PCA data via the SOURCE tab. But I don't recommend doing this with the Celtic vs Germanic data, which are derived from a fine scale analysis and don't pack much variation. On the other hand, Global25 data are ideal for such re-processing. I made the plots below from subsets of Global25 coordinates available in a zip file here. To see how, refer to the screen caps here and here.

See also...

Modeling your ancestry has never been easier

Getting the most out of the Global25

Modeling genetic ancestry with Davidski: step by step

Wednesday, September 12, 2018

Avars and Longobards

Most of the "barbarians" from today's Amorim et al. paper have made it into the Global25 datasheets. Look for the samples with Collegno and Szolad in their labels. Same links as always...

Global25 datasheet ancient scaled

Global25 pop averages ancient scaled

Global25 datasheet ancient

Global25 pop averages ancient

Here's my usual Principal Component Analysis (PCA) of West Eurasian variation with the same individuals. As seen in the paper, the two females from Avar burials are very European indeed, with no hints of any recent Asian ancestry. The relevant datasheet is available here.

And this is my Global25-derived North European PCA featuring a subset of these samples that plotted firmly with present-day populations from north of the Alps, Balkans and Pyrenees. The aforementioned Avars (red dots) are sitting within the Polish cluster. The relevant datasheet is available here.

See also...

Greeks in a Longobard cemetery

First real foray into Migration Period Europe: the Gepid, Roman, Ostrogoth and others

Tuesday, September 11, 2018

Blast from the past: Yamnaya prediction from 2016

I wonder what's holding up the publication of the Wang et al. "Greater Caucasus" preprint? It was released back in May at the bioRxiv (see here). On a related note, I was looking back at some of the stuff that I wrote about the origin of the Yamnaya people (aka Steppe_EMBA), and found this...

But here's my prediction: Steppe_EMBA only has 10-15% admixture from the post-Mesolithic Near East not including the North Caucasus, and basically all of this comes via female mediated gene flow from farming communities in the Caucasus and perhaps present-day Ukraine.

The relevant blog post from 2016 is here. I totally forgot that I made such a bold prediction. But it actually has a very good chance now of being proven correct, more or less.

This, however, depends on the precise origin of the Yamnaya-like Eneolithic populations of the southernmost parts of the Pontic-Caspian steppe. But, considering the data in Wang et al., I think the possibility that they date back to the Pottery Neolithic period, and are thus indigenous to the region, looks quite high.

About a year later I made a prediction about the genetic structure of the Maykop people, and was basically proven right by Wang et al. (see here). Admittedly, my jaw dropped when I saw how the Steppe Maykop individuals came out in the preprint, with their Botai-like ancestry that is missing in all Yamnaya populations sampled to date. But it was an interesting outcome and nice to be surprised by ancient DNA yet again.

See also...

Genetic borders are usually linguistic borders too

Ahead of the pack

Open thread: What are the linguistic implications of Olalde et al. 2019?

Wednesday, September 5, 2018

ISBA 2018 abstracts

The ISBA 2018 conference is in a couple of weeks and the abstract book is now available here. Below are a few examples of what's on offer this year. Admittedly, the Scythian abstract looks a bit weird to me, because we know for a fact that the Scythians who lived in the Pontic-Caspian steppe harbored Siberian genome-wide and maternal admixture (see here and here). The abstract about the horses and mules looks like it's from the major horse paper that I blogged about a few days ago (see here). Emphasis is mine...

Genetic continuity in the western Eurasian Steppe broken not due to Scythian dominance, but rather at the transition to the Chernyakhov culture (Ostrogoths)

Jarve et al.

The long-held archaeological view sees the Early Iron Age nomadic Scythians expanding west from their Altai region homeland across the Eurasian Steppe until they reached the Ponto-Caspian region north of the Black and Caspian Seas by around 2,900 BP. However, the migration theory has not found support from ancient DNA evidence, and it is still unclear how much of the Scythian dominance in the Eurasian Steppe was due to movements of people and how much reflected cultural diffusion and elite dominance. We present new whole-genome results of 31 ancient Western and Eastern Scythians as well as samples pre- and postdating them that allow us to set the Scythians in a temporal context by comparing the Western Scythians to samples before and after within the Ponto-Caspian region. We detect no significant contribution of the Scythians to the Early Iron Age Ponto-Caspian gene pool, inferring instead a genetic continuity in the western Eurasian Steppe that persisted from at least 4,800–4,400 cal BP to 2,700–2,100 cal BP (based on our radiocarbon dated samples), i.e. from the Yamnaya through the Scythian period.

However, the transition from the Scythian to the Chernyakhov culture between 2,100 and 1,700 cal BP does mark a shift in the Ponto-Caspian genetic landscape, with various analyses showing that Chernyakhov culture samples share more drift and derived alleles with Bronze/Iron Age and modern Europeans, while the Scythians position outside modern European variation. Our results agree well with the Ostrogothic origins of the Chernyakhov culture and support the hypothesis that the Scythian dominance was cultural rather than achieved through population replacement.


Unveiling early horse domestication and mule production with ancient genome-scale data

Fages et al.

Despite being one of the last large herbivores to be domesticated, the horse has deeply transformed human civilizations. It provided not only important primary domestication products including both meat and milk, but also invaluable secondary products, such as fast transportation, which impacted patterns of human movements and facilitated the spread of vast cultural and political units across the Old World. The steps underpinning early horse domestication are, however, difficult to track in the archaeological record, especially due to (1) the relative scarcity of horse bone assemblages until the Neolithic and Bronze Age transition, and (2) the absence of clear patterns of size differentiation prior to the Iron Age. Some of the more recent steps accompanying horse domestication, and in particular how it was transformed to fit a range of utilizations in different human groups, are also poorly documented. One such step pertains to the development of mules, and other kinds of F1-hybrids, which are difficult to identify on fragmentary remains using morphology alone. Within the course of the ERC PEGASUS project, we have generated genome-scale sequence information from hundreds of equine archaeological remains spread across Eurasia and spanning the last ~40,000 years. These data helped us test the extent to which candidate domestication centres in Central Asia and Europe contributed to the genetic makeup of the modern domestic horse and propose a minimal time boundary for the earliest utilization of mules by mankind.


The genetic history of the Iberian Peninsula over the last 8000 years

Olalde et al.

The Iberian Peninsula, lying on the southwestern corner of Europe, provides an excellent opportunity to assess the final impact of population movements entering the continent from the east and to study prehistoric and historic connections with North Africa. Previous studies have addressed the population history of Iberia using ancient genomes, but the final steps leading to the formation of the modern Iberian gene pool during the last 4000 years remain largely unexplored. Here we report genome-wide data from 153 ancient individuals from Iberia, more than doubling the number of available genomes from this region and providing the most comprehensive genetic transect of any region in the world during the last 8000 years. We find that Mesolithic hunter-gatherers dated to the last centuries before the arrival of farmers showed an increased genetic affinity to central European hunter-gatherers, as compared to earlier individuals. During the third millennium BCE, Iberia received newcomers from south and north. The presence of one individual with a North African origin in central Iberia demonstrates early sporadic contacts across the strait of Gibraltar. Beginning ~2500 BCE, the arrival of individuals with steppe-related ancestry had a rapid and widespread genetic impact, with Bronze Age populations deriving ~40% of their autosomal ancestry and 100% of their Y-chromosomes from these migrants. During the later Iron Age, the first genome-wide data from ancient non-Indo-European speakers showed that they were similar to contemporaneous Indo-European speakers and derived most of their ancestry from the earlier Bronze Age substratum. With the exception of Basques, who remain broadly similar to Iron Age populations, during the last 2500 years Iberian populations were affected by additional gene-flow from the Central/Eastern Mediterranean region, probably associated to the Roman conquest, and from North Africa during the Moorish conquest but also in earlier periods, probably related to the Phoenician-Punic colonization of Southern Iberia.

See also...

How relevant is Arslantepe to the PIE homeland debate?

Sunday, September 2, 2018

Major horse paper coming soon

Horse domestication is an important and controversial topic, in large part because it's intimately tied to the debate over the location of the Proto-Indo-European (PIE) homeland. Based on the currently available genetic and archaeological data, it seems likely that all modern domesticated horse breeds ultimately derive from the Pontic-Caspian steppe in Eastern Europe (see here and here).

In the interview linked to below (click on the image) horse expert Alan Outram reveals that a new paper will be published within months that will test this theory, and either confirm or debunk it.

Outram also talks about the colonization of Central Asia during the Middle Bronze Age by groups from the west associated with the Sintashta culture. He says that this was probably an aggressive affair, akin to the more recent European colonization of North America, that may have pushed the Botai people, who were the indigenous inhabitants of the Kazakh steppe, and their horses far to the east. This, he suggests, might explain why the Przewalski horse of Mongolia appears to be derived from the Botai horse.

See also...

The mystery of the Sintashta people

Focus on Hittite Anatolia

Friendly Yeniseian steppe pastoralists