search this blog

Thursday, July 19, 2018

An early Iranian, obviously

Today, the part of Asia between the Caspian Sea and the Altai Mountains, known as Turan, is largely a Turkic-speaking region. But during the Iron Age it was dominated by Iranian speakers. Throughout this period it was the home of a goodly number of attested and inferred early Iranic peoples, such as the Airya, Dahae, Kangju, Massagetae, Saka and Sogdians.

Indeed, the early Iron Age Yaz II archaeological culture, located in southwestern Turan, is generally classified as an Iranian culture, and even posited to have been the Airyanem Vaejah, aka home of the Iranians, from ancient Avestan literature.

That's not to say that Iranian speakers weren't present in this part of the world much earlier. They probably were, and it's likely that we already have their genomes (see here). But the point I'm making is that Turan can't be reliably claimed to have been an Iranian realm until the Iron Age.

Ergo, any ancient DNA samples from Turan dating to the Iron Age, as opposed to, say, the Bronze Age, are very likely to be those of early Iranian speakers. One such sample is Turkmenistan_IA DA382 from Damgaard et al. 2018.

Below is a screen cap of the "time map" from, with the slider moved to 847 BC, showing the location of the burial site where the remains of DA382 were excavated. The site is marked with the Z93 label because DA382 belongs to the Eastern European-derived Y-chromosome haplogroup R1a-Z93. Interestingly, his burial was located in close proximity to archaeological sites associated with the above mentioned and contemporaneous Yaz II culture.

DA382 didn't get much of a run in the Damgaard et al. paper, and little wonder because the authors also analyzed 73 other ancient samples. So let's take a close look at this individual's genetic structure to see whether there's anything particularly Iranian about it.

Damgaard et al. did mention that DA382 was partly of Middle to Late Bronze Age (MLBA) steppe origin. And indeed, my own mixture models using qpAdm confirm this finding with very consistent results and strong statistical fits. Here are a couple of two-way examples...

Namazga_CA 0.528±0.040
Srubnaya_MLBA 0.472±0.040
P-value: 0.561330411
Full output

Dzharkutan1_BA 0.530±0.037
Srubnaya_MLBA 0.470±0.037
P-value: 0.485083377
Full output

The fact that the MLBA Srubnaya samples from the Pontic-Caspian steppe can be used to model DA382's ancestry (alongside Bronze and Copper Age populations from Turan) with such ease shouldn't be surprising, considering the he belongs to R1a-Z93, which is the dominant Y-haplogroup in the Srubnaya and all other closely related MLBA steppe peoples.

Now, Srubnaya is generally regarded to be the proto-Iranian archaeological culture. How awesome is that considering those qpAdm fits? But, admittedly, this is just an inference, even if a robust one, based on genetic, archaeological and historical linguistics data. So apart from the fact that DA382 comes from Iron Age Turan, an Iranian-speaking realm, is there any other way to link him directly to Iranians?

Well, he's very similar in terms of overall genetic structure to some of the least Turkic-admixed Iranian speakers still living in Turan, and might well be ancestral to them.

For instance, below is a Principal Component Analysis (PCA) featuring a wide range of ancient and present-day West Eurasian samples. Note that, in line with the qpAdm models, DA382 clusters about half-way between the populations of the MLBA steppe and pre-Kurgan expansion Turan, and amongst present-day Yaghnobi and Pamiri Tajiks. In fact, he clusters at the apex of a southeast > northwest cline made up of Tajiks that appears to be pulling towards Europeans.

Needless to say, Tajiks, especially Pamiri Tajiks, also pack a lot of Srubnaya-related ancestry. I've talked about this plenty of times at this blog (for instance, see here). But what happens if I try to model Pamiri and Yaghnobi Tajiks with DA382?

Turkmenistan_IA 0.892±0.023
Han 0.108±0.023
P-value: 0.794566182
Full output

Wow, it's an awesome fit! My mind's made up: DA382 was probably an Iranian speaker and, more specifically, an Eastern Iranian speaker. Who disagrees and why? Feel free to let me know in the comments (unless you're banned, in which case, f*ck off).

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friendly Yeniseian steppe pastoralists

New PCA featuring Botai horse tamers, Hun and Saka warriors, and many more...

Friday, July 6, 2018

"The Homeland: In the footprints of the early Indo-Europeans" time map

Click HERE to view the interactive time map, and give it some time to load if you're on a slow connection. Use the slider tool to explore different time periods from 6385 BC to 1 BC. It's still a work in progress, so feel free to let the author, Mikkel Nørtoft, know if anything should be added, tweaked and/or generally improved.

Below is a screen cap of the map with the slider moved to 3618 BC. Note the sheep in the North Pontic steppe and the wheels just west of it. Wink, wink, nudge, nudge...

See also...

Friendly Yeniseian steppe pastoralists

Ahead of the pack

Genetic borders are usually linguistic borders too

Thursday, July 5, 2018

SMBE 2018 abstracts

Abstracts from the upcoming SMBE 2018 conference (8-12 July) are now available HERE. Below are a few that I found interesting. Emphasis is mine. Feel free to post your own favorites in the comments.

The first Epipaleolithic Genome from Anatolia suggests a limited role of demic diffusion in the Advent of Farming in Anatolia

Feldman et al.

Anatolia was home to some of the earliest farming communities, which in the following millennia expanded into Europe and largely replaced local hunter-gatherers. The lack of genetic data from pre-farming Anatolians has so far limited demographic investigations of the Anatolian Neolithisation process. In particular, it has been unclear whether farming was adopted by indigenous hunter-gatherers in Central Anatolia or imported by settlers from earlier farming centers. Here we present the first genome-wide data from an Anatolian Epipaleolithic hunter-gatherer who lived ~15,000 years ago, as well as from Early Neolithic individuals from Anatolia and the Levant. By using a comparative dataset of modern and ancient genomes, we estimate that the earliest Anatolian farmers derive over 90 percent of their ancestry from the local Epipaleolithic population, indicating a high degree of genetic continuity throughout the Neolithic transition. In addition, we detect two distinct waves of gene flow during the Neolithic transition: an earlier one related to Iranian/Caucasus ancestry and a later one linked to the Levant. Finally, we observe a genetic link between Epipaleolithic Near-Easterners and post-glacial European hunter-gatherers that suggests a bidirectional genetic exchange between Europe and the Near East predating 15,000 years ago. Our results suggest that the Neolithisation model in Central Anatolia was demographically similar to the one previously observed in the southern Levant and in the southern Caucasus-Iran highlands, further supporting the limited role of demic diffusion during the early spread of agriculture in the Near East, in contrast to the later Neolithisation of Europe.


Demographic processes in Estonia from Bronze Age through Iron Age to Medieval times

Metspalu et al.

N3 and R1a are the two most common Y chromosome haplogroups among modern Estonians. R1a appears with Corded Ware culture but the arrival of hg N has not been determined. To this end we have extracted and studied aDNA from teeth of 18 individuals bracketing the changes in the material culture in the end of the Bronze and early Iron Age. We find N3 in Iron Age but not in Bronze Age. Due to the small sample size we cannot refute the existence of hg N in the latter. In genome wide analyses the Bronze Age and especially Iron Age samples appear very similar to modern Estonians implying population continuity. Christianization (13 cc AD) established a new elite of West European origin, which presumably had an impact on the genetic structure of the local population. To investigate this we extracted DNA from teeth of 35 individuals, who have been uncovered from both rural (considered local Estonian population) and town (likely of West European origin) cemeteries of Estonia. We compared the low coverage genomes with each other and with relevant modern and ancient Estonian and other European populations. We find that there is a clear discontinuity between the elite and common people, where the former group genetically with modern German samples and the latter with modern Estonians. We do find three individuals of mixed genetic ancestry. But importantly we do not see a steady shift of either local population strata, which suggests limited contact between the elite and the common people.


Genetic transition in the Swiss Late Neolithic and Early Bronze Age

Furtwaengler et al.

Recent studies have shown that the beginning of the Neolithic period as well as final stages of the Neolithic were marked by major genetic turnovers in European populations.The transition from hunter-gatherers to agriculturalists and farmers/farming in the 6 th millennium BP coincided with a human migration from the Near East. In the 3 rd millennium BP a second migration into Central Europe occurred originating from the Pontic steppe linked to the spread of the Corded Ware Complex ranging as far southwest as modern day Switzerland. These genetic processes are well studied for example for the Middle-Elbe-Saale region in Eastern Germany, however, little is known from the regions that connect Central and Southern Europe. Here, we investigate genomic data from 69 individuals from the Swiss Plateau and Southern Germany that span the transition of the Neolithic to the Bronze Age (5500 to 4000 BP). Our results show a similar genetic process as reported for the Middle-Elbe-Saale region suggesting that the migration from the Pontic steppe reached all the way into the Swiss plateau. The high quality of the ancient genomic data also allowed an analysis of core families within multiple burials, the determination and qualification of different ancestry components and the determination of the migration route taken by the ancestors of the Late Neolithic populations in this region. This study presents the first comprehensive genome wide dataset from Holocene individuals from the Swiss plateau and provides the first glimpse into the genetic history of this genetically and linguistically diverse region.


Genome-Wide Ancient DNA Portrays the Forming of the Finnish Population Along a 1400-Year Transect

Majander et al.

The Finnish population has long been a subject of interest for the fields of medical and population genetics, due to its isolation-affected genetic structure and the associated unique set of inherited diseases. Recent advances in ancient DNA techniques now enable the in-depth investigation of Finland's demographic past: the impact of migrations, trade and altering livelihood practices. Here we analyse genome-wide data from over 30 individuals, representing ten archaeological burial sites from southern Finland, that span from the 5th to 19th century. We find the historical individuals to differ genetically from Finns today. Comparing them with surrounding ancient and modern populations, we detect a transition from genotypes generally connected with prehistoric hunter-gatherers, and specifically resembling those of the contemporary Saami people, into a more East-Central European composition, associated with the established agricultural lifestyle. Starting from the Iron Age and continuing through the Early Medieval period, this transition dates remarkably late compared to the respective changes in most regions of Europe. Our results suggest a population shift, presumably related to Baltic and Slavic influences, also manifested in the archaeological record of the local artefacts from the late Iron Age. Our observations also agree with the archaeological models of relatively recent and gradual adoption of farming in Finland.


Population migration and dairy pastoralism on the Bronze Age Mongolian steppe

Warinner et al.

The steppe belt that extends across Eurasia was the primary corridor of Eneolithic and Bronze Age migrations that reshaped the genetics of Europe and Asia and dispersed the Indo-European language family. Beginning in the Eneolithic, a new and highly mobile pastoralist society formed on the Western Steppe. These steppe herders expanded both westwards, contributing to the Corded Ware culture of Eastern and Central Europe, and eastwards, contributing to the mobile pastoralist Afanasevo, Sintashta, Andronovo, and Okunevo cultures in Central Asia. The eastern extent of this Western Steppe herder expansion is not well defined. Here we investigate genome-wide ancestry data obtained from 20 Late Bronze Age (16th-9th century BCE) khirigsuur burials from Khovsgol, Mongolia and further investigate evidence for dairy pastoralism by LC-MS/MS analysis of dental calculus. Overall, we observe limited Western Steppe gene flow into Late Bronze Age Mongolia, but adoption of Western ruminant dairying by ca. 1500 BCE.


The Transition to Farming in Eneolithic (Copper Age) Ukraine was Largely Driven by Population Replacement

Schmidt et al.

The transition to a farming-based economy during the Neolithic happened relatively late in southeastern Europe. Material changes occurred through pottery manufacture, but it wasn't until the sixth millennium BCE that farming was adopted by the Cucuteni-Trypillian archaeological complex (4800-3000 BCE). In many parts of Europe, early farmers who were descended from Anatolian migrants slowly admixed with local hunter-gatherers over the course of the Neolithic. In Eastern Europe and the Balkans, this process may have been more complex since early farmers would likely have admixed with local groups prior to spreading into continental Europe. Studies from the Baltic and Estonia suggest little genetic input from early farmers or continuous admixture with hunter-gatherers. Here, we investigate the impact of Trypillian migrations into Ukraine through the analyses of 19 ancient genomes (0.6 to 2.1X coverage) from the site of Verteba Cave. Ceramic typology and radiocarbon dating of the cave indicate continuous occupation from the Mesolithic to the Medieval Period, with peak occupation coinciding with the middle to late Tripolye. We show that the Trypillians replaced local Ukrainian Neolithic cultures. Also, hunter-gatherers contributed very little ancestry to the Trypillians, who are genetically indistinct from early Neolithic farmers. The one exception is a female that has mostly steppe-related ancestry. Direct radiocarbon dating of this individual places her in the the Middle Bronze Age (3545 years before present). Her lack of farmer ancestry suggests abrupt population replacement resulting perhaps from inter-group hostilities or plague that spread through Europe during the Late Neolithic.

See also...

Ahead of the pack

Genetic borders are usually linguistic borders too

Yamnaya isn't from Iran just like R1a isn't from India

Wednesday, July 4, 2018

How relevant is Arslantepe to the PIE homeland debate?

Below is an abstract of a presentation from the recent 11th International Congress on the Archaeology of the Ancient Near East (see here). It was discussed on at least a couple of DNA forums, and hailed by some as potentially pivotal to the Proto-Indo-European (PIE) homeland debate. Emphasis is mine:

Palaeogenetic and Anthropological Perspectives on late Chalcolithic and Early Bronze Age Arslantepe

Skourtanioti, Eirini - Max Planck Institute for the Science of Human History, Jena
Selim, Erdal Yilaz - Department of Anthropology, Hacettepe University, Ankara

While Anatolia was highlighted as the genetic origin of early Neolithic European farmers, the genetic substructure in Anatolia itself as well as the demographic and cultural changes remain unclear. In eastern Anatolia, the archaeological record reflects influences from North-Central Anatolia, the northeastern sectors of Fertile Crescent and the Caucasus, and suggests that some of these were brought along with the movement of people. Central to this question is the archaeological site of Arslantepe (6 th -1 st millennium BC), strategically located at the Upper Euphrates, the nexus of all three regions. Arslantepe also developed one of the first state societies of Anatolia along with advanced metal-technologies. Archaeological research suggests that conflicts with surrounding groups of pastoralists affiliated to the Caucasus might have contributed to the collapse of its palatial system at the end of the Chalcolithic period (4 th millennium BC). To test if these developments were accompanied by genetic changes, we generated genome-wide data from 18 ancient individuals spanning from the Late Chalcolithic period to the Early Bronze Age of Arslantepe. Our results show no evidence for a major genetic shift between the two time periods. However, we observe that individuals from Arslantepe are very heterogeneous and differentiated from other ancient western and central Anatolians in that they have more Iran/Caucasus related ancestry. Our data also show evidence for an ongoing but also recent confluence of Anatolian/Levantine and Caucasus/Iranian ancestries, highlighting the complexity of the Chalcolithic and Bronze Age periods in this region.

Actually, I think it's likely that the good people from Max Planck will try to weave these results into their southern PIE homeland theory.

If so, they'll probably claim that since there's no discernible steppe ancestry in any of the 18 ancient Anatolians, then it's unlikely that there were any migrations from the Pontic-Caspian steppe into Anatolia during the Chalcolithic and Bronze Age periods. Ergo, they're likely to conclude that the PIE homeland was located south of the steppe, perhaps in Transcaucasia or what is now Iran, because this is where some of the Arslantepe ancients appear to source a lot of their recent ancestry from. But that would be a mistake.

First of all, the best way to link Arslantepe to the Proto-Indo-Europeans is probably via the massive Arslantepe Royal Tomb, which shows a close archaeological relationship to the kurgan burials in the North Caucasus. Here's an abstract from a paper on the topic from 2011 (freely available here). Emphasis is mine:

The first appearance of the Kurgan funerary tradition in the Northern Caucasus (Majkop‐Novosvobodnaya) dates to the second half of the fourth millennium and records an impressive display and accumulation of wealth in the grave goods (mainly metal objects) which stress the emergence of radical social transformations in the communities of the region. But kurgans also signal a new approach to territory and a different conception of the landscape. The Arslantepe Royal Tomb (in the Upper Euphrates Valley, eastern Anatolia), which is dated to 3100-2900 B.C., shows that far-reaching influences from the Northern Caucasus were already crossing the Greater Caucasus range and that they were being assimilated by the Anatolian power groups. The set of traits that the Arslantepe Royal Tomb shares with the funerary representations of the northern Caucasian Kurgans (ritual, grave goods and eventually the location chosen for the burial) was the result of a symbolic and ideological selection performed by a local community. It aimed at legitimizing and justifying the current historical and political contingencies and the emergence of new images and power rules. What can be grasped of the general sense and cultural values of the phenomenon of the northern Caucasian Kurgans by means of an interpretation performed by an external (and distant) community?

Hence, we can still argue that the PIE homeland was on the steppe, with Anatolian languages splitting from the rest of PIE in the North Caucasus, and then making their way south to Anatolia along with people of fully Caucasian genetic structure. As far as I can tell, that's essentially the theory proposed recently by Damgaard et al. in their linguistic supplement (see here).

But I'm not a huge fan of this solution. Clearly, the archaeological and genetic changes at Arslantepe from the Chalcolithic to the Bronze Age were in large part caused by the arrival of groups associated with the Kura-Araxes archaeological culture, and they were probably speakers of early Hurra-Urutian languages. If there were any migrants directly from the North Caucasus at Arslantepe, then I'd say that they probably spoke Caucasian languages.

Indo-Europeans certainly arrived at Arslantepe during the Late Bronze Age, with the expansion of the Hittite Empire into the region. But it's impossible to say how much, if any, PIE ancestry they brought with them, because obviously language shifts are a complex phenomenon, and this is especially true in state societies, like the Hittite Empire, in which cultural and political domination can lead to widespread language change without any accompanying gene flow.

See also...

Ahead of the pack

Genetic borders are usually linguistic borders too

Yamnaya isn't from Iran just like R1a isn't from India

Tuesday, July 3, 2018

Friendly Yeniseian steppe pastoralists

For most people the Proto-Indo-European (PIE) homeland debate isn't just about language, but also, or even more so, about things like ancestry, politics, racism, and ethnic pride.

I don't want to get into all the dirty details in this post, but, for instance, many of those who argue vehemently against a steppe homeland seem to really hate the idea that their ancestors were, at some level, dominated by a bunch of sheep and cow herders from an obscure part of Eastern Europe. Why? I'm pretty sure because they find this humiliating.

Hence, PIE homeland debates are generally very emotional and often degenerate into shouting matches. So what would happen if we assumed that those sheep herders weren't Indo-Europeans? Let's say, for the sake of argument, that they spoke extinct Yeniseian languages.

Well, nothing really...

We still have them expanding in a big way out of the Pontic-Caspian steppe. And the fact that their impact was mostly male mediated, especially at the two ends of its range, in Iberia and India, suggests that they weren't just friendly pastoralists looking for new grazing fields.

In fact, in India, steppe ancestry, including Y-chromosome haplogroup R1a, is much more pronounced in the upper castes than in the rest of society. But who knows what that means, right? Maybe those friendly "Yeniseian" steppe pastoralists just got lucky or something?

But that's OK, perhaps this will always remain a mystery? In any case, does anyone know if there's any sort of Yeniseian substrate in Celtic or Mycenaean? Haha.

See also...

"Heavily sex-biased" population dispersals into the Indian Subcontinent (Silva et al. 2017)

Migration of the Bell Beakers—but not from Iberia (Olalde et al. 2018)

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)

Sunday, July 1, 2018

Ahead of the pack

Eurogenes Blog January 2018 (see here):

Yamnaya and other similar Eneolithic/Bronze Age herder groups from the Eurasian steppe were mostly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG). But they also harbored minor ancestry from at least one, significantly more westerly, source that pulled them away from the EHG > CHG north/south genetic cline.


It's interesting and, I'd say, important to note that the West Asian reference groups produce amongst the worse statistical fits (bolded). What this suggests is that Yamnaya did not harbor extra West Asian ancestry on top of its CHG input.


Rather, Blatterhole_MN is simply the best proxy in this analysis for the non-CHG/EHG ancestry in Yamnaya, and the important question is why?

Considering also the presence at the top of the list of Koros_HG (which includes Hungary_HG I1507), Germany_MN and Vinca_MN, the likely answer is its high ratio of Western European Hunter-Gatherer (WHG) ancestry.


So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds? Quite possibly. But let's wait and see what happens when I add the ancient groups from the Balkans and North Pontic steppe from the forthcoming Mathieson et al. 2018 to this analysis.

And now Wang et al. May 2018 (see here).

In principal component space Eneolithic individuals (Samara Eneolithic) form a cline running from EHG to CHG (Fig. 2D), which is continued by the newly reported Eneolithic steppe individuals.


However, PCA results also suggest that Yamnaya and later groups of the West Eurasian steppe carry some farmer related ancestry as they are slightly shifted towards ‘European Neolithic groups’ in PC2 (Fig. 2D) compared to Eneolithic steppe.


Importantly, our results show a subtle contribution of both Anatolian farmer-related ancestry and WHG-related ancestry (Fig.4; Supplementary Tables 13 and 14), which was likely contributed through Middle and Late Neolithic farming groups from adjacent regions in the West. A direct source of Anatolian farmer-related ancestry can be ruled out (Supplementary Table 15).


We find that Yamnaya individuals from the Volga region (Yamnaya Samara) have 13.2±2.7% and Yamnaya individuals in Hungary 17.1±4.1% Anatolian farmer-related ancestry (Fig.4; Supplementary Table 18)– statistically indistinguishable proportions. Replacing Globular Amphora by Iberia Chalcolithic, for instance, does not alter the results profoundly (Supplementary Table 19). This suggests that the source population was a mixture of Anatolian farmer-related ancestry and a minimum of 20% WHG ancestry, a profile that is shared by many Middle/Late Neolithic and Chalcolithic individuals from Europe of the 3 rd millennium BCE analysed thus far.

Strikingly similar, don't you think? However, I'm not implying that they copied me. The point I'm making is that I predicted this outcome ahead of anyone else, and was able to demonstrate it without some of the key ancient samples that Wang et al. had access to. Indeed, kudos to them for finding and successfully sequencing those all important new Eneolithic steppe samples.

Moreover, these results mean that it's no longer plausible to argue that the Yamnaya population, by and large, formed due to recent gene flow from south of the Caucasus, let alone from what is now Iran, into the Pontic-Caspian steppe. Obviously, this is a major problem for anyone arguing that the Proto-Indo-European (PIE) homeland may have been located somewhere south of the Caucasus, such as Paul Heggarty of the Max Planck Institute for the Science of Human History (for instance, see here). But hey, never mind the facts when you have an awesome theory, right?

See also...

Genetic borders are usually linguistic borders too

Yamnaya isn't from Iran just like R1a isn't from India

Paul Heggarty: desperate or clueless?

Tuesday, June 26, 2018

Genetic borders are usually linguistic borders too

Note the awesome correlation between the two maps below. The first map is mine. I posted it on this blog almost a year ago (see here). The second map is from the recent Wang et al. preprint (see here). Also note that the Steppe and Caucasus clusters as defined by Wang et al. are rich in Y-haplogroups R1 and J, respectively (see here).

Very cool indeed. But I'm still scratching my head and wondering why Wang et al. entertained the possibility in their conclusion that Indo-European languages diffused into the steppe from south of the Caucasus? That's because, as a rule, human genetic borders also represent linguistic borders, and major linguistic families are strongly associated with Y-haplogroups (for instance, see here).

See also...

Matters of geography

Likely Yamnaya incursion(s) into Northwestern Iran

Yamnaya isn't from Iran just like R1a isn't from India

Saturday, June 23, 2018

Guest post: we owe many of our genetic traits to ancient steppe pastoralists, but...

This is a guest post courtesy of Samuel Andrews, a regular commentator for several years at this blog. I did edit parts of the original text submitted to me, but these were just cosmetic changes. If you spot any issues with this article, feel free to complain to Samuel in the comments below.

Massive migrations of pastoralists from the Pontic-Caspian steppe in the 3rd millennium BC abruptly ended generations of genetic stability in Europe. These large-scale population movements spread Yamnaya and Yamnaya-related ancestry throughout most of the continent, and indeed also much of Eurasia. Moreover, they also carried a specific type of European farmer ancestry, which was picked up by the migrating herders just west of the steppe.

By the end of the 3rd millennium BC, people across long distances within Europe shared very recent ancestry from the Pontic-Caspian steppe and surrounds. As a result, far away locations in Europe were more connected than they were ever before.

Ancient genome-wide data suggest that these migrations also spread unique genetic traits, such as lactase persistence and fair hair (blonde & red), that were once mostly restricted to a fairly limited region within Europe. However, some of these traits were originally derived from the farmers, or rather agropastoralists, who lived just west of the steppe, such as the Globular Amphora and Funnel Beaker peoples, rather than the steppe herders, and indeed this appears to be the case for both lactase persistence and fair hair.

In the 3rd millennium BC, both of these traits went from relative obscurity to widespread prominence across and beyond Europe. The Bell Beaker people, who dominated the western half of Europe during the Early Bronze Age (EBA), and the Sintashta people, who, soon after, lived just east of Europe's present-day border in the Trans-Ural steppe of what is now Russia, demonstrate this well.

The pre-Sintashta and pre-Beaker populations of present-day Russia and Britain, respectively, showed low frequencies of alleles associated with fair hair. Thus, the Beaker and Sintashta peoples took high frequencies of these alleles, which they both probably inherited from Eastern European farmers, to the opposite ends of Europe, and then, via the expansions of Sintashta-related Andronovo populations, also deep into Asia.

rs4988235 > lactase persistence
rs16891982 > light skin & blonde hair
rs12913832 > blue eyes & blonde hair
rs1805008 > red hair

When the Andronovo groups mixed with the indigenous inhabitants of Asia, the frequencies of most of these European-specific alleles among them were reduced, until they almost disappeared in the new populations that formed as a result of this mixture process. Nevertheless, they continued to exist wherever there was significant Andronovo ancestry, including in the Iron Age peoples of the Swat Valley in what is now Pakistan:

Swat Valley, Udegram_IA, S8195.E1.L1; allele T at rs1805008 (aka R160W), the most popular red hair variant among present-day Europeans.

Swat Valley; 17% derived allele frequency at rs16891982; 11% at rs12913832; twice as much as what Neolithic, Chalcolithic, and Bronze Age South Central Asians could boast.

Swat Valley, Barikot_IA, I6547; allele A at rs4988235 (aka I3910-T), the main lactase persistence mutation in both present-day Europe and South Asia.

Hence, I3910-T is a direct link between the populations of the Iron Age Swat Valley and ancient Europe. Indeed, Ukraine_Eneolithic I6561, from a burial associated with the Sredny Stog II archaeological culture in the North Pontic steppe, present-day Ukraine, is the oldest, UDG-treated sample in the ancient DNA record to date to show I3910-T. And he's also the oldest individual to belong to Y-chromosome haplogroup R1a-M417, which is today one of the most common Y-chromosome haplogroups in both Eastern Europe and South Asia, especially among the speakers of Indo-European languages.

After its expansion from Eastern Europe, I3910-T was heavily selected for both across most of Europe and in large parts of South Asia. Today, its frequencies vary significantly by region and ethnic group in India. By and large, much like R1a-M417, it’s more common in Indo-European-speaking North Indians than Dravidian-speaking South Indians. But it clearly peaks in Indian pastoralist populations that consume a lot of dairy products. They carry I3910-T at frequencies equal to those seen in many European groups.

The red hair variant R160W, seen in Swat Valley sample S8195.E1.L1, is another direct link between ancient South Asia and Europe. Just like I3910-T, R160W has been shown to have been present in Europe at least 2,000 years before the Andronovo Culture. Balkans_ChL I2423, a sample from what is now Bulgaria, carries R160W. This individual dates to 4400 BC, so he's of a similar age to Ukraine_Eneolithic I6561, the above mentioned early I3910-T carrier from the North Pontic steppe.

Other than that, as things stand, R160W is absent from pre-Kurgan Europe. In the aftermath of the Steppe migrations, around 2400-1800 BC, R160W pops up in many places in Europe just like I3910-T does. Four Andronovo/Sintashta samples out of about 100 carry R160W. Thus, a few, perhaps 1%, of the Andronovo and Sintashta people almost certainly had red hair.

Though very rare, R160W and other red hair variants do exist in South and Central Asia today. Several South Asians from the 1000 Genomes dataset carry R160W (see here). A Pathan or Pashtun from the HGDP dataset is predicted to have red hair by HIrisPlex-S. There is little doubt that this is associated with their Andronovo ancestry.

Main data sources...

Lazaridis et al. 2016

Lipson et al. 2017

Mathieson et al. 2018

Narasimhan et al. 2018

Olalde et al. 2018

See also...

Yamnaya isn't from Iran just like R1a isn't from India

Thursday, June 21, 2018

A potentially violent end to the Kura-Araxes Culture (Alizadeh et al. 2018)

The Kura-Araxes Culture dominated large parts of West Asia during the Early Bronze Age. It's generally accepted that the peoples associated with this archaeological phenomenon were speakers of early Hurra-Urutian dialects, and that they eventually morphed into the Hurrians and other related groups across the northern Near East.

However, it has also been hypothesized that in and around the Caucasus Mountains they were harried and even violently displaced by invaders pushing down from the Pontic-Caspian steppe in Eastern Europe.

A new paper at the AJA Online by Alizadeh et al. explores this angle in detail for an Kura-Araxes site at Nadir Tepesi in the Mughan Steppe, Iranian Azerbaijan, and concludes that it's a very plausible scenario indeed (open access here). Also worth noting in this context, I'd say, is my own recent discovery based on ancient DNA of the rather obvious signals of Yamnaya-related incursions into an area of what is now northwestern Iran not far from the Mughan Steppe (see here). From the paper, emphasis is mine:

By the late fourth to early third millennium B.C.E., Kura-Araxes (Early Transcaucasian) material culture spread from the southern Caucasus throughout much of southwest Asia. The Kura-Araxes settlements declined and ultimately disappeared in almost all the regions in southwest Asia around the middle of the third millennium B.C.E. The transition to the “post–Kura-Araxes” time in the southern Caucasus is one of the most tantalizing subjects in the archaeology of the region. Despite current knowledge on the origins and spread of the Kura-Araxes culture, little is known about the end of this cultural horizon. In this field report, we argue that the Kura-Araxes culture in the western Caspian littoral plain ended abruptly and possibly violently. To demonstrate this, we review the current hypotheses about the end of the Kura-Araxes culture and use results from excavations at Nadir Tepesi in Iranian Azerbaijan.


Following the decline of the relatively dense distribution of the Kura-Araxes settlements, some striking transformations are reflected in material culture. These include a large reduction in the number of settlements, an increase in burial sites, the appearance of collective burials and impressive royal kurgans, increased mobility, and changes in ceramic traditions (i.e., the appearance of Martkopi-Bedeni ceramics). In addition, there was a clear increase in metalwork, especially in the gold and silver attested mostly in rich burials. [10] To some scholars, all these transformations suggest the arrival of new groups of people with a new lifestyle based on transhumant pastoralism. [11]


We postulate that around the mid third millennium B.C.E. Nadir Tepesi was abandoned by the Kura-Araxes community. The end of the Kura-Araxes occupation in TTB and TTC is marked by a characteristic red-orange deposit that suggests a large-scale fire. It is unknown whether the destruction covers the whole settlement or is limited to its southwestern portion. However, it is hard to imagine that the fire was accidental since it represents the end of the Kura-Araxes occupation and an abrupt change in the cultural sequence at the site. The last Kura-Araxes occupational layer was immediately followed by a completely different archaeological repertoire. The thick destruction level followed immediately by a decisive break in the material culture suggests a violent end to the Kura-Araxes community at the site.


Tracing population movement and identifying evidence of migration are major methodological challenges for archaeologists. [49] On one hand, Puturidze argues that there is no evidence supporting the notion of a migration of people into the southern Caucasus. [50] Rather, she associates all the changes in the post–Kura-Araxes period with influences from Near Eastern societies as a result of developing interactions by the end of the third millennium B.C.E. On the other hand, Kohl hypothesizes the possibility of a “push-pull process” [51] in which new groups of people with wheeled carts and oxen-pulled wagons gradually moved from the steppes of the north into the southern Caucasus, and the Kura-Araxes communities subsequently moved farther south. [52]

Kohl also reminds us of the evidence of increased militarism from the Early to the Late Bronze Age that is reflected in more fortified sites, new weaponry, and an iconography of war as seen on the Karashamb Cup. [53] The appearance of defensive mechanisms such as fortification walls, which can be seen at Köhne Shahar, a Kura-Araxes settlement near Chaldran in Iranian Azerbaijan, further emphasizes the increase of inter-group conflicts and militarism during the Early Bronze Age, before the Kura-Araxes culture came to an end. [54] Kohl argues that, while the number of Kura-Araxes settlements decreased in the southern Caucasus, archaeological research indicates that the Kura-Araxes culture spread to western Iran in the Zagros region and to the Levant. [55] In Kohl’s view, as new groups of people moved in, the Kura-Araxes communities abandoned the southern Caucasus and moved farther south, where some of them already resided.


We believe that the evidence supports a less uniform scenario. The Kura-Araxes culture may have disappeared in various ways; the transition to the post–Kura-Araxes time may not be explained by a single model. Different Kura-Araxes settlements may have ended differently. The evidence from Nadir Tepesi could support a violent end at that site, and it is possible that similar evidence will be found at other sites in the Mughan Steppe. At some sites, such as Köhne Tepesi in the Khoda Afarin Plain, [58] the Kura-Araxes occupation also ended abruptly but without any sign of destruction. In other regions, there may be evidence supporting the coexistence of newcomers with Kura-Araxes communities for some period. [59]

Alizadeh et al., The End of the Kura-Araxes Culture as Seen from Nadir Tepesi in Iranian Azerbaijan, American Journal of Archaeology Vol. 122, No. 3 (July 2018), pp. 463–477, DOI: 10.3764/aja.122.3.0463

See also...

Yamnaya isn't from Iran just like R1a isn't from India

Tuesday, June 19, 2018

An exploration of distance-based models of language relationships with a special focus on Indo-European (Kozintsev 2018)

The latest edition of the Journal of Indo-European Studies includes an interesting methodological paper by Alexander Kozintsev, in which the author tests the relationship between Indo-European and other language families using lexicostatistical data and a wide range of distance-based models (see here). My impression, after reading the paper a couple of times, is that we probably have a long way to go before someone comes up with a robust enough way to study languages with these sorts of methods, which are more widely used for the classification of living things.

However, note that Kozintsev's results are very consistent in placing Indo-European, including Hittite (HIT in the figure below), significantly closer to Uralic than to any of the language families south of the Caucasus. This is in line with the general consensus amongst historical linguists working with more traditional methods of studying languages, and, if true, has significant implications for the search for the Proto-Indo-European (PIE) homeland. Why? Because it's very difficult to imagine the PIE homeland being located anywhere south of the Caucasus considering the present-day distribution and likely homeland of Uralic languages well to the north of this region. Emphasis is mine:

The paper explores the informative potential of various distance-based methods of language classification such as cluster analysis, networks, and two-dimensional projections, using lexicostatistical data on 41 languages belonging to seven families (IE, Uralic, Altaic, Yupik-Chukchee, Kartvelian, Semitic, and North Caucasian) represented in the STARLING database. Rooting and weighting are of critical importance, radically affecting the graphic models. Special focus is made on two-dimensional charts generated by the multidimensional scaling and on the little-used minimum spanning tree method. The latter two techniques are employed to test the hybridization/ Sprachbund theory of Indo-European origins. The “Semitic” tendency of IE relative to Uralic is significant whereas neither the “Kartvelian” tendency nor the North Caucasian substratum hypothesis are supported by the two-dimensional models.


Finally, having come full circle, we return to our working hypothesis––that IE is closer to Uralic than to any of the “southern” families. I did not test this assumption because it appeared almost self-evident; now it can be easily tested by the same analysis. But, in fact, even statistical testing is unnecessary, because the triangle data cited above speak for themselves. IE, according to these data, is 20.8% closer to Uralic than to West Caucasian; 18.4% closer to Uralic than to East Caucasian; 13.7% closer to Uralic than to Kartvelian; and 16.9% closer to Uralic than to Semitic. Given the statistical reliability of a 5.6% difference (see above), all these values are highly significant a fortiori.

Kozintsev, Alexander, On Certain Aspects of Distance-based Models of Language Relationships, with Reference to the Position of Indo-European among other Language Families, Journal of Indo-European Studies, Vol. 46, 2018, No. 1 & 2, pp. 1-264

Saturday, June 16, 2018

Yamnaya isn't from Iran just like R1a isn't from India

A strange thing sometimes happens in population genetics: highly capable and experienced researches come up with stupid ideas and push them so hard that, despite all the evidence to the contrary, they become accepted as truths. At least for a little while.

It's obvious now, thanks to full genome sequencing and ancient DNA, that Y-chromosome haplogroup R1a cannot be native to India. It arrived there rather recently from the Eurasian steppe, in all likelihood during the Bronze Age, probably as the Indus Valley Civilization (IVC) was collapsing or, perhaps, just after it had collapsed.

But for quite a few years this was something of a taboo, even politically incorrect, narrative, and it was vehemently rubbished by many Indians, including Indian scientists, and their western academic sympathizers.

Indeed, a whole series of papers came out, often in high brow scientific journals, claiming that R1a originated in South Asia, and that it spread from there to Europe. This, it was also claimed, was the final nail in the coffin of the so called Aryan Invasion Theory (AIT), because R1a was often described as the "Aryan" haplogroup.

I wasn't impressed by any of this nonsense. I said so here and elsewhere, to the great annoyance of those who believed, against all reason and logic, that the Indo-Aryans, and even Indo-Europeans, were indigenous to India. Here's a taste of some of my work on the topic going back to 2013.

South Asian R1a in the 1000 Genomes Project

Children of the Divine Twins

The Poltavka outlier

Looking back, it's all a bit rough, but very cool nonetheless. However, I was often accused of being biased, unscientific and even bigoted and racist as a result of offering such commentary and research. Make no mistake, my detractors were seething that I would dare to question what was apparently a scientific reality, and they wanted to shut me up. It was a nasty experience, but it now feels great to be vindicated.

Certainly, nowadays, no objective person who, more or less, knows their stuff would argue that the vast majority of the R1a in India doesn't ultimately derive from the Pontic-Caspian steppe in Eastern Europe.

But otherwise things haven't changed all that much since then. For instance, despite a whole heap of ancient DNA data being available from Eastern Europe and West Asia, there's a widely accepted idea that the Early Bronze Age (EBA) Yamnaya culture formed on the Pontic-Caspian steppe as a result of migrations from what is now Iran.

This is not true. It can't be true, because it's contradicted by all of the data. I've tried to explain this on several occasions, but generally to no avail.

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic

Another look at the genetic structure of Yamnaya

Likely Yamnaya incursion(s) into Northwestern Iran

Thus, the Yamnaya people and culture were indigenous to Eastern Europe, and basically formed as a result of the amalgamation of at least three different populations closely related to Eastern European Hunter-Gatherers (EHG), Caucasus Hunter-Gatherers (CHG), Early European Farmers (EEF) and Western European Hunter-Gatherers (WHG). They did not harbor any significant ancestry from what is now Iran; at least not from within any reasonable time frame.

However, me communicating this fact has resulted in some rather strange and unsavory reactions from a number of individuals who appear to have a big emotional investment in this issue. They become frustrated and even angry when I try to explain to them that there's no sense in looking for the genetic origins of Yamnaya in Iran, much like the people who argued with me when I tried to reason with them that R1a wasn't native to India. Here's an example from a recent blog post (for the full conversation scroll down to the comments here).

Heh, here we go again with the accusations of bias, scientific impropriety and whatnot. Ironically, the poor chap just couldn't comprehend that he never had an argument to begin with, quite obviously due to his own bias in regards to this topic. Well, at least he didn't call me a racist.

In a recent preprint, Wang et al. correctly characterized Yamnaya as, by and large, a mixture of populations closely related to EHG, CHG, EEF and WHG (see here), with no obvious input from what is now Iran. Sounds familiar, right?

They also discovered that, during the Chalcolithic and Bronze Age, the Caucasus and nearby steppes were mainly home to three quite distinct populations: 1) Steppe groups, including Eneolithic steppe and Caucasus Yamnaya, 2) Caucasus groups, including Kura-Araxes and Maykop, and 3) Steppe Maykop, which they classified as part of 1. These populations were all separated by clear genetic and cultural borders, with significant and unambiguous mixture from the Caucasus cluster only in a couple of Steppe Maykop outliers and one Yamnaya outlier from what is now Ukraine.

Clearly, this leaves no room for any migrations from what is now Iran to the steppe that would potentially give rise to Yamnaya. In other words, the main genetic ingredients for what was to become Yamnaya were already on the steppe well before Yamnaya, during the Eneolithic, and it's quite likely that they were indigenous to the region.

However, interestingly, Wang et al. did appear to try to save the link between Yamnaya and Iran by referring to the CHG-related ancestry in Yamnaya as "CHG/Iranian". I'm not surprised because most of these authors are associated with the Max Planck Institute for the Science of Human History (MPI-SHH), which is currently pushing a proposal that the Proto-Indo-European (PIE) homeland was located in what is now Iran and surrounds (see here). So, obviously, they need to somehow show a relationship between Yamnaya and Iran, because Yamnaya and the closely related Corded Ware archaeological complex are generally seen as early Indo-European cultural horizons. Good luck with that.

Actually, let me make it clear once and for all that I couldn't care less where the very first Indo-European words were uttered. It's just something that I find interesting. I rather doubt that this was within the borders of present-day Iran, and I explained in some detail why in a post almost two years ago (see here). But if someone manages to prove that the PIE homeland was indeed located partly or wholly within what is now Iran, that's OK. I won't be emotionally traumatized as a result.

However, obviously, this will have to be done with the assumption in mind that Yamnaya and Corded Ware became Indo-European-speaking almost purely via an linguistic transmission, with hardly any associated gene flow. It's possible, I guess. But then there's almost 200 years of scholarship based on linguistics and archaeological data that generally agrees in favor of the Pontic-Caspian steppe as the PIE homeland.

On a related note, I also couldn't care less whether the Aryan Invasion Theory (AIT) reflects what really happened during the Indo-Europeanization of South Asia, or if it's more appropriate to call it the Aryan Migration Theory (AMT). I'll accept whatever an objective analysis of all of the relevant data shows when we have enough of it to make an informed decision.

However, currently, I see nothing in the data that would prevent the AIT from being true. To me, the profound impact that the Bronze Age steppe peoples obviously had on South Asia, and especially on the Indo-European-speaking Indian upper castes, suggests that, overall, an invasion-like scenario is quite plausible. But I might be wrong, and so what if I am?

See also...

Ahead of the pack

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Genetic borders are usually linguistic borders too

Tuesday, June 12, 2018

Dali_EBA and West_Siberia_N in qpGraph

Below is a qpGraph tree that I've been working on for a while. I'll be posting more output from random analyses like this from now on. The relevant graph file is available in the zip folder here. Any ideas what else can be done with this topology?

See also...

Graeco-Aryan parallels

Friday, June 8, 2018

Of horses and men

Y-HT-1 is today by far the most common Y-chromosome haplogroup in domesticated horse breeds. According to Wutke et al. 2018, this is probably the result of artificial, human induced selection for this lineage, initially on the Eurasian steppe during the Iron Age, and then subsequently in Europe during the Roman period (see here).

However, during the Bronze and Iron Ages, before Y-HT-1 reached fixation, another very important Y-haplogroup in domesticated horses was its older sister clade Y-HT-4.

Indeed, it's likely that both Y-HT-1 and Y-HT-4 first dominated the domesticated horse gene pool during the Bronze Age, probably because they happened to have been present in the horse population exploited by the early Indo-Europeans. This was missed, or at least not directly discussed by Wutke et al., but I'd say it's a fairly obvious conclusion that can be drawn from their data, especially if we consider the fact that horses are the most important animal in the Indo-European pantheon.

Thus, the story of Y-HT-1 and, up to a point, Y-HT-4 is probably very similar to that of two human Y-haplogroups, R1a-M417 and R1b-M269. Both of these lineages also rose to prominence rather suddenly during the Eneolithic and Bronze Age, in all likelihood because they were present amongst early Indo-European-speaking males (see here).

Below is a map of the earliest reliably called and dated instances of Y-HT-1, Y-HT-4, R1a-M417 and R1b-M269 in the ancient DNA record. Not surprisingly, all of the points on the map are located on or very close to the Pontic-Caspian steppe, which is generally accepted to have been the Proto-Indo-European homeland. Fascinating stuff.

See also...

Central Asia as the PIE urheimat? Forget it

Cultural hitchhiking and competition between patrilineal kin groups may have led to the post-Neolithic Y-chromosome bottleneck (Zeng et al. 2018)

Was Ukraine_Eneolithic I6561 a Proto-Indo-European?

Thursday, May 31, 2018

What's Maykop (or Iran) got to do with it? #2

For the past few days I've been trying to copy and also improve on the qpGraph tree in the Wang et al. preprint (see here). I've managed to come up with a new version of my model that not only offers a better statistical fit, but, in my opinion, also a much more sensible solution. For instance, the Eastern Hunter-Gatherer node now shows 73% MA1-related admixture, which, I'd say, makes more sense than the 10% in the previous version. The relevant graph file is available here.

Samara Yamnaya can be perfectly substituted in this graph by early Corded Ware samples from the Baltic region (CWC_Baltic_early) and a pair of Yamnaya individuals from what is now Ukraine. This is hardly surprising, considering how similar all of these samples are to each other in other analyses, but it's nice to see nonetheless, because I think it helps to confirm the reliability of my model.

And yes, I have tested all sorts of other Yamnaya-related ancient and present-day populations with this tree. They usually pushed the worst Z score to +/- 3 and well beyond, probably because they weren't similar enough to Yamnaya. But, perhaps surprisingly, Bell Beakers from Britain produced a decent result (see here).

See also...

Ahead of the pack

On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint)

Another look at the genetic structure of Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friday, May 25, 2018

Cultural hitchhiking and competition between patrilineal kin groups may have led to the post-Neolithic Y-chromosome bottleneck (Zeng et al. 2018)

A very interesting paper has just appeared at Nature Communications that potentially offers an explanation for the well documented explosions of certain Y-chromosome lineages in the Old World after the Neolithic, such as those that led to most European males today belonging to Y-haplogroups R1a and R1b (LINK). I might have more to say about this paper in the comments below after I've read it a couple of times. Emphasis is mine:

In human populations, changes in genetic variation are driven not only by genetic processes, but can also arise from cultural or social changes. An abrupt population bottleneck specific to human males has been inferred across several Old World (Africa, Europe, Asia) populations 5000–7000 BP. Here, bringing together anthropological theory, recent population genomic studies and mathematical models, we propose a sociocultural hypothesis, involving the formation of patrilineal kin groups and intergroup competition among these groups. Our analysis shows that this sociocultural hypothesis can explain the inference of a population bottleneck. We also show that our hypothesis is consistent with current findings from the archaeogenetics of Old World Eurasia, and is important for conceptions of cultural and social evolution in prehistory.


If the primary unit of sociopolitical competition is the patrilineal corporate kin group, deaths from intergroup competition, whether in feuds or open warfare, are not randomly distributed, but tend to cluster on the genealogical tree of males. In other words, cultural factors cause biases in the usually random process of transmission of Y-chromosomes, increasing the rate of loss of Y-chromosomal lineages and accelerating genetic drift. Extinction of whole patrilineal groups with common descent would translate to the loss of clades of Y-chromosomes. Furthermore, as success in intergroup competition is associated with group size, borne out empirically in wars [43] as ‘increasing returns at all scales’ [44], and as larger group size may even be associated with increased conflict initiation, borne out in data on feuds45, there may have been positive returns to lineage size. This would accelerate the loss of minor lineages and promote the spread of major ones, further increasing the speed of genetic drift.

In addition, the assimilation of women from groups that are disrupted or extirpated through intergroup competition into remaining groups is a common result of warfare in small-scale societies [46]. This, together with female exogamy, would tend to limit the impact of intergroup competition to Y-chromosomes.


Figure 6 shows a striking pattern of differences in shallowness of coalescence in samples from hunter-gatherer, farmer and pastoralist cultures. While hunter-gatherer Y-chromosomes from the same culture, and often the same sites, commonly divide into haplotypes that coalesce in multiple millennia, Y-chromosomes of samples from farmer and pastoralist cultures are more homogeneous and have more recent coalescences. The Bell Beaker culture has a high proportion of sampled males (81%) from a large geographical area (Iberia to Hungary) who belong to an identical Y-chromosomal haplogroup (R1b-S116), implying common descent from a kin group that existed quite recently. Some groups of males share even more recent descent, on the order of ten generations or fewer [64]. Such recent common descent may even be retained in cultural memory via oral genealogies, such as among descent groups in Northern and Western Africa, whose members can trace descent relationships up to three to four centuries before the generation currently living [40]. Likewise, from Germany to Estonia, the Y-chromosomes of all Corded Ware individuals sampled, except one, belong to a single clade within haplogroup R1a (R1a-M417) and appear to coalesce shortly before sample deposition.

Thus, groups of males in European post-Neolithic agropastoralist cultures appear to descend patrilineally from a comparatively smaller number of progenitors when compared to hunter gatherers, and this pattern is especially pronounced among pastoralists. Our hypothesis would predict that post-Neolithic societies, despite their larger population size, have difficulty retaining ancestral diversity of Y-chromosomes due to mechanisms that accelerate their genetic drift, which is certainly in accord with the data. The tendency of pastoralist cultures to show the lowest Y-chromosomal diversity and the shallowest coalescence would also be explained, as they may have experienced the social conditions that characterized cultures of the Central Asian steppes [42]. Indeed, the Corded Ware pastoralists may have been organized into segmentary lineages [65], an extremely common tribal system among pastoralist cultures, including those of historical Central Asia [66].

Zeng et al., Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck, Nature Communicationsvolume 9, Article number: 2077 (2018) doi:10.1038/s41467-018-04375-6

Update 30/05/2018: For those clued in, here's an awesome quote from the relevant press release.

The outlines of that idea came to Tian Chen Zeng, a Stanford undergraduate in sociology, after spending hours reading blog posts that speculated - unconvincingly, Zeng thought - on the origins of the "Neolithic Y-chromosome bottleneck," as the event is known.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, May 24, 2018

What's Maykop (or Iran) got to do with it?

I had a go at imitating this qpGraph tree, from the recent Wang et al. preprint on the genetic prehistory of the Caucasus, using the ancient samples that were available to me. I'm very happy with the outcome, because everything makes good sense, more or less. The real populations and singleton individuals, ten in all, are marked in red. The rest of the labels refer to groups inferred from the data.

However, this is still a work in progress, and, if possible, I'd like simplify the model and also get the worst Z score much closer to zero. If anyone wants to help out, the graph file is available HERE. Feel free to post your own versions in the comments, and I'll run them for you as soon as I can.

Update 31/05/2018: I've managed to come up with a new version of my model that not only offers a better statistical fit, but, in my opinion, also a much more sensible solution. For instance, the Eastern Hunter-Gatherer node now shows 73% MA1-related admixture, which, I'd say, makes more sense than the 10% in the previous version. The relevant graph file is available here.

For more details and a discussion about the updated model, including additional trees with Baltic Corded Ware and British Beaker samples, please check out my new thread on the topic at the link below.

What's Maykop (or Iran) got to do with it? #2


Wang et al., The genetic prehistory of the Greater Caucasus, bioRxiv, posted May 16, 2018, doi:

See also...

Ahead of the pack

On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint)

Another look at the genetic structure of Yamnaya

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Wednesday, May 23, 2018

More Botai genomes (Jeong et al. 2018 preprint)

Over at bioRxiv at this LINK. Actually, these may or may not be the same Botai genomes that have already been published along with Damgaard et al. 2018 (see comments below for the discussion about that). Here's the abstract. Emphasis is mine:

The indigenous populations of inner Eurasia, a huge geographic region covering the central Eurasian steppe and the northern Eurasian taiga and tundra, harbor tremendous diversity in their genes, cultures and languages. In this study, we report novel genome-wide data for 763 individuals from Armenia, Georgia, Kazakhstan, Moldova, Mongolia, Russia, Tajikistan, Ukraine, and Uzbekistan. We furthermore report genome-wide data of two Eneolithic individuals (~5,400 years before present) associated with the Botai culture in northern Kazakhstan. We find that inner Eurasian populations are structured into three distinct admixture clines stretching between various western and eastern Eurasian ancestries. This genetic separation is well mirrored by geography. The ancient Botai genomes suggest yet another layer of admixture in inner Eurasia that involves Mesolithic hunter-gatherers in Europe, the Upper Paleolithic southern Siberians and East Asians. Admixture modeling of ancient and modern populations suggests an overwriting of this ancient structure in the Altai-Sayan region by migrations of western steppe herders, but partial retaining of this ancient North Eurasian-related cline further to the North. Finally, the genetic structure of Caucasus populations highlights a role of the Caucasus Mountains as a barrier to gene flow and suggests a post-Neolithic gene flow into North Caucasus populations from the steppe.

Jeong et al., Characterizing the genetic history of admixture across inner Eurasia, Posted May 23, 2018, doi:

See also...

New PCA featuring Botai horse tamers, Hun and Saka warriors, and many more...

Global25 workshop 2: intra-European variation

Even though the Global25 focuses on world-wide human genetic diversity, it can also reveal a lot of information about genetic substructures within continental regions.

Several of the dimensions, for instance, reflect Balto-Slavic-specific genetic drift. I ensured that this would be the case by running a lot of Slavic groups in the analysis. A useful by-product of this strategy is that the Global25 is very good at exposing relatively recent intra-European genetic variation.

To see this for yourself, download the datasheet below and plug it into the PAST program, which is freely available here. Then select all of the columns by clicking on the empty tab above the labels, and choose Multivariate > Ordination > Principal Components.


You should end up with the plot below. Note that to see the group labels and outlines, you need to tick the appropriate boxes in the panel to the right of the image. To improve the experience, it might also be useful to color-code different parts of Europe, and you can do that by choosing Edit > Row colors/symbols. Of course, if you have Global25 coordinates you can add yourself to the datasheet to see where you plot.

Components 1 and 2 pack the most information and, more or less, recapitulate the geographic structure of Europe. However, many details can only be seen by plotting the less significant components. For instance, a plot of components 1 and 3 almost perfectly separates Northeastern Europe into two distinct clusters made up of the speakers of Indo-European and Finno-Ugric languages.

This plot might also be useful for exploring potential Jewish ancestry, because Ashkenazi, Italian and Sephardi Jews appear to be relatively distinct in this space. Thus, people with significant European Jewish ancestry will "pull" towards the lower left corner of the plot. For example, someone who is half Ashkenazi and half German will probably land in the empty space between the Northwest Europeans and Jews.

See also...

Global25 workshop 1: that classic West Eurasian plot

Global25 PAST-compatible datasheets

Monday, May 21, 2018

Global25 workshop 1: that classic West Eurasian plot

In this Global25 workshop I'm going to show how to reproduce, more or less, that classic plot of West Eurasian genetic diversity seen regularly in ancient DNA papers and at this blog (for instance, here). To do this you'll need the datasheet below, which I'll be updating regularly, and the PAST program, which is freely available here.


This is what you'll get if you follow my instructions to the letter. Note the fairly strong correlation with geography. I think this is impressive for so many reasons.

OK, so, download the said datasheet, plug it into PAST, select columns 1 to 8, and go to Multivariate > Ordination > Principal Components. Here's a screen cap of me doing it:

The initial output won't resemble my plot above. So you'll need to place PC2 on the X axis, PC1 on the Y axis, and set the image size to 1206x706. After doing that, you should end up with exactly this:

Then, export the image, flip it horizontally with whatever imaging software that can do the job, and that's it, unless you want to add some labels like I did. Feel free to ask questions and make suggestions in the comments below.

See also...

Global25 workshop 2: intra-European variation

Global25 PAST-compatible datasheets

Saturday, May 19, 2018

Global25 PAST-compatible datasheets

I'm planning to run regular workshops over the next few months on how to get the most out of Global25 data with various programs, and expecially PAST (see here). So if you have Global25 coordinates, please stay tuned.

To that end, I've put together four color-coded, PAST-compatible Global25 datasheets with thousands of present-day and ancient samples, available at the links below:





PAST is an awesome little statistical program and simple to use. The manual is available here. To kick things off, here's a quick guide how to run a Neighbor Joining tree on your Global25 coordinates:

- download the Global_25_PCA_pop_averages_scaled.dat from the last link above

- open the dat file with something a little more advanced than Windows notepad, like, say, TextPad (see here)

- stick your scaled coordinates at the bottom of the sheet, so that they look exactly like those of the other samples, except give yourself an original symbol, like, say, a black star

- open the edited dat file with PAST and choose all of the columns and rows by clicking the empty tab above the labels

- then, at the top, go to Multivariate > Clustering > Neighbor joining

After a few seconds you should see a nice, color-coded tree like the one below, except you'll also be on it, in black text. I'm very happy with these results, by the way. As far as I can see, all of the populations and individuals cluster exactly where they should.

Those of you who are already very proficient in using PAST, feel free to go nuts with these new datasheets and show us the results in the comments below. I'll try to put together a workshop for beginners within the next couple of weeks.

See also...

Global25 workshop 1: that classic West Eurasian plot

Global25 workshop 2: intra-European variation

Wednesday, May 16, 2018

On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint)

Finally, the focus shifts to the Eneolithic/Bronze Age North Caucasus. In a new manuscript at bioRxiv, Wang et al. present genome-wide SNP data for 45 prehistoric individuals from the region along a 3000-year temporal transect (see here). From the preprint (emphasis is mine):

Based on PCA and ADMIXTURE plots we observe two distinct genetic clusters: one cluster falls with previously published ancient individuals from the West Eurasian steppe (hence termed ‘Steppe’), and the second clusters with present-day southern Caucasian populations and ancient Bronze Age individuals from today’s Armenia (henceforth called ‘Caucasus’), while a few individuals take on intermediate positions between the two. The stark distinction seen in our temporal transect is also visible in the Y-chromosome haplogroup distribution, with R1/R1b1 and Q1a2 types in the Steppe and L, J, and G2 types in the Caucasus cluster (Fig. 3A, Supplementary Data 1). In contrast, the mitochondrial haplogroup distribution is more diverse and almost identical in both groups (Fig. 3B, Supplementary Data 1).

Thus, the most important "Indo-European" Y-haplogroups today, R1a-M417 and R1b-M269, did not arrive in Europe from the Caucasus or Near East. They're native to Europe. Hence, it appears that Eneolithic/Bronze Age Eastern Europeans mostly acquired their Near Eastern-related ancestry via female exogamy from populations in the Caucasus. That's basically what I've been arguing for a few years now. It feels good to be vindicated, especially considering the unfair criticism that I was subjected to here and elsewhere because of expressing this opinion (for instance, see here).

However, as far as I can see, based on the samples in this preprint, neither the Caucasus Maykop nor steppe Maykop appear to be unambiguous sources of this southern admixture in ancient Eastern Europe. That's because the Caucasus Maykop mtDNA profile still looks somewhat off in this context, while steppe Maykop harbors West Siberian forager-related genome-wide ancestry that is practically absent in the Yamnaya and all other closely related peoples.

In any case, please note the happy coincidence that academia has finally caught up to this blog and managed to find European farmer-derived ancestry in Yamnaya:

Importantly, our results show a subtle contribution of both Anatolian farmer-related ancestry and WHG-related ancestry (Fig.4; Supplementary Tables 13 and 14), which was likely contributed through Middle and Late Neolithic farming groups from adjacent regions in the West. A direct source of Anatolian farmer-related ancestry can be ruled out (Supplementary Table 15). At present, due to the limits of our resolution, we cannot identify a single best source population. However, geographically proximal and contemporaneous groups such as Globular Amphora and Eneolithic groups from the Black Sea area (Ukraine and Bulgaria), which represent all four distal sources (CHG, EHG, WHG, and Anatolian_Neolithic) are among the best supported candidates (Fig. 4; Supplementary Tables 13,14 and 15).

Check out what I had to say about this issue exactly two years ago: Yamnaya = Khvalynsk + extra CHG + maybe something else. Not bragging, just making a point that I do know what I'm doing here, most of the time anyway.

Wang et al. conclude their preprint with, unfortunately I have to say, some downright bizarre comments in regards to the Proto-Indo-European (PIE) homeland debate. But I'll get back to that later, when the ancient data from this and forthcoming related papers are released online.


Wang et al., The genetic prehistory of the Greater Caucasus, bioRxiv, posted May 16, 2018, doi:

See also...

Ahead of the pack

Genetic borders are usually linguistic borders too

What's Maykop (or Iran) got to do with it?