search this blog

Tuesday, June 26, 2018

Genetic borders are usually linguistic borders too

Note the awesome correlation between the two maps below. The first map is mine. I posted it on this blog almost a year ago (see here). The second map is from the recent Wang et al. preprint (see here). Also note that the Steppe and Caucasus clusters as defined by Wang et al. are rich in Y-haplogroups R1 and J, respectively (see here).

Very cool indeed. But I'm still scratching my head and wondering why Wang et al. entertained the possibility in their conclusion that Indo-European languages diffused into the steppe from south of the Caucasus? That's because, as a rule, human genetic borders also represent linguistic borders, and major linguistic families are strongly associated with Y-haplogroups (for instance, see here).

See also...

Matters of geography

Likely Yamnaya incursion(s) into Northwestern Iran

Yamnaya isn't from Iran just like R1a isn't from India

Saturday, June 23, 2018

Guest post: we owe many of our genetic traits to ancient steppe pastoralists, but...

This is a guest post courtesy of Samuel Andrews, a regular commentator for several years at this blog. I did edit parts of the original text submitted to me, but these were just cosmetic changes. If you spot any issues with this article, feel free to complain to Samuel in the comments below.

Massive migrations of pastoralists from the Pontic-Caspian steppe in the 3rd millennium BC abruptly ended generations of genetic stability in Europe. These large-scale population movements spread Yamnaya and Yamnaya-related ancestry throughout most of the continent, and indeed also much of Eurasia. Moreover, they also carried a specific type of European farmer ancestry, which was picked up by the migrating herders just west of the steppe.

By the end of the 3rd millennium BC, people across long distances within Europe shared very recent ancestry from the Pontic-Caspian steppe and surrounds. As a result, far away locations in Europe were more connected than they were ever before.

Ancient genome-wide data suggest that these migrations also spread unique genetic traits, such as lactase persistence and fair hair (blonde & red), that were once mostly restricted to a fairly limited region within Europe. However, some of these traits were originally derived from the farmers, or rather agropastoralists, who lived just west of the steppe, such as the Globular Amphora and Funnel Beaker peoples, rather than the steppe herders, and indeed this appears to be the case for both lactase persistence and fair hair.

In the 3rd millennium BC, both of these traits went from relative obscurity to widespread prominence across and beyond Europe. The Bell Beaker people, who dominated the western half of Europe during the Early Bronze Age (EBA), and the Sintashta people, who, soon after, lived just east of Europe's present-day border in the Trans-Ural steppe of what is now Russia, demonstrate this well.

The pre-Sintashta and pre-Beaker populations of present-day Russia and Britain, respectively, showed low frequencies of alleles associated with fair hair. Thus, the Beaker and Sintashta peoples took high frequencies of these alleles, which they both probably inherited from Eastern European farmers, to the opposite ends of Europe, and then, via the expansions of Sintashta-related Andronovo populations, also deep into Asia.

rs4988235 > lactase persistence
rs16891982 > light skin & blonde hair
rs12913832 > blue eyes & blonde hair
rs1805008 > red hair

When the Andronovo groups mixed with the indigenous inhabitants of Asia, the frequencies of most of these European-specific alleles among them were reduced, until they almost disappeared in the new populations that formed as a result of this mixture process. Nevertheless, they continued to exist wherever there was significant Andronovo ancestry, including in the Iron Age peoples of the Swat Valley in what is now Pakistan:

Swat Valley, Udegram_IA, S8195.E1.L1; allele T at rs1805008 (aka R160W), the most popular red hair variant among present-day Europeans.

Swat Valley; 17% derived allele frequency at rs16891982; 11% at rs12913832; twice as much as what Neolithic, Chalcolithic, and Bronze Age South Central Asians could boast.

Swat Valley, Barikot_IA, I6547; allele A at rs4988235 (aka I3910-T), the main lactase persistence mutation in both present-day Europe and South Asia.

Hence, I3910-T is a direct link between the populations of the Iron Age Swat Valley and ancient Europe. Indeed, Ukraine_Eneolithic I6561, from a burial associated with the Sredny Stog II archaeological culture in the North Pontic steppe, present-day Ukraine, is the oldest, UDG-treated sample in the ancient DNA record to date to show I3910-T. And he's also the oldest individual to belong to Y-chromosome haplogroup R1a-M417, which is today one of the most common Y-chromosome haplogroups in both Eastern Europe and South Asia, especially among the speakers of Indo-European languages.

After its expansion from Eastern Europe, I3910-T was heavily selected for both across most of Europe and in large parts of South Asia. Today, its frequencies vary significantly by region and ethnic group in India. By and large, much like R1a-M417, it’s more common in Indo-European-speaking North Indians than Dravidian-speaking South Indians. But it clearly peaks in Indian pastoralist populations that consume a lot of dairy products. They carry I3910-T at frequencies equal to those seen in many European groups.

The red hair variant R160W, seen in Swat Valley sample S8195.E1.L1, is another direct link between ancient South Asia and Europe. Just like I3910-T, R160W has been shown to have been present in Europe at least 2,000 years before the Andronovo Culture. Balkans_ChL I2423, a sample from what is now Bulgaria, carries R160W. This individual dates to 4400 BC, so he's of a similar age to Ukraine_Eneolithic I6561, the above mentioned early I3910-T carrier from the North Pontic steppe.

Other than that, as things stand, R160W is absent from pre-Kurgan Europe. In the aftermath of the Steppe migrations, around 2400-1800 BC, R160W pops up in many places in Europe just like I3910-T does. Four Andronovo/Sintashta samples out of about 100 carry R160W. Thus, a few, perhaps 1%, of the Andronovo and Sintashta people almost certainly had red hair.

Though very rare, R160W and other red hair variants do exist in South and Central Asia today. Several South Asians from the 1000 Genomes dataset carry R160W (see here). A Pathan or Pashtun from the HGDP dataset is predicted to have red hair by HIrisPlex-S. There is little doubt that this is associated with their Andronovo ancestry.

Main data sources...

Lazaridis et al. 2016

Lipson et al. 2017

Mathieson et al. 2018

Narasimhan et al. 2018

Olalde et al. 2018

See also...

Yamnaya isn't from Iran just like R1a isn't from India

Thursday, June 21, 2018

A potentially violent end to the Kura-Araxes culture (Alizadeh et al. 2018)

The Kura-Araxes Culture dominated large parts of West Asia during the Early Bronze Age. It's generally accepted that the peoples associated with this archaeological phenomenon were speakers of early Hurra-Urutian dialects, and that they eventually morphed into the Hurrians and other related groups across the northern Near East.

However, it has also been hypothesized that in and around the Caucasus Mountains they were harried and even violently displaced by invaders pushing down from the Pontic-Caspian steppe in Eastern Europe.

A new paper at the AJA Online by Alizadeh et al. explores this angle in detail for a Kura-Araxes site at Nadir Tepesi in the Mughan Steppe, Iranian Azerbaijan, and concludes that it's a very plausible scenario indeed (open access here). Also worth noting in this context, I'd say, is my own recent discovery based on ancient DNA of the rather obvious signals of Yamnaya-related incursions into an area of what is now northwestern Iran not far from the Mughan Steppe (see here). From the paper, emphasis is mine:

By the late fourth to early third millennium B.C.E., Kura-Araxes (Early Transcaucasian) material culture spread from the southern Caucasus throughout much of southwest Asia. The Kura-Araxes settlements declined and ultimately disappeared in almost all the regions in southwest Asia around the middle of the third millennium B.C.E. The transition to the “post–Kura-Araxes” time in the southern Caucasus is one of the most tantalizing subjects in the archaeology of the region. Despite current knowledge on the origins and spread of the Kura-Araxes culture, little is known about the end of this cultural horizon. In this field report, we argue that the Kura-Araxes culture in the western Caspian littoral plain ended abruptly and possibly violently. To demonstrate this, we review the current hypotheses about the end of the Kura-Araxes culture and use results from excavations at Nadir Tepesi in Iranian Azerbaijan.


Following the decline of the relatively dense distribution of the Kura-Araxes settlements, some striking transformations are reflected in material culture. These include a large reduction in the number of settlements, an increase in burial sites, the appearance of collective burials and impressive royal kurgans, increased mobility, and changes in ceramic traditions (i.e., the appearance of Martkopi-Bedeni ceramics). In addition, there was a clear increase in metalwork, especially in the gold and silver attested mostly in rich burials. [10] To some scholars, all these transformations suggest the arrival of new groups of people with a new lifestyle based on transhumant pastoralism. [11]


We postulate that around the mid third millennium B.C.E. Nadir Tepesi was abandoned by the Kura-Araxes community. The end of the Kura-Araxes occupation in TTB and TTC is marked by a characteristic red-orange deposit that suggests a large-scale fire. It is unknown whether the destruction covers the whole settlement or is limited to its southwestern portion. However, it is hard to imagine that the fire was accidental since it represents the end of the Kura-Araxes occupation and an abrupt change in the cultural sequence at the site. The last Kura-Araxes occupational layer was immediately followed by a completely different archaeological repertoire. The thick destruction level followed immediately by a decisive break in the material culture suggests a violent end to the Kura-Araxes community at the site.


Tracing population movement and identifying evidence of migration are major methodological challenges for archaeologists. [49] On one hand, Puturidze argues that there is no evidence supporting the notion of a migration of people into the southern Caucasus. [50] Rather, she associates all the changes in the post–Kura-Araxes period with influences from Near Eastern societies as a result of developing interactions by the end of the third millennium B.C.E. On the other hand, Kohl hypothesizes the possibility of a “push-pull process” [51] in which new groups of people with wheeled carts and oxen-pulled wagons gradually moved from the steppes of the north into the southern Caucasus, and the Kura-Araxes communities subsequently moved farther south. [52]

Kohl also reminds us of the evidence of increased militarism from the Early to the Late Bronze Age that is reflected in more fortified sites, new weaponry, and an iconography of war as seen on the Karashamb Cup. [53] The appearance of defensive mechanisms such as fortification walls, which can be seen at Köhne Shahar, a Kura-Araxes settlement near Chaldran in Iranian Azerbaijan, further emphasizes the increase of inter-group conflicts and militarism during the Early Bronze Age, before the Kura-Araxes culture came to an end. [54] Kohl argues that, while the number of Kura-Araxes settlements decreased in the southern Caucasus, archaeological research indicates that the Kura-Araxes culture spread to western Iran in the Zagros region and to the Levant. [55] In Kohl’s view, as new groups of people moved in, the Kura-Araxes communities abandoned the southern Caucasus and moved farther south, where some of them already resided.


We believe that the evidence supports a less uniform scenario. The Kura-Araxes culture may have disappeared in various ways; the transition to the post–Kura-Araxes time may not be explained by a single model. Different Kura-Araxes settlements may have ended differently. The evidence from Nadir Tepesi could support a violent end at that site, and it is possible that similar evidence will be found at other sites in the Mughan Steppe. At some sites, such as Köhne Tepesi in the Khoda Afarin Plain, [58] the Kura-Araxes occupation also ended abruptly but without any sign of destruction. In other regions, there may be evidence supporting the coexistence of newcomers with Kura-Araxes communities for some period. [59]

Alizadeh et al., The End of the Kura-Araxes Culture as Seen from Nadir Tepesi in Iranian Azerbaijan, American Journal of Archaeology Vol. 122, No. 3 (July 2018), pp. 463–477, DOI: 10.3764/aja.122.3.0463

See also...

Yamnaya isn't from Iran just like R1a isn't from India

Tuesday, June 19, 2018

An exploration of distance-based models of language relationships with a special focus on Indo-European (Kozintsev 2018)

The latest edition of the Journal of Indo-European Studies includes an interesting methodological paper by Alexander Kozintsev, in which the author tests the relationship between Indo-European and other language families using lexicostatistical data and a wide range of distance-based models (see here). My impression, after reading the paper a couple of times, is that we probably have a long way to go before someone comes up with a robust enough way to study languages with these sorts of methods, which are more widely used for the classification of living things.

However, note that Kozintsev's results are very consistent in placing Indo-European, including Hittite (HIT in the figure below), significantly closer to Uralic than to any of the language families south of the Caucasus. This is in line with the general consensus amongst historical linguists working with more traditional methods of studying languages, and, if true, has significant implications for the search for the Proto-Indo-European (PIE) homeland. Why? Because it's very difficult to imagine the PIE homeland being located anywhere south of the Caucasus considering the present-day distribution and likely homeland of Uralic languages well to the north of this region. Emphasis is mine:

The paper explores the informative potential of various distance-based methods of language classification such as cluster analysis, networks, and two-dimensional projections, using lexicostatistical data on 41 languages belonging to seven families (IE, Uralic, Altaic, Yupik-Chukchee, Kartvelian, Semitic, and North Caucasian) represented in the STARLING database. Rooting and weighting are of critical importance, radically affecting the graphic models. Special focus is made on two-dimensional charts generated by the multidimensional scaling and on the little-used minimum spanning tree method. The latter two techniques are employed to test the hybridization/ Sprachbund theory of Indo-European origins. The “Semitic” tendency of IE relative to Uralic is significant whereas neither the “Kartvelian” tendency nor the North Caucasian substratum hypothesis are supported by the two-dimensional models.


Finally, having come full circle, we return to our working hypothesis––that IE is closer to Uralic than to any of the “southern” families. I did not test this assumption because it appeared almost self-evident; now it can be easily tested by the same analysis. But, in fact, even statistical testing is unnecessary, because the triangle data cited above speak for themselves. IE, according to these data, is 20.8% closer to Uralic than to West Caucasian; 18.4% closer to Uralic than to East Caucasian; 13.7% closer to Uralic than to Kartvelian; and 16.9% closer to Uralic than to Semitic. Given the statistical reliability of a 5.6% difference (see above), all these values are highly significant a fortiori.

Kozintsev, Alexander, On Certain Aspects of Distance-based Models of Language Relationships, with Reference to the Position of Indo-European among other Language Families, Journal of Indo-European Studies, Vol. 46, 2018, No. 1 & 2, pp. 1-264

Saturday, June 16, 2018

Yamnaya isn't from Iran just like R1a isn't from India

A strange thing sometimes happens in population genetics: highly capable and experienced researches come up with stupid ideas and push them so hard that, despite all the evidence to the contrary, they become accepted as truths. At least for a little while.

It's obvious now, thanks to full genome sequencing and ancient DNA, that Y-chromosome haplogroup R1a cannot be native to India. It arrived there rather recently from the Eurasian steppe, in all likelihood during the Bronze Age, probably as the Indus Valley Civilization (IVC) was collapsing or, perhaps, just after it had collapsed.

But for quite a few years this was something of a taboo, even politically incorrect, narrative, and it was vehemently rubbished by many Indians, including Indian scientists, and their western academic sympathizers.

Indeed, a whole series of papers came out, often in high brow scientific journals, claiming that R1a originated in South Asia, and that it spread from there to Europe. This, it was also claimed, was the final nail in the coffin of the so called Aryan Invasion Theory (AIT), because R1a was often described as the "Aryan" haplogroup.

I wasn't impressed by any of this nonsense. I said so here and elsewhere, to the great annoyance of those who believed, against all reason and logic, that the Indo-Aryans, and even Indo-Europeans, were indigenous to India. Here's a taste of some of my work on the topic going back to 2013.

South Asian R1a in the 1000 Genomes Project

Children of the Divine Twins

The Poltavka outlier

Looking back, it's all a bit rough, but very cool nonetheless. However, I was often accused of being biased, unscientific and even bigoted and racist as a result of offering such commentary and research. Make no mistake, my detractors were seething that I would dare to question what was apparently a scientific reality, and they wanted to shut me up. It was a nasty experience, but it now feels great to be vindicated.

Certainly, nowadays, no objective person who, more or less, knows their stuff would argue that the vast majority of the R1a in India doesn't ultimately derive from the Pontic-Caspian steppe in Eastern Europe.

But otherwise things haven't changed all that much since then. For instance, despite a whole heap of ancient DNA data being available from Eastern Europe and West Asia, there's a widely accepted idea that the Early Bronze Age (EBA) Yamnaya culture formed on the Pontic-Caspian steppe as a result of migrations from what is now Iran.

This is not true. It can't be true, because it's contradicted by all of the data. I've tried to explain this on several occasions, but generally to no avail.

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic

Another look at the genetic structure of Yamnaya

Likely Yamnaya incursion(s) into Northwestern Iran

Thus, the Yamnaya people and culture were indigenous to Eastern Europe, and basically formed as a result of the amalgamation of at least three different populations closely related to Eastern European Hunter-Gatherers (EHG), Caucasus Hunter-Gatherers (CHG), Early European Farmers (EEF) and Western European Hunter-Gatherers (WHG). They did not harbor any significant ancestry from what is now Iran; at least not from within any reasonable time frame.

However, me communicating this fact has resulted in some rather strange and unsavory reactions from a number of individuals who appear to have a big emotional investment in this issue. They become frustrated and even angry when I try to explain to them that there's no sense in looking for the genetic origins of Yamnaya in Iran, much like the people who argued with me when I tried to reason with them that R1a wasn't native to India. Here's an example from a recent blog post (for the full conversation scroll down to the comments here).

Heh, here we go again with the accusations of bias, scientific impropriety and whatnot. Ironically, the poor chap just couldn't comprehend that he never had an argument to begin with, quite obviously due to his own bias in regards to this topic. Well, at least he didn't call me a racist.

In a recent preprint, Wang et al. correctly characterized Yamnaya as, by and large, a mixture of populations closely related to EHG, CHG, EEF and WHG (see here), with no obvious input from what is now Iran. Sounds familiar, right?

They also discovered that, during the Chalcolithic and Bronze Age, the Caucasus and nearby steppes were mainly home to three quite distinct populations: 1) Steppe groups, including Eneolithic steppe and Caucasus Yamnaya, 2) Caucasus groups, including Kura-Araxes and Maykop, and 3) Steppe Maykop, which they classified as part of 1. These populations were all separated by clear genetic and cultural borders, with significant and unambiguous mixture from the Caucasus cluster only in a couple of Steppe Maykop outliers and one Yamnaya outlier from what is now Ukraine.

Clearly, this leaves no room for any migrations from what is now Iran to the steppe that would potentially give rise to Yamnaya. In other words, the main genetic ingredients for what was to become Yamnaya were already on the steppe well before Yamnaya, during the Eneolithic, and it's quite likely that they were indigenous to the region.

However, interestingly, Wang et al. did appear to try to save the link between Yamnaya and Iran by referring to the CHG-related ancestry in Yamnaya as "CHG/Iranian". I'm not surprised because most of these authors are associated with the Max Planck Institute for the Science of Human History (MPI-SHH), which is currently pushing a proposal that the Proto-Indo-European (PIE) homeland was located in what is now Iran and surrounds (see here). So, obviously, they need to somehow show a relationship between Yamnaya and Iran, because Yamnaya and the closely related Corded Ware archaeological complex are generally seen as early Indo-European cultural horizons. Good luck with that.

Actually, let me make it clear once and for all that I couldn't care less where the very first Indo-European words were uttered. It's just something that I find interesting. I rather doubt that this was within the borders of present-day Iran, and I explained in some detail why in a post almost two years ago (see here). But if someone manages to prove that the PIE homeland was indeed located partly or wholly within what is now Iran, that's OK. I won't be emotionally traumatized as a result.

However, obviously, this will have to be done with the assumption in mind that Yamnaya and Corded Ware became Indo-European-speaking almost purely via linguistic transmission, with hardly any associated gene flow. It's possible, I guess. But then there's almost 200 years of scholarship based on linguistics and archaeological data that generally agrees in favor of the Pontic-Caspian steppe as the PIE homeland.

On a related note, I also couldn't care less whether the Aryan Invasion Theory (AIT) reflects what really happened during the Indo-Europeanization of South Asia, or if it's more appropriate to call it the Aryan Migration Theory (AMT). I'll accept whatever an objective analysis of all of the relevant data shows when we have enough of it to make an informed judgment.

However, currently, I see nothing in the data that would prevent the AIT from being true. To me, the profound impact that the Bronze Age steppe peoples obviously had on South Asia, and especially on the Indo-European-speaking Indian upper castes, suggests that, overall, an invasion-like scenario is quite plausible. But I might be wrong, and so what if I am?

See also...

Yamnaya: home-grown

Ahead of the pack

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friday, June 8, 2018

Of horses and men

Y-HT-1 is today by far the most common Y-chromosome haplogroup in domesticated horse breeds. According to Wutke et al. 2018, this is probably the result of artificial, human induced selection for this lineage, initially on the Eurasian steppe during the Iron Age, and then subsequently in Europe during the Roman period (see here).

However, during the Bronze and Iron Ages, before Y-HT-1 reached fixation, another very important Y-haplogroup in domesticated horses was its older sister clade Y-HT-4.

Indeed, it's likely that both Y-HT-1 and Y-HT-4 first dominated the domesticated horse gene pool during the Bronze Age, probably because they happened to have been present in the horse population exploited by the early Indo-Europeans. This was missed, or at least not directly discussed by Wutke et al., but I'd say it's a fairly obvious conclusion that can be drawn from their data, especially if we consider the fact that horses are the most important animal in the Indo-European pantheon.

Thus, the story of Y-HT-1 and, up to a point, Y-HT-4 is probably very similar to that of two human Y-haplogroups, R1a-M417 and R1b-M269. Both of these lineages also rose to prominence rather suddenly during the Eneolithic and Bronze Age, in all likelihood because they were present amongst early Indo-European-speaking males (see here).

Below is a map of the earliest reliably called and dated instances of Y-HT-1, Y-HT-4, R1a-M417 and R1b-M269 in the ancient DNA record. Not surprisingly, all of the points on the map are located on or very close to the Pontic-Caspian steppe, which is generally accepted to have been the Proto-Indo-European homeland. Fascinating stuff.

See also...

Central Asia as the PIE urheimat? Forget it

Cultural hitchhiking and competition between patrilineal kin groups may have led to the post-Neolithic Y-chromosome bottleneck (Zeng et al. 2018)

Was Ukraine_Eneolithic I6561 a Proto-Indo-European?