Friday, July 26, 2019

Afanasievo people may well have been proto-Tocharian speakers (Ning et al. 2019)

Update 17/08/2019: A surprising twist to the Shirenzigou nomads story


During the Early Bronze Age, around 2,900 BCE, a population associated with the Yamnaya archeological culture migrated from the Pontic-Caspian steppe in Eastern Europe deep into Asia, as far as the Minusinsk Basin in South Siberia.

This rapid, long-range expansion was likely to have been the first significant migration of a Yamnaya-related group far to the east of the Ural Mountains, and it resulted in the formation of the Afanasievo archeological culture (see here).

The appearance of Tocharian languages in the Tarim Basin, in what is now western China, is often associated with the Afanasievo culture, mainly because of the confirmed presence of European-related populations in the Tarim Basin during the Bronze Age, as well as the likely highly divergent position of the Tocharian node in the Indo-European language phylogeny.

But the Afanasievo people were separated by considerable distance in space and time from the Tocharians, and can't yet be reliably linked to them with archeological or genetic data. So even though the inference that the former are linguistically ancestral to the latter is quite plausible, it's far from certain.

However, thanks to a new paper at Current Biology by Ning et al., at least we now know that a population with significant Yamnaya/Afanasievo-related ancestry was living in the eastern Tian Shan Mountains just a few hundred years before Tocharian languages were attested nearby [LINK]. Below is the paper summary, emphasis is mine:

Recent studies of early Bronze Age human genomes revealed a massive population expansion by individuals-related to the Yamnaya culture, from the Pontic Caspian steppe into Western and Eastern Eurasia, likely accompanied by the spread of Indo-European languages [1, 2, 3, 4, 5]. The south eastern extent of this migration is currently not known. Modern-day human populations from the Xinjiang region in northwestern China show a complex population history, with genetic links to both Eastern and Western Eurasia [6, 7, 8, 9, 10]. However, due to the lack of ancient genomic data, it remains unclear which source populations contributed to the Xinjiang population and what was the timing and the number of admixture events. Here, we report the first genome-wide data of 10 ancient individuals from northeastern Xinjiang. They are dated to around 2,200 years ago and were found at the Iron Age Shirenzigou site. We find them to be already genetically admixed between Eastern and Western Eurasians. We also find that the majority of the East Eurasian ancestry in the Shirenzigou individuals is-related to northeastern Asian populations, while the West Eurasian ancestry is best presented by ∼20% to 80% Yamnaya-like ancestry. Our data thus suggest a Western Eurasian steppe origin for at least part of the ancient Xinjiang population. Our findings furthermore support a Yamnaya-related origin for the now extinct Tocharian languages in the Tarim Basin, in southern Xinjiang.

Ning et al., Ancient Genomes Reveal Yamnaya-Related Ancestry and a Potential Source of Indo-European Speakers in Iron Age Tianshan, Current Biology, July 25, 2019, DOI:

zardos said...

Well, this would prove that Yamnaya and some lineages of R1b spoke IE with fail high certainty.

To which groups are those Q-variants related? West Sibirians?

Ebizur said...

Most of their haploid lineages appear to be of Western Eurasian derivation, but a few of their mtDNA lineages (G3b, D4j1b, A17) appear to be related to those of some modern individuals in Tibet and vicinity.

Matt said...

Davidski: However, thanks to a new paper at Current Biology by Ning et al., at least we now know that a population with significant Yamnaya/Afanasievo-related ancestry was living in the eastern Tianshan Mountains just a few hundred years before Tocharian languages were attested nearby

That's certainly the case, but with the caveat that strictly speaking populations with significant Yamnaya/Afanasievo-related ancestry were also living in the western Tian Shan, Pontic-Caspian and throughout Central Asian.

Whether the Iron Age Shirenzigou had a separate or special connection to Afanasievo beyond what is found at contemporary more western sites on Tian Shan and throughout Central Asia, requires going beyond the analysis which this paper completes. Eurogenes and commentators, and subsequent academic work, will probably be necessary to work this out!

Just as with samples that putatively represent Anatolian speakers, samples that are presented as being representative of Tocharian speakers will need care. The same questions of whether these actually can be securely tied to Tocharian speakers should be asked with the same degree of scrutiny as it was asked whether the Anatolian MLBA samples represent Anatolian language speakers.

Matt said...

Copy pasting my comment about this paper from the last post's comment thread, if that's OK, so it is in the same place and can be part of the conversation about this paper:

Choice of modelling populations ancestor is pretty kookalamanza:

"We continued to use qpAdm [4, 27] to estimate the admixture proportions in the Shirenzigou samples by using different pairs of source populations, such as Yamnaya_Samara, Afanasievo, Srubnaya, Andronovo, BMAC culture (Bustan_BA and Sappali_ Tepe_BA) and Tianshan_Hun as the West Eurasian source and Han, Ulchi, Hezhen, Shamanka_EN as the East Eurasian source.

In all cases, Yamnaya, Afanasievo, or Tianshan_Hun always provide the best model fit for the Shirenzigou individuals, while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases [3] (Tables 2 and S2; Data S2A). The Yamnaya_Samara or Afanasievo-related ancestry ranges from 20% to 80% in different Shirenzigou individuals, consistent with the scattered distribution on the East-West cline in the PCA (Figure 2)."

The supplement shows that Tian Shan Saka work equally well to Hun; not sure why this isn't in the main text. In fact, with Baikal EN and Baikal EBA, they actually usually get higher P fits in general than Yamnaya and Tian Shan Hun...

So why in the heck are you gonna use Tian Shan Hun and Saka as your sole Iron Age Central Asian source? Like, Sarmatians, Scyths, much? If you have samples which are almost contemporary, in the right place....?

Unless it's data quality limitations, very strange choice.

Likewise they connect R1b directly with Afanasievo, but this glosses over - - "The Iron Age nomads (Cimmerians, Scythians, and Sarmatians) mostly carried the R1b Y haplogroup, which is characteristic of the Yamnaya of the Russian steppe". And Tian Shan Hun DA81 also shows R1b. (Edit: Davidski presented a spreadsheet that clarified that under yfull only 3/15 of assignments as R1b actually turned out to be R1b, while 8/15 were R1a, and the balance of 4 were I2a, E1b, Q1a and unclassifable. Only 3/9 R1b were correctly called. Shirenzigou are out with yfull)

The null hypothesis should really be that these individuals are largely continuous with other Tian Shan populations from the west, which is plausible on the genetic they present.

It's hard to understand why they have gone down the route of this far fetched adventure of linking the Iron Age Shirenzigou site with early Bronze Age Afanasievo rather than Iron Age R1b populations that are actually their contemporaries and who seemingly work as ancestors in qpAdm.

I suspect they've had the same problem the earliest papers on Sarmatians and Scythians had, finding these populations seemingly descended from Yamnaya rather than Andronovo, when this was due to being Andronovo+other admixture (IranN related+ANE related, etc.).

(Visually, using G25 PCA, we wouldn't expect points that are close to Tian Shan Hun to be modeled with Sappali_Tepe and Bustan_BA, but Sarmatians might work -

zardos said...

The problem with Anatolia is that we know there lived non-IE and the population density was high. In classic literature about the Hittites there was always the concept of elite dominance and a weak genetic influence, weaker than in India for example.
So testing a non-IE inhabitant of Anatolia was always a likely scenario and only carefully chosen, securely placed elite burials are truly relevant.
The same cant be said for many other IE. Otherwise I agree with you anyway, but Anatolia is really a special case.

Matt said...

Would add, even if I'm not too keen on the models here, I would qualify my above comment that I probably wouldn't blame the Chinese researchers on this paper for that too much, as perhaps they're not 100% as familiar with the whole literature as they could be (and this is maybe a first paper for some of them).

I have more the feeling that Johannes Krause of Max Planck/Copenhagen as the supervisor with the history in adna papers could have pushed them a bit more on these questions.

If not on answering them, explaining why they were not able to... (Wang is experienced as well, but I don't think this is as much his main focus as for Krause...).

@Zardos, that is a reasonable qualification. The problem of steppe / Tian Shan in assigning to Tocharian is almost opposite? Mobility is high and density is low; testing a non-Tocharian speaking sample seems quite likely, not because there are large dense, established, local populations who are likely to be non-Tocharian and absorb an earlier population of Tocharian speakers (as would be the case in an analogy to Hittite?) as much as that it is hard to say how much long residence Tocharian speakers ever had in a location, and thereby securely link to a sample.

Though I would say on Hittite, my understanding is that although an elite transfer case could be made for Hittite (or Nešili), there is no direct / indirect evidence actually for this being the case for any of the other Anatolian languages (Luwian, etc.), even if there is no evidence against it. For Anatolian it's very much not the case of being able to ascribe Hittite specifically to elite processes, and anything explaining Anatolian within Anatolia must explain the whole Anatolian family.

Drago said...

A broader (time & geography) of Tarim basin will help elucidate with what wave from western steppe; and how; Tocharian arrived
The mtDNA Us shows this probably included women & children

zardos said...

@Drago: Why just U? Of course they brought women and children, most of the typical mtDNA haplogroups being in this small sample already.

For IE Anatolia is like Hungary for the Ugrians: A warrior alliance captured a region and formed a literate state with local preconquest and neighbourIng cultural elementa in a majority non-IE population in which even more non-Hittites came in.

I would not trust single samples. Remind you on Hungary again. You need a large sample from elite burials. If no Yamnaya related ancestry is present at all, this would be a problem for the steppe theory.
But even those small numbers of IE went through mixture and filters before making it to Anatolia, so even smaller amounts and lineages will suffice.

Dragos said...

@ Zardos

''For IE Anatolia is like Hungary for the Ugrians: A warrior alliance captured a region and formed a literate state with local preconquest and neighbourIng cultural elementa in a majority non-IE population in which even more non-Hittites came in.''

It seems you have figured it out, when even specialists aren;t really sure what happens between the Neolithic & the emergence of the Bronze Age - at least 2 milleniu still porrly pieced :) Im not sure how relevant, or not, the analogy to the Carpathian basin is.
People often mistake the conquest of Kanesh - a city - with the presence indo-European speakers in central - & western Anatolia at large. It's not even entirely clear who conquered who in Kanesh.
You also seem to be defensively linking my lack of conviction of your elite conquest explanation (which seems to be a kind of default go to), with debating against a steppe model.
My point rather was, successful & lasting elites tend emerge from a significant demographic and social backdrop which can support them, and can be used by them to justify their position. Hence, the I-E presence in Anatolia was relatively large and perhaps hundreds of years if not not more; This is the only way sensibly explain the presence of several IE languages strewn over Anatolia, and their lasting a very long time

zardos said...

Yet at the time of the samples taken, how much of the IE ancestry will be left? You find many historical examples of small group capturIng a foothold in a region, mixing and spreading as mixed group into the neighbourhood.
Now if we start with the steppe, these early Anatolians will be heavily mixed already before entering the region.
Next question: Where was their starting point in Anatolia, where the IE Anatolians where less diluted than in a later phase?
For this we need to know their migration path and the material culture associated with them...

Drago said...

@ Zardos

''Yet at the time of the samples taken, how much of the IE ancestry will be left? ''

I suspect that the population-genetic structure in BA Anatolia is discernable into a pattern which may, tentatively, & with more samples, correlate with the various ethno-linguistic groups.

Davidski said...

I'd say that the concept of the "transient gene flows" that we've seen in the recent papers about the ancient Levant is going to make a regular appearance in aDNA papers about the Near East (including Anatolia) over the next few years.

Andrzejewski said...

Correct! Prior to the Yamnaization associated with Anatolian languages Anatolia was densely populated by admixed populations of mostly ANF + CHG like the Hatti, Hurrians and Kaska, and that’s perhaps why Steppe elite dominance haps are rare to find

Nick Patterson (Broad) said...

Jim Mallory wrote a survey paper on the origins of Tocharian

He is characteristically cautious but clearly leans towards an Afanasievo
origin. The new paper makes that overwhelmingly likely it seems to me.

Andrzejewski said...

One thing can be sure: if Tocharians were Afanasievo/R1b, then it surely rules them out as the source of the (in)famous Tarim Basin Mummies because the latter almost all (11/12) turned out to be R1a1. Thus the TBM must be Andronovo/Saka/Cimmerians or Sarmatians

Andrzejewski said...

How “highly divergent” can the Tocharian language be if it allegedly arrived in Xinjiang 2900 BCE? I mean Corded Ware spread 2800 BCE in Eastern Europe so would 100 years make such a difference?

Andrzejewski said...

We have to also remember that (if Tocharian are to be identified with the Yuezhi, Wusun or Kuchans) have extensively interacted for centuries with other European people of the Andronovo stock like Saka or Scythians and Vedic Indians/Nepali and were instrumental in the Silk Road transmitting of Buddhism. As far as I can remember, Tocharian was replaced as state language by Saka with the Buddhization of the Kuchan/Yuezhi kingdom. (Europeans back then in general were all over Western China, Siberia, Western Mongolia and Central Asia plus Turkey and Northwest India). And don’t forget Alexander the Great’s soldiers in Afghanistan, which was part of this Kushan empire (called “Bactria” then).

I’m stating all this data because the lines between Scythians and Tocharians got blurred throughout the centuries so it’s hard to fish for the hard core Tocharians.

Ric Hern said...

If this is accurate then the Centum Connection of Tocharian with Celtic/Germanic etc.looks basically like a done deal. Seeing that Afanasevo Ancestors spread early Eastwards, what happened at the same time towards the West ?

Matt said...

@andrezjewski, trees:

Trees primarily using core lexicon divergence (tree 1 and 2) generally finds Tocharian diverges early on the order of 500 years or so, as in first tree by Chang, though not necessarily (Tocharian is not really outgroup in second tree).

Trees by phonological and grammatical (tree 3 and 4) features isn't really possible as these are strongly interdependent, are more dependent on chance events (due to fewer linguistic characters); it is easier for rates to scale up or down due to interdependent shifts between features.

The latter find generally find Tocharian to the outgroup to "core IE" or "late IE", but do they cannot fit divergence to time in a meaningful way. (Again not always -

Matt said...

@Andrzejewski:I’m stating all this data because the lines between Scythians and Tocharians got blurred throughout the centuries so it’s hard to fish for the hard core Tocharians.

And, I mean, this is only the stuff which is relatively clearly the case as we enter the historical period, let alone prehistory before this.

It's possible that the Afanasievo hypothesis is linguistically correct (though I don't think there's much evidence for it).

But the idea that Tocharian speakers in the Eastern Tian Shan at 200 BCE would be unadmixed with Andronovo and then later Iranic speaking related populations from the west is basically suggesting...

... that almost immediately with the Yamnaya horizon or its immediate predecessor, an offshoot migrates 2000 miles to the east in basically no time at all, and then they stay put for about 2500 years and do not mix with *anyone* except that for some reason they happily admix with East Eurasian groups.

It may have happened if the genetic data shows this (when looked at more completely) but I am not sure what naively the reason there would be to expect that any of that would be the case.

(This all despite the absence of any clear separation and continuation of Afanasievo and successor cultures throughout this period. And that there is no sign of Afanasievo descended groups on the Eastern Eurasian Steppe, but there is some small signature of Sintashta-Andronovo ancestry - - "The Western outlier ARS026 ... is well modeled as a two-way mixture of Shamanka_EBA and Sintashta (P = 0.307; 48.6 ± 2.0% from Sintashta) (SI Appendix, Table S7). Similar to ARS026, contemporaneous LBA Karasuk individuals from the Altai (1400–900 BCE) (1, 29) also exhibit a strong extra genetic affinity with individuals associated with the earlier Sintashta and Andronovo cultures (SI Appendix, Fig. S14). ").

Ric Hern said...

@ Matt

Not mix with "Anyone" when we see Haplogroup Q among the samples...? After all migration towards the West met significantly more people yet retained a lot of the PIE related Language...? Why should Lithuanian have conserved more connections to PIE than Tocharian if it originated from Corded Ware at roughly the same time as Tocharians originated from Afanasevo ?

Matt said...

@ric, are you understanding that I don't actually mean to endorse that scenario?

Ric Hern said...


Sofia Aurora said...


David is this paper the one you mentioned when you posted the dissertation about the Xiaohe people about a month ago?

Or there is it another?

Also do you have any news about the Cimmerians' paper?

Ric Hern said...

Wonder how many Sister Languages PIE had ?

Ric Hern said...

If Hittite was a Sister and seperated before the Yamnaya expansion then surely there could have been a few more Sisters in and around the Pontic Caspian Steppe area...?

FrankN said...

Nick: "Jim Mallory wrote a survey paper on the origins of Tocharian

He is characteristically cautious but clearly leans towards an Afanasievo

Just wanted to post about that paper, you beat me to it. IMO an absolute "must read", as Mallory is one of the most pronounced experts when it comes to Xinjiang's archeology, and he drew heavily on D. Adams linguistic expertise (Adams' publications include a Tocharian etymological dictionary).

I interprete his paper quite differently. In fact, he seems to retreat from the Afanasievo scenario he himself proposed in 2000.

But step by step:

1. Acc. to Mallory, medieval Tocharians either dressed like Indish monks or Sassanid warriors. It would be absolutely impossible to figure out if any early medieval person (mummie) from the Tarim Basin spoke Tocharian, Indo-Iranian, or whatever else (Chinese, Turkic, Bactrian Greek, etc.)

2. The IA Tarim Basin shows a patchwork of cultures (essentially each oasis seems to have differed from the next one), with the "Tocharian" area being divided into a "painted pottery" (China-derived) and a "Grey ware" (Iran/ Turan-derived) zone. Among the manifold East Iranian (Saka) borrowings in Tocharian, three are culturally significant, namely "iron", "canal", and "clay brick". This implies for Mallory the impossibility to archeologically distinguish IA Proto-Tocharians from East Iranians.

3. The key to Tocharian in his opinion lies with the EBA Xiahoe culture (ca. 2000–1700 BCE) that encompassed a good part of the area from where Tocharian A-C has been attested, and provides the earliest evidence of Europoid mummies.


FrankN said...

Continuing my previous post

4. Mallory then (re-)examines possible connections between Afanasievo and Xiahoe. He a/o constates a chronological gap: Afanasievo ended some 500 years before the onset of Xiahoe. If at all, Okunevo could be chronologically regarded as ancestral to Xiahoe, but that poses obvious aDNA problems (briefly glossed over by Mallory), and substantial archeological problems discussed by him in quite some detail, e.g. ceramicless Xiahoe burials vs. pottery-rich Andronovo and Okunevo ones, rectangular Okunevo "kurgan" enclosures vs. circular Xiahoe ones, etc.

5. He concludes: "This paper has not only failed to provide a solution to the problem of Tocharian origins—it has even helped undermine the author’s earlier solution (Mallory and Mair 2000). (..) The Eurasian steppe model (Early Bronze Age) that sets the Indo-European homeland in the Pontic-Caspian region and identifies the ancestors of the Tocharians as members of the earliest eastward expansion of steppe pastoralists from the Urals eastwards to the Altai and Yenisei, i.e., the Afanasievo culture (Mallory and Mair 2000; Anthony 2007, 307–311) (..) satisfies those who regard Tocharian as a very early departed language, geographically peripheral to the other Indo-European branches, and eliminates the problem of dating contacts between Tocharians and Indo-Iranians to any period earlier than the entry of the Saka into the Tarim Basin. Among its major problems are: 1) it lacks any evidence of the suite of domestic cereals which the ancestors of the Tocharians should have known; 2) while there may be some Afanasievo artifacts associated with the Qiemu’erqieke culture in the Junghhar basin, these are really totally different cultures, so there is no evidence for an Afanasievo migration south through the Junghhar Basin towards the land of the historical Tocharians; 3) the archaeological case for contacts between the Afanasievo and later Okunevo cultures with the Early Bronze Age culture of the Tarim Basin (Xiaohe) is, other than burial posture, generally weak and circumstantial."

6. He then cautiously argues for a "combined Steppe and Central Asian model that sets the Indo-European homeland in the Pontic-Caspian but argues that steppe populations intruding into the indigenous agricultural societies of Central Asia adopted many elements of material culture without undergoing language shift." In this context, he explicitly mentions Gonur Depe and Saraszm IV, hinting at (but not explicitly expressing) the possibility that Xiahoe and ultimately Tocharian might have originated there. "No one seems certain precisely how one might link the European steppe, the Zervashan Valley of Tajikistan and the Minusinsk Basin together (mobile traders from the European steppe, a single interaction sphere of exchange relationships, Frachetti’s “Intermountain Corridor”?), but there is clearly evidence in both the Afanasievo and subsequent Okunevo periods for some form of mutual contact. As I indicated above, the reason for suggesting this model is that it places steppe populations in an area where cereal agriculture was well established, so it reduces both the spatial and temporal lacuna between their homes in the Pontic-Caspian region and their possible approach to the Tarim Basin. Unfortunately, the spatial and temporal lacuna with respect to domestic plants now appears not merely between the Urals and the Altai but even farther, between the Dnieper and the Altai (Mallory 2014). I do not know how we are going to be able to resolve these issues, but if we really want to trace the Tocharians to their origins we might paraphrase the immortal lines of ‘Deep Throat’ and “follow the cereals.”

Davidski said...

@Sofia Aurora

David is this paper the one you mentioned when you posted the dissertation about the Xiaohe people about a month ago?

Yes, this is it.

Also do you have any news about the Cimmerians' paper?

It was published recently...

Davidski said...


The more I think about this, the less I'm convinced that these eastern Tianshan Iron Age samples were Tocharian or proto-Tocharian speakers.

They may have been, but we would need Medieval samples from the Tarim Basin where Tocharian was attested to see if people there had specifically Afanasievo-derived ancestry and whether they were directly related to the eastern Tianshan Iron Age population, and also what other types of ancestries they had.

FrankN said...

My summary of Mallory 2015 above has failed to mention a main point: Mallory, based on Adams, argues that Tocharian posessed all PIE agricultural terms, including those related to cereal agriculture. [Of course, as PIE isn't an ANF/Levantine Neolithic language, most of these terms wouldn't have originated in PIE, but they had been acquired already at the PIE stage, i.e. prior to the splits of Anatolian and Tocharian]. Phonetics would rule out a later borrowing, e.g. from Iranic.

If correct, this means that PIE must have been a cereal farming culture. And this, in turn, rules out non-cereal farming cultures like Yamnaya and Afanasievo as PIE origin (they nevertheless may have been vectors).

Drago said...

@ Dave

“The more I think about this, the less I'm convinced that these eastern Tianshan Iron Age samples were Tocharian or proto-Tocharian speakers.”

I bet they’d be close to western Xiongnu; not saying that’s what they spoke; however

EastPole said...

“If correct, this means that PIE must have been a cereal farming culture. And this, in turn, rules out non-cereal farming cultures like Yamnaya and Afanasievo as PIE origin (they nevertheless may have been vectors).”

Tocharians grew millet. But Tocharian word for millet is cognate with Slavic word:

I wrote about Slavic influence on Tocharian in previous tread:

So maybe Tocharians with Slavic linguistic influence also got some agriculture they didn’t have in Yamnaya/Afanasievo times.

Sofia Aurora said...


Many thanks

FrankN said...

East Pole: "Tocharians grew millet. But Tocharian word for millet is cognate with Slavic word." Yep. Both facts are also stated and discussed in Mallory 2015.

"So maybe Tocharians with Slavic linguistic influence also got some agriculture they didn’t have in Yamnaya/Afanasievo times."

Nope. It certainly went the other way around. Millet was domesticated in China, from where it reached the Tarim Basin. For a long time, it was believed that millet was already present in Neolithic Ukraine. However, a few years ago some of those millet samples were directly AMS-dated (would need to look up the source), and it turned out that they only dated to the MBA and had been washed into lower, Neolithic layers. The arrival of millet farming in E. Europe (and also Caucasia, where it equally arrived during the MBA) apparently pre-dates the Skyths, but the corresponding vocabulary may well have entered Slavic from Skythian (E. Iranian), ultimately Tocharian.

"I wrote about Slavic influence on Tocharian in previous tread". Yes, I saw that, but you forgot to mention the source for your first caption. Might interest D. Adams who, albeit being the author of the probably most comprehensive treatise on Tocharian vocabulary, apparently isn't aware of that influence yet.

Bob Floy said...

"Might interest D. Adams who, albeit being the author of the probably most comprehensive treatise on Tocharian vocabulary, apparently isn't aware of that influence yet."

Hmm. I wonder why the expert wouldn't be aware of that? Seems like a hard thing to miss.

FrankN said...

This wasn't the millet link I originally meant, but it's equally interesting and reports the same result: Supposedly Neolithic millet finds (here a/o German LBK, Hungarian EN Sopot culture, Bulgarian early Neolithic) were AMS dated to sometimes as late as the 15th century AD. The earliest date comes from Fajsz, Hungary (1606–1414 BC).

In the Caucasus, millet appeared by around 2000 BC, see link below (in French). Previous speculations about Shulaveri-Shomu already cultivating millet apparently were just that - speculations.

Drago said...

Mallory suggests that the Tocharian languages began splittng c. 500 BC (quoting several references therein), so it's a but hard to link it with Afansievo (c. 3000 BC); in addition to the chronological & cultural 'gap'' between Traim basin horizon & Afansievo culture. Not only is there a gap, between the two, but it in fact looks like whatever groups were left after Afansieovo dispersed, might have been acculturated into East-Asian groups, yet providnig the western ancestry evident in groups like the Royal Xiongnu

Andrzejewski said...

@Drago Not only that, but finding 11/12 of TBM were R1a1 and NOT R1b rules the Tarim Basin Mummies out as Proto-Tocharians.

You might also want to consider that the East Asian admixture in Tocharians (if indeed descendants of Afanasievo) might’ve come from Okunevo

FrankN said...

Drago: "Mallory suggests that the Tocharian languages began splittng c. 500 BC."

Yeah, that's a problem. A "linguistic generation" is usually estimated to last around a millennium, based on the securely dated evolution from Latin (Ceasar's conquest of Gaul) to Old French (Straßburg Oaths). That would place Proto-Tocharian rather around 1500 BC, i.e the MBA, than with the EBA Xiahoe culture as proposed by Mallory, and certainly much later than anything eluded from those phylogenies posted by Matt above.

Now, we know about cases where a language survived more than a millenium w/o splitting into sub-branches. Most prominent in this respect is Arabic, earliest attested in writing in the 4th cAD, but with "Arabic" homonyms apparently being recorded in Mesopotamia as early as 500 BC. East Slavic, the origin of which is dated to the times of the Kiew Rus, that "officially" only branched after 1991 into Russian, Belorussian and Ukrainian, is another (obviously contentious) case that clearly made it beyond the millenium mark.

Tocharian might theoretically fall into this category as well. However, if one regards contact with other languages (from the same, or different language families) as a main trigger of linguistic differentiation, Tocharian seems pre-disposed to have evolved faster than average (a point a/o brought forward by Häkkinen in his readworthy criticism of IE phylogenies [link to be added later]). More specifically, Tocharian was substantially overformed by contact with an agglutinating language, setting it apart from the PIE mainstream. This language could have been Uralic or Altaic, in line with a hypothetical Afanasievo descent. Howewer, as Mallory spells out "almost all the languages that the potential ancestors of the Tocharians could have possibly come into contact with tend to be agglutinative, e.g., Sumerian, Hattic, Hurrian, Elamite, and even Tibetan displays agglutination."

It would certainly be helpful if linguists could provide an estimate about when that agglutinative overforming (or alternatively absorbtion of agglutinating substrate) took place. I guess they can't, otherwise we would have this information by now...

Ric Hern said...

Even if those R1b guys were not Tocharians it still is interesting that Afanasevo Descendants were still around 2200 years Okunevo didn't totally replace them.

Drago said...

Yep they were certainly an expansive lot- from Atlantic to the Far East; & Mediterranean seas. Must have been part of a lot of histories

Matt said...

FrankN: That would place Proto-Tocharian rather around 1500 BC, i.e the MBA, than with the EBA Xiahoe culture as proposed by Mallory, and certainly much later than anything eluded from those phylogenies posted by Matt above.

Do you mean "much earlier" rather than "much later"?

Proto-Tocharian in the sense Mallory is referring to is the common ancestor of Tocharian A and Tocharian B before the internal split. Not the split off of the ancestor of Tocharian A+B from other IE languages.

Only two of the models I shared above resolves split of Tocharian A and B, one by Heggarty and one by Tandy+Warnow. Heggarty's tree place this at about 700 years before their attestation, and Tandy+Warnow at around 0 BCE to 500BCE.

So the internal splits of Toch A+B provided by these models, when they are, are roughly coincident with Mallory's date of around 500BCE.

Ric Hern: Even if those R1b guys were not Tocharians it still is interesting that Afanasevo Descendants were still around 2200 years Okunevo didn't totally replace them.

Why would the R1b be from Afanasievo people thousands of years prior to them when there is lots of R1b around in the Western Steppe and subsequently anyway?

Davidski said...


I've just read the paper again, but this time focusing more on the genetic results and archeological notes, rather than the linguistic theory, and I have to say that these eastern Tianshan pastoralists appear to be related to the Saka and Huns. They may even have been a proto- or para-Hunnic group.

That's not to say that closely related groups that settled in the Tarim Basin weren't Indo-European-speaking and didn't eventually become the Tocharians. But how is it that the authors didn't notice and elaborate on the rather obvious Saka/Hun links?

Matt said...

@Davidski, yes, included as almost a coda was:

"We also cannot rule out that the genetic heterogeneity had already existed in the western source populations before they entered the eastern Tianshan region to form the Shirenzigou individuals, since the genomic composition of nomadic populations such as Scythian, Hun, and Saka at that period in the region were also highly dynamic.

The existing archaeological evidences suggest that the Shirenzigou site shows typical characteristics of the agro-pastoral Yanbulake Culture from the Bronze-Age Hami Basin located in the southern slope of the East Tianshan Mountains. The animal motifs such as the deer-shaped Griffin in the site also reflects the influences from the Pazyryk Culture from the Altai region. Besides the cultural exchange between the eastern Tianshan Mountains and the Altai region, a special funeral ritual appeared in the Shirenzigou site, whereby the upper torsos were disturbed.

This was a common custom in the Neolithic Gansu corridor, in Northwestern China. Furthermore, the chemical composition analysis of ancient glass beads excavated from the site indicated they were imported from the central region of China. The diverse cultural elements observed in the same site provide evidence that different populations once came to this region and admixed with each other to form the genetic structure of the Shirenzigou people, which is well supported by the ancestry profile inferred in this study. We found most Shirenzigou individuals derived a large amount of their ancestry from Ulchi or Hezhen related populations, which might be associated with ancient nomadic people in northern Asia. The Shirenzigou samples also harbor a Han Chinese-related component, which may be introduced into this region by the farming populations from surrounding regions, such as Gansu and Qinghai, who also contributed to present-day Han Chinese."

(and further information on the Pazyryk related materials in the Archaeological supplement in STAR METHODS : Archaeological Context and Skeletal Materials of Shirenzigou Site).

I am not sure if they had not originally set out to write a paper which had a bearing on ideas such as "(t)he origin of the Tocharian speakers has been long debated (and how) archaeologists have proposed two hypotheses, the “Bactrian Oasis hypothesis” and the “Steppe hypothesis,” to explain the first appearance of agricultural communities in the region ca. 2,000 BCE" and not just to be presenting another bunch of individuals similar to Iron Age individuals already sequenced. Then ran with that.

Ric Hern said...

@ Matt

Yes Western Steppe but not particularly as numerous on the Eastern Steppe as R1a. And we are talking Eastern here, aren't we ?

Slumbery said...

"..., while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases [3] (Tables 2 and S2; Data S2A)."

I admittedly just glanced at their tables, but it is not clear to me what samples they used as "Andronovo". Andronovo had a huge range and alongside a Sintashta derived main group many various outliers, including some that had much less European Farmer ancestry that the main group and/or more "pure" Yamnaya ancestry with various eastern admixtures and even some with Yamnaya derived R1b. My first (possibly too quickly made) impression then, that the single "Andronovo" reference population in the article probably does not represent this variety and actually there is nothing in these results that would cross out the possibility that the western ancestry of the tested group was mediated by an Andronovo group (and then by its descendants).

Ric Hern said...

Didn't Andronovo spread over some previous Afanasevo occupied territory ? Was it really so unlikely for some Afanasevo descendants to survive in relative isolation in Siberia like Lithuanians in the West...?

Slumbery said...

@Ric Hern

Setting aside that I'd have a beef with your Lithuanian analogy, it is not really the main problem (for me) how likely it is. Not very, but things with smaller probability can happen. The main problem is that this still remains a mere speculation, because the article is not convincing about the existence of a specific link.
(And then we do not even know if these were even Tocharians.)

EastPole said...

Here is a new paper trying to reconstruct Indo-European phylogeny. They link Tocharian with Afanasievo culture, Proto-Indo-Iranian with Sintashta, Balto-Slavic-Indo-Iranian with Corded Ware and Italo-Celto-Germanic with Bell Beakers.
Their tree:

“Chronological intervals obtained by Bayesian MCMC analysis and summarized in Table 3 do not contradict the expert views. E.g., the initial IE bifurcation, the Anatolian split-off, falls within the range 4139–3450 BC that is compatible with traditional estimations, e.g.,
Moreover, Bayesian chronological intervals (Table 3) do not contradict radiocarbon datings of archaeological cultures which may be associated with the spread of Indo-European languages. Thus, the split-off of Tocharian can be identified with the migration that gave rise to the Afanasievo culture …
The 29th century BC date for the rise of Afanasievo aligns well with the Bayesian date for the Tocharian split-off: 3727–2262 BC (mean 3011 BC).
The end of the Sintashta archaeological culture, frequently associated with Proto-Indo-Iranian speakers is dated to the beginning of the 18th century BC
Cf. the Bayesian dates for the break-up of Proto-Indo-Iranian: 2044–1458 BC (mean 1740 BC).
According to ancient DNA data, it is likely that the population of the Sintashta culture is at least partially descended from that of the Corded Ware culture
Since the Corded Ware culture is usually associated (non-exclusively) with the ancestors of Balto-Slavic peoples it seems reasonable to suppose that the Balto-Slavic–Indo-Iranian break-up may be correlated with the end of Corded Ware. According to the current view, “[t]he years between 2300 and 2100 B.C. were a period during which the Corded Ware culture ended in most regions, especially in the southern part of its domain (basins of the Danube, Upper Rhine, Elbe, and Vistula). Only in the Russian Plain did it last until 2000 B.C. These datings align relatively well with the Bayesian dates for the Balto-Slavic–Indo-Iranian break-up: 2723– 1790 BC (mean 2241 BC).
Finally, it is not excluded that the Italic-Germanic-Celtic unity should be associated with the Bell Beaker culture (cf. similarly Mallory (2013), where it is proposed to connect the Bell Beaker culture, due to its chronological depth, not with Proto-Celtic per se, but generally with an ancestor of “North-West” IndoEuropean languages). Recent ancient DNA studies have confirmed that the spread of this culture in most places (with the significant exception of Iberia) was associated with a real migration rather than simply with a dissemination of a “cultural package” (Olalde et al. 2018). The latest dates for this culture extend into the first centuries of the second millennium B.C. (Czebreszuk 2004b: 482). Cf. the Bayesian dates for the Italic-Germanic-Celtic break-up: 2655–1537 BC (mean 2080 BC).

That said, we avoid any further discussion on the IE homeland issue in this paper, since it would be a serious simplification to assume direct one-to-one correlations between our current results and the geographic distribution of early Proto-Indo-European speakers”.

Drago said...

What if PIE tree was flat; as some have suggested ? Spreading widely c 25-2000 BC

JuanRivera said...

But then, Olalde et al (2019) showed steppe ancestry spreading into Iberia in that timeframe.

Davidski said...

Yeah, there was a Beaker migration into Iberia but not out of Iberia. Those linguists confused the two things.

Samuel Andrews said...

Spumbery, andronovo/sintashta was actually genetically very uniform.

Ric Hern said...

I rather see it as Proto-Yamnaya equals Proto-Indo-European. Where did R1a and R1b share a Culture before the Great Devide ? Khvalynsk currently looks like it for me, unless I missed something. I still wonder how many Sisters of PIE existed before the Yamnaya expansion....Were those Piedmont Eneolithic guys Proto-Indo-European or just one of a network of Sisters ?

Ric Hern said...

I get more and more the feeling that Women were the glue for the formation of PIE...

Davidski said...


Mesolithic, Neolithic and Eneolithic steppe populations show a wide variety of Y-chromosome haplogroups, including I2, Q1a, R1a and R1b.

So all of these markers may have, and probably did, exist in the closely knit PIE community. But once the break up of PIE started on the steppe, and different male clans expanded within and out of the steppe, certain Y-haplogroups became fixed, or almost fixed, in frequency in specific PIE offshoots.

This is what confuses a lot of people into thinking that males from post early PIE communities weren't related to each other and must have come from very different populations.

Ric Hern said...

@ Davidski

Yes precisely what I think also, that is why Yamnaya does not really fit with PIE but was rather most likely a daughter...

Open Genomes said...

Was anyone able to successfully align the FASTQ files for the R1b1a1a2-M269s and determine if they are in any subclades?

What does the 1240k SNP array say for L23, PF7562, Z2103, and L51?

Ric Hern said...

MtDNA Haplogroup H2a1 is particularly interesting. In Khvalynsk, Samara,Sredny Stog II and Eneolithic Piedmont in both R1a and R1b Males...

Davidski said...

@Open Genomes

What does the 1240k SNP array say for L23, PF7562, Z2103, and L51?

I don't have it yet.

Ric Hern said...

MtDNA Haplogroup H15 in Yamnaya, Britain and Shirenzigou...

Slumbery said...

Samuel Andrews
"andronovo/sintashta was actually genetically very uniform."

No. The majority of the Andronovo samples belong to a rather uniform Sintashta derived line (that you remember correctly), but other populations are clearly present. Also it is safe to assume that they are underrepresented in the samples, because the studies focused on sites that are well identified as archaeologically Andronovo.
When Sintashta expanded while transforming into Andronovo they engulfed a huge territory and although there was significant colonization, other groups were not completely replaced.

I show you one example of such a population. It is not directly relevant to the currently discussed paper, just an example.

"sample": "RUS_Sintashta_MLBA:Average",
"fit": 1.7966,
"Yamnaya_RUS_Kalmykia": 69.17,
"POL_Globular_Amphora": 30,
"RUS_West_Siberia_N": 0.83,
"RUS_Shamanka_N": 0,
"TKM_Geoksiur_En": 0,

"sample": "RUS_Srubnaya_MLBA:Average",
"fit": 1.9524,
"Yamnaya_RUS_Kalmykia": 71.67,
"POL_Globular_Amphora": 27.5,
"RUS_West_Siberia_N": 0.83,
"RUS_Shamanka_N": 0,
"TKM_Geoksiur_En": 0,

"sample": "RUS_Srubnaya_o:Average",
"fit": 3.3061,
"Yamnaya_RUS_Kalmykia": 52.5,
"RUS_West_Siberia_N": 40.83,
"TKM_Geoksiur_En": 6.67,
"POL_Globular_Amphora": 0,
"RUS_Shamanka_N": 0,

"sample": "KAZ_Kairan_MLBA_o:Average",
"fit": 2.0918,
"Yamnaya_RUS_Kalmykia": 47.5,
"RUS_West_Siberia_N": 26.67,
"RUS_Shamanka_N": 12.5,
"POL_Globular_Amphora": 10,
"TKM_Geoksiur_En": 3.33,

"sample": "KAZ_Maitan_MLBA_Alakul_o:Average",
"fit": 1.6648,
"Yamnaya_RUS_Kalmykia": 47.5,
"RUS_West_Siberia_N": 33.33,
"POL_Globular_Amphora": 11.67,
"TKM_Geoksiur_En": 6.67,
"RUS_Shamanka_N": 0.83,

The point i would like to make with this series that the Srubnaya outlier and the Maitan and Kairan outliers represent more or less the same population. A population that is basically Yamnaya + WSHG + some southern ancestry (reminiscent of Streppe Maykop, but with less southern ancestry). The Kairan and Maitan outliers have some Sintashta admixture on top of that and Kairan even have some BHG, but the Srubnaya outlier shows that this population was still out there in pure (no western admixed) form in the middle of the second millennium BC, well into Andronovo times. (I also suspect that Potapovka had admixture from this population, but that is another story.)

Slumbery said...

I could as well add Sintashta outlier group 1 to the above list.

"sample": "RUS_Sintashta_MLBA_o1:Average",
"fit": 2.7319,
"RUS_West_Siberia_N": 56.67,
"Yamnaya_RUS_Kalmykia": 24.17,
"TKM_Geoksiur_En": 19.17,
"POL_Globular_Amphora": 0,
"RUS_Shamanka_N": 0,

They had more southern ancestry and about proportionally less Yamnaya ancestry than the later representatives, showing the southern ancestry is connected to the Siberian side.
The narrative that comes in mind that there was a southern admixed "Siberian" population in the East Caspian that expanded significantly northward at some point and might be a source for earlier Steppe Maykop. Later they got Yamnaya admixed (not necessarily literal Yamnaya, could have happened later) and this mixed population still existed in Andronovo. However apparently they did not make into the Iron Age.

But again, I do not think they were the only population under the umbrella of Andronovo that was not Sintashta derived. This is just an example that groups with alternative ancestries were out there and could have impacted Iron Age populations in some extent.

Samuel Andrews said...

One population founded Andronovo. The few outlier samples are immigrants from foreign populations.

Slumbery said...

@Samuel Andrews

One population founded Andronovo."
That is exactly what I am saying, so your point is? They later still assimilated other groups.

The few outlier samples are immigrants from foreign populations.

A claim based on nothing. If anything, the Sintashta related founders were immigrants in a considerable part of Andronovo range even in Andronovo's own context and virtually in the entire range if we start from the beginning of Sintashta, as Sintashta was itself an intrusive population.
The population I just showed as an example was obviously native in the area (in the context if Sintashta/Andronovo, that is, they lived there before the arrival of Sintashta people).
Or where outside of Andronovo range a population like that could have lived according to you?

Matt said...


It seems that they really do not like the way that all the conventional trees using lexicostatistics (core lexicon cognacy) reconstruct the IE tree with an early branch of Indo-Iranian prior to the split of Balto-Slavic from Celtic-Germanic-Italic, and that they probably also do not like deeper splits of Greek-Armenian from Core-IE and would prefer a simultaneous split.
And so they have some grumbles that the datasets assembled for these trees must be wrong and re-write the dataset to produce the answer which they were looking for in the first place.

This includes an obvious feature of using reconstructed wordlists* rather than cognacy data directly from modern and attested languages.

There are some other changes as well which they include to resolve issues discussed by Chang et al. Thinning their dataset to remove certain items or mark them differently. Though these seem to have minimal impact on the trees (other than the placement of Albanian), judging by the supplement, where they explicitly show the stages of analysis, excerpts:

I think that a critical response from people who produced previous trees will probably focus on whether the use of reconstructed wordlists is mere circularity or not.

*"Our analysis is based on the 110-item Swadesh wordlist as it is currently defined in the Global Lexicostatistical Database project (G. Starostin 2011). Since the subgroups (such as Slavic, Germanic, Albanian and so on) within the IE family are uncontroversial, for each subgroup we prefer to use a reconstructed proto-language wordlist (e.g., Proto-Germanic for the Germanic group) or, where available, a list for an attested language which can be roughly equated with a proto-language (e.g., Vedic for the Indo-Aryan group).

We believe that the use of reconstructed wordlists for intermediate proto-languages instead of the more traditional approach that requires a great number of wordlists from modern languages is preferable for two reasons.
(This seems to be that their 100 item wordlist cannot handle very large numbers of languages within their method (I think other analyses do uses a 200-207 item list, which probably does scale within the parameters they give) and that more languages creates more noise and homoplasy).

Open Genomes said...

The two R1b individuals M15-1 and M012 are R1b2-P155 and probably R-PH200:

Tianshan M15-1 and M012 Y-DNA R SNPs

R-PH155 on the YFull tree

This sounds more like Botai / WSHG than Yamnaya. Interestingly a Tien Shan Hun and an "Asiatic" East Germanic Gepid were R-PH155.

What does this imply about the relationship of R1b and Proto-Indo-European?

Slumbery said...

@Open Genomes

I do not think there is much of an implication as PH155 had split a way before PIE and it is not like R1b as a whole was assumed to be IE specific. So PH155 has nothing to do with the PIE question.
All of these samples had Siberian (or "Siberian") ancestry that can explain the presence of this paternal line.

Davidski said...

These guys were para-Turks, not Indo-Europeans.

Joe Flood said...

Yet another paper where the Y haplogroups have been mis-attributed. Is anyone at all doing quality control on this stuff?

Running through the BAMs we find that both the R1bs are actually R-PH200 (R1b1b2) a very rare basal subclade so far found only in Turkey, Bahrain and Iraq. Presumably arriving with Turkic migrations.

It's hard to say when the R1b would have penetrated that far east. This eastwards Uralic R1b expansion probably took place during the Sintashta period (2100 BC) mixed with majority R1a.

Slumbery said...

@Joe Flood

"It's hard to say when the R1b would have penetrated that far east. This eastwards Uralic R1b expansion probably took place during the Sintashta period (2100 BC) mixed with majority R1a.

The earliest known Siberian/Central Asian R1b is 3300 BC, sample BOT14. Also there are the Afanasievo samples that predate Sintashta and are pretty far east. Afanasievo was actually eastward from the Tian-San (but also a way north of course and they are totally the wrong branch).

However I would not be sure that this is a west to east penetration in the case of PH200. R itself is from Central Asia / Siberia. It is possible that a branch of R1b, that separated from the dominant branch cirka 20 000 years ago, is actually native in Siberia / Central Asia. Or even if it formed in Europe, it might very well moved to Asia already in the Paleolithic.

Ric Hern said...

So did R-L754 split around the Urals, with his Brother R-PH155 staying behind in Siberia ?

So looking at this and the previously postulated "R1b from Western Asia" it is understandable to see where the confusion originated. Turkics carrying these clads into Western Asia.

I always wondered why a Mammoth Hunting Tribe adapted to extreme cold, suddenly prefered to Migrate through a Central Asian Desert rather than migrating and staying at plus/minus the same latitude with plus/minus the same type of ecosystem...