Thursday, July 19, 2018

An early Iranian, obviously

Today, the part of Asia between the Caspian Sea and the Altai Mountains, known as Turan, is largely a Turkic-speaking region. But during the Iron Age it was dominated by Iranian speakers. Throughout this period it was the home of a goodly number of attested and inferred early Iranic peoples, such as the Airya, Dahae, Kangju, Massagetae, Saka and Sogdians.

Indeed, the early Iron Age Yaz II archaeological culture, located in southwestern Turan, is generally classified as an Iranian culture, and even posited to have been the Airyanem Vaejah, aka home of the Iranians, from ancient Avestan literature.

That's not to say that Iranian speakers weren't present in this part of the world much earlier. They probably were, and it's likely that we already have their genomes (see here). But the point I'm making is that Turan can't be reliably claimed to have been an Iranian realm until the Iron Age.

Ergo, any ancient DNA samples from Turan dating to the Iron Age, as opposed to, say, the Bronze Age, are very likely to be those of early Iranian speakers. One such sample is Turkmenistan_IA DA382 from Damgaard et al. 2018.

Below is a screen cap of the "time map" from, with the slider moved to 847 BC, showing the location of the burial site where the remains of DA382 were excavated. The site is marked with the Z93 label because DA382 belongs to the Eastern European-derived Y-chromosome haplogroup R1a-Z93. Interestingly, his burial was located in close proximity to archaeological sites associated with the above mentioned and contemporaneous Yaz II culture.

DA382 didn't get much of a run in the Damgaard et al. paper, and little wonder because the authors also analyzed 73 other ancient samples. So let's take a close look at this individual's genetic structure to see whether there's anything particularly Iranian about it.

Damgaard et al. did mention that DA382 was partly of Middle to Late Bronze Age (MLBA) steppe origin. And indeed, my own mixture models using qpAdm confirm this finding with very consistent results and strong statistical fits. Here are a couple of two-way examples...

Namazga_CA 0.528±0.040
Srubnaya_MLBA 0.472±0.040
taildiff: 0.561330411
Full output

Dzharkutan1_BA 0.530±0.037
Srubnaya_MLBA 0.470±0.037
taildiff: 0.485083377
Full output

The fact that the MLBA Srubnaya samples from the Pontic-Caspian steppe can be used to model DA382's ancestry (alongside Bronze and Copper Age populations from Turan) with such ease shouldn't be surprising, considering the he belongs to R1a-Z93, which is the dominant Y-haplogroup in the Srubnaya and all other closely related MLBA steppe peoples.

Now, Srubnaya is generally regarded to be the proto-Iranian archaeological culture. How awesome is that considering those qpAdm fits? But, admittedly, this is just an inference, even if a robust one, based on genetic, archaeological and historical linguistics data. So apart from the fact that DA382 comes from Iron Age Turan, an Iranian-speaking realm, is there any other way to link him directly to Iranians?

Well, he's very similar in terms of overall genetic structure to some of the least Turkic-admixed Iranian speakers still living in Turan, and might well be ancestral to them.

For instance, below is a Principal Component Analysis (PCA) featuring a wide range of ancient and present-day West Eurasian samples. Note that, in line with the qpAdm models, DA382 clusters about half-way between the populations of the MLBA steppe and pre-Kurgan expansion Turan, and amongst present-day Yaghnobi and Pamiri Tajiks. In fact, he clusters at the apex of a southeast > northwest cline made up of Tajiks that appears to be pulling towards Europeans.

Needless to say, Tajiks, especially Pamiri Tajiks, also pack a lot of Srubnaya-related ancestry. I've talked about this plenty of times at this blog (for instance, see here). But what happens if I try to model Pamiri and Yaghnobi Tajiks with DA382?

Turkmenistan_IA 0.892±0.023
Han 0.108±0.023
taildiff: 0.794566182
Full output

Wow, it's an awesome fit! My mind's made up: DA382 was probably an Iranian speaker and, more specifically, an Eastern Iranian speaker. Who disagrees and why? Feel free to let me know in the comments (unless you're banned, in which case, f*ck off).

See also...

A Mycenaean and an Iron Age Iranian walk into a bar...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

New PCA featuring Botai horse tamers, Hun and Saka warriors, and many more...


  1. Why have you denoted him as Zarafshan_IA? He is not found anywhere close to Zarafshan.

  2. It's unclear whether or not tribal federations like the Massagetae and Dahae (much like the Yuezhi) were originally Iranian, especially since we don't know much about their original languages, before they dissolved/ migrated away from the Huns etc – westward, into Parthia and Bactria respectively). Lazy historians tend to refer to them all as Scythians/Sakas (although these are really specific groups from the western steppe and Sogdia respectively). There is onomastic evidence linking both the Dahae and Massagetae to Balkan IE peoples with similar names, i.e. the Dacians and Getae. Sure, the Parni (part of the Dahae) became Iranian-speaking when they took over Parthia (and founded the Arsacid Empire). Likewise the Massagetae are widely believed to have become the Zaths/Jatts in South Asia (also adopting Indo-Iranian way). Note also that the Jatts are also reputed to have the highest levels of Western Steppe/EHG in South Asia.

  3. Cool how the small shadows that are lighted with timescale map.
    Great new tool. I'd be interested to see something similar with just Beakers. Probably a useless shotgun blast, but who knows, visual depiction is pretty powerful.

  4. @rozenfag

    Why have you denoted him as Zarafshan_IA? He is not found anywhere close to Zarafshan.

    That's what he's called in the dataset. And the paper says that his burial is in the Zarafshan Mountains. See page 5...

    Interestingly, for two northern groups, the only tested model we could not reject included the Iron Age (~900 to 200 BCE) individual (Turkmenistan_IA) from the Zarafshan Mountains and the Xiongnu_IA as sources.

    I guess I can rename him to Turkmenistan_IA though.

  5. @Leucuuo

    There is onomastic evidence linking both the Dahae and Massagetae to Balkan IE peoples with similar names, i.e. the Dacians and Getae.

    Yeah, I know, but that's not very plausible.

    There's a reason why this sample is so close to Tajiks, and why the Scythians, Saka and Eastern Iranians carry so much R1a-Z93 (even as much as 80% in some populations).

  6. I'm not sure it's really meaningful to describe someone living in 850BC as "Eastern Iranian". As I understand it, there are very serious questions whether "Eastern Iranian" ever actually existed at all - it seems more likely that 'northeastern' (or 'old northwestern' and 'southeastern' are two different branches off from proto-iranian, and are simply grouped by shared archaisms or early areal influence, and really just indicate two varieties of of 'not western' (just as 'northwestern' iranian is now recognised as a catchall for everything that isn't southwestern (i.e. Persian or a very close relative)).

    There would then also be questions whether 'eastern iranian', if it did exist, would have existed so early, hundreds of years before the era when it's usually spoken of.

    The only early languages attested that aren't persian are older and younger avestan (which don't appear particularly related). They're traditionally called 'eastern', but don't actually share notable characteristics with the other 'eastern' languages. [Similarly, the "Pamiri" languages don't actually appear to form a family, and are just an areal grouping of Iranian languages]. The language of the Yaz II culture may well have been, for example, Younger Avestan; or Younger Avestan may be the only attested language from an originally much larger, now otherwise unknown branch of Iranian ("Central Iranian").

    It's also worth pointing out that the area has historically been quite mobile in its affiliations; many polities have been multilingual and multiethnic, with ruling classes temporarily imposing their languages onto the confederation as a whole - so there isn't necessarily going to be a clear equation between someone's genes and their language, certainly to the level of being able to pick out specifically eastern vs western iranian affiliation. [as a modern example: note that Tajik itself is Western Iranian, and a relatively recent import to the region]

    I think the more parsimoneous solution is simply to say that the reason this person looks Pamiri is just that they represent an archaic Iranian population, and that the Pamiris represent a relative undiluted modern development of that population, while other Iranian populations have higher levels of admixture from either substrate (eg pre-steppe Iranian) or superstrate (eg Turkic) populations.

  7. @Wastrel

    Pamiri is an Eastern Iranian language. And even though Tajiks now speak Western Iranian, they're originally an Eastern Iranian ethnic group.

    Also, I've seen Yaz II linked to the precursors of Iranians and/or Eastern Iranians, so it depends who you believe.

    But a migration of Z93-rich proto-Eastern Iranians from that region to the east does look plausible based on all the evidence.

  8. Large amounts of R1a Z93 Sintashtaish ancestry in early historical Iranians would be the nail in the coffin. It's crazy that after Narasimhan 2018 lots of posters here still opposed a Steppe origin for Indo-Iranian languages. They won't have much of an argument against a Steppe origin of both Iranian and Aryan languages if it is confirmed the first known Iranian-speakers were of 50% Sintashta-like origin.

  9. @Samuel Andrews

    It's crazy that after Narasimhan 2018 lots of posters here still opposed a Steppe origin for Indo-Iranian languages.

    I wouldn't say lots. Rather, just a few crackpots who were very loud and annoying about their denial, while the silent majority who accepted the facts said nothing.

    There are many people who come here to read the comments, and they understand all of the issues quite well, but they never post anything themselves.

  10. Off topic but interesting study: - "Their main finding is that where wealth is highly concentrated, few men are wealthy enough to afford more than a single wife, and the very wealthy, while often polygynous, do not take on wives in proportion to their wealth ...

    But the new study showed that the impediments to polygyny even among the very rich went way beyond the need to share the male's wealth among rival wives. "This is what really surprised us," said Borgerhoff Mulder. "Our estimates show that a very rich man with four wives will have far fewer kids than two men with two wives and with the same total wealth divided between them."

    Supports that the Steppe_EMBA cultures with an expansion of y-chromosomes probably did not have this driven by an unusual concentration and stratification of wealth in their elite relative to the male elites of other societies (supports archaeology on Yamnaya and Corded Ware, etc.). Egalitarian wealth distribution supporting milder but more pervasive polygyny, if anything, could be more favorable to y expansion.

    That helps push away from wealthy elites to the ideas of competition between relatively "closed" male groups as important in expansion (Tian Chen Zeng and colleagues' paper as academic first.) Even if I'm not so sure about the idea that these were actually bound cultural by knowledge of having a shared patrilineal ancestor (e.g. Corded Ware Culture in Germany's common y ancestor was about 2500 years before the samples; that's about 83 patrilineal generations at minimum - did they really have this long almost 100 generation cultural memory?).

  11. @Davidski, off topic again, if you have a chance, and the right sets in your datasets, would you mind running the following d-stats for me?:

    I want to have a quick go at estimating population replacement in Iberia in terms of the proximate source of steppe ancestry into Iberia, which is probably represented by either Beaker_Southern_France, Beaker_Netherlands or Beaker_Central_Europe.

    So using f4 ratios of difference on stats distinguishing between Iberia related Neolithic populations and GAC_Poland, for a "direct ratio", and between different WHG related Euro HG (El Miron, WHG, Iron_Gates, Narva), for indirect ratio.

    Ideally would like to use only GAC_Poland and not all GAC as they have some different properties in D-stats, but if you only have the GAC as a whole, then n/p, I'll see if that works.

    Global_25 is great, but I think this method could give a finer point estimate of exact replacement. Then it will be easier to talk about the R1b skew of recent northern Beaker origin in a more meaningful way that the raw steppe vs neolithic % estimate tells us.

  12. @ Matt

    Maybe Wealth was still important when negotiating a brideprice for their sons brides ? You could have had many sons but that helps very little if you can not help them to pay the brideprice..?

  13. @ Matt

    In a society where women mostly produced the tradable goods, wouldn't it be more profitable to have many wives ?

  14. @Wastrel. Avestan does share more similarities with eastern Iranian languages, when compared to languages of the western Iranian group such as Persian, or Kurdish.

  15. @Matt

    Here are those D-stats. But note that the Iberian Beakers aren't divided up into "steppe" and "no_steppe" samples, because I only did that for the few individuals who made it into the Global25 datasheet. Apart from that, it should all be as requested, so let me know if I missed anything.

  16. Davidski: "Pamiri" isn't a language. It's an ethnic and geographical grouping of at least four different families of languages*. They can't be shown to be particularly closely related within Iranian. Indeed, since Wakhi seems close to the old Saka languages, and the others don't, and Saka has been suggested to be the first split from Iranian, it's likely that "Pamiri" straddles the deepest divide in the family.

    On prehistory and Avestan, the EI says:
    "The term “Eastern Iranian” is of limited utility with reference to the Old Iranian period. Of the two attested Old Iranian languages, Old Persian is a typical representative of South-Western Iranian. Avestan geographically belongs to the eastern Iranian area, but shows few if any of the distinctive characteristics of the later Eastern Iranian languages... One may suppose that at this stage the Iranian languages had only recently begun to diverge from one another, and that only the more peripheral languages had already developed markedly individual traits. Among such peripheral languages one would include Old Persian in the extreme south-west, which displays [soundchange]... At the opposite end of the Iranian world the languages of the nomadic Saka peoples of the Eurasian steppes show a different but equally distinctive development of [etc]". Avestan it characterises as "central Iranian", lacking the features of either old persian or saka.

    On Eastern Iranian itself: "Within Eastern Iranian one can establish several sub-groups of languages which are particularly closely related to one another... However, it does not seem possible to regard the Eastern Iranian group as a whole—even excluding Parachi and Ormuri—as a genetic grouping... if one reconstructs “proto-Eastern Iranian” in such a way as to account for all the features of the group, it proves to be identical to the “common Iranian” reconstructible as the ancestor of the whole Iranian family..."

    Basically, the most likely development is that a large dialect-continuum of Iranian languages broke down, because the languages in the middle of the continuum were increasingly areally influenced by Old Persian in the southwest and by Saka (Khotanese, etc) in the east. The "Eastern" languages are simply those more influenced by Saka; the "Southwestern" languages are those that underwent the distinctive changes of Old Persian and its close relatives; and the "northwestern" languages are those that aren't closely related to Old Persian but were heavily influenced by it.

    So at the time of this person, there probably was no such thing as an "Eastern Iranian" linguistic, ethnic or genetic grouping, descending from a common 'proto-Eastern Iranian'. Rather, this person was similar to those who would LATER BECOME Eastern Iranian under the influence of the Saka kingdoms. And the main genetic feature of those people is not a common innovation, but an archaism: any random Iranian who wasn't genetically influenced by pre-steppe Iran (or by later Turkish or Tocharian influence) would look like a modern Pamiri, because originally all Iranians were like modern Pamiris.

    [not that I think this matters, but for the record: the Tajik ethnogenesis was with the invasion of the region by Persians. The word 'Tajik' originally meant 'Arab' (because the Persians were Muslim and often fought with or under Arab soldiers). Modern Tajiks, as I understand it, are the result of interbreeding between the original Persian colonists and the local Eastern Iranians, and then further interbreeding between that mixed population and a later layer of Turkic settlers. This is why the Tajiks in your chart form a continuum pulling away from Europe, and why this older sample is at the European (i.e. non-Old Iran, non-Turk) end of that continuum.]

    *the encyclopaedia iranica lists twelve Pamiri languages, one of them extinct, though some of these in turn have their own dialects, so the precise number is debateable.


  17. By the way, i'm confused: on the one hand you associate these remains with the Yaz culture, but on the other hand the paper says they're from the Zarafshan mountains. Is my geography just really terrible, or aren't the Zarafshan mountains of northern Tajikistan a very long way (relatively speaking!) from the Yaz culture of southern Turkmenistan/northeastern Iran? Is there another Zarafshan, or was Yaz just way more massive than previously thought?

    [and this conversation about Yaz, btw, feels much weirder having watched The Americans...]

  18. @Wastrel

    The Pamir languages are an areal group of the Eastern Iranian languages, spoken by numerous people in the Pamir Mountains, primarily along the Panj River and its tributaries.

    The Ethnologue lists the Pamir languages along with Pashto as Southeastern Iranian,[7] however, according to Encyclopedia Iranica, the Pamir languages and Pashto belong to the North-Eastern Iranian branch.[8]

    And it's possible that there's an error in the paper in regards to DA382 being from the Zarafshan Mountains. But he's definitely from Turkmenistan and from an area close to former Namazga Copper Age sites, so definitely also close to Yaz.

  19. @Davidski, cheers for those; I didn't formulate them right for an f4 ratio I think, but I have tried to use them for estimates, and in case anyone is interested in how they correlate:

  20. @Davidski

    "Srubnaya is generally regarded to be the proto-Iranian archaeological culture."

    I thought that was Andronovo?

  21. @Philippe

    There are all sorts of theories, but Andronovo is usually described as Indo-Iranian and/or Indo-Aryan, while Srubnaya as Proto-Iranian or some type of Iranian. For instance, see here...

    Srubna Culture "Encyclopedia of Indo-European Culture"

    For the time being I'd just say that Sintashta, Andronovo and Srubnaya were early Indo-Iranian. More ancient samples, especially hi-res ancient Y-chromosome sequences, will sort it out soon.

  22. Maybe the Poltavka Culture mixed with some Abashevo ? Abashevo certainly looks like the Corded Ware link ?

  23. @ Davidski

    Can the Srubnaya likeness of DA382 be traced back to the Eastward expansion of Corded Ware (Abashevo ?) ? Or is he just related to a population (Poltavka ?) who didn't migrate out of the Pontic Caspian Steppe when Corded Ware did ?

  24. @Ric

    It's the same story as with Sintashta. These populations were either derived from Corded Ware or Sredni Stog, or both.

    The mystery of the Sintashta people

  25. @Wastrel

    You make some useful points, but I'm pretty much convinced that DA382 was an Iranian speaker.

    On a related note, Mikkel updated his map with a new feature. Have a look at the screen cap above to see what I mean, and indeed check out the map again.

  26. This comment has been removed by the author.


Read the rules before posting.

Comments by people with the nick "Unknown" are no longer allowed.

See also...

New rules for comments

Banned commentators list