search this blog

Wednesday, June 28, 2017

Iron Age nomads vs Bronze Age herders: Sarmatians and Yamnaya in qpGraph

If we are to take these qpGraph models fairly literally, and I don't see why not, since they're very tight fits overall, then the early Sarmatians from what is now Pokrovka, Russia, derived almost 80% of their ancestry from Yamnaya or a very closely related group, while the rest of their ancestry came from a source that was a ~50/50 mixture between Han-like East Asians and a population closely related to Neolithic and Chalcolithic farmers from what is now Iran.

This topology also tests for the same Iran Neolithic/Chalcolithic-related input in Yamnaya, and I think it's very important to note that the relevant admixture edges (D7 to D9) are 0%, which suggests that Yamnaya did not harbor this type of ancestry. I didn't bother testing for East Asian-related admixture in Yamnaya in the same way, because it never shows such signals in other analyses.

The clearly more complex ancestry of the Sarmatians is probably best explained by the fact that they belonged to a true nomadic warrior culture, and indeed one that managed to spread its influence across vast stretches of Eurasia. So these two Sarmatian individuals, both from Unterlander et al. 2017, may have had recent ancestors from as far afield as Central Asia and Siberia. On the other hand, Yamnaya was a semi-nomadic pastoralist population, and although also highly mobile and prone to long-distance expansions, probably not as mobile as the Sarmatians.

Update 30/06/2017: Interestingly, adding Siberian Upper Paleolithic genome MA1 to the topology in the main model slightly shifts the admixture coefficients for Yamnaya, resulting in an arguably more accurate outcome in which it's modeled as a 50/50 mixture of populations closely related to Eastern European and Caucasus Mesolithic foragers.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts


Joe said...

is the east asian component in Samaritan groups like the Alans, the reason north euros have a minor east asian in their ancestry ?

Slumbery said...

Is the East Asian element is really East Asian proper or some kind of Siberian? I assume that the populations used in this calculation are not sufficient to tell this apart. (Or you tell me if I am wrong.)
If I remember correctly the Scythian article said Nganasan is also a good reference for the "East Asian" element.

Anthro Survey said...

So it seems Sarmatians were similar to Volga Tatars in terms of deep ancestry.

Also, what is the margin of error in these magnitudes? From what I've seen Yamnaya usually tends to pull slightly more EHG than CHG.

Matt said...

@Slumbery, you could always add Ulchi, etc. to the graph to see which is a better fit, though IRC from the Indian case, adding multiple ENA to the same graph with Yamnaya, EHG, etc. tends to mean you need additional admix edges between at least one of ENA and Yamnaya, EHG to get Z<4.

@Davidski, I guess you could equally run this graph with admixture from D7 and D10 into D11, but there's no reason to do it either way, and maybe this looks cleaner.

Davidski said...

Han is more or less the best East Asian reference in this type of model, but Nganasan also works well. So it's hard to say where these Sarmatians got their East Asian-related admixture; maybe both from Central Asia and Siberia?

It's likely that some of this admixture spread deeper into Europe, but the signal of East Eurasian-related ancestry in North Europeans that was found a few years ago probably isn't strongly linked to Sarmatian expansions, but rather to the relationship between ANE-rich ancient groups, mostly EHG, and East Asians. This relationship is yet to be fully resolved.

I don't know what the error margins are for the mixture edges in this model, but keep in mind that in such fine scale analyses the results are usually somewhat fuzzy, because some of the reference samples, like CHG and EHG, carry ancestry from essentially the same sources, so the mixture coefficients depend on how each method interprets reality.

The key finding here, I think, is that these Sarmatians readily take Iran_N/Iran_ChL mixture, while Yamnaya refuses to do so. And this is in line with uniparental marker data, which shows South Caspian-specific markers, like mtDNA U7, in Iron Age steppe nomads, but not in Yamnaya or any Bronze Age steppe groups.

Matt said...

@Davidski, along the lines of the models where you modelled Pathan vs Kalash with Andronovo vs Yamnaya, what is the fit like in this model with Andronovo, Sintashta, etc. in place of Yamnaya? Fail or work? Because of controversy with these Iron Age samples when published and it was questioned with steppe populations constituted the most probable ancestral.

Matt said...

Of course difference of this method is that qpGraph vs qpAdm is it cannot really use information well from populations that are not well fit to the tree (qpAdm has other problems of course).

Davidski said...


The same model with Andronovo and Iran_N just fails (Z = 3.115), but works well with Andronovo and Iran_ChL (Z = 1.827). I think this has something to do with the topology being more realistic for Andronovo when Iran_ChL is there, because it has Anatolia_N related ancestry which Iran_N lacks.

Sintashta is a fail (Z >5 and >4).

So I think it's entirely possible that the Sarmatians and Scythians had some ancestry from Andronovo, and thus minor Anatolia_N/EEF input, but not from Sintashta, which has too much EEF.

Seinundzeit said...

This is somewhat off-topic, but at the same time it's still somewhat relevant...

Personally, I am 100% certain that the levels of steppe ancestry in South Central Asia and South Asia will turn out to be very substantial. There is no way around that.

But, the percentages might be lower than 50% for people like the Kalash, if they have some extra ANE (and since Steppe_EMBA is the only place for their extra ANE to go, we see the Kalasha end up being as Steppe_EMBA-admixed as Lithuanians).

So, the notion that Iran_Neo-related ancestry in South Asia might be more ANE-rich (in comparison to the current Zagrosian samples) is one which deserves testing/exploration.

With that in mind, what sort of topology can one create to allow extra ANE in Central/South Asia, via an Iran_Neolithic-related population (with the Kalasha as test, since they don't show any signs of Iran_Chalc)?

Samuel Andrews said...

I have a strong feeling South Asian mHGs U2a, U2b, U2c are all mostly from their Iran Neo ancestry.

MaxT said...

"It's likely that some of this admixture spread deeper into Europe, but the signal of East Eurasian-related ancestry in North Europeans that was found a few years ago probably isn't strongly linked to Sarmatian expansions, but rather to the relationship between ANE-rich ancient groups, mostly EHG, and East Asians. This relationship is yet to be fully resolved."

I expect a lot of surprises in coming years in terms of modeling deep relationships for Eurasians.

Davidski said...


By the way, I should have mentioned in the post that these Sarmatians can also be modeled excellently as Yamnaya + extra CHG + Han. The highest Z score is basically the same as in the Iran_N model, and the number of Z scores over 1 is only a little higher (I think 4 more). Here's the graph.

But I'd say the Sarmatians must be receptive to Iran_N/ChL admixture for a good reason, because they are and Yamnaya isn't, and the best explanation for this is that they have some of this type of ancestry. This also fits with the presence of Iran_N/ChL mtDNA lineages on the Iron Age steppe.


That's a tough one, but I'll try a few different runs and see what happens. Also, maybe Matt can think of something? The graph file for the main South Asia model is here.

Rob said...

Dave nice layout there.

The Hx of west Eurasian steppe is certainly becoming clear.

1) a distinct pulse of Caucasian admixture, bringing with it a package of pastoralism and metallurgy, which set the foundation of steppe populations for millenia.

2) These Pokrovka individuals probably obtained their Iran_Neol & ENA in concert ; from central Asia. Several of Sarmatian cultural elements are due to interactions with south-central Asian communities.

Seinundzeit said...


I really look forward to seeing what you find.

On a related note, any news on possible South Asian/Central Asian aDNA papers?

Honestly, I really hope that they don't make us wait another year.

Matt said...

@ Davidski, thanks for all that.

@ Sein, worth a try; questions for trying that are:

1) where does the ANE edge happen? it may matter where the edge happens, e.g. should we put ANE in the model to i) South Asian (Onge related) ancestry, ii) Caspian, iii) Steppe_EMBA?

2) how does ANE fit on the topology to begin with?
For a first try, a modification of a model parallel fitting Kalash, Brahmin_India, Gond: Strip out the Brahmin_India, Gond lines if needed (last two lines in the pops and admix edges).

I've gone with the ANE input feeding into "ANI" (Steppe_EMBA+Caspian related side) and taken the safest way to fit ANE as to be diverging off after split from C (East Eurasian) but before split of EHG from Caspian-Caucasian pre-Basal (which seems to usually work better).

Maybe we can see what this comes back with and then shift populations and edges around based on outliers.

@ Davidski, also off topic, another graph around Yamnaya, CHG, Iran_N: This one is *not* meant to be a literal model, just another way to look at how the non-Basal parts of Yamnaya, CHG, Iran_N relate to each other. (Maybe resolve some of those small outliers whereby CHG is closer to Yamnaya, Barcin_N, WHG related than Iran_N to a greater extent than the different Basal between CHG and Iran_N).

EastPole said...

“The key finding here, I think, is that these Sarmatians readily take Iran_N/Iran_ChL mixture, while Yamnaya refuses to do so. And this is in line with uniparental marker data, which shows South Caspian-specific markers, like mtDNA U7, in Iron Age steppe nomads, but not in Yamnaya or any Bronze Age steppe groups.”

This suggests that indeed there were migrations from Iran to the steppe and that these Sarmatians could be Iranians. But it is limited to the nomads east of Don river, we still don’t know what those western Scythians around Dnieper were like. Those western Scythians were assimilated by Slavs and we don’t see much Iranian or East Asian influences in our genes.
This also suggests that early Iranians, those who went west to Iran after Indo-Iranians in Central Asia split into Iranians and Indo-Aryans, were more Yamnaya like. But it means also that early Indo-Iranians were more Yamnaya like.
Since Indo-Iranians were more Yamnaya like, we cannot link their origin to Sintashta. Sintashta and probably Andronovo were then more likely closer to Balto-Slavic or Slavic.
There are Slavic influences in Tocharian and plenty of Slavic borrowings in Turkic languages for example in Kirgizstan and this probably comes from Sintashta/Andronovo.

The relations between Balto-Slavic and Indo-Iranian languages become more complex in such situation because in addition to common PIE origin there were also probably Balto-Slavic or Slavic influences on Indo-Iranians which most likely came from Sintashta/Andronovo. We see it especially in religion. So Slavic Sintashta/Andronovo would also explain the similarity between Indo-Iranian and Greek which were much later than PIE.

The history of IE languages should be explained by both family-tree model and wave model:

The spread of Slavic linguistic influences in wave like manner could be explained by the spread of religious influences. Both Greeks and Indo-Iranians got them from the North and they were much later than PIE.

Rob said...

Reality check
The western Scythians were long gone by the time of the Slavic expansion

EastPole said...

“Reality check
The western Scythians were long gone by the time of the Slavic expansion “

“Early Slavs (Proto-Slavs) marginalized Eastern Iranian dialects in Eastern Europe as they assimilated and absorbed the Iranian ethnic groups in the region”

Rob said...

Oh okay if Wikipedia says so....
But let's pretend I'm wrong for a second- and given that we know that late Scythians and sarmatians were Z93 and some G and J picked up too- can you highlight for us which sublcades of Z93, G and J show a recent expansion correlating with Proto-Slavs ?

Bulan Goldstein said...


"There are Slavic influences in Tocharian and plenty of Slavic borrowings in Turkic languages for example in Kirgizstan and this probably comes from Sintashta/Andronovo."

Oh my Tengri! Slavic borrowings in Turkic 4000 years ago!? Can you show us a source and some examples?

Davidski said...

@Matt & Sein

And your second model returned this:

fatalx: input wts for node pSteppe_EMBA_HG not set correctly

EastPole said...


“Oh my Tengri! Slavic borrowings in Turkic 4000 years ago!? Can you show us a source and some examples?”

Yes Tocharian, Turkic borrowed from Slavic:

“As the Tocharians began to move east, the last contacts that they had with other Indo-Europeans (before their much later interaction with the Indians and Iranians) was with the Slavs, resulting in some Slavic influence in the lexicon, but no impact on the essential structure of the language.”

As for Turkic languages well known example is Slavic ‘koza’ “goat”:

But it is not only this, derived from Slavic ‘koza’ “goat” “sheepskin coat” is ‘kozan’ in Polabian Slavic dilalects and ‘kazan’ in Kyrgyz and Kazakh languages .

I have a book about it, but it is in Polish:

Prof. Moszyński “Pierwotny zasięg języka prasłowiańskiego” writes about 30 such cases, and probably many more existed. You cannot explain it in other way than language contact.

Words which have Slavic etymology and are widely used but isolated and without etymology in Turkic are surely borrowed from Slavic to Turkic. Moreover we have now genetic evidence that migrations of certain plants and animals were from west to east.

Bulan Goldstein said...

@Everyone who believe Scythians, Sarmatians, Alans, etc. Iron Age nomadic steppe people weres Iranians.

1. Scythians, Sarmatians, Alans, etc. all were kurgan people who had distinctive kurgan burial tradition, actually kurgan was their main attribute. Can anyone show us Iranian or Indo-Aryan kurgans in Iran and India? Also please show us an Iranian or Indo-Aryan religion behind kurgan burial tradition. Kurgan culture and people without kurgans and a religion to support kurgan burial?!

2. If Scythians, Sarmatians, Alans, etc. were Iranians, then this means at least two millennium Iranian nomadic domination from Central Europe to China. After two millennium domination, Iranian people should left innumerable Iranian borrowings in Germanic, Baltic, Slavic, etc. languages. But there is no Iranian borrowings or Iranisms in those languages but a bunch of controversial words. Can anyone explain the absence of Iranian borrowings in Germanic, Baltic, Slavic, etc. languages after two millennium (forget the cultural, religious, etc. influences)?!

Davidski said...

Trzciniec Culture was a Kurgan Culture; it was derived from Corded Ware and its direct descendants are Balts and Slavs.

How do we know this? Because we have Trzciniec genomes, and they share a lot of recent genetic drift with Balts and Slavs, and carry R1a-M417.

Who else carries a lot of R1a-M417? Duh, other people derived from steppe Kurgan groups including Indo-Aryans.

On the other hand, Turks are a very recent feature of much of Eurasia. Turkic speakers who carry R1a-M417 are the descendants of language shifters, usually Iranics who carried R1a-Z93.

Deal with it.

Anonymous said...

1. David answered.

2. The last of "Iranians" (which just means Aryan in Iranian) in Europe were the Alans. They left stuff there, just search it.
Also, in the late Byzantine period, they were crushed into almost nothingness (they were set up on being the powerhouse of Eastern Europe, maybe people there wouldn't be speaking Slavic languages these days if the Byzantine hadn't favored them over the Alans).
Also, you say "Central Europe", but the end for Scythians was only the Ukrainian Steppe, same for the Sarmatians. Alans went all over Europe, but never stayed anywhere.

About Turks, they were clearly Mongols (literally, Mongolian, from Mongolia, not just "Mongoloid") who mixed with Eastern Scythians.
You can see that in today's Mongols, who have residuals from the time - probably all the mixed population separated and went Westward (after the Chinese chasing them out) and little remained on actual Mongols.

Bulan Goldstein said...


There is nothing to deal, David. I appreciate your work but not the interpretation. I totally agree with you that "Ancient herders from the Pontic-Caspian steppe crashed into India" but it is not relevant with the language. There is too many questions that should be answered like absence of kurgans, lifestyle, cultural materials, etc. in South Asia.

Would you answer the questions which I have asked a comment before about Iranian borrowings, IA and Iranian kurgans and an Iranian and IA religion to support kurgan burial tradition?

The problem is like most people you believe that Turkic homeland is in Far East and Turkic languages are relatives of Mongolian and Tungusic. Altaic Language Family is an obsolete theory. Turkic urheimat is Ponto-Caspian steppes. Actually I have always thought that Ponto-Caspian steppes should be second homeland of Turkic people. The first homeland should be Eastern Anatolia, Caucasus, Zagros, and may be the area up to Pamir. At least we now know Yamnaya had this component (CHG).

If we follow your logic "Who else carries a lot of R1a-M417?" then who carries nomadic steppe lifestyle, kurgan burial tradition and a religion that support kurgan burial tradition up to 20th century? Who roamed from Siberia to India from Pacific to Atlantic? Slavs? Iranians? or Turkic people?

Also can you explain how could Turkic languages could give cultural borrowings to Proto-Slavic if Turkic people was in Mongolia at the Bronze Age? The example below means Turkic languages exist in Eastern Europe and gave cultural borrowings to Proto-Slavic, the ancestor language of all Slavic languages before they split.


From a Turkic language; compare Proto-Turkic *tilmaç (Turkish dilmaç (“interpreter”), Uzbek tilmoch), from Proto-Turkic *til (“tongue”) (Turkish dil) + Proto-Turkic *-maç (“noun suffix”).

*tъlmačь m
1. interpreter


East Slavic:
Old East Slavic: тълмачь (tŭlmačĭ)
Russian: толмач (tolmač)

South Slavic:
Bulgarian: тълма́ч (tǎlmáč)
Old Serbo-Croatian: (tlĭmačĭ)
Cyrillic: ту̀ма̄ч

Latin: tùmāč

Slovene: tolmáč (tonal orthography)

West Slavic:
Czech: tlumač
Kashubian: tłómôcz
Polish: tłumacz
Slovak: tlmač
Upper Sorbian: tołmač

Non-Slavic languages:
Hungarian: tolmács
Luxembourgish: Dolmetscher
Old High German: tolmetsche
German: Dolmetscher
Romanian: tălmaci

Bulan Goldstein said...


"But it is not only this, derived from Slavic ‘koza’ “goat” “sheepskin coat” is ‘kozan’ in Polabian Slavic dilalects and ‘kazan’ in Kyrgyz and Kazakh languages ."

Dear EastPole, you confuse who gave borrowings to whom. The word is common for all Turkic languages but not common to Indo-European languages. This should be show the direction of borrowing. Also please read below about initial k-.



Possibly from a Turkic language. Possibly related to Albanian kedh (“kid”), which would then render the Proto-Indo-European reconstruction as *koǵʰeh₂.

In older sources it is usually grouped with PIE *h₂eǵós (“he-goat”) but initial *k- does not match, or with set of Germanic cognates such a Old English hæcen (“kid”) and Middle Dutch hoeke, which is precluded by Winter's law.

*kozà f
1. goat


East Slavic:
Belarusian: каза́ (kazá)
Russian: коза́ (kozá)
→ Latvian: kaza
→ Veps: koza
Ukrainian: коза́ (kozá)
South Slavic:
Old Church Slavonic:

Cyrillic: коза (koza)
Glagolitic: ⰽⱁⰸⰰ (koza)

Bulgarian: коза́ (kozá)
Macedonian: ко́за (kóza)

Cyrillic: ко̀за
Latin: kòza

Slovene: kóza (tonal orthography)
West Slavic:
Czech: koza
Kashubian: kòza
Polabian: ťözâ
Polish: koza
Slovak: koza
Lower Sorbian: koza
Upper Sorbian: koza

Rob said...

The chronology of historical Iranians in east & SEE begins in the Iron Age, if we leave for a moment the Cimmerians. The Scythian kingdoms end c. 3rd century BC, classicaly explained due to arrival their 'cousins' the Sarmatians, whose homeland was thought to be in the middle Volga and Ural region, who's earliest cultural manifestation is the so-called Sauromation culture (but I'm not sure this is true given the aDNA). These Scythains and Sarmatians were obviously Iranic speakers - we have coins with the name Farzoi as one of their chiefs.

The other 'invaders' were from the west - Galatians and groups in contact with La Tene & Pomoranian cultures. However this process is pretty sketchy in the literary descriptions from Greeks. The named historical Sarmatians (Roxolani, Iazygi, etc) emerge somewhat later, making their mark by the turn of the Era.
The primordial Goths arrived to Black Sea steppe c. 3rd century AD, absoring many Sarmatians groups - clearly evidence in the archaeological record, and the fact that they continued to be attested.
The real big change came with the arrival of the Huns, which radically altered the centuries of ad-migration and cohabitation of various which had been occurring on the western most steppe & surrounds. However, even after the end of the Hun Empire, a few small Sarmatians tribes are attested around Banat, etc. Around the mid -5th century is the last time they are attested in SEE, although they obviously continue to lived east of the Azov-Don, and we hear of those in Iberia & north Africa.

The main phase of the (most recent) Slavic expansion occurred about 100 years after that, so there is a whole century of interpositional history to be accounted for, which is connected with the 'Roman Goths'. I think in the past the impact of Iranic on the development might have been overestimated, because they main languages into which early proto-Slavic speakers (500-650) expanded to was (East) Germanic. On the other hand, when the Slavs (i.e Rus-Slavs) expanded onto the steppe, they encountered Oghur- Turkic speakers, although as mentioned above, Sarmatian-Alani tribes continued to live in parts of the Saltovo culture, esp. toward the Caucasus.

Anonymous said...

Explain Japanese and Korean (I know it's debatable, but it's the best accepted) if Turkic originated in Eastern Anatolia/Caucasus/Zagros and why no one ever spoke about it in the ancient world and it left no trace of itself apart from the Altaic Steppe until the Turkic expansion.

Now, just to complement my earlier thoughts:
Transitional populations from Early to Late Steppe to Siberians, Mongolians and Eskimo are:
Uzbek, Turkmen, Uygur, Hazara, Mansi, Even, Selkup, Aleut, Tingit, Tubalar, Altaian, Kyrgyz, Dolgan and Yakut.

They create a fine continuum in Eurasia and are live evidence of contact between Steppe peoples and Altaic ones.
The original Turks would probably be something like the Tubalar people, that went on and mixed with people from the Indo-European Central Asia. Later admixture in Central Asians would be from the Mongol expansion of the Khan.

About the whole language thing, I've already answered you. Iranians never penetrated nor resided in Europe, past the Ukrainian Steppe there was a Wall to them, expect for the Alans, but they too never settled, only roamed (and made alliances for that, they were subservient of the Gothics, for instance).

About Slavic, Baltic, and Iranian connections to Turkic, it is pretty obvious that those people interacted with each other for thousands of years, there's no secret here - and that DOESN'T mean that they were together in origin, but that the Steppe people really roamed a lot and far away.

Ebizur said...

"Explain Japanese and Korean"

Oh, expletive...

Most words shared between Turkic and Mongolic languages are too similar to be anything but historical (first to second millennium CE) loanwords. There may be a few older loanwords in the mix, but they are not great in number in any case, and they do not prove a genetic relationship between the Turkic and the Mongolic languages.

Personally, I have found many more promising potential cognates between the Uralic and the Turkic languages.

On the other hand, I think the case for a genetic relationship between the Mongolic languages and the Tungusic languages is strong, but the modern languages seem to be complex mosaics of elements inherited from various ancestral dialects, many of which have not survived in whole form. It will be very difficult to disentangle the threads and prove the hypothesis of a common origin of the Mongolic and Tungusic languages beyond a reasonable doubt.

The Korean language probably has interacted with some of those ancient para-Mongolic and para-Tungusic dialects, and it does contain some patently obvious Mongolic loanwords (these probably dating back only to the Yuan era), but I do not think that a genetic relationship between Korean and my hypothetical proto-Mongolic-Tungusic will ever be demonstrated.

As for Japanese, there is nothing particularly similar between it and any of the other languages mentioned. It just happens to be another language with SOV syntax and the usual concomitant morphology. There are plenty of other languages in the vicinity and elsewhere that share those same vague typological similarities. Well, for the sake of completeness, it should be mentioned that Japanese is phonologically not very dissimilar to some Tungusic languages, but I do not find a common origin to be probable.

Anonymous said...

To add to what Rob said:
"given that we know that late Scythians and sarmatians were Z93 and some G and J picked up too"

The Scythians may have had more Z93 than Sarmatians. Among Sarmatians G seems to have cropped up more often in the studies so far, though some J and Z94 were found too. Maybe this means Sarmatians had far more Iranian and Caucasus admixture. (Though Anthrogenica's current thread on ancient genomes from Poland occasionally refers to this as "South Central Asian" admixture.)

"Sarmatians", wikipedia December 2016‎,


In a study conducted in 2014 by Gennady Afanasiev et al. on bone fragments from 10 Alanic burials on the Don River, DNA could be extracted from a total of 7. Four of them turned out to belong to yDNA Haplogroup G2 and six of them possessed mtDNA haplogroup I.[23]

In 2015, the Institute of Archaeology in Moscow conducted research on various Sarmato-Alan and Saltovo-Mayaki culture Kurgan burials. In these analyses, the two Alan samples from the 4th to 6th century AD turned out to belong to yDNA haplogroups G2a-P15 and R1a-z94, while two of the three Sarmatian samples from the 2nd to 3rd century AD were found to belong to yDNA haplogroup J1-M267 while one belonged to R1a.[24] Three Saltovo-Mayaki samples from the 8th to 9th century AD turned out to have yDNA corresponding to haplogroups G, J2a-M410 and R1a-z94.[25]"

Anonymous said...

The emphasized statement has been removed from the wikipedia page, logs show that a need for clarification was cited. The use of STR may make it out of date, but the referenced work of Gennady Afanasiev 2014 states on p.3 (p.314),

"Исследование мужской линии проводнлосъ путем аналиэа 23 микросателлитных локусов (STR) Y-хромосомы человека. Определение гаплогруппы женской линии осушествлялосъ изучением гипервариабелъных регионов мтДНК. задача состояла в выявлении нуклеотидных последователъностей гипервариабелъных регионов HVS-1, HVS-2, HVS-3 мтДНК с последуюшим определением гаплогруппы индивидуумов и предсказаний путей миграции. В итоге, в 6 образцах была обнаружена мужская гаплогруппа G2 и в 6 образцах - женская гаплогруппа I."

With Google translate, there's no need for clarification,

"The study of the male line was carried out by analyzing 23 microsatellite loci (STR) of the human Y chromosome. The definition of female line haplogroup was carried out by studying the hypervariable regions of mtDNA. The task was to identify the nucleotide sequences of the hypervariable regions HVS-1, HVS-2, HVS-3 mtDNA, followed by the determination of the haplogroup of individuals and the predictions of migration routes. As a result, in 6 samples a male haplogroup G2 was found and in 6 samples a female haplogroup I was found."

So that's 6 x G2 in Sarmatian burials in Gennady Afanasiev's work, which is actually 2 more occurrences of this Hg than the line in wikipedia which was citing the paper.

Then the rest of the genetics section in wikipedia mentions G2a-P15, R1a-z94, 2 x J1-M267, R1a, G, J2a-M410 and R1a-z94.

The Unterlaender paper found 2 x R1a (one was Z93), 1 x R1b and 1 x Q1a. But since only one of them, the R1b, was labelled Sarmatian, perhaps that means the others were Scythian.
Supplementary material from Unterländer et al 2017, p.71-72,

"Individual I0563 (Pazyryk) belonged to the Z93 clade 45 which is frequent in Central Asia 45, 46 and was also recorded in Bronze Age individuals from Mongolia 47 and the Sintashta culture from Samara 33. Individual I0577 (Aldy Bel) also belonged to haplogroup R1a1a1b but could not be determined more downstream. Individual I0575 (Sarmatian) belonged to haplogroup R1b1a2a2, and was thus related to the dominant Y-chromosome lineage of the Yamnaya (Pit Grave) males from Samara 37 (~3000BCE). Individual IS2 belonged to haplogroup Q1a which was also found in the Eneolithic period in Samara 33 in Europe but is most commonly found in present-day people from Siberia and the Americas 48 (Supplementary Table 22)."

The other of the two labelled as Sarmatian on p.53 was female. So one Sarmatian R1b from 5th-2nd century bc.

Total Sarmatian Y Hg: 8 x G (6+2), 4 x R (3 R1a, 1 R1b), 3 x J (one of which is J2a).

9 of the 15 Sarmatians so far don't have R1/J1 steppe Y markers, but G and J2.
Not just in terms of mtDNA but also Y lineages, there looks like a significant founding impact from the Caucasus and Iran in forging the Sarmatians. And if the J1 in the 2 Sarmatians were Near Eastern clades rather than mediated via the bronze age steppes where an instance of some J1 was found, then that could mean a Near Eastern origin for an even greater majority of the Sarmatian Y lineages sampled so far.

Bulan Goldstein said...


Please answer the questions if you want to discuss. And please no personal opinion and thoughts.

You have written that "Explain Japanese and Korean" Explain what?! Japanese, Korean, Mongolian and Tungusic has nothing to do with Turkic languages except heavy Turkic influence since Bronze age (also some re-borrowings and borrowings from Mongolian into some Turkic languages after Genghisid Empire).

Let's check the cardinal numbers of so-called Altaic languages:

Proto-Mongolic - ProtoTungusic - Old Japanese - Korean - Proto-Turkic

1 *nikēn - ämün - pitö - hana - *bir
2 *koxar - zhör - puta - dul - *ėki
3 *gurbān - ilan - mi - set - *üč
4 *dörbēn - dügün - yö - net - *tȫrt
5 *tabūn - tuñga - itu - daseot - *bēš
6 *jirguxān - ñöngün - mu - yeoseot - *altï
7 *doluxān - nadan - nana - ilgop - *yeti
8 *na(y)imān - zhapkun - ya - yeodeol - *sekiz
9 *yersǖn - xüjägün - kökönö - ahop - *tokuz
10 *xarbān - zhuwan - töwo - yeol - *on

Here is your Altaic Language Family. Check it yourself please and see if there is any similarity.

You have written that "Now, just to complement my earlier thoughts: Transitional populations from Early to Late Steppe to Siberians, Mongolians and Eskimo are: Uzbek, Turkmen, Uygur, Hazara, Mansi, Even, Selkup, Aleut, Tingit, Tubalar, Altaian, Kyrgyz, Dolgan and Yakut."

What are you talking about? Evenks?!, Selkup?!, Aleut?! What the?! Just check the cardinal numbers above. And please no personal opinion! BTW Yakuts called themselves Sakha and they migrated to Yakutia in 13th century A.D. What continuum?!

Hector said...

Ragerage sounds exactly like a Turkic version of Davidsky, concentrating instead on linguistics he knows nothing about.

There just is no way in hell that Turkic has an origin in Anatolia. Ask linguists. Like you know... how would you explain the numerous Chinese loan words into Old Turkic like ...
"As is known, the Chinese loanwords are among the oldest borrowings in

Your imminent outburst - accusing scholars(even Turkish ones like the author of the above article) of hating Turks - will make you even closer to Davidski.

Unknown said...

"So that's 6 x G2 in Sarmatian burials in Gennady Afanasiev's work"

They are not Sarmatian, they are Alan of Saltovo-Mayaki culture.

Seinundzeit said...


Hmm, I'm thinking of maybe adding an extra shot of ANE admixture to "Caspian" (in this case, D7)?



Interesting that it doesn't change the proportions in any serious way.

Anthro Survey said...

I hope this isn't a duplicate. Phone's acting up.

Could the increase of Western farmer alleles on the later steppe be the reason why Sarmatians take Iran_Chl much better than Yamna do?

Also, couldn't mt lineages like U7 also be attributed to the extra ~10% Iran_Neo presumably picked up in Ferghana or Transoxiana?

Matt said...

@Sein, though 2% extra ANE into "ANI" is not *necessarily* so small when you consider my best model (encompassing Ust Ishim, EHG, WHG, MA1, CHG, Barcin, Iberia_Chal, Onge, Ami, Iran_N, Yamnaya) would give ANE fractions: Yamnaya - 31%, CHG - 21%, Iran_N - 14%.

For a model like you discuss: (basically moved D11 node to between D5 and D7).

A few things about this model though:

- It's relatively hard to fit Barcin_N, WHG and Levant_N to these graphs, and these are the populations that show the most distinctive character against South Asia. There might be more information in including them than is here, but I found them hard to fit.

- The level of Basal Eurasian is pretty important. Note that there is no shared drift between D6 (previously "North Caucasian") and CHG, or D7 (previously "Caspian") and Iran_Neolithic.

So why does the model find the right answer to reject D7 into Steppe_EMBA and choose D6? The model constrains Basal Eurasian in D6 to match CHG and BEu into D7 to match Iran_N. The BEu level in D6 *can* work with EHG to fit Yamnaya's relative relatedness to Onge, MA-1, EHG and the West Asians, while the BEu level in D7 *can't*.

(Of course, it might actually improve the fit of the model to have additional nodes where CHG and D6 drift together and the same for Iran_N and D7. But the point is the BEu level itself is sufficient to choose the right population!).

Consider anything that might limit or constrain the BEu level to be too low.

All this said, it's a pretty good fitting model already when we come down to it!

@ Davidski: "And your second model returned this: fatalx: input wts for node pSteppe_EMBA_HG not set correctly"

Dammit. Labelling was slightly off, should be:

Seinundzeit said...


Now that you mention it, I wonder if you could go with your best model (the one encompassing Ust Ishim, EHG, WHG, MA1, CHG, Barcin, Iberia_Chal, Onge, Ami, Iran_N, Yamnaya), and add the Kalash to it (with the Kalash as a product of something Yamnaya-related, something Iran_N-related, and something ENA-related. On the ENA angle, Mark Lipson told me that "ASI" is intermediate between the Onge and Ami, with no clear/unambiguous preference, so that might be something to consider).

Probably a very tall order.

Regardless, you do make a very interesting point concerning the role of BEu levels in this sort of modelling.

So, perhaps adding extra ANE to a hypothetical Iran_N-related population for Central/South Asians isn't fruitful, since South Asians choose D7 precisely because it is highly BEu-admixed (so low ANE/EHG isn't an issue; quite the contrary).

Davidski said...


Davidski said...

@Anthro Survey

Could the increase of Western farmer alleles on the later steppe be the reason why Sarmatians take Iran_Chl much better than Yamna do?

Yes, that's probably one of the reasons.

Also, couldn't mt lineages like U7 also be attributed to the extra ~10% Iran_Neo presumably picked up in Ferghana or Transoxiana?

Yes, from those parts, probably via the Inner Asian Corridor.

By the way, adding MA1 to the topology in the main Sarmatian model corrects the admixture coefficients for Yamnaya.

Seinundzeit said...


Thanks for running that; this adds some much needed clarification.

For the most part, I guess we can lay to rest the whole notion that "perhaps there's substantial extra local ANE admixture in Central and South Asia, which dates prior to the steppe-related influx, so the Yamnaya-related percentages are highly inflated".

I was beginning to gravitate towards that line of thinking myself, but I'm not seeing the extra ANE admixture make a huge difference in the Yamnaya-related proportions for Central/South Asians.

Matt said...

@ Davidski:

OK, so this model is aimed at allowing a free mix of a Steppe_HG, West_Asian and Basal_Eurasian ancestry into Yamnaya, CHG and Iran_N, as a check on which share the most HG ancestry and as a check on levels of Basal Eurasian.


Yamnaya: BEu - 20%, Steppe_HG - 78%, West_Asian - 2%. Ratio Steppe_HG:West_Asian 97:3.

Caucasus_Mesolithic: BEu - 52%, Steppe_HG - 46%, West_Asian - 2%. Ratio Steppe_HG:West_Asian 97:3.

Iran_N: BEu - 64%, Steppe_HG - 27%, West_Asian - 9%. Ratio Steppe_HG:West_Asian 74:26.

So this is kind of a stylized model. What it does seem to me to show is

a) Slight decrease of Basal Eurasian ancestry relative to the constraint of modeling Yamnaya as CHG+EHG. In this model BEu in Yamnaya at 20% is 40% of CHG, while the general models give 50% CHG ancestry, which here would have implied 26%.

b) The non-BEu ancestry in CHG does not seem different from Yamnaya, while the non-BEu ancestry of Iran_N does seem different to both Yamnaya and CHG.

Considering b) we may get better fit on the model from modeling CHG as from the same D3 as Yamnaya rather than the same D2 as Iran_N.

Model with that modification to test if poss.:

Anthro Survey said...

Wow, that is pretty narley in that adding MA1 tothe topology only affects proportions for Yamna, not the extra "southern" % for the overall.

Matt said...

@ Sein, yeah, this model ( is very consistent with the last (

Previous had 2% extra ANE (E2) direct into ANI, while this has 4% into the "Caspian" ancestor of ANI, which gets cut down by 50% Steppe to 2% in ANI. Agree that extra ANE does not look from these models to allow to substitute for Steppe ancestry.

As a last try, a model that has ANI having an ancestor that doesn't share the D5 (or proto-Caspian) node with Iran_N, and is instead a free mix of E2 (ANE), D2 (West Asian) and B (BEu):

This is a more free model in that E2 and B are not in any kind of tradeoff and both can increase or decrease in exchange for D2 if that improves fit.

Re: the complex model, I will have a look at how to add Kalash in and see if it's feasible. Even on that, the best Z scores were I think in the high range of 3<z<4, though this is driven by the complexity of admixture between the Mesolithic-Neolithic West Eurasian populations.

Re: whether Onge or Ami or intermediate works better for ASI, it is pretty unclear which is better. That may actually have some small effect on models, as there are Z on these trees for asymmetric relationships to EHG, Yamnaya, CHG, Iran_N by Onge and Ami.

Davidski said...


Matt said...

@ Davidski, cheers. Using Steppe_HG rather than West_Asian as non-BEu ancestor of CHG seems to result in a nice slight improvement to model: (left vs right). However, does introduce a few zero drift edges, so the model could be simplified further.

OTOH, Sein, the model I was working with mixing ANE, West_Asian and BEu to replace Caspian into South Asia works slightly less well:

Most of the highest stats are of the form f4 (And Kal X Eas) where X is Yamnaya or MA1. I guess that these tend to indicate that the non-Andamanese like portion of Kalash is closer to EHG than these models suggest? This might be less of a problem if ASI was based on a sister to Dai / Ami instead? Or if instead of Caspian (D7) the model was allowed to use a ghost Steppe_HG+high Basal population instead (though that may not make a lot of geographic sense).

Seinundzeit said...


(I'm sorry for taking so long to respond)

I think it would be very fruitful to see the Kalash integrated into your general working model for ancient Eurasians, so I really look forward to seeing your ideas on that.

With regard to those stats, I think your latter suggestion could help, but I also agree with you on the geographical implausibility.

Then again, perhaps the Kalash need something like the Srubnaya_outlier?

For what it's worth, she has much more ANE, and much less Near Eastern ancestry, in comparison to Yamnaya.

Also, Central and South Asians tend to prefer her to Yamnaya, in simple PCA-based nMonte analyses.

I think that could be of interest/worth exploring (using Srubnaya_outlier, rather than Yamnaya).

Anonymous said...

@Ir Pegasus
"They are not Sarmatian, they are Alan of Saltovo-Mayaki culture."

Alans tend to be listed as one of the Sarmatian tribes. Not just on the wikipedia page I referred to, which listed aDNA from Alan burials in its genetics section on the Sarmatians page.

For instance, refer to

"During the 3rd century B.C. new powerful Sarmatian tribes were formed - the Aorsi, the Roxolani, the Alans, and the Iazyges advanced westwards. The massive Sarmatian western expansion ultimately brought down Scythian rule in the North Black Sea area between the end of the 3rd century and early 2nd century B.C."

The page lists the various eras of the Iranic populations called Sarmatians. It seems the Alans took a prominent position among the Sarmatians during the "Late Sarmatian" stage:

Late Sarmatian - the Alan or Shipovskaya cultures, 2nd - 4th century A.D.
Late Sarmatian sites were first identified by P.D. Rau, who also associated the Late Sarmatian sites with the historical Alans. At the beginning of the 1st century A.D., the Alans had occupied lands in the northeast Azov Sea area, along the Don. Based on the archaeological material they were one of the Iranian-speaking nomadic tribes began to enter the Sarmatian area between the middle of the 1st and the 2nd century A.D. The written sources suggest that from the second half of the 1st to 4th century A.D. the Alans had supremacy over the tribal union and created a powerful confederation of tribes. They continued to rule in the North Black Sea steppes until they were invaded by the Huns in the late 4th century A.D. Most of the Alans were absorbed by the Huns while a small number of them fled to the North Caucasus or went west and reached the shores of Gilbraltar.

Over on the anthrogenica page about the ancient Polish genomes on preview, the discussion supposes that the Roman era samples are Goths as well as some Sarmatians among them.

Perhaps it's relevant that so far 2 instances of G have been found among the Iron Age samples from Poland.

Vadim Verenich had identified all the likely male samples among the Roman era (Late Iron Age) genomes from Poland that have been made available. Another member remarked that these amounted to 6 Iron Age males in all, all from Kowalewko.

Gravetto Danubian had reported someone's analysis of 3 of these Iron Age samples to be specific subclades of G2, I1 and I2.

Vadim Verenich reported that Yfull's Vladimir T came back with subclades of I1, I2 and 2 x G2 for 4 of the 6 Iron Age male samples from Poland, 3 of which overlap with and confirm what Gravetto Danubian had reported for them. (Both reports further found an I1 for the sole Medieval sample either's source had so far analysed.)

Matt said...

@ Sein, good point re: the Srubnaya outlier sample and South Central Asia.

First, I'm going to have a go and see if that technique of changing CHG ancestry from a separate West Asian population to coming from the Steppe_HG improves the general model.

@ Davidski, could you please run:

Modified 1:
Modified 2:

See if this can reduce some of the complexity that begun emerging in the basic model with additional edges between Anatolian and Caucasus (that even still left CHG a little less related to Yamnaya than worked). Then see how to best fit Kalash in the model with Yamnaya, and if that works, try substituting in the Srubnaya_outlier for Yamnaya.

Davidski said...


Matt said...

Davidski - Cheers, looks like that modification puts Iran_N too far away from CHG+Yamnaya? This might work better (Steppe_HG into Iran_N as well):

Davidski said...


Matt said...

Hmm... Didn't work out either. Would you mind uploading the outliers file for that one?

Davidski said...


Matt said...

@ Davidski, couldn't really get much clearly out of the outliers, could you try running this model for me: ? Expect it would probably work worse, but just wondering if I've added too much complexity into the previous few models.

Davidski said...


Matt said...

@Davidski, actually not too bad. The worst is 3.995, *but* that seems to mainly relate to Ust_Ishim and MA1, and most of the other poorly fitting stats relate to Ust_Ishim and MA1. May relate to increased Neanderthal / archaic (though this tree doesn't model that explicitly to allow to check).

One more model to see if adding edges between Ust_Ishim and MA-1 and allowing some more common drift within WHG-UHG, Caucasus-Iran and the Barcin groups can get the stats down:

Davidski said...


Matt said...


Error in that graph (no connection between Basal_Iran and pCaspian), should have been:

vivek said...

@Davidski...."The key finding here, I think, is that these Sarmatians readily take Iran_N/Iran_ChL mixture, while Yamnaya refuses to do so. And this is in line with uniparental marker data, which shows South Caspian-specific markers, like mtDNA U7, in Iron Age steppe nomads, but not in Yamnaya or any Bronze Age steppe groups. " you mean to say that U7 is associated with Scytho-Sarmatian cultures and understanding is that it is widespread in south asian peoples, near east ( where it probably originated) and also U7a is seen among the Saudis...just needed some clarification... thanks.

Davidski said...

Some Scytho-Sarmatian samples belong to U7, which suggests that their ancestors had contacts with peoples from the Caucasus and maybe even from south of the Caucasus and the Caspian Sea.

On the other hand, we don't see U7 in Yamnaya, which is the point I was making.

vivek said...

thanks David!