search this blog

Tuesday, January 22, 2019

Hungarian Yamnaya predictions

About ten thousand ancient burial mounds still stand in the Carpathian Basin and surrounds. Many of these kurgans or tumuli show direct archeological links with the highly mobile Yamnaya culture of the Pontic-Caspian steppe to the east, and may have been built by Yamnaya migrants.

The testing of ancient DNA from the remains in these burials is important, because the results are likely to be informative about the profound genetic, cultural and linguistic changes that took place in what is now Hungary and the Balkans during the Copper and Bronze Ages.

But, alas, probably to the disappointment of some readers, my great prediction is that they're not going to be overly relevant to what happened at this time in Northern and Western Europe, and won't upend the current consensus that the Corded Ware culture (CWC) was the main vector for the spread of steppe ancestry and Indo-European languages into these parts of the continent.

The important thing to understand about the Yamnaya expansion into the Carpathian Basin is that it mostly stopped at the Tisza River. It's true that some archeological cultures west of the Tisza, such as Mako and Vucedol, do show fairly strong Yamnaya influences, but they can't be regarded as part of the Yamnaya colonization of Central Europe. Below is a slightly modified map from Heyd 2011 to illustrate my point.

In fact, four early Yamnaya period samples from one of the few kurgans west of the Tisza have already been published along with the Olalde et al. 2018 paper on the Bell Beaker culture (BBC). And one of these samples, labeled I5117, even represents a male buried in a Yamnaya-like pose. But this is how three of these individuals cluster in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation.

They sit firmly among other Copper Age and Neolithic samples from west of the Pontic-Caspian steppe. In other words, they show practically zero Yamnaya-related or steppe ancestry. Moreover, both of the males belong to Y-haplogroup G2a-L91, which is yet to be found in any samples from the Copper and Bronze Age steppe.

That's not to suggest, however, that the spread of the Yamnaya culture into the Carpathian Basin was a cultural process with little or no genetic impact. It probably wasn't, because five samples labeled "Yamnaya Hungary" were featured in the Wang et al. 2018 preprint on the genetic prehistory of the Greater Caucasus, and judging by their PCA and ADMIXTURE results (in the figure below from the said preprint) they're not very different from most other Yamnaya samples, such as those from far to the east in Kalmykia or Samara.

But the point I'm making is that not every one of the ten thousand kurgans and tumuli in the Carpathian Basin and surrounds was built by newcomers from the steppe, and, thus, my other prediction is that a fair proportion of the Yamnaya-related burial mounds, especially west of the Tisza, might contain remains without any steppe ancestry.

As far as I know, the Y-haplogroups of the aforementioned Yamnaya Hungary samples haven't yet been reported anywhere. But there are three ancients in the Mathieson et al. 2018 paper on the genetic prehistory of southeastern Europe that are probably highly informative about what we can expect in this context, because based on their archeology and ancestry, they're likely to be closely related to the Hungarian Yamnaya population. They are:

Balkans_BronzeAge I2165: Y-hg I2a-L699 3020-2895 calBCE

Vucedol_Croatia I3499: Y-hg R1b-Z2103 2884-2666 calBCE

Yamnaya_Bulgaria Bul4: Y-hg I2a-L699 3012-2900 calBCE

That's not much to work with, you might say. Perhaps, but keep in mind that R1b-Z2103 has now been reported in Yamnaya samples from Ciscaucasia, Kalmykia, and Samara, and I2a-L699 in a Yamnaya singleton from Kalmykia. Thus, a lot of outcomes are still possible, but some are more likely than others. So I'm expecting most Hungarian Yamnaya males to belong to R1b-Z2103 and I2a-L699, or perhaps even the other way around!

However, in line with my great prediction, I don't expect to see any R1a-M417 or R1b-L51, the two most common Y-halogroups among present-day Europeans living north and west of the Balkans. And I think that if these markers do actually show up, then they'll be represented by nowadays rare or even extinct lineages that aren't very important to the peopling of Europe. Any thoughts? Feel free to share them in the comments.

See also...

Hungarian Yamnaya > Bell Beakers?

Single Grave > Bell Beakers

Dutch Beakers: like no other Beakers


«Oldest   ‹Older   201 – 240 of 240
Gabriel said...


And that, to me, confirms Bell Beaker comes from Corded Ware. Some have tried to explain that somehow R1b-P312 is from Yamnaya and R1b-U106 is from Corded Ware, but that makes no sense. Celtic and Germanic are obviously closely related not only genetically but also linguistically and this is most likely due to shared origins, most likely in Single Grave.

Drago said...

“This sample is dated to 2275-2032 calBCE, but considering the Scandinavian farmer ancestry, it seems like his ancestors were already in that area around 3000 BCE.”

Pfft C’mon. Not even CWC reached Scandinavia until 2700 BC.
And as we know CWC is all R1a

Davidski, You can’t just make up stories dude

Davidski said...


I'm not going to cite perfect dates every time I drop a comment here. You just have to try a little harder to put things into the right context. I'll explain this time, for the benefit of others...

TRB is dated to around 3,500 BCE and Corded Ware was in the southern Baltic region from as early as 2,900 BCE.

Ergo, if Nordic_LN RISE98 is a mixture of these groups, and he does seem to be, then like I said, his ancestors were in the area (and mixing) already around 3,000 BCE.

Apart from that, there's no way to explain the Globular Amphora and TRB ancestry in the Rhenish Beakers if their R1b ancestors weren't part of the massive migration from the steppe across the North European Plain that formed the Single Grave population.

These people deforested most of southern Scandinavia while trying to replicate a steppe environment for their herds. I'm pretty sure they didn't just evaporate when the Single Grave culture ended.

Drago said...

I agree , sorry for being cheeky
Anyhow, this 2200 BC-> period corresponds to what is called the “Flint Dagger period” of LN Scandinavia, I believe

Samuel Andrews said...

From the looks of it, everything northern European comes from Corded Ware. This includes Germanic & Y DNA I1.

Samuel Andrews said...

Really Corded Ware is the secondary PIE homeland not the PC Steppe. Balto-Slavic, Germanic, Celtic, and Indo Iranian all ultimately trace back to Corded Ware.

Gabriel said...

And Italic. Think about it. The Romans.

It's hard to deny it. With the existence of shared phenotypes from Portugal and Ireland to Mongolia and Pakistan as well as shared pigmentation traits and genotype most of all despite the R1b/R1a divide it's obvious that Corded Ware is the main vector of Indo-European languages into Europe and the world, with Sintashta being so similar to Beakers, and Trzciniec to both, in ways that can't be explained other than by common origins.

That sample with R1b-U106 is just a smoking gun for all of that being the case, due to his very local origins yet presence of a Beaker-related haplogroup. Coupled with the autosomal similarities between Beakers and Corded Ware, the conclusions couldn't be more obvious.

So sorry guys, but Bell Beakers and the Irish are probably Corded Ware people just like Slavs and Indo-Iranians.

On the other hand, maybe that means that I1 is from TRB? Maybe we already or finally know where it came from.

Ric Hern said...

@ Them Meee

Yes as I see it Proto-Gaelic was the Poetic Shorthand version of the PIE Languages...The VSO Wordorder most probably originated because of Poetry (Amergin the Bard...) and it looks like PIE words were shortened, something similar to what Danish does with North Germanic Languages....

Davidski said...

Sintashta is basically like an eastern twin of steppe-admixed Bell Beakers that just happened to be dominated by Z93 and decided to move to Asia instead of Western Europe.

Samuel Andrews said...

I1 is Indo European. Its age estimates & star-like tree structure is exactly like that of R1b L151 & R1a Z283. It probably ultimately is from 'farmers' but it became popular with IEs.

Davidski said...


[1] "distance%=3.2295 / distance=0.032295"


"Progress_Eneolithic" 41.05
"Khvalynsk_Eneolithic" 36.2
"Vonyuchka_Eneolithic" 6.65
"Globular_Amphora_Poland" 5.75
"Ukraine_Mesolithic" 4.95
"CHG" 4.45
"Armenia_EBA" 0.95
"Abdul_Hosein_N" 0
"AfontovaGora3" 0
"Anatolia_BA" 0
"Anatolia_ChL" 0
"Anatolia_EBA" 0
"Armenia_ChL" 0
"Baden_LCA" 0
"Balaton_Lasinja_CA" 0
"Balkans_ChL" 0
"Balkans_N" 0
"Barcin_N" 0
"Blatterhole_HG" 0
"Blatterhole_MN" 0
"Boncuklu_N" 0
"Dzharkutan1_BA" 0
"EHG" 0
"Ganj_Dareh_N" 0
"Geoksiur_Eneolithic" 0
"Glazkovo_EBA" 0
"Globular_Amphora_Ukraine" 0
"Gonur1_BA" 0
"Gonur2_BA" 0
"Hajji_Firuz_ChL" 0
"Hotu_HG" 0
"Iberia_Central_CA" 0
"Iberia_ChL" 0
"Iberia_MN" 0
"Iberia_N" 0
"Iberia_Southwest_CA" 0
"Iron_Gates_HG" 0
"Koros_HG" 0
"Koros_N" 0
"LBK_N" 0
"LBK_N_Austria" 0
"Levant_BA_North" 0
"Levant_BA_South" 0
"Levant_ChL" 0
"Levant_N" 0
"Namazga_Eneolithic" 0
"Narva_Lithuania" 0
"Parkhai_Eneolithic" 0
"Protoboleraz_LCA" 0
"Sappali_Tepe_BA" 0
"Sarazm_Eneolithic" 0
"Seh_Gabi_ChL" 0
"Seh_Gabi_LN" 0
"Shahr_I_Sokhta_BA1" 0
"Shahr_I_Sokhta_BA2" 0
"Shahr_I_Sokhta_BA3" 0
"TDLN" 0
"Tepe_Anau_Eneolithic" 0
"Tepe_Hissar_ChL" 0
"Tepecik_Ciftlik_N" 0
"Tisza_LN" 0
"Tiszapolgar_ECA" 0
"Trypillia" 0
"Ukraine_N" 0
"Varna" 0
"Vinca_MN" 0
"West_Siberia_N" 0
"West_Siberia_N_low_res" 0
"Wezmeh_Cave_N" 0
"WHG" 0

Drago said...

@ Sam
''From the looks of it, everything northern European comes from Corded Ware. This includes Germanic & Y DNA I1.''

Woah there cowboy !

Drago said...

Blogger Dragos said...
Nice work Davidski.
Seems like there is some minor gene flow from Turan to Yamnaya ?

Caucasus Eneolithic Steppe
Samara_Eneolithic 50.5%
Sarazm_Eneolithic 28.1%
CHG 21.4%
Baden_LCA 0%
West_Siberia_N 0%
Tepecik_Ciftlik_N:Tep002 0%
Armenia_EBA 0%
Armenia_ChL 0%

Distance 3.2346%

(Dist. would improve with Majkop or Meshoko ? )

KcEnlStp 49.5%
Samara_Eneolithic:I0122 37.6%
Globular_Amphora 12.7%
Ukraine_N 0.2%
EHG 0%
Trypillia 0%
Baltic_HG:Spiginas4 0%

Distance 4.2944%

Davidski said...


Using qpAdm I can model Vonyuchka_Eneolithic with a very good fit as AG3, CHG and EHG. And I can model Progress_Eneolithic as Vonyuchka_Eneolithic plus maybe some extra EHG.

There's no way to fit in Sarazm_Eneolithic. It's rejected every time.

I think that when this sort of discrepancy happens between qpAdm and G25/nMonte it means that there are no real proximate mixture sources available.

And it's even possible that Vonyuchka_Eneolithic is not a recent mixture of anything. Sarazm_Eneolithic is related in some way, but it's hard to say how.

Drago said...

@ Davidski
I see ur point
Maybe using distal modelling you mentioned it not optimal v
On the other hand, the Siberian -like admixture in Sarazm could be falsely “pulling” it ?
Worth probing at

Drago said...

Something like 20% south of Caucasus admixture in Progress seems reasonable and historical.
I’d be very surprised if there was an absolute Barrier between these two cultural groups

Davidski said...

There's no Anatolian ancestry in Progress/Vonyuchka_Eneolithic, and it's already at a high level in Meshoko, which is dated to about the same time.

It got into Yamnaya via the North Pontic steppe, probably from the Balkans, and maybe a little bit from the south via Maykop.

Matt said...

@Dave, so the legendary Eneolithic samples have finally arrived ;)

Obligatory plots, using averages datasheet only:

Straight Global 25, sample of dimensions informative to West Eurasia, with Khvalysnk, Yamnaya_Samara, the Eneolithic samples and CHG broken out:

Reprocessed Global 25, to focus on West Eurasia:

Trees: (Neighbour of Yamnaya in a restricted West Eurasia set built to sift out African and East Eurasian recent ancestry; neighbour of Gonur1_BA_outlier and to a lesser extent Tajiks and Yamnaya_Ukraine_outlier in the full set)

Distance Comparisons: (Compared to Yamnaya, Khvalysnk, Ukraine Eneolithic, Progress is most correlated with Yamnaya, but closer to Sarazm, CHG, Gonur1_BA outlier, Iran_N, while Yamnaya is more correlated with present day and ancient HG rich Russian/Eastern European samples)

Matt said...

To add to last comment, closest populations in present day for Khvalysnk, Yamnaya, CHG and the new Eneolithic samples:

In present day terms they look closest to a South-Central Asian flavour (through different processes). CHG is very Caucasus and Khvalysnk is rather NE European. Yamnaya between Khvalynsk and Progress-Vonyuchka, with a little NW European flavour coming in associated to EEF and WHG.

(Of course, standard caveat, all modern populations very different to ancient - these are only "closest" not close, and probably not including any special drift a modern population may have, particularly prevalent in South Central Asia).

Samuel Andrews said...

The main dimension with Kurgan drift is PC24. Progress_Eneolithic has this drift but Vonyuchka_Eneolithic doesn't.

The results for both are confusing. Probably they don't fit in G25 yet. What is clear is they, especially Progress, have more ANE than EHG can give. Also, Vonyuckha has IranNeo-affinity.

Also, there's no way Yamnaya is just Eneolithic Steppe+farmer. Yamnaya also has lots of EHG/UkraineHG-like stuff on top of that. How did Wang 2018 not pick this up?

The pops who contributed to Corded Ware & Bell Beaker were like Yamnaya. So, maybe Eneolithic_Steppe is a dud/dead end who is related to pops who gave rise to early IE speakers?

Matt said...

Same West Eurasia reprocessing as above with a rotation to make match more closely "HG" aligned with PC1:

Davidski said...

@Samuel Andrews

The main dimension with Kurgan drift is PC24. Progress_Eneolithic has this drift but Vonyuchka_Eneolithic doesn't.

The results for both are confusing. Probably they don't fit in G25 yet.

You're looking too closely at the least significant dimensions, which are more easily and potentially radically affected by recent drift.

When I model Progress_Eneolithic and Vonyuchka_Eneolithic with qpAdm and G25/nMonte, I get basically the same results. It's only when I do an unsupervised run with G25/nMonte that things look different, because nMonte starts looking for a better fit and more proximate mixture sources and over fits the model.


CHG 53.55
EHG 32.05
AfontovaGora3 14.4

[1] distance%=5.6512 / distance=0.056512


CHG 0.587±0.031
EHG 0.326±0.051
AfontovaGora3 0.087±0.041

chisq 8.242
Tail prob 0.509972
Full output

Also, in the Global25 the main Kurgan dimension, or rather the dimension that shows a very clear EHG > CHG cline, is PC9. Progress_Eneolithic and Vonyuchka_Eneolithic are both sitting on this cline close to each other, except with Vonyuchka_Eneolithic shifted a little closer to the Neolithic samples from Iran, probably because its basal-rich ancestry isn't exactly like the CHG samples currently available.

What is clear both from qpAdm and G25/nMonte is that these Eneolithic samples can be more or less described as mostly mixtures of CHG, EHG and ANE, in that order, but in reality they're probably not recently mixed, but rather just sit on very ancient clines between CHG, EHG, ANE and Iran_N.

Matt said...

@Sam, I haven't done the nMonte, but in reprocessed PCA it looks like if you were to model Yamnaya only allowing Progress-Vonyuchka, EHG and GAC, you'd get about 58:30:12. But anywhere in the interval 25-30% looks like it would be OK in such a model. Still seems like they should've picked it up tho! Harder because Progress-Vonyuchka is closer to EHG than Anatolia_N and WHG for ex, but still should've been found. This probably makes more sense with archaeology, and scenarios with some degree of continuity are always useful.

Quite curious to see what goes down with Steppe_Maykop.

Davidski said...

However, I should mention that these are preliminary genotype data.

So we should hold off from making any conclusions until the genotype data from the paper are released.

Richard Rocca said...

Looks like the Hungarian kurgan samples may or may not be Yamnaya. According to the dating of the Hunarian plain:

Pre-Pit Grave (or Pre-Yamnaya) period: 3400/3350–3300/3000–2750 cal BC
Early Pit Grave (or Yamnaya) period: 3300/3100–2900/2600 cal BC

Drago said...

Vonyuchka requires Iran Neolithic. Using Epipaleolithic sources:

EHG 42.7%
CHG 33.9%
Ganj_Dareh_N 16.2%
Hotu_HG 5.6%
AfontovaGora3:I9050.damage 1.6%
Levant_N 0%
WHG 0%
West_Siberia_N 0%
Lokomotiv_N 0%
Boncuklu_N 0%

Distance 4.3944%

On Matt's plot it appears slightly off the EHG - CHG cline

Davidski said...


But look at the distance in your model. You obviously don't have the right proximate sources; just near and far proxies. Either that, or like I said, Vonyuchka_Eneolithic isn't really a mixture.

And with formal stats, Vonyuchka_Eneolithic doesn't require Iran Neolithic to have its ancestry modeled.

It just needs a lot of EHG, ANE and anything basal with very little or no Anatolian input, that's why it shows affinity to Sarazm_Eneolithic and Ganj_Dareh_N in the Global25.

These Eneolithic samples do not provide evidence of any recent migrations from the Near East into the steppes.

Drago said...

@ Davidski

Obviously neither are superb, but yours is more wanting ( 4,3 vs 5.6)

The absence of Anatolian -like admixture goes against obvious recent admixture from anywhere near Anatolia; so perhaps somewhere significantly further east; and quite archaic ? I suspect we’re still missing key samples

Davidski said...


These samples lack ancestry from Anatolian and West Siberian/Kazakh steppe foragers.

In fact, even CHG, Ganj_Dareh_N and Sarazm_Eneolithic have too much Anatolian-related ancestry to provide decent fits.

What this means is that there was a population of foragers similar to a mix between EHG, AG3 and CHG living north of the Caucasus, but not exactly like this.

I think you'll find that this is the consensus that is forming in regards to this issue.

Matt said...

@Davidski:"So we should hold off from making any conclusions until the genotype data from the paper are released."

Another point to add is that there may be some substructure among the "Steppe Eneolithic" - we've got in the datasheet PG2001 and VJ1001 who look like the more low EHG samples of the three who sit together, while PG2004 is very slightly later in dating and looks slightly more admixed with EHG.

That might close the EHG gap a little and make it a little more comprehensible why Wang's paper might appear to "undershoot" the extra EHG/WHG of Yamnaya a bit.

In the same way we are probably overshooting EHGness of Khvalynsk and undershooting its Steppe Eneolithic using models with G25, as I0122 and I0433 are the slightly more EHG like Khvalysnk and I0434 the more Steppe Eneolithic sample. But Davidski can only get on the plot the PCA the samples he can. Contribution of Khvalysnk Eneolithic and "Steppe Eneolithic" would be more even with all the samples, although questions about whether it makes sense to model with Khvalysnk Eneolithic and "Steppe Eneolithic" or makes more sense to model with EHG and "Steppe Eneolithic" (as Khvalysnk cemetary genepool later I think and plausibly formed at least partially by recent emigration from southern steppe?).

Matt said...

Certainly though it is odd why the D stat D(EHG;CHG,test,Mbuti) comes out as clearly non-sig different to Yamnaya in the D stat in Wang's supplementary material:

But doesn't look to be the case here.

Curiously, D-stats also show Caucasus Eneolithic being rather richer in CHG ancestry than Armenia MBA-LBA, which didn't show up at all on their plot (probably the compression issues?).

Davidski said...


Not sure if adding PG2004 would shift things significantly, but anyway...

EHG CHG Progress_Eneolithic Mbuti 0.0350 6.433
EHG CHG Vonyuchka_Eneolithic Mbuti 0.0325 6.142

EHG CHG Yamnaya_Samara Mbuti 0.0438 13.409

CHG Yamnaya_Samara Progress_Eneolithic Mbuti -0.0298 -6.471
CHG Yamnaya_Samara Vonyuchka_Eneolithic Mbuti -0.0297 -6.606

Matt said...

@David, counting pixels, the results for D(EHG,CHG;test,Mbuti) is Wang's supplement should be, for a sample:

Samara_Eneolithic: 0.1091, Steppe Maykop: 0.0761, Yamnaya_Samara: 0.0443, Steppe_Eneolithic: 0.0398, Armenia_Chl: 0, Caucasus_Eneolithic: -0.0443, Late Maykop: -0.0216, Armenia_EBA: -0.0205, Iran_N: -0.0352

Close enough? If PG2004 would be about 0.0510, then that would square the difference. For that PG2004 would have to be approximately 20% or so further of the distance between the Khvalysnk samples we have on G25 and PG2001 and VJ1001. Looks reasonably plausible given the sample positions on their PCA?

(Stats on the D(AnatoliaN,Ganj_Dareh;test,Mbuti) from the same figure would be about: Armenia_Chl: 0.0545, Late Maykop: 0.0398, Armenia_EBA: 0.0364, Caucasus_Eneolithic: 0.0261, Yamnaya_Samara: 0.0250, Samara_Eneolithic: 0.0239, Steppe_Eneolithic: 0.0114, Steppe Maykop: : 0.0068, Iran_N: -0.0909).

Samuel Andrews said...

If you take G25 results literally, Steppe Piedmont can't be the proto-PIE/Kurgan population that Wang 2018 thinks it is. Ukraine_Eneolithic, Corded Ware all share a something which had lots more EHG.

Davidski said...

@Samuel Andrews

The Wang et al. supplement suggests that the ancestors of Progress/Vonyuchka_Eneolithic may have come from the north, from the Don Caspian steppe. If so, then Corded Ware, Yamnaya etc. shared ancestry with Progress/Vonyuchka_Eneolithic, rather than being derived from them.

Complementary to the southern [Darkveti-Meshoko] Eneolithic component, a northern component started to expand between 4300 and 4100 calBCE manifested in low burial mounds with inhumations densely packed in bright red ochre. Burial sites of this type, like the investigated sites of Progress and Vonyuchka, are found in the Don-Caspian steppe [10], but they are related to a much larger supra-regional network linking elites of the steppe zone between the Balkans and the Caspian Sea [16]. These groups introduced the so-called kurgan, a specific type of burial monument, which soon spread across the entire steppe zone.

The PIE homeland controversy: January 2019 status report

Samuel Andrews said...

That makes more sense. Yamnaya might still be near perfect proxy for Don-Caspien PIE pop.

Armenia_Chl choses Steppe_Piedmont over Yamnaya. Anatolia_Chl has some Steppe_Piedmont. Could this be where Hittites come from?

Gabriel said...

How are we sure that Armenia/Anatolia_ChL's Steppe_Piedmont affinities isn't due to extra CHG/Iran_Neo?

Gabriel said...

The genetic landscape of the steppe is finally becoming clear and with it I hope to see which kind of steppe population is the one that expanded into Europe and became Corded Ware, and is most likely to be PIE or Yamnaya's ancestor. For long the steppe was spotty and we knew little of where we came from and what populations used to live there. That is changing very rapidly and the pieces of the puzzle are finally falling in place. So I think we will finally we get to know how steppe people formed and how Sredny Stog II, Yamnaya etc. came to be.

Joe Flood said...

What proof do you have that the 'Yamnaya' expanded into the Carpathian basin? Sure there are a bunch of Kurgan sites leading up the lower Danube to the Carpathian, but what makes them 'Yamnaya' ? Are they Eastern R1b {R-Z2103)? Are they large scale animal herders or hunters after minerals, as the so called 'Yamnaya' on the steppe seem to have been? Do they have some sort of close genetic match to the Samara sites, as the so-called 'Afanasievo' do?

Sorry but this entire construct not just of yours but of Reich et al is a house of cards based on tiny, heavily selective, inadequate evidence.

Davidski said...

@Joe Flood

What proof do you have that the 'Yamnaya' expanded into the Carpathian basin?

Read the blog entry.

«Oldest ‹Older   201 – 240 of 240   Newer› Newest»