search this blog

Thursday, March 8, 2018

Beakers vs modern-day Northern Europeans

Here are most of the Beakers from Olalde et al. 2018 in my Principal Component Analysis (PCA) of modern-day Northern European genetic variation. They look rather Celtic or perhaps Celto-Germanic, don't they? The relevant datasheet is available here.

If you're wondering why the Yamnaya and early Baltic Corded Ware individuals are sitting in the middle of the plot, I'd say it's because they don't share enough genetic drift with any specific sub-set of modern-day Northern Europeans to cluster with them. This might be also why the Ukraine Neolithic samples are so dispersed around the middle of the plot. In other words, they're possibly too old to feature in this PCA, unlike the Beakers and Bronze Age descendants of the Baltic Corded Ware people, who are clustering fairly deliberately with their likely closest modern-day relatives.

See also...

Genetic and linguistic structure across space and time in Northern Europe


Richard Rocca said...

Global25/nMonte fits for R1b-P312+L2+ Bell Beaker sample from Jordanów Śląski Poland. I used just immediate predecessors to see if fits were better coming from the Balkans or somewhere to the west. He favors Yamnaya_Kalmykia over Vucedol which is interesting since Vucedol is often quoted as a possible origin for steppe Bell Beaker. Best fits favor about 90% CWC_Germany with a remainder of Iberia_Central_CA or France_MLN.

[1] "distance%=2.8745"


[1] "distance%=1.5666"


[1] "distance%=1.5159"


[1] "distance%=1.4595"


Richard Rocca said...

Anyone have a guess as to why most steppe derived Bell Beaker samples prefer Yamnaya_Kalmykia over Yamnaya_Bulgaria or Yamnaya_Samara.

Anonymous said...


Where do Czech and German CWC land on that plot?

Ryan said...

I think the location for Yamnaya is pretty sensible? Why the skepticism? They're more Uralic-shifted because they're relatively richer in ANE/Siberian ancestry, no?

The Bell Beakers look Celtic/German, but don't the ancient Germans look more Slavic/German? I suspect if you went back 3,000 years things would line up better. There's just a lot of assimilated Celtic ancestry in modern Germans, and a lot of assimilated German ancestry in modern Slavs.

Teper said...

a lot of assimilated German ancestry in modern Slavs


AWood said...

While I realize the U152 Bashkir men have a recent founder effect, I often wonder if the ancestor to their cluster is considerably older, and it has just survived this long through very few offspring and luck. It's still quite possible that L51+ was in eastern Yamnaya and moved westwards trying to escape Yersinia pestis which is looking more and more likely the reason for the migration. It's also interesting that the earliest Dereivka (prior to Alexandria) does not have much CHG ancestry, so it doesn't seem like L51+ sprung from there either.

Queequeg said...

@ Ryan and re "They're more Uralic-shifted because they're relatively richer in ANE/Siberian ancestry, no?"

According to Wong et al:

"Indeed, Yamnaya culture people (5.3-4.7 kya) had stronger genetic affinity to ANE than Mansi or Finns to ANE (Fig. S21a). Yamnaya people shared more alleles with Mansi than Western Europeans French and Sardinian (Fig. S21b), supporting shared ANE ancestry among Mansi and Yamnaya people. And importantly, Yamnaya people have greater affinity to Mansi than to Eastern Siberian population of Even (Fig. S21c), favoring Western Siberians over Eastern Siberians as specifically contributing to the Yamnaya ancestry."

This, despite modern Mansi being almost 60 % Evenki-like, because of later admixture? Proto Mansi's, therefore, might have been pure ANE and if they were, to what extent ANE can be seen as being "Siberian"?

Matt said...

@ Davidski, re: Beakers, are these *all* the Beakers or the ones who fall within broadly Northern European ancestry proportions? I guess I ask only because I thought a few of the Hungarian outliers might overlap more with present day Hungarians a bit!

Whether Beakers look Celtic here, I'd definitely say *yes* but also a qualified yes: the D-stats which you and Arza (thanks) have run for seem to indicate that the early NW Beakers (Netherlands, Britain) were fairly sharply differentiated between MN references, sharing most of the drift distinguishing a MN reference in order of: Globular_Amphora_Poland -> Scotland_N -> Iberia_Chl -> Globular_Amphora_Ukraine. Highly specific Z scores well >3.

(Overall suggestive to me, if we follow clines, of either the NW Poland reference we have, or something slightly west of that in North Germany or closer to the Netherlands which we haven't got sampled. In reality either may be a composite of EEF+MN-like ancestry picked up elsewhere, so long as it fits with the right place on a rough Iberia->Southern France->Britain->Poland->Ukraine MN cline).

The later British_Iron_Age and Roman_British samples don't seem to do this, having a good bit more affinity for the drift distinguishing the Iberia_Chl ref and/or a more generalized affinity for different MN farmers, without many significant Z scores in D stats between them.

I would have guessed that this may be because of some influx of continental Celtic populations (Celtic proper) with similar overall Steppe:EEF:HG ratios to the early Beakers, together with mixing with slightly different / more widespread MN populations. The Beakers may well be more of a pre-Celtic wave, though contributing ancestry to later British+Irish populations.

This stuff might be hard to capture with PCA based on modern day people, if both streams mostly contribute to the same modern people. Lots of D-stats between D(Ancient1,Ancient;Beaker_Brit,later_Brit) from newer ancients all the way up to moderns might start to show some very solid stuff. qpAdm models also should find something *but* I think would probably need to be Lazaridis 2018 Mycenaean style with "big" pright, using lots of MN groups (or least EN groups) in the pright to find the real best fitting ancestries. I think there could be an opportunity to make a discovery there anyway.

Anonymous said...

@Ryan & Huck

Comb Ware Culture was proven to be almost like EHG. However it may have had a tiny Siberian input:

Y-DNA: Three R1a's found and one N1a.

Maybe there is a clue in that.

Anonymous said...

This picture is easy to interpret. The upper part shows the Siberian shift, the lower one shows shift to the European farmers. The left shows the shift to Eastern Europe, the center to Central Europe, the right part to Western Europe. Center of this plot in these relationships are neutral.

bellbeakerblogger said...

I'll take a stab at it. First the shakeout of the Silesian looks good to me since his community is commonly thought to come from Bohemia/Moravia and then further West. With regards to the preference toward the Kalmykian steppe over the other two, If there is a preference towards Kalmykia and German CWC both, I would lean toward a scenario where that signal from the Middle Lower Volga is mostly coming via ancestors of the western CWC, not actually a discrete pre-proto-Beaker population. You might be able to test this using CWC populations, particularly in the West, against the same three Yamnaya. If they shake out to Kalmykia, then this could locate an early ancestral component in the Lower to Middle Lower Volga of the (pre)proto-CWC before it became so mixed with GAC.

Anther possibility is that the core of the Yamnaya homeland in the Western Caspian depression, possibly being subject to aridization first, first left dodge and is present in the West apart and before CW, but is more difficult to detect.

Ryan said...

@Twasztar - Just look at a map. Poland and the Ukraine used to be Germanic speaking ~2,000-1,500 years ago. Crimea was the last place to speak Gothic.

Matt said...

Attempting to label the samples that look particularly outlying in the Baltic BA + Beaker PCA:

Selecting the same samples (where there's overlap) in Global25, reprocessing through PCA and Neighbour Joining, then labelling the outliers there:

Looking at the Ukraine_Eneolithic samples, I6561 is interesting in having a position between certain BK_England / BK_Netherlands / BK_Hungary and CWC_Baltic_early in the G25 data, and neighbour joins closest to a Swedish and Scottish sample in the Baltic BA + Beaker PCA.

The archaeological context for this sample was an "Eneolithic cemetery of the Sredny Stog II culture ... near the village of Alexandria ... Kharkov region".

(Wiki suggests Sredny Stog II, precursor to Corded Ware - "Phase II (ca. 4000–3500 BC) used corded ware pottery which may have originated there, and stone battle-axes of the type later associated with expanding Indo-European cultures to the West.... The culture ended at around 3500 BC, when the Yamna culture expanded westward replacing Sredny Stog, and coming into direct contact with the Cucuteni-Trypillian culture in the western Ukraine.").

This sample is very different from the I5882, I5884, and I4110 samples from Dereivka I. These samples cluster together considering G25 NJ for overlapping samples, though with I5882 and I5884 subclustering together and being much more HG. The closest references to them in this tree in G25 are I7286 BK_Czech_CZE_Outlier, who sits closest to a couple of Swedes.

Paper states "According to craniometric analysis, the Dereivka I population consists of two components, one of which was similar to previous hunter-gatherers of the same region while another is more closely related to individuals from the northern forest zone" so it would be intriguing if genetic structure in this population confirmed this (e.g. I5882/I5884 as "previous hunter-gatherer" and I4110 as "closely related to individuals from the northern forest zone"). Though since these are dental samples, can't be confirmed directly.

Considering Baltic BA vs Beaker PCA, I5882 is closest to Yamnaya and I4110 closest to a Ukrainian sample, while I5884 is closest to a Russian sample.

It may make more sense to label the Ukraine_Eneolithic samples into the Dereivka I and Sredny Stog II subclusters, though of course this means SSII has one sample only...

EastPole said...

Assuming as per Mittnik et al. 2018 “linguistic model that sees an early branching of Balto-Slavic from a ProtoIndo-European language, for which the west Eurasian steppe was proposed as a homeland” we can propose following model of emerging Baltic and Slavic ethnicities:

The main difference between Balts and Slavs both coming from Late Sredny Stog Dereivka culture is that Slavs earlier switched to farming in Ukraine-Poland area and assimilated more Tripolye/TRB/GAC farmer component whereas Balts mixed more with hunter gatherers around Baltic Sea:

Davidski said...

I have a strong feeling that the earliest Slavs, and indeed Proto-Slavs, will cluster with those western-most (right-shifted on the plot) Baltic BA samples.

That's already where one of the two early Bohemian Slavs clusters; the one that probably has much less local pre-Slavic admixture.

EastPole said...

“I have a strong feeling that the earliest Slavs, and indeed Proto-Slavs, will cluster with those western-most (right-shifted on the plot) Baltic BA samples.”

I think early Proto-Slavs were a mixture of pure steppe element and mixed steppe-farmer element groups so they should be located between steppe Balto-Slavs and modern farmer Slavs as per my map.

Where do you put early Proto-Indo-Iranians? They should be adjacent to Proto-Slavs and have less farmer element. Unless early Proto-Indo-Iranians are a mixture of Balto-Slavs with some eastern steppe element.

Davidski said...


I think early Proto-Slavs were a mixture of pure steppe element and mixed steppe-farmer element groups so they should be located between steppe Balto-Slavs and modern farmer Slavs as per my map.

Yes, they probably should, generally speaking, but not in this PCA, because it's intentionally skewed by recent intra-North European ethnic-specific genetic drift.

So anyone who doesn't share much of this drift will be pushed into the middle of the plot, more or less, while anyone who does share a lot of this drift with specific ethnic groups might be pushed way out of the middle, which is then likely to make the sort of linear visualizations like yours impossible.

The precursors of the Proto-Balto-Slavs did come from the steppe, but the genetic drift that defines modern-day Balts and Slavs didn't exist at that time, and its appearance was unlikely to have been a very gradual process, but rather a sudden one outside of the steppe. This is why there's such as jump on the plot between early Baltic Corded Ware and Baltic BA, from the middle of the plot to the lower left corner.

Where do you put early Proto-Indo-Iranians? They should be adjacent to Proto-Slavs and have less farmer element. Unless early Proto-Indo-Iranians are a mixture of Balto-Slavs with some eastern steppe element.

They'll be pushed into the middle of the plot, because they're not closely enough related to modern-day Northern Europeans.

Ryukendo K said...

Ancient Genomics Reveals four Prehistoric Migration Waves into Southeast Asia

Davidski said...

@ryukendo kendow

Nice. I'll start a new thread later today.

Ryukendo K said...

I always found the sociological models for the Bronze Age, with long distance trade and politicking and continent-wide interactions among illiterate people, to be extremely fanciful. What with ideas about "wandering metallurgists", "prospectors" and even "caravans" when even the Roman Empire was riddled with bandits everywhere. How could metal age polities cope with raiding and such. Looking at the distribution of the Welzin warriors is really making me reconsider this position.

Samuel Andrews said...

A few days ago, I modeled modern Slavs as Baltic BA+other BA. West Slavs come out Hungary BA+Baltic BA almost 50/50.

South Slavs come out Croatia BA/Bulgaria IA+Baltic BA along with 25% Anatolia BA. They score 0% Hungary BA. They get 30-40% Baltic BA.Greeks, Albanians get 15%.

Most South Slavs don't have much more Steppe as did Croatia BA but their WHG is significantly higher. That excess WHG is coming from proto-Slavs. CHG-rich East Med added ontop of Baltic-BA type stuff lowers Steppe to levels similar to Croatia BA.

The actual percentage of ancestry from proto-Slavs might be a lot higher than those Baltic BA scores. I expect, Russians, Poles, and Ukrainiane to be 80-100%. South Slavs, northern Russians, Czechs might be like 50%.

EastPole said...

“Yes, they probably should, generally speaking, but not in this PCA, because it's intentionally skewed by recent intra-North European ethnic-specific genetic drift.

So anyone who doesn't share much of this drift will be pushed into the middle of the plot, more or less, while anyone who does share a lot of this drift with specific ethnic groups might be pushed way out of the middle, which is then likely to make the sort of linear visualizations like yours impossible.”

I don’t think I understand your argument. By selecting specific populations you can change the geometry of PCA. You can squeeze and turn around some populations, but still we are dealing here with linear transformations only, so basic topology doesn’t change. If farmer Slavs come from steppe Balto-Slavs the intermediate stage of Proto-Slavs must be located between Slavs and Balto-Slavs. There cannot be a break between them as you are suggesting.
My drawing is an oversimplification of what really happened and the details of the forming of particular Slavic groups may be more complex. For example Poles can come from Dereivka Balto-Slavs mixed with Tripolye farmers who migrated west and mixed with GAC farmers or something like this:

In addition to DNA we should also look at languages, how Slavic languages are related to other IE languages and how DNA correlate with languages. Indo-Iranian languages are very close to Slavic, they preserved many Slavic words almost unchanged, numerals, religious terminology, present tense conjugations system etc. So they were forming close to Proto-Slavs and I am sure they didn’t form in Bohemia but rather somewhere around Eastern Dereivka.

Samuel Andrews said...

I would guess the Germanic people many Beaker folk cluster with are English & Germans who both have significant Celtic ancestry. I doubt Beaker folk have shared drift with Scandinavians.

They definitly share the most drift with Isles Celts. IMO, they will be shown to also share drift with French & Spanish. Don't know if Beaker folk were proto-Celts but no doubt all Celts had lots of Beaker ancestry. All modern Celt-descendants, Isles Celts & French & Spanish, are mostly R1b P312.

Davidski said...


I've added Czech and German Corded Ware to the plot and datasheet. They cluster with Scandinavians.

Davidski said...


One of the early Bohemian Slavs clusters in the ellipse that you labeled "Balts". See here...

So unless she was a Balt, then something is wrong there. My hunch is that she's one of the two early Bohemian Slavs who is much closer to Proto-Slavs.

And yes, I suppose, if we were to sample the precursors to Proto-Slavs all the way from the steppe to the Proto-Slavic homeland, then this would probably show a linear progression from the Yamnaya cluster to the Proto-Slavic cluster.

But if the formation and expansion of the Proto-Slavs was rapid, and associated with a lot of drift, then we'd need some awesome sampling from across space and time to catch all of this on a plot.

So it's more likely that there will be a jump in my PCA from the Yamnaya cluster to the Proto-Slavic cluster. And I think the Proto-Slavic cluster will be in the part of the plot where Poles overlap with Balts. Not sure where the Proto-Baltic cluster will be; maybe left of there?

Davidski said...

Actually, it totally slipped my mind that the "jump" from the steppe to the Proto-Balto-Slavic homeland is already there on the plot, from early Baltic CWC to late Baltic CWC (Spiginas2).

EastPole, check out the distance between early Baltic CWC and Spiginas2 across dimensions 1&2. That's a jump alright, probably caused by admixture and rapid drift during late Corded Ware.

Matt said...

@EastPole, Davidski could project the Globular Amphora and Baltic/Narva HG samples on that PCA; I'd doubt that the Slavic samples would lie between the Yamnaya and Globular Amphora to be honest. Particularly I would doubt GAC, TRB, Tripolye samples would lie where the labels have been placed in the image from your last post.

Weighing in on this topic (heavy handedly), essentially the geometry in all the Global25 PCAs (which reflects underlying genetic reality, if the PCA is accurate) is disfavourable to present day Slavic groups reflecting a simple mix between genetically Yamnaya or genetically early Corded Ware with any the Middle Neolithic or Copper Age farming groups we've got samples of. This includes the Swedish Funnelbeaker samples, or the Polish or Ukrainian Globular Amphora Culture samples, or any of the offshoots of the Balkan or Hungarian Neolithic groups.

Instead, the admixture cline and genetic sequence in Eastern Europe (from the Baltic->Balkans) seems to reflect a sequence of the persistence of populations who are a) a relatively direct mixture of Steppe-like people with hunter-gatherers (possibly not even with *any* input from the heavily EEF cultures, if the Western steppe already had EEF and depending on the sequence), and b) in the Balkans (and even quite north of there towards Carpathian Basin/Pannonian Plain), populations who picked up fairly modest levels of steppe ancestry on top of being almost completely EEF, with almost no HG ancestry.

This can make some sense in that the climate in Eastern Europe varies between the extremes of climates that are close to the Near East and relatively favourable for agriculture, to climates in the north that are possibly less favourable for farming and herding and more for foraging than climates at the same parallel in Western Europe (more affected by Gulf Stream and westerlies). Although probably the northern populations who are mostly steppe+HG may well have farmed or herded for most of their subsistence (populations like Baltic_BA certainly weren't foragers).

On the other hand, it seems like samples we can attribute to Germanic and Celtic speaking cultural contexts (as well as most Bronze Age Europeans in general) mostly can be modeled simply by the Yamnaya+MN Europeans, without too much need for the persistence of any groups richer or poor in WHG than MN Europeans. For whatever that's worth.

This doesn't mean that the present day Germanic+Celtic speaking groups are particularly close to the Steppe_EMBA groups in formal measures of shared drift (in fact the Baltic Lithuanians are closest in f3 measures that get at very deep scale sharing), though neither are they less close. While at the same time, Slavic groups are probably closer to the Samara_Eneolithic samples (also reasonably likely to be Indo-European speaking) in formal stats than the Celtic+Germanic populations are.

Of course, I don't know what this means for language dispersal models.

Matt said...

@Ryu, awesome, thanks for the link (and is your (welcome) reappearance here in the comments related to this paper I wonder ;) ?).

I'm very interested in the paper on its own terms, but I'm sure the South Asian posters will be interested in how any new samples which could be a proxy for ASI model their ancestry. I'm sure that there are many who will move to claiming Onge like Hoabhinian samples simply prove that Onge are related to SE Asia and have nothing to do with pre-Holocene India, but hopefully this will move up the pressure on the South Asia paper (a sort of going round India to capture early South and Southeast Asian ancestry). Quality of these samples, and their publication, will be key.

Now to actually read this paper and see if my comment is justified..

Lee said...

Is it at all surprising that we have R1b1a in the Iron Gates HG or in Latvia?

Wa this expected at such early dates

Lee said...

Interesting findings

I did qpAdm work on Latvia HG and Iron_Gates_HG, due to the high R1b content

Latvia comes back with a tail probability of 0.908 with mixture of SHG/WHG 0.66 and 0.34 and 0.082 SE

Iron_gates came bake with a tail probability of 0.893 with a mixture of WHG/CHG 0.819 and 0.181 and SE of 0.053.

Interesting mixtures
Based on PCA I split Central Europe Beakers into two groups and Iberia Beakers into two groups.

BB_CE_O were: I2741, I1392, I7045, I7044, I2364, I3594, I5524, I3528, I7283, I3597,
I5015, I6581, E09538

BB_Iberia come at a mix of tail prob of 0.536 SE of ~ 0.064 mis Iron_Gates_HG/Anatolia_BA 0.413/0.587

BB_IberiaS (S for Steppe like) : came up very similar though the outgroup list needed to be modified to reduce the error

So the Bell Beaker Central Europe Outliers: could be modelled as a combo of Anatolia_BA and SHG or Latvia_HG at ~0.75 and 0.25 with an error rate ~ 0.05.
So we have EHG coming from SHG and CHG from Anatolia_BA--so is this really a steppe influence? Or just looks like it?

BB CE minus the outliers--I had to use Corded ware germany to get a good mix. It looked like the same base population as BB_CE_O with corded ware mixture. best combo I got was tail probability of 0.293 and Latvia_HG (0.21), Anatolia_BA (0.362), CWC_germany (0.428) with Standard Error of 0.038/0.064/0.079
Iron gates works instead of Latvia as well, fairly comparably.

So Overall it appears that a population with Anatolia_BA/Iron_Gates_HG moved into Iberia AND Central Europe---And then mixed with local populations.

It may have been from Iberia or from the Balkans. Based on Historical development of Bell-Beaker coming from Portugal (possibly)

Will need to look at Iberia_Chalcolithic to see if it comes from Iberia or not. 3Pop testing seems to indicate that it may be. BB-Iberia shares alot of drift with Iberia_Chalcolithic

Interesting--may need to add some qpGraph analysis as well