search this blog

Thursday, March 1, 2018

Awesome substructure within Czech Corded Ware

This is where the three Czech Corded Ware samples from Olalde et al. 2018 cluster in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation.

The two individuals belonging to Y-haplogroup R1a look like they might be straight from the Pontic-Caspian (PC) steppe. That's because they're sitting right next to an Eneolithic sample from the North Pontic part of the PC steppe, in what is now Ukraine, Eastern Europe. This guy, from Mathieson et al. 2018, also belongs to R1a. And if they're not totally of steppe origin, then clearly they both only have minor ancestry from outside of the steppe.

On the other hand, the third Czech Corded Ware individual, who belongs to the "Old European" Y-haplogroup I2a2a, actually shows no signs of steppe ancestry, because he clusters with Middle Neolithic Central Europeans. Indeed, I can test all of this with the Global25/nMonte method (see here and here), using the Eneolithic North Pontian and Samara Yamnaya as steppe references.

[1] distance%=3.8801 / distance=0.038801


Barcin_N 82.6
WHG 17.4
Ukraine_Eneolithic:I6561 0
Yamnaya_Samara 0

[1] distance%=2.4713 / distance=0.024713


Ukraine_Eneolithic:I6561 63.3
Yamnaya_Samara 24.65
WHG 7.35
Barcin_N 4.7

[1] distance%=2.4089 / distance=0.024089


Yamnaya_Samara 61.25
Barcin_N 17.7
Ukraine_Eneolithic:I6561 12.9
WHG 8.15

But what does this mean? Well, obviously that the R1a in Corded Ware people is not from the PC steppe!

Nah, I'm just messing around; poking a bit of fun at the dumb trolls online still arguing, against all odds, that the steppe ancestors of the Corded Ware people did not carry R1a. But let's just move on, shall we, because there's no longer any doubt that the R1a-M417 subclade of R1a, which encompasses almost 100% of the R1a lineages in the world today, expanded from the PC steppe with the forefathers of the Corded Ware folk. For one, it's found in the aforementioned Eneolithic North Pontian, and two, in the oldest and most steppe-shifted Corded Ware individuals. So that's that.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...


Davidski said...

Btw, Ukraine_Eneolithic I6561 has a new C14 dating: 4045-3974 calBCE (5215±20BP, PSUAMS-2832).

Not sure how close this is to the birth of R1a-M417. Probably pretty close.

Ric Hern said...

@ Davidski

Thanks. Yes I think the fringe arguments are now just for the sake of arguing with no real bite. It is like someone without their dentures trying to bite into a big hard apple.

Richard Rocca said...


Would the closest direct predecessor to I7272 in your G25 sheet be Globular Amphora?

bellbeakerblogger said...


Can you or someone do some comparisons between Czech CW and BB? What happens when you model CW as one of the ancestral components of Beaker along with GAC?

Salden said...

So over on Anthrogenica, someone posted a link to a study on an around 4000 Ancient Upper Egyptian sample:

>When only the mtDNA sequences recovered from ancient Egyptian human remains are considered, the Djehutynakht sequence most closely resembles a U5a lineage from sample JK2903, a 2000-year-old skeleton from Abusir el-Meleq

>The deep presence of Eurasian mtDNA lineages in Northern Africa has, therefore, been clearly established with these recent reports and offers further support for the authenticity of the Eurasian mtDNA sequence observed in the Djehutynakht mummy. In the present study, Near Eastern influence has been found in an individual of high social status who lived in Upper Egypt during the Middle Kingdom.

This is big.

namedguest said...

The haplogroup doesn't favour this scenario, although it's tempting.

[1] "distance%=1.2751 / distance=0.012751"


CWC_Czech 100
Globular_Amphora 0

"This is big"
Why? I'm just curious though.

Salden said...

>Why? I'm just curious though.

It's the first reliably tested sample that's:

-Predynastic to First Intermediate Egyptian
-Upper Egyptian
-Upper class

It has a MtDNA haplogroup that:

-Was significantly present in Prehistoric European populations
-Is carried by modern Finns, Samis, and Estonian more than any other populations
-Is carried by select populations in North Africa (see Berbers) and the Near East.

How isn't it?

Salden said...

The haplogroup:

>U5 was the main haplogroup of mesolithic European hunter gatherers. U haplogroups were present at 83% in European hunter gatherers before influx of Middle Eastern farmer and steppe Indo-European ancestry decreased its frequency to less than 21%.[29]

Ric Hern said...

@ Salden

I wonder if they stumbled upon the migration route of R1b(V88) within this find...?

Simon_W said...

I'm not always in the mood to read interesting new papers ASAP. But eventually I do digest them. So here's my opinion on the Amorim et al. preprint on the Longobards. (Sorry for this off-topic stuff, but it wouldn't be more on topic in the other thread anyway.)

Like many others I have been particularly interested in studying the ancestry composition of the non-Germanic Piedmontese locals of predominantly southern (and partly Iberian-like) descent. In the various PCA plots there are clearly two equally large clusters of them discernible:

- CL25, CL30, CL38 and CL121 plot inbetween Cypriots and Tuscans. CL31 is similar but has a Macedonian shift (due to the elevated contamination).

- CL23, CL36, CL57 and CL94 plot inbetween Tuscans, Iberians and French/Swiss. CL47 is similar but has a Slovenian shift (due to higher NEE input).

This sub-structure is corroborated by ADMIXTURE, which shows a strong Iberian component as well as a variously large northern component in the second cluster, while the first cluster is almost purely composed of the Tuscan component.

And according to Table S10.1, which shows the assignment of the most likely modern population using PAA, the most likely POPRES population for the samples of the first cluster is in all cases IT. While the majority of the samples of the second cluster are best matched by Spain or Portugal, with the exception of CL57 which is rather matched by Belgians and the eastern shifted CL47 that is best matched by Austrians.

This is further corroborated by the yDNA. In the first cluster we find one R1b-S116, one E1b-PF2211, one R1b-Z2103 and one G2a-Z6644. In the second cluster we find one T1a, and two R1b-L151, one of which is a subvariant of S116.

So it's plain wrong to claim that the pre-Longobard inhabitants of Roman Age northern Italy were all Cypriot-like. About one half was rather South Italian-like, but the other half had a more northern and Iberian-like ancestry composition. It's not very hard to infer that the latter are predominantly descended from the Celto-Ligurian pre-Roman locals and the former came with the Romans mostly from peninsular Italy.

Furthermore it would be premature to assume that the overall ethnic composition of the site of Collegno was in any meaningful way representative for the ethnic composition of early Medieval Northern Italy as a whole. Because if that were the case, modern day North Italians wouldn't be South European-like. They would be in the north European genetic variation. The Longobards were not evenly distributed across northern Italy, and as the paper explained, Collegno was an important site for the Longobards. It was selected by the authors for that reason, because they were interested in the Longobards.

Davidski said...

@Richard Rocca

Would the closest direct predecessor to I7272 in your G25 sheet be Globular Amphora?

I'd say GAC has too much HG ancestry to be a significant source of ancestry for I7272, as well as Baltic HG ancestry that is missing in I7272.

[1] "distance%=3.6454 / distance=0.036454"


"Barcin_N" 73.65
"WHG" 20.25
"Narva_Lithuania" 6.1

[1] "distance%=3.88 / distance=0.0388"


"Barcin_N" 82.6
"WHG" 17.4
"Narva_Lithuania" 0


Can you or someone do some comparisons between Czech CW and BB? What happens when you model CW as one of the ancestral components of Beaker along with GAC?

I have some posts coming up on the genetic structure and origins of different Beakers. The first one should be ready this weekend.

epoch2013 said...

That farmer CW sample is really surprising. So a fully farmer father had a son buried in fully CW style:

"I7272/Grave 23: 2900–2500 BCE. Right-sided crouched burial, head towards the west. Sex: orientation – M, anthropology – ?, DNA – M. Age: infant (4-6 years). Grave goods: vessel (small beaker), pelvis of cattle"

That sounds like fully integrated into CWC..

bellbeakerblogger said...

I mean testing actual wallbanger relatedness Czech BB and CW, instead of being similar in proportions of ancient components.
Perhaps using Dutch Beakers and Latvian CW as a measure of Czechness (or something).

If later Beakers do have local CW ancestry, is it possible to estimate what percentage that is?

André de Vasconcelos said...


The paper seems clear that CL30, CL25, CL38, CL121 and I'd guess CL31 are indeed locals from Collegno. CL31 is kindred with SZ1.
CL57 and CL47 are also kindred with elite folk: CL49 and CL53.
CL94 was of high-ish status and non-local. CL23 is non-local aswell.

It seems that the local commoners were the southernmost individuals, and that the elite status is associated with people of foreign ancestry

bellbeakerblogger said...


Thanks, I look forward to seeing those.

Matt said...

Re, I7272, it also looks like this is the case for Czech EBA sample I7197, another male with I2a1. (The description of the sites is "A total of 29 graves were found, dated to the older phases of the Únětice culture on the basis of grave equipment (ceramic and bronze inventory) and burial ritual"; so on face value an EEF early Unetice).

It certainly looks like sex biased mass migration could have taken place, but it does also look like from some of these outliers like males from EEF groups were incorporated to some degree into expanding Indo-European groups. The sort of "Pots not people" mechanisms that appear to have been massively overemphasised, where Indo-European expanded culturally as a network of ideas that outpaced the movements of actual people by offering traditions that fit for the emerging Bronze Age societies, must have existed to some small degree.

I still wonder (though I know it's a bit disfavoured on here) at how much absorption of males from EEF societies was erased by later y lineage founder effects, which would tend to favour expansions of subclades of the most common lineages. (If you have a population that is 80% x and 20% y, then you randomly sample 10% and double their representation, for ex, then I believe eventually you will on average tend to converge on 100% x, much more often than, say, 70% x, 30% y.).

Matt said...

Some plotting of those two outliers using the Global25 data and a random set of other EEF as a comparison:

Rob said...

@ Matt/ R.Rocca

Where someone plots on the 2D could mask the reality behind their history, based on similar propoertions of WHG/ ANF/ steppe
Attempting to look more closely

Eg CWC Czech 7279
Yamnaya 55%
Ukraine Eneol 17%
GAC 15%
Vucedol 7.5%

CWC Czech 7272

Remedello 40%
Iberia Chalc 32%
Greece Pelop. 10%
GAC 5%

Overfitted, but the point is 7272 has little to do with GAC. Indeed, his I2a2 is an I2a2a2, thus far only found around El Portalon, and different to the I2a2a1b2 of GAC men.

Adrian T said...

Matt said: "I still wonder (though I know it's a bit disfavoured on here) at how much absorption of males from EEF societies was erased by later y lineage founder effects, which would tend to favour expansions of subclades of the most common lineages. (If you have a population that is 80% x and 20% y, then you randomly sample 10% and double their representation, for ex, then I believe eventually you will on average tend to converge on 100% x, much more often than, say, 70% x, 30% y.)."

I wonder about this often as well. Mono-parental markers are essentially clones. From what I've read in population genetics, for a given population of say, bacteria, it will take about 100 generations for all members to be descended from the same individual. 100 generations in humans would be about 3000 years.

In humans you have the problem of establishing the limit of what a population is. But I guess this gives an idea how these dynamics may play out. It could also explain why relatively more isolated populations like Basques or the Irish (an island) appear to have less variety of markers than others.

Of course, you still have to explain how at particular times a 'new' marker becomes dominant. So the migration model from some other population which was already dominated by it makes perfect sense.

Re. the IE question, we still need to find out the exact role of yDNA hg I in it. To me, it does look like I1 mixed into the steppe incomers somewhere in Scandinavia or the Northern Baltic, and it later hitched a ride South with the German expansions.

namedguest said...

Davidski said he will post about it in detail, but as per your requests:
"Perhaps using Dutch Beakers and Latvian CW as a measure of Czechness (or something)."
"If later Beakers do have local CW ancestry, is it possible to estimate what percentage that is?"

[1] "distance%=2.2634"



@Adrian T
"R1a", "R1b" and "I" are known haplogroups from Mesolithic Europe, and in the same way R1a and R1b were part of the Steppe genesis, so was I, but it seems to a lower degree.

Samuel Andrews said...

I doubt Northern Beaker has any ancestry from Corded Ware. They're two different tribes represented by two different paternal lineages. Beaker's R1b P312 comes from Yamnaya-like people maybe even Yamnaya itself. So, the high Steppe in northern Beaker has nothing to do with Corded Ware.

Yamnaya moved as far west as Hungary with Vucedol. This is confirmed by the Z2103 in Vucedol. It isn't too crazy to think Yamnaya or some relative moved as far west as Britain without mixing with Corded Ware.

Samuel Andrews said...

Even more amazing, in a Beaker burial in Hungary includes a person of entirely local Hungarian farmer decent alongside someone of 75% Steppe origin (Z2103+).

Ric Hern said...

@ Samuel

How did Northern Beakers pick up Narva Ancestry without picking up some Corded Ware ? Did they move earlier up the Elbe than the arrival of Corded Ware in that area ? Or did I miss something ?

Samuel Andrews said...

@Ric Hern,

Narva admixture only applies to Corded Ware in the Baltic States. Also, we don't know if BBC had Narva ancestry all we know is they excess WHG.

Considering R1b L151 came from the Steppe there's no need for Corded Ware admixture to explain their Steppe ancestry. It makes more sense Bell Beaker and Corded Ware are separate phenomenons like LBK and Cardiel were.

namedguest said...

@Samuel Andrews
The thing is, and also what bellbeakerblogger expressed is:
Later after the Beaker genesis and after the CW genesis, have these two groups mixed?

Rob said...

Some people think BB is a subsection of CWC CWC, like copper smiths.
At the very least CWC and B.B. interacted & secondarily admixed
It’s rather clear in multiple lines of evidence but “proving it genetically” is complex, at least without Hungarian Yamnaya samples

epoch2013 said...


BB has quite some EEF and WHG. There are two possible scenarios:

1) It picked it up and then migrated into Europe. This requires an admixture near the Yamnaya source. That Yamnaya source should have been in the Steppe or in Eastern Europe, in Hungary. But the latter already evolved in other cultures by the time BB arrived. So where did they hide all these centuries? Or are any of these cultures the source of BB. The other possibility means BB ran over these cultures in their run to the west.

2) It is local admixture. This only requires R1b in CW, of which we have at least one sample (Polish). It requires some extra WHG which was readily available in the fringes of farmer cultures (Blätterhöhle, Vlaardingen culture possibly as well).

You have a big cllection of mtDNA: What can you make of BB's and CW's mtDNA?

Davidski said...

Wow, Maju has totally lost the plot now.

Funny to watch though.

Grey said...

Ric Hern said...
"How did Northern Beakers pick up Narva Ancestry without picking up some Corded Ware ? Did they move earlier up the Elbe than the arrival of Corded Ware in that area ? Or did I miss something ?"

hare and tortoise?

population A traveling west along the Volga, one part stops in central Europe and mixes with local farmer ancestry creating population B while the other carries on to the Baltic and mixes with local HGs there to create population C then at some later point parts of B and C recombine?

Grey said...


or if that's too far north, Danube-Elbe or Dneipr-Vistula etc.

Folker said...

"I still wonder (though I know it's a bit disfavoured on here) at how much absorption of males from EEF societies was erased by later y lineage founder effects, which would tend to favour expansions of subclades of the most common lineages."

If PIE societies were patriarchal, patrilocal and patrilineal, it doesn't mean that admixture was only mediated through females. It doesn't have to.

What i means is that in those cultures existed a social structure where the main clan domined socially and domined reproduction. A man of the man clan did have more chance to find a bride, to have children, and probably to have concubines. A man not from the main clan did have lower chance to find a bride, probably have a lower status, and hence did have lower chance to have many children.

In such groups, the Y haplogroup of the main clan could erase very easily other haplogroups in several decades.

If you look at some Y haplogroups connected to diffusion IE, you'll see that some of them are of farmers origin, like G2. Hence probably the presence of G2 in India (specifically in higher castes and ethnic groupes).

It was probably already the case in the Steppe during the Yamna-like population ethnogenesis. That's why it is probably wrong to say that the female-mediated CHG admixture was from exchanges of wives, or concubines and so on. But the high level CHG groups weren't probably at the core of pre-PIE culture, and the CHG males weren't part of the dominant clans and hence weren't able to transmit their Y haplogroups to their descendants.

Hence huge founder effects for the main clans.

"Some people think BB is a subsection of CWC CWC, like copper smiths.
At the very least CWC and B.B. interacted & secondarily admixed
It’s rather clear in multiple lines of evidence but “proving it genetically” is complex, at least without Hungarian Yamnaya samples"

Completely agree with you, even if I will not rule out a scenario where neither Hungarian Yamna or CW took part of BBs ethnogenesis, and where they took their Steppe admixture directly from Western Pontic Steppe, through a Northern route to Central Europe. But Hungarian kurgans will tell us.

Alogo said...

Folker, at face value and without knowing the particular subclades, the G2a in India and Central Asia doesn't implausibly seem connected to eastern Neolithic (there was an Iranian Neolithic sample with G2a too IIRC) as a minor haplogroup. Do you have some interesting subclades in mind that unite India with deep Europe, for example?

Tesmos said...


Where are the Czech_EBA samples on your plot? Would be interesting to compare Czech_EBA with Czech_CWC & BB samples.

Ryan said...

Do we have any R1a Bell Beakers yet? I have a hard time believing CWC got completely rolled over by them.

Samuel Andrews said...

"You have a big cllection of mtDNA: What can you make of BB's and CW's mtDNA?"

I haven't had time to look at the new Beaker mtDNA. But from what I remember, Beaker & Corded Ware have no obvious mtDNA link other than being a farmer, Steppe mix. Many of the links between them I view, based on modern mtDNA research, as generic European lineages. They may be generic today but back then represented a special link. There's really no way to know at this point.

I will say though, Neolithic Britain and Spain have related mtDNA. And it doesns't look like Iberian farmers or Beaker's farmer ancestor had any N1a1a which in part explains the disappearance of N1a1a. Though it looks N1a1a might still have a presence in some parts of Central-East Europe like in Austria.

Gaspar said...

I find it strange that in the Lombard paper people justify there claims by using Popres country instead of Popres regional. Why is that?

clearly these "non-lombard" samples including SZ28 require further checking.

example - CL23 is stated as BUL , has plotted with Hungarian bronze age samples and yet has a lots of IBS and a minimal SAS. The likely scenario his origins began in the wallacian plain, where anatolians combined with steppe people

epoch2013 said...


We *do* have one R1b CW, and one possible R1b if Mathieson 2018 is correct and RISE436 is R1b rather than R1a. I emailed Iain Mathieson about that sample and he was kind enough to answer that the R1b1 was based on L1349, but being a C->T mutation and the sample is not UDG-treated it could be damage. Also it didn't not look like it's very high coverage, so it could be wrong. (quote edited to fit my response)

epoch2013 said...


Also, BB didn't occupy the entire realm you sometimes see on maps. This is a map from Nature on the Olalde paper.!/image/nature-bell-beaker-map-18-may-17-online.png_gen/derivatives/landscape_630/nature-bell-beaker-map-18-may-17-online.png

So, by sampling Bell Beaker samples we are deliberately targeting an R1b population. Which is Bias.

MomOfZoha said...

"I still wonder (though I know it's a bit disfavoured on here) at how much absorption of males from EEF societies was erased by later y lineage founder effects, which would tend to favour expansions of subclades of the most common lineages. (If you have a population that is 80% x and 20% y, then you randomly sample 10% and double their representation, for ex, then I believe eventually you will on average tend to converge on 100% x, much more often than, say, 70% x, 30% y.)."

Goes to show you how ridiculous it is to frame entire historical arguments based solely on Y-DNA.

Autosomal will always be far more interesting, entropy preserving form of ancestral information. But, that's harder to really analyze appropriately, with as many pairwise similarity measures as there are genes (I'm exaggerating of course, but)... For example, comparing cM sharing data with the Global 25 data yields vastly different patterns. For example: If a person is half Chinese and half Namibian, then would they not be very far from both of their parents in the Global 25 graphs (PCA or nearest neighbor graph, doesn't matter)?

As for the plausibly bigger question of IE origins, I wish I could care more to devote any effort diving into it. Once upon a time, I was so interested in the roots of Indo-European as any English speaking Latin student would be. Although Turkish is my mother tongue, our school didn't offer any classes in non-IE languages (it was remarkable that such a rural Bible belt school even had a Latin teacher at all -- I was so fortunate as she was the last one). Years later, I have grown more to appreciate Semitic and Turkic languages to which I have been naturally exposed, in addition to the pervasive IE all over.

I cannot then help but wonder if the special curiosity in IE in particular is driven by the circumstance of its dominance today. Why isn't there as great an interest in the origin of language itself? Or is proto-IE implicitly claimed to have a special role in the development of human language?

When millions of $$, time, and effort, are expended by all kinds of very smart people to study the origin of IE, do we get even epsilon closer to understanding the origin of language?

My feeling is that many people have "evolutionary" arguments in their own minds regarding the origin of IE, just like they have "evolutionary" arguments concerning the pervasiveness of certain Y-haplogroups.

You see, what people can say directly is one thing, and what truly moves them is another thing altogether.

I would not at all be surprised if all the IE researchers ultimately believe that some kind of evolutionary "optimality" criteria must have contributed to the pervasiveness of the language group today. Given the overlap between those studying IE and those obsessed with Y-hg R1a, I do not doubt that similar people feel similarly regarding the ancestral forefather of R1a.

I am sure that you think deeply enough to understand the fallacies inherent in such pseudo-evolutionary arguments that are often thought and seldom said. Looking backwards as we think in an IE language, it is natural to feel that there must be something inherently special about this all from the very beginning to bring it to today. How could we possibly *not* think this, huh?

Folker said...

I have no peculiar about haplogroup G2. Anyway I've read some interesting analysis linking some the diffusion of some haplogroups, like G2, to secondary expansion of IE speakers. For India, there are some possibilities, like for the Z30503 or the Z30522 subclades. But Indian genetics are clearly very difficult to decypher, notably given the highly political issues at stand.

Rob said...

It’ll please maju and everyone that some Mesolithic neolithic Bronze Age studies are due soon, apparently, from France; a big undersampled region.

But still missing from the B.B. data are samples from the earliest phase ( A1) from German and Swiss sites

Davidski said...


You can plot this with PAST.

Samuel Andrews said...

Btw, everybody, I admit I was wrong about significant SW Asian/Natufian stuff in southern Europe. At least, the evidence now indicates it isn't significant. It is only significant in southern Italians. Using D-stats with no SW Asian outgroup, basal heavy IranNeo and Natufian are almost indistingushable which explains high SW Asian scores I got.

However, I do very much suspect there is some Levantie stuff in much of southern Europe especially Italy and Greece.

Also, the CHG stuff in mainland Italy, Spain, France, Balkans can't be very old. I'm thinking most of it came in the Bronze age.

Rob said...

@ Sam

“Also, the CHG stuff in mainland Italy, Spain, France, Balkans can't be very old. I'm thinking most of it came in the Bronze age.”

Just keep in mind that “southern Bronze Age” is 2500 BC. That’s as old as Beaker in Northern Europe; & is in the formative phase of “European civ.” and therefor calling it “recent” is somewhat of a relative miscategorization
But it’s good to see you’ve eventually come round to the correct conclusions for yourself, which is more convincing than anyone else telling you

Samuel Andrews said...


Ok, then I'm thinking 2000-1000 BC. There may have been a massive movement from the east med into Italy.

Samuel Andrews said...

Whose the best AFrican reference for southwest Asia>

namedguest said...

@Samuel Andrews
If it's not related to Morocco and America and it's pre-Islamic, then it's NOT Yoruba/West African.
The Khoisan/South Africans never contacted Eurasians prior to the English/Dutch.
Do NOT use the Hadza unless you know what you're doing.

So, what you want is an East African.
Somali, Oromo and Ethiopian are Arab-admixed.
The Dinka has West African ancestry but no Levantine admixture.
The Bantu of Kenya is just like the Dinka, but mainly West African with some East African ancestry.

Rob said...

@ Sam

"Ok, then I'm thinking 2000-1000 BC. "

Cant tell for sure yet, but Id be more like 2500-1800.
After that, the movements in Italy were probably north to south, apart from a minor hypothesized 'oriental' influence on Etruscans, as some claim.

Samuel Andrews said...

Thanks, namedguest. Looks like those Africans give good fits....




Slumbery said...


I do not think there is such high AN ancestry in Egypt realistically. I think it is inflated significantly in your model, because Levant_N is unaccounted, also because Iran_N is actually via Iran_Chalcolitic in Egypt. Egyptians should have both direct and indirect Levant_N ancestry and its absence in the model causes Barcin_N coming up that high.

Arza said...

Mbuti Ukraine_Eneo_I5884 Iran_ChL Armenia_EBA 0,0142 4,246 547912
Mbuti Ukraine_Eneo_I5884 Iran_ChL Armenia_ChL 0,0217 6,55 552212
Mbuti Ukraine_Eneo_I6561 Iran_ChL Armenia_EBA 0,0152 4,617 668786
Mbuti Ukraine_Eneo_I6561 Iran_ChL Armenia_ChL 0,0183 6,085 674449
Mbuti Ukraine_Eneo_I5882 Iran_ChL Armenia_EBA 0,0134 3,751 378136
Mbuti Ukraine_Eneo_I5882 Iran_ChL Armenia_ChL 0,0189 5,38 380525
Mbuti Ukraine_Eneo_I4110 Iran_ChL Armenia_EBA 0,0113 3,395 549621
Mbuti Ukraine_Eneo_I4110 Iran_ChL Armenia_ChL 0,0227 7,025 553058
Mbuti Ukraine_Neolithic_ Iran_ChL Armenia_EBA 0,0166 7,202 891529
Mbuti Ukraine_Neolithic_ Iran_ChL Armenia_ChL 0,0288 12,974 904042
Mbuti Ukraine_Mesolithic Iran_ChL Armenia_EBA 0,0174 6,965 867506
Mbuti Ukraine_Mesolithic Iran_ChL Armenia_ChL 0,0286 11,689 878750
The calls show that I5884 belonged to Y haplogroup R1b1a1a2a2b-BY3293.

R-Y4366Y4366formed 4300 ybp, TMRCA 4300 ybpinfo
id:YF10690ARM [AM-LO]

Maybe R1b entered Yamnaya via Armenian(-like) population?

Arza said...

CWC_Baltic_early 39.8%
Globular_Amphora 34.8%
Yamnaya_Kalmykia 17.2%
Ukraine_Eneolithic:I5884 8.2%
Distance 1.5601%

CWC_Baltic_early 45.8%
Globular_Amphora 31.4%
Yamnaya_Kalmykia 14.8%
Ukraine_Eneolithic:I5884 8%
Distance 1.5406%

Globular_Amphora 52.4%
Yamnaya_Kalmykia 26.6%
CWC_Baltic_early 21%
Ukraine_Eneolithic:I5884 0%
Distance 0.5753%

Globular_Amphora 67%
CWC_Baltic_early 21.2%
Yamnaya_Kalmykia 11.8%
Ukraine_Eneolithic:I5884 0%
Distance 2.0329%

Historical viability unknown.

Arza said...

And now the question appears. Who descends from who and what is the exact origin of CWC, Yamnaya and Srubnaya Outlier.

CWC_Baltic_early 69.2%
Srubnaya_outlier 21%
Armenia_EBA 8%
Samara_Eneolithic 1.8%
Armenia_ChL 0%
Distance 2.7165%

CWC_Baltic_early 63%
Srubnaya_outlier 22.6%
Armenia_EBA 7.2%
Samara_Eneolithic 7.2%
Distance 2.8682%

CWC_Baltic_early 66.8%
Srubnaya_outlier 23.4%
Armenia_EBA 8.4%
Samara_Eneolithic 1.4%

Distance 3.4452%

a said...
The calls show that I5884 belonged to Y haplogroup R1b1a1a2a2b-BY3293
o% Steppe

R-Y4364Y4363/M12160 * Y4369 * Y4367/A368/M589+10 SNPsformed 5700 ybp, TMRCA 4300 ybpinfo

R-PF331FGC35088 * PF331formed 5700 ybp, TMRCA 5200 ybpinfo

Matt said...

MoZ: For example: If a person is half Chinese and half Namibian, then would they not be very far from both of their parents in the Global 25 graphs (PCA or nearest neighbor graph, doesn't matter)?

Yes, they would. I think what this would mean would be that they would be far from their parents in the % population structured genetic variance captured by the Global25... but as this (the population structured genetic variance) is a very low amount of the total human genetic, the child would still be roughly as close in real terms to their parents as any other child, and as easily recognizable as a child of those parents. This is the kind of theoretical issue I have not thought much about though.

I cannot then help but wonder if the special curiosity in IE in particular is driven by the circumstance of its dominance today.

Partly I think it's because it has an early and central role in historical linguistics (the first meaningful projects were to address it and ethnocentricism is most likely part of that story to some degree), and then the volume of data built and particularly the way it interacts with archaeology in Europe and Russia (where it is plentiful and easy).

The big spread and corpus of writing also helps to actually study with comparative methods; Austronesian is also subject to fair amount lot of work studying the spreads, as there's just this huge spread, but might be a bit more hard work. Bantu languages would also be good. Even Semitic or Sino-Tibetan seem like comparatively harder work.

Though this is just my perception, I don't think that evolutionary reasons favouring particular structural characteristics of IE languages as responsible for IE spread are very popular within historical linguistics of IE. David Anthony does I think talk about how particular reconstructed terminology within IE suggests a particular cultural complex which was able to expand (though we have more solid reasons now to believe it was a case of migration) and also act as a lingua franca etc, but I don't think he or any of his peers would ever credit any grammatical or phonological features of the language as having anything to do with that spread (the proto-language could almost have had any grammar or phonology; the arbitrariness of the sign).

If anything, I would guess actually these ideas that there is an advantage to the language itself would be disfavoured by them, firstly because it gets you into this Sapir-Whorf quagmire about how language affects thought and proposing all kinds of radical ideas which are probably wrong, and secondly because it actually undermines the idea that IE would expand via cultural expansion or migration (if the grammar and phonology of IE languages simply helps people think in a particular way, then the languages could plausibly expand without either a cultural expansion or mass migration).

Similarly for the R1 haplogroup - people who are interested in the markers expanding with the IE cultures and attribute this to associated cultural process actually in my experience tend not to prefer ideas that the haplogroup itself has a particular selective/phenotype advantage (resistance to some disease, etc.) as it's then less "useful" as a selectively neutral marker correlated with their preferred cultural process (migration, the meme-plex and elites).

(Of course, people who just want R1 and IE to be as "glorious" as they can make them will likely combine lots of ideas together which undermine each other, but they don't seem to be much linked to the academics here.)

Slumbery said...

"I cannot then help but wonder if the special curiosity in IE in particular is driven by the circumstance of its dominance today. Why isn't there as great an interest in the origin of language itself? Or is proto-IE implicitly claimed to have a special role in the development of human language?"

Yes, IE languages are in the focus of interest, because their current dominant position, the number of speakers and the economic/cultural weight of countries that speak them. This seems to be so evident for me that I never imagined that somebody would "wonder" about it. :)

I never heard anybody claiming that proto-IE has any special role in the development of human languages in general. It was too late for that at any rate. This is not the reason of the interest.

Also, there _is_ a lot of interest in the origin and development of human language in general, but that is a much more difficult topic, the amount of actual hard data is less. So if it seems that there is a bigger activity around IE languages in particular that around the origin of language in general, that is not due to the lack of interest in latter.
(Also, obviously, we are talking about this in a thematic site with a defined range of interest, where any linguistic debate is tangential at best, and the origin of human language is downright off topic.)

Slumbery said...

Matt: Although many other factor plays a role, very often the main reason behind the success of a particular human population or language is just being in the right place in the right time. For example the reason why the main agricultural revolution happened in the fertile crescent is not that people there were smarter or had a better language, but simply that the geographical position (with the particular flora, fauna and climate) was the best for that. This could be said about a lot of differential development and success, even big scale things, like the differential development of america and Eurasia.

Matt said...

@Folker, yeah, well expanded on, and I would have tended to agree that my intuition would be that those patrilineal main / grafted subbranch and social structure dynamics (as reflected by divisions of property, "justice", conflicts and feuds, etc.) would reinforce the tendency to favour particular groups of males sharing a patrilineal haplotype (tentatively those original or otherwise high status to the admixing groups, which may not always have been the same thing).

At the same time, my above post was a bit like trying a purely statistical version without considering social structure as some folk I've read who seem to know about archaeology seem to be doubtful that early Bronze Age/putatively IE groups sustained this kind of social structure, as there's low evidence for social structure in the archaeology as they see it. (Did they really have it? I dunno.)

Matt said...

Slumbery: "Although many other factor plays a role, very often the main reason behind the success of a particular human population or language is just being in the right place in the right time."

Yes, this is the normal position that virtually everyone takes or believes (though usually more like right place->particular cultural practices, which may entail various other things).

Matt said...

@Davidski, if possible could you run these D-stats:

For comparing GAC against other MN pairs. I haven't thought about D-stats for a while, but interested to see what these show (if anything).

cesar sanz said...

Dear Davidski,
I just tried theEUtest V2 K15 model with my results from 23andme.
I would like to know how you determine the ADN of different clusters (Atlantic, Baltic,etc…)
Do you use modern or old DNA? If old, how old is it?

Where can I found the description of each cluster in terms of the DNA used to stablish them?
Thanks in advance

Davidski said...


No Lengyel_LN in the new dataset. And no French_South in this dataset (they're in another dataset), etc...

@Cesar sanz

See here.


No more discussions about politics, language, or stupid questions why some topics are more popular here than others.

Obviously, this is a blog that foremost deals with European genetic origins and structure.

So if you want to talk about Indo-Europeans and Indo-European languages, this must be done in the context of how the Indo-European expansion impacted on the genetics of Europeans.

Posts that don't stick to this rule will be removed and repeat offenders banned.

Matt said...

Thanks David. A few more I forgot to ask for last time:, sorry.

Chad Rohlfsen said...


8 of the TDLN samples are Lengyel.

Chad Rohlfsen said...

If this works as well as it should, it is Lengyel_LN + WHG more than Narva for GAC. The U and P stand for Ukraine and Poland.

result: Mbuti_DG Globular_Amphora_P Germany_MN Iberia_MN 0.0022 0.697 25153 25042 475500
result: Mbuti_DG Globular_Amphora_P Germany_MN Lengyel_LN -0.0012 -0.413 25507 25569 481805
result: Mbuti_DG Globular_Amphora_P Iberia_MN Lengyel_LN -0.0024 -0.936 37737 37917 710718
result: Mbuti_DG Globular_Amphora_U Germany_MN Iberia_MN 0.0011 0.362 30250 30186 574837
result: Mbuti_DG Globular_Amphora_U Germany_MN Lengyel_LN -0.0026 -0.970 30945 31104 584588
result: Mbuti_DG Globular_Amphora_U Iberia_MN Lengyel_LN -0.0052 -2.156 46396 46878 880195
result: Narva Globular_Amphora_P Germany_MN Iberia_MN -0.0026 -0.888 23217 23340 475381
result: Narva Globular_Amphora_P Germany_MN Lengyel_LN 0.0143 5.390 23937 23264 481657
result: Narva Globular_Amphora_P Iberia_MN Lengyel_LN 0.0182 7.744 35554 34280 710517
result: WHG Globular_Amphora_P Germany_MN Iberia_MN -0.0083 -1.903 16421 16694 337782
result: WHG Globular_Amphora_P Germany_MN Lengyel_LN 0.0216 5.348 17082 16360 341290
result: WHG Globular_Amphora_P Iberia_MN Lengyel_LN 0.0320 10.072 24877 23334 490835
result: Narva Globular_Amphora_U Germany_MN Iberia_MN -0.0039 -1.359 28005 28224 573990
result: Narva Globular_Amphora_U Germany_MN Lengyel_LN 0.0127 4.930 29076 28349 583552
result: Narva Globular_Amphora_U Iberia_MN Lengyel_LN 0.0155 6.563 43858 42516 878835
result: WHG Globular_Amphora_U Germany_MN Iberia_MN -0.0092 -2.268 18693 19039 385037
result: WHG Globular_Amphora_U Germany_MN Lengyel_LN 0.0230 6.033 19583 18703 390680
result: WHG Globular_Amphora_U Iberia_MN Lengyel_LN 0.0323 9.945 28926 27119 570628

Chad Rohlfsen said...

The problem, of course, is higher WHG in Iberia and Germany can skew this. IBD, rare alleles, and such are needed. I still favor Germany at the end of the day, if those types of analysis are done.

Davidski said...

Global 25/nMonte can pick correctly the different types of forager input in different Corded War and Bell Beaker groups. It's very interesting, when done properly.

I've got a blog post coming up about it.

Rob said...

@ Dave

"Global 25/nMonte can pick correctly the different types of forager input in different Corded War and Bell Beaker groups. It's very interesting, when done properly."

Id agree to the point of early and middle Neolithic
Beyond that it gets less useful. Eg BB & CWC are just going to show variations in EHG. CHG and UkraHG.
To discern the key behind BB & CWC, you'd need to anaylse individual by region, site, haplogroup. etc; with more proximate sources, like what Im doing. -
And i think ive got it, but am having a slightl chicken/ egg moment.

Chad Rohlfsen said...

Some F3s

Lengyel_LN WHG Globular_Amphora_P -0.003460 0.001443 -2.398 228921
Lengyel_LN Narva Globular_Amphora_P -0.000193 0.001165 -0.166 358071
Germany_MN WHG Globular_Amphora_P 0.006813 0.002001 3.405 143909
Germany_MN Narva Globular_Amphora_P 0.005395 0.001413 3.818 240042
Iberia_MN WHG Globular_Amphora_P 0.009351 0.001666 5.613 216468
Iberia_MN Narva Globular_Amphora_P 0.006910 0.001299 5.320 348333
Germany_MN Lengyel_LN Globular_Amphora_P 0.009334 0.001501 6.220 212371
Germany_MN Lengyel_LN Globular_Amphora_P 0.009334 0.001501 6.220 212371
Iberia_MN Lengyel_LN Globular_Amphora_P 0.007979 0.001229 6.493 314246
Iberia_MN Lengyel_LN Globular_Amphora_P 0.007979 0.001229 6.493 314246
Iberia_MN Germany_MN Globular_Amphora_P 0.009689 0.001567 6.183 201406
Lengyel_LN WHG Globular_Amphora_U -0.002963 0.001471 -2.015 344234
Lengyel_LN Narva Globular_Amphora_U 0.000074 0.001257 0.059 596433
Germany_MN WHG Globular_Amphora_U 0.005061 0.001931 2.620 208052
Germany_MN Narva Globular_Amphora_U 0.004405 0.001519 2.901 376518
Iberia_MN WHG Globular_Amphora_U 0.010159 0.001698 5.984 322308
Iberia_MN Narva Globular_Amphora_U 0.006742 0.001421 4.745 570991
Germany_MN Lengyel_LN Globular_Amphora_U 0.008211 0.001496 5.489 326818
Germany_MN Lengyel_LN Globular_Amphora_U 0.008211 0.001496 5.489 326818
Iberia_MN Lengyel_LN Globular_Amphora_U 0.007826 0.001334 5.868 503741
Iberia_MN Lengyel_LN Globular_Amphora_U 0.007826 0.001334 5.868 503741
Iberia_MN Germany_MN Globular_Amphora_U 0.008924 0.001531 5.828 308750

Rob said...

Speaking of which, and contrary to Olalde, your data is showing that the dominant EEF ancestry in Beakers is Iberian !

Chad Rohlfsen said...

I think nMonte is highly inflating Iberian ancestry due to the HG excess. Just like Admixture will with a component that peaks in the Atlantic MN.

Rob said...

I don’t think so
Recall that lbk collapsed
All those Germany MNs and Michelsberg even GAC which precedes CWC-BB will have an Iberian component in someway

Chad Rohlfsen said...

Why would they have any Iberian? Rossen, Stoke-Ornamented, and Lengyel are from LBK.

Rob said...

Because the main MNE input into B.B. is GAC
And that has a split of Iberian and lengyel

Chad Rohlfsen said...

No way. There's nothing Iberian-like about GAC. Not culturally or genetically. Those f3s fail and qpAdm/Graph say no too.

If anything, it is a Rossen and Lengyel mix, where you already have beakers, amphorae, and footed bowls.

Chad Rohlfsen said...

Try making an Estonian or Latvian. If you get high Iberian, there's some issues.

Davidski said...


The four Global 25 datasheets have been updated. Many of the Beakers are now grouped into new categories to reflect better the substructures in the data. See here...

Rob said...

@ Chad

"No way. There's nothing Iberian-like about GAC. Not culturally or genetically. Those f3s fail and qpAdm/Graph say no too."

Megalithism stretched from Iberia to the Baltic
Wheres the centre of this phenomenon - France.
What is the foundation of French neolithic - cardial and LBK streams meeting in the centre.

What is the EEF background of GAC ?

Iberia_EN 58.3 %
LBK_EN 20.4 %
Ukraine_N 9.55 % + WHG:I1875 8.05 % + Romania_HG 3.7 %
((WHG:Rochedane 0 %
Loschbour:Loschbour 0 %
Latvia_HG 0 %
Narva_Lithuania 0 %
Narva_Lithuania:Kretuonas1 0 %
SHG 0 %))

So Iberia probably fails in F3 because HG-assoc drift skews the output. - and it is indirect (not really 'Iberian', but southern French)

Rob said...

@ Dave

Thanks for the update. I7044 (HUngarian BB Z2013) looks like another outlier.

Beaker_HungaryZ2013:I7044 2500 BC
Baden_LCA 71.8 %
Ukraine_Eneolithic:I5884 19 %
Anatolia_ChL 8.3 %
Yamnaya_Samara:I0370 0.7 %

C.f. Ukraine Eneol. Z2013 (I5884) 2800 BC
Ukraine_N 69.25 %
LBK_EN 20.75 %
Iron_Gates_HG 7.95 %

That's two Z2013's that look more Ukraino-Carpathian than Samaran.

huijbregts said...

The beakers in Global25 continue to be interesting. The maximal score on PC12 is Beaker_Sicily_no_steppe:I4930, followed by Ukraine_N_outlier:I3719, 11 SAA Africans and Beaker_Iberia_no_steppe:I4247.

Alberto said...


To verify these two:

WHG Globular_Amphora_P Iberia_MN Lengyel_LN 0.0320 10.072 24877 23334 490835
WHG Globular_Amphora_U Iberia_MN Lengyel_LN 0.0323 9.945 28926 27119 570628

Wouldn't it be advisable to run something like:

WHG Germany_MN Iberia_MN Lengyel_LN

To see if you see something similar or not? If the result is similarly significant, then the two above won't be very meaningful. But if the result is not significant, then the first two will be showing something probably real.

Grey said...

Chad Rohlfsen said...
"Try making an Estonian or Latvian. If you get high Iberian, there's some issues."

Atlantic Megalith culture?

the megalith culture map looks like maritime hopping to me:

->Southern France

initially obsidian & amber with soft metals added later?

Davidski said...


The beakers in Global25 continue to be interesting. The maximal score on PC12 is Beaker_Sicily_no_steppe:I4930, followed by Ukraine_N_outlier:I3719, 11 SAA Africans and Beaker_Iberia_no_steppe:I4247.

Probably just a reflection of inflated postmortem damage. Even UDG-treated samples suffer from this, and it's likely to be more apparent in low coverage samples with fewer markers available to test.

Alberto said...


Haven't tested with that specific setup, but there are a few Iberian Bell Beakers (with EHG admixture) that also seem to lack CHG, or mostly. Would you try them with that same setup? They are I5665 and I6472. And to a lesser degree also these two: I6539 and I0461.

A couple of Bel Beakers from southern France could have a similar prifile: I1388 and to a lesser degree I3874.

Grey said...

Samuel Andrews said...
"Also, the CHG stuff in mainland Italy, Spain, France, Balkans can't be very old. I'm thinking most of it came in the Bronze age."

Magna Graecia


"Whose the best AFrican reference for southwest Asia>"

pure guess but

the people called "worm eaters" (Dawada or Dawwada) from Libya


"The Khoisan/South Africans never contacted Eurasians prior to the English/Dutch."

not the ones from south Africa maybe but what if related groups were more widespread before the Bantu and Eurasian expansions?

if so then you might only find traces of that original population in the most inaccessible regions?

zardos said...

Khoisan have Eurasian admixture. The Hottentots even more than the average Bantu. E1b1b is a good indicator for Eurasian admixture in South Eastern Afrika.

Matt said...

@Davidski, sorry to ask again, would you be able to run these as well:

@Rob:Speaking of which, and contrary to Olalde, your data is showing that the dominant EEF ancestry in Beakers is Iberian!

Hmmm... The way I read Olalde is that it is exhausting positions that the Bell Beaker groups are specifically Iberian MN/Chl (that is Iberian Beaker) plus Steppe; and I think there's less focus on which EEF streams contributed to GAC (though there is some qpGraph in there along those lines). This is their way of trying to exclude a specifically out of Iberia movement at the stage of Bell Beaker (and any earlier movement is less in scope). This is the same reason they don't actually test any models using Corded Ware or any other LNBA groups either.

It might be worth repeating their qpAdm tests with a more exhaustive panel of HG groups and European EEF groups. I believe they've *only* (and it's still a lot) restricted to Anatolian/MN/LN/CHL + KO1/Losch/La Brana + steppe; might get better fits from the whole panel including EN as an intermediate stage (and using more HG samples in the pright and pleft to strengthen differentiation among streams).

@Chad Rohlfsen: I think nMonte is highly inflating Iberian ancestry due to the HG excess. Just like Admixture will with a component that peaks in the Atlantic MN.

Strictly nMonte is not doing anything than evolving the best fitting (lowest distance) vector; if there is any false relatedness between high HG EEF (and this makes vectors of Iberia_EN to GAC better for some reason despite Hungary_CA having comparable HG ancestry), then it is in the structure of the Global25 that is amiss (G25 is the input data. You understand the basics of what is being done here, right?).

It could be that something about how GAC+Atlantic have combined in modern populations in Iberia / France make them false friends in the G25 (close because they have contributed to SW Europe, not close because they share a direct relationship).

Rob said...

@ Matt

Yes, I realise what hypothesis they were testing, I think Chad and I were indeed discussing the deeper streams of ancestry :)

BTW, I imagine GAC can be seen as a semi-kurganized Megalith culture (?)

@ Alberto

The 'ancestry' detected of the steppe-admixed Beakers from SW Europe with G25 is as expected - EHG, Ukr. HG, some CHG.
With more proximate sources, the best fitting option when limited to major 'blocks' of sources are the steppe & GAC combo, combining with local groups, in all the individuals you mentioned.

eg I5665 in northern Iberia. R1b-M269, 22-1900 BC
Yamnaya 12%/ Ukraine 14%/ GAC 16%
Iberia_Chalc 40%
Baden 18%

You can substitute the top combo for BB Central Euro, but in reality the admixture could have come via Britain and/or France.

Matt said...

Yeah, really more commenting on deeper streams of ancestry partly because it's what was being discussed on other threads and for any third parties reading (of course you and he would both understand to what degree this would / would not be compatible with the paper's main findings).


Re:Global 25 again, the Dimension PC13 looks particular important in creating a distinction between Iberia_EN / Iberia_MNChl and MN Europeans in general. (Dimensions: PC4, PC10, PC13, PC21, PC23 also look to matter, but PC4 is more general North-South and PC10 East-West, while PC13 is more specific to EEF).

Graphing PC10 v PC13 in particular are the dimensions which are breaking out Basque / Sardinian / SW European.

Plotting these for the purposes of visualisation:

PC10 in particular creates a distinction between WHG and Iron Gates HG and Iberia_EN vs Levant. PC13 seems not to be meaningful for HGs but further contrasts populations with ancestry from the same source as West European Neolithic particularly to Natufians in particular, and splits apart a Atlantic_MN_CHL->Steppe_EMBA cline from a Hungary_CA->Baltic_BA cline (though the place of Sweden_MN and GAC is slightly hard to discern there visually).

Davidski said...


Matt said...

Using some of those D-stats Davidski has provided (cheers!):

See here:

Stats are generally beneath the significance threshold in most cases.

However, looking at the trend, and at the highly significant Z contrasts between the Beaker_Britain and EBA vs England_Roman, it looks like some decided shifts between England_MBA to England_Roman.

MBA is highly shifted towards Scotland_N on the Scotland_N:Globular Amphora axis, and slightly more so than Beaker (and specifically to Scoland_N relative to Iberia_Chl providing some evidence this is local absorption).

In contrast England_Roman seems to lose this affinity in exchange for a shift towards GAC on the Scotland_N:Globular Amphora, *but* also gains some Iberia_Chl shift relative to its Scotland_N shift.

Tentatively, this may reflect further immigration from Atlantic Europe and West Central Europe (e.g. Halstatt Celts)?

(Note all populations in this panel are closer, albeit largely non-significantly, to Scotland_N and Iberia_Chl relative to GAC, except for present day NE and SE Europeans who are closer to GAC than Iberia_Chl, and in one instance for Eastern Ukraine, than to Scotland_N as well).

Running these stats again with many more populations (e.g. many more BA populations and present day pops) could firm up or reject the trends.

Chad Rohlfsen said...


The two need not be related. Megaliths go back to the Mesolithic in Europe. They may be no more related than the groups with megaliths in North Africa, the Levant, East Africa, Caucasus, or even East Asia. I'm referring to shared ancestry in the last few centuries and not on a deep level.

Where are all the megaliths east of Northern Germany or the rest of GAC land to Ukraine if megalith equals Iberian ancestry? Some cultural imitation on the border between the Atlantic and TRB seems more likely. There may be light mixing in the east, but nothing about GAC is "Iberian". I think testing on the alleles will help settle that.

Matt said...

Few more comparisons for the British time series using D stats provided by Davidski:

Show with higher Z what previous series showed; compared to Beaker/England Middle Bronze Age recent Brits shifted away from Scotland_N towards GAC, also shifted away from Iberia_Chl towards GAC relative to Beaker/E MBA, but less dramatically.


A few more pairs of D-stats:

Globular Amphora looks equally related to both Iberia_EN and Tiszapolgar_ECA (Hungary) when WHG is used as a "counterweight" to control for WHG inflating relatedness between GAC and which of Iberia_EN and Tiszapolgar_ECA is more WHG related.

Possible comparing D(Tiszapolgar_ECA, IronGatesHG; X, Mbuti) to D(Iberia_EN, La_Brana; X, Mbuti) over a range of EEF for X would account for differences in the EEF streams (maybe this would take GAC slightly closer to Iberia_EN relative to Tiszapolgar_ECA?).


A couple of pairs from Chad's stats: These show that Lengyel is significantly less WHG like than Germany_MN, as is well known from virtually all of David's modeling (which show them to be like the Czech MN samples with very little extra WHG compared to the resurgence elsewhere).

But other stats are non-significant, including comparisons of similarity to GAC relative to Germany_MN for Lengyel and Iberia_MN, which are non-significant (neither Lengyel or Iberia_MN are significantly closer or less close to GAC compared to Germany_MN).

However, the Lengyel may be close to GAC *relative* its WHG, but this set of stats doesn't really show that. More pairs might help indicate whether this is viable or not.

Matt said...

@Davidski, apologies again, some more D stats to run if possible:

I'd just like to see if any of the trends hold up, re: Scotland_N vs GAC vs Iberia_Chl for BA and post BA, and Iberia_EN vs Tiszapolagar ECA affinities within EEF attempting to control for HG ancestry.

Davidski said...


I've only got a limited dataset available at the moment for formal stats, with only a subset of the modern pops that were present in past datasets. So some of the pops you listed aren't in this dataset.

These are the pops & individuals that I can run at the moment...

Matt said...

OK, cheers, sifted out the pops you don't have in the dataset:

Thanks again

Davidski said...


Matt said...

Cheers Davidski. So plotting those stats, first pairs in this post:

Y: D(Mbuti, X; Scotland_N, Globular Amphora), X: D(Mbuti, X; Iberia_Chl, Globular Amphora) -

Same, restricting to ancients with >800,000 SNPs -

Present day people show a pretty clear cline where:

a) essentially all Western European populations are more related to Iberia_Chl than Globular Amphora, though non-significantly in the case of most (even Spanish/Basque only have Z= -2.5 toward Iberia_Chl!). The only highly significant stats are for Sardinians.

b) while all Eastern European populations are closer to GAC than Iberia_Chl (in the case of Russia_Central, this is almost as significant as Spanish/Basque's Iberia_Chl affinity!).

On geographic grounds, this all makes fairly perfect sense.

The only exceptions to this are

1) Russian North, who have quite weak relatedness to GAC relative to Iberia_Chl compared to other Slavic populations, and other Russians. I would guess this is explained by a unique demographic history somehow for Russia_North?

2) Swedish, who are much more strongly related to the GAC than populations from NW Europe... but this is perhaps explicable if we think of Swedes as North-Central, rather than NW European?

Scotland_N affinity relative to GAC is mostly correlated with Iberia_Chl, but is generally stronger - almost all present European populations are (non-significantly) closer to Scotland_N than GAC.

There's also a pattern where it looks like populations tend to be closer to GAC relative to Scotland_N if they have more WHG.

Looking at ancients now, and looking only at the >800,000 SNPs:


1) Closer (significantly) to Scotland_N than GAC & closer (significantly) to Iberia_Chl than GAC: Beaker_Iberia, Beaker_Southern_France (e.g. largely Atlantic_Neolithic SW Beaker populations).

2) Closer (significantly) to Scotland_N than GAC & closer (non-significantly) to Iberia_Chl than GAC: England_MBA, England_CA_EBA, Beaker_Britain, Scotland_CA_EBA, Scotland_LBA, Scotland_MBA, Beaker_The_Netherlands, Poland_BA, Czech_EBA, Beaker_CE, Hungary_BA, Beaker_Hungary. (Most Bronze Age populations I've tested).

3) Closer (Non-significantly) to Scotland_N than GAC & closer (Non-significantly ) closer to Iberia_Chl than GAC: England_Roman, Ireland_EBA, England_IA, Netherlands_BA, Battle_Axe_Sweden, Baltic_BA

4) Closer (Non-significantly) to Scotland_N than GAC and closer (Non-significantly) to GAC than Iberia_Chl: CWC_Germany, England_Anglo-Saxon

5) Closer (Non-significantly) to GAC than Scotland_N and closer (Non-significantly) to GAC than Iberia_Chl: Srubnaya, Sintashta


I'm slightly at a loss to explain the ancients results. It seems plain that there's some degree of a cline here between Western/low Steppe populations tending to favour both Iberia_Chl and Scotland_N strongly and Central/Eastern/high Steppe populations tending to favour GAC. (I'm presuming that for most / all of these ancient populations Scotland_N is simply a proxy for ancestry intermediate the extremes of Iberia_Chl and GAC, inc. France_MN, Germany_MN, etc).

But what is messing with me is how all the Beaker populations can all show a stronger affinity for Scotland_N than Globular_Amphora, to the highly significant level of Z = -4 or -5 while the paper indicates that qpAdm on outgroups makes GAC a better fit than any of the Atlantic MN populations including Scotland_N.

How can this be so, if the qpAdm procedure is actually adequate to detect the differences between those groups? Have Olalde et al gone wrong by focusing their qpAdm exercise on HG outgroups?

I'd really appreciate some ideas on how this could be from anyone else who might be reading (Davidski, and others?).

Rob said...

@ Matt
Maybe the DStats are reflecting what i suggeated above - the overall majority of West European moderns and Beaker folk harbour the majority of their autosomal ancestry from western Neolithics. Moreover, maybe qpAdm can’t discern multiple streams of similar ancestry ?

Matt said...

Now for the second pair of stats.

Plotting a bivariate linear fit between D(Tiszapolgar_ECA, WHG; X, Mbuti) and D(Tiszapolgar_ECA, WHG; X, Mbuti):

The intercept is set to zero, and the end of the line cuts through Balkans_N and Barcin_N which have roughly a difference of ≈0 between both stats. So the residual is (+) how much more a population is related to Tiszapolgar_ECA and (-) how much more a population is related to Iberia_EN, once WHG is accounted for (+/- reversed in second pair in the image).

(I suppose I could run D(Iberia_EN, Tiszapolgar_ECA; X, Mbuti) but this graph makes the relationship with WHG affinity more explicit).

Fairly clearly GAC and Sweden_EN are almost at 0 on the residual, as are Czech_MN and Starcevo.

So this supports the notion that GAC, at least, is not particularly related to the Hungarian Early Copper Age relative to Iberia_EN. On the other hand, neither does Czech_MN, so still arguable that a Central European precursor to GAC existed, but, if so, this precursor didn't share any particular relationship to Hungarian ECA.

I would guess that perhaps Central-Southeastern Europe had more diffuse streams of Early Neolithic ancestry (which makes sense since it's closer on colonisation routes), and that this makes a simple crumbling into Cardial vs Danubian groups a challenge?

So all in all, taking this with my last post, it supports that:

a) In its "deep" Neolithic roots, GAC and Funnelbeaker are probably either uniquely Central European or else linked equally to Hungarian ECA ("Danubian") and Iberia_EN ("Cardial")

but b) by direct comparison GAC doesn't seem to share as much with Beaker cultures as Scotland_N (or less significantly Iberia_Chl), and this seems like a challenge for Beakers as GAC+Steppe_EMBA models...

Matt said...

@Rob, thanks for your comment, yeah, that's certainly a possibility I'm considering, and seems like the most straightforward interpretation of these stats. Mainly trying to work out if there is something simple explanation that I haven't thought of that still means that GAC is the best MN type ancestor...

Would say I'm not so sure about whether qpAdm can or can't do it though, exactly , so much as whether the outgroups they have picked are up to the task; I believe their strategy is that the used the pright of these 9 Outgroups:

Mota, Ust_Ishim, MA1, Villabruna, Mbuti, Papuan, Onge, Han, Karitiana, ElMiron and GoyetQ116-1.

They included El_Miron and GoyetQ116-1 to in theory generate statistical variation (largely) in stats D(Villabruna, El_Miron; X, Y) to split apart the Neolithic waves into Europe. But this may have been an inadequate method, and it may be that they would really need to be using the latest qpAdm and simulation methods from Lazaridis's 2017 paper...! (which lean much more on a host of ancient dna in the pright). I also don't know why they haven't used direct D-stats of the form D(Neolithic1,Neolithic2,X,Outgroup) in their paper at all.

Arza said...

@ Matt

Mbuti Beaker_The_Netherlands Scotland_N Globular_Amphora_Poland 0.0041 2.235 734918
Mbuti Beaker_The_Netherlands Scotland_N Globular_Amphora_Ukraine -0.0089 -5.165 945311
Mbuti Beaker_Britain Scotland_N Globular_Amphora_Poland 0.0023 1.439 748336
Mbuti Beaker_Britain Scotland_N Globular_Amphora_Ukraine -0.0088 -5.697 1012231
Mbuti Scotland_N Globular_Amphora_Ukraine Globular_Amphora_Poland 0.0083 4.090 738342
Mbuti Europe_MNChL_Iberia_ChL Globular_Amphora_Ukraine Globular_Amphora_Poland 0.0084 3.339 710325

Matt said...

@Azra, GAC Poland not too much like GAC Ukraine in these stats! What do we get with


Hence Poland_LN not Globular Amphora Culture in all the models I guess. Any unity for GAC or not?

Matt said...

Sorry, Arza, if poss:


n/p if not in your dataset.

Arza said...

Mbuti Globular_Amphora_Poland Scotland_N Globular_Amphora_Ukraine -0.0019 -0.942 738342
Mbuti Globular_Amphora_Poland France_MLN Globular_Amphora_Ukraine -0.0017 -0.635 717532
Mbuti Globular_Amphora_Poland Europe_MNChL_Germany_MN Globular_Amphora_Ukraine 0.0095 2.950 689221
Mbuti Globular_Amphora_Poland Europe_MNChL_Iberia_Chl Globular_Amphora_Ukraine 0.0005 0.219 710325

It seems that I don't have Sweden_MN (merged Olalde, Mathieson and Laz 2016 datasets)

I1277 M Europe_MNChL_Iberia_Chl
I1272 F Europe_MNChL_Iberia_Chl
I1281 F Europe_MNChL_Iberia_Chl
I1300 F Europe_MNChL_Iberia_Chl
I1303 M Europe_MNChL_Iberia_Chl
I0172 M Europe_MNChL_Germany_MN
I0560 F Europe_MNChL_Germany_MN

Arza said...

Europe_EN_Iberia WHG Globular_Amphora_Poland Mbuti 0.0012 0.392 729797
Europe_EN_Iberia WHG Globular_Amphora_Ukraine Mbuti 0.0039 1.326 952855

Europe_EN_Iberia - I0409, I0410, I0412, I0413

Tisza is also not present in the dataset.

Matt said...

Cheers, GAP non-sig closer to Scotland_N/France_MLN than GAU and non-sig closer GAU than Iberia_Chl. Only Germany is almost significant.

I'd guess even Mbuti England_MBA Scotland_N Globular_Amphora_Poland (the lowest stat for Mbuti England_MBA Scotland_N Globular_Amphora) should still be positive (albeit<1?) based on those so far.

I think you've sorted that out and I should've been thinking about "Poland_LN" to begin with rather than GAC. Though I would think if the change from D(Mbuti, X;Scotland_N, Globular_Amphora) between Beakers to later Iron Age British cultures should be paralleled by D(Mbuti, X;Globular_Amphora_Poland,Globular_Amphora_Ukraine) and D(Mbuti, X;Globular_Amphora_Poland,Iberia_Chl), and shift from GAC_Poland affinity to more general MN affinity, that would certainly be interesting in itself?

Arza said...

@ Matt

Mbuti Beaker_The_Netherlands Scotland_N GAC_______ -0.0053 -3.69 952904
Mbuti Beaker_The_Netherlands Scotland_N GAC_Poland 0.0041 2.235 734918
Mbuti Beaker_Britain Scotland_N GAC_______ -0.0061 -4.693 1022330
Mbuti Beaker_Britain Scotland_N GAC_Poland 0.0023 1.439 748336
Mbuti England_CA_EBA Scotland_N GAC_______ -0.0065 -4.926 1008688
Mbuti England_CA_EBA Scotland_N GAC_Poland 0.0024 1.444 746771
Mbuti England_MBA Scotland_N GAC_______ -0.0067 -5.249 1013052
Mbuti England_MBA Scotland_N GAC_Poland 0.0033 2.036 747449
Mbuti England_IA Scotland_N GAC_______ -0.0021 -1.2 895066
Mbuti England_IA Scotland_N GAC_Poland -0.0027 -1.27 658730
Mbuti England_Roman Scotland_N GAC_______ -0.0016 -1.114 1022987
Mbuti England_Roman Scotland_N GAC_Poland 0.001 0.574 746795
Mbuti England_Anglo-Saxon Scotland_N GAC_______ 0.0001 0.049 977943
Mbuti England_Anglo-Saxon Scotland_N GAC_Poland 0.002 0.934 721108

Mbuti Beaker_The_Netherlands GAC_Poland GAC_Ukraine -0.0105 -4.604 727386
Mbuti Beaker_The_Netherlands GAC_Poland Iberia_ChL -0.0051 -2.276 709006
Mbuti Beaker_Britain GAC_Poland GAC_Ukraine -0.0089 -4.482 738410
Mbuti Beaker_Britain GAC_Poland Iberia_ChL -0.0042 -2.227 716164
Mbuti England_CA_EBA GAC_Poland GAC_Ukraine -0.009 -4.395 737430
Mbuti England_CA_EBA GAC_Poland Iberia_ChL -0.0039 -1.953 715693
Mbuti England_MBA GAC_Poland GAC_Ukraine -0.0108 -5.278 737845
Mbuti England_MBA GAC_Poland Iberia_ChL -0.0039 -2.047 715918
Mbuti England_IA GAC_Poland GAC_Ukraine 0.0003 0.114 650341
Mbuti England_IA GAC_Poland Iberia_ChL 0.0045 1.69 631302
Mbuti England_Roman GAC_Poland GAC_Ukraine -0.0026 -1.176 736811
Mbuti England_Roman GAC_Poland Iberia_ChL -0.0022 -1.018 714562
Mbuti England_Anglo-Saxon GAC_Poland GAC_Ukraine -0.0032 -1.253 711874
Mbuti England_Anglo-Saxon GAC_Poland Iberia_ChL -0.0063 -2.471 691187

IA, Roman, Anglo-Saxon same as in G25

Arza said...

@ Matt

Beaker_The_Netherlands - proxy for pre-Britain - at least some GAC ancestry
Beaker_Britain - drop in GAC affinity due to mixing with local farmers
England_CA_EBA - stagnation
England_MBA - gaining traction
England_IA - migration of people more related to Iberia_ChL - Italo-Celts?
Roman - mixed influx
Anglo-Saxon - Central Europe again


Matt said...

Azra, thanks. I think that's a good model to put on those stats; surprised that the Roman->IA change looks relatively sharp in these stats. I'd feel a little more comfortable if they were all over 800,000 SNPs but it looks pretty good.

Some plots of those:

Since the GAC_Ukraine seems to be yielding such strong stats vs other EEF, and given the heavy clinality of GAC_Ukraine vs anyone more western, I was wondering about where Corded_Ware_Germany and Corded_Ware_Baltic stands in terms of these stats. Maybe it can help resolve the questions we've had of where CW EEF ancestry comes more, whether from the Ukraine and the Western steppe (and cultures like GAC_Ukraine) or was picked up locally in regions post migration.

And also this hits on whether the Beaker groups could be explained by Corded_Ware_Germany+Yamnaya+other EEF, as well as / instead of using the specific Poland_LN/GAC_Poland group. (E.g. whether a model of Corded+Yamnaya+other EEF works, with varying EEF depending on Beaker subgroup, or whether the stats very specifically track to a fresh off the steppe Steppe EMBA+Poland_LN combination).

So could you possibly also please try these D-stats?

Mbuti Corded_Ware_Germany Scotland_N GAC_______
Mbuti Corded_Ware_Germany Scotland_N GAC_Poland
Mbuti Corded_Ware_Germany Scotland_N GAC_Ukraine
Mbuti Corded_Ware_Germany Germany_MN GAC_Poland
Mbuti Corded_Ware_Germany Germany_MN GAC_Ukraine
Mbuti Corded_Ware_Germany Iberia_CHL GAC_Poland
Mbuti Corded_Ware_Germany Iberia_CHL GAC_Ukraine
Mbuti Corded_Ware_Germany France_MLN GAC_Poland
Mbuti Corded_Ware_Germany France_MLN GAC_Ukraine
Mbuti Corded_Ware_Germany GAC_Poland GAC_Ukraine
Mbuti Corded_Ware_Baltic Scotland_N GAC_______
Mbuti Corded_Ware_Baltic Scotland_N GAC_Poland
Mbuti Corded_Ware_Baltic Scotland_N GAC_Ukraine
Mbuti Corded_Ware_Baltic Germany_MN GAC_Poland
Mbuti Corded_Ware_Baltic Germany_MN GAC_Ukraine
Mbuti Corded_Ware_Baltic Iberia_ChL GAC_Poland
Mbuti Corded_Ware_Baltic Iberia_CHL GAC_Ukraine
Mbuti Corded_Ware_Baltic France_MLN GAC_Poland
Mbuti Corded_Ware_Baltic France_MLN GAC_Ukraine
Mbuti Corded_Ware_Baltic GAC_Poland GAC_Ukraine
Mbuti Sintashta Scotland_N GAC_______
Mbuti Sintashta Scotland_N GAC_Poland
Mbuti Sintashta Scotland_N GAC_Ukraine
Mbuti Sintashta Germany_MN GAC_Poland
Mbuti Sintashta Germany_MN GAC_Ukraine
Mbuti Sintashta Iberia_ChL GAC_Poland
Mbuti Sintashta Iberia_CHL GAC_Ukraine
Mbuti Sintashta France_MLN GAC_Poland
Mbuti Sintashta France_MLN GAC_Ukraine
Mbuti Sintashta GAC_Poland GAC_Ukraine

(Sintashta stats are for whether the GAC_Ukraine vs others contrast can be any help in working out how the pickups of EEF ancestry happened in Steppe_MLBA).

High steppe ancestry might cause these stats to tend to 0, which would be a problem, but I think it's worth a try.

Matt said...

Also re: Corded Ware and D-stats, we could try:

Yamnaya_Samara Corded_Ware_Germany GAC_Poland GAC_Ukraine
Yamnaya_Samara Corded_Ware_Germany Germany_MN GAC_Poland
Yamnaya_Samara Corded_Ware_Germany Germany_MN GAC_Ukraine
Yamnaya_Samara Corded_Ware_Germany Scotland_N GAC_Poland
Yamnaya_Samara Corded_Ware_Germany Scotland_N GAC_Ukraine
Yamnaya_Samara Corded_Ware_Germany Iberia_Chl GAC_Poland
Yamnaya_Samara Corded_Ware_Germany Iberia_Chl GAC_Ukraine
Yamnaya_Samara Corded_Ware_Germany France_MLN GAC_Poland
Yamnaya_Samara Corded_Ware_Germany France_MLN GAC_Ukraine

as a strategy to separate out the non-steppe like part of Corded_Ware_Germany's ancestry (in theory mostly EEF). Samara_Eneolithic might work as well instead of Yamnaya_Samara.

Alberto said...


I didn't have time to follow this with any good level of detail, but one thing I noticed when running models to explore this same thing with Global 25 is that a good part of the EEF admixture in BA Europe already happened in Ukraine. In other words, the Yamnaya in Europe does not seem to be pure Yamnaya, but rather a mix of Yamnaya a Ukraine_Eneolithic, which mostly explains the excess WHG seen in these BA Europeans in a more parsimonious way (since Ukraine_Eneolithic has a good amount of Ukraine_N or SHG-like admixture).

But what matters more about it is that Ukraine_Eneolithic already has similar levels of EEF admixture as CWC. Easily 30-40%. So in a way, that means that there are probably 2 layers of EEF, the first one was in Ukraine since the 5th Mill. (mixed with Ukraine_N, though we hardly have any sample from that 5000-4200 BC period), who mixed with a more eastern (EHG+CHG) population that arrived around 4200 BC probably.

And then a second layer when these guys migrated out of the steppe deeper into Europe, where depending on the route they would mix with different populations. So maybe it's easier to do this in 2 steps, first find the best EEF for Ukraine_Eneolithic and then use Ukraine_Eneolithic as a starting point to see what's on top of that (this way probably the stats for different BA groups become more significant).

Matt said...

@Alberto, thanks for the comment, to be honest though, re:G25 I'm not sure whether Bronze Age Europeans even really do have excess WHG admixture there than what can be explained by the various MNChl populations. I've not looked at nMonute models closely enough, however Beakers seem OK to fit MNChl+Yamnaya lines when I look at PCA? I'm not sure Ukraine_Eneolithic is important at all. If you have models that work, may as well post them up though.

Plus issue with Ukraine_Eneolithic is 3/4 are obvious outliers.
Just to express my point of view, I'll reprocess the G25 West Eurasians only, through PCA (which should represent most of the structure in G25 in ways more optimized for West Eurasians alone):

You can see that there are about 3 out of 4 of the Ukraine_Eneolithic samples that are just outliers and have no real role in any of Europe+Steppe BA clines involving LNBA European or MLBA steppe populations... They're maybe sort of like Baltic_BA in some dimensions but it seems more like a chance resemblence to me. One of the samples overlaps LNBA Europe, but the above D-stats make it seem unlikely that Europeans are actually descendents of that one (rather than a similar admixture).

The position of 3/4 of those Ukraine_Eneolithic samples makes it hard for me to believe they are as EEF like (or just generally as "southern") as any of CWC, certainly very clearly for 2/4.

If you had to fit them as anything, then 25% EEF and 75% HG seems like it work OK... but although the EEF level would be similar to CW Germany, this is obviously way "north" of them, as CW Germany fits on a cline from Yamnaya/CW Early Baltic->MNChl, and has major CHG offset ancestry.

The Globular_Amphora_Ukraine samples seem really interesting to me, in contrast, because they generate these strong Z stats distinguishing them from the MNChl ancestry "in" Beakers, without looking much different from GAC_Poland / Iberia_Chl on the G25.

Does make sense? Sorry if I've missed out on anything you were saying.

Alberto said...


I didn't check the Globular Amphora Ukraine samples alone. I must have missed the latest updated sheet with different labels.

But I was testing mostly Bell Beakers as mix of all MN-ChL European populations and Yamnaya/Ukraine_Eneolithic. Then I went on with more populations using the same setup. I guess you missed the post:

And I do see quite a bit of Ukraine_Eneolithic (those 4 samples I put them as individuals for being diverse). And that way (thought I included 3 WHGs to avoid the choice of MN-ChL populations being forced by the amount of HG) there was very little excess of HG (that otherwise you would see when running models with G25).

So I don't know if 3/4 are really outliers or they are a different population that actually did contribute to BA Europeans. The models are purely for testing, to find patterns more than to show anything realistic.

Matt said...

Hmmm... You've got a fit that involves Ukraine_Eneolithic:I4110 at substantial percentages (21%), out of the three outlying he's the least off and closest to Europeans (still pretty off the cline but not too far from the Corded_Ware_Germany outlier). It's HG enriched, but by less than Baltic_BA are; a sample with 21% of this is going to be very modestly off the cline, nothing too decisive. Another fit I6561 in some other fits (11% in one fit for Iberia_BA), and I6561 which basically just resembles a MLBA Bronze Age steppe sample, mostly by chance I'd say.

The other two outliers don't seem to get more than 2%-7%, for the normal European cline, but have some presence in Hungary_BA, which I'd guess is a proxy for Baltic_BA influence.

Here's a neighbour join of them and a randomly chosen batch of other ancients (can't get them all on as too many samples):

A resampled PCA using the same samples:

Ukraine_Eneos are labeled in black on each.

The clustering positions are highly diverse and not *really* too much like each other, so I'd be very wary about talking of a Ukraine Eneolithic population that contributed to others as if it were a single population. They're nothing like each others neighbours. They're very diverse and sporadic samples as you say. Def not a good idea for us to talk about their median ancestry proportion or any whole population quality they have when grouped together and extrapolate back to fits involving a single sample (perils of nMonte3 type methods are extra serious for these samples!)

Arza said...

Mbuti CWC_GER Scotland_N GAC_______ -0,0005 -0,295 1008875
Mbuti CWC_GER Scotland_N GAC_Poland 0,0034 1,746 743563
Mbuti CWC_GER Scotland_N GAC_Ukraine -0,0018 -0,972 999282
Mbuti CWC_GER Germany_MN GAC_Poland 0,009 2,913 695134
Mbuti CWC_GER Germany_MN GAC_Ukraine 0,0037 1,24 941660
Mbuti CWC_GER Iberia_ChL GAC_Poland 0,0051 2,189 712524
Mbuti CWC_GER Iberia_ChL GAC_Ukraine 0 -0,019 883121
Mbuti CWC_GER France_MLN GAC_Poland 0,0016 0,598 720236
Mbuti CWC_GER France_MLN GAC_Ukraine -0,0026 -1,03 911144
Mbuti CWC_GER GAC_Poland GAC_Ukraine -0,004 -1,643 734133
Mbuti CWC_I4629 Scotland_N GAC_______ -0,0071 -2,969 248859
Mbuti CWC_I4629 Scotland_N GAC_Poland -0,0022 -0,661 235995
Mbuti CWC_I4629 Scotland_N GAC_Ukraine -0,0104 -3,318 248574
Mbuti CWC_I4629 Germany_MN GAC_Poland 0,0067 1,25 216434
Mbuti CWC_I4629 Germany_MN GAC_Ukraine -0,0017 -0,32 228407
Mbuti CWC_I4629 Iberia_ChL GAC_Poland -0,0011 -0,278 234940
Mbuti CWC_I4629 Iberia_ChL GAC_Ukraine -0,0085 -2,228 246291
Mbuti CWC_I4629 France_MLN GAC_Poland 0,0022 0,465 235252
Mbuti CWC_I4629 France_MLN GAC_Ukraine -0,0053 -1,202 246914
Mbuti CWC_I4629 GAC_Poland GAC_Ukraine -0,0073 -1,685 235710
Mbuti Sintashta Scotland_N GAC_______ 0,0015 0,904 921345
Mbuti Sintashta Scotland_N GAC_Poland 0,0033 1,533 700776
Mbuti Sintashta Scotland_N GAC_Ukraine 0,001 0,507 913433
Mbuti Sintashta Germany_MN GAC_Poland 0,0089 2,701 653224
Mbuti Sintashta Germany_MN GAC_Ukraine 0,0051 1,591 855609
Mbuti Sintashta Iberia_ChL GAC_Poland 0,0034 1,346 674814
Mbuti Sintashta Iberia_ChL GAC_Ukraine 0,0018 0,731 820932
Mbuti Sintashta France_MLN GAC_Poland 0,0018 0,633 681407
Mbuti Sintashta France_MLN GAC_Ukraine 0,0013 0,472 843700
Mbuti Sintashta GAC_Poland GAC_Ukraine -0,0021 -0,761 692979
Yamnaya_Samara CWC_GER GAC_Poland GAC_Ukraine -0,0005 -0,213 732410
Yamnaya_Samara CWC_GER Germany_MN GAC_Poland 0,0033 1,197 693516
Yamnaya_Samara CWC_GER Germany_MN GAC_Ukraine 0,0028 0,988 935819
Yamnaya_Samara CWC_GER Scotland_N GAC_Poland 0,0016 0,995 741560
Yamnaya_Samara CWC_GER Scotland_N GAC_Ukraine 0,0015 0,931 992377
Yamnaya_Samara CWC_GER Iberia_ChL GAC_Poland 0,0025 1,274 711135
Yamnaya_Samara CWC_GER Iberia_ChL GAC_Ukraine 0,0008 0,402 880125
Yamnaya_Samara CWC_GER France_MLN GAC_Poland 0,003 1,315 718752
Yamnaya_Samara CWC_GER France_MLN GAC_Ukraine 0,0015 0,651 907503

Yamnaya_Samara, CWC_GER and Sintashta - same samples as in G25
From Baltic I have only one - I4629.

Matt said...

@Alberto, but yeah, I think I may have drifted a little from the topic, for the two samples Ukraine_Eneolithic samples that are almost on cline for Europe LNBA->steppe MLBA (I4110 and I6561), it could be worth running the same stats I suggested above for CWC_Germany and Sintashta for them, since these two have fairly overlapping variation to CWC. Esp. I6561.

@Arza, cheers.

Arza said...

Re: Beakers

CWC_Baltic_early added to David's model obliterates Yamnaya. Outlier from Hungary is half-half. Surprisingly CWC_Germany takes much more Yamnaya than Beakers.

That's the reason why they are so vague in the paper about the source of steppe ancestry in Beakers, or even they suggest CWC ("400 years earlier").

A good exercise is to model Beakers with a broad set of samples but without Yamnaya or CWC and then to recreate the "steppe" point. It lands shifted away from Yamnaya in the CWC_Baltic_early direction.

@ Matt

Hungary_BA, which I'd guess is a proxy for Baltic_BA influence

Hungary_BA = Baltic_BA + WHG + N/MN or simply... Welzin.

Now when it comes to Baltic_BA... Spiginas2 is not what I thought. He's definitely not the earliest example of the freshly formed Baltic_BA cluster. @ PC14 it's clearly visible that he's distinct... and shifted towards another sample.

Baltic_BA:Kivutkalns209 52%
Ukraine_Eneolithic:I5884 48%

Distance 2.9072%

Distances between samples are around 6% CB, 6% CU and 11% BU.

And of course Baltic_BA cluster is not what everyone thinks.

Samara_Eneolithic:I0122 37.4%
Samara_Eneolithic:I0433 18.95%
CHG:KK1 14.3%
Armenia_EBA:I1633 11.15%
Srubnaya_outlier:I0354 9.4%
Baltic_BA:Turlojiske3 8.8%
Distance 3.6244%

Samara_Eneolithic:I0122 39.45%
Baltic_BA:Turlojiske3 27.15%
Srubnaya_outlier:I0354 14.75%
Armenia_EBA:I1633 10.15%
CHG:KK1 4.95%
Globular_Amphora:I2441 3.5%
Baltic_BA:Kivutkalns222 0.05%
Distance 4.1064%

Samara_Eneolithic:I0122 25.4%
Baltic_BA:Turlojiske3 16.5%
Globular_Amphora:I2441 13.3%
Armenia_EBA:I1633 11.8%
Ukraine_N_outlier:I3719 11.7%
Samara_Eneolithic:I0433 10.9%
Srubnaya_outlier:I0354 9.25%
Armenia_EBA:I1658 1.15%
Distance 3.4863%

All Armenian EBA and ChL, EHG, Iran ChL, GAC, Barcin, Ukraine N and M samples present, but not taken. "Steppe spot" seems to be a cross-section between Samara-Caucasus and Srubnaya_outlier-some_exotic_WHG clines.

Such solution is also supported by Srubnaya_outlier constantly popping out in Indian populations.

Set up as above + Iran_N, Chamar and Paniya:

Chamar:evo_40 45.75%
Chamar:evo_42 13.85%
Srubnaya_outlier:I0354 8.75%
Iran_N:AH4 6.95%
Chamar:A261 6.1%
Iran_N:AH1 4.25%
Chamar:A259 3.7%
Baltic_BA:Kivutkalns209 2.65%
Iran_ChL:I1661 2.45%
Globular_Amphora:I2441 1.9%
Armenia_EBA:I1658 1.6%
Armenia_EBA:I1633 1.5%
Baltic_BA:Kivutkalns222 0.55%
Distance 1.043%

I know that it's crazy, but if something walks like a duck and quacks like a duck...

Matt said...

@Arza, had a chance to look at those stats, comparing only Sintashta and CWC_Germany (high SNPs) to the Beakers:

The affinity to GAC_Ukraine farmers looks relatively high for Sintashta and CWC_Germany compared to the Beaker sample; Sintashta and CWC_Germany are high on Scotland:GAC_All, indicating relatively close to GAC_All and low on GAC_Ukraine:GAC_Poland, relatively close to GAC_Ukraine.

Affinity to GAC_Ukraine:GAC_Poland is also relatively high Sintashta and CWC_Germany compared to their affinity to GAC_Poland:Scotland_N.
Affinity to Iberia_Chl:GAC_Poland is relatively low Sintashta and CWC_Germany compared to Beakers.

I think all this points to a MNChl stream in Sintashta and CWC_Germany that is relatively more Eastern (closer to GAC_Ukraine) compared to Beakers, but still more strongly linked to GAC_Poland than GAC_Ukraine.

Some more stats if poss.

Corded_Ware_Germany Beaker_Britain GAC_Ukraine GAC_Poland
Corded_Ware_Germany Beaker_The_Netherlands GAC_Ukraine GAC_Poland
Corded_Ware_Germany Beaker_Britain France_MLN GAC_Poland
Corded_Ware_Germany Beaker_The_Netherlands France_MLN GAC_Poland
Corded_Ware_Germany Beaker_Britain Iberia_Chl GAC_Poland
Corded_Ware_Germany Beaker_The_Netherlands Iberia_Chl GAC_Poland
Corded_Ware_Germany Beaker_Britain Scotland_N GAC_Poland
Corded_Ware_Germany Beaker_The_Netherlands Scotland_N GAC_Poland
Corded_Ware_Germany Beaker_Britain France_MLN GAC_Ukraine
Corded_Ware_Germany Beaker_The_Netherlands France_MLN GAC_Ukraine
Corded_Ware_Germany Beaker_Britain Iberia_Chl GAC_Ukraine
Corded_Ware_Germany Beaker_The_Netherlands Iberia_Chl GAC_Ukraine
Corded_Ware_Germany Beaker_Britain Scotland_N GAC_Ukraine
Corded_Ware_Germany Beaker_The_Netherlands Scotland_N GAC_Ukraine

Baltic_BA cluster is not what everyone thinks

My model was something like "CWC_Baltic_early" + MN input + Narva input. Results in a population similar to early steppe with higher WHG relative to CHG and higher EEF relative to EHG. Right / wrong?

Alberto said...


Yes, the thing about the extra WHG was kind of a side comment. The main point I was trying to make is that trying to figure out the MN admixture in Bell Beakers was probably not a simple Yamnaya + "the best MN fit" (or rather "which MN population is favoured by Bell Beakers in D-stats). What I was getting was a preference for Ukraine_Eneolithic, which contains AN ancestry (mostly via Tripolye, but also probably from the Caucasus) and that part already was acquired in Ukraine before moving further west.

I run again the Iberian Beakers without Ukraine Eneolithic to see the difference. Some samples don't see a big change, but others the change is significant and makes sense:

With Uk_En:

Iberia_MN 33%
Ukraine_Eneolithic:I4110 26.3%
Portugal_MN 16.1%
Yamnaya_Samara 12.55%
Iberia_EN 11.9%
Iberia_ChL 0.15%

Distance 4.1549%

Without Uk_En:

Iberia_MN 34.8%
Yamnaya_Samara 24.7%
Globular_Amphora 17.65%
Portugal_MN 12.65%
Wales_N 7.5%
Koros_HG:I4971 2.7%

Distance 4.4015%

With Uk_En:

Iberia_Southwest_CA 32.55%
Ukraine_Eneolithic:I6561 26.5%
Ukraine_Eneolithic:I4110 12.6%
Yamnaya_Kalmykia 12.45%
Iberia_ChL 8.75%
Ireland_MN 2.9%
Blatterhole_HG:I1565 2.6%
Globular_Amphora 1.6%
Yamnaya_Samara 0.05%

Distance 3.4844%

Without Uk_En:

Iberia_Southwest_CA 40.25%
Yamnaya_Kalmykia 33.35%
Globular_Amphora 17.6%
Koros_HG:I4971 4.25%
Yamnaya_Samara 3.75%
Blatterhole_HG:I1565 0.8%

Distance 3.6362%


This one re: the side comment about the extra WHG, which here is small, but might be quite bigger in other places (like Hungary):

With Uk_En:

Iberia_Southwest_CA 40.45%
Iberia_Central_CA 18.4%
Ukraine_Eneolithic:I4110 17.15%
Yamnaya_Kalmykia 13.55%
Iberia_MN 7.75%
Ukraine_Eneolithic:I5884 2.5%
Yamnaya_Samara 0.2%

Distance 1.9294%

without Uk_En:

Iberia_Southwest_CA 56.65%
Yamnaya_Kalmykia 23.7%
Iberia_Central_CA 12.2%
Koros_HG:I4971 4.6%
Czech_MN 1.85%
WHG:Rochedane 0.9%
Tisza_LN 0.1%

Distance 2.103%

Matt said...

Alberto: What I was getting was a preference for Ukraine_Eneolithic, which contains AN ancestry (mostly via Tripolye, but also probably from the Caucasus) and that part already was acquired in Ukraine before moving further west.

Hmm... We may need to establish how much continuity there was over time with the Ukraine_Eneolithic. The two Yamnaya_Ukraine at Ozera suggests not much influence, assuming Yamnaya moving west? (Not much different from other Yamnaya / closer to Ukraine_Eneolithic from fairly nearby sites).

GAC_Ukraine also doesn't see too much influence from anything else in Ukraine before it (not clear how much local ancestry this absorbed, but does look like GAC_Ukraine is very distinct from GAC_Poland, hence Arza's D(Mbuti, GAC_Poland, Iberia_Chl, GAC_Ukraine) = 0.0005 0.219 710325, e.g. GAC_Ukraine about as distinct from GAC_Poland as Iberia_Chl).