search this blog

Wednesday, January 17, 2018

Another look at the genetic structure of Yamnaya


Yamnaya and other similar Eneolithic/Bronze Age herder groups from the Eurasian steppe were mostly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG). But they also harbored minor ancestry from at least one, significantly more westerly, source that pulled them away from the EHG > CHG north/south genetic cline. This is easy to show with formal statistics (for instance, refer to the qpAdm output here) and illustrate with a decent Principal Component Analysis (PCA).


Over the past couple of years I've come to the conclusion that this minor westerly input probably came from the Carpathian Basin (modern-day Hungary) or somewhere nearby, like the Balkans (see here).

However, this inference was based on just a handful of Neolithic samples from the Carpathian Basin. Now, thanks to Lipson et al. 2017, I have genotype data from tens of individuals from several different Neolithic and Copper Age cultures from the region. So let's revisit the issue by plugging these new samples into qpAdm, and also using the very latest qpAdm methods as described in scientific literature (with Ethiopia_4500BP as the base pright sample to 15 other ancient pright groups and individuals).

Below are the results, best to worst, sorted by taildiff. For comparison, I ran extra models with ancient populations from other parts of Europe and also West Asia. It's interesting and, I'd say, important to note that the West Asian reference groups produce amongst the worse statistical fits (bolded). What this suggests is that Yamnaya did not harbor extra West Asian ancestry on top of its CHG input. And, by the way, please note that I'm only using Yamnaya_Samara in these runs because I prefer UDG-treated, and thus higher quality, ancient samples.

CHG + EHG + Blatterhole_MN 0.465394061 > full output

CHG + EHG + Koros_HG 0.322245651 > full output

CHG + EHG + Germany_MN 0.321017025 > full output

CHG + EHG + Protoboleraz_LCA 0.315521424 > full output

CHG + EHG + Vinca_MN 0.292074267 > full output

CHG + EHG + Baden_LCA 0.255168297 > full output

CHG + EHG + Tisza_LN 0.246555616 > full output

CHG + EHG + ALPc_MN 0.220623346 > full output

CHG + EHG + Blatterhole_HG 0.219418173 > full output

CHG + EHG + Balaton_Lasinja_CA 0.211230222 > full output

CHG + EHG + Tiszapolgar_ECA 0.207527666 > full output

CHG + EHG + LBK_EN 0.182365613 > full output

CHG + EHG + TDLN 0.176675465 > full output

CHG + EHG + Koros_EN 0.15488361 > full output

CHG + EHG + Starcevo_EN 0.136365203 > full output

CHG + EHG + Armenia_EBA 0.127988891 > full output

CHG + EHG + Armenia_ChL 0.123057884 > full output

CHG + EHG + LBKT_MN 0.122780467 > full output

CHG + EHG + Tepecik_Ciftlik_N 0.110155019 > full output

CHG + EHG + Greece_N 0.105880232 > full output

CHG + EHG + Boncuklu_N 0.094240794 > full output

CHG + EHG + Anatolia_BA 0.069141519 > full output

CHG + EHG + Anatolia_ChL 0.067837662 > full output

...

CHG + EHG + Iran_ChL infeasible > full output

At the top of the list is Blatterhole_MN. Admittedly this is something of a surprise, considering the geographic distance between Blatterhole, Germany, and Samara, Russia. It's also an intriguing result because of the presence of Y-chromosome haplogroup R1b in both Blatterhole_MN and Yamnaya (see here).

However, this doesn't necessarily mean that Yamnaya harbors direct ancestry from Blatterhole_MN, or even any closely related group from North-Central Europe. Rather, Blatterhole_MN is simply the best proxy in this analysis for the non-CHG/EHG ancestry in Yamnaya, and the important question is why?

Considering also the presence at the top of the list of Koros_HG (which includes Hungary_HG I1507), Germany_MN and Vinca_MN, the likely answer is its high ratio of Western European Hunter-Gatherer (WHG) ancestry. Indeed, when I let qpAdm vary the WHG ratio, by dropping Blatterhole_MN and adding Koros_EN and Koros_HG in its place, I get an even better fit.

CHG + EHG + Koros_EN + Koros_HG 0.612772624 > full output

And for comparison...

CHG + EHG + LBK_EN + WHG 0.551431774 > full output

So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds? Quite possibly. But let's wait and see what happens when I add the ancient groups from the Balkans and North Pontic steppe from the forthcoming Mathieson et al. 2018 to this analysis.

Update 17/05/2018: My results have been confirmed in a new preprint from Harvard/Max Planck. See here: On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint).

See also...

What's Maykop (or Iran) got to do with it?

The Yamnaya outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

85 comments:

Shahanshah of Persia said...

Thanks for sharing! You have made a few excellent points here!

Aram said...

Mitogenomes like J2b, T2c1a, T1a in Yamna are almost certainly from EEF side.

Matt said...

Considering the underlying more basic components in the Neolithic samples, all the working (Blatterhole, Koros_EN, Koros_EN+Koros_HG) models are pretty sharp on the total amount of CHG+Anatolia_N together, which are approximately 50% (varies from 49.7% to 51.0% drafting in the WHG+Anatolia_N proportions from Lipson's paper), and the remaining 50% (50.3% - 49%) WHG+EHG.

Within the models, CHG+WHG are positively related (models with slightly higher CHG fraction have slightly higher underlying WHG fraction). With as many outgroup relevant statistics on the right, it seems like this is balancing it to a very precise point on general north-south underlying dimension.

a said...

SM. has placed R1b samples from Blatterhole and Iron Gates between L754+ and V88+
https://kumbarov.com/ht35/R1b_xP312xU106_V.38.1.pdf

epoch2013 said...

@Davidski

"At the top of the list is Blatterhole_MN. Admittedly this is something of a surprise, considering the geographic distance between Blatterhole, Germany, and Samara, Russia."

But what stands out is that Blatterhole_MN has a lot more WHG than other Old Europeans, apart from Koros_HG. Which is the second on the list.

Which makes me wonder, would the Ukranian Mesolithic from Mathieson + CHG fit better than CHG + EHG? These mesolithic samples have more WHG than the samples mostly called EHG. Or would CHG + EHG + Ukrainian mesolithic fit? Or CHG + Ukrainian mesolithic + some farmer/EEF sample?

Volodymyr Lutsyk said...

Input of women from Cucuneti-Tripollye?

Blasonario Cremonese said...

@ a

Did I read well the PDF of Sergey Malyshev and Atanas Kumbarov? Villabruna is placed on the M73 line... How's it possible?

I noticed also that ATP3 is surely placed on the M269 line...

Grey said...

"and the important question is why?"

any copper/gold/silver mines near Blatterhole?

Grey said...

"any copper/gold/silver mines near Blatterhole?"

or less fun but possibly more likely, salt

Ryan said...

"So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds?"

I just want to mention that this is what a lot of us "R1b/Yamnaya skeptics" have been suggesting for a while (ie that R1b-M269 radiated out from around the Carpathians/Danube before some downstream clades then hitched a ride on Yamnaya). Obviously it remains to be seen if that skepticism will be validated.

The "third neolithic" talk on Bell Beaker Blogger's page fits this well IMHO.

epoch2013 said...

@David

Another thing: If I recall correctly it looks like Koros_HG has a tad extra ANE and a tad EEF. I can imagine that would somehow cause qpAdm to choose it over any other WHG sample with a target that has ANE and basal eurasian.

Rob said...

I think individuals like GB1 Eneolithic romania will be interesting for Yamnaya

Samuel Andrews said...

And also all Steppe ancestry in Europe came a long with significant EEF/WHG admixture. Those two mega "races" had a long history of mixing. By the time R1b P312 Steppe folk made it to Spain they may have been only 50% Steppe.

AlanL said...

>> any copper/gold/silver mines near Blatterhole?

> or less fun but possibly more likely, salt

Well, it's in the Ruhr Valley, but I doubt if coal or iron ore were of any interest at that stage. There is copper in Sauerland, 40 to 100 km away, but ten minutes googling didn't turn up any references to pre-Roman mining.

Rob said...

regarding the Speculation of Copper and Blatterhohle - there's no basis. In fat the Blatterhohle R1b-V88 individual was found without context (in a cave without clear TRB or Baalberg or whichever local MNE culture artefacts). But an interesting clue is the links between Iron Gates, El Trocs and some of the Mariupol Ukraine V88 individuals, further evidence of intra-Europe movement patterns before the big steppe migration.

Grey said...

AlanL, Rob

me wildly speculating - the V88 distribution in Africa is either very odd (if the source was distant) or not odd at all (if it was local) but on the assumption it was distant i googled around one time to see if there was anything in the areas of high concentration of V88 that might have drawn people from a long way away hoping for something fun like gold or copper but iirc the only clear potential candidate was salt. One of the high V88 regions was a major salt production center from ancient times iirc (that's if i got the right spot in the first place so may be nothing).

Anthro Survey said...

Epoch's hypothesis mirrors mine and deserves some attention.

Who here remembers the Ukrainian hunter gatherers and that Eneolithic steppe-like sample from Ukraine from Mathieson 2017 et al?

As many of you probably recall, the Ukranian HGs were not exactly like EHGs. They had some WHG-like ancestry pulling them westward.

So, WHAT IF the WHG-like ancestry in Ukranians and Romanians was closely related to UHG in Neolithic Anatolians(and in turn EEFs)? It's a relevant question as Eastern Ukraine was, in all likelihood, the birth of the steppe ancestral package considering aDNA and archaeology alike.

In fact, I recall modeling Yamnaya Kalmykians a while back on nMonte using a battery of samples. They took no ANF ancestry at all, but preferred a mix of EHG+some WHG+CHG(don't recall the exact samples), instead. So, no surprise David's models prefer Koros_HG and HG-enriched Blatterhole_MN.

Davidski said...

@Anthro Survey

There are EEF-related archaeological sites derived from the Balkans and/or Carpathian Basin as far east as the Donets in eastern Ukraine/southern Russia.

http://eurogenes.blogspot.com/2017/02/three-way.html

It seems unlikely that these settlements were inhabited by WHG-like populations, as opposed to something resembling a mixture of Koros_EN and Koros_HG.

Anthro Survey said...

@Davidski

In this paper, the authors do not rule out, by any means, the arrival of agricultural, husbandry and pottery-making packages to the Donets/Lower Don regions from the NW Caucasus(or perhaps via the Eurasian route). In fact, they arguably show a tacit preference for this.

Furthermore, agriculture definitively began there at the tail end of 5th millennium BC. This is either contemporaneous with or post-dates some of the earliest "steppic" sites. Hence, if we assume an EEF route of transmission from Tripolye-like peoples, it doesn't preclude the earliest hybrid populations there being CHG+EHG+minor WHG.
Either way, it doesn't require local EEF-like populations of E. Ukraine to have been significant contributors to the steppe package in the long run. They could have lived side-by-side and perhaps engaged in limited mixing but ultimately ended up as a dead-end population.

I remain open-minded, of course, and don't discount limited EEF in Yamnaya.

When do you think those genomes from Ukraine and Romania will become available?

Davidski said...

@Anthro Survey

When do you think those genomes from Ukraine and Romania will become available?

Hopefully soon.

Rob said...

Anthro
It is thought these days that pottery making was invented in Russia - Ukraine independently. It’s just silt and clay, which was then fired for hardness

Your statement that “agriculture definitely began there at the tail end of 5th M BC” is more a term ante quem.

Aram said...

Khvalynsk and Armenia EBA R1b in the same branch makes sense. That would explain the EHG in Chl Armenia.

But the other R1b from Armenia is MBA not EBA. There is an error there.

supernord said...

Blatterhole_MN have R1b and clear links with the Western North of the Black sea region by mtDNA. Most likely, they are some kind of Mesolithic(? early? late?) aliens from the Western part of the Northern Black sea region. May be under pressure from Neolithic farmers is one part of their ancestral population get away in Germany, and the other on the Dnieper river in the Neolithic.

Protoboleraz_LCA clearly had contacts with the steppe culturally, and therefore received genetic flow. Baden_LCA comes from him and Tisza_LN & Tiszapolgar_ECA & Balaton_Lasinja_CA.


What is Körös_HG?

Lee Albee said...

Hmm. Could this be do to a spread of R1b to the steppe via expansion of the Magdalenian culture. Archeology demonstrates that the magdalenian culture did expand out to at least poland

Based on Reich's paper of Ice age Europe The El miron cluster was fairly diverse with the El miron individual falling in between the rest of the El miron cluster and the Villabruna (WHG) cluster.

Or it could have been a slightly later, replacement expansion, of the Villabruna genetic cluster (WHG) which expanded out to at least the Czech republic. (https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/FuQ_nature17993.pdf)

Blatterhold_MN is smack in the middle of that range and shows ~70% Villabruna or KO1 related WHG ancestry and ~30% EEF farmer ancestry. (https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/nature24476_final.pdf)

Would seem parsimonious that the R1b came from the Villabruna lineage.

So R1b expanded out to eastern europe/steppe from western/central europe, where it was found only at low percentages, via the Villabruna-like lineage during the Late paleolithic/early mesolithic?

Just a thought

Samuel Andrews said...

@Lee Albee,

Um. If early R1b spread with Magdalenian that would really be something considering for like 10 years the consensus was R1b L151 spread with Magdalenian.

epoch2013 said...

@supernord

The area is not far from the Netherlands where the Vlaardingen Culture could be found in the peat- and marshlands. It was contemporary to the Funnel Beaker culture, even survived that well into the Corder Ware era. It showed clear Mesolithic treats, most likely a continuation of the local Dutch variety of Ertebolla called Swifterband culture.

https://en.wikipedia.org/wiki/Vlaardingen_culture

We also have a Funnelbeaker graveyard from near Schwerin - the Ostorf flat grave - which had fully mesolithic mtDNA and showed clear signs of being far more hunter-gatherer than normal funnelbeaker.

http://www.sciencedirect.com/science/article/pii/S2352409X17303231?via%3Dihub

Your scenario seems too far fetched with such examples in the neighbourhood.

Rob said...

@ Epoch
So where is all the Mesolithic V88 in Central -Western Europe ??

Chad Rohlfsen said...

MN Euro may not be the way to go. Depends how M269 looked during the Eneolithic. If it was like Khavalynsk, then there is no MN, but 10-15% input from an Armenia EBA/IranChL pop.

Lee Albee said...

@Samuel Andrews

Do you have a reference for that? I would love to read it. I had thought that such an early prescence of R1b in western europe was fairly unheard of?

Thank you

epoch2013 said...

@Rob

V88 could have been brought to Blätterhöhle by the farmers part of them.

Davidski said...

@Chad

MN Euro may not be the way to go. Depends how M269 looked during the Eneolithic. If it was like Khavalynsk, then there is no MN, but 10-15% input from an Armenia EBA/IranChL pop.

Yamnaya still shows a strong signal from the west when it's modeled as part Khvalynsk, and no Armenia_EBA/Iran_ChL.

http://eurogenes.blogspot.com/2017/08/genetic-and-archaeological-continuity.html

supernord said...


epoch2013 "Your scenario seems too far fetched with such examples in the neighbourhood."

It means nothing, as neither genetics Dutch cultures or their relations with the Blatterhol we do not know. It is important that in the Neolithic in Blatterhol and the Dnieper lived R1b-(V88), and, on the Dnieper river the population received more WHG in comparison with the Mesolithic.


@Davidski -
What is Körös_HG?

Davidski said...

@supernord

What is Körös_HG?

It's the Koros_HG sample from Lipson et al., plus Hungary_HG I1507, which also comes from a Koros Neolithic site, but is overwhelmingly WHG.

https://reich.hms.harvard.edu/datasets

Rob said...

Epoch
Sometimes Supernord does make sense
Although his scenario misses the point that my one made earlier- iron gates expansion to Ukraine and Central Europe

supernord said...

@Davidski
This is the Koros_EN in Lipson et al. In Lipson et al. there are not a Koros_HG.

Davidski said...

@supernord

In Lipson et al. there are not a Koros_HG.

In the Lipson et al. dataset ind file there is.

I4971 M Koros_EN_HG

https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/LipsonEtAl2017.tar.gz

Davidski said...

@Chad

By the way, using this latest pright pops set up that I have, I can model Yamnaya as a mixture of CHG, Blatterhole_MN and Samara_Eneolithic, without any additional EHG.

https://drive.google.com/file/d/1Q-Jcv4AFIzpZr3iD9KINW330emfUjOSe/view?usp=sharing

But it's true that Samara_Eneolithic doesn't take any Blatterhole_MN.

https://drive.google.com/file/d/1rpffJguM9jI9Gl16_ja4-02JI_nr15pp/view?usp=sharing

So this makes me think that the archaeologically based idea that Yamnaya came about from a fusion of the westerly Sredny Stog and easterly Khvalynsk might be spot on.

Rob said...

That Koros individual I4971 is from Neolithic context c. 5300 BC, but seems to be yet another 'assimilated hunter-gatherer'. Reported lineages are I2a2 (Y) and K1 (mtDNA), although he can be further determined to be I2a2b.
Just more confirmation that the area around the eastern Carpathian basin and adj. mountains were hunter-gather 'refugia' of I2a2 and probably certain types of R1b.

EastPole said...

@Davidski
“So this makes me think that the archaeologically based idea that Yamnaya came about from a fusion of the westerly Sredny Stog and easterly Khvalynsk might be spot on.”

Maybe it wasn’t the Sredny Stog but Corded Ware which influenced the Yamnaya:

https://www.researchgate.net/publication/292423589_The_Yamna_culture_and_the_Indo-European_homeland_problem

From Allentoft “Population genomics of Bronze Age Eurasia”

https://s10.postimg.org/nv0yovixl/screenshot_324.png

Chad Rohlfsen said...

David,

Isn't that just one way to make an Armenia EBA? You're relying on a pure CHG pop to exist in the Caucasus. That, in all likelihood, has no chance of being true. Why say Hungary_N + Kotias? Just have it be a pop between Armenia_EBA and Iran_ChL.

The Caucasus pops with good Anatolian admixture make the best proxy for an admixing source, going from Khavalynsk > Yamnaya

Steppe_ChL WHG Steppe_EMBA 0.005605 0.001315 4.261
Steppe_ChL Blatterhole_MN Steppe_EMBA 0.002843 0.001059 2.685
Steppe_ChL Vinca Steppe_EMBA -0.001168 0.001108 -1.054
Steppe_ChL Anatolia_N Steppe_EMBA -0.002015 0.000712 -2.829
Steppe_ChL TDLN Steppe_EMBA -0.001191 0.000825 -1.443
Steppe_ChL LBK Steppe_EMBA -0.001502 0.000731 -2.056
Steppe_ChL Armenia_EBA Steppe_EMBA -0.002890 0.000902 -3.203
Steppe_ChL Tiszapolgar Steppe_EMBA -0.001606 0.001002 -1.603
Steppe_ChL Armenia_ChL Steppe_EMBA 0.000495 0.000823 0.601
Steppe_ChL Tepecik_Ciftlik Steppe_EMBA -0.003153 0.001059 -2.977
Steppe_ChL Iran_ChL Steppe_EMBA -0.003186 0.000880 -3.620

Davidski said...

@Chad

My models rely on a largely CHG, but partly EEF and EHG, farmer population existing in the North Caucasus and Don region of the steppe, both of which are yet to be sampled, during the Late Neolithic/Chalcolithic.

This is very plausible considering the archaeological and skeletal finds from that period in the North Caucasus and nearby parts of the steppe.

http://eurogenes.blogspot.com/2017/02/three-way.html

Rob said...

Chad
Doesn’t the high CHG bounceback in Kura-Arx suggest that there was indeed an almost pure CHG population living around still ?

Chad Rohlfsen said...

David,

I think it is probably something very Armenian and Iranian-like. It won't be a little Caucasus, plus MN Euro. The EEF-like mtDNA need not come from Europe, but the Caucasus.

Rob,

It isn't a CHG resurgence. look at PCA. They are right under them, towards Iran_ChL. It is a mixed pop, from near the Zagros or Mesopotamia.

left pops:
Armenia_EBA
Armenia_ChL
Iran_ChL
CHG

right pops:
Mbuti_DG
Kostenki14
Ust_Ishim
Onge
Karitiana
Anatolia_N
WHG
EHG
Iran_N
Levant_N
MA1


best coefficients: 0.518 0.541 -0.059
std. errors: 0.154 0.194 0.111


fixed pat wt dof chisq tail prob
000 0 8 3.861 0.869461 0.518 0.541 -0.059 infeasible
001 1 9 4.188 0.898595 0.517 0.483 0.000

Extra CHG is a no-go. However, look towards Mesopotamia or the Zagros, and that is probably your answer.

left pops:
Armenia_EBA
Armenia_ChL
Iran_ChL
Tepecik_Ciftlik

right pops:
Mbuti_DG
Kostenki14
Ust_Ishim
Onge
Karitiana
Anatolia_N
WHG
EHG
Iran_N
Levant_N
MA1



best coefficients: 0.253 0.661 0.086

std. errors: 0.175 0.158 0.081

fixed pat wt dof chisq tail prob
000 0 8 2.438 0.964543 0.253 0.661 0.086
001 1 9 3.573 0.937198 0.323 0.677 -0.000

Chad Rohlfsen said...

I can fine-tune that and do a qpGraph if you like.

Davidski said...

@Chad

I think it is probably something very Armenian and Iranian-like.

Can't be, because I've included all of the available ancient West Asian groups in my models. The fits were below par or even very poor, with the Iran_ChL model coming back as infeasible.

The EEF-like mtDNA need not come from Europe, but the Caucasus.

A bigger problem I think is the lack of South Caucasus and South Caspian mtDNA HGs in Yamnaya etc., like U7.

Chad Rohlfsen said...

These guys aren't from south of the Caspian. This is likely Mesopotamian and eastern Anatolian.

Rob said...

Chad
K-A doesn’t appear to have expanded from Iran but into it
In fact, it might have originated in east Anatolia before finalising in the Kura and araxes valleys

Davidski said...

@Chad

Mesopotamia is south of the Caspian, and South Caspian-specific mtDNA HGs are common there. They're found in the Zagros too, including in Iran_ChL.

Anywhere south of the Caucasus is no go for the West Asian-related admix in Yamnaya. Take a look at my models; I've got it all there, and it matches the uniparentals on the ancient steppes.

Chad Rohlfsen said...

Mesopotamia isn't south of the Caspian. There's limited data from here and not enough to say it didn't happen. The trail from the Caucasus farmers is from Mesopotamia. There really isn't any way around that. After someone gets a couple hundred samples across this region and the Caucasus, it will be more clear.

Samuel Andrews said...

@Davidski,
"A bigger problem I think is the lack of South Caucasus and South Caspian mtDNA HGs in Yamnaya etc., like U7."

Maybe, the Caucasus lacked south Caspian mHGs but carried EEF mtDNA. That would make a Caucasus route for Yamnya's EEF stuff possible. But I think it makes a lot more sense Yamnaya's EEF is from Europe considering southeast Europe right around the corner was 90% Anatolian.

Davidski said...

@Chad

Mesopotamia is next to the Zagros, which is where Iran_ChL is from, and its mtDNA is South Caspian-specific, and very much unlike the mtDNA in Yamnaya.

So Mesopotamia is a long shot, especially since the North Caucasus is right next to the steppe, and the Carpathian Basin is an extension of the steppe in Europe.

Rob said...

I think chad is right -there’s too few genomes outside Anatolia to really appreciate what’s what
Anyhow , CHG must be either from south of the Caucasus or south / east of the Caspian. In fact as I’ve previously suggested, it’s probably several geographic and temporal layers

Davidski said...

Nonsense.

The Chalcolithic South Caspian populations that were expanding from Mesopotamia and Zagros made it to Egypt and Anatolia. We know this because it's easily seen in the ancient data without having to resort to any mental gymnastics.

They didn't make it onto the steppe, because their markers are missing from Bronze Age steppe populations.

Just follow the data without wishing too hard, and you'll see the reality.

Rob said...

So you’re envisaging a cryptic CHG population living in seclusion somewhere north of the Caucasus ? Which archaeological group would that be ? How’d it get there in the first place ?

And you’ve been corrected before several times- if you want to understand affairs, we need to look at Late neo and Eneolithic steppe markers, not Yamnaya

Davidski said...

@Rob

So you’re envisaging a cryptic CHG population living seclusion somewhere north of the Caucasus ? Which archaeological group would that be ? How’d it get there in the first place ?

They lived in the Northwest Caucasus since the Mesolithic. Duh.

And you’ve been corrected before several times.

Give me two examples.

Rob said...

“They lived in the Northwest Caucasus since the Mesolithic. Duh.”

No they didn’t . The Mesolithics who at one point lived in the NW Caucasus, and where of southern origin (Imeretian-Zarzian epipaleolithic) appears to have gone extinct. There is a 1000 year hiatus between the Mesolithic and the arrival of Mariupol- like groups in the north Caucasus (which would be on the WHG-EHG cline, and I2 /R1).
Then new southerners arrived with Meshoko horizon from Georgia, mixed with Sfedny Stog people. Then again new southerners arrived with Majkop contacts (northern Ubaid and halaf, therefore not really “Mesopotamian”).

We also have kelteminar contacts impacting Elahanka -Volga groups

Davidski said...

@Rob

No they didn’t.

It looks like they did, or somewhere nearby, because the ancient genetic data show very clearly that the southern ancestry in both Khvalynsk and Yamnaya is overwhelmingly CHG.

Again, I ask you, where are the Mesopotamian/South Caspian/Zagros markers in Khvalynsk, Yamnaya, Catacomb, etc.?

Can you explain this discrepancy?

The Mesolithics who at one point lived in the NW Caucasus, and were of southern origin (Imeretian-Zarzian epipaleolithic) appears to have gone extinct.

The operative word here is "appears".

It also appeared to many until recently, including to you, that Corded Ware and R1a-M417 weren't from the steppe.

Salden said...

>The Chalcolithic South Caspian populations that were expanding from Mesopotamia and Zagros made it to Egypt and Anatolia. We know this because it's easily seen in the ancient data without having to resort to any mental gymnastics.

Are you advocating the Dynastic Race Theory?

https://en.wikipedia.org/wiki/Dynastic_race_theory

Davidski said...

Later today, or tomorrow, Rob, I'll tell you a little tale about a fortress in the Northwest Caucasus.

Davidski said...

@Salden

I'm not advocating anything about ancient Egypt here; just pointing out patterns in the data that have been discussed in scientific literature and on this blog already (ie. the spread of South Caspian/Mesopotamian/Iran_ChL ancestry to ancient Egypt).

Salden said...

>the spread of South Caspian/Mesopotamian/Iran_ChL ancestry to ancient Egypt

Playing Devil's Advocate here, wasn't that in the Abusir sample? Which was:

A. In Lower Egypt

B. After major movements of Levantines into Egypt during the end of the Middle Kingdom (see Hyksos)

Alberto said...

The way I see it:

The best model posted is the one with CHG + EHG + Koros_EN + Koros_HG, with proportions of 44% + 44% + 6% + 6%.

Ukraine Neolithic was SHG-like, and only received EHG during the Eneolithic with al already admixed Yamnaya-like population.

The North Caucasus (pre-Maykop) had clear links to the NW Black Sea region*, and I'd guess that those Balkans outlier can only realistically come from the North Caucasus (and they plot near modern Europeans).

The South Caucasus (home of CHG) had too much AN ancestry from early on (Armenia_Chl and Armenia_EBA). Everything around the Black Sea was too "western" by the LN/Chl to be the origin of a Yamnaya-like population.

So where does this leaves us? We need a population as "eastern" as Samara_Eneolithic (Khvalynsk), but more southern. This would make the lower Volga-Ural region as a good candidate. Either that, or Central Asia.

The first Yamnaya-like (but not exactly) sample we have is the Samara Eneolithic sample (probably not from Khvalynsk tribe) belonging to HG Q, and eastern marker.

The earliest Yamnaya proper (genetically) samples we have are from Samara and the Altai (Afanasievo), ca. 3000 BC.

Apparently (unplublished data), the best match so far for the mtDNA of Yamnaya comes from Ulug Depe**.

* https://www.academia.edu/2543641/The_chronology_of_the_Maikop_culture_in_the_Northern_Caucasus_changing_perspectives

** http://eurogenes.blogspot.com/2017/06/the-pigtailed-figures.html?showComment=1496922199462#c7884162881824309440

supernord said...

@Alberto "Ukraine Neolithic was SHG-like"
Wrong. They were not SHG-like, such a term is not applicable to describe the mixture of EHG and WHG. They had an increase in the number of WHG compared to the Mesolithic.


"and only received EHG during the Eneolithic with al already admixed Yamnaya-like population."

EHG in Ukraine is present in large quantities in the Mesolithic.

Matt said...

@Rob, re: the discussion in this thread between you and Chad, as you know from past discussion, I was pretty interested in the idea of Iran_Chl admixture into the Armenia_EBA population (simply to tie in with general change in the Near East ME region as a single phenomenon).

But, arguing against it, would say I notice in the new West Eurasian Ancient 67 panel, in the higher dimensions than 1 and 2:

- Dimension 3 adds further distinction between CHG and Iran_N / Natufian / Levant_N Anatolian_N and shows some bending of particularly present day South Caucasian (Georgian / Abkhazian) towards Satsurblia and Kotias. North Caucasus also shows some of the same phenomenon (but seems reduced due to greater EHG affinity).

Armenian and Turkish populations who overlap in Dimension 1 and 2 with South Caucasus listed above do not overlap in this dimension and are more removed to the other, non-CHG end (which makes sense given their languages are from the north / east one way or another and relatively lower isolation means drawing in more ancestry from both Levant and Iran).

Steppe_EMBA samples also seem outbent towards CHG on this dimension (The West Eurasian type Karasuk and Mezhovskaya samples in the run don't seem to be).

- Dimension 4 where NW_Anatolia_N ancestry is distinguished from other Western farmer and Neolithic streams of ancestry (and where Western Europeans and Eastern Europeans are slightly more distinct from each other than PC2), there is also seems to be a slight outbending of North Caucasus and Steppe_EMBA samples towards the "other Western farmer" end (where Levantine and Bedouin samples are at the other pole).

...

Overall, I guess I'm in agreement that the sampling of eastern Anatolia and early South Caucasus / Armenian plateau is too weak to reject ancestry from there in Yamnaya (the Armenia Chalc and EBA may just be rejected becaue they are too complex mixes that don't have enough freedom to be fit).

But also seems like there has to be at least some continuity of CHG in South Caucasus, and don't totally see any compelling reason in modern dna as represented there that a "CHG stronghold" should have been in North rather than South Caucasus (working in complete ignorance of archaeology).

Chad Rohlfsen said...

Armenia EBA and Iran ChL are the best admixing source to go from Khavalynsk > Yamnaya by f3. Nothing weak or rejected about it. Northern Mesopotamians are the ancestors of Caucasus farmers. Once we get the kind of coverage in Europe, across West Asia, it'll be pretty clear.

Matt said...

Well, they're certainly rejected in Davidski's qpAdm models, whatever authoritative assertions we wish to make.

Chad Rohlfsen said...

I haven't looked at his, but I think that has to do with the set-up. I have no issues using them and get fine scores. No rejections.

Matt said...

I think that could be different pright; he seems to be using greater number of European Upper Paleolithic+Mesolithic samples plus Natufian, ancient Ethiopia, Tianyuan. No recent Mbuti, Onge, Karitiana. (Always appreciate your comments, but we will get more out of this is comparing the methodology?).

Chad Rohlfsen said...

Tianyuan and Mota really don't matter here. It's having MA1, EHG, WHG, Iran, Levant, Anatolia, that will flesh out what we're looking for. I only use transversion sites too.

Rob said...

Matt
Yes I agree
Georgia was the stronghold of CHG. Glacial forests there survived the longest; and its local neolithic is different to the Shuvaleri group (? Which came from Halaf)

Davidski said...

@Chad

Seems to me like your set up doesn't have enough power for this. You need to use the latest methods as described in literature.

Also, no idea why you say that Tianyuan doesn't matter, considering the variable East Eurasian-related input into ancient West Eurasians? Tianyuan, as the only ancient East Asian we have, is crucial.

And of course there's the issue of the lack of Mesopotamian/South Caspian diagnostic markers on the ancient steppe.

Chad Rohlfsen said...

Not enough power? I'm using the same stuff as you, so I don't follow here. Tianyuan is an early East Asian and not relevant. Karitiana will likely be closer to the ENA side of ANE anyway. Groups closer in time matter more here. Try a run with no farmers or mesolithic samples to see. Tianyuan changes nothing in the outcome, really. I've checked.

Also, there is only an H3a from Halaf and no y-DNA, so lets not get ahead of ourselves and say it's not in the later steppes. Just wait for the samples to come.

Basil S said...

@Dave, Chad

In qpAdm, in terms of modelling a European BA, say Rathlin, on earlier BA populations in the pleft like BB, would including Yamnaya in the pright (along with the obvious necessary ancients) be recommendable in terms of fleshing out things phylogeny wise? Or would that be going too deep/risking too much shared drift between left and right pops?

Or including EEFs as well Anatolia_Neolithic in the pright?

Davidski said...

@Chad

Not enough power? I'm using the same stuff as you, so I don't follow here.

You're not, otherwise you'd be getting the same results as me.

Also, there is only an H3a from Halaf and no y-DNA, so lets not get ahead of ourselves and say it's not in the later steppes. Just wait for the samples to come.

There's not much difference between the ancient mtDNA south of the Caucasus today and in the Neolithic/Chalcolithic.

Ancient steppe mtDNA doesn't derive from this biogeographic zone.

Davidski said...

@Basil S

In qpAdm, in terms of modelling a European BA, say Rathlin, on earlier BA populations in the pleft like BB, would including Yamnaya in the pright (along with the obvious necessary ancients) be recommendable in terms of fleshing out things phylogeny wise? Or would that be going too deep/risking too much shared drift between left and right pops?

Or including EEFs as well Anatolia_Neolithic in the pright?


Judging by the latest literature on qpAdm, that would be fine.

Chad Rohlfsen said...

There's no difference in our samples or SNPs. What is in your pright?

Davidski said...

@Chad

There's no difference in our samples or SNPs. What is in your pright?

All of the full qpAdm output is linked to above.

Anthro Survey said...

@Seinundzeit,

Somewhat belated, but here it is. I am going to preface this by highlighting three samples of interest: Eneolithic Samara 434, Yamnaya Samara I0357, and Y. Samara I0441(somewhat less interesting). If you recall from one of Dave's posts, 434 is the Hg Q suspected Kelteminar migrant(or admixed individual)

So, the other day, I was modeling a set of modern Indo-Iranian speaking populations on Monte, namely: UP Brahmins, Gujarati A, Brahmin, Kshastrya, Afghan Pasthuns, Pamiris & Yaghnobis, Dardics, Persians, Ir. Zoroastrians, Lors and Mazandaranis. My input contained a consistent panel of samples: Yamnaya(all), Sintashtans, Andronovo, Iran_N, Iran_chl, a couple of ENA "adjusters", and ASI-rich(Pulliyars & Paniyars).

At a first go, virtually all the Indians just took I0357 and the Eneolithic Samara sample(434) for their steppe and no Sintasthta/Andronovo. Kshastriya took 434, primarily. Kalash took mainly the Samara sample, but also I0441. Pashtuns just took I0357, as did Burusho. Iranians from Iran just took Andronovo and Sintashta, though.

Midway through this, I changed my trajectory and decided to do a preliminary assessment of how Iranian Persians relate to Messopotamians and Arabs, choosing Iraqi_Jews, Arab Israel 1, Saudis, and Leb Muslims.

Guess what? I got back on track as two interesting things happened. The model SNATCHED Iraqi Jews AND opted for I0357 instead of Andronovo/Sintashta. The fit improved from 0.4% to 0.03%. Granted, it was an overfit as it opted to use >10 samples all of a sudden. After a bit of trimming, I was still left with 0.05% or so. One of my best fits(or overfits).
Arab_Israel1 was used, but it's unlikely to reflect Arab-related ancestry since Zoroastrians also took it. Same story with them: fit improved, snatched Iraqi_Jews and opted for I0357. Mazandaranis and Lors "fell into line" in the same fashion.
______________

I've played around with 434 in the past using a panel of various Hgs. Unlike the other 2 Eneolithic Samaran Samples, it actually preferred a good chunk of Iran_N(in addition to CHGs) and the fit was still worse than for the other 2. IIRC, it also took extra ANE.

Spurred on with my Indo-Iranian results, I chose to examine all the Yamnaya_Samara samples, using a panel of HGs AND Eneolithic 434. Guess what? They mainly opted for CHGs and Euro HGs, while I0357 took a considerable slice of 434 and Iran_N. I0441 grabbed these, too, but to a lesser extent.
__________________________________________
This leads me to suspect two things:

1.The Samara Bend area may have experienced some exotic steppe-like influence(if projected on 2D PCA, that is) from further East or SE. Possibly some Iran_N+EHG+extra ANE population. I0357 and 434 could be hints of this. Haplogroup-wise? Maybe Q+J2b.

2.If the first Indo-Iranians were, in fact, R1a-z93 Corded Ware folk carrying some EEF, they hybridized with some of these possible Central Asian(?) populations en route to BMAC and/or India. In other words, Monte's choice of I0357(and 434 in some) may reflect an imperfect composite tapping into both CWC and a mystery ancestral stream.

Can you look into these 3 samples(or at least I0357) using formal methods somehow?

Chad Rohlfsen said...

What sample(s) makes up WHG?

Davidski said...

Loschbour and La Brana.

Lee Albee said...

@Davidski

I am interested in how you selected your Right versus left populations in this analysis.

Your right populations have groups that could, theoretically, have mixed with you left populations.

Looking at the documentation for the qpAdm program it states:

"Caveat...
1) It is important to realize that the answers are invalid if there has
been post admixture gene-flow between left and right populations."

How sure are you that this analysis is using the appropriate right population? How do you determine that for this work?

Sincerely,
Lee

Davidski said...

@Lee Albee

The qpAdm documentation is no longer current in these aspects. You need to refer to the latest literature to see how its use has developed. For instance, page 26 here...

https://media.nature.com/original/nature-assets/nature/journal/v548/n7666/extref/nature23310-s1.pdf

Lee Albee said...

@Davidski

Thanks..

So your basing the suitability of your groups then based on the statistics? Tail probability and error level of the Jackknife mean?

Lee

Davidski said...

@Lee Albee

No, I'm just packing the right pops with as many genetically diverse ancient populations and individuals as I can, while at the same time ensuring that my analyses are each based on at least 100K SNPs, so that I have as much discriminatory power as possible.

And then I look at the output, mainly the taildiff, to see how the models perform. If I'm seeing clear patterns that make sense in terms of biogeographical affinities, with, say, most groups being clearly discriminated against relative to a few that are obviously working well, then I'm happy.

By the way, you can e-mail Nick and Iosif about this sort of stuff, especially the more technical aspects. If your questions are legit they'll reply. And if you do find out anything new and useful, feel free to share it here.

Matt said...

@Lee, I believe in the latest papers and generally (Lazaridis 2017 supplement is good to look at) there is a lot of consideration to using qpWave to prune the minimal number of necessary populations in the pright (e.g. if you have all of La Brana, El Miron, GoyetQ-116-1, Villabruna in the pright, and the qpWave is only 2, then there is some way in which you can simplify down to needing only 2).

That said, even in that paper they abandon this for "All" sets including various outgroups, on the basis that ("Adding these later populations has one disadvantage: if populations A and B are both included in the
larger set and are composed of the same ancestral elements in similar proportions then A may be modeled as deriving most of its ancestry from B and vice versa. This does not clarify the ancestral origins of either population. However, this approach also has the advantage of identifying mixture when the admixing populations are themselves complex. For example, if a population A is a mix of B and C, and B and C are themselves 2- or 3-way mixtures, then this approach might identify a simpler mix in the origin of A than would be possible if B and C were not considered as source populations").

We should probably not think of qpAdm as actually less problematic than ADMIXTURE or PCA in regard to the problem of the pright being arbitrary. Using qpWave there is some degree of testing for redundancy, and whether the pright are even distinguished by multiple streams of ancestry, but it ultimately seems like an somewhat arbitrary choice of selecting populations which are believed to be able to distinguish the pleft in formal stats.

The other new approach in the Lazaridis paper (which is not yet part of the ADMIXTOOLS I think) is the simulation approach - directly simulating mixes of n populations, then running f4 of the form (real,simulated;X;Y) for various X;Y in a pright. The advantage of that is that the results are directly understandable in terms of comparison to (real,real;X;Y) and the f4 Z test for significance.

As well, with the simulation approach, you can run f4(simulated1,simulated2;real;outgroup), so that in the event that two simulations get all the outgroup relationships right, but actually either or both is not very close to the real population, then you could detect that (qpAdm can't really do this at all).

But still this does not get you away from arbitrary elements in the pright choice.