search this blog

Friday, January 13, 2017

qpAdm tour of Europe: Mesolithic to Neolithic transition

For a while now I've been trying to work out a way to model present-day Europeans with qpAdm as a mixture of Neolithic and Mesolithic populations. It hasn't been easy, because often what works for some Europeans doesn't work for others. But I've finally figured it out.

The trick is to account for Siberian, East Asian and Sub-Saharan ancestry, by including the Nganasan from Siberia, Onge from the Andaman Islands, and Yoruba from West Africa, respectively, as reference pops.

Below is the spreadsheet with the results and outgroups. Judging by the chisq and tail prob, most of the fits aren't spectacular, but as far as I can tell, they work.

Moreover, in the entire analysis not a single standard error reached 2%. Based on my experience with qpAdm, that's a remarkable thing for such a complex analysis, and I think it suggests that the reference populations are relevant.

Interestingly, while, as expected, the Nganasan-related admixture peaks in far Northern Europe, the Onge-related ancestry is, perhaps surprisingly, most pronounced in Southern Europe. Any ideas why? My thoughts on that here.

Update 014/01/2017: If you guys want to reproduce my analysis, but you don't have the same dataset, which is more than likely, you should be able to get very similar results using the full Human Origins dataset. Try these reference pops and outgroups.

Caucasus_HG (Kotias)
Eastern_HG (Karelia_HG x2, Samara_HG)
Western_HG (Hungary_HG, Iberia_HG, Loschbour)
Yoruba (from the HGDP)

Mbuti (Mbuti.DG x3 from Fu et al.)

If you're seeing "infeasible", then remove the redundant reference population that might be causing problems, usually Yoruba or Nganasan, and run again.

If it's still not working, then maybe your dataset is just too different in some way, perhaps with not enough markers (there should be around 200K SNPs available for these runs).

See also...

Ancient ancestry proportions in present-day Europeans


Samuel Andrews said...

Makes sense. These are basically the same as results we've seen in somewhat recent(now old) ancient DNA papers and with your D-stats. The Onge stuff is the only thing we've never seen.

I think Europeans may be scoring Onge because...
>Mostly because it's a neutralizer. If affinity to the outgroups is slightly too high using only Mesolithic/Neolithic genomes Onge is used to lower/neutralize it.
>Also because Southern Europeans have Iran Neolithic and Levant Neolithic ancestry. Those two ancients were slightly more distant to the outgroups, at least according to F4 and D-stats, than their cousins CHG and Barcin Neolithic were.

jv said...

Thank You! I need a course on some of theses terms however I do see that Yamnaya/Samara is rich in EHG & CHG(mtDNA H13 is CHG but I believe my mtDNA H6a1 is EHG or Central Asian in origin) .Would like to know if my maternal Yamnaya lineage is EHG or CHG. If the women survived the LGM in the PC Steppes or south PC Steppes. Look forward to the Kurgan qpAdm. My mtDNA matches in approx 700 AD are in Croatia,(Vucedol is proto-Bell Beaker and my lineage is Corded Ware Culture) Serbia,Macedonia & she guess followed her R1b men into the Balkans at some point between 3000 BC and 700 AD...jv

Anonymous said...

Look at that difference in Caucasian_HG between German CW and German Bell Beaker.

MaxT said...

I'll just wait for ancient genomes from various parts of Asia to make sense of it all. It's interesting either way.

Shaikorth said...


Onge has preferences towards certain outgroups used here (East > West > Rest) so it shouldn't be a neutralizer. But if reducing outgroups allows adding Iran_N or Levant_N the idea could be tested.

Anonymous said...

Presuming that Onge is indeed a proxy for any East-Asian admixture one might look into historical events: Russia historically has its relatively high contact with Asian populations such as Turkish peoples and Mongolians.

Th southern affinity is Balkans and Italy. Huns, perhaps? Sicily has a relatively high Y-DNA Q according to a map from Eupedia, for what that is worth.

Alberto said...

Thanks, I was looking forward to seeing something like this.

The numbers are quite in line with what we get with D-stats based nMonte models, loading much more on the Barcin_Neolithic/EHG axis than in the WHG/CHG one. I expected that qpAdm could produce different results by being able to use other outgroups, instead of -quite specially- WHG and CHG, but here I see that Villabruna and Satsurblia are included in the outgroups, which seems strange for a qpAdm setup (?) (because of WHG and CHG already in the left pops).

I wonder if using something like Motala and Armenia_EBA instead of Villabruna and Satsurblia would make any significant difference?

Slumbery said...

Huns in Sicily? Anyway, as far as we know the origin of Huns their heritage should not be more Onge than Nganasan in this test IMHO.
I have my doubt about the putative correspondence between Huns and YHg Q in Europe too.

Unknown said...

David, I think it's due to a deeper BE source. Adding Levantine Neolithic and removing Onge should work.

Ryan said...

"Interestingly, while, as expected, the Nganasan-related admixture peaks in far Northern Europe, the Onge-related ancestry is, perhaps surprisingly, most pronounced in Southern Europe."

Could you add a couple of North African samples to see where the bounds of this are? And a Druze sample?

And maybe Villabruna if that's feasible?

My wild ass guess would be its some weird basal Eurasian signal (maybe basal Eurasians weren't completely basal), but it would be nice to just test the boundaries of it, and check if its some sort of Upper Paleolithic Balkans thing.

Mark Moore (Moderator) said...

I am with Chad and Ryan on this one. The populations with more BE also show more Onge. That is, the BE that is predominantly in southern Europe looks more like Onge than the other things you measured it against.

Matt said...

Running the values in MDS mode it looks roughly the right shape half the time:

and half the time comes out with CHG in the "wrong" direction:

In this model CHG Adding more Caucasus people (high CHG combined with Levant / Barcin) might help to "anchor" the CHG component more firmly in the "right" position.

(PCA gives you something like this in its first 2 dimensions - and rotated again perhaps because of the difficulty of placing CHG and maybe of stretching some populations likely to hold with Levant ancestry towards Barcin. WHG does look pretty distant but that's mitigated in the next dimension up

I don't know about whether the values of the non-West Eurasian populations are correct. They do seem like they have a loose inverse isolation by distance correlation with lower total percentages of non-West Eurasian with distance from the edges of Europe (e.g. Russian North - 16%, Russia West 7% including peak Onge in the set, Greek 7%, Ukrainian West 5%, Polish and Sardinian 4.6%, Basque 3.9%, English 3.7%) and with open vs closed populations (e.g. Lithuanian 4%). They look too high compared to ADMIXTURE.

David: Do you think this set of outgroups would work OK with Levant_N, Iran_N in place of Barcin and CHG (replacing Levant_N as an outgroup with Natufian)?

Anonymous said...


Yes, that would require some creative looking at history. I admit that. Any other suggestion? Sicily is known for its relatively high Arab admixture.

Anonymous said...

@Mark Moore

"The populations with more BE also show more Onge."

But Sardinians show very low levels.

Shaikorth said...

Those groups would leave no Anatolia-N types, has something like it even been tried to fit a full European set?

Eastern ancestry levels might look more "normal" with Dai replacing Nganasan, given that the papers always model the latter easily with SE-Asian and ANE-EHG types.

Matt said...

Re: PCA again, taking these values and converting so:
Barcin Neolithic = 28% Basal + 72% WHG
CHG = 32% Basal + 34% WHG + 34% ANE
EHG = 50% WHG + 50% ANE

which IRC is an OK match to Basal Eurasian estimates, generates a PCA like:


(That makes French come out 18.2% Basal; changing the proportions to Barcin 23% Basal and CHG 25% Basal fixes the French to the 14.8% Basal from the latest Lipson paper, and gives the same PCA:, closeup:
Other proportions of Basal there: Albanian: 17.6%, Italian East Sicilian: 18%, Saami: 7.2%, Yamnaya: 10.5%, English: 13.6%, Polish: 13%, Lithuanian 11.9%, Corded Ware: 12.2%, Basque 14.8%. You could use other slightly different proportions of Basal in Barcin and CHG potentially).

Samuel Andrews said...

"David, I think it's due to a deeper BE source. Adding Levantine Neolithic and removing Onge should work."

That'll replace some Barcin_Neolithic with Levant_Neolithic in all Europeans and it'll also raise WHG to unrealistic levels. I think the best method would be to make Natufian an outgroup and LevantNeolithic an ancestor. Then Levant Neolithic won't be confused with Barcin Neolithic.

Unknown said...

That's what I meant. He knows he'd need Natufians as an outgroup.

Unknown said...

I'll work on this tonight too.

Rob said...

@ All

Could Onge in Southern Europe be something remaining from the Palaeolithic?
I believe that pre- historic North Africa will have a large chunk of natufian ancestry, and some might have made its way to Southern Europe too.
Dave or Chad could you check it against La Brana, Oetzi or one of the Remedellos ?

Anonymous said...


Both German Corder Ware (0.008) and German Bell Beaker (0.005) have substantially less Onge than current day Germans (0.026).

Grey said...


"Both German Corder Ware (0.008) and German Bell Beaker (0.005) have substantially less Onge than current day Germans (0.026)."

later southern mixture - connected to the mtdna shift maybe?

Davidski said...


I see that Villabruna and Satsurblia are included in the outgroups, which seems strange for a qpAdm setup (?) (because of WHG and CHG already in the left pops).

I wonder if using something like Motala and Armenia_EBA instead of Villabruna and Satsurblia would make any significant difference?

The relationship between Satsurblia and Kotias, the Caucasus_HG reference, is much older than the arrival of Caucasus_HG admixture in Europe, so as far as I know, my setup is fine.

Same thing with Villabruna; it's much older than the three Western_HG references and their relationship with modern Europeans.

Note that Lazaridis et al. used Switzerland_HG (Bichon) in the outgroups and Western_HG as a reference. They were following the same line of thinking.


In this model CHG Adding more Caucasus people (high CHG combined with Levant / Barcin) might help to "anchor" the CHG component more firmly in the "right" position.

This model is basically rejected for populations from outside of Europe, so it's not doable.

I don't know about whether the values of the non-West Eurasian populations are correct.

Do you think this set of outgroups would work OK with Levant_N, Iran_N in place of Barcin and CHG (replacing Levant_N as an outgroup with Natufian)?

I think the problem with this test is the lack of enough samples (for instance, Russian_West is based on just two people). I reckon to get more stable results, I'd need to have much larger numbers of both reference and test individuals.

And I don't think Levant_N or Iran_N are realistic Neolithic references for Europeans. Their addition to this model will drive up the levels of Eastern_HG and Western_HG to unrealistic levels, but the aim of this test is to get as close as possible real ancestry proportions in Europeans from the most relevant Mesolithic/Neolithic sources.

I'm not really bothered by the sometimes wayward Onge and/or Nganasan percentages, because even if they're off by 1-2%, they help to get the other ancestry proportions correct.

I suspect that in this test, Onge is used to correct for minor Iran Neolithic gene flow that can't be explained by Caucasus_HG. That's why it generally peaks in Southern Europe. But for some parts of Europe it might have more to do with events during the Iron Age.

Seinundzeit said...

This is a very good analysis, solid stuff. As far as I can see, everything makes good sense.

And honestly, I don't find the Onge percentages to be a surprise. Rather, I think qpAdm has a tendency to show very strong estimates of Onge-related admixture. This is a pattern I've noticed from the beginning.

I mean, if qpAdm can construe the Kalash as 20% Onge, it is no surprise that Russian_West get 6% Onge, while West_Sicilians get 5% Onge, and Albanians get 4%.

In analyses where these Europeans are at 0% Onge, people like the Kalash + Pashtuns tend to be around 10%-15% Onge (and closer to 10%, although some Pashtun tribal people are more around 5%), and the Pamiri Tajiks tend to be around 10%-0% (Ishkashimi speakers usually have around 10%, and the Rushan speakers tend to be closer to around 0%-1%, with the other Pamiri peoples in between).

I think the intense Onge percentages for Europeans and South Central Asians (with qpAdm) might track some sort of ancestry for which we don't have any proper aDNA. Who knows really. At the end of the day, we need more aDNA, especially from Central Asia.

Unknown said...

Onge is something that chunks are dumped in without a basal enough reference in Admixture. I'll test if it's the same here. I'll do it later this evening, once my new set with extra SE Asians and Berbers is ready. I have over 4300 samples right now.

Davidski said...

But if Barcin_Neolithic and Caucasus_HG aren't basal enough for Europeans in this model, then why are they all scoring Eastern_HG and at least a few per cent of Western_HG, even the Southern Euros?

Rather, the Onge stuff looks like an eastern shift not explained well enough by Caucasus_HG, Eastern_HG and Nganasan.

Rob said...

OT. Interesting ?

Davidski said...

It is interesting. But it'd be even more interesting if they developed something like that for more distant genealogical connections, like over a few thousand years.

By the way, they mention Srubnaya outlier I0354 as a potential migrant to the Volga-Ural region who produced offspring with the local male I0360. See page 2.

Rob said...

yes the Srub outlier ; looks like a pre_andronovan central asian

Unknown said...

Probably from the forest zone, rather than SE.

Davidski said...

From just east of the Urals IMO. One of the Khvalynsk guys, the one with Y-DNA Q, is similar to the Srubnaya outlier. It looks like there was a population like that very close by in Central Asia.

Rob said...

Yes I agree.

Srubnaya Outlier

Karelia_HG 49.5 %
Iran_Neolithic:I1945 21.3 %
AfontovaGora3:I9050.damage 20.2 %
Kotias:KK1 8.6 %
Dai 0.4 %
Hungary_HG:I1507 0 %
Mentese_Neolithic:I0723 0 %
Samara_HG:I0124 0 %
Levant_Neolithic:I1699 0 %
Iran_Hotu 0 %
Yoruba 0 %
Paniya 0 %

-Versus- 'normal' Srubnaya's

Kotias:KK1 30.7 %
EHG: 29.5%
Hungary_HG:I1507 23.5 %
Mentese_Neolithic:I0723 16.2 %
Iran_Hotu 0.1 %
AfontovaGora3:I9050.damage 0 %
Iran_Neolithic:I1945 0 %
Levant_Neolithic:I1699 0 %

Davidski said...

Srubnaya outlier I0354 can't be modeled using exactly the same setup as in my analysis above. The model is basically rejected by a high chisq and very low tail prob.

Rob said...

As per your QP admin tour ? Missing source Pops ?

Davidski said...

Yeah, Eastern_HG and Caucasus_HG clearly don't have enough ANE to be the only ANE sources for Srubnaya outlier. I would need to add AG3 to the reference list, and I'm using it as an outgroup along with MA1. Also, maybe I'd need to replace Caucasus_HG with Iran Neolithic.

Rob said...

Indeed. Even when offered Yamnaya, Srub_Out doesn't chose it:

Karelia_HG 49.6 %
Iran_Neolithic:I1945 21.35 %
AfontovaGora3:I9050.damage 20.15 %
Kotias:KK1 8.55 %
Dai 0.35 %

Yamnaya_Samara:I0429 61.9 %
LBK_EN 22.1 %
Villabruna:I9030 8.7 %
Samara_HG:I0124 7.05 %
Baalberge_MN 0.25 %
AfontovaGora3:I9050.damage 0 %

I'm not sure if too much can be made of it: but (a) every time, Srubna Outlier chooses Karelia_HG over Samara_HG; and (b) it is the "EMBA steppe' individual of choice for the large marjority of south Asians. Of course, the latter could just be the elevated ANE of Srub Outlier, but i think its probably more than nMonte pragmatism.

Garvan said...

Can somebody recommend a reference for qpAdm that would explain how it works, particularly the selection of out-groups?


Davidski said...


Have a look at the supp info PDF to Lazaridis et al. 2016.

Seinundzeit said...

David and Rob,

I completely agree. I think qpAdm is quite liberal with the Onge percentages, for both Europeans and South Central Asians, mainly because we might be missing a crucial piece of the puzzle.

The same pattern appeared for Iranian_Mazandarani. They showed a very striking excess of Onge admixture, compared to other Iranians, despite being from the northern portion of that country.

Basically, the Onge-related percentages in Europe, West Asia, and South Central Asia are often wholly or partially reflective of some ancient Central Asian genetic influence, and not actual ENA ancestry connected to Andamanese people.

In fact, I'd say that qpAdm also inflates Onge-related admixture for peninsular South Asians, due to the same reasons that are operative in the case of Europeans/West Asians/South Central Asians.

Regardless, I do think we'll figure it all out, once we see some aDNA from Central Asia.

And just to add another note for other folks, I think the "more Onge = more Basal Eurasian-admixed" explanation is quite untenable, if we look at the Sardinian result. Basically, I'd imagine that Sardinians are one of the most highly Basal Eurasian-admixed of all contemporary European populations.

Yet, they have one of the lowest Onge values in Europe. Instead, Onge admixture peaks in Russian_West, at 6%. I highly doubt that Russian_West have more Basal Eurasian admixture compared to Sardinians.

Anyway, on a different note, I think it's now quite clear that the Srubnaya outlier is an exceedingly valuable sample.

This is exactly what I meant, when I claimed that she didn't emerge out of thin air; there must have been whole populations like her (perhaps on the steppe, or perhaps in/around Central Asia).

Honestly, I'm thinking that she might be the closest sample we currently have to ancient Central Asian peoples.

So, exploring her genetic affinities, seeing how she compares to other steppe samples, examining the sort of proportions she provides for contemporary populations, etc, should now prove to be of great interest.

Davidski said...

After some discussions with Chad, if anyone wants to reproduce my results, but you don't have the same dataset as me, which is very likely, you should be able to get very similar results using the full Human Origins dataset. Or even not full, if you have the Onge from somewhere else.

Use these reference pops and outgroups.

Caucasus_HG (Kotias)
Eastern_HG (Karelia_HG x2, Samara_HG)
Western_HG (Hungary_HG, Iberia_HG, Loschbour)
Yoruba (from the HGDP)

Mbuti (Mbuti.DG x3 from Fu et al.)

You should get very similar results to me. If you're seeing "infeasible", then remove the redundant reference that might be causing problems, usually Yoruba or Nganasan, and run again.

If it's not working, then maybe the dataset is just too different, perhaps with not enough markers (there should be around 200K available for these runs).

Shaikorth said...

Sein, if you remember the ANE preference test, the ANE in Kalash and Tajiks (as well as IA Scythian) preferred Srubnaya Outlier over Karelia_HG unlike BA steppe samples, and since Karelia HG can't proxy Srubnaya outlier well due to too much ANE, that might suggest a lower than expected European steppe contribution in them. That is, if Srubnaya Outlier is a pre-IE Central Asian.

How about a modified repeat of that test, add some modern Europeans to see if it's just the most recent ANE being universally preferred.

Bichon Euro_EN CHG Iran_N Andamanese Dai Karelia_HG Srubnaya_Outlier AG3/MA1

Based on the phylogenic suggestions in Reich's latest paper it's tempting to suggest Kostenki14 in Bichon's place but it's probably too old to work well with PCA.

Matt said...

Davidski: And I don't think Levant_N or Iran_N are realistic Neolithic references for Europeans. Their addition to this model will drive up the levels of Eastern_HG and Western_HG to unrealistic levels, but the aim of this test is to get as close as possible real ancestry proportions in Europeans from the most relevant Mesolithic/Neolithic sources.

Fair enough, depends on how deep you want to go deeper back into what is the composition of CHG and Barcin_N (as in Laz 2016 supplement) or leave it at the most proximate 3 / 4 big clades likely to be most of ancestry for recent Europeans.

Balaji said...

There is a possibility that the Onge element in Southern Europe is real. The Admixture figure in the Haak 2015 paper ( has a component at K=6 which peaks in Papuans and which is also found in small amounts in Southern Europeans but not in Northern Eurpoeans. Also Davidski calculated the following f3 statistics.

f3(French;LBK_EN,Papuan) = -0.002265 z=-3.548
f3(Spanish;LBK_EN,Papuan)=-0.003703 z=-6.239

This suggests some ENA admixture in these European populations. It is possible that there was a southern route for the introduction of Indo-European languages into Southern Europe which brought along a small amount of ASI. The northern route via the steppe brought in less ASI to Northern Europe.

Unknown said...

Haven't weird edges from Mbuti to Papuan and also Stuttgart to Mbuti or vice versa been observed using Treemix? Could this be relevant?

Unknown said...

Also Mbuti to LaBrana.

Matt said...

Tangent: Not totally on topic, since this is all about the proximate populations of the Mesolithic and Neolithic, but I was thinking about the Lipson and Reich paper and find it hard to see how they could've got a fraction of 23.2% East Eurasian for French in very deep modelling, when they get 17% as the high bound into MA1.

If you take these and assume 17% EHG + WHG (since outgroup D stats for these with MA1 and ENA are null), and 12% Barcin Neolithic and 11% CHG (taking into account their Basal Eurasian) are East Eurasian, then you would get 14.9% East Eurasian in French. (Europeans who don't have recent above 1% slice of Nganasan go from 16.1% to 12.8%).

You'd need 29%, essentially a third, into WHG+EHG and 22% and 18% in Barcin Neolithic and CHG as East Eurasian to get the 23.2% figure they get for French. (Aforementioned lower Nganasan Europeans go from 25% to 21%).
If you take the low bound in their data of 11% East Eurasian in MA1, then completing the above again, it works out to 10.2% in French. (Lower Nganasan subset of Europeans from 11.4% to 8.6%).

Reich and Lipson did say that the proportions aren't likely to be exact, but I did still think it was worth doing a quick spot check calculation.

Shaikorth said...

It's annoying that the graph with French wasn't shown in the paper. Clearly modern Europeans have a relationship to the other populations which fits the model, even if outgroup D-stats may not indicate it. After all ANE-Onge fits for Native Americans with over 60% Onge work using qpAdm even though direct D-stats imply Karitiana should be closer to ANE than Onge. Bidirectional geneflow between ANE/WHG and EE would end up driving down the ENA in Europe.

Matt said...

Well, for the Onge vs ANE thing, I would say I don't think direct D-stats like that are ever really great for estimating a percentage of ancestry, like, e.g. an equal D-stat means a population has an equal ratio.

If you take D(WHG1,Outgroup1)(WHG2,Outgroup2) it's invariably a much stronger stat than D(EHG1,Outgroup1)(EHG2,Outgroup2).

So from that even a population with equal amounts of both would not be intermediate (at 0) on D(EHG,WHG)(Test,Outgroup), because WHG ancestry gives a stronger relatedness signal to other WHG than EHG to other EHG.

(Plus WHG ancestry through the Barcin_N channel).

Possibly why qpAdm works on correlations and not covariance for that reason.

For the model in the paper though, for more it's just particularly difficult to see how French could end up with any more than MA-1's high bound estimates (or even exactly the same).

The WHG should only have as much East Asian as MA-1 assuming their model of splits, because the WHG has an equal stat and is otherwise cladistically equivalent, and the Barcin / CHG seem to be 60-75% pretty much WHG / MA1 derived with no extra East Asian stuff with the rest being Basal. (CHG might have a little ENA? But the French don't have that much of CHG).

Maybe this is another peculiarity of ADMIXTUREGRAPH and uncertainties / degrees of freedom in fits (like the Haak models for K14 I couldn't understand).

Shaikorth said...

WHG fits in a "similar fashion" but Kostenki14 has a very long drift path from the WE node. If Loschbour's non-ENA shares a longer drift path with K14 than MA1 (Fu et al. suggests this is plausible) it could need more ENA to make it fit considering the relatively similar amount of drift shared with eastern pops...

Given how many overlapping SNP's they're getting in the set I assume they use the 16.1x Kostenki14 from Fu et al instead of the 2.8x (shotgun) sequence used in Haak et al.

CHG is a more complicated fit since it needs basal, have no idea how much ENA it would get.

Seinundzeit said...


This proved to be rather interesting.

First, I tried it on a northern European population. Although, I also made two minor modifications. For one, instead of CHG, I used Iran_Chalcolithic. Only because the CHG samples do strange things in PCA-based analyses. Iran_Chalcolithic and CHG are strikingly similar, much more so than either of them are to Iran_Neolithic, so I think we're in a good position to do this. Also, I used Oroqen instead of Dai. Again, only because PCA doesn't really seem to allow sensible results with that sort of reference.


40.8% Bichon
29.45% Srubnaya_outlier
22.85% LBK_EN
6.9% Iran_Chalcolithic


Despite having a choice between Karelia_HG, MA1, and the Srubnaya_outlier, the Lithuanians chose the Srubnaya_outlier.

The same exact references, but with South Central Asians:


32.65% Iran_Hotu
26.35% Iran_Neolithic
25.7& Srubnaya_outlier
10.8% Onge
4.5% Bichon


(Note: Iran_Hotu is just Iran_Neolithic + 15% ANE, per analyses I've tried before)


36.25% Iran_Chalcolithic
29.9% Srubnaya_outlier
17.45% Iran_Neolithic
12.85% Onge
3.55% Bichon



42.15% Srubnaya_outlier
41.95% Iran_Chalcolithic
12.4% Onge
2.5% Iran_Neolithic
1% Bichon



41.75% Srubnaya_outlier
38.55% Iran_Chalcolithic
10.5% LBK_EN
4.45% Onge
3.55% Oroqen
1.2% Bichon


The Kalasha and Pashtuns have around the same amount of Srubnaya_outlier as Lithuanians, while the Pamiri people have much more. This is a typical result, Kalasha and Pashtuns are usually comparable/identical to Northern/Eastern Europeans when it comes ANE-rich steppe ancestry, and the Pamiri peoples have the most compared to everyone else.

Davidski said...


It went like this...

Shaikorth said...


Looks kind of expected, though I also had a suspicion that Bichon+ANE combo might automatically appear for some populations instead of Karelia_HG. Could you try Lithuanians with Bichon replaced by some European MN population and then with just LBK as the WHG source besides Karelia_HG?

By the way, how do the Dai fits work badly, and would Ami or She work better? I'd prefer to avoid using recent mixes from regions with back and forth geneflow, like Siberians, as sources since their constituent parts are almost certainly available to us.

Seinundzeit said...


I'll definitely make sure to give that a spin.

Basically, sometimes the Dai, and other purely East Asian populations, tend to eat into the Onge percentages for South Central Asians, often in a very substantive manner. That's the only real issue. But I'll still try it, and report what I find.

Anyway, although the Srubnaya_outlier is a fascinating sample, a lot of questions could be answered if we knew her true geographic origins, and if we knew her linguistic affiliations.

Objectively, we can't really know anything certain about the latter (quite unfortunate), but if I had to guess, I think she was IE.

If not, why did she end up in Srubnaya country? And since she seems to be the mother of another Srubnaya sample, one can assume she was culturally "integrated" into that population (by cultural integration, I mean a whole broad set of processes/activities/social roles).

So, is this not likely to be reflective of ethno-cultural linkages between the different steppe communities at this period? Which would imply that she was from an IE steppe populations, but one which was very genetically distinct from Srubnaya/Andronovo/Sintashta-like steppe peoples?

Personally, I have no clue, so someone who knows ancient steppe anthropology/archaeology should chime in.

Samuel Andrews said...

"Iran_Chalcolithic and CHG are strikingly similar, much more so than either of them are to Iran_Neolithic"

Where did you get this information? If you got that from D stats of the form D(Chimp, Iran_Chal)(CHG, Mbuti)-D(Chimp, Iran_Chal)(Iran_Neo, Chimp) and D(Chimp, CHG)(Iran_Chal, Chimp)-D(Chimp, CHG)(Iran_Neo, Chimp) the results can't tell if Iran_Chal is closer to CHG than to Iran Neo.

D-stats aren't good at telling if Pop A is closer to two pops who are closely related to it. For example we've seen on this blog it can't tell that Europeans are closer to EEF or Corded Ware than to WHG. We've seen on this blog they can't tell that Ashkenazi Jews are more similar to Italians than English.

Everything we've seen doesn't suggest Iran_Chal is closer to Iran_Neo than to CHG. If they were 60%+ Iran_Neo, as most models say, they've got to be closer to Iran_Neo.

"The Kalasha and Pashtuns have around the same amount of Srubnaya_outlier as Lithuanians, while the Pamiri people have much more."

We don't have good models for SC Asians but we do have good models for Lithuanians. Lithuanians scored an unrealictis 40% WHG in this model because Iran_Chal was included. They're 20% or a little above, maybe less 20% actually, according to most models.

Srubnya ancestry was replaced with WHG ancestry. We have to use mtDNA as a sanity test for autosomal based results. mtDNA-wise Lithuanians are very similar to Srubnaya(and its relatives). In contrast they're radically different from WHG. Lithuanians have about 3% U5b, WHG had about 80% U5b. Most models would make Lithuanians something like 60%+ Srubnaya.

Shaikorth said...

If that's the issue then Dai can be used. The idea behind this test is not to create a totally comprehensive model but to try checking ANE preference, Oroqen eating ANE is more undesirable than Dai eating Onge.

Srubnaya outlier's origins can be solidly resolved with only ancient DNA, but Caucasus, Iran and regions west of the Srubnaya horizon are probably out of the question. That leaves the forest zone and the parts of Central Asia or South Siberia not inhabited by Afanasievo or Andronovo/Sintashta-like populations (or anything in between) - depending on how far the latter areas were from Srubnaya she might have come a long way in a foreign company.

Seinundzeit said...


It's a long story.

At the end of the day, CHG and Iran_Chalcolithic are quite similar. There is a reason why Iran_Chalcolithic was modeled as 70% CHG in Lazardis et al (if my memory serves me right).

Also, that estimate doesn't involve Srubnaya. It involves the Srubnaya_outlier, who seems to be ANE + EHG + Iran_Neolithic. So, she is very different from the other Srubnaya samples, and likely came from somewhere else.

Whenever a steppe reference is very ANE-shifted, Northern Europeans and South Central Asians tend to have identical percentages. This happens all the time, even with Yamnaya. By now, it's a familiar result, not radical at all.

If we used the other samples, then Lithuanians would probably be predominately Srubnaya. You and me, we've covered this before.

Also, PC and ADMIXTURE-based analyses always give Europeans much more WHG, and much less ANE/EHG/Steppe/CHG, compared to formal methods. That's just how the methods differ.

There is no real way to make the formal methods match with the PC-based analyses.

Although, even with formal methods, South Central Asians and Northern Europeans are at the same levels of ANE-rich steppe ancestry (please refer to Lazaridis et al.).

Again, I think we went over this, a few weeks ago.

Seinundzeit said...


In that case, I'll try it out, and we'll see what happens.

And I guess either the forest zone, or perhaps some part of North Central Asia, really are the only options, with regard to her origins.

Shaikorth said...

Iran_Chal could be modeled as 63-70% CHG in qpAdm, the rest being a mix of Iran_N and Levant_N, despite direct D-stats showing high Iran_N similarity (the earlier issues with ANE, Native Americans and Onge, or as Matt mentioned WHG and EHG, come to mind). Makes sense Global10 treats it in a similar fashion.

Seinundzeit said...

On the topic of Iran_Chalcolithic, I have a D-stats nMonte sheet that has a simulated Basal Eurasian population.

It's quite easy to produce a Basal Eurasian ghost with d-stat data, one just has to assume that the Natufians are 50% BEA and 50% West Eurasian (which seemed like a good starting point).

Even with that setup, I found that CHG and Iran_Chalcolithic were very similar, when examined in terms of very deep modelling, but Iran_Neolithic turned out to be quite unique.

So, qpAdm and Global10 are on the right track. CHG and Iran_Chalcolithic are very similar, but perhaps the similarity is a "coincidence", not the product of "direct" linkages between CHG and Iran_Chalcolithic. I can't claim to know.

Anyway, I should share all of those results, eventually. Everything made sense, and it'll prove to be of great interest, since we've now seen some deep modelling of Eurasian populations in the Lipson and Reich paper.

jv said...

Don't Udmurts have the most Yamnaya-like ancestry today? jv

Davidski said...

You'll have to wait for my qpAdm tour of Europe: Kurgan invasion edition. Coming soon...

jv said...

What was the mtDNA for the Srubnaya Outlier? And don't you think an outlier would adapt to a new Culture? My mtDNA was found in a NON-IE culture in 2000 BC in Siberia even though it's associated with the Yamnaya. These folks moved around. Even in the Yamnaya Era there were HG living in the Zhiguli Mts in Samara. Surely Steppe, Forest-Steppe people intermingled for thousands of years.jv

jv said...

I remember reading M Gimbutas. She mentioned that the Srubnaya Culture received input from the western Cultures. Surely looks like it.jv

Samuel Andrews said...

North Africa's West Eurasian mtDNA.

NorthWest Africa is linked with Neo Iberia, Egypt is linked to Neo Levant/Egypt. NW African and NE African mtDNA are completely different.


Srubnaya_outlier belonged to U5a1. It's consistent with her being mostly ANE/EHG.


I didn't notice Shaik used Srubnaya_outlier. There's no reason to use her when modelling Europeans(except Russians) because we have documentation of Yamnaya/MN hybrids deep in Europe.

SC Asians may have as much ANE as Europeans but I don't think many have as much Steppe/EHG(which has lots of WHGish stuff).

Rob said...


It's interesting that ancient Asian populations can be modelled with something basal-rich (Natufian), MA1, and something like Onge/Pulliyar; even Ust-Ishm can be used for the latter.

So eg ChG (Kotias) :
MA1 :62%

IranNeol (WC1):
MA1: 27


As you say, Iran Mesolithic and Neolithic are very similar to CHG, but have somehing distinct in them which Kotias and Saturblia do not : something part ASI which already was in northern Iran by the early Holocene.

But this changes if we include Satsurblia as a source population (given it's an early Epipalaeolithic sample, it might be more correct to add it ). Whilst we need pre-Ice age samples from Asia, I think an Onge-type population was widespread through eastern west Asia and perhaps southern Central Asia, into which a Basal rich - population expanded from Levant or Gulf, and ANE type group from southern Siberia (AfantovaGora preferred over MA1).
But I don't know if such 3 metpopulations were ever completely separated completely during the ice age in (milder) west Asia, contra what would be the case in Europe. In other words, a Basal-ANE mix might have existed since 25kya, and Central Asia might have been characterised by late upper palaeolithic foragers which existed along an ANE - Pulliyar/ Onge cline.

Shaikorth said...


The point of the test isn't to measure Yamnaya ancestry but which kind of ANE-rich population (among AG, EHG and Srubnaya outlier who have the most of it) is preferred as a source by moderns in Global10.

Matt said...

@ Rob, though I'm not totally sold on Reich and Lipson's new paper as necessarily the clear cut best model, it does seem like it could predict a more or less independent substantial contribution of East Asian related ancestry to both WHG and ANE as they suggest.

So if that's the case that could make it more likely that it may have come from an ENA leaf population in Central Asia, which maybe expanded to East Asia and SE Asia more recently than K-14's time (which is 35,000 YBP - a long time ago), and probably with some admixture (perhaps from "first wave ENA").

K33 said...


It seems that Natufian-related ancestry did make its way to Southern Europe (probably via both N. Africa but also via the Phoenicians) distinct from the Anatolian_Neolithic wave --albeit at trace quantities. ADMIXTURE at K=10 or so always shows Sicilians especially with distinctly Natufian-related (Levantine or North African) admixture even when West African is included as a reference.

I believe if you added "Natufian" as an additional reference pop, the Yoruba percentages would be likely cut in half for Southern Euros....

Samuel Andrews said...

"The point of the test isn't to measure Yamnaya ancestry but which kind of ANE-rich population (among AG, EHG and Srubnaya outlier who have the most of it) is preferred as a source by moderns in Global10."

Ok I get it now. But we know where European's ANE is from so there's no point in including them in that analysis.

Shaikorth said...

Europeans are there for the very important reason of being a sanity check. If they were showing AG3 ANE while SC Asians were showing something else and South Indians AG3 again something would probably be wrong. Now SC-Asians and Europeans are showing Srubnaya outlier, which is a type of ANE that married into steppe populations.

That Lithuanians are showing Srubnaya outlier and not EHG can mean many things: that the ANE in steppe populations ancestral to them doesn't derive from EHG but from Srubnaya outlier types, or that the source populations are selected poorly (which is why I asked for alternate results without Bichon as a source to see if this shows EHG instead of Srubnaya outlier) or it means Global10 is not suited for this kind of test which makes SC-Asian results questionable too.

Alberto said...

I find it quite strange that Lithuanians prefer Srubnaya_outlier over Karelia_HG, and I certainly can't reproduce those results. This is what I get instead:

"Bichon:Bichon" 32.9
"Karelia_HG:I0061" 27.3
"Iran_Chalcolithic:I1661" 20.2
"LBK_EN:I0100" 19.6
"Srubnaya_outlier:I0354" 0


And the fit is quite worse too, so I don't know if Sein is using a different sheet or what (this is the Global 10 one without any changes to it).

Shaikorth said...

Yeah looks odd. Maybe eigenvector correction, though only Sein can answer that.

What do you get for Kalash?

Alberto said...

Yes, there must be something different because also here there is no Onge. I used Dai instead and tried to match the other pops to the model above:

"Iran_Hotu:I1293" 42.55
"Iran_Neolithic:I1290" 22.7
"Srubnaya_outlier:I0354" 21.8
"Bichon:Bichon" 9.15
"Dai" 3.8
"LBK_EN:I0100" 0
"Karelia_HG:I0061" 0
"Iran_Chalcolithic:I1661" 0


For the king said...


" At the end of the day, CHG and Iran_Chalcolithic are quite similar. There is a reason why Iran_Chalcolithic was modeled as 70% CHG in Lazardis et al (if my memory serves me right). "

Yeah, but in the same study CHG was modeled as 71.6% Iran Neolithic, 7% WHG and 21.4% EHG. Quite confusing to be honest. Whare are the best fits for Iran ChL in globe 10 nMonte?

FrankN said...

Re: Srubnaya Outlier: Coming from a completely different angle, I noted something interesting.
Essentially, I was trying to figure out whether, and what kind of transpacific gene flow may have been going on since the Mesolithic. As an admittedly crude measure, I for each pop calculated the relative difference between the K10 Global PCA distance (uncorrected) to Kennewick Man, and to Karitiana. For all populations not involved in transpacific gene flow, those differences should be more or less equal. The average for all non-Amerindian pops stands at being 0.0036 = 0.43% less distant to Kennewick than to Karitiana.

Since NW American Amerindians are closely related to Kennewick Man, pops that have received “out of America” gene flow should display lower distance to Kennewick than to Karitiana. In fact, that was also what I found: Clearly closer to Kennewick than to Karitiana were:
- Even (-8.5%)
- Selkup (-8.1%)
- Altaian (-7.8%)
- Karasuk_outlier:RISE497 (-7.4%)
- Ket (-6.9%)
- Tuvinian (-6.6%)
- Nenets (-6.4%)
- Yakut (-6.2%)
and so on, through all NE Siberian groups, down to Oroqen (3.1%) and Kyrgyz (2.7%). [Interestingly, then a “Horn of Africa” cluster follows, starting with Somali (2.6%), and including Mota (2.4%) and Dinka (2.3%). PCA artifact or some real movement (obviously post-Kennewick, pre Mota)?]

Conversely, I expected some more southerly Asian/ Oceanic pops to score substantially closer to Karitiana than to Kennewick, reflecting well documented trans-pacific exchange over the last 3-4 kya (dog (a)DNA, coconut, sweet potato etc.). However, to my surprise, the following turned up:
- Okunevo:RISE516 (3.5%), RISE515 (2.2%)
- Altai_IA:RISE601 (2.5%), RISE600 (2.3%)
- Srubnaya_outlier:I0354 (2.0%)
- MA1 (1.9%)
- AG3 (1.9%)
- Samara_Eneolithic:I0434 (1.9%)

Obviously, this doesn’t have anything to do with rather recent transpacific migrations. Unless we are dealing with a PCA artifact here, the only plausible explanation is
a) The Srubnaya outlier (and Okunevo) capture (part) of the original founding population of South Amerindians, possibly slightly better than does MA1; and
b) Kennewick must have received additional gene flow from a different population, as is already indicated in Dave’s Admix13Q.

Related to a) above, the next pops that the PCA has closer to Karitiana than Kennewick are interesting:
- Kalash (1.8%)
- Burusho (1.7%)
- Yamnaya_Samara:I0438, I0443, I0439, I0429, I0441 (1.7%)
- Satsurblia, Kotias (1.7%)
- Tajik_Ishkashim (1.7%)
- (various Steppe CA/BA)
- Iran_Hotu:I1293 (1.5%)

Finally, for reference:
- Austroasiatic_Khasi (1.3%)
- Bajo (1%)
- Andaman_Onge:avg (0.9%)
- Aeta (0.8%)
- Ami (0.7%)
- Dai (0.6%)
- Lapita_Vanuatu:I1369 (0.6%)
- Papuan (0.5%)
- Han (0.4%)
- Japanese (0)
- Sephardi_Jew (-0.43% = average over all pops)

Rob said...

@ Alberto, & Shaik.

I'm not sure if I've understood entirely what you're analysing: where the 'ANE' in modern European's comes from, right ?

Seinundzeit said...


In order to get comparable results to myself, you have to use the first 7 dimensions.

It's a long story. I figured this out the hard way, awhile back. In fact, I have a whole list of reasons for why I don't use dimensions 8, 9, and 10.

But to sum it up quickly, dimensions 8, 9, and 10 involve African and Australasian outliers.

Pragmatically speaking, many results get nonsensical for certain populations on the West Eurasian cline, especially ones with recent Siberian/East Asian admixture.

When using PCA-based nMonte, I try to get results that are as sensible/in tune with the academic literature as I can get them to be, and keeping those dimensions always made things very problematic for West Eurasian or heavily West Eurasian populations. But dropping them always made things much more reasonable.

I would only recommend using 10 dimensions when dealing with Sub-Saharan African or Oceanian modelling.

Anyway, we need to remember that Srubnaya_outlier is the closest we have to a recent ANE sample.

What I mean is this: she is EHG + a shot of pure ANE + Iran_Neolithic. And both EHG and Iran_Neolithic are in large part ANE.

So, Srubnaya_outlier is overwhelmingly ANE, in terms of her deep genetic ancestry. And she is far, far younger compared to AG3 and MA1. So, it is no surprise that she is preferred over AG3 and MA1, but also over the more WHG-admixed Karelia_HG.

So, the ANE preference test is simply telling us that many Europeans, and all South Central Asians, prefer the youngest ANE sample when it comes to their ANE ancestry, rather than AG3 or MA1.

That's all. It isn't giving us any information about which steppe references we should use.

Seinundzeit said...


For what it's worth, here is some deep modelling I tried awhile back, for Eurasian populations.

Unfortunately, I lack the computational apparatuses to examine these questions at the level seen in the Lipson and Reich paper, but this was still fun.

My reference populations were:

"Basal Eurasian"

Looking back, the Neanderthal + Denisovan was pointless, as this sheet didn't have an Archaics column. I might redo this, using a D-stats sheet that has a column for Neanderthal/Denisovan.

The Basal Eurasian simulation was based on Natufians, using the assumption that they are 50% West Eurasian. I just used this as a clean starting point, to build a simulation.

I tried these same reference populations with everyone.

As a sanity check, I first tested El-Miron.

Per the supplementary data for Fu et al.:

"ElMiron can be modeled as having 49±13% GoyetQ116-1 and 51±13% LaBrana1 ancestry. (In Supplementary Information section 6 using full Admixture Graph modeling, we obtain a point estimate of 63% GoyetQ116-1 related ancestry, consistent with this model.)"

I find:

59.90% Villabruna
32.60% GoyetQ116-1
4.95% Ami
2.15% Basal Eurasian
0.40% Onge


In the ballpark, so quite good. Although, the lack of an Archaic column is why they score Basal + Ami.


72.75% AG3-MA1
27.25% Villabruna


64.75% Ami
35.25% AG3-MA1


86.95% Ami
13.05% AG3-MA1


68.1% Ami
23.7% Onge
7.3% AG3-MA1

Everything makes sense.

Now, trying Iran_Neolithic:

41.45% AG3-MA1
40.4% Basal Eurasian
8.5% Villabruna
5.9% Onge
3.75% GoyetQ116-1

Fascinating stuff. Compared to Iran_Chalcolithic:

43.45% AG3-MA1
36.20% Basal Eurasian
20.10% Villabruna
0.25% Ami

Compared to CHG:

50.15% AG3-MA1
29.75% Basal Eurasian
20.1% Villabruna

Compared to Iran_Late_Neolithic:

38% Basal Eurasian
34.75% AG3-MA1
18.30% GoyetQ116-1
7.80% Villabruna
0.70% Onge
0.45% Ami

The differences between Iran_Neolithic, Iran_Late_Neolithic, Iran_Chalcolithic, and CHG are quite fascinating.

For the king,

Please see above, I think the differences are interesting.

Unknown said...

LaBrana is probably about 16% Aurignacian. That would be around 58% Goyet and 42% Villabruna.

Rob said...

@ Sein & Chad

What are they testing then they sugegst El Miron ha LaBrana 1 ancestry (shared ancestry)? It pre-dates LB1 by 10000 years +

Seinundzeit said...


Without focused/extensive sampling of different spatio-temporal levels, they can only make inferences with regard to shared ancestral nodes, rather than direct claims concerning proximate admixing sources.

I guess it's a matter of doing the best they can, with the samples they have.

Things should get far clearer, with more UP data.

Regardless, I'll post some South Asian and South Central Asian results, as I'm seeing some very interesting patterns.

Seinundzeit said...

Okay, so here is an exploration of South Asia and South Central Asia, in terms of deep modelling.

South Asia:


79.7% Onge
11.5% Ami
6.95% AG3-MA1
1.65% Villabruna
0.2% Basal Eurasian


59.1% Onge
27.4% AG3-MA1
8.7% Basal Eurasian
4.5% Villabruna
0.3% Ami


35.8% Onge + 5.5% Ami
37.7% AG3-MA1
13.7% Basal Eurasian
7.3% Villabruna


38.65% AG3-MA1
30.25% Onge + 6.70% Ami
14% Basal Eurasian
10.4% Villabruna


42.8% AG3-MA1
23.1% Onge + 7.3% Ami
16.5% Basal Eurasian
10.3% Villabruna


45.45% AG3-MA1
18.90% Basal Eurasian
12.90% Onge + 9.40% Ami
13.35% Villabruna


45.40% AG3-MA1
21.70% Basal Eurasian
11.75% Onge + 9.10% Ami
12.05% Villabruna

South Central Asia


45.95% AG3-MA1
18.50% Basal Eurasian
18.30% Ami
11.05% Villabruna
5.20% Onge


49.45% AG3-MA1
22.25% Basal Eurasian
14.85% Villabruna
13.45% Ami


52.65% AG3-MA1
22% Basal Eurasian
13.85% Villabruna
11.5% Ami


52.65% AG3-MA1
19.05% Villabruna
16.95% Basal Eurasian
11.35% Ami

Very interesting.

I'm going to perform some tweaks (this is an old sheet), do some experiments, and will report what I find.

Rob said...

Which method/ programme are you using here ^^

Seinundzeit said...


Just an nMonte D-stats sheet.

Nirjhar007 said...

In case of S Asians , Paniya is a better choice .....

Olympus Mons said...


"It went like this...


Bullshit. That is applicable in your polishgene movie... Meanwhile in the real world things will go down surely in a very different ensamble. How different will it be is what we will see.

By the time is taking to reconcile the Danish bell beaker DNA findings with Reich lab BB findings... Well prevailing steppe assertions are not being confirmed otherwise those papers would be headlines by now.

huijbregts said...

Am I the only one who had a quick look at the PCA of the qpAdm sheet?
Dimensions 1 and 2 span a nice triangle between Sardinia(Barcin), Saami(Nganasan) and Poltavka(Eastern_HG). Dim 1-2 explain 70% of the variance.
Dimensions 3-5 explain another 28% of the variance, but are each dominated by one outlier: Hungary_EBA, again Hungary_EBA and Russian_West.
It makes me wonder about the relatives of Hungary_EBA. In the PCA an Onge-effect boils down to Russian_West as an outlier on Dim5/Onge.

This PCA seems strongly overfitted, which is not a surprise with 7 columns and only 40 rows. For the calculation of Euclidean distance or nMonte only the first two dimensions should be used.

German Dziebel said...


"Since NW American Amerindians are closely related to Kennewick Man, pops that have received “out of America” gene flow should display lower distance to Kennewick than to Karitiana. In fact, that was also what I found."

Good. This makes sense. Northern Amerindians and Eskimos should be closer to East Asians, while Southern Amerindians should be closer to a geographically much wider sample.

Recall Raghavan et al.: "Thus, if the gene flow direction was from Native Americans into western Eurasians it would have had to spread subsequently to European, Middle Eastern, south Asian and central Asian populations, including MA-1 before 24,000 years ago."

Also, considering that Lipson & Reich now report "East Asian admixture" in MA-1, Raghavan's reason to reject out-of-America ("Moreover, as Native Americans are closer to Han Chinese than to Papuans (Fig. 3c), Native American-related gene flow into the ancestors of MA-1is expected to result in MA-1 also being closer to Han Chinese than to Papuans. However, our results suggest that this is not the case (D (Papuan, Han; Sardinian, MA-1) = 20.002 ± 0.005 (Z = 20.36))") is now invalid.

Rob said...

@ Alberto

"I find it quite strange that Lithuanians prefer Srubnaya_outlier over Karelia_HG, and I certainly can't reproduce those results. This is what I get instead:"

I agree. Virtually all ANE in Europe comes from Karelia_HG and its Yamnaya, CWC, BB derivatives.
However, in north-eastern Europe, some extra (mesolithic) EHG is found in Finns (28%), Estonians (15.5%), Lithuanians (7%), Norway (5%- limit of western extent), continuing east to Komi, Vepsa, etc. Some EHG arrived with Iron Age Altai-type groups, but in Europe propper this impact appears limited to Finns.
The more difficult question at present is when and how it expanded to southern Europe.

Matt said...

@ Shaikorth:

Those groups would leave no Anatolia-N types, has something like it even been tried to fit a full European set?

Missed this one earlier. Hmmm... Well, no, as far as I know, only ancient populations were modeled in Lazaridis 2016 as mixtures of the Iran_N, Levant_N, EHG and WHG groups.

But yes, I think a separate Anatolia_N population is required to recreate distances (and derived measures - PCA etc).

Anatolia N is poorly placed using models from Lazaridis 2016 which have it as admixed with other populations (LevantN and IranN) for other ancient populations taking proportions from other populations accurately position them on PCA. As are its derived populations (Europe EN and MN) when using a LevantN+IranN+WHG model and not an AnatoliaN+WHG model.

Plus the proportions of IranN+LevantN being almost suggest that Anatolia_N's non-WHG related ancestry was almost equally related to both.

My motivation was to try and separate out the WHG admix in Anatolia_N through qpAdm. But that would be misleading if it also gave incorrect proportions in the Levant / Iran related ancestry.

Unknown said...

It's possible that some near eastern populations that south europeans are admixed with carried south asian admixture, that's why some south europeans including greeks show some residual australoid admixture.

Rob said...

The Iran Neolithic which is seen in Armenian EBA which admixed into Southern Europe after 2500 BC has something partly akin to ASI

Anonymous said...

Interesting that in the spreadsheet the English have more WHG than any of their NW European neighbors, even those with less Mediterranean ancestry like the Scandinavians. I wish we had an Irish sample to see if mixture from that country is what's driving this increase- I remember seeing some plots where isolated Irish and Icelandic samples seemed to show strong connections to WHG relative to EHG.

Also, very cool to see Germans finally stand as no. 1 in a type of ancestry: Caucasian Hunter-Gatherer.

Matt said...

There was some interesting stuff in the paper on Irish Bronze Age ancestry last year (Cassidy et al 2016) showing that the Loschbour WHG donated more haplotype chunks to Ireland, Scotland, Wales than other populations (not England).

Though I would say here the Scots are slightly lower than Sweden and Norway there, and often come out closer to Irish than English. So conversely that doesn't bode as well for Ireland having a WHG enriched background.

I'm not so sure these ancient components will show great structure between West and Eastern Europe. There's fairly clear structure in Europe ( and and populations are quite separable along East-West axis but when you look at the different European populations projected onto ancient variation like - - and estimated ancestry proportions then a lot more overlap. The genetic variations that create distinction between the modern populations look somewhat orthogonal to the defined ancient groups (as do the actual total genetic differentiation between populations).

Chad said...

Here are some of the better fits using Boncuklu, CHG, Levant_EN, WHG, and EHG

Grey said...

"Loschbour WHG donated more haplotype chunks to Ireland, Scotland, Wales than other populations (not England).

Though I would say here the Scots are slightly lower than Sweden and Norway there, and often come out closer to Irish than English. So conversely that doesn't bode as well for Ireland having a WHG enriched background."

purely on an anecdotal basis I wouldn't be surprised if it turned out there was regional clusters along the Irish Sea coast
- North Wales in Wales
- Cumbria in England
- Strathclyde in Scotland
which might mess with the national averages

Rob said...


You've assumed Boncuklu was somehow "native" to Anatolia, but Im not so sure. A more 'primordial" model would be one using Epipalaeolithic samples; and that'll be something like 20% Kotias, 20% Villabruna and 60% Natufian (invariably for Western Anatolia) and that makes "historical" sense.

Unknown said...

I'm working on an 8000 BCE timeframe. Boncuklu is likely a hunter group in a transition with some cattle and early grain gathering. An independent start is a possibility. Also, if the Caucasus is already 30% BE at 13000BCE, why not Anatolia by 8000BCE or even before?

Rob said...

Bonkuclu is pre-Ceramic Neolithic, not a forager. Very few Epipalaeolithic sites have been noted in Anatolia, and when so, they are in the far west - Marmara and Antalya. Here, I'd bet they'd basically be WHG.
So, with the thawing of the Ice Age, East Anatolia would have been settled by the more mobile type of northern natufians (=Kebaran) from Syria (**if** the assumption of absence of evidence is evidence for absence holds, which it might not, because the state of Palaeolithic research in turkey isn't super). Anyhow, we can only with what is at present physically supportable in the "real life". The colonisation of the south Caucasus is a potentially other ballgame, but Basal there could be a continuation from even before the ice age, new stock from Syria or even Iran. (We really need some earlier west Asian individuals).
So I'm not saying your model is wrong, But for interests sake, see what Bokuclu looks like modelled on CHG, Natifan and WHG.

Unknown said...

Boncuklu still forages too. They haven't found evidence of planned agriculture, as far as I know. It's the cattle that's more important for the label. They are new to Neolithic practices, as can be seen from ROH. Meaning, they'd never had a pop boom with exogamy. It's still a forager density. Their diversity is between WHG and later Anatolians.

Unknown said...

I can do the other model for shits and giggles though.

Matt said...

For anyone who is interested (I doubt anyone is, but just in case I may as well... ;) ).

Correction to my above post: Qualifying it, I mean on the low dimensions "when you look at the different European populations projected onto ancient variation like - - and estimated ancestry proportions then a lot more overlap".

On the high dimensions of a version of that I have* that Davidski ran there's totally enough structure - (similarity and distance index based on this -

*Can't remember the post I got this file from was linked to! File as I've downloaded it is titled Central & West Eurasia3.txt / Central & West Eurasia3.dat. Think it was some time after the linked post or in a comment thread. Don't know if you remember that what that file was Davidski, if you're reading.

(Interestingly, the dimension 6 of this PCA that seems to account for most of the ability of the PCA to recapitulate modern European PCA among ancients seems to orient primarily on an axis between different WHG and to differentiate Iberia_Chalcolithic from Caucasus Hunter Gatherers, with Iranian Neolithic and Anatolian Early Neolithic midway between).

(Same thing with Yamnaya, Sardinian and Basques included as well gives a slightly different shape - and similarity and distance matrix -

Davidski said...

I posted it here...

Unknown said...

Best output
7.173 chi square
0.845988 tail probability

48.4% CHG
42.7% Natufian
8.9% WHG

jv said...

Interesting that current Hungarian has more EHG than LBA. I thought the Yamnaya migrations into Hungary would reflect more EHG. But it seems later(Magyar?) or another group added the extra EHG. I believe Magyar mtDNA is very similar to Eastern Europe & Central Asia mtDNA. And Croatia seems rather low on EHG. What happened when the Yamnaya invaded the Balkans? Didn't they leave a DNA footprint? (perhaps the women didn't as for every....what 10 or so migrating Yamnaya man, 1 Yamnaya woman migrated. Young sisters perhaps, no children, no relatives to care for, seems those Neolithic Farmer females left their DNA) jv

Davidski said...

Hungary was largely depopulated during the Middle Ages as a result of Mongol and Tatar raids.

To get the country back on its feet, Hungarian rulers encouraged the immigration of northern Slavs and Germans to the Hungarian Plain. So Hungarians today are basically Slavs and Germans with some early Hungarian ancestry.

Croatians are a mixture of native Balkan groups of mostly southern European origin and Slavic newcomers from the north, so their EHG is somewhat lower than among northern Slavs.

Rob said...

@ Chad

Thanks those figures look correct.

Yes Boncuklu hadn't adapted all the Neolithic package (although they had domestic plants, and fixed abodes) and hadn't undergone the population boom yet, because that happened 1000 years after. But that doesn't change the fact that the settlers themselves at pre-Ceramic Bonkuclu, or their ancestors, had to arrive from somewhere else to begin with - as mobile forager/incipient farmers. And going back to the Late Glacial, this was ultimately ~ Syria +/- Zarzian +/- the Caucasus Epipalaeolithic. Map

jv said...

Thank you Davidski, Just trying to understand how my Pontic-Steppe Grandmothers made their presence in the Balkans between about 2000 BCE and 700 AD(and that's a lot time!)Yes, I suppose it could be a migration in the Middle Ages to Hungary from Northern Europe(H6a1a Corded Ware Culture Esperstedt Germany 2300 BCE)The Magyars did have mtDNA H6a1a and H6a1b. These mtDNA's were present in the Yamnaya (H6a1b) and Srubnya(H6a1a) So it seems Central Asia & the Ural Mt area still had these lineages in 900 AD. jv

Unknown said...

Nah. The obsidian production looks like a continuation of local UP groups. The mixing happened way before the Neolithic. Even Greek hunters match these Anatolian mtDNA groups. A third Neolithic center seems more likely. 1/3 of Natufian genes are gone by the ceramic period. They never exploded like Central and Eastern Anatolians.

ROH tells us they weren't mobile. So do the houses. They've been inbred for a long period of time. More mobile groups like early Natufians show quite a bit more diversity.

Rob said...

@ JV

The Yamnaya impact in Bronze Age Hungary and the Balkans wasn't massive. It was only large in north -central Europe. Many people are lost on this fact because they're dazed by tabloid headlines and lack of detailed understanding of European prehistory. Southern Europe had it's own set of phenomena going on.

In fact, the EHG acquired in Bronze Age Hungary was probably mediated via exogamy with Corded Ware communities in southern Poland, across the northern Carpathians, and vice versa (which is how CWC acquired certain Copper Age Balkan cultural traits).

As David said, we mustn't treat modern Hungarians as a proxy for any earlier Carpathian population - as the amount of admixture and shift in subCarpathian Europe is more complex and multifold than anything in the north European plain.

At the end of the day, you cannot try understand modern historic populations on something as basal as Daves Mesolithic-Neolithic sheet here, as it merely represents the final result of 50000 years of processes (since the EBA) without cluing as as to what. Rather, as more genomes come from the late Bronze & Iron Ages, we can try model modern groups on those.

I think Albanians would be a good clue as to what the Balkans looked like in the Iron Age (even though they too have some Slavic admixture).

Germany_Bronze_Age:RISE471 26.45 %
Baalberge_MN:I0559 26 %
Armenia_EBA 23.6 %
Hungary_CA:I1497 6.45 %
Yamnaya_Samara:I0429 6.4 %
Anatolia_Chalcolithic:I1584 4.7 %

Their main ancestry comes from middle Neolithic Europe ("Baalberg"), The Tumulus culture (RISE 471), and Armenia Bronze Age.

Rob said...

@ Chad.

Well you have it backwards. Natufians were more sedentary, and the northern Kebaran more mobile. The procurement of Obsidian doesn't mean that the its consumers were foragers - as incoming Neolithic farmers could and simply did requisite the network.

See 'The Prehistory of Asia Minor From Complex Hunter-Gatherers to Early Urban Societies". E During.

"There is no a priori reason to assume that these pre-existing groups
living on the Anatolian plateau played a role in the neolithisation of the

"Beyond the Pre-Pottery Neolithic B interaction sphere" by Eleni Asouti

jv said...

@ Rob, Thank you. I've been reading papers on the Yamnaya migrations into Hungary and I guess assumed they came from Ukraine & Russia directly. The Yamnaya burial positions, Kurgans, red ochre, and local Neolithic Pottery(maybe Steppe herders finding local Neolithic wives) I suppose the Yamnaya mtDNA in the Balkans could be from a migration from the North from the CWC.jv

Rob said...

@ JV

No you're right. The Yamnaya in Hungary definitely did come from the steppe. But the question of lasting genetic impact on the local fabric is a different one. People could come, and then go....
Mind you I was referring specifically to BA HUngary here. Who knows what the Ezero culture in Bulgaria, or Vucedol in Serbia/ Croatia will look like ?

jv said...

@Rob, yes, Thought of Vucedol regarding my mtDNA. Seems Vucedol is Beaker....well, that, I'm not(CWC). so, indeed folks come and go. Suppose I'll have to patiently wait for ancient mtDNA results to fill in the gaps.jv

Unknown said...

What I mean is the industry and trading shows continuity. There isn't a big disruption. I think Boncuklu type people go way back. They may stretch back to the Antelian period. European and Anatolian farmers have a much better match in uniparentals with Boncuklu, then Natufians to a smaller degree. There won't be any WHGs residing in Anatolia, which you'd need if you think it's fully Levantine migrants. Using transversions, Anatolia is significantly closer to Boncuklu than the Levant EN, arguing against any major direct Levantine admixture. That's more in line with archaeology saying later Levantine influence in western Anatolia.

Rob said...

@ Chad.

I never stated that Anatolian "pre-Ceramic" incipient farmers came from Israel. They came northern fertile crescent (which is different!)
There are basically no Epipalaeolithic sites in central & eastern Anatolia.

Future research might change this, but we have to work with the present reality, for now.

This below map modified from Asouti shows the clusters of pre-Ceramic (PPNA) centres. The Israeli ones were haplogroup E, but the hg G guys are from northern Syria (Turkish-Syrian border area). These still mobile groups moved west into central Anatolia c. 10000 BC, into more or less empty landscape. Becuase there were isolated groups at the incipient stages, they show high ROH. Then a second pulse of cultural & genetic miscegeneation occurred, c. 8000 BC. This then generated the demographic impetus for movement of hg G rich farmers to Greece, Iberia, etc.

" There won't be any WHGs residing in Anatolia, "

I disagree. The foragers in Antalya are completely different from the pre-Ceramic farmer guys in the east. They derive from the local Balkan EpiGravettian, will be "WHG" (if ever a burial is found), and we have indirect confirmation by the presence of I2c in Barcin & Mentese. You know this from previous discussions.

Unknown said...

There are epi-Paleolithic groups in Central and Eastern Anatolia. Where do you think a lot of the lithics came from? Gobekli Tepe is one of the most famous sites in the region. Your timeline is strictly Neolithic. There was hardly any cultivation in Boncuklu. Almost all food was local swamp stuff, wild animals, wild grain, water fowl, frogs, etc.. Big contrast to the east. Also, Natufians and PPNB and C were all more diverse than Boncuklu. Very much so. Incipient agriculture has nothing to do with it. They've been isolated for a long time as an ancient population. Watch future remains from Karain Cave, Hacilar, and Okuzini Cave all be similar to Boncuklu. Lithuania ensembles are more related to the east and older than Greek ones. Marmara is more connected to Bulgaria to Crimea groups than Central Europe. Judging by Greek mtDNA, there won't be a pure hunter in the southern Balkans.

Unknown said...

Lithic ensembles, not Lithuania. Autocorrect got me.

Rob said...

@ chad

The fact is there are like 3 Epipalaeolithic sites in eastern Turkey. You seem to be ignoring that . But if you have found a reference showing proof of something otherwise, please share.

Secondly, high ROH doesn't a priori equate with a local origin. All epiPaleo groups from the Caucasus and northern levant will have been more or less isolated until the Neolithic .

Lastly, inland Greek foragers were just normal Mesolithic people like in Italy or the Balkans. It was only those in the islands which engaged in obsidian trade with the near east. Anyhow, they all went extinct by the arrival of actual farmers .

Unknown said...

More like at least 5 in the east and some more in central Anatolia. In fact, right on the Konya Plain where Boncuklu would soon follow. Good cultural continuity. Here's the first I'll give you to read about several inhumations. Actually contemporary with Natufians and elaborate shell headdresses like them, but item "richer" longer than Natufians. Obsidian trade could be a source of wealth.ınarbaşı._Central_Anatolia_from_an_Eastern_Mediterranean_perspective

Unknown said...

I'll dish out more tomorrow. It's really late here.

Matt said...

@ Davidski: Cheers for man. I was really surprised when I looked at it how much that Dimension 6 in that PCA seems to contribute strongly to capturing all the intra-European differences, when running the recent European subset on that spreadsheet only. (That dimensions a lot less important in structure across the whole set of samples.)

Dimensional loadings in that sheet in importance of differentiation among the European subset of samples seem roughly equal between Dimension 1 (Euro HG vs Ancient Near East), and Dimension 2 (EHG+CHG vs WHG+Barcin), and Dimension 6. Dimension 6 seems to loads on CHG, EHG and recent Eastern European people on end vs Basque, Sardinian, Iberia Chalcolithic and a cluster of WHG (Continenza, Rachot88, Falkenstein) at the other. The other dimensions also add to relationships, to a lesser degree.

Building neighbour joining trees for European samples with different subsets of the dimensions - and, with Iberia_Chal and Yamnaya as well.

The Iberian and Northwest European samples are much more likely to form subtrees and a connected subtree that are associated with dimensions other than 1 and 2 are taken into consideration. Particularly dimension 6 contributes strongly to that. With dimension 1 and 2 alone in neighbour joining you get a single European cline across West->Northeast, varying acording to HG ancestry. With all dimensions in play there's a distinct Norwegian->Basque+Iberia_Chalcolithic and a distinct Hungarian->Cypriot subtree among the samples.)

Including the Boncuklu (, Tefecik-Ciftlik ( and Iberia_EN ( samples, all those populations look they're pretty distinct with regard to the dimensions that work most to distinguish present day Europeans.

For me, it would be pretty cool once we the data on Mathieson's Balkan Farmer paper released to stick them under the same PCA technique you've used here, and then rerun the same PCA and neighbour joining exercise with them in the European context.

Davidski said...

Please note, I updated the English result, using more samples and markers.

The former result was based on a reduced number of samples, due to mislabeling, and that's probably why the Western_HG ratio was so high for the English.

Davidski said...

I've just discovered something very interesting when running models for my qpAdm tour of Europe Bronze Age edition; Neolithic admixture in Poles probably comes from the Lengyel group.

I'll post the results soon when I put up the new blog entry.

Slumbery said...

Completely irrelevant here, but "admixture in Poles comes from the Lengyel group" sounds like a pun. :) (Because the name of the village that gave its name to the Lengyel culture means Polish in Hungarian.)

Davidski said...

This Lengyel sample is interesting. I'm discovering some great stuff here. The new post will be very informative about the European Bronze Age.

Rob said...

Which Lengyel sample Davo ?

Davidski said...

Hungary Lengyel Late Neolithic I1495 (NE7) Apc-Berekalya I 4491-4416 calBCE

Chad said...


Here's more detail on those burials. I've got several more here that I'm combing through to see if they're of any use.

Hopefully, this link works.

Chad said...

I gave the contact info to Reich. So, hopefully they go after the samples and we see something by next year.

Matt said...

Cool to see what you find David. Being able to link ancestry to specific Neolithic groups would be very nice.

In the West Eurasia spreadsheet you've linked above, calculating the euclidean distance in those dimensions to the Tepecik Ciftlik samples and other Early Neolithic, then comparing produces a subtle, but systematic axis across Europe:

(Not sure if this is varying with CHG - it looks possibly like a different pattern?).

Rob said...

Yes that was a very interesting paper Chad, thanks
Dave : if Lengyel is important for Poles, what about other Slavs ?

Davidski said...

Lengyel is important for everyone. It's the only Neolithic sample that produces statistically sound models across Europe, except for the Chuvash.

Not really surprising, considering this sample comes from the final Neolithic of East Central Europe, so it probably represents the kind of farmers that the steppe pastoralists met as they fanned out west.

The interesting thing is that Western European populations can be modeled with a variety of Neolithic samples, while Eastern Europeans usually need Lengyel for solid fits. But I won't have time to systematically test this. All I can say is that Neolithic-derived ancestry in Western Europe generally appears more heterogeneous than in Eastern Europe, and this might be one of the factors that differentiates most Western Europeans from Balto-Slavic groups in particular.

EastPole said...

Dave, could you compare Sintashta samples with Lengyel.
Sintashta’s RISE391 mtDNA is N1a1a1a1 and N7 Lengyel is N1a1a1a.
Which of your Neolithic samples fits Sintashta best?

Davidski said...

I'll have a look later today. Might take me a little while though.

Matt said...

Self quote: (Not sure if this is varying with CHG - it looks possibly like a different pattern?).

Distance on that ancient West Eurasia PCA from, Tepecik-Ciftlik vs other Neolithic : (peaks SW Europe vs SE Europe and follows East-West pattern) is distinct from

CHG vs Anatolian Neolithic : (peaks SW vs NE Europe and follows a SW-NE axis)


(Note though: In above graphed size of the distances of modern samples along a Tepecik-Ciftlik:Boncuklu axis is only 1/3 or less than the size of CHG:Boncuklu axis, including far outliers, Sardinian, Cypriot).

EastPole said...

Dave, maybe you can check also Corded Ware. Wikipedia writes about Lengyel culture:

“It was preceded by the Linear Pottery culture and succeeded by the Corded Ware culture. In its northern extent, overlapped the somewhat later but otherwise approximately contemporaneous Funnelbeaker culture”.

Sintastha is linked with Corded Ware and Corded Ware originated in Poland following Lengyel culture.

Here you were comparing Corded Ware and Lengyel in Poland:

Rob said...

Wikipedia is inadequate here. The Lengyel culture ended long before CWC. You'd be better off looking at what contemporary Polish scholars write, and their work is abundant and of top quality. For example see here for propper chronology "Neolithisation in Polish Territories: Different Patterns, Different Perspectives, and Marek Zvelebil’s Ideas"
The main Neolithic culture in Polish territory before CWC & GAC was the TRB, and some Baden outcrops.
So It's difficult to tell what Dave is picking up with Lengyel being pertinent in Slavs. Perhaps because Lengyel was the culture which expanded east of the Carpathians, which is where Slavs expanded from thousands of years later.

Davidski said...

Lengyel is one of the Danubian Neolithic cultures that expanded north, like Rossen, which covered most of Germany. So Wikipedia has it right in that regard, on their map, and it's pretty likely that both Yamnaya and CWC would have ran into the descendants of these final Neolithic Danubians.

EastPole said...

Lengyel culture contributed to both TRB and Baden in Poland.

Rob said...

@ Dave & E.P.

I take what the Lorkiewiczs paper suggests, radiocarbon evidence firmly places the TRB culture to have expanded from north Germany. Some have even linked its catalyst to a colonization from Michelsberg north, then expansion back south, west and east. If so, then it's origin isn't from Lengyel, but a post-LBK thrice removed.
GAC beings c. 3000 BC, not much before CWC, and is contemporary. Whilst we await its aDNA, the role of surviving >sub-Neolithic< groups in Poland (esp. Kuyavia) probably played a big role. The Lengyel culture, of the "Danubian world' had collapsed for 1000 years prior to this. I doubt it played much significant role in the history of Poland subsequently. Rather, where it probably survived was in Ukraine, and offshoots cultures there (eg C-T). Thousands of years later, it expanded northwest with the Slavs moving into Poland. There will be little link between Neolithic Polish Lengyel and modern Poles.

EastPole said...

Read the abstract:
“Of greatest importance is the observed link between the BKG and the TRB horizon, confirmed by an independent analysis of the craniometric variation of Mesolithic and Neolithic populations inhabiting central Europe. Estimated phylogenetic pattern suggests significant contribution of the post-Linear BKG communities to the origin of the subsequent Middle Neolithic cultures, such as the TRB”.

BKG is Brześć Kujawski Group of the Lengyel culture

So Eastern TRB/Baden were influenced by Lengyel and from them early Slavs originated by mixing with Steppe(R1a)/Corded Ware

Rob said...

I did read the abstract. I'll wait for GWA

Alberto said...

The "special" thing about the Lengyel sample is probably that it has all it's WHG admixture from an eastern type of WHG (very similar to Hungary KO1). In the Bronze Age there seems to be a big resurgence of KO1-like (or even a bit towards Motala) kind of ancestry. This is quite notable in the BA Hungary samples, especially BR1 (I1052) and BR2 (I1504) that can't be well modelled using Loschbour or Bichon (which are more restricted to Central-Western Europe).

In Cassidy et al, the haplotype analysis of BR2 peaked in Poland at 29.67, while for example in Spanish is 27.09 (compared with Loschbour: Polish - 32.95, Spanish: 35.32).

Alberto said...

Barcin_Neolithic:I0707 84.8 %
Hungary_HG:I1507 13 %
Motala_HG:I0017 1.8 %
Kotias:KK1 0.4 %
Loschbour:Loschbour 0 %

Barcin_Neolithic:I0707 86.4 %
Loschbour:Loschbour 13.4 %
Hungary_HG:I1507 0.2 %
Kotias:KK1 0 %
Motala_HG:I0017 0 %

Anonymous said...


Could that be due to Magdalenian admixture, absent in KO1 but present - or more abundant - in Loschbour? Would Iberia_EN prefer La Brana over Loschbour?

Rob said...

@ Epoch

It's simply because WHG from western Europe admixed into EN Iberia, perhaps because many EEF made it to western Europe with little WHG admixture en route. On the other hand, those reaching Hungary admixed with the local WHGs. I don't think anything too complex or deep is involved.

Anonymous said...


That is what I hinted at too. OTOH CB13 d-stats from Olalde 2015 choose KO1 over Loschbour or La Brana:

Rob said...

Yeah. I hope we get more Mesolithic stuff from Europe soon

Azarov Dmitry said...

I think Davidski is right about this Lengyel admixture. It looks like there was no dramatic change of population between the Alps and Carpathian mountains from Neolithic until Iron Age (Lengyel->Baden->Urnfield->Hallstatt). So this Lengyel admixture could be explained by mixing of early Slavic people (R1a folks) with pops from Eastern Hallstatt (could be I2-DIN folks).

Davidski said...


Using transversion sites because Sitashta is very damaged, the results suggest that Lengyel_LN is the best source of farmer ancestry in Sintashta. I'm getting very similar results for Corded Ware.

Pop chisq tail_prob Lengyel_LN Steppe_EBA Western_HG
Sintashta 7.713 0.738796 0.411 0.5 0.089

Pop chisq tail_prob Barcin_Neolithic Steppe_EBA Western_HG
Sintashta 11.917 0.369916 0.315 0.554 0.131

Pop chisq tail_prob Germany_LBK Steppe_EBA Western_HG
Sintashta 12.343 0.338398 0.328 0.562 0.11

Pop chisq tail_prob Hungary_MN Steppe_EBA Western_HG
Sintashta 15.341 0.167403 0.387 0.534 0.079

Pop chisq tail_prob Iberia_EN Steppe_EBA Western_HG
Sintashta 16.013 0.140652 0.363 0.542 0.095

Pop chisq tail_prob Hungary_LBK Steppe_EBA Western_HG
Sintashta 20.085 0.0441948 0.339 0.576 0.085

Matt said...

Still with my tangent, trying to use the West Eurasia PCA from with 4mix* to model fine structure differences between Europeans in different types of Neolithic-Chalcolithic ancestry.

I found that the following two models seemed to spatially reproduce European only PCA (run on the above PCA) and neighbour joining trees quite well (with direction preserved better than distance).

Yamnaya, Iberia_EN, Iberia_Chalcolithic, Tepecik-Ciftlik:

Produces a Northeast-Northwest European cline in Yamnaya->Iberia Chalcolithic, with Central and Southern European populations mainly deflected from local Northern Europeans towards Tepecik Ciftlik, with the exception being Sardinians who deflect to Iberia_EN.

Yamnaya, LBK_EN, Iberia_Chalcolithic, Tepecik-Ciftlik:

Still produces a Northeast-Northwest European cline in Yamnaya->Iberia Chalcolithic, however in this instance mainly Italian and Greek populations are deflected from local Northern Europeans towards Tepecik Ciftlak, while Central Europeans and particularly Balkan populations and Basques tend to be deflected towards LBK_EN.

Average of both:

Fits are weakest for Baltic populations, as these models could probably benefit from additional of extra HG rich Neolithic populations for that region (to contribute to their ancestry in place of some of the Yamnaya they get).

(Trying to use more distant populations like EHG, WHG, Iberia_EN, Tepecik-Ciftlik didn't work so well, since it overdifferentiated some populations and didn't click with some affinities)

Using Germany_MN with Yamnaya, Iberia_Chalcolithic, Tepecik-Ciftlik: produces a stronger distinct affinity between Germany_MN and modern day Central European than LBK does, and excludes Iberia_Chalcolithic from Europeans other than the Atlantic facade. It is weaker at reproducing PCA shape, so I haven't included that.

Combining the average of the models with Yamnaya, Iberia_Chalcolithic, Tepecik-Ciftlik and (Iberia_EN, Germany_MN, LBK_EN) produces a surprisingly very smooth correlation of each of the populations with their closest present day geographical neighbours:

Yamnaya: peaks NE Europe, similar levels in NW and Northern Balkans, very low SW Europe

Germany_MN: peaks Germany, Austria, North Italy and Czechoslovakia, similar levels in NW, NE and Balkan Europe, low at the peripheries.

LBK_EN: similar distribution to Germany_MN, with instead a peak in the Balkans

Iberia_EN: essentially non-existent outside Sardinia (and curiously the Albanian language isolate).

Iberia_Chalcolithic: high across entire Atlantic facade, with a peak in Spain, lowest in SE Europe

Tepecik - Ciftlik: Peak in SE Europe, nadir in British Isles, Baltic and Scandinavia, trace level in Central Europeans

(In this final toy model, percentages are not likely to be accurate, even if the relative shifts are in the right directions).

*nMonte might have been better than 4mix, as there are limits to using 4 as in above and I had to use averages as I extended, but I don't have a script for running mass comparisons.

Matt said...

nMonte analysis with PCA from ( and 6 groups Tepecik-Ciftlik_Neolithic, Yamnaya, Germany_MN, LBK_EN, Iberia_Chalcolithic , Iberia_EN:

Similar to the 4mix averages:
Tepecik-Ciftlik_Neolithic peaks in Southeast Europe (70-50% Southern Italy to Greece) and is in last substantial amounts in North Italy and Croatia (13-8%)

Yamnaya peak Northeast Europe and is present everywhere with nadir at Sardinian (7.5%)

Germany_MN peaks in... Germany (50.5%) and drops out from there

Iberia_Chal is essentially only present in populations along the Atlantic facade (80% Basque, 50% Spanish, 25% England-Norway)

LBK_EN and Iberia_EN have very limited contribution independent of other components - only present in Kosovar (11%) and Sardinian (61.5%).