search this blog

Saturday, July 2, 2016

Ulan IV

If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here), then I reckon the best option is the Catacomb Culture.

As far as I can tell, one of the Yamnaya samples from Allentoft et al. 2015, RISE552 from the Ulan IV burial, might actually be a Catacomb sample. That's because Ulan IV is classified as an West Manych Catacomb Culture site. Check out this awesome paper on one of the graves from this site here.

Here's an qpAdm model of the Kalasha from the Hindu Kush featuring Ulan IV RISE552 based on over 200K SNPs:


Ulan_IV 0.609 ± 0.051
Iran_Neolithic 0.184 ± 0.066
Andamanese_Onge 0.175 ± 0.041
Han 0.032 ± 0.023

I'm not saying this model is definitive by any stretch, but it's more or less statistically sound, with fairly low standard errors for each of the coefficients (0.051, 0.066, 0.041, 0.023 respectively). It's also very similar to the optimal qpAdm model of the Kalasha in Lazaridis et al. 2016.

Interestingly, it also matches closely a TreeMix analysis that I posted at my other blog last year, months before I even knew that ancient genomes from Neolithic Iran were on the way (see here). This is what I said in that blog entry:

Both of these models are correct; they just show the same thing in different ways. So if we mesh them together the Kalash and Pathans come out ~65% LNE/EBA European (which includes substantial Caucasus or Caucasus-related ancestry), ~12% ASI, and ~23% something as yet undefined.

If I had to guess, I'd say the mystery ~23% was Neolithic admixture from what is now Iran.

That's not bad considering how difficult it is to make predictions about ancient population movements without direct evidence from ancient DNA. In any case, it's a lot better than what has been published on the topic in some major journals.


Davidski said...

Ha, last paragraph here...

Helgenes50 said...

This catacomb culture is also very close, geographically, to the western Yamnaya who are supposed to be appeared in the Hungarian plains, i.e. the ancestors of the Western Europeans

Rob said...

Interesting paper, although nothing new

"In Eastern Europe, the first wagons were made by the Majkop population living in the northern Caucasus"

It would seem the Caucasus wives hauled them in, bringing elements of their dowry along for the ride

Helgenes50 said...

RISE 552 ( Asi K8)

3.91% Amerindian
0.02% Siberian
51.68% Euro_HG
0.03% Oceanian
0.80% Sub-Saharan
0.00% Southeast_Asian
5.08% LBK
38.48% South-Central_Asian

Rami said...

The Lazaridis paper says 43% of EMBA Steppe is derived from a population similar to Chacolithic Iranians, I would infer Central Asian Agriculturalists would also already be rich in that component , as their culture is similar to those found around the Caspian.

Davidski said...

Doubt it.

Stats show that ANI is a mix of NEOLITHIC Iranians and steppe pastoralists.

Rob said...

what happens if SAs are tried to be modelled as Iran Chalcolithic or Armenian Chalc ? ?

Rob said...

This sample Ulan IV is from Kurgan 4. grave 8. Its calibrated BC date is mean 2500 BC, so it's in the transitional period between Yamnaya & Catacomb - Manych; which is I guess why the authors put it as "Yamnaya" still (Supp Table 1, Allentoft).

Curiously, this Ulan IV Rise 552 sample is the one which was I2a2 !

But IIRC most of the data from Allentoft was Yamnaya, then Srubnaya, with little of the intervening Catacomb culture.

The origins of the Catacomb rite itself is still discussed. Some catacomb burials appears in southern Poland, within the Sandomiercz - Krakow groups of CWC & Zlota culture, but the details of construction, positioning of deceased & accompaniments differ. Also some have proposed links to Dolmenic burials in the contemporary Nth Caucasus group.

So it appears to be a distinct local innovation in the Black Sea steppe (?)

Davidski said...

The majority of South Asians can't be modeled successfully without Bronze Age steppe reference samples.

In other words, modeling them as Armenia_Chalcolithic, Armenia_EBA or Iran_Chalcolithic plus Onge/Han doesn't work well at all.

But it's possible to include the ancient Armenian samples and Iran_Chalcolithic in successful models that are mostly based in varying degrees on the Bronze Age steppe and Iran_Neolithic references.

Rob said...

Ok Thanks.
Plus there's the need to link in Z93.

ryukendo kendow said...
This comment has been removed by the author.
ryukendo kendow said...
This comment has been removed by the author.
Shaikorth said...

If Natufians have archaic OOA and not African they perhaps should show increased affinity to Papuans relative to Iran_N instead of the opposite (and not have the relative African-Papuan pattern repeat with Onge)?

This is because there's an upcoming study saying Papuans have early AMH admixture which manifests as increased haplotype sharing with Africans. Some time ago I noticed this sharing with SSA happened even in Anders PĂ„lsen's chromopainter runs using lower coverage genomes (happened with Paniyas etc. too, though Papuans were the extreme).

"Previous human genetic studies, based on sampling small numbers of populations, have supported a recent out-of-Africa dispersal model with minor additional input from archaic humans. Here, we present a novel dataset of 379 high-coverage human genomes from 125 populations worldwide. The combination of high spatial and genomic coverage enabled us to refine current knowledge of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size. The examined Papuan genomes (hereafter taken as representative of the broader Sahul region) show an excess of haplotype sharing with Africans. This is compatible with traces of an early and otherwise extinct non-African lineage in the Papuan genome, evidence of which was so far only available in the Western Asian fossil record. Our tests of positive and balancing selection highlight a number of new metabolism- and immunity-related loci as candidates for local adaptation."

ryukendo kendow said...
This comment has been removed by the author.
Shaikorth said...

Papuan distance increases from Africans and indeed all AMH's, but that can be expected because of their Denisovan. If Natufians share ancestry with Papuans from the same archaic OOA clade that should bring them together vs others, and in theory show in their fst-ratios too. Or maybe we'll have to think of several "african-looking" early AMH expansions that almost died out.

It would be really interesting to see Natufian formal stats with high coverage genomes in any case. The z-score for EastAsian French Dinka Chimp changed from <-1 to >2 when switching from Human Origins to high coverage (Wong et al).

ryukendo kendow said...
This comment has been removed by the author.
Alberto said...


It seems we're discussing different things. Ust-Ishim, Vestonice or Kostenki are corner cases, and not really relevant. Maybe not having the right columns for them make their behaviour weird. The 45 ky of extra drift that modern Eurasians don't share with Ust-Ishim is, in any case, drift away from Africans, so maybe not so weird after all.

For the main point about measuring ancestry in "normal" populations, I'd be glad to discuss it privately if you want. It would be long and complicated to do it here.

Matt said...

@ Ryu:

Re: Levantines have a different fst ratio between papuans and Africans than other Eurasians, but this need not mean that they are closer to Africans compared to other Eurasians

I dropped a comment on Maju's blog about how it seems like high drift in a branch of clade would converge fst scores between that branch and different outgroups, relative to another branch from the same clade that was not as drifted - In theory high drift making the relative FSTs to each of the outgroups more of a function of the drift, rather than any branching order / admixture.

Re: aDNA in North Africa I ref'd the abstract as - "The genomic enigma of two Medieval North Africans" Gunther along with some other abstracts on

Samuel Andrews said...

"If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here)"

We shouldn't take the preference of Yamnaya over Andronovo for South Asians in Laz 2016 very seriously.

ryukendo kendow said...
This comment has been removed by the author.
Samuel Andrews said...


Read my blog for anything about mtDNA.

The ancient Moroccan mtDNA is from an old study and so might be illegitimate results they have no U5 or U4 or U2. Most are undefined R, two are R0a, two are HV0, and two might be U6d3.

ryukendo kendow said...
This comment has been removed by the author.
Davidski said...

We shouldn't take the preference of Yamnaya over Andronovo for South Asians in Laz 2016 very seriously.

We shouldn't accept it just yet as fact, because pop genetics methods constantly improve and thus so do inferences, but we should certainly take it as a serious possibility.

Shaikorth said...


If we assume that drift is added linearly and things with Natufians and other Eurasians are like in your example (no SSA to Natufians and no Papuan to everyone but them), the Mbuti/Papuan ratio should approach 1 but never reach or exceed it for non-Natufians (it does) and we'd have to assume Natufians are less drifted because their ratio is lowest. The latter is not the case as we can see their absolute fst-distances are high compared to other populations. This may be artificial drift caused by low coverage but the point stands.

If we reduce an arbitrary number from Natufian and BedouinA's (or Yoruba's) distances to Papuan and Mbuti, the ratio stays below 1 but if we do the same to Iraqi Jews or Iran_N or Irish it stays above 1.

ryukendo kendow said...
This comment has been removed by the author.
Matt said...

@ Ryu: Yes, that's pretty much what I'm trying to say, in comparings the ratios of fsts, both the effect of private drift to a branch making outgroups more relatively equal, and phylogenetic similarities are both present.

The phylogenetic effect dominates I think. (And I think the cause of the phylogenic pattern probably is a "Basal Eurasian" style split along the lines of RK's speculation IIUC?).

I would just tend to use something like Mbuti-Papuan though, as the common effect of drift adding to both fst to Mbuti and Papuan would cancel out better, and the rank order makes more sense for ancient and also modern populations. But comparing fst with different outgroups is not invalid to get a sense of phylogeny. That's not what I am trying to say.

IMG of comparing Mbuti-Papuan fst to the Mbuti:Papuan fst ratio:

(WHG is below (more Mbuti) compared to LNBA Europe and steppe populations by ratio method, not by subtractive method. I think this is the effect of high drift scaling in the ratio method.).

Arch Hades said...

So where does all the R1a in modern Indo-Iranians come from? Because Allentoft's Yamnaya samples are R1b.

Shaikorth said...

Matt, could you add BedouinA (not drifted, presumed to have SSA), BedouinB (drifted with assumed SSA), South Italian (drifted) and Sicilian (not drifted) to the graph?

Matt said...


( with some recent North-Central Europeans and Sardinians thrown in)

However, the difference in rank order in subtraction vs ratio normed against these ancients would require *very* high FSTs to generate. It seems to me present in Euro HGs, but they have an enormous degree of drift (whereas a pop like South Italian only likely to be highly drifted / isolated relative to recent populations). That said, the recent populations (Sicilian, Lithuanian, Polish, English) do tend to be in the opposite position relative to HGs in terms of ratio relative to subtraction.

Matt said...

Increasingly OT, but: Few neighbour joining dendrograms based on a) raw fst distances to outgroups (top), b) fst distances to outgroups normalised against Mbuti by division (middle), c) fst distances to outgroups normalized against Mbuti by subtraction (bottom):

You can see that some kind of normalisation is required to recreate a dendrogram that reflects what we know about the population relationships (as the top doesn't really work at all, mostly just clustering populations together according to how drifted they are and tendency to have high fsts). I think the subtraction normalised dendrogram / clustering works best to place SHG and WHG logically (obvious limits from using outgroups, as always).

Shaikorth said...

On the graphs, one thing worth noting is Natufians despite obviously having high drift like Euro HG's are also on their opposite side.

huijbregts said...

@ Ryu

I looked into the consequences of a dimension reduction on West-Eurasians. (D-stats3b, without Remedello_BA).
I kept 4 dimensions, which explain 96% of the variance. The higher dimensions are dominated by a single row or column.
A cluster analysis with PAM returned 8 clusters (labels are mine):
- North_African(3)
- Villabruna(6)
- Levant_Neolithic(7)
- Euro_Neolithic(11)
- EHG(12) including steppe
- South_Asia(17)
- Caucasus(26)
- modern European(35) including Bell Beaker etc.
Two pops were oddly classified: GoyetQ111 in South_Asia and Vestonice in modern Europe; maybe they form their own cluster.
Lebanese_Muslim was classified as Levant_Neolithic and Lebanese_Christian was classified as Caucasus.
I think this may be close to the maximum resolution you can demand in PCA and nMonte.
Of course one can see a lot more detail in a PCA, but the danger is just seeing ones pet theory.

Gill said...

If we recover DNA from Neolithic Balochistan/Indus Valley and if it's even more ANE shifted than Iran_N, then an Iran_Chl-like population will be required to explain the final result. It's not unlikely a Steppe population could have picked up such admixture on its way in (via Iran or South Central Asia) during the Bronze Age.

Olympus Mons said...

Ot:Loved the Map in this Post. Its the best HYpsometric I have seen.It shows why the populations moved/settled in Caucasus.

Michael Ellsworth said...

Again returning to David's initial point of the unusually Yamnaya-like character of ANI, should we not expect the Indo-Aryans to have picked up some more strictly Yamnaya-like ancestry in their migration to India? Most models have the Indo-Aryans running ahead of the Iranians in the move south, and as they did so, they ran into Afanasevo people. From the samples so far, Afansevo is basically indistinguishable from Yamnaya. So what happens if we imagine that the Indo-Aryans were admixed Corded Ware, late Western Afanasevo, and Iranian Neolithic? I understand that Corded Ware and Yamnaya are close, so it might be hard to untangle an admixture between them, but it seems to me like extra Yamnayishness is what we expect.

Davidski said...


You might be right. It's useful to note that one of the outlier Srubnaya samples was classified by Lazaridis et al. as Steppe_EMBA, because it was more similar to Yamnaya than to Andronovo, Sintashta and Srubnaya, in other words the Steppe_MLBA samples.

Considering the patchy sampling of the Bronze Age steppe to date, future sampling might uncover the persistence of whole Yamnaya-like populations in Andronovo and Srubnaya territories.

However, if more Catacomb samples are sequenced, and it turns out that they're excellent proxies for steppe admixture in South Asians, like Ulan IV RISE552 clearly is, and some of them belong to basal clades of R1a-Z93, then that would make things very interesting.