search this blog

Saturday, July 2, 2016

Ulan IV

If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here), then I reckon the best option is the Catacomb Culture.

As far as I can tell, one of the Yamnaya samples from Allentoft et al. 2015, RISE552 from the Ulan IV burial, might actually be a Catacomb sample. That's because Ulan IV is classified as an West Manych Catacomb Culture site. Check out this awesome paper on one of the graves from this site here.

Here's an qpAdm model of the Kalasha from the Hindu Kush featuring Ulan IV RISE552 based on over 200K SNPs:


Ulan_IV 0.609 ± 0.051
Iran_Neolithic 0.184 ± 0.066
Andamanese_Onge 0.175 ± 0.041
Han 0.032 ± 0.023

I'm not saying this model is definitive by any stretch, but it's more or less statistically sound, with fairly low standard errors for each of the coefficients (0.051, 0.066, 0.041, 0.023 respectively). It's also very similar to the optimal qpAdm model of the Kalasha in Lazaridis et al. 2016.

Interestingly, it also matches closely a TreeMix analysis that I posted at my other blog last year, months before I even knew that ancient genomes from Neolithic Iran were on the way (see here). This is what I said in that blog entry:

Both of these models are correct; they just show the same thing in different ways. So if we mesh them together the Kalash and Pathans come out ~65% LNE/EBA European (which includes substantial Caucasus or Caucasus-related ancestry), ~12% ASI, and ~23% something as yet undefined.

If I had to guess, I'd say the mystery ~23% was Neolithic admixture from what is now Iran.

That's not bad considering how difficult it is to make predictions about ancient population movements without direct evidence from ancient DNA. In any case, it's a lot better than what has been published on the topic in some major journals.


Davidski said...

Ha, last paragraph here...

Helgenes50 said...

This catacomb culture is also very close, geographically, to the western Yamnaya who are supposed to be appeared in the Hungarian plains, i.e. the ancestors of the Western Europeans

Rob said...

Interesting paper, although nothing new

"In Eastern Europe, the first wagons were made by the Majkop population living in the northern Caucasus"

It would seem the Caucasus wives hauled them in, bringing elements of their dowry along for the ride

Helgenes50 said...

RISE 552 ( Asi K8)

3.91% Amerindian
0.02% Siberian
51.68% Euro_HG
0.03% Oceanian
0.80% Sub-Saharan
0.00% Southeast_Asian
5.08% LBK
38.48% South-Central_Asian

Rami said...

The Lazaridis paper says 43% of EMBA Steppe is derived from a population similar to Chacolithic Iranians, I would infer Central Asian Agriculturalists would also already be rich in that component , as their culture is similar to those found around the Caspian.

Davidski said...

Doubt it.

Stats show that ANI is a mix of NEOLITHIC Iranians and steppe pastoralists.

Rob said...

what happens if SAs are tried to be modelled as Iran Chalcolithic or Armenian Chalc ? ?

Rob said...

This sample Ulan IV is from Kurgan 4. grave 8. Its calibrated BC date is mean 2500 BC, so it's in the transitional period between Yamnaya & Catacomb - Manych; which is I guess why the authors put it as "Yamnaya" still (Supp Table 1, Allentoft).

Curiously, this Ulan IV Rise 552 sample is the one which was I2a2 !

But IIRC most of the data from Allentoft was Yamnaya, then Srubnaya, with little of the intervening Catacomb culture.

The origins of the Catacomb rite itself is still discussed. Some catacomb burials appears in southern Poland, within the Sandomiercz - Krakow groups of CWC & Zlota culture, but the details of construction, positioning of deceased & accompaniments differ. Also some have proposed links to Dolmenic burials in the contemporary Nth Caucasus group.

So it appears to be a distinct local innovation in the Black Sea steppe (?)

Davidski said...

The majority of South Asians can't be modeled successfully without Bronze Age steppe reference samples.

In other words, modeling them as Armenia_Chalcolithic, Armenia_EBA or Iran_Chalcolithic plus Onge/Han doesn't work well at all.

But it's possible to include the ancient Armenian samples and Iran_Chalcolithic in successful models that are mostly based in varying degrees on the Bronze Age steppe and Iran_Neolithic references.

Rob said...

Ok Thanks.
Plus there's the need to link in Z93.

ryukendo kendow said...

Just an idea:

Sahikorth is completely right when he (guessing you're a guy) points out that the branches leading to Levantine Neolithic and Natufian do not lead to any shared drift with either of Mota, Yoruba, or Mbuti; i.e., they come out from the trunk instead of the branches that lead to the African populations. Except in the first treemix, which anyway the algorithm then adjusts more finely when the number of branches is increased. The overall effect of such admixture edges is that Levantines are further away from Eurasians than Eurasians are to each other, but no closer to Africans than other Eurasians are. Assuming that this were actually the case, we solve multiple problems:

Levantines do not show increased shared drift with any Africans compared to Eurasians, and this makes sense as the position of their admixture edges means that they and Eurasians are perfectly symmetrical in their relationships with Africans.

Levantines have a different fst ratio between papuans and Africans than other Eurasians, but this need not mean that they are closer to Africans compared to other Eurasians; this could mean that they are further away from other Eurasians, such as Papuans, than the average Eurasian is, while being no closer to Africans (we can't derive this second part--that they are no closer to Africans--from fsts, which are sensitive to drift path length, but we can get this from Ds, which are pointing in that direction.)

In the fits for levantines in D stat nMonte variation space, levantines and Iran-N receive both African and Munda/Papuan percentages, which suggests that the distance to Eurasians and Africans is being finely adjusted, not just decreased distance to Africans. In PCA, levantines are displaced in PC1 but are not part of the stream of Eurasian populations leading to Africans because they are also displaced in PC3. Other populations displaced in both PC1 and PC3 from the Eurasian cline include Kostenki, Ust-Ishim, Oase, Papuans, etc, suggesting Natufians likewise have some kind of 'weird' ancestry outside the range of modern Eurasian variation.

F3s with CHG are exceptionally low around the Levant and Eastern Mediterranean, even in populations with no African ancestry like Lebanese Christians or Greeks, which I suggested before was due to 'Southwest Asian'.

All these can be explained if Natufians have some pre-OoA ancestry, which can branch off from either between Mbuti and Yoruba, Yoruba and Mota, or Mota and Basal Eurasian--the important thing is that this ancestry shares no drift with later West Africans, East Africans or African HGs, being a parallel branch of its own. Possibly in North Africa.

The biggest probability, I think, is that this population also mixed with WHG or paleo-WHG or even pre-West Eurasian-HG types, having West Eurasian mtDNA, in N Africa and contributed the non-Basal ancestry of Natufians; Natufians are actually less basal than Iran_N, explaining their exceptionally high shared drift to Ust-Ishim for their level of shared drift with later Crown Eurasians, and have quite a bit of Neanderthal ancestry from Europe through North Africa.

Anyway, I remember reading a news article about an upcoming aDNA paper in N Africa from the 14th century (?) that showed a population of people that, unexpectedly, were described best as a mix of S African HGs and Mediterranean Europeans and doesn't match anyone today, but I can't find it anywhere, (does anyone remember such a thing?) so I expect lots of peri-OoA weirdness in N and E Africa.

ryukendo kendow said...

@ Huijbregts

Huij, I agree that its not clear how the Iberia_EN vs Anatolia_EN distinction we get in basic processing of the variance in the columns can be preserved in the higher dimensions, but such tiny fluctuations must have been preserved *somehow*, because the fine distinctions of source populations are retained in Orthogonalized nMonte such that the identity and percentages of source populations are almost exactly the same. Of course we can't expect to visually recover the Iberia_EN Anatolia_N distinctions that we see in a direct plot of the columns, in any of the higher dimensions, because any such pattern of variations are probably chopped up and mixed up across multiple dimensions; they may even be stacked into patterns that load on outliers that then dominate single dimensions.

Anyway, I have an upcoming experiment to post later, which will show that the information has somehow been preserved, but across multiple dimensions instead of just one.

@ Alberto

Alberto, I agree with Matt that this method is not really genealogically informative, at least not in the 'who shared drift with who' way, because after all what nMonte datasheets are doing is positioning all the row populations between the column populations in variation space, where the variance is in shared drift with column populations, so everything depends on what populations we define as columns. For example, I fully expect that, if you only include West Eurasian HGs as columns, then Asians will turn out to be in between Europeans and Africans, because the columns are measuring shared drift with WHG/ANE and by this measure Asians are intermediate between Europeans and Africans. Similarly, since the column populations we are using now are mostly modern crown Eurasians, populations like Ust Ishim who have the same drift with Africans as modern Eurasians do, but less drift with all the modern Crown Eurasian column populations than actual modern Europeans and Asians, will end up in the middle of the cloud of populations and thus closer to Africans than other modern Eurasians, when in reality this is not the case. Same for Kostenki, Vestonice, et cetera. In fact, this explains Mal'ta's weird behaviour in nMonte too, as we now know it has ~25,000 years less of shared drift with almost all the column populations that more recent genomes have, and since AG3's contribution is now known to extend from Siam to Somalia, from the Andes to the Alps, MA-1 tends to be in the center of the cloud, when more recent HGs and Europeans/Asians are at the corners of it.

Shaikorth said...

If Natufians have archaic OOA and not African they perhaps should show increased affinity to Papuans relative to Iran_N instead of the opposite (and not have the relative African-Papuan pattern repeat with Onge)?

This is because there's an upcoming study saying Papuans have early AMH admixture which manifests as increased haplotype sharing with Africans. Some time ago I noticed this sharing with SSA happened even in Anders PĂ„lsen's chromopainter runs using lower coverage genomes (happened with Paniyas etc. too, though Papuans were the extreme).

"Previous human genetic studies, based on sampling small numbers of populations, have supported a recent out-of-Africa dispersal model with minor additional input from archaic humans. Here, we present a novel dataset of 379 high-coverage human genomes from 125 populations worldwide. The combination of high spatial and genomic coverage enabled us to refine current knowledge of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size. The examined Papuan genomes (hereafter taken as representative of the broader Sahul region) show an excess of haplotype sharing with Africans. This is compatible with traces of an early and otherwise extinct non-African lineage in the Papuan genome, evidence of which was so far only available in the Western Asian fossil record. Our tests of positive and balancing selection highlight a number of new metabolism- and immunity-related loci as candidates for local adaptation."

ryukendo kendow said...

If Natufians have archaic OOA and not African they perhaps should show increased affinity to Papuans relative to Iran_N instead of the opposite (and not have the relative African-Papuan pattern repeat with Onge)?

Why would this be the case though, especially if the pre-OoA clades are parallel? The signal in Papuans and Africans is discovered in similarity in Haplotypes, not in unlinked snps used in formals, are they not? In fact in unlinked snps Papuans are further away from Africans than most Eurasians are.

In unlinked snps used in formals pre-OoA input that does not share drift with late Africans in the Paleo-African, W African and Mota clades should bring populations further away from Eurasians without increasing similarity to Africans, which is what the situation for Natufians seems to be.

Shaikorth said...

Papuan distance increases from Africans and indeed all AMH's, but that can be expected because of their Denisovan. If Natufians share ancestry with Papuans from the same archaic OOA clade that should bring them together vs others, and in theory show in their fst-ratios too. Or maybe we'll have to think of several "african-looking" early AMH expansions that almost died out.

It would be really interesting to see Natufian formal stats with high coverage genomes in any case. The z-score for EastAsian French Dinka Chimp changed from <-1 to >2 when switching from Human Origins to high coverage (Wong et al).

ryukendo kendow said...

@ Shaikorth

I would expect the Papuan phenomena to be due to an entirely different clade, very unlikely that the clade was the same in East Asia and Oceania with the other half left behind in N Africa, and the newest clade occupied all the areas in between, phylogeographically unlikely.

Lol now that you mention it I actually remember reading that post by Anders, but I was really young then and had very little real understanding, its really interesting that you can do a chunk based analysis on just 200, 000 snps with avg chunk size 13. Not being so familiar with this kind of analysis statistically, I would have thought the coverage would need to be much higher.

On another note, here is an old paper which seems little-referenced, about mtDNA in Mecthoids and Iberomaurusian, almost entirely West Eurasian and European HG-like:'etude_du_peuplement_de_l'Afrique_du_Nord

42, 8% (9/21) H or U
14, 2% (3/21) JT
2 individuals (9,5%) U6

Alberto said...


It seems we're discussing different things. Ust-Ishim, Vestonice or Kostenki are corner cases, and not really relevant. Maybe not having the right columns for them make their behaviour weird. The 45 ky of extra drift that modern Eurasians don't share with Ust-Ishim is, in any case, drift away from Africans, so maybe not so weird after all.

For the main point about measuring ancestry in "normal" populations, I'd be glad to discuss it privately if you want. It would be long and complicated to do it here.

Matt said...

@ Ryu:

Re: Levantines have a different fst ratio between papuans and Africans than other Eurasians, but this need not mean that they are closer to Africans compared to other Eurasians

I dropped a comment on Maju's blog about how it seems like high drift in a branch of clade would converge fst scores between that branch and different outgroups, relative to another branch from the same clade that was not as drifted - In theory high drift making the relative FSTs to each of the outgroups more of a function of the drift, rather than any branching order / admixture.

Re: aDNA in North Africa I ref'd the abstract as - "The genomic enigma of two Medieval North Africans" Gunther along with some other abstracts on

Samuel Andrews said...

"If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here)"

We shouldn't take the preference of Yamnaya over Andronovo for South Asians in Laz 2016 very seriously.

ryukendo kendow said...

@ Matt

Wow Matt that makes incredible sense! And thanks for the reference, I wonder why it isn't bigger news across the gene blogging world.

I was specifically referring to this point in the reasoning:
So not all Eurasians are equidistant to Africans (that would be an amazing coincidence, if not outright impossible)
This is completely an artifact of using many Eurasian column populations to define the variation we want to see in the cloud. If we use .e.g only West Eurasian columns, then now Asians are closer to Africans instead.

The 45 ky of extra drift that modern Eurasians don't share with Ust-Ishim is, in any case, drift away from Africans, so maybe not so weird after all.
This is not true, all populations in Eurasia, even old ones like Ust Ishim, have the same outgroup D stat magnitude with Africans, because they split off from African at the same time; no population drifts away from another population. Maybe in variation space their position changes to become 'further away from Africans', but that is a product of methods like PCA of D stats, where they are moving towards the corners of the cloud, or subtracting euclidean distances using distances from column populations, where their shared drift with modern Eurasians of course makes them further away in euclidean distances than the older populations are.

Samuel Andrews said...


Read my blog for anything about mtDNA.

The ancient Moroccan mtDNA is from an old study and so might be illegitimate results they have no U5 or U4 or U2. Most are undefined R, two are R0a, two are HV0, and two might be U6d3.

ryukendo kendow said...

@ Samuel

Thanks, I already do and really appreciate your efforts.

Davidski said...

We shouldn't take the preference of Yamnaya over Andronovo for South Asians in Laz 2016 very seriously.

We shouldn't accept it just yet as fact, because pop genetics methods constantly improve and thus so do inferences, but we should certainly take it as a serious possibility.

Shaikorth said...


If we assume that drift is added linearly and things with Natufians and other Eurasians are like in your example (no SSA to Natufians and no Papuan to everyone but them), the Mbuti/Papuan ratio should approach 1 but never reach or exceed it for non-Natufians (it does) and we'd have to assume Natufians are less drifted because their ratio is lowest. The latter is not the case as we can see their absolute fst-distances are high compared to other populations. This may be artificial drift caused by low coverage but the point stands.

If we reduce an arbitrary number from Natufian and BedouinA's (or Yoruba's) distances to Papuan and Mbuti, the ratio stays below 1 but if we do the same to Iraqi Jews or Iran_N or Irish it stays above 1.

ryukendo kendow said...

Matt's mechanism may contribute to the effect, but the most straightforward explanation to me is still to decrease similarity with Eurasians but not decrease similarity with Africans.

Matt said...

@ Ryu: Yes, that's pretty much what I'm trying to say, in comparings the ratios of fsts, both the effect of private drift to a branch making outgroups more relatively equal, and phylogenetic similarities are both present.

The phylogenetic effect dominates I think. (And I think the cause of the phylogenic pattern probably is a "Basal Eurasian" style split along the lines of RK's speculation IIUC?).

I would just tend to use something like Mbuti-Papuan though, as the common effect of drift adding to both fst to Mbuti and Papuan would cancel out better, and the rank order makes more sense for ancient and also modern populations. But comparing fst with different outgroups is not invalid to get a sense of phylogeny. That's not what I am trying to say.

IMG of comparing Mbuti-Papuan fst to the Mbuti:Papuan fst ratio:

(WHG is below (more Mbuti) compared to LNBA Europe and steppe populations by ratio method, not by subtractive method. I think this is the effect of high drift scaling in the ratio method.).

Arch Hades said...

So where does all the R1a in modern Indo-Iranians come from? Because Allentoft's Yamnaya samples are R1b.

Shaikorth said...

Matt, could you add BedouinA (not drifted, presumed to have SSA), BedouinB (drifted with assumed SSA), South Italian (drifted) and Sicilian (not drifted) to the graph?

Matt said...


( with some recent North-Central Europeans and Sardinians thrown in)

However, the difference in rank order in subtraction vs ratio normed against these ancients would require *very* high FSTs to generate. It seems to me present in Euro HGs, but they have an enormous degree of drift (whereas a pop like South Italian only likely to be highly drifted / isolated relative to recent populations). That said, the recent populations (Sicilian, Lithuanian, Polish, English) do tend to be in the opposite position relative to HGs in terms of ratio relative to subtraction.

Matt said...

Increasingly OT, but: Few neighbour joining dendrograms based on a) raw fst distances to outgroups (top), b) fst distances to outgroups normalised against Mbuti by division (middle), c) fst distances to outgroups normalized against Mbuti by subtraction (bottom):

You can see that some kind of normalisation is required to recreate a dendrogram that reflects what we know about the population relationships (as the top doesn't really work at all, mostly just clustering populations together according to how drifted they are and tendency to have high fsts). I think the subtraction normalised dendrogram / clustering works best to place SHG and WHG logically (obvious limits from using outgroups, as always).

Shaikorth said...

On the graphs, one thing worth noting is Natufians despite obviously having high drift like Euro HG's are also on their opposite side.

huijbregts said...

@ Ryu

I looked into the consequences of a dimension reduction on West-Eurasians. (D-stats3b, without Remedello_BA).
I kept 4 dimensions, which explain 96% of the variance. The higher dimensions are dominated by a single row or column.
A cluster analysis with PAM returned 8 clusters (labels are mine):
- North_African(3)
- Villabruna(6)
- Levant_Neolithic(7)
- Euro_Neolithic(11)
- EHG(12) including steppe
- South_Asia(17)
- Caucasus(26)
- modern European(35) including Bell Beaker etc.
Two pops were oddly classified: GoyetQ111 in South_Asia and Vestonice in modern Europe; maybe they form their own cluster.
Lebanese_Muslim was classified as Levant_Neolithic and Lebanese_Christian was classified as Caucasus.
I think this may be close to the maximum resolution you can demand in PCA and nMonte.
Of course one can see a lot more detail in a PCA, but the danger is just seeing ones pet theory.

Gill said...

If we recover DNA from Neolithic Balochistan/Indus Valley and if it's even more ANE shifted than Iran_N, then an Iran_Chl-like population will be required to explain the final result. It's not unlikely a Steppe population could have picked up such admixture on its way in (via Iran or South Central Asia) during the Bronze Age.

Olympus Mons said...

Ot:Loved the Map in this Post. Its the best HYpsometric I have seen.It shows why the populations moved/settled in Caucasus.

Michael Ellsworth said...

Again returning to David's initial point of the unusually Yamnaya-like character of ANI, should we not expect the Indo-Aryans to have picked up some more strictly Yamnaya-like ancestry in their migration to India? Most models have the Indo-Aryans running ahead of the Iranians in the move south, and as they did so, they ran into Afanasevo people. From the samples so far, Afansevo is basically indistinguishable from Yamnaya. So what happens if we imagine that the Indo-Aryans were admixed Corded Ware, late Western Afanasevo, and Iranian Neolithic? I understand that Corded Ware and Yamnaya are close, so it might be hard to untangle an admixture between them, but it seems to me like extra Yamnayishness is what we expect.

Davidski said...


You might be right. It's useful to note that one of the outlier Srubnaya samples was classified by Lazaridis et al. as Steppe_EMBA, because it was more similar to Yamnaya than to Andronovo, Sintashta and Srubnaya, in other words the Steppe_MLBA samples.

Considering the patchy sampling of the Bronze Age steppe to date, future sampling might uncover the persistence of whole Yamnaya-like populations in Andronovo and Srubnaya territories.

However, if more Catacomb samples are sequenced, and it turns out that they're excellent proxies for steppe admixture in South Asians, like Ulan IV RISE552 clearly is, and some of them belong to basal clades of R1a-Z93, then that would make things very interesting.