search this blog

Friday, July 22, 2016

The Basal-rich K7

Update 25/01/2018: The Basal-rich K7 is now available to personal genomics customers for $6 USD a pop (see here).


I've got a new test. Currently I'm only using it to explore ancient genomes, but at some point I'll make another version available to personal genomics customers, one way or another. However, that might take a little bit of work and time to mitigate the effects of the calculator effect and so on.

Below is a spreadsheet featuring a wide range of ancient and present-day samples from recent papers. A table with the Fst genetic distances between the seven ancestral populations is available here.

Please note that the Basal-rich component is unlikely to be a perfect representation of the hypothetical Basal Eurasian population. At the same time, it's likely that the two hunter-gatherer components, Ancient North Eurasian (or AG3-related) and Villabruna-related, contain some Basal Eurasian admixture.

Here's a Principal Component Analysis (PCA) of the West Eurasian populations based on their K7 ancestry proportions. It captures all of the main features of West Eurasian genetic diversity, including the two parallel clines made up of Europeans and Near Easterners, and the intermediate position of South Central Asians between the ancient samples from Neolithic Iran and Bronze Age Europe.

An extra large version of the same PCA, with the samples labeled individually, can be downloaded here.

Also, using the K7 ancestry proportions, I modeled the ancient ancestry of a few present-day populations from the Near east, Northern Europe and South Central Asia with the nMonte R script. Bronze Age steppe admixture in groups from the latter two regions is usually inferred at 40-50% with tools based on formal stats, such as qpAdm and TreeMix, so I wanted to check if I could reproduce such results.

Iran_Chalcolithic 42.05
Iran_IA:F38 31.35
Iran_Neolithic 14.95
Andronovo_Kytmanovo 7.6
Yamnaya-Catacomb_Ulan 1.95
Han 1.35
Andamanese_Onge 0.75
Papuan 0

distance%=0.285 / distance=0.00285

Iran_Chalcolithic 40.95
Iran_IA:F38 38.25
Yamnaya-Catacomb_Ulan 11.25
Iran_Neolithic 6.45
Andronovo_Kytmanovo 2.8
Andamanese_Onge 0.15
Han 0.15
Papuan 0

distance%=0.2979 / distance=0.002979

Iran_IA:F38 96.3
Andronovo_Kytmanovo 2.35
Han 0.6
Andamanese_Onge 0.55
Papuan 0.2
Iran_Chalcolithic 0
Iran_Neolithic 0
Yamnaya-Catacomb_Ulan 0

distance%=1.3845 / distance=0.013845

Yamnaya_Peshany 44
Loschbour 30.65
Sweden_MN:Gokhem4 25.35
Barcin_Neolithic 0
Ulchi 0

distance%=1.2569 / distance=0.012569

Sweden_MN:Gokhem4 39.4
Yamnaya_Peshany 39
Loschbour 21.6
Barcin_Neolithic 0
Ulchi 0

distance%=0.5208 / distance=0.005208

Sweden_MN:Gokhem4 42.15
Yamnaya_Peshany 36.55
Loschbour 21.3
Barcin_Neolithic 0
Ulchi 0

distance%=0.939 / distance=0.00939

Iran_Neolithic 54.05
Yamnaya-Catacomb_Ulan 25.5
Andronovo_Kytmanovo 9.35
Han 7.6
Andamanese_Onge 1.95
Papuan 1.55

distance%=0.4592 / distance=0.004592

Iran_Neolithic 54.8
Andronovo_Kytmanovo 31
Han 7.5
Yamnaya-Catacomb_Ulan 3.5
Andamanese_Onge 1.75
Papuan 1.45

distance%=0.4921 / distance=0.004921

Iran_Neolithic 55.25
Yamnaya-Catacomb_Ulan 19.1
Andronovo_Kytmanovo 12.3
Han 8.45
Andamanese_Onge 3.4
Papuan 1.5

distance%=0.4848 / distance=0.004848

Iran_Neolithic 41.7
Andronovo_Kytmanovo 30.65
Yamnaya-Catacomb_Ulan 19.15
Han 6.95
Andamanese_Onge 1.05
Papuan 0.5

distance%=0.5318 / distance=0.005318

Admittedly, these estimates look very conservative, but certainly not out of the ballpark. I suspect that I'll be able to improve the models and statistical fits as new Bronze Age steppe samples become available. Indeed, I'll be updating the spreadsheet above regularly.


1 – 200 of 369   Newer›   Newest»
astenb said...

Run Mota pleae?

Helgenes50 said...

That's a good news !

Chad said...

Looks good! I think your Basal rich only has about 20% WHG like ancestry in it.

Anonymous said...

If we assume basal is supposed to be equally related to East and West Eurasians, then these basal figures are too high.

A stat in the form
Mbuti Test Kostenki14 Ust_Ishim
Should be 0 in any Africans share ancestry with Eurasians beyond Mbuti (Yoruba and such), Basal Eurasian, and any miscellaneous Crown Eurasian branches that may exist.

Now if we take the genome that is most related to K14 (V16) we have a stat of -.914

Bedouins have theoretically at most 41.1% of their ancestry should consist of basal, ENA, african, misc crown Eurasian. I say at most because any west Eurasian that split pre V16-K14 will dampen the statistic, and ENA will slightly reverse it (as ENA are closer to Ust than K14).

Matt said...

Assuming those stats, roughly, Kalash would model as 50% Iran_Neolithic, 40% Yamnaya_Kalmykia, 10% ENA (providing the balance of Andaman, Oceanian, SE Asian). Seems OK.

Equally, if you modeled Yamnaya_Kalmykia as 20% Iran_Neolithic, the remaining EHG balance of its ancestry would be 60% AG3-MA1, 40% Villabruna (seems close to what it would be supposed to be?). If you modeled Yamnaya_Kalmykia as 23% Anatolia_Neolithic and the remaining HG would be 73% AG3-MA1, 25% Villabruna.

The fst stats should be important to see if the Basal-rich meets the criteria of Basal Eurasian.

How do SE Asian populations model in this with the separate Andaman and SE Asian components?

All that said, ultimately not totally sure how well this matches with the estimates from Laz et al 2016 -

which includes BE estimates - Europe_MNChl - 20%, Steppe EMBA -20%, Anatolia_Neolithic - 22%, Iran_N 45%.

I think Chad's argued before, with Shaikorth, for those estimates for Iran_Neolithic particularly being wrong, due to ANE ancestry, and I think he may have had an arguable point (not totally sure I understood it), but if correct it does make that nice correlation with Neanderthal ancestry basically disappear if he's right and BE is overweighted here for all Iranian and Iranian related points! (And that's quite a serious point for Laz 2016, and a strong justification they used to continue with the BE concept).

Chad said...

ANE, to a point. They covered that in a section and had Iran and Natufians equal in BE. Onge or ENA ancestry is 3x a driver in false BE as ANE. I'm pretty sure David's component here is 80% BE and 20% WHG-like.

a said...

Not to make things to complex, is it possible to show the relationship -Neaderthal/Denisovan averages.
Table S21 give some averages; but not all.
Maybe you could correspond with Broushaki et al and get Neaderthal/Denisovan averages for your components.

Ryan said...

Interesting that Afanasievo has so little Basal and so much Villabruna. Are you warming up to the idea of R1b following a more circuitous route to the Steppe and Europe?

Matt said...

@ Chad, do you have a page ref? The section "Supplementary Information 4" is quite extensive and includes a number of different estimates, so I'm finding it hard to identify the specific bit at a glance.

(I'll give a few examples of the different estimators I can see that they use for populations, examples Anatolia_N, Euro_MNCHL, Iran_N, Steppe_EMBA, Natufian, CHG:

p24 Table S4.4 f4 ratio(Test, WHG; Ust_Ishim,Kostenki14)/f4(Mbuti, WHG; Ust_Ishim, Kostenki14) -
Anatolia_N: 0.344, Euro_MNCHL: 0.293, Iran_N: 0.591, Steppe_EMBA: 0.367, Natufian: 0.460, CHG: 0.505

Anatolia_N: 0.363, Euro_MNCHL: 0.310, Iran_N: 0.591, Steppe_EMBA:0.402, Natufian: 0.438, CHG: 0.497

p28 Table S4.5 ADMIXTUREGRAPH (different topology) -
Anatolia_N: 0.3629, Euro_MNCHL: 0.3098, Iran_N: 0.5784, Steppe_EMBA: 0.4015, Natufian: 0.4378, CHG: 0.4965

p38 Table S4.8: Estimates of Basal Eurasian admixture with qpAdm methodology, with WHG+EHG streams of ancestry -
Anatolia_N: 0.305, Euro_MNCHL: 0.252, Iran_N: 0.561, Steppe_EMBA: 0.239, Natufian: 0.384, CHG: 0.415)

Davidski said...


Please take a look at what I said in my post.

I don't think it's possible to find a pure Basal Eurasian population with methods like ADMIXTURE, STRUCTURE and the like. I've tried both, as well as other programs, and the most basal-rich ghost pop I can find still seems to have at least 10-20% Villabruna and even Onge-like stuff.

Anyway, like I said, I'll update this post later today with more results and the Fst table.

Chad said...

Table S4.9.

Check that too. I think that would make David's component about 50% BE and 50% WHG. That might be why the BE is getting thrown in ANE, with all the Iran linked pops in the test. It's a long process of adding and subtracting samples to get the proportions in Admixture to match things like qpAdm.

Davidski said...

Here's Mota. Not a bad result, considering that this test is absolutely not designed to characterize genetic substructures within Africa.

AG3-MA1 0.66
Andamanese 2.64
Basal-rich 34.51
Oceanian 2.58
Southeast_Asian 5.59
Sub-Saharan 53.63
Villabruna 0.39

Seinundzeit said...

This actually looks incredibly good, probably the best we can get from ADMIXTURE. It's also in agreement with the nMonte experiments that were tried by a few of us.

Looking forward to seeing the spreadsheet.

Gioiello said...

@ Davidski

Afanasievo RISE511
AG3-MA1 53.8
Andamanese 0.24
Basal-rich 7.54
Oceanian 0.31
Southeast_Asian 1.26
Sub-Saharan 0.8
Villabruna 36.06

Anatolia_Neolithic I0709
AG3-MA1 0
Andamanese 0
Basal-rich 45.44
Oceanian 0.06
Southeast_Asian 0
Sub-Saharan 0.12
Villabruna 54.37

Yamnaya_Kalmykia RISE552
AG3-MA1 56.24
Andamanese 0.19
Basal-rich 10.58
Oceanian 0.03
Southeast_Asian 0.02
Sub-Saharan 0.81
Villabruna 32.14

But it is a masterpiece! You have finally demonstrated that migrants from Anatolia of 7000 and Indo-Europeans from the steppes of 5000 are the ancestors of Villabruna of 14000. Finally!

Davidski said...


I suspect that a hybrid Basal/Villabruna-related population very similar to the Natufians was a major feature of the Near East for a very long time, maybe longer than anything purely Basal Eurasian. Btw, the Natufians score around 70% of the Basal-rich component in this test, but essentially no Sub-Saharan admixture.

At the same time, it's also possible that both Villabruna and Samara HG have some Basal Eurasian admixture, so Basal Eurasian ancestry, albeit at low levels, may already have been a feature of many pre-farming populations across northern Eurasia.

Ryukendo K said...
This comment has been removed by the author.
Davidski said...

I ran many tests, usually at low K, looking for specific signals, like ANE and WHG. Whenever a nice cluster formed that reflected what I've seen elsewhere, I made synthetic samples out of it, and used them to trigger the same cluster in the following runs.

After doing so many analyses, I don't see a way to extract a pure Basal Eurasian population, although I'm now also sure that such a population, or populations, did really exist.

They seem to have made hybrid composites in the Near East with Villabruna-like groups in the Levant and MA1-like groups in Iran very early, probably before the Ice Age.

I'll be using this test from now on to explore a lot of things because it does reflect really well the genetic patterns that are found across Eurasia. We can all argue until the cows home about the ancestry proportions. They'll never be perfect in the context of all analyses, most of which are probably skewed to some extent anyway.

I'm running more samples now. I'll make an update soon with more info.

Ryukendo K said...
This comment has been removed by the author.
Ryukendo K said...
This comment has been removed by the author.
Davidski said...

All of the results above were done in the same way, so the ancestry proportions reflect the same components.

This Afanasievo individual is easily the most northerly shifted in my PCA, so the low Basal-rich result more or less makes sense. Note also that the noise due to deamination here is taken care of by the very low Sub-Saharan proportion. This probably otherwise shows up as increased southern ancestry in other analyses that don't have such an outlet for erroneous basal genotype calls.

Also, I think the Villabruna cluster is indeed part Basal Eurasian. This seems to show up in the Fst and resulting PCA. So this is where some Basal Eurasian will be hiding for many samples.

But all of the Bronze Age steppe samples, especially the early ones, show very low levels of the Basal-rich component, which is something I've noticed before, and that's why I was always stumped by models that showed them to be ~50% Armenian. That never made much sense to me.

Btw, the PCA above is very similar to one that I did with an AG3-MA1 composite, Villabruna, a Levant_Neolithic sample, an Onge, a Papuan, and a few African individuals. The Levant_Neolithic sample was shifted in the same way towards the Africans as the Basal-rich component.

Matt said...

@ Chad, thanks. OK, so page 40 Table S4.9: Estimates of Basal Eurasian admixture with qpAdm methodology, with WHG+EHG streams of ancestry, dropping an outgroup (same as Table S4.8, taking out particular outgroups they used there):

Dropping Ust Ishim:

Anatolia_N: 0.298, Euro_MNCHL: 0.252, Iran_N: 0.577, Steppe_EMBA: 0.235, Natufian: 0.382, CHG: 0.417

Dropping Kostenki14 doesn't produce any sensible values for α

Dropping MA1:

Anatolia_N: 0.253, Euro_MNCHL: 0.210, Iran_N: 0.460, Steppe_EMBA: 0.216, Natufian: 0.448, CHG: 0.347

Dropping Han:

Anatolia_N: 0.309, Euro_MNCHL: 0.260, Iran_N: 0.559, Steppe_EMBA: 0.238, Natufian: 0.385, CHG: 0.414

Dropping Papuan:

Anatolia_N: 0.333, Euro_MNCHL: 0.268, Iran_N: 0.571, Steppe_EMBA: 0.254, Natufian: 0.368, CHG: 0.456

Dropping Onge:

Anatolia_N: 0.309, Euro_MNCHL: 0.261, Iran_N: 0.568, Steppe_EMBA: 0.234, Natufian: 0.393

So you do get a convergent estimate for Natufian and Iran_N under the MA1 dropped model here *only*....

But at the same time, you at the same time in all of these models you get a convergent Basal level between Steppe_EMBA (Yamnaya / Afanasievo) and Euro_MNCHL, and also Anatolia_N is still around half the BE level of Iran_N.

Assume the Basal Rich here could be around 20% WHG, so if so we'd get:

Anatolia_Neolithic: "True Basal" - 36.3, Sub Sahran - 0.12, Villabruna - 54.37, "Other WHG" - 9.08

Afanasievo: "True Basal" - 6.03, AG3-MA1 - 53.8, Andamanese - 0.24, Oceanian - 0.31, Southeast Asian 1.26, Sub-Saharan 0.8, Villabruna 36.06, "Other WHG" - 1.50

Iran_Neolithic: "True Basal" - 41.76, AG3-MA1 - 45.3, Andamanese - 0.88, Oceanian - 0.9, Southeast_Asian - 0.12, Sub-saharan - 0.5, Villabruna - 0.09, "Other WHG" - 10.44

Assume 50% WHG in Basal Rich:

Afanasievo: AG3-MA1 - 53.8 Andamanese - 0.24 “True Basal” - 3.77 Oceanian - 0.31 Southeast_Asian - 1.26 Sub-Saharan - 0.8 Villabruna - 36.06 “Other WHG” - 3.77

Anatolia_Neolithic: AG3-MA1 - 0 Andamanese - 0 “True Basal” - 22.72 Oceanian - 0.06 Southeast_Asian - 0 Sub-Saharan - 0.12 Villabruna - 54.37 “Other WHG” - 22.72

Iran_Neolithic :AG3-MA1 - 45.3 Andamanese - 0.88 “True Basal” - 26.1 Oceanian - 0.9 Southeast_Asian - 0.12 Sub-Saharan - 0.5 Villabruna - 0.09 “Other WHG” - 26.1

Doesn't seem like it would have that value that Laz find in all their models, of similar level of BE between EuroMNCHL and Steppe_EMBA, while Iran_N and Anatolia_N are much closer in levels than in the formal models (where Anatolia_N is pretty close to Steppe_EMBA).

Ryu: Matt, I suspect the Basal Component may represent different things in different populations, i.e. there is 'slipping and sliding' going on, so the very low percentage in Afanasievo may not be that representative.

Yes, probably is; at the same time, I'm not sure that the Afanasievo is the only one inconsistent compared to the formal Laz models, if you're saying anything like that, unless I'm reading you wrong.

I think you'd probably need something like 0% Basal in the Villabruna cluster, 40% WHG in Basal Rich, and something like 40% Basal in AG3-MA1 before you got any close estimates, even to the qpAdm model that drops MA1 as an outgroup.

Chad said...

I'm thinking BE was shoved into the ANE component. Admixture can be funky that way. I've got a calculator in progress with separate Natufian, Iranian, and ANE components. As soon as it fits close to qpAdm, I'll see about getting it out.

Chad said...

You would also think that Anatolia would be 15-20% more BE than MN ChL, with all the extra WHG. Anatolia should also be equal to CHG, with the equal Ust-Ishim stats.

Ryukendo K said...
This comment has been removed by the author.
Matt said...

@ Chad, thanks.

Note though, 20% more WHG in EuroMNCHL wouldn't mean that that they have 20% less Basal Eurasian, but that they had 80% as much Basal as Anatolia_Neolithic.

So if we go with S4.9 without MA1 model, and Anatolia_N has 0.253 Basal, then expected value for EuroMNCHL would be 0.2024 (0.8 * 0.253), which is close to their model's actual 0.210.

Or take the with Papuan free model, Anatolia_N 0.333, expected EuroMNCHL 0.267, actual EuroMNCHL 0.268

It wouldn't be 20% less Basal Eurasian unless Anatolia_N was 100% Basal Eurasian. It'd be 10% less Basal Eurasian if Anatolia_N was 50% Basal Eurasian (that's not close to what these models find though).

(Similarly, if their models find CHG is around 0.417 Basal, and Steppe_EMBA is theoretically around 0.5 CHG, then Steppe_EMBA would be predicted to be around 0.208 Basal).

Rob said...

@ RK

Cool. Hopefully we'll get some copper age genomes from northeast Balkans this year to pan out the details

Davidski said...


Admixture doesn't work in that way. In some samples the Basal-rich might be a lot more basal than in others.

Btw, I don't think the AG3-MA1 cluster is influenced by Basal. The Iran_N ancestry proportions seem about right, with 5-10% of something non-Basal in the Basal-rich cluster. The rest is AG3-MA1 related, both near and far.

Also, there's an issue with the Basal estimates for the Iranian samples in Laz et al., caused by an eastern component specific to these samples. It's discussed in the paper, can't remember which page though.

I suspect that the steppe samples might also be affected by the same or similar phenomenon.

Matt said...

@ Davidski:

Yeah, there would still be variances within clusters, though

a) to get anything like their models with the same relative proportions of BE in Steppe_EMBA as Anatolia_N, around 2/3 as much BE in Steppe_EMBA as Anatolia_N, you'd need Basal Rich to be 100% Basal in Afanasievo (7.54%), and then only 25% Basal in Anatolia_Neolithic (to produce around 11% Basal and keep the two populations in ratio).

Or taken more literally - their estimates all show Steppe_EMBA should have around 20% BE - all of the estimates they've included are impossible without BE in another component that is present in Steppe_EMBA to a large degree.

b) if Basal Rich does vary very widely in Basal Eurasian depending on which population it is placed on, it's not a very useful index for how much BE a population actually has?

Graphically comparing the estimates that crossover between theirs and the BEuK7 -

Also, there's an issue with the Basal estimates for the Iranian samples in Laz et al., caused by an eastern component specific to these samples. It's discussed in the paper, can't remember which page though.

Sorry, I can't find this, only the note on page 39

Three populations (Kostenki14, MA1, and Han), when dropped as outgroups, result in the quadruple (Test, WHG, EHG, Mota) being consistent with 3 streams of ancestry for all (or nearly all in the case of Han) Test populations. Removing Kostenki14 results in a blowup of the standard errors suggesting that it carries important phylogenetic information that is not present in the other outgroups. Removal of MA1 and Han suggests interactions between West Eurasia and Upper Paleolithic Siberia and East Asia which we explore in Supplementary Information, section 11. For our purpose of estimating Basal Eurasian ancestry, however (and unlike with Kostenki14), removing MA1 and Han from the set of outgroups does not result in a blowup of standard errors which remain modest (less than 10% for most Test populations). In Fig. 2 we plot graphically the Basal Eurasian estimate results when removing MA1, which results in successful modeling of all Test populations, and is thus the main estimate we use in the study. (note these are the same I used in the above graph).

If there are reasons the Iranian estimates are not valid, then OK, though that does mean that their association of Basal Eurasian with 0 Neanderthal ancestry sort of ceases to exist, because it is driven by them. That's a pretty big deal for their paper.

Davidski said...


Not home right now, so I can't point you to the specific page. But it's in the supp info, and basically what it says is that the Basal estimates for the Iranian samples might be inflated and not higher than in the Natufians.

Also, the ranges for Basal in the paper, in a table in the supp info, are pretty big, and from memory go down to just 2% per cent for Steppe_EMBA.

So based on what I've seen, plus the uncertainty in modeling Basal ancestry proportions, I don't think it's a done deal that the Early Bronze Age steppe samples have around 20% of Basal. In fact, 20% seems a lot, and I'd say unlikely.

But I haven't tested many samples yet, just a couple of individuals from the steppe, and like I say, variable levels of Basal are probably hiding in the Villabruna cluster for many samples.

Matt said...

Re: error bars and ranges, yeah you're right that those are there, but I did sort of discounted them a bit since they exist for all the populations - Table S4.6 - feasible ranges Steppe_EMBA gets min 2, max 52, but then Anatolia_N gets min 3, max 49.

I did just notice there is one set model of models in their paper that does have a slightly better fit with these BEuK7 values : Table S7.26. - table S7.26 - correlation between S7.26 values

What this is the result of is, rather than modeling BE directly into all their populations, they model other populations as admixed between Natufian, Iran_N, EHG and WHG with qpAdm and qpWave, then feed in model BE values for those four from Section 4. - correlation of the S7.26 mixture based values with BEuK7.

So there is that at least. But they've kind of gone with the direct estimate in the Neanderthal correlation, so I don't know which they have more confidence in, and I would assume the direct estimate. To accept the Table S7.26 estimate would also mean accepting that the Steppe and Anatolia Neolithic models as really modeled by admixtures with Iran_N, which are a bit dubious from other perspectives (e.g. PCA, etc.)...

Davidski said...

Most of all, Basal Eurasian levels should show an inverse correlation with the levels of north Eurasian forager ancestry, although in the usual West Eurasian PCA samples with inflated levels of ANE, ENA and probably other eastern components are pulled south. So in PCA at least, there's a pseudo-Basal effect for the steppe samples.

Actually, if you look at the graph in the main Laz et al. PDF that compares Basal Eurasian levels with Neanderthal admixture, Anatolia_ChL has about half the Basal of Steppe_EMBA, even though it looks more Near Eastern in all other comparisons. And Iran_Hotu has more Basal than Iran_N, even though it has more EHG affinity.

So WTF is going on there?

I'll run Anatolia_ChL as soon as I get the chance. But you can already see in the above test results that Iran_Hotu looks less Basal than Iran_N, which agrees with everything else, but not the Basal/Neanderthal graph in Laz et al.

Davidski said...

Tomorrow I'll see what happens when I tweak the dataset by removing some of the steppe reference samples. If this increases the Basal in Afanasievo and the whole test still works, then that will be interesting.

If nothing happens, or the whole analysis collapses, that'll be interesting too.

Matt said...

Those two Iran_Hotu and Anatolia_CHL are the weirdest outliers, though can some of that can be lain at the feet of them both being sample size = 1 for their populations, and maybe not so many SNPs (also Steppe IA), while the larger populations and higher coverage samples can be a bit more robust in the direct model? I'd expect Iran_Hotu to be a lot more like Iran_N, ultimately. Maybe even having 2 or 3 samples makes it much robust.

Still, Iran_Hotu having more BEu could be consistent with, f4(Iran_N, Iran_HotuIIIb; EHG, Mbuti) = -0.00199 (Z=-2.4), if Iran_Hotu had less of some other kind of non-Basal ancestry that was not closely related to EHG. What they're doing in the Table S4.9 those estimates from the Neanderthal graph comes from is qpAdm with the outgroups (Ust_Ishim, Kostenki14, MA1, Han, Onge, Papuan), when the pright are Mota*, EHG, WHG, and dropping any individual outgroup of them (except Kostenki14, which when dropped gives no meaningful results) doesn't change Hotu's relative position in the BEu ranking.

It might be interesting to run the whole set of D(Iran_N, Iran_Hotu; X, Mbuti) where X is Villabruna, Kostenki14, Ust_Ishim, MA1, AG3, EHG, Han, Onge, Papuan, Kharia to see what comes out of that, and whether those stats themselves agree that Hotu is less Basal Eurasian (particularly should be closer to Ust_Ishim if less BEu, or at least further away) or if it's just more EHG, and looks like a strange, hard to explain "High EHG, high Basal" thing.

Re; Anatolia_CHL in the comparisons from Table S4.9 which don't drop MA-1 as an outgroup, Anatolia_CHL does get the more stable 20% Basal Eurasian. Still oddly lower than EuroMNCHL and Steppe_EMBA though.

*Doing a qpAdm which includes Mota and a lack of African outgroups, I guess, allows Mota to emulate the behaviour of Basal Eurasian in their theory.

Davidski said...

I think there's a good reason why Iran_Hotu plots north of the Zagros farmers, and that reason is less Basal Eurasian admix.

Btw, these stats are basically in line with my Admixture run. Seems like Iran_Hotu might have something Central Asian related to ASI. The farmers lack this influence.

Iran_Neolithic Iran_Hotu Villabruna Mbuti.DG -0.0251 -2.966
Iran_Neolithic Iran_Hotu Kostenki14 Mbuti.DG -0.0035 -0.426
Iran_Neolithic Iran_Hotu Ust_Ishim Mbuti.DG -0.0088 -1.223
Iran_Neolithic Iran_Hotu MA1 Mbuti.DG -0.0161 -1.697
Iran_Neolithic Iran_Hotu AfontovaGora3 Mbuti.DG -0.0243 -1.78
Iran_Neolithic Iran_Hotu Karelia_HG Mbuti.DG -0.0261 -3.221
Iran_Neolithic Iran_Hotu Han Mbuti.DG -0.0049 -0.848
Iran_Neolithic Iran_Hotu Andamanese_Onge Mbuti.DG -0.0145 -2.301
Iran_Neolithic Iran_Hotu Papuan Mbuti.DG -0.0096 -1.452
Iran_Neolithic Iran_Hotu Munda Mbuti.DG -0.0038 -0.675

Matt said...

Hotu closer to Ust Ishim, albeit at non-sig, therefore unlikely to have more BE than Iran_N....

Samuel Andrews said...

" now that we Know Yamnaya has EEF ancestry, is now an uninterrupted increase in WHG and EEF ancestry from the Khvalynsk-->Yamnaya-->Andronovo-->Iron Age transitions. Which makes me think that your original idea about interactions between the Steppe and CT/other settled societies in Europe creating cultural dynamism, may have something to it."

Andronovo looks more like a brand new population from Central Europe. The change from Yamnaya to Poltvaka_Outlier isn't gradual.

Rob said...


I think people would listen to you slightly more if (a) what you wrote made some sense grammatically , (b) you didn't try to force feed your theory down people's throats

Try presenting the data in sensical English, try adopting spacing and bullet points, and drop the ad hominems.

What you're saying might have some validity, but it's really hard to decipher beyond all the **crazy**

Olympus Mons said...

@ rob
A.You are partly right. Sorry for bad wording but writing on the smartphone is actually a problem.
B. I will delete the comment. However you seem to make two mistakes. First is to think anyone "listens" beyond their pet theory. They dont. Second that I try to convince anyone of anything. Because of one,i don't.
C. All i am trying to do is create a digital record of it. Because if i am right shortly after everybody will argument as if they were never wrong. When in fact all people do is talk amongst themselves to themselves.

Olympus Mons said...

So. My point to ryu was actually meant to a lot more people. See how instead of looking to the source of EEF in yamnaya just next door where the EEF actually seem to have originated, to fit the pet theory they "pretend notnto see" which is in fact a very interesting neurocognitive pathway fired by the anterior cingulate cortex... But that is a different story.

Rob said...

You'd be surprised actually
Most people have their views but will happily go with the data

Unknown said...


You make some solid points. However if you're looking to validate your opinions thru some meaningful discussion, then I think you giving this place too much credit. Despite it's great potential and Davidski's ability, this bolg just keeps on propagating the same western European wannabe bullshit.


You'd be surprised how many shades you can get from primary colors.

Ryukendo K said...
This comment has been removed by the author.
Davidski said...


I've managed to push the level of the Basal-rich component in Afanasievo RISE511 to over 11%, while at the same time more or less keeping the rest of the analysis as it is.

Let's see where this takes us.

Anonymous said...


" seem to make two mistakes. First is to think anyone "listens" beyond their pet theory. They dont."

You're projecting.
I myself have abandoned numerous pet theories of my own over the years, some of which I felt very strongly about, and I'm far from alone. I'm sure that many people here have. In fact, the scientific method relies on a willingness to abandon one's pet theories, as new data comes in. That's how it's supposed to work, and hopefully most of us understand that, but you don't seem to. So yeah, speak for yourself.

"Second that I try to convince anyone of anything. Because of one,i don't."

Oh. Is that why you spam every other thread on this blog with demands that we all read your thesis?

Samuel Andrews said...


The gradual admixture scenario is possible of course. Except ancient DNA from Samara shows a sudden appearance of R1a Corded Ware-like people. They lived in the same time period as R1b Yamnaya like people in the Poltvaka culture, as if they were two differnt populations.

It's similar to Bell Beaker and Corded Ware in Germany. Yes they weren't contemporary but they were clearly two differnt populations, mostly because of differnt Y DNA.

"Well that's true. The EEF in Andronovo is significantly more similar to Iberia _Neolithic than the EEF in Corded Ware or Yamnaya is."

I'm not confident there's a way to differentiate Ibeira_Chl from German_MN.

Rob said...

IIRC we don't have many samples from the Catacomb period (apart from Ulan IV)
So the sampling goes from "developed" Yamnaya (mostly 3000-2700 BC) to Srubnaya (post 2000 BC)

Olympus Mons said...

That is called setting a digital footprint. Its not really meant to convince anyone. You do know what that is right and how its done?

Secondly...thanks. I which i was that successful in *spam* my thesis. Not really. Don't have the time to. Although i which i do a better job at it.

Lastly. If you abandoned several thesis its because those were not particularly good or well crafted, were they? Mine on the other hand has names, dates, phone numbers steet names (sarc)... Not broadly set mames as steppe or oit or generally meaningless brush pictures. In the end i will be right or wrong and believe me either way wont loose a second of sleep over it. Now one thing you can be sure...i wouldnt lose a second trying to convince the likes of you. Waste of time.
Have fun and Over and out for you.

Davidski said...

Digital foot in mouth print, more like it.

You wait a bit, there's a new paper coming on the genetics of Neolithic and Bronze Age Europe with a shit load of new samples.

Rob said...

Yeah lad

MfA said...

@David, can you post F38's Eurogenes K13 results?

Matt said...

For the Basal Eurasian estimates from Lazaridis 2016's SI, section 4 and section 9, I thought I'd run through them through a PCA along with some estimates of my own on how much of the non-Basal for each was WHG or ANE (unfortunately not provided!):

Olympus Mons said...

I would not pick a fight with the owner of the blog, right? Actually i dont see you doing it also on other people's you like nice and cosy right? So.. Lets just leave it alone.

About those sampless. Are you taking those out of your ass as you told that other guy or is it something real?

Because we already have samples for europe_N and even Europ_b...will we get samples for copper age in southern europe or not! 3000 to 2300 bc?

Davidski said...


I ain't talking out of my behind. There's a paper on the way with a lot of samples from all over Europe.


Don't have that genome yet, but should soon.

Olympus Mons said...

More samples... that is great. One day we will have enough of those to build a meaningful picture of pre-history. Those guys extracting all these DNA are really a breath of fresh air...

Matt said...

@ Chad / Davidski:

I was having a few more thoughts about the qpAdm method Lazaridis et al use to estimate BEu (on p37-p40 of the supplement), and whether there are some improvements available from the Fu et al samples that weren't available for Laz to use.

The final estimate they use is outgroups Ust_Ishim, Kostenki14, Han, Onge, Papuan - and then model populations Mota, WHG, EHG where Mota is the substitute for the Basal Ancestry.

Useful samples it seems to me that Fu et al make available would be GoyetQ116-1 and Vestonice16 (as other UP European with low relatedness to particular recent groups, like Kostenki14, and lacking ENA relatedness and Basal Eurasian), AG3 (as a better representative of ANE than MA-1), and Villabruna (as IRC the least ANE / ENA shifted member of the WHG cluster).

So maybe it would be interesting to run that qpAdm again, with outgroups still as Ust_Ishim, Kostenki14, Han, Onge, Papuan, and then the model populations as Mota, Villabruna, AG3, GoyetQ116-1/Vestonice 16.

And also the same with outgroups as Ust_Ishim, Kostenki14, MA-1, Han, Onge, Papuan.

Test populations would be Anatolia_ChL, Anatolia_N, Armenia_ChL, Armenia_EBA, Armenia_MLBA, CHG, Europe_EN, Europe_LNBA, Europe_MNChL, Iberia_BA, Iran_ChL, Iran_LN, Iran_HotuIIIb, Iran_N, Iran_recent, Levant_BA, Levant_N, Natufian, SHG, Steppe_EMBA, Steppe_Eneolithic, Steppe_IA, Steppe_MLBA, Switzerland_HG.

But you could also include as test any high coverage UP Europeans (e.g. whichever of GoyetQ116-1 / Vestonice16 you aren't using), and test the method that way, as they shouldn't score and BEu.

I think that would help split apart true Basal Eurasian in the Near East (if it exists) from ancestry which diverged early from WHG and EHG but is still West Eurasian, just without that increased ENA affinity that shows up in WHG and EHG. Laz 2016 doesn't seem to think this is a concern, but it seems like it should be a concern if ANE / Villabruna cluster members are contributing to and from some ENA groups.

(Appreciate you're busy, so this is if this is quick to run, or food for thought for the future).

Chad said...

Yeah, I'll be home in a bit. I think Bichon is the least ANE of the group, without at least 10% UP ancestry. In Admixture, Villabruna scores the most ANE of the 15 I use. I think it may be excess Neandertal keeping him away from ANE a sliver more than Loschbour in Dstats. I've got the following components; SSA, San, WHG, ANE, Natufian, Iran, Onge, and Ami. I may add a UP component based on the Aurignacians too.

Alberto said...


If there are reasons the Iranian estimates are not valid, then OK, though that does mean that their association of Basal Eurasian with 0 Neanderthal ancestry sort of ceases to exist, because it is driven by them. That's a pretty big deal for their paper.

I think this is another problem in the paper, because if Basal Eurasian lacked Neanderthal admixture, wouldn't that make them closer to Africans (and in this case to all Africans, you wouldn't even need to find a specific branch)? (It would probably be better worded as Neanderthal admixture would make non-Basal Eurasians further away from Africans than Basal Eurasians are). I think we can see this effect with Ust-Ishim, Kostenki14 or Vestonice16 (and presumably much more with Oase1, though I haven't seen it explicitly), even if they only have a tad more Neanderthal than WHG/EHG. So why don't we see this effect much more clearly when comparing Iran_N and WHG to Africans?

In general, I think it's difficult to test accurately both of these things (Basal Eurasian having no Neanderthal admixture and Basal Eurasian not being closer to Africans). You probably need better materials and moethods. But at least theoretically, it seems to me that you can't argue for both at the same time (unless you do show that Basal Eurasians are indeed closer to Africans, but then argue that this is just the effect of them not having Neanderthal admixture and not them being otherwise more related to Africans).

Matt said...

@ Alberto

Interesting comment and questions. I think that such an effect does show to some degree for the Papuans (Denisovan+Neanderthal), and Oase1, so you could infer that if BEu lacked Neanderthal, you would expect it to share more with Africans than more Neanderthal admixed populations in D stat measures.

Another question that comes to my mind is, given Neanderthal is very divergent, if one Eurasian population did lack any Neanderthal ancestry, while a few other Eurasian populations all had it at around 3%, could this create an effect like the Neanderthalised l groups forming a clade together? Even if that was not phylogenically true. Might be worth the study authors giving some thought to.

Like, say you had WestEurasian A, WestEurasianB, who form a clade, and then a separate Ust_Ishim clade and ENA clade, then WEB, Ust_Ishim and ENA all pick up some Neanderthal ancestry, which by the processes of selection smooths out to the same level, while WEA is unaffected. Would WEB, show some relatedness to Ust_Ishim and ENA that WEA lacks?

How much of the sharing between Eurasians is mediated by Neanderthal ancestry. How much could a few unrelated clades show as related just by sharing the same % of Neanderthal ancestry? How much differentiation between the Eurasians remains when Neanderthal alleles are masked out? Do the D-stat relations found still exist when Neanderthal derived variants (and variants derived from those variants) are masked?

That seems like it would need consideration in light of the idea of Near Eastern populations and particularly ancient ones systematically being derived from some pop that lacked Neanderthal ancestry.

Rob said...


Are you suggesting that BEu is a sister branch to a population related to an UP European group which simply did not mix with Neanderthals, whilst other west Eurasian, as well as all other divergent Eurasians (ENA, U-I) did (somehow) ?

This would have to essentially exclude a southern coastal AMH route wholly (beyond the Persoan gulf) , and mean that all Eurasians apart from BEu colonised the globe via a path going through Neanderthal territory

(Assuming ASI and Aborigines also have Neandethals admixure)

Anonymous said...


"That is called setting a digital footprint. Its not really meant to convince anyone. You do know what that is right and how its done?"


"I which i was that successful in *spam* my thesis. Not really. Don't have the time to. Although i which i do a better job at it."

You might be more successful if you didn't write like a 13 year old with ADHD who just overdosed on sugar. But probably not ;D

"...brush pictures..."


Grey said...


"This would have to essentially exclude a southern coastal AMH route wholly (beyond the Persoan gulf) , and mean that all Eurasians apart from BEu colonised the globe via a path going through Neanderthal territory"

would a southern migration counter clockwise around the himalayas followed by two back migrations, east and west, work?

(so the western back migration mixed with neanderthals in the north)

Rob said...


I'm not sure if I've even understood Alberto & Matts suggestion, and I'm sure they're just preliminary suggestions.

Firstly, the non-Basal (ie crown eurasian split) was more than a two way split (west & east), but one which has to take into account north Eurasian (ANE, WHG); ENA, Ust-Ishm, etc)

Secondly, Papuans have Neandethal admixture. If so, and considering the territorial range of Neanderthals existed, then Asia would need to have been colonised via somewhere near the Southeast Caspian region. It can get tricky about the routes around the Himalayas, south of them, north of them, both, layering, etc
It's been debated since the early days of mtDNA, and recently with TreeMixes etc
Indeed, a few threads back, most TMs had crown Eurasian slightly closer to the basal in Iran than Natufians, which suggested to me that Cr Eu split off the basaloid branches closer toward Iran than immediately in the southern Levant

Chad said...

I'll try those Mota runs tomorrow. Sorry, I ran out of time today.

Ryukendo K said...
This comment has been removed by the author.
Ryukendo K said...
This comment has been removed by the author.
Ryukendo K said...
This comment has been removed by the author.
Matt said...

Rob: Are you suggesting that BEu is a sister branch to a population related to an UP European group which simply did not mix with Neanderthals, whilst other west Eurasian, as well as all other divergent Eurasians (ENA, U-I) did (somehow) ?

Really, it's that since there is an apparent correlation with Near Eastern early Neolithic ancestry and Neanderthal statistics, I'm interested in whether the D(EEF,WHG/ANE,Han;Ust_Ishim,Outgroup) signals would remain when Neanderthal derived variants are removed.

If those signals go (e.g. D(EEF,WHG;U_I,Outgroup: 0, when Neanderthal derived variants are removed), then something like that topology like: could be possible.

Seems kind of unlikely to me (three separate edges from Neanderthal is less simple and no sign of it in any treemix with Neanderthal) and how it would make in terms of movements I don't know, but seems worthwhile for any listening academics from the Reich group to test, if they're listening in here.

Rob said...

Yes I see. Interesting, that would have implications for the path trodden through the Eurasian heartland, as I indicated above

Karl_K said...


"I'm interested in whether the signals would remain when Neanderthal derived variants are removed."

It would be fantastic if we had an accurate list of all alleles that l entered the modern genepool from Neanderthals or Denisovans.

Unfortunately, this is much more difficult than it seems. Since there was often heterozygosity at the exact same alleles in modern Africans and in archaics, you can't simply seperate them on a SNP basis, you would have to use short haplotypes. This requires high coverage sequencing to be very accurate.

The main problem is that we don't have the exact genomic sequence of the Neanderthals that actually admixed at the split toward the Crown Eurasian group.

We are only inferring that info from a very small number of distantly related Neanderthals.

Also, we do not know if the admixing Neanderthals at the base of Crown Eurasians already had any earlier AMH admixture (as the Altai Neanderthal did).

Unknown said...

@ Karl

" Since there was often heterozygosity at the exact same alleles in modern Africans and in archaics, you can't simply seperate them on a SNP basis, you would have to use short haplotypes. This requires high coverage sequencing to be very accurate."

There should be plenty of sites where Archaics are hetrozygous to the exclusion of Africans and Eurasians, or even hetrozygous at AMH hetro sites, but with a very different allele frequency. The problem is that we don't have any good arrays that are very well ascertained in Archaics. Even if we put together one from a high coverage genome, the problem is AMH may not be genotyped at those sites (except for some sequences at the Simons Human Diversity Project).

I have worked with the 110K Lazaridis Denisovan ascertained panel, but found that they had too much overlapping allele frequencies at African sites. I did not spend too much time on identifying Denisovan unique sites. That may be a possibility.

With regards to not having the sequence for the Neanderthal groups that actually admixed with AMH, that is not that big a deal. An analogy is even with Africans being very diverse, we are still able to identify Africans in general from Eurasians

Karl_K said...


"With regards to not having the sequence for the Neanderthal groups that actually admixed with AMH, that is not that big a deal. An analogy is even with Africans being very diverse, we are still able to identify Africans in general from Eurasians"

No doubt. But you are not trying to identify 0-5% admixture from groups that you have only a small amount of data from.

The available Neanderthal genomes are mostly low quality, with the exception of the Altai Neanderthal that clearly has AMH (African-like) admixture.

There must be a few thousand SNPs that originated in the lines leading to Neanderthals and/or Denisovans that will be 100% indicators of that ancestry at thiae positions.

However, we can only identify those that were present in the small number of sequenced genomes.

The other factor is (as you mentioned) that the enriched genotyping data is great for calling alleles on known heterozygous sites. But we can't know if those sites were also polymorphic in Neanderthals without more data. And we certainly can't know if those sites were polymorphic in the particular Neanderthals that admixed into the base of Crown Eurasians.

Only many more higher quality Ust-Ishim like genomes will be able to shed light upon that.

Jacob said...

Hi Davidski-
I wanted to know why I got over 1% Oceanian on all the GEDmatch Eurogenes calculators when I have no known ancestry from there and it didn't show up on Ancestry DNA. Thank you,
J Bower

Unknown said...

@ Karl

"No doubt. But you are not trying to identify 0-5% admixture from groups that you have only a small amount of data from."

The amount of admixture is not really the issue. We are able to identify such small amounts of African admixture in Eurasians, even just using a couple of African samples. It would be helpful if we had a few good coverage samples to get an idea of diversity at various Archaic loci, but keep in mind that even if we don't have genomes from the ones that admixed with AMH, Archaics by virtue of their shared drift with each other, should have loci with a shared common "Archaic" allele frequency to the exclusion of AMH. Naturally at some loci, archaics will show varying allele frequencies amongst each other, but at other loci, archaics from various groups should have a common "archaic" allele frequency different from AMH, or perhaps at those sites AMH may have a MAF of 0 (fixed in AMH), or visa versa.

"The available Neanderthal genomes are mostly low quality, with the exception of the Altai Neanderthal that clearly has AMH (African-like) admixture."

I believe that I have seen a couple of decent coverage genomes (Denisovan and Neanderthal". The African like admixture is not real, as it is a byproduct of ascertainment bias. The reason for the agreement at those loci between archaics and Africans to the exclusion of Eurasians, is not geneflow from Africans to Altai or visa versa, but because as you mentioned African and Archaics agree at those sites because they are very likely sites ancestral to both AMH and Altai, but have mutated in Eurasians only. Whole genome comparisons don't support the African like admixture in Archaics.

Unknown said...

@ Karl

Forgot to mention, you have a point about not having genomes from those archaics that admixed with AMH, not that we would not be able to identify general archaic geneflow to AMH, but rather that geneflow would likely be underestimated if we used distantly related archaics.

An analogy is if I had a bunch of Eurasian genomes and 1 African genome, I could identify % African in those Eurasians, but that would likely be an underestimate of African admixture for somewhat obvious reasons

Matt said...

First Farmers (Israel and Iran) is out published, so there will be no further revisions- (if not now, imminently) -

"Going forward, said Pinhasi, "We're eager to study remains from the world's first civilizations, who succeeded the samples analyzed in the study. The people everyone reads about in history books are now within the reach of our genetic technology."

Harappa, Sumer, Kemet, Elam, Hittites, Minoan Crete, I'd guess.

In other news - - "Genomic analysis of Andamanese provides insights into ancient human migration into Asia and adaptation"

10 Andamanese whole genomes and 60 Indian mainland whole genomes (1000 Genomes, Simons Project or others?).

"We show that all Asian and Pacific populations share a single origin and expansion out of Africa, contradicting an earlier proposal of two independent waves of migration. We also show that populations from South and Southeast Asia harbor a small proportion of ancestry from an unknown extinct hominin, and this ancestry is absent from Europeans and East Asians."

I don't know about single wave (given ANE findings in Lazaridis 2016). Is this unknown archaic ancestry the Denisovan related we know of, or something else?

Matt said...

Ah, just looking at their figures for the Andaman paper appears to be something else - non Denisovan.

Idea seems to be based on lower African derived alleles in South Asia (and therefore must be derived from a population that branched off earlier than a H. Sap split?).

See Fig 2.

Alberto said...


Yes, those are good questions for someone to look into if they have the technical ability to do so.

This paper about Andamanese is quite relevant to all this. It clearly shows how archaic admixture pulls populations away from Africans. French and Sardinian (the 2 basal admixed pops in the study) indeed show the lowest levels of Neanderthal (just lightly). And also appear closer to Africans (just slightly too). Also it seems they have a lightly lower Denisovan admixture than the other populations (Supp Fig. 13), though here Papuan is clearly the outlier with high Denisovan (as expected). And then that mysterious 3rd archaic population that admixed into South Asians (but not East Asians - Han and Dai), and is tested indirectly by their lower relatedness to Africans. Here it's Australians that seem to have clearly highest than South Asians.

The final effect of all the accumulated admixtures produces the highest stat like (Supp. Table 6):

D(Australian, French; Yoruba, Ancestral) D=-0.0691 Z=-13.986

Quite impressive. (Ancestral is just an outgroup here).

Matt said...

I if I understand it, I *think* from their treemix models (and I think I vaguely remember something similar in another paper) "Ancestral" is a kind of simulated population that has ancestral states from which Neanderthal, Denisovan, Human are all derived.

That is they used the chimp genome and genomes for Neanderthal, Denisovan and Human to simulate the last common ancestor of the whole Homo. clade. Rather than any specific archaic population that shows derived states of their own, or the chimp which would shows lots of derived and ancestral states that are totally irrelevant to Homo.

Not 100% sure about that though, I could be spouting nonsense here.

Davidski said...


I had a look at some of those models. They don't seem to be working. Lots of "infeasible" in the output.


Chimp Anatolia_Neolithic Mbuti.DG Villabruna 0.3962 100 847019
Chimp Germany_MN Mbuti.DG Villabruna 0.4169 100 809999
Chimp Hungary_EN Mbuti.DG Villabruna 0.4098 100 826036
Chimp Iberia_Chalcolithic Mbuti.DG Villabruna 0.4249 100 847449
Chimp Iberia_EN Mbuti.DG Villabruna 0.4083 100 834035
Chimp Iberia_MN Mbuti.DG Villabruna 0.425 100 797524
Chimp Israel_Natufian Mbuti.DG Villabruna 0.3738 78.015 428880
Chimp LBK_EN Mbuti.DG Villabruna 0.4016 100 845052
Chimp Levant_Neolithic Mbuti.DG Villabruna 0.3829 94.87 706117
Chimp Remedello_BA Mbuti.DG Villabruna 0.4144 94.735 522957
Chimp Sardinian Mbuti.DG Villabruna 0.4011 100 736049
Chimp Anatolia_Neolithic Mbuti.DG Iran_Neolithic 0.372 100 822016
Chimp Germany_MN Mbuti.DG Iran_Neolithic 0.3635 97.175 783402
Chimp Hungary_EN Mbuti.DG Iran_Neolithic 0.3716 100 800684
Chimp Iberia_Chalcolithic Mbuti.DG Iran_Neolithic 0.3665 100 822593
Chimp Iberia_EN Mbuti.DG Iran_Neolithic 0.3663 100 805231
Chimp Iberia_MN Mbuti.DG Iran_Neolithic 0.3685 100 760145
Chimp Israel_Natufian Mbuti.DG Iran_Neolithic 0.3448 74.645 425175
Chimp LBK_EN Mbuti.DG Iran_Neolithic 0.3707 100 819100
Chimp Levant_Neolithic Mbuti.DG Iran_Neolithic 0.3632 95.831 692260
Chimp Remedello_BA Mbuti.DG Iran_Neolithic 0.365 87.916 510924
Chimp Sardinian Mbuti.DG Iran_Neolithic 0.3598 100 708675
Chimp Anatolia_Neolithic Mbuti.DG Ust_Ishim 0.3217 80.107 1087293
Chimp Germany_MN Mbuti.DG Ust_Ishim 0.3244 74.582 1030835
Chimp Hungary_EN Mbuti.DG Ust_Ishim 0.3192 78.545 995446
Chimp Iberia_Chalcolithic Mbuti.DG Ust_Ishim 0.3245 79.009 1099921
Chimp Iberia_EN Mbuti.DG Ust_Ishim 0.3233 77.297 1041359
Chimp Iberia_MN Mbuti.DG Ust_Ishim 0.3252 75.034 926123
Chimp Israel_Natufian Mbuti.DG Ust_Ishim 0.3077 64.545 480552
Chimp LBK_EN Mbuti.DG Ust_Ishim 0.3224 79.725 1075313
Chimp Levant_Neolithic Mbuti.DG Ust_Ishim 0.3079 69.883 803324
Chimp Remedello_BA Mbuti.DG Ust_Ishim 0.325 69.491 674326
Chimp Sardinian Mbuti.DG Ust_Ishim 0.3169 80.514 934087
Chimp Anatolia_Neolithic Mbuti.DG Ami 0.3285 100 592905
Chimp Germany_MN Mbuti.DG Ami 0.3332 93.421 560940
Chimp Hungary_EN Mbuti.DG Ami 0.3292 97.286 557579
Chimp Iberia_Chalcolithic Mbuti.DG Ami 0.3322 99.809 593440
Chimp Iberia_EN Mbuti.DG Ami 0.3321 95.785 582637
Chimp Iberia_MN Mbuti.DG Ami 0.3365 97.142 554447
Chimp Israel_Natufian Mbuti.DG Ami 0.3096 69.034 268162
Chimp LBK_EN Mbuti.DG Ami 0.3305 100 590953
Chimp Levant_Neolithic Mbuti.DG Ami 0.3176 85.332 454079
Chimp Remedello_BA Mbuti.DG Ami 0.3356 83.663 370834
Chimp Sardinian Mbuti.DG Ami 0.3337 100 593614

Jacob Bower,

It's only 1% so it might be excess archaic ancestry or genotype errors.

Jacob said...

Thanks, Davidski, for the reply. I had one more question. What does the North Atlantic category include? It is my majority ancestry on your calculator.

huijbregts said...

So far Davidsky has published the ADMIXTURE data of only 9 pops.
And already I find it hard to combine all the data.
What is helpful for me is a datasheet, which has the rows and columns ordered by dendrogram.
This can be implemented with the R function 'heatmap'.
See heatmap.png in my Dropbox:

Davidski said...


Check out the relevant blog entries here...


The test is coming along nicely. I'll post new results for the samples above tomorrow, and a spreadsheet later this week.

Matt said...

@ Davidski, I see thanks for trying. If you have a chance, can you see if the model in the paper work, with outgroups as Ust_Ishim, Kostenki14, MA-1, Han, Onge, Papuan, and model populations as Mota, WHG, EHG? If those don't replicate then I must have misunderstood the section somehow. I can't see how the Villabruna / UP Europeans / AG3 in as ancestors in place of WHG and EHG would break it though.

Davidski said...

Getting similar results to what's in the paper. Btw, keep in mind I didn't explore your suggestions in great detail; just ran a couple of quick tests.

Right pops

Let pops

best coefficients: 0.242 0.214 0.544

std. errors: 0.055 0.070 0.106

Ryukendo K said...
This comment has been removed by the author.
Davidski said...

Haven't looked at that paper yet, but totally unsupervised Admixture runs aren't usually very informative unless they include a lot of modern and ancient samples.

Chad said...


Which part do you take that from? Dstats in the form of West Asian/CHG Paniya Onge E Asian favor the Onge. I also don't see the West Eurasian part of South Indians being included in TreeMix, which would explain them looking like a separate branch. qpAdm also favors Onge, and pretty much rejects E Asian.

Ryukendo K said...
This comment has been removed by the author.
Unknown said...

Well, since you gave us the results of the Kalash, mind sharing BedouinB's results?


Matt said...

@ Davidski. Thanks. I suspect if there are reasons why the models with the same right and then AG3, Villabruna and UP European as left don't work, rather than just give UP European at 0, it may be because there's not enough information for qpAdm to distinguish between some of the left.

I guess other options to explore the impact of allowing the low ENA affinity, high Ust Ishim affinity UP Europeans in are to make more iterative changes to their model with the UP Europeans (e.g. keep left pops as is, but substitute WHG for Vestonice16 or GoyetQ116-1, or trying swapping EHG for AG3) and then see what happens, but basically to explore or not explore in detail as up to you if and when you have time.

Alberto said...


Yes, that's correct. This Ancestral is a simulated common ancestor of Homo and Chimp. It was used in the Mota paper too as an outgroup for estimating archaic admixture.

BTW, from the Mota paper there are a couple of East African populations that don't show any sign of Eurasian admixture: Sudanese and Anuak. In the absence of more African ancient DNA, they seem more relevant to test if Basal Eurasian is closer to Africans than using Yoruba or Mbuti. Though I'm not sure if those populations are publicly available.

From the Andamanese paper, another thing that caught my eye: While the stats in the form D(Loschbour, X; Andamanese, Yoruba) are significantly negative for South and East Asians as expected (Supp. Fig. 10), the ones in the form D(Mal'ta, X; Andamanese, Yoruba) are insignificantly positive (Supp. Fig. 28). At the same time D(Andamanese, Dai/Han; Mal'ta, Yoruba) is insignificantly positive too (Supp. Table 3). Not sure what to make out of it.

Karl_K said...


"Not sure what to make out of it."

You're not the only one. The new genomes will be a great addition, but the analysis in the paper seems very strange. I would like to see some 3rd party verification of the results.

MfA said...

@David, I can send you the Iron Age Iranian file if you still need it

Davidski said...

Send it over. I'll run a PCA.

MfA said...

Ok sent it to your hotmail.

batman said...

The Middle-East as a continous melting-pot?

"We found that the relatively homogeneous population seen across western Eurasia today, including Europe and the Near East, used to be a highly substructured collection of people who were as different from one another as present-day Europeans are from East Asians," said David Reich, comparing apples to oranges".

batman said...

It's rather obvious that the area between the Black Sea, The Med and the Persian Gulf have been a sink, rather than a source of the patrilinear cultures thar grew into the ancient civilisations.

That they all used domesticated plants and animals are well known.
It's also established that the respective populations were growing out of patrilinear dynasties - forming "extended families" ("etnicities") - as they grew in numbers and built the first, large civilisations.

What this study establish is that the various branches of the paleolithic genome - forming these etnicities - have developed various speciations (sub-groups) of the plants and animals used domestically.

Today we also know that domestication of both plants and animals had already started when settlements like Aurignac, Kostenki, Madeleien, Maglemose, Malta, Mladec, Solutre and Sunghir existed.

Moreover we know that (all) these populations were very homogenic, sharing 'close genetic ties' across Europe, from Spain to Siberia.

Moreover we know that ALL of the known paleolithic sites went extinct, during the Last Glacial Terminus. This started with the LGM at 23.000 - 18.00 BP and peaked during The Younger Dryas, 12.900-12.100 yrs BP.

It was during the very last period that the last populations of larger species of land-animals disappeared, some 45 of them into extinction.

The same 'evolution' can be spotted among the remains of the artical humans that populated the arctic Eurasia - where a number survived the LGM, only to disappear during the 'extinction event' known as the Younger Dryas.

This implies that the specific haplotypes from paleolithic Eurasia are basically extinct, too. Except from a small group or two who happened to live in a climatic refugia, where they could survive the mass-extinction of the Younger Dryas.

We still don't know where this refugia(s) were located, but Pinhasi et al have already repported (2014) that only "small groups" of people had survived the devastating end of the Eurasian ice-age.

Since the results from Ust-Istim, Kostenki and Malta-Buret we have also know that ALL later Eurasians - including the arctic Hunters as well as the arctic Gathererers, Trappers, Fishers, Foragers, Gardeners, Herders, Dog-breeders, Goat-breeders, Horse-breeders and Cattle-breeders alike - have a common ancestry from a (small) group of SURVIVORS from the European Paleolithic.

Consequently we may explain the occurance of both the 'Anatomically Modern Caucasian' as a result of an ice-time refugia - where the specific traits and phenotypes neccesary to survive in the arctic hemisphere COULD develop.

Moreover - we may find that the spread of the later, caucasian y-lines - defined by an ancestral CT/CF-makrogroup - are the sole origin to the y-lines that made the respective dynasties that ruled the first, known civilisations - such as hg G,H,I,J, K.

batman said...

Obviously there's a mutation of K2, forming R1a/b, that can be linked to the spread of Cattle-breeders, which seems to be the somewhat limited understanding of "farming", as the term is used by modern academians.

Thus it's good to see that Pinhasi et al have looked a bit deeper into these factors, finally substantiating what most of us already knew - that VARIOUS etnicities developed a v-a-r-i-e-t-y of produce, each adapted to regional biotopies and climates. Which explains why there are agricultural societies forming already some 10.000 yrs ago - not only in sentral America and China, but even in the arctic and semi-arctic regions of northern India, Anatolia, northern Africa and Europe.

An effective sperad of the arctic Caucasians was possible only after the Younger Dryas and the end of the Eurasian Ice-time. That may explain how the initial, "frog-leap" spread of both agriculture and the I-E languages can be linked to the older y-dna-lines of G, H, I and J. As their cousine-lines of K2/R1, were able to develop an infantilisated ability to digest milk and diaries we got the later spread of "livestock-agriculture", where the massively effective milking-cows start to roam all plains and lowlands.

Thanks to a common, post-glacial ancestor-group ("Noah") - the homogenity between the various caucasian populations of mesolithic Eurasia were also very close. Thus the various branches from "Noah" (C/CF) - forming todays 'brother-lines' of G, H, I, J, K+ (etc.) - became distinctively separate family-lines already at the beginnnig of Holocene, as soon as the repopulation of arctic Eurasia COULD start.

Finally - since the various y-dna lines from Paleolithic Eurasia went extinct as late as the Younger Dryas, we may have to look for a Last Common Ancestor to the e-x-t-a-n-t y-lines (from CF) as a result of the last 12.000 years - only.

Ryukendo K said...
This comment has been removed by the author.
Shaikorth said...

Pontus Skoglund failed to duplicate the D-stats of the Andamanese study, even with Simons Genome Diversity Panel. Caution is adviced.

Karl_K said...


"Pontus Skoglund failed to duplicate the D-stats of the Andamanese study, even with Simons Genome Diversity Panel. Caution is adviced."

Very interesting. I was just saying above that this study may have technical issues.

It is very hard to imagine that mysterious hominins contributed 1-10% admixture to people from India to Australia, yet it hasn't been predicted previously.

This implies that Australians are ~5% Denisovan, ~3% Neanderthal, AND ~7% mystery hominin?

Sounds especially hard to believe, as there is no clear way to explain this from a single admixture event.

Alberto said...


Yes, thanks, good to know that.

I hope it won't become the norm to have papers published with strange looking stats that others can't reproduce. And it would be good to understand why these discrepancies happen in the first place.

Davidski said...

Here's that PCA of Iran Iron Age F38 (red dot). Clusters with the Early Bronze Age Armenians.

Open Genomes said...

David, will you generate a Gedmatch upload for WC1 and also plot it?

It would be useful for checking the IBD with other ancient samples.

The complete BAM file is available here, with the index:

Also, after that's done, how about a K9 or K10 spreadsheet so we can generate a 3-D plot?


Matt said...

A few unsupervised ADMIXTURE analysis by the blogger Genetiker:

K11: (forms a) LevantN/Natufian cluster, b) EuroHG cluster, c) CHG cluster)
K12: (forms a) Natufian cluster, b) Anatolia_N cluster, c) EuroHG cluster, d) CHG cluster)
K13: (forms a) Natufian, b) Anatolia_N cluster, c) EuroHG cluster, d) CHG cluster, e) Yamnaya-steppe cluster)

Interesting facets:
- Invariably some small level of the Levant / Natufian cluster in Mandenka, even where IRC the old SW Asian clusters never occured there.
- Zoroastrian cluster seems to contain Euro_HG, and not just as a by product of containing Yamnaya cluster. not sure why.
- Specific Levant/Natufian cluster absent from ancient Europeans always shows up in modern Southern Europe (except Basque).
- Iran_N always takes CHG cluster plus South_Asian cluster. Poss not enough power to make its own component.

(Of course take with Genetiker being kooky in mind).

Any thoughts?

Anonymous said...


Any word on the GedMatch Iranian test?

Ryukendo K said...
This comment has been removed by the author.
Chad said...

It's Admixture. It's not very reliable for differentiating ancient mix. It tends to load on groups with lots of samples, such as the Zoroastrians. I can check tomorrow, but I doubt they're very significantly Iranian over Levantine.

Garvan said...

@ Matt
The Chinchorro mummy result looks like it is 1/3 admixed. Any ideas?

Chad said...

Kusunda also look like they have Iranian and ANE ancestry, on top of Onge and Austronesian.

Davidski said...

Genetiker is a schmuck. He probably hasn't cottoned onto the fact yet that some of these Zoroastrians need to be removed from ADMIXTURE runs because they show inflated pairwise IBD sharing, so they form their own cluster at high K, and pull a lot of other South/Southwest Asians into it.

Btw, the Kusunda from the Human Origins are really mixed. They pretty much have everything that exists in South Asia. Hard to make sense of them.

Gioiello said...

@ Davidski

"Genetiker is a schmuck". As Genetiker is a German very likely, and "Schmuck" in German does mean "gioiello" in Italian, and if we put together the two metaphors (Gioiello too is a "schmuck")...
I don't know if you know all the tens of languages I know, but...
1) I think that the results about SNPs of Genetiker are reliable, above all about Villabruna, and that against Sergey Malyshev
2) I don't give a dime to the autosome, I leave it to the Tolemaics

Davidski said...

The fun and games are about to end. There's a big paper coming with Bronze Age samples from all over Europe.

Hopefully when the dust settles, Genetiker leaves his blog online so that we can go back and check out the champagne comedy he served up over there.

Gioiello said...

Davidski, we all are waiting for this paper, because from a scientific point of view the last word has to be given to "proofs", but the only serious theory against mine is that of batman's (vespertilius, who as Hegel's noctule doesn't like Mediterranean light and naked bodies), i.e. that there was one only refuge. We'll see whether in the cold North or in the sunny South.

Davidski said...

Yes, obviously, batman is a leading mind in this field; Indo-European ethnogenesis during the Ice Age. Not only plausible, but simply brilliant.

Samuel Andrews said...

Geneticker did admit he was wrong about R1b btw. But yeah he does come off as sort of crazy.

Gioiello said...

I left ever opened the possibility that everything might have come from North, just when I said that R1a-M420 did come from Northern-Western Europe, and the close link of IE with Uralic languages should make us think to North, and also that R-L11* seems expanded from Baltic with German peoples migration to South and Western Europe, even though it seems very unlikely to me that some R1 (b or a) were there 14000 years ago.
As usual I am basing me upon single Y: I am testing c/o the J1* of an adopted man who is wandering through Europe in serach of his true father, who seems an Italian, I thinbk from Abruzzi or nearby. His Y separated from the subclades about 15000 years ago... I expect that als J will be found in Italy, older than Satsurblia and all the rest.

Gioiello said...

@ Samuel Andrews

Wrong about what? That he thought that it came from Palaeolithic Europe, not that expansion happened from Italy and not from Franco-Cantabrian refugium. That is the mistake that everyone did.

Matt said...

Davidski: He probably hasn't cottoned onto the fact yet that some of these Zoroastrians need to be removed from ADMIXTURE runs because they show inflated pairwise IBD sharing, so they form their own cluster at high K, and pull a lot of other South/Southwest Asians into it.

Probably not and he doesn't realise, at very high K, though in these case I think I misspoke and to be clear the Zoroastrians don't form a component in any of his K11-K13 runs. I should have said "Zoroastrian population samples seems to contain Euro_HG, and not just as a by product of containing Yamnaya cluster. not sure why." not Zoroastrian cluster (which there isn't).

@ Ryu, thanks for the comments. Re: Sardinians, I think depends on which mainland South Europe population being compared to, as its different for South Italian / Spanish.

Olympus Mons said...

What do you expect the bronze age paper full of genomes will clarify?

Will it not show those samples highly admixed with local substratum? The further we move in time the further we loose detail of originals?

Olympus Mons said...

... And its a bit strange this ganging up on Genetiker. Weird.

CroMagnon said...

Lol The Batman
Not only does he think IE came from ice age west Baltic, but it seems like he's suggesting that all humanity came from a surviving group in the Baltic after the YD, so 11kya, although he freely admits that he doesn't know the first thing about genetics, such as directions of gene flow

It'll also be good when his Gio shuts up everything coming from his granddads backyard in Tuscany because he "proved" in, you know, all those thousands of letters he wrote based on STRs, but he was blocked because there is a conspiracy against his genius, like Copernicus when he pointed out the world was round

Exciting times

Davidski said...

I think the new paper will confirm the steppe hypothesis, with both R1a-M417 and R1b-M269 coming from Eastern European Hunter-Gatherers via Khvalynsk, Repin and/or Sredny Stog.

Simon_W said...

Yes, it will be interesting to see what yDNA the early Iberian Bell Beakers had. I think some people here are suggesting they had R1b. And that this had migrated somwehow along the Mediterranean from West Asia (or Italy). But this is rather unlikely, because we already know samples from northern Iberia dated to 2900 - 2600 BC, and they were predominantly I2a, complemented with some G2a, I and H. One sample from El Mirador, I1277, was carrying I2a even as late as 2500-2400 BC. And one of the Remedello samples, RISE486, also postdates the Bell Beaker age, he lived after 2134 BC, yet he had haplogroup I. While already at 2500 BC Bell Beakers from Germany were purely R1b. I also point out that from early Bronze Age Armenia 2600-2500 BC we have only one R1b, and that wasn't M269, and from chalcolithic Armenia we have several L1a, but no R1b. So it's clear on which thesis I'm going to put my money...

Gioiello said...

@ Davidski
"I think the new paper will confirm the steppe hypothesis, with both R1a-M417 and R1b-M269 coming from Eastern European Hunter-Gatherers via Khvalynsk, Repin and/or Sredny Stog".

Davidski, the first serious things you are saying from when I know you. So reasons a Copernican: he makes athoery and proofs for proving or disproving them, even though Koeppernick was very likely closer to genetiker than you.

@ CroMagnon

You who believe descending from an extinct species, and perhaps coming from Holy Spirit, Villabruna was a slap of mine, confess, but it was only the first... este paratus!

Simon_W said...

It's also noteworthy that modern Iberians have non-negligible, quite substantial LNBA European admixture that is even visible to the naked eye. And this can't be all from Visigoths and Suebi. If Celtic had spread from Iberia, with direct roots in West Asia, then the modern picture would probably look very different.

Shaikorth said...


Here's something on the Kusunda and ADMIXTURE:

MDLP World-22 averages and a TreeMix on the components. First of all, it's obvious that Kusunda are endogamous and get a "Tibetan" component minimized in all other populations because of that first and foremost. TreeMix helps in checking the affinities of said component, once you check where the components peak from the sheet. Note that "Austronesian" is clearly misnamed and should be Australian.

Rob said...


Don't forget the theory which argues that early Iberian BB was steppic from the outset. Ie steppe colonists somehow reached Iberia already by 3200 BC
It'll be interesting to see what pans out, but I agree with you that where looking at a culmination of gene flows from south (=Iberian maritime beakers) and east (single graves, etc). I think Ryu's tests were a teaser but indicative of significant complexity we should expect, esp once we get samples from several Beaker regions

Open Genomes said...

David, here are the Affymetrix Human Origins Array SNP calls for WC1 in VCF 4.2 and VCF 4.1 formats.

Will you be able to use this to plot WC1 and for your calculator?

Davidski said...

Yes, I think so. I'll be able to download them in about 10 hours when I'm home. On wi-fi right now.

Simon_W said...


Yeah, that's true. But seems kind of hard to believe that Yamnaya herders went to coastal Portugal around 2900 BC.


Of course you could argue that El Mirador was the wrong culture, i.e. not Bell Beaker at all, and moreover in an area where non-IE Basque may have survived for a long time. We know e.g. from the steppe that the culture matters, and even neighbouring cultures may have had very different haplogroup profiles. And you could also specially plead that Northern Italy was only culturally affected by Bell Beakers, but not much by immigrants, and hence explain the late Remedello sample. And the steppe admixture in modern Iberians could be explained as a Celtic back-migration. We've seen a back-migration on the steppe too, so why not in Iberia? These arguments are all possible, but not very parsimonious at the moment. I think they'll only be justified if new data forces us to adopt them.

truth said...

The European LNBA input in Iberia is about 50-60%, and the R1b was most probably introduced by them. The Bell Beakers of Iberia were probably carrying typical neolithic haplogorups, such as I2, G2a, etc.

Rob said...

And I hope that they've sampled EE & the Baltix well so we can at least go toward arriving at some workable hypothesis for CWC.

Simon_W said...

Addendum to prevent all misunderstandings: What I've just been suggesting was that Celtic and R1b might have back-migrated to Iberia at a later date in the Bronze or Iron age thereby freshly introducing the LNBA ancestry there. That's what I consider the less parsimonious theory atm.

Samuel Andrews said...


R1b-P312 going to Iberia then back east is an unnecessarily complication. It went one direction East>West.

I'm most interested in DNA from Bronze age SouthEast Europe. IMO, we'll see gene flow from Northern West Asia and Northern Europe. By the time Greeks began writing they were very mixed by Bronze age standards.

Olympus Mons said...


"I think the new paper will confirm the steppe hypothesis, with both R1a-M417 and R1b-M269 coming from Eastern European Hunter-Gatherers via Khvalynsk, Repin and/or Sredny Stog."

Well. Disappointing. No bronze age samples (2nd millennium) will show you that. Truly think you like alleles but have difficulties with the concept of Time.

Olympus Mons said...

... unless you have Bronze age from all regions (3rd up until end of 2nd).

Karl_K said...

@Samuel Andrews

"R1b-P312 going to Iberia then back east is an unnecessarily complication. It went one direction East>West."

There were obviously many unnecessary complications in human history, and this could still be what actually happened. The data is lacking.

It also seems unnecessarily complicated that the Yamnaya-like steppe people went west and then back east before entering India, but that seems to be what happened anyway.

Olympus Mons said...


Anyway, without a "super paper" on bronze age samples this is what "I know" as I suppose anyone who reads anthropology also should “see”. So, this “super Bronze age” paper is going to disprove me on R1b. Ok

a. R1b M269 was in southern Caucasus north of kura river so between the lesser and the greater Caucasus and also at the basin of Araxes river in Aratashen (from 6000BC to 4.900 BC – See how precise the dates are!). The culture is today called Aratashen-Shulaveri-Shomu, because it existed in both pockets (oasis), separated by lesser Caucasus Mountains.
b. Aratashen got kicked out first. Aratashen is near the Sevan lake where that Bronze age R1b (p25?) was found. So by 5300Bc they were kicked out by the Ophidians (snake people) coming from beyond the Zagros mountains ( a couple centuries earlier) so the source people of Ubaid, Uruk and whatever was left after the period 5500BC to 4500 BC. Anyway, Aratashen Fell first at the same time as the Halaf was transitioning to Ubaid. Maybe, just maybe this were the M269 (xl23).
c. By 4900 bc, Mentesh tepe fell (that is north of lesser caucasus mountains where the shulaveri lived. That is where R1b-M269 lived. That is the point L23 arised. Some had it, same didn’t. Pushed by Ophidians people (Ubaid/Uruk) coming from east/south they were pushed north.
d. By 4800 bc, some were near the shores of Black Sea. Some where in Anaseuli (near black sea ) others, with R1B-L23, near Kvachara – where you cross from southern Caucasus to Northen Caucasus – and later part of Maykop territory. I believe The M269 (xl23) were in Anaseuli and moved south back to Anatolia (where highest variance of R1b still exists) to coalescence and then immediately south to … that is the rest of my theory.
e. The ones moving north near the black sea (also with L23) were in Nalchik by, say… 4800/4700 bc. And kept moving north to samarra, Khvalynsk, Sredny Stog.etc.

Just hope your super bronze age is not talking about R1b-M269-L23 in …. “"I think the new paper will confirm the steppe hypothesis, with both R1a-M417 and R1b-M269 coming from Eastern European Hunter-Gatherers via Khvalynsk, Repin and/or Sredny Stog."” Millennia later, because the right answer to that it would be…. No shit, Sherlock!

Just hope that Super bronze age paper wlll give us, bronze age L51, L11 in Sredny stog and eastern Europe at least by Bronze age. Otherwise is a double “no shit Sherlock”.

Olympus Mons said...

What everyone should be trying to figure out is who were and where they came from, the Ophidian people that during 6th millennia BC moved to northwestern Iran, then to north Mesopotamia, to Iraq, Syria, and even up to parts of Anatolia. They all fell fast. Shulaveri (both at Shomutepe as in Aratashen) the Halaf, etc. That is the admixture you see and we have been talking about these last weeks. They admixed everyone and made the first civilizations… and scared the living shit out of everyone with their deformed heads and face paintings to look like snakes, so much that a whole lot of people run and kept running until they couldn’t see them no more.

Does anyone have an Idea where these guys came from?

Karl_K said...


"Does anyone have an Idea where these guys came from?"

I'm guessing they came from Croatia.

Tesmos said...


Will they finally have samples from Western Yamnaya or nearby in this new huge paper?

Olympus Mons said...

Yes. Coming from you looks like a fair assumption.

Open Genomes said...

Thanks David - I've created VCFs for the Illumina Human OmniExpress 24 v.1.1 too along with the Affymetrix Human Origins Array VCFs, and these will be found in the same location as above:

The VCF 4.2 files are preferable for running an analysis, since they have calls for all the SNPs on their respective chips, and the VCF 4.2 "DRP" field contains all the high-quality read counts for each allele too. (There should be "no reads" for those SNPs that have no reads, so you can distinguish a "homozygous identical with Build37" result from a true "no call".

WC1 is a whole genome sequence (not from a capture array) at 10.42x coverage with 622,993,765 raw read pairs, extremely high coverage for an ancient sample. Between these two datasets we should be able to get excellent IBD between WC1 and other ancient samples, and all sets of modern individuals too.

Open Genomes said...

Most of you may not realize it, but WC1 is a complete mystery. It's from a femur found in Wezmeh Cave in the Central Zagros, 7455-7082 calBCE (9465-9092 BP). The femur was dragged into this cave in a cliff by an animal, apparently from an exposed grave somewhere else.

Wezmeh Cave, Central Zagros, Iran

WC1 was a full-on grain farmer, whose diet consisted mostly of grain. He possibly also ate domesticated goats and cattle (and dairy?). There's no evidence that he hunted wild animals like gazelle at all. This is based on an isotope analysis of the collagen of the bone. There are very few radiocarbon dated Pre-Pottery Neolithic sites in Iran at this time. There are even fewer excavated sites, because most of these were from surface finds.

New evidence of the Neolithic period in West Central Zagros> the Sarfirouzabad-Mahidasht Region, Iran

Context Database 7,500-7,100 calBCE dated sites in Iran

The contemporary site of Tepe Guran in particular is only 33 miles / 53 km southeast of Wezmeh Cave.

This period in the Central-Northern Zagros of Iran is called the "ACN" or Aceramic Neolithic. It's contemporary with the Middle PPNB and Late PPNB in Northern Syria, Southeast Anatolia, and Northern Iraq, and also with the ECA I (Early Central Anatolian Neolithic I) of Çatalhöyük. It predates the Pottery Neolithic Jarmo Culture of the northern and central Zagros by several hundred years.

I think it's very clear that the ancestors of WC1 recently arrived in the Central Zagros from somewhere else with their wheat and cattle. Where did they come from? The PPNB region of Southeast Anatolia?

Map of radiocarbon dated sites, 8,200-7,500 calBCE, just before WC1 (7455-7082 calBCE)

If that's the case, will we see any IBD between WC1 and the Anatolian Neolithic?

Olympus Mons said...

@open genomes,
Thanks. Something to follow up. Will read about it. Those guys are important to understand it all...

Karl_K said...


"Yes. Coming from you looks like a fair assumption."

You're welcome. Let me know if you have any more questions.

Rob said...


Can you point to us which genetic data shows this "coming of the snake people" and emigration of M269 originated form exactly between the R. Kura & Gr & Lsr Caucasus ranges ?

Karl_K said...


"Can you point to us which genetic data shows this "coming of the snake people" and emigration of M269 originated form exactly between the R. Kura & Gr & Lsr Caucasus ranges ?"

Please do! This sounds totally reasonable!

Ariel said...

"The fun and games are about to end. There's a big paper coming with Bronze Age samples from all over Europe."
In 2016?

Rob said...

Open Genomes

Interesting insights

But I wonder how much the blank space in those maps relates to the state of research in pre-Neolithic Iran and surrounds is still poor and lagging the Levant

Moreover, I thought the genetic data suggested the Zagrosy farmers are distinct (coming from a long separated Elipalaeolithic background) from those in Anatolia and levant, not directly derivative

MfA said...

If Iran_IA is baseline for Iranic ancestry in modern western Iranics, then Iranic ancestry should be very low, less than 15%. Non-Iran_IA part of ancestry of modern West Iranians is most similar to modern Pamiri Tajiks. I don't think anyone surprised this though.

Part of the steppe ancestry in modern West Iranians predates the proto West Iranics(probably total steppe ancestry is over 20%). And probably from groups like Hurrians et al, even though Hurrians et al didn't speak an Indo-European language, their religion and culture have IE influence all over.

Iran_IA looks like Armenia_EBA(Kura-Araxes) on 2D PCA but its admixture profile -altough not much- is still different. Shows less steppe, more Iran_N and Levant type admixture, though this should be expected judging by its southern location and recent in time. On 3D PCA, modern West Iranics are in a cline runnig through modern south-central asians to Iran_IA.

I also checked Iran_recent, she's most similar to Feyli Kurds, has slightly more Pamiri Tajik like ancestry than modern West Iranians.

Open Genomes said...

@Rob & @Olympus

It's true that the state of research into the earliest Neolithic in Iran is very poor. It does however look like the early Ganj Dareh people were only nomadic goat herders, not settled farmers. The Ganj Dareh people were "almost" nomadic hunter-gatherers who "herd managed" wild goats.

Of course, the Zagros grain farmers and cattle herders were not directly derived from the Anatolian Neolithic people. We do know that the tMRCA of G2-P287 will be just after the LGM, around 18,200 ybp. not before it. There must be some sort of connection between the two groups that post-dates the LGM. It seems likely that both groups originate in the Middle Euphrates region, and migrate in opposite directions admixing with the local hunter-gatherers. The Anatolian Neolithic people are all in various branches of Y haplogroup G2a2-L1259 with a tMRCA of 16,800 ybp, while the Neolithic Iranians are in G2b2a-Z8022 (WC1) and G2a1a-Z6553 (SG2/I1671). (There's also a Chalcolithic sample from Seh Gabi, I1674/SG21, in G1a1b-GG362/Z3189.) Even leaving out the G1a, the tMRCA between all the G2-P287 samples is 20,800 ybp, during the LGM.

YFull tree G2-P287

This certainly is in the Upper Paleolithic, but they are not separated by 45,000 years as the authors of the various studies claim. The Anatolian Neolithic Y results indicate that the G2a2-L1259 lineages had been co-migrating together since 16,800 ybp, and they weren't too far off before that from the other G2-P287s.

The paradox is, where is the shared ancestry that must exist between these two groups that is at least 25,000 years later than the claimed earliest shared ancestry? Does this have something to do with the high percentage of so-called "Basal Eurasian" found in both groups, a kind of IBD that may have been lacking in the Iranian Hunter-Gatherers such as the sample from Hotu Cave?

Olympus Mons said...

Hummm so prior to being able to extract dna from inhumations we knew nothing about history? So even theories like steppe pie had no business being put forward prior to dna extraction?

Can you points us to genetic data that make us believe that pharaonic egypt stem from the movement of people from upper nile onto lower nile and the mixing of both? As far as i know there is no genetic data to support this assumption. So every ancient Egypt expert is an idiot because there is no dna to support their claims?

Rob, are that stupid? Apparently so...

Rob said...


You're deflecting the question by false analogy
The expansion of Pharaonic Egypt is documented archaeologically, and makes no bold or specific calls about genetic lineages coming from a specifically defined geography, as you do, hence my question about specific evidence, esp. in light of how "obvious" you keep claiming it to be and the "evidence" you mentioned but did not elaborate upon or cite

Moreover, your theory reads more like a cartoon than one developed from a propper evaluation of evidence & rooted in deep methodological understanding. Some apparent parallels between Zambujal and Shuvaleri enclosures, albeit they're separated by 2 Millenia with no bridging? Essentially, the snake people chased everyone away- that's the crux of your theory ?

Olympus Mons said...

@open genomes,
my problem is. apart from the growing evidence stemming from goats, cattle, cereals and so forth to support your previous comment, what we know is that a leveling and admixing force came to Caucasus, Mesopotamia, even part of Anatolia, from 5500 bc to 4500 bc and changed the region. Halaf to Ubaid, Ubaid, Uruk, end of shulaveri, etc. It was the snake people and it first show up in northwestern Iran, near Zagros. So, it could have been a local event then. Is there any indication of "foreign dna" from east or southeast of mesopetamia/caucasus/anatolia that was added to the mix and shows up in chalcolithic populations there? What is it with L1a guys found in Caucasus?

Its intriguing...

Chad said...

G2 in Anatolians can be from Iranian-like admixture. Remember, Lazaridis had Anatolians as about equal parts Iranian, Levantine, and WHG. The Levantine is 2/3 Natufian and 1/3 Anatokian-like, meaning Anatolians have been modeled as a decent amount more Iranian than Natufian. So, Anatolians may be something like 25% Natufian, 40% Iranian, 35% WHG. The G2s may have only recently gone separate ways, even after the Older Dryas.

Olympus Mons said...

@ rob and friends.
I do not seek your validation, right? So, couldn't really care about what you think...
But, let me just put a challenge to you guys. Just yesterday, this, a copper awl was found in a old pit of perdigoes and you know how that is important to my thesis. So probably near 3000 BC. see the pic.

This a a picture of the oldest copper awl found in in middle east, around 5000 bc (when the shulavari were on the run). This copper awl as been traced its origins to Arukhlo, the heartland of Shulaveri itself and was not local!
see the awl!

now, I do not have a pic, but a similar copper awl was found in Maadi in lower nile (so at the Merimde region) so... 3800 bc?.

Easy. Go find similar copper awls in a different route in copper age and shows us here... Because there are similar in north caucasus... coming from south Caucasus ore. So show the spread by steppe and so forth.

will be waiting seated.

Olympus Mons said...

oh, bell beakers in Iberia had Ivory. But oddly enough the Ivory was from "Savannah elephant" not even "jungle elephant" that existed by copper age in north Africa. See, even in east Iberia, all elephant ivory, as in east Mediterranean sites for that matter, was from "asian elephant" .... do you know were there was similar ivory found? in ancient Egypt heliopolis near the Nile Delta... yeah, right.

I could go on forever...

Well if R1b didn't spread with BB from Iberia, that then is a different story,,,

Davidski said...


Some of the Early Neolithic Anatolian farmers do have minor CHG or Iran Neolithic admixture, but I can't see them all having around a third of that type of ancestry.

Chad said...


I do agree that they are likely more Natufian than Iran_EN, but check this out. I don't think the Iranian is so minor, but not far from Natufian.

Depending on what is closer to the WHG in Anatolians, we see that the numbers aren't very significant. Hungary_HG, as we know, does look closer to Anatolians, but that might be because of possible minor BE and some ANE, which is likely in Anatolians with CHG/Iranian influence. If the hunter that admixed into Anatolians is more like Loschbour or Villabruna, then they aren't that much more Natufian than Iranian.

result: Loschbour Anatolia_EN Iran_EN Natufian 0.0116 2.370 23236 22703 439804
result: Bichon Anatolia_EN Iran_EN Natufian 0.0233 4.443 23990 22897 444322
result: Villabruna Anatolia_EN Iran_EN Natufian 0.0126 2.324 21579 21044 408466
result: Hungary_HG Anatolia_EN Iran_EN Natufian 0.0167 2.947 16507 15963 311834

Check out from the other direction now. Going from Natufian to Anatolia, Iran_EN admixture looks just as strong as WHG in Anatolia, and CHG is an even better admixing pop than WHG!

result: Natufian Anatolia_EN Loschbour Iran_EN -0.0006 -0.138 22703 22732 439804
result: Natufian Anatolia_EN Bichon Iran_EN 0.0021 0.435 22897 22801 444322
result: Natufian Anatolia_EN Villabruna Iran_EN 0.0043 0.854 21044 20866 408466
result: Natufian Anatolia_EN Hungary_HG Iran_EN -0.0002 -0.046 15963 15971 311834
result: Natufian Anatolia_EN LaBrana1 CHG 0.0090 1.934 24738 24297 483510
result: Natufian Anatolia_EN Loschbour CHG 0.0068 1.508 25371 25028 497627
result: Natufian Anatolia_EN Bichon CHG 0.0106 2.291 25735 25194 502381
result: Natufian Anatolia_EN Villabruna CHG 0.0124 2.593 22972 22410 448235
result: Natufian Anatolia_EN Hungary_HG CHG 0.0056 1.114 17818 17619 351027

I'll try some qpAdm runs now.

Davidski said...

Have a look at Extended Data Figure 5. Their modeled Anatolia_N clearly has eastern admixture that the real Anatolia_N lacks.

Anatolia_N does share ancestry with Iran_N, but I wouldn't characterize this as admixture from the latter into the former.

Essentially, Anatolia_N is just a more northerly version of Levant_N, with more Villabruna-related ancestry, and within that also minor WHG ancestry from the Balkans and surrounds.

D-stats are very clear about that, wouldn't you say?

Mbuti.DG Iran_Neolithic Levant_Neolithic Anatolia_Neolithic 0.0131 4.882 723467
Mbuti.DG Anatolia_Neolithic Levant_Neolithic Iran_Neolithic -0.0668 -20.804 723467
Mbuti.DG Levant_Neolithic Iran_Neolithic Anatolia_Neolithic 0.0798 24.647 723467

Chad said...

The two should share a good amount of drift if they're both a mix of Natufian, WHG, and Iranian. The stat above, Mbuti Iran Levant Anatolian shows that the Levant + Iran + WHG is valid. Even with more BE, Levant is further away, so Anatolia must have a good amount of Iranian. Considering Anatolia has significant WHG, which is much closer to Natufian and Levantine than Iranian. I would think this confirms Iranian into Anatolia must be comparable to WHG, just as Natufian Anatolia WHG Iran confirms.

Davidski said...

Nah, Iran_N has a lot of ANE. That qpGraph in the last paper showing WC1 as 60/40 Basal/ANE isn't far from the truth.

Anatolia_N has no ANE that it didn't get through the Villabruna stuff. So it can't have any Iran_N.

Like I say, Anatolia_N and Iran_N do share ancestry, and some of the Anatolia_N individuals have clear CHG (Iran_N-related) admixture. But the idea that Anatolia_N is significantly Iran_N is just plain stupid, and I can tell you it will eventually be corrected.

Rob said...
This comment has been removed by the author.
Davidski said...

Open Genomes,

I'm having trouble converting the VCF files to PLINK bed. I'm getting empty bed files. Not sure what the problem is.

Can you upload PLINK bed fam & bim for WC1, with markers that overlap with the markers from the Mathieson et al. dataset?

Ryukendo K said...
This comment has been removed by the author.
Ryukendo K said...
This comment has been removed by the author.
Davidski said...

Some Anatolia_N individuals have some CHG or another type of Iran_N related admixture.

But Anatolia_N by and large can't have a lot of Iran_N ancestry, because Iran_N is rich in ANE, while Anatolia_N lacks it totally.

I can't see any way around this by positing the existence of a ghost hunter-gatherer group. There's simply a problem with the model, possibly related to having Levant_N as an unadmixed pop with no Villabruna-related input, when clearly it has such input.

Matt said...

Outgroups used to build those models of Anatolia_N "successfully" (with lower probability than their other models) are only : O9: Ust_Ishim, Kostenki14, MA1, Han, Papuan, Onge, Chukchi, Karitiana, Mbuti, and then for the final model 09E: Ust_Ishim, Kostenki14, MA1, Han, Papuan, Onge, Chukchi, Karitiana, Mbuti, EHG.

Should be taking account of ANE related level of Anatolia_N (relative to references Iran_N, Levant_N, WHG) by Chukchi, Karitiana, MA1, EHG.

Davidski said...

I'm not saying Anatolia_N doesn't have MA1-related ancestry. What I'm saying is that this didn't come from Iran_N admixture, and most of it is contained within its Villabruna-related ancestry, apart for a few Anatolia_N samples that clearly have CHG or some other Iran_N related admixture.

MfA said...

This is what non-Iran_IA part of the modern West Iranics ancestry looks like
36.36 Potapovka_I0419
13.03 Andronovo_SG_RISE505
2.73 Sintashta_MBA_RISE395
29.70 Iran_ChL_I1661
5.45 Pima
11.21 Paniya
1.21 Koryak
0.30 Nganasan

Ryukendo K said...
This comment has been removed by the author.
Ryukendo K said...
This comment has been removed by the author.
Rob said...

@ RK

About Sardinia
I actually saw some of the Nuraghes head statues last year, but don't know too much about them, indeed they're a mystery to historians. I suspect that they're just Neolithic descendants.

But apparently the Nuraghics are sometimes counted amongst the "sea peoples", so perhaps trans-Mediterranean contact is to be expected. More pertinently, Sardinia was colonized by Phoenicians, becoming a Carthaginian colony until the Romans came. After Rome, it never fell to Arabs / Berbers.

* About Balkans
It was immediately obvious after the Satsurblia genomes came out and Dave did his new CHG K8 that extra CHG was required into Europe, probably via an Anatolia - Balkan route. It makes sense archaeologically. I remember only up until a few weeks ago that Sam attempted to vehemently deny such a movement, but others (eg Roy) thought is should be placed in the 2200 BC period ("bronze age collapse"). I always maintained it could/ should have begun as early c. 4000 BC, after the copper age collapse.

So Maybe we should expect Iranian - Anatolian like stuff in *some* (eg lower Thrace), but perhaps not all Balkan Bronze Age samples (eg further north toward Hungary), as it probably penetrated Europe more slowly compared to the Eastern European steppe-like mixture, or was initially limited to Southern Europe. I suspect there were even 2 differential types and streams- an island Cypro-Minoan type and inland Anatolian- Thracian type. Of course, we might be looking at an additional Anatolia -> SEE in 2200 BC, also.

The Croatian paper your after is by Pinhasi
Abstract out only

Ryukendo K said...
This comment has been removed by the author.
Rob said...

Quite the opposite ! , but it depends on the region
In Anatolia, it seems there occurred an "exhibit readjustment" with relative continuity
But in Greece, there appears to have been fresh migrations from Anatolia, maintained by archaeologists from the 60s and to this day. A second, smaller movement appears to have come from the NW Balkans, from the Cetina culture area.

Ryukendo K said...
This comment has been removed by the author.
Rob said...

Agree, it makes sense
Hopefully the new study has got some samples from SEE

Shaikorth said...

"A tutorial on how (not) to over-interpret STRUCTURE/ADMIXTURE bar plots"

Though many here are aware of the limitations of ADMIXTURE and STRUCTURE, this should be a very important read for the wider community.

Additionally contains some analysis of Basu et al. Indian paper's data which is interesting on its own.

Ryukendo K said...
This comment has been removed by the author.
Aram said...


I agree with You that Hurrians should also have spread some Steppe like ancestry. After all they were neighbours of IE and they definitevily have IE influence. Both linguistic and religious (Teshub).
But on the other side from whom the Southern shift of modern Armenians comes from? It is impossible that this shift was mediated solely by 'Semitic' influence. After all we didn't seen any J2a, J1*, G2a in North Near East BA / IA context. So I still expect substructures in North Near East mountains. A lot off people should be there without any Steppe just a mixture of Iran Chl and Anatolia Chl. Plus some Levant with loads of J and G.

MfA said...


I think modern Armenians are mostly from Kingdom of Sophene and west of Lake Wan stock with elites from the Ararat valley. Modern Assyrians are mainly Oshroene people.

Aram said...

Well let's hope we will see some aDNA from West and South/West of lake Van. Btw I think lake Van was a very important place in Paleolithic for understanding Iran and Anatolia Neolithic connections.

MfA said...

According to Palisto's genome mapping tool based on Dodecad k12b, Armenians mainly plot the area where I mentioned earlier. Black Armenians, Blue Kurds, Red Assyrians.

Iran_IA plot slighlty east of Hasanlu with the tool.

It's still accurate despite based on moderns and outdated calculator.

Shaikorth said...


We should have the means to do at least some comparisons like they did with South Asians to evaluate the algorithm's performance in other locations - comparing it to Broushaki's modeling of moderns as ancient West Eurasians + mota/yoruba/han. With the obvious limitations, won't work in South/Southeast Asia or deeper in Africa because of a lack of ancients and can't measure something like direct steppe ancestry because there's no high coverage sample.

But let's consider Karitiana. In Lazaridis merge it's 5.22% Ust-Ishim, 9.46% Han and 85.31% self copy. In the Busby merge it's 36.45% Ust-Ishim 60.81% Han and 2.74% self copy. In this case it's obvious that U-I is standing in for the bulk of its ANE, though Han can contain a bit, and the self-copy was just drift - same thing resulting in the Native American/Amazonian component in ADMIXTURE. The problem is, Ust-Ishim isn't always standing in for ANE - Papuans have 50% in both sets and it means something very different there.

Chad said...

Extra WHG wouldn't create the affinity to Iran, vs Natufians or Levantines. It is significantly closer to Levantines than Iran. Iran isn't even significantly closer to ANE than WHG. If there was no Iran or very minor Iranian in Anatolians we wouldn't see this. Even Levant EN is around Z>3 closer to Iran than Natufians. Then, Anatolia is closer yet, with substantially less BE. This can only be explained by more Iranian than Levantines have. Again, if it were only WHG, Anatolians would be significantly further from Iran than Levantines, and not significantly closer. It's got to be a good amount to completely offset the extra WHG. It may not be 40%, but it's definitely not just a little Iran/CHG in a few Anatolians.

Ryukendo K said...
This comment has been removed by the author.
Open Genomes said...

David, I'm converting the files to 23andMe format, including the SNP that are the same as Build37. From there I can create plink . The problem is that the tools either create VCF 4.2 files that aren't supported by plink (even plink 1.90) or that VCF 4.1 files only show differences from Build37. The files I'll create will also list the alleles for each set (Human Origins Array and OmniExpress) that were read but agree with the Reference Sequence too.

Just give me a bit to finish it ...

Matt said...

@ Ryu, yes, while the Chromopainter analysis may *somehow* contain more information, it's to me at least quite obscure about what they actually mean, and how to trace back to the actual model.

I haven't read through it very much "A tutorial on how (not) to over-interpret STRUCTURE/ADMIXTURE bar plots", though I have the following comments:

Looking at their K=11 ADMIXTURE simulations, I'm a bit circumspect about their comment that

"Note that these simulations were performed with 12 populations but only results for the four most relevant populations are shown."

Presumably these are other simulated pure populations, but I do think it makes the figure a little more obscure to not include them.

The objections about indistinguishable ADMIXTURE plots are reasonable, but at that stage you would incorporate more information, e.g. in the B Ghost scenario P2 would be poorly fitted while not in A, and in the C recent bottleneck scenario, P1 and P2 would both form an exact clade to outgroups (via fst, D stats, etc.), which you'd not expect if P2 was truly admixed.

In fairness I do think they get into this (though not having read the full paper), at least for the goodness of fit (they could do with discussing cladistics as well). Hence "use [of algorithms like STRUCTURE] should represent the beginning of a detailed demographic analysis, not the end".

I'm also a bit critical about their comment that "This exercise is relevant in particular because human history is in fact full of episodes in which groups such as the Bantu and the Han have used technological, cultural or military advantage or virgin territory to multiply until they make up a substantial fraction of the world’s population. The history of the world told by STRUCTURE or ADMIXTURE is thus a tale that is skewed towards populations that have grown from small numbers of founders, with the bottlenecks that that implies."

While theoretically possible, I would qualify this that I don't think they actually *know* that groups like the Bantu / Han have gone through sharp bottlenecks / founder effects that have an effect in ADMIXTURE following the invention of new technologies.

IRC, the group we have an example of as early agriculturalists with a decent enough genome quality to recreate population size - the Anatolia Neolithic - specifically seemingly did *not* go through a bottleneck at that time and has a more relaxed population history than many others. That may be generally true of most of the populations who have been "winners" in our history.

So this to me feels very much an assumption and they could stand to be more explicit that it is. It's not enough to support this assumption with vague notions about injustice due to favouring the winners; they should work to explicitly demonstrate that it is plausible with the data we have.

Shaikorth said...

"C recent bottleneck scenario, P1 and P2 would both form an exact clade to outgroups (via fst, D stats, etc.)"

Not necessarily via fst though? Case in point the drifted Italian_South in the clustering diagrams with Lazaridis 2016 ancients you did. Outgroup to everyone or very loose clustering with Sardinians as outgroup to everyone.

Re: other stuff, I think that they were getting to Han & co being oversized populations compared to their effective population size and that's what a bottlenecked population technically is. Problem here is that populations commonly understood as bottlenecked also have other characteristics relative to those that aren't, like more long RoH and so on. Lets consider Japan and Bangladesh. These have populations of considerable size, known to be admixed, but one shows more RoH and gets its own ADMIXTURE component easily (then appearing unadmixed) while the other doesn't...

capra internetensis said...


In a discussion over at Jabal al-Lughat translated a passage from Arab geographer al-Idrisi saying that Sardinians were originally "barbarized Roman Africans".
The article "Sardinia in Arabic sources" has some interesting stuff about North African connections back and forth with Sardinia. There's certainly evidence in the Y DNA.

Matt said...

@ Shaikorth, I think you make sense of the Karitiana result in that analysis, with the Ust Ishim fraction being concordant with the expected ANE fraction, though, even within that data, if you generalize to the other Amerinds, it looks like you have odd scenarios like (using their analysis of Lazaridis data):

Chilote: Ust'-Ishim - 0.8034, Loschbour - 0.1493, Han - 0.0045, Self Copy - 0.0428
Bolivian_Pando: Ust'-Ishim - 0.6038, Han - 0.3864, Self Copy - 0.0098
Pima: Ust'- Ishim - 0.158, Han - 0.135, Self Copy - 0.707
Surui: Ust'- Ishim - 0.065, Han - 0.1119, Self Copy - 0.8231
Bolivian_LaPaz: Ust'- Ishim - 0.3336, Han - 0.6477, Self Copy - 0.0187

and then adjust for self copy fraction:

Chilote: Ust'- Ishim - 0.839, Loschbour - 0.156, Han 0.004
Bolivian_Pando: Ust'-Ishim - 0.6098, Han - 0.3902
Pima: Ust'- Ishim - 0.539, Han - 0.460
Surui: Ust'- Ishim - 0.367, Han - 0.633
Bolivian_LaPaz: Ust'- Ishim - 0.340, Han - 0.660

No great pattern. IRC Bolivian_Pando, not admixed with Europeans. Self copy also seems to have a loose relationship with the degree of drift in a branch (Karitiana don't have notably that much more than various of the other groups, IRC?). Seems like Ust-Ishim can or can't proxy for non-ENA ancestry in Amerinds in this analysis, depending on the population.

Re: Not necessarily via fst though? Possibly not via fst, I think certainly via the D-stats / f4 stats though.

Shaikorth said...

Looking at the results of Indian and Native American populations, it's pretty clear that U-I can proxy for some kind of ENA too, and ANE haplotypes seem to prefer U-I over Loschbour. Lack of high coverage ANE sample hurts here. I made the Karitiana example because they're in both sets and the Busby merge result allowed inferring self-copy as drift for them. It's also true for Surui, they are clear U-I and Han split in the Busby merge and proportions approximate ANE/ENA based on formal testing. Bolivians unfortunately aren't in both sets, though they may have a bit of euro mixture (about the same as Maya).

Given the ancients available it should work best for those with no Ust-Ishim or high self-copy so we can avoid guessing. However in the case of something like Papuans I'm pretty sure the self-copy is more than drift, if the upcoming study about first OOA remnant in Sahul is to be trusted they should have a lot of pre-Ust Ishim ancestry when that's combined with Denisovan.

Samuel Andrews said...

"Even Levant EN is around Z>3 closer to Iran than Natufians. Then, Anatolia is closer yet, with substantially less BE. This can only be explained by more Iranian than Levantines have."

Natufian is less close to all Eurasians(inlu. Iran_N) than Levant_N is and Anatolia_N is closer to all Eurasians(inlu. Iran_N) than Levant_N is.

IMO, future ancient genomes will surprise you and show Anatolia_N has little if any Iran_N ancestry. Looking at D-stats from modern Middle Easterners I couldn't see how anyone could have significant CHG ancestry outside of the Caucasus. Then I saw D-stats from Iran_N and Natufians, which were out of this world crazy and were an answer on how some could have significant CHG-related ancestry. We need to think outside of the box of the current ancient genomes we have.

Simon_W said...

Again, regarding Bell Beakers and R1b, what strikes me is the rather negative association between west European R1b and excess West Asian admixture (excess relative to the CHG-related part in Yamnaya). In western Europe R1b peaks among the Basques and some extreme northwestern groups like the Irish. The Basques are known to be rather un-West Asian, compared to others. The modern Irish do have additional West Asian admixture, like all IEs, but R1b was already predominant in their early Bronze Age ancestors who didn't have it. Hence it's extremely unlikely that R1b-M269 reached Iberia from West Asia.

But one comment Alberto made also made sense to me: In Iberia there is no correlation between increased steppe admixture and formerly Celtic language. I would add the same holds true for R1b. To the contrary, in Iberia R1b seems to be most common where in pre-Roman times non-IE languages were spoken. In contrast, the IEs differ from the Basques in their additional West Asian admixture. It's only a minor difference, but it seems to be real. For instance, in the new PuntDNAL K12 calculator, some English people have more than 9% additonal Iranian Neolithic-like ancestry, in addition to the steppe stuff. That's probably just a random deviation, because the English average is lower, but the pattern is persistent, and for example my south German + 1/4 Swiss grandmother scored strong southeast European scores in several tests. I'm tempted to think that Celtic started to spread after R1b did, in the Bronze Age, from southeastern Europe, where steppe influence had mixed with Natufian- and Iran_N-related influence that had reached the area after 2600/2500 BC.

Davidski said...

I'm losing my nerve and patience here trying to make everything fit in this test. So I won't, because it's not possible.

I'll run as many samples as I can later today/early tomorrow with the files I have now and post all the relevant data for discussion, but for now here are a few observations.

- I can't push up the level of the Basal-rich component in the Steppe EMBA samples much beyond 16% without blowing out the AG3-MA1 component in Amerindians above 50%

- Natufians and Levant Neolithic samples don't show Sub-Saharan admixture above noise levels, but the Bedouins do, albeit at only 2-3%

- there's nothing I can do to make the Anatolian and Levant farmers show any of the AG3-MA1 component

- conversely, there's nothing I can do to make the Iranian farmers show any of the Villabruna component

- along with the findings in the Lazaridis et al. paper, Iran Hotu is different from the Iranian farmers, and I suspect it has Central Asian admixture, perhaps related to Ancestral South Indians (ASI)

Unknown said...

I am also running out of patience trying to form separate Iran N and CHG clusters. I have tried all kinds of K supervised and unsupervised, with various combinations of references.

This is a shortcoming of ADMIXTURE, which is very sensitive to recent drift. It works fine with moderns, assuming adequate sample sizes and a good genotype rate for the run.

With ancients thrown in, things change, and one finds oneself wrestling to have reasonable clusters form around ancients, and have reasonable mixture proportions for the test samples.

I think there is a 50% chance that with the release of additional ancient near eastern genomes from the recent papers, over the next couple of weeks,it will become easier to form those types of clusters. Then again ADMIXTURE does not perform well with ancient/modern mixes.

Davidski said...

I don't think it'll be possible to get both Neolithic Iranian and CHG clusters in ADMIXTURE, unless we have at least several high quality samples from each grouping that share ethnic-specific drift.

But that wont help in properly capturing the relevant ancient components.

The early Neolithic Iranians and CHG look like populations on almost the same cline. The main differences are that CHG has less Basal and more Villabruna affinity/admixture. Also, considering the really high level of something that looks very close to AG3/MA1 in the Iranian farmers, I don't see the need for an Iranian Neolithic cluster at the same time as an AG3-MA1 cluster.

As I said above, Iran_Hotu does look distinct from the early Neolithic farmers, but like CHG, it's probably on basically the same cline as CHG and the farmers, apart probably from more Central/South Asian forager affinity/admixture.

Davidski said...

I should also add that Anatolia_N, Levant_N and the Natufians look like populations on the same cline.

This doesn't appear to be a coincidence. I can't see them having different components, at least not above a few per cent.

«Oldest ‹Older   1 – 200 of 369   Newer› Newest»