search this blog

Wednesday, April 12, 2017

Population geneticists often not very good at population genetics


An abstract book from a recent mathematics meeting in Estonia includes an abstract on the genetic impact of Bronze Age steppe pastoralists on Europe and South Asia. Titled A Pre-Existing Isolation by Distance Gradient in West Eurasia May Partly Account for the Observed “Steppe” Component in Europe, it's mostly authored by scientists from the Estonian Biocentre including Luca Pagani and Mait Metspalu. You can read it here.

Even though it's just an abstract of a paper that might never be published, it's so obviously wrong that I can't let it go. This is the sort of thing I'd expect to see from some of the half deranged visitors in the comments section at this blog, not scientists from the Estonian Biocentre.

First of all, even though the abstract doesn't spell out which data crunching algorithms were used by the authors, it's pretty clear to me that the main part of their analysis was run with ADMIXTURE. That basically makes it a pointless exercise from the outset, simply because ADMIXTURE is not designed for these types of analyses.

Why? Because it's impossible to accurately recapitulate ancient population structure with ADMIXTURE; the results are always significantly skewed in some way, usually by heavy genetic drift in one or more of the test populations. In other words, there's no way to truly revive ancient populations with ADMIXTURE components. And if you can't do that, then how can you estimate their impact more or less accurately? Not possible.

In any case, whether the authors relied on ADMIXTURE or not is immaterial to the fact that all of their main points are clearly wrong. Before I go through these points, and explain why they're wrong, I need to explain exactly what the Steppe component really is and isn't.

The Steppe component is the genetic structure of Early and Middle Bronze Age (EMBA) steppe pastoralist groups Afanasievo, Poltavka and Yamnaya. And it's a very specific thing. It isn't a component inferred from a random run of ADMIXTURE that peaks in Afanasievo, Poltavka and/or Yamnaya, or any other ancient populations.

So, Steppe component = Afanasievo, Poltavka and Yamnaya, or Steppe_EMBA. Nothing more, nothing less. Certainly nothing from outside of the steppe predating Afanasievo and Yamnaya.

Keep in mind also that Steppe_EMBA is a very specific mixture of older and contemporaneous populations. Using the formal-statistics-based qpAdm method, which models ancestry directly based on f4-statistics, Steppe_EMBA is probably best modeled as a mixture of Eastern European Hunter-Gatherers (EHG), Caucasus Hunter-Gatherers (CHG), and Anatolia Chalcolithic (Anatolia_ChL), with ancestry proportions of around 0.453, 0.453 and 0.094, respectively. See here.

I believe that in this model Anatolia_ChL represents some type of minor western admixture amongst the close relatives of CHG still living in the Caucasus during the Eneolithic/Early Bronze Age, and/or minor gene flow from the Balkans onto the steppe. But that's a topic for another day, perhaps after the release of the Bell Beaker behemoth?

Below is a visual representation of the model, using a typical Principal Component Analysis (PCA) of Western Eurasian population structure. Note the tight cluster formed by the Steppe_EMBA groups and individuals, which is easily differentiated from all ancient populations outside of the steppe, except, importantly, Corded Ware.


Thus, considering that I know what the Steppe component is and isn't exactly, then I can try to test for admixture from it and its ancestral components as best I can using qpAdm. Below are results for a few pertinent ancient populations (no idea how to model the farmers from Early Neolithic Iran at this stage, but I've already underlined their unique genetic character here and have no reason to believe that they're responsible for any part of the Steppe_EMBA signal in Europe or South Asia). If you're wondering why I chose Hungary_HG as the potential Western Hunter-Gatherer source, it's because it provided the best statistical fits overall. Also note that Ukraine_HG/N is based on samples from the Pontic Steppe.

CHG

Germany_MN 1
Germany_MN 2

Iran_ChL 1
Iran_ChL 2

Karelia_HG 1
Karelia_HG 2

Latvia_HG 1
Latvia_HG 2

Latvia_MN 1
Latvia_MN 2

Ukraine_HG/N 1
Ukraine_HG/N 2

The models involving Steppe_EMBA and CHG are almost always worse than the best models without them. As far as I can see, there's no strong evidence here of any mixture from a population even similar to Steppe_EMBA in any of these groups, except perhaps Ukraine_HG/N.

However, qpAdm results are dependent on the choice of pright and pleft populations (outgroups and potential mixture sources, respectively). Therefore, with different pright and pleft populations it might be possible to model all of the above groups with significant Steppe_EMBA admixture.

But of course there are other tests that I can run to double check my qpAdm models, such as the West Eurasian PCA. And clearly, the PCA basically supports the qpAdm results, with none of the test groups showing much, if any, deviation towards Steppe_EMBA or CHG from their main mixture clines.


So now let's take a look at the key points made in the abstract and why they're so way off the mark:

However ancient DNA samples from East European and Caucasian Hunter-Gatherers as well as from Early Iranian Neolithic, dating from before the Yamnaya expansion, already show signs of this so called “Steppe” component (Lazaridis et al. 2016).

There's no persuasive evidence for this; see my qpAdm and PCA models above for CHG and various Eastern European Hunter-Gatherer groups. As for the Early Neolithic farmers from Iran, there are no formal models that really make sense for them; we probably don't yet have old enough Near Eastern genomes to serve as potential mixture sources. But the idea that they're somehow interchangeable with Steppe_EMBA is patently idiotic.

Such an observation is compatible with the presence of a pre-existing genetic gradient ranging from Caucasus/Iran all the way to Europe, which likely formed through isolation by distance over thousands of years.

It's not. Isolation by distance has nothing to do with it, because there's no persuasive evidence for the existence of Steppe_EMBA ancestry, or even anything similar, outside of the steppe until the Late Neolithic/Early Bronze Age (LNBA). All of the evidence available to date points to a sudden, massive and perhaps even violent explosion of Steppe_EMBA peoples deep into Europe and also across much of Asia during the LNBA.

Here we show that such a gradient, defined as decrease of "steppe” component with distance from Iran, can be inferred from ancient samples pre-dating the Yamnaya expansion (r^2 = 0.93).

Not possible, because, as I've just pointed out, pre-Bronze Age samples from Iran (Iran_ChL) do not show strong evidence of Steppe_EMBA ancestry aka. the Steppe component.

When analysed in the light of this gradient, later ancient and modern samples from Europe still display an excess of Steppe component, however this excess is less pronounced than previously estimated.

Horseshit. Nothing's changed.

Additionally we found that, of the analysed samples, modern South Asians show the highest excess of “steppe” component, pointing to the documented, recent links between the Caucasus/Iran populations and the South Asian peninsula.

No, you're conflating Steppe_EMBA ancestry with Neolithic ancestry from what is now Iran because you don't know how to differentiate them. But this has already been done many times over on this blog and also in scientific literature.

...

By the way, Iosif Lazaridis made a couple of observations related to the Pagani et al. abstract on Twitter. See here and here.

I suspect P10: http://www.karger.com/Article/PDF/469638 … conflates Caucasus/Iran-component with "steppe" ancestry 1/n

Steppe ancestry brought into mainland Europe post-5kya was a mix of Caucasus/Iran-component (Basal Eurasian-rich) with ANE/EHG-component 2/n

See also...

Globular Amphora people starkly different from Yamnaya people

127 comments:

Ryan said...

This actually seems perfectly reasonable, and consistent with what's been discussed on this blog - namely that some ANE influence in Europe predates the IE expansion (Villabruna, KO-1 and their descendants).

Davidski said...

It's bullshit.

ANE is not the steppe component.

Palacista said...

A gradient from Iran?

truth said...

Seems like they got it backwards. It's not EHG and CHG who have Steppe, but the other way around.

pnuadha said...

If the gradient is from Iran (modern day?) then it sounds like they are relabeling steppe to be Iranian bronze age/CHG.

That makes no sense. The "Steppe" component is literally a designation for the people living on the steppe between the neolithic and bronze age. We have not found this component, (.5)EHG/(.5)CHG, anywhere prior to its existence on the steppe. This component has never been prominent in Iran so Iran cannot be a source for this "steppe" component. There is good indication that the "steppe" component developed locally since the steppe gradually went from EHG to (.75)EHG/(.25)EHG to (.5)EHG/(.5)CHG or "steppe" for shorthand. It makes sense that the "steppe" component would develop on the steppe since the steppe was at the border of EHG and CHG.

Now, it would not surprise me if the "steppe" component developed in other places like Romania or Belarus but it pretty much needs to be outside of the middle east and india since neither of these places have ever had sufficient EHG.

So the paper is misusing the term for the "steppe" component. I dont think laz' conclusion, which is that north europeans harber about 50% "steppe" which came directly from the steppe or near to it. What this paper might try to do is claim that there is a deeper source of population movements which started in Iran and that this population truly ties the indo europeans together. If that is what the authors are trying to prove then they should rewrite their abstract since it sounds like they are disputing laz' claim that north europeans have about half of their heritage come from bronze age steppe groups or close to them and south europeans have around a quarter of their ancestry from such groups.

pnuadha said...

That comment was long.

Two questions I have. Are they disputing laz' conclusion that post neolithic steppe people (or a people close in proximity and time) migrated into central europe and drastically changed the demographics. Or, are they just disputing the idea that indo european migrations started from the steppe?

Davidski said...

I'll spell out all of their claims and mistakes in a very detailed post with graphs and plots and so on later today or tomorrow.

Al Bundy said...

The Haak paper didn't claim 75 did it? From memory it was around 50 percent in Northern and central Europe.Maybe they mean the Corded Ware samples or something.Looks like a pretty miserable job I'm definitely interested in seeing Davidski's critique.

Al Bundy said...

It seems like,based on the abstract,they're leaving IndoEuropean alone and are just concerned with the steppe genetic impact.

Rob said...

How can one critique an abstract ?

Al Bundy said...

hmm good point but what do you make of what they conclude?

Davidski said...

How can one critique an abstract?

Very easily when it clearly has nothing to do with reality.

TruthPrevails said...

That is on the lines of what a few of us have been saying all along.

Good paper.

Davidski said...

That is on the lines of what a few of us have been saying all along.

Yes, because you're mentally unhinged, but what excuse do the scientists at the Estonian Biocentre have?

TruthPrevails said...

Haha.. you are funny for sure.


Davidski said...

Yes, and I'm sure Luca Pagani et al. will find me very funny later today.

Gioiello said...

There will be an "Italian Conspiracy" seen "Pagani" (even though this surname may belong also to other ethnicity)?

Davidski said...

There's no conspiracy. Luca Pagani just doesn't understand the basics here. I'm going to get some crayons out later today, draw a few pictures showing the basics, and send them over to him.

Nirjhar007 said...

Finally ! some scientific attitude and sense! .

Rami said...

What is the percentage range of CHG in Yamnaya people David? As well Khalvynk?

Nirjhar007 said...

Though the abstracts main suggestion is IMO in the right direction , we need to wait for the full paper . Uh one thing, those people are no novices on the subject ! . But that don't mean they will be always wrong or right . We need the details first .

Olympus Mons said...

Ues. Like I have always said....there are those papers that reinforce my view, so great profissionals and sssooooo smart and knowledge. The ones challenged my views...just idiots that dont get it. Does it get more lunatic and psychiatric than this?

Olympus Mons said...

Like any new field in science so will this have its tenets broken apart and rewritten dozens of times before anything is as solid as to allow the amount of crockery we see here from steppe addicts. I am loving this 2017..the year steppe would rule over all others...eheheh. just loving it.

Matt said...

Weird stuff. I hope they'e done something more interesting than run ADMIXTURE and declare that a "Steppe" population was ancestral to EHG, CHG, Iran_N, etc, ignoring the substance of the papers they cite.

@ Al Bundy, yes they are talking about Corded Ware ("initial impact"), and that finding (CWG=approx 75% steppe) is robust to several methods of estimation (e.g. also found by the Fst distances).

Karl_K said...

It also turns out that there is an ancient gradient of 'Native American Component' that runs from Southeast Asia to Northwest Europe.

Karl_K said...

It also turns out that there is an ancient gradient of 'African American Component' that runs from West Africa to Eastern Europe.

Ric Hern said...

Whahahaha !!!

Nirjhar007 said...

There are not saying anything new, people knew it from before. But technical details from which they obtained it, will be most welcome.

Karl_K said...

After further analysis, the 'African American Component' was shown to be highest in the Caribbean Islands, which is likely where the population originated before it admixed extensively into European and African populations starting at least 200,000 years ago.

As further evidence, the A00 Y-haplogroup from a Carolina man places the root of all men somewhere in Southeast North America.

Nirjhar007 said...

Cool!, submit for next years meeting .

Alberto said...

Yes, we won't see anything new here. I guess it's kind of based on the idea that with a Basal Eurasian sample that matched the BEA in Iran/Caucasus, we could model Iran_N and CHG as Basal Eurasian + 60-70% Yamnaya, just like we can model Iran_Hotu as Iran_N + 20% Yamnaya, or also with available samples we can model Armenia_ChL as some 30% Yamnaya, even though all these samples predate Yamnaya by thousands of years.

At the same time, we can model Lithuanian as 50% Yamnaya without using Baltic Hunter-Gatherers or it can be reduced to much less (20-30%?) with Latvia_HG.

What these geneticists need to understand is that we need new samples, especially from yet not sampled areas. Reich said there are over 1000 samples they've sequenced but have no time to publish. So all those geneticists should start making an effort to get those published, and if necessary just publish the genomes and let the blogosphere/fora do the analysis for them (which is going to be better in many cases).

Even for relatively well sampled areas like Europe it's still hard to get the fine details. We can model Bell Beakers from Germany as 50% Yamnaya, but without good sampling from Bulgaria, Romania, Poland, etc... it's still quite a rough model lacking any subtle details.

And for Asia it's much worse. So bring those samples soon, because right now, modeling S-C Asians as 40-50% Yamnaya is pretty much useless, and a source of needless speculations and fights over the prehistory of the region.

So yes, we don't need these kind of papers. Skip them and bring new and relevant samples. It's about time.

*End of rant*

Nirjhar007 said...

Wished all the rants were neat like that ;) .

postneo said...

@karl_k
"I work with many highly intelligent people from many places in South Asia. And, as I find history and anciect genomics fascinating, over the years I have had many duscussions about these topics of steppe immigration, the indus valley culture, the neolithic, the mesolithic, etc."

Gee karl k thanks for your prissy condescension. Its sorely needed by 1.2 billion people. Are the steppe believers the intelligentest of them all?

@mahaloeveryone
The steppe hypotheses may turn out to be true, but its not there yet. But you on the other hand sound like an imbecile with your hindutva bullshit.

Karl_K said...

@postneo

If you actually read what I wrote, you will find that I never said that any of them were 'steppe believers'. I only said that they didn't care, hadn't heard that a controversy even existed, and were willing to accept the facts when they come out.

Whereas if you were to only listen to the highly intelligent comments on this and several other genetics blogs and forums, you might start to believe that all 1.2 billion South Asians are quite caught up in some kind of nationalistic conspiracy theories.

I'm not sure how I was being condescending, as I was only talking about specific people that I kniw quite well, but I apologize if it came off that way. My bad.

postneo said...

no problem Karl, sorry for the rant, it's statistically impossible for such a population to be on conspiracy theories.

Karl_K said...

@postneo

I think you are incorrect. As an example I read, but have not verified myself, close to 50% of the USA population will believe most conspiracy theories delivered by 'authority figures' (vaccines cause autism, global warming is a hoax, etc.)

I have asked my US friends about these seemingly fringe ideas, and they all say that they are aware and have friends and family who follow and agree with this kind of thing.

In contrast, no South Asians (with probably a sample size of ~100) have ever told me that they knew anything about a major controversy involving Indo-Europeans and South Asia.

My (perhaps incorrect) conclusion was that >99% of South Asians are not caring about, or knowing about, this 'big controversy'.

bellbeakerblogger said...

It sounds like they will arbitrarily redefine what the 'Steppe' component is by Equivocation. It is a specific profile at a specific place and at a specific moment in time.

Because of this, Steppe is not other things, for example, CHG. Steppe is also not other configurations of it constituent populations. They need to use the term as it was used in Haak, 2015 if they are going to argue against the Haak conclusions.


Samuel Andrews said...

@Karl_K,
"My (perhaps incorrect) conclusion was that >99% of South Asians are not caring about, or knowing about, this 'big controversy'."

Or they just never get the chance to become aware of this controvery.

Karl_K said...

@Samuel Andrews

True. Several of my colleagues are from small villages without internet access. But they all eventually went to a university, and most have a PhD.

Can anyone provide any polls or whatever about this? It all sounds like a very minor thing.

Azarov Dmitry said...

Ha-ha-ha-ha. Steppe fun boys should prepare themselves for bigger surprises. As I said many times before R1 folks came from the Iranian Plateau.

Karl_K said...

@Azarov Dmitry

Well, perhaps "R1 folks came from the Iranian Plateau" at some point in time. But implying that that 'R1 folks' didn't then leave, go to the steppe, and then to Europe, and then all across Asia, and then (perhaps back) into South Asia is in contrast to a lot of data.

Timing matters a lot, especially in terms of modern human genetics.

Davidski said...

Extremely unlikely that R1 folks came from the Iranian Plateau. They're originally northern folks. Iran is in the south.

TruthPrevails said...

Extremely unlikely that R1 folks came from the Iranian Plateau. They're originally northern folks. Iran is in the south.

First humans were dropped in the Arctic by Aliens and they started coming down to all the places in the world :).

Its looks like Davidski thinks on the lines of Ancient Aliens.

Unknown said...

Mr. Davidski, why don't you publish your critique of this work using your full name, just as the authors of the paper are using theirs? It seems only fair to me.

Anonymous said...

@Alberto

I second that rant.

Carlos Aramayo said...

Sorry for changing the topic,just for a moment, but I want to let you know of a new paper`s full draft "The maternal genetic make-up of the Iberian Peninsula between the Neolithic and the Early Bronze Age"

https://www.academia.edu/31588469/The_maternal_genetic_make-up_of_the_Iberian_Peninsula_between_the_Neolithic_and_the_Early_Bronze_Age

Abstract
Agriculture first reached the Iberian Peninsula around 5700 BCE. However, little is known about the genetic structure and changes of prehistoric populations in different geographic areas of Iberia. In our study, we focused on the maternal genetic makeup of the Neolithic (~ 5500-3000 BCE), Chalcolithic (~ 3000-2200 BCE) and Early Bronze Age (~ 2200-1500 BCE). We report ancient mitochondrial DNA results of 213 individuals (151 HVS-I sequences) from the northeast, middle Ebro Valley, central, southeast and southwest regions and thus on the largest archaeogenetic dataset from the Peninsula to date. Similar to other parts of Europe, we observe a discontinuity between hunter-gatherers and the first farmers of the Neolithic, however the genetic contribution of hunter-gatherers is generally higher and varies regionally, being most pronounced in the inland middle Ebro Valley and in southwest Iberia. During the subsequent periods, we detect regional continuity of Early Neolithic lineages across Iberia, parallel to an increase of hunter-gatherer genetic ancestry. In contrast to ancient DNA findings from Central Europe, we do not observe a major turnover in the mtDNA record of the Iberian Late Chalcolithic and Early Bronze Age, suggesting that the population history of the Iberian Peninsula is distinct in character.

batman said...

"A Pre-Existing Isolation by Distance Gradient in West Eurasia May Partly Account for the Observed “Steppe” Component in Europe"

Didn't I mention this possibility, somehow - already?!

postneo said...

@karl K
"I think you are incorrect. As an example I read, but have not verified myself, close to 50% of the USA population will believe most conspiracy theories delivered by 'authority figures' (vaccines cause autism, global warming is a hoax, etc.)"

Carl Sagan one cautioned about superstition in the US, perhaps his concern was exaggerated. The US is a bit more susceptible to fake news than Europe and the rest of the world. They are insulated and under exposed. I live here and can see that. people say things like Assad is the president of Eyeran and so forth...

As for south asia there are no overarching centralized authority or source of conspiracy theories. At most there is family intrigue. Billions of Rumours live and level off quickly. As for Steppe/non steppe stuff hardly anyone has heard of it, you would have to pinch people to arouse interest. Thats true of the rest of the world as well.

The hyper awareness of readers of this blog are an anomaly.

batman said...

Davidski,

"Extremely unlikely that R1 folks came from the Iranian Plateau. They're originally northern folks. Iran is in the south."

It MAY be that most of the ancient Iranian males - starting the Mesopotamian Mesolithic and the Tauric Neolithic - had an arctic origin, too. Just as the first known populations of the Mediterranean archepelagos - as well as it's northern, 'semi-arctic' biosphere.

From the oldest dna we find y-G2 all along the 40-45th parallel, from Spain to Irak. Then there's one ancient G1 from todays Iran.

If the founders of the 'Minoan civilization' were of the same G-dynasty as Ötzi, the first Iranian/Sumerian civilization seem to be a 'branch' of the same, northern y-dna.

Unknown said...

The only people I know concerned about India and the Steppe Hypothesis are the Estonian genetics research group, Michael Witzel--Sanskrit scholar at Harvard--and Steve Farmer--A Vedic yoga nut! I've know literally hundreds of people from India, none of whom is concerned about, or even has heard of the controversy.

Davidski said...

You're living in la la land. It's not very difficult now to show that AIT is grounded in scientific reality.

TruthPrevails said...

Wishful thinking...

TruthPrevails said...

You are silly, why did you delete that note about witzel? it is from wiki, so well publicized.

Unless you are part of the group, it shouldnt affect you.

Davidski said...

We don't discuss people or their suspected motives here. We discuss data and related stuff.

See that's your problem; not enough focus on the data. No wonder you think that obvious conclusions based on hard science are "wishful thinking".

TruthPrevails said...

Btw, i just now did read you reply to the Estonia Biocenter paper.

It looks like you need some formal qualification in microbiology and statistics, with that reply of yours. Looks very amateur.

What they are saying is steppe folks were already admixed to the hilt, they might have been providing a unique fresh genetic input to Euro zone, but they were already admixed before they came on to the steppe, so what was unique to the european side of the equation was already present in the south asia zone, before steppe habitation and massive expansion into europe.

All of this perfectly matches with what NatGeo is putting out as population references for their project. Except for a few guys who are in denial mode, everything looks perfect.

TruthPrevails said...

But since the steppe component, (though it should not be called steppe component but South and West Asia component) is already present in South and West Asia, when you do you admix runs it looks like the steppe expansion also happened into South West Asia as well and Europe, when in reality the expansion was only into Europe.

You see the problem there, and that is precisely why more aDNA is needed from S&W Asia.

Davidski said...

What they are saying is steppe folks were already admixed to the hilt, they might have been providing a unique fresh genetic input to Euro zone, but they were already admixed before they came on to the steppe, so what was unique to the european side of the equation was already present in the south asia zone, before steppe habitation and massive expansion into europe.

Yeah, but I totally debunked this.

First I broke up Steppe_EMBA into its ancestral components and then I tested for admixture from Steppe_EMBA and its main ancestral components in ancient populations from around the steppe.

I didn't find any persuasive evidence of admixture from Steppe_EMBA or its main ancestral groups in any of the groups apart maybe Ukraine_HG/N.

I also pointed out that Steppe_EMBA can't be conflated with ancient ancestry from Iran in Europe or South Asia.

You should know this if you read my post...?

Davidski said...

But since the steppe component, (though it should not be called steppe component but South and West Asia component) is already present in South and West Asia, when you do you admix runs it looks like the steppe expansion also happened into South West Asia as well and Europe, when in reality the expansion was only into Europe.

Only when Pagani does it.

That's not a persuasive argument.

Arza said...

minor gene flow from the Balkans onto the steppe
Let me guess... Z2103?

But what the line going from Iran Neo trough CHG to WHG shows?

Davidski said...

Let me guess... Z2103?

No, farmer ancestry via females. So, for example, mtDNA H2a2.

Z2103 arrived in the Balkans from the steppe.

But what the line going from Iran Neo trough CHG to WHG shows?

Main CHG mixture cline.

Davidski said...

Two tweets on the abstract from Iosif Lazaridis.

https://twitter.com/iosif_lazaridis/status/852277969498378244

https://twitter.com/iosif_lazaridis/status/852278507900215296

P Piranha said...

Even if those favoring steppe movements into India are proven wrong by aDNA, there is still no reason to blame them for their conclusions at this stage.

Let's look at all the data. The oldest case of R1a-z93, which the vast majority of Asian R1a, including South Asian R1a, falls under, comes from the Poltavka Outlier sample, 4925-4536 BP, from the the Pontic Steppe--probably a migrant from other areas of Europe, as it differs from other Poltavka samples in Y-haplogroup and by having what people here term EEF (Early European Farmer) ancestry. The Poltavka samples resemble Yamnaya and do not carry such ancestry. It is positive for one downstream marker, z94, implying that z93 arose quite shortly before among its ancestors. This matches very precisely the estimated date of divergence of z93, ~5000 years ago, from Underhill et al 2014 and from Yfull. This sample does not possess either Iran Neolithic ancestry or South Asian ancestry (ASI); it is lower in CHG than the Poltavka samples. Therefore we have one sample of z93 from Eastern Europe several generations after its estimated date of origin that possesses no ancestry from South or Central Asia.

Poltavka outlier is a mix of Yamnaya ancestry and Early European Farmer ancestry, resembling the Corded Ware samples, which occur just before it temporally and beside it geographically. One individual from Esperstedt Corded Ware is R1a1a1-m417, the parent haplogroup of z93.

Then we have I0419/SVP27 from Potapovka culture - 2200-1900 BC, RISE392 from Sintashta culture, also at 2126-1896 BC, and then several from the Srubnaya culture. All of these samples are z93+, none have Iran Neolithic or ASI ancestry, and all demonstrate whole or partial genetic continuity with the prior samples.

From Metspalu et al 2011, the ASI component appears, in fact, to be slightly older in populations of Uttar Pradesh than in South Indian Dravidians. One of David's articles contains a slide from Johannes Krause which seems to show that the Indus Valley genomes are very high in ASI, more than half; given that these researchers (Haak is with Krause in Max Planck) have so many unpublished results I'm inclined to trust them. The Iran Hotu genomes seem to possess a trace of eastern ancestry. All this points to an old provenance for ASI in northern parts of the Subcontinent and Southern Central Asia.

The Iranians show that the major genetic change after the Chalcolithic in Iran is an increase in ancestry from Sintashta, Andronovo or Scythian-type peoples. So at least one group of Indo-Iranian speakers have certainly received a pulse of admixture from those steppe groups which carry early R1a-z93.

P Piranha said...


The comparisons between high and low caste populations in India using Matt's method and formal statistics requested by RK also show that high castes are distinguished by strong affinity to the Sintashta and Andronovo populations that Kurganist linguists have postulated were the original Indo-Aryan speakers, and which carry R1a-z93.

Jaydeep has offered well-crafted arguments that there was a mesolithic introgression of West Eurasian populations into India, or in the opposite direction from the Indus Valley to the Caucasus and West Asia. However, there is no evidence thus far, either from autosomal genetics or haploid markers, for more recent genetic expansions from India, even if there was a CHG expansion in the Mesolithic. Metspalu does not find a recent, non-Mesolithic contribution of Southern genomes to the Caucasus and West Asia. So there is no evidence right now of post-Mesolithic movements of Indian populations into the Steppe. Since the time depth of Indo-European is not mesolithic, even if Indo-European was spoken by CHG peoples, they diverged from Indians after the glacial melt and South or South-Central Asia could not be the Indo-European homeland.

It is still possible that the samples from the Indus Valley may falsify the AIT. But until then, the picture seems, overwhelmingly, to suggest the AIT. Accusations of bias directed at Davidski and the other commenters as regards the interpretation of the evidence available now seem to me to be quite unfounded.

TruthPrevails said...


I think you are cherry picking results to build your house of cards.

Everyone agrees on one thing archaeologically and genetically, that the spread of R1a and
Indo-European languages into Europe makes sense, but it totally fails to explain the South Asian expansion.

And regarding the genetic studies I think you are totally confused, because all genetic studies have pointed out that
there was genetic input 5000 years ago and recently again 2000 years ago. Which totally does not match the aryan migration
theory structure and dates put forth for south asia. But it does attest to the near east trade in bronze age, and Indian genetic input has been found in Bronze and Iron Age aDNA from Mesopotamian region, so there is all the possibility of a two way genetic exchange.

high castes are distinguished by strong affinity to the Sintashta and Andronovo populations
that Kurganist linguists have postulated were the original Indo-Aryan speakers, and which carry R1a-z93.


Affinity can be shown between all the extant populations of the world depending on how you craft it, except pure Africans, because after them everyone is admixed in some form or other. And if you read more, you will understand Sintashta is connected to IVC.


It is still possible that the samples from the Indus Valley may falsify the AIT.
But until then, the picture seems, overwhelmingly, to suggest the AIT.


AIT is already falsified, probably you are living a decade back, now the AIT folks have changed their
tunes to migration in the past decade due to lack of evidence.


So yes we dont know what the IVC aDNA is going to be, but it has the potential to prove autochtonous origin of
languages and daughter clades of R1 which is rooted in South Asia.

And by the way you need to study in detail the history of Indo-Mediterranean region which was a busy trading and economic corridor
to understand why there was a higher possibility of a complex language to evolve there, which can not be of nomadics.


PS: No AIT/AMT and davidski has no connection, so there is no reason for me to accuse him of anything.

Slumbery said...

TruthPrevails

"to understand why there was a higher possibility of a complex language to evolve there, which can not be of nomadics"

Are you implying that IE languages are specially complex compared to those non-IE languages that developed outside of your favored region? Interesting...

Also mixture date estimates based on modern DNA data are proven to be very unreliable, especially if multiple mixtures layered and there are no reliable modern proxies for the original sources (and then I already assumed that you have not just made up your claim about this 5000 - 2000 years pair in "all" studies).

Anonymous said...

Epoch has already seconded Alberto's justified critique, therefore I third it.

Palacista said...

@TruthPrevails

And by the way you need to study in detail the history of Indo-Mediterranean region which was a busy trading and economic corridor
to understand why there was a higher possibility of a complex language to evolve there, which can not be of nomadics


On the other hand this comment shows that you are at best innocent of an understanding of the disciplines of historical and comparative linguistics.

Ric Hern said...

Regarding R1 Y-DNA. It is as if some people never heard of Mal'ta Buret in Southern Siberia +-26 000 years ago which was R*.

So taking into account the desertification of Central Asia during the Last Glacial Maximum it is hard to see a migration from the South to the North during this period. So Haplogroup R and its immediate ancestors were trapped in Southern Siberia for a while. Deglaciation only started from about 20 000 years ago and within an estimated 2000 years after that R1b formed.

This tells me that R1 was firmly established in Siberia before Deglaciation commenced and those people firmly adapted to the Northern Climate and its challenges during this isolation.

The Steppe Tundra environment stretched from France to Siberia so it is not unlikely that R1 people spread as far as Europe during the LGM or just before Deglaciation caused the Volga River to form a Massive Delta which would have been very difficult to cross taking into account the technology of the day.This is why we see some settlements in Ukraine and the Southern Urals at around +-18 000 years ago and eventually Villabruna R1bs in Northeast Italy +-12 000 years ago.

Matt said...

Davidski: Why? Because it's impossible to accurately recapitulate ancient population structure with ADMIXTURE; the results are always significantly skewed in some way, usually by heavy genetic drift in one or more of the test populations. In other words, there's no way to truly revive ancient populations with ADMIXTURE components. And if you can't do that, then how can you estimate their impact more or less accurately? Not possible.

Just on this point, thinking about this since the latest Goldberg counter-paper I would offer a qualification as I understand it.

It may not be that ADMIXTURE *can't* revive the ancient populations, as such. In theory.

More that at the moment, and for the foreseeable future, we just don't have the required sample sizes and coverage needed for ADMIXTURE to do so. Goldberg's latest (her counter against Lazaridis's failure to replicate her results) has some discussion of this (and the authors of this abstract should read it, if they are relying on ADMIXTURE methods).

This isn't like modern populations where they can just sample more (individuals and coverage), and then adjust the algorithm to suit their larger panel. We have, what, 5 EHGs at the moment, to varying levels of coverage? If we had 50 at a good level of coverage, then these problems with using ADMIXTURE might go away, and it might robustly infer the component. There just aren't enough samples and coverage for ADMIXTURE to say whether EHG really is an optimal coherent cluster or not at a given level K. (Let alone ANE).

(Generally since running Fst experiments with Fst and PCoA, I'm tempted to see formal stats more as a workaround for issues with ancient dna - poor sample size, low coverage, damage, limited geographical sample - and less to see them as an inherently superior method as such. They do discard a lot of the drift data that is very specific to real ancestry and fine structure. If better ways emerge in the long term to deal with those issues, we might be able to potentially discard these methods like qpAdm that rely on lossy indirect means to detect ancestry and get back to simpler methods that use all the data.)

Davidski said...

@TruthPrevails

You're flapping about trying to conjure up genetic and linguistic arguments against what the latest data is showing, but you offer nothing of substance.

Try and offer something of substance. Surely if you represent the truth then it shouldn't be all that hard to come up with something at least worth reading?

Rob said...

@ Ric Hern

I'm still wondering how directly relevant Mal'ta is, and it seems doubtful that "Haplogroup R and its immediate ancestors were trapped in Southern Siberia". Quite the contrary, the current data to me suggests that R-lineages became extinct in Siberia around the LGM, because out of 30 or so Mesolithic & Neolithic lineages in the Altai, mostly are Q and N (2 were R1a). This doesn;t look like a zone where R1 was 'trapped'.

It also seems that your assertion that central Asia were a deserted wilderness isnt quite on point, at least not all of it, given the recent finding of a LUP individual from Kyrgysztan (which is being analysed)..

I'm also confused about what you're trying to communicated about Ukraine, and it having anything to do with Siberia - the Black Sea was it's own refuge zone.
lastly, I would treat the post-LGM history of R1b and R1a distinctly, for a clear picture. And Ultimately, there is no way to link Villabruna to Siberia.

Davidski said...

Affinity between MA1 and ancient steppe populations, which are obviously directly ancestral to modern Europeans, is extremely high. This doesn't look like a signal from an extinct and irrelevant population.

https://4.bp.blogspot.com/-H7ZXZV8JekI/V62REaCGiDI/AAAAAAAAEuo/7ps72XQew5AXubJGuuvRz7c4GIny7doEgCLcB/s1600/Iran_N_Bichon_vs_MA1.png

I'd be shocked if any sample from Central Asia, like Kyrgysztan, shows even higher affinity to the same ancient steppe groups.

And if it doesn't, then even if it belongs to R, like MA1, there won't be any reason to assume that R1a and R1b on the ancient steppe and in modern Europe came from Central Asia rather than Siberia.

Rob said...

Hhmm. Well, the evidence suggests that R1b isn't from LUP Siberia.
R1a is more complex.
But essentially, i think R1b was already in Europe at or after the LGM, whilst R1a might have arrived later, coupled with other 'eastern' lineages such as Y hg Q and mtDNA C, which would account for the 35% ANE in EHG, according to the new calculations by Haak et al.

Ric Hern said...

@Rob I didn't say that R1b was from Siberia but its ancestors surely were roaming North of Central Asia during the LGM because Central Asia were very sparsely populated during this time because of Desert like conditions that prevailed until Deglaciation.It was much drier than today with much less fauna to hunt.

And yes they were not totally trapped but the route South for all intensive purposes was blocked by desert. So the only way open was East and West.

Currently there is only evidence for Villabruna as being R1b. The rest of European samples of roundabout the same timeframe and earlier produced I2 mostly but no R.

postneo said...

@piranha
"Poltavka outlier is a mix of Yamnaya ancestry and Early European Farmer ancestry, resembling the Corded Ware samples"

So there are two humps for the steppe hypothesis to overcome for both europe and south asia
there is no uniparental match between yamnaya/CW and europe even though there is adna match. On the archeological side there is some support of yamnaya/CW "like" movement to europe.

there is no adna match between poltavka and south asia even though there is common ydna. there is no archeological trail.

of course it may get resolved. But for now there is work to be done in finding the right founding populations. They may not be the same for south asia and europe, but may share some ancestry.

Making a hard association of language to these ancient populations and making a belief system out of that is silly. Yamnaya/CW may have spoken extinct IE dialects or non IE. they may have adopted language from BB GAC. Perhaps they had disparate language groups. the possibilities are endless.

Davidski said...

Of course it may get resolved.

It's already resolved. Early Baltic Corded Ware are identical to Yamnaya (with no EEF admixture) and carry R1a-Z645.

Yamnaya/CW may have spoken extinct IE dialects or non IE. they may have adopted language from BB GAC. Perhaps they had disparate language groups. the possibilities are endless.

No, the possibilities are finite to anyone who understands the data we already have.

Rob said...


@ Ric
You have you noticed that all the samples we currently have for LUP are from a small corner of Western Europe (France - Germany) ? Which are indeed I2.
We still don't have much Y DNA from the rest of Europe (Iberia Italy Balkans; or LUP Russia -Ukraine).

Also; at least we DO have R1b in late palaeolithic Italy, as well as Mesolithic northeast and Eastern Europe, and an early offshoot in the Caucasus (although much later in prehistory); not to forget V88 in Neolithic Iberia, as well as the MNE R1b from Blatterhohle which has nil EHG but is 50% WHG. Against this, we have Zero R1b in Siberia before the copper age. So this establishes R1b in Central Europe -Caucasia by the LUP, in contexts which have noting to do with contemporaries in Siberia. Clearly, there must be another explanation ..


Lastly, as I stated before, yes, much of CA was a desert, but not the mountain corridors, or the east Aral-Caspian littoral.

postneo said...

@davidski
No, the possibilities are finite to anyone who understands the data we already have.

thats an empty statement. There is no data. You don't understand what "data" means when it comes to linguistics. Its best we stick to DNA because that alone presents enough challenges without muddling things further.

Atriðr said...

By the way, Iosif Lazaridis made a couple of observations related to the Pagani et al. abstract on Twitter.

To which Mait Metspalu (one of the authors) replied:

Yes - the gradient we think is there is made of some components of future steppe ancestry. But the signal for inflation would remain.

And then Iosif counter-replied:

Thanks, looking forward to reading the details

Yes, we are looking forward to the details.

Unknown said...

Why did -post all the graphs? I thought it was a new study :(
And I was excited than disappointed.

The new studies will never happen :(

postneo said...

@david
neither cw z645 or yamnaya are good matches for bulk Western European y DNA

bellbeakerblogger said...

I think Mait said 'yes' that they conflated Steppe with Steppe components, which makes me think Iosef's point didn't quite land. Conflation is a bad thing because two logical definitions have become confused.

Future Steppe ancestry isn't Steppe ancestry because Steppe ancestry is a specific thing. A quarter isn't a dollar. Yellow isn't green. Combined with something else, yes. This is simple logic.

If they want to argue CHG is clinal into Eastern Europe, then that's fine, but that doesn't have anything to do with Steppe other than it mixed with something else.

1Mero171 said...

"However ancient DNA samples from Eastern European and Caucasian Hunter-Gatherers as well as from Early Iranian Neolithic, dating from before the Yamnaya expansion, already show signs of this so-called" Steppe "component (Lazaridis et al.


What they are calling "steppe component" is obviously not CHG, since eastern europeans HG did not have CHG mix. Most likely, this "steppe component" is ANE, since it is the common point with all mentioned populations in the Abstract. ANE mix is present in EHG (WHG + ANE), CHG / Neolithic Iran (ANE + basal eurasian).

bellbeakerblogger said...

I'm trying not to be a lawyer here, but "steppe component" and "a component of steppe" are two different things. Maybe that was lost in translation.

If the argument is that ANE radiated into Europe from Iran or South Central Asia, and that relates to a possible over-estimation of Steppe admixture in later Europeans, then that argument can be made however defensible. But again, the Steppe profile is not parts of itself.

1Mero171 said...

@bellbeakerblogger

I'm trying to understand what these authors are proposing. Your previous coment was perfect and also defines what I think.

Szkx said...

I think the authors just use old terminology. ANE was called "steppe component" because we thought all of it in Europeans was from the steppe. Now we know ANE predates steppe expansions in Eastern Europe and that is what they want to say. That's all. They just should use the right name, ANE component.

Atriðr said...

@Szkx

They just should use the right name, ANE component.

Unless they are meaning to imply something else.

Matt said...

Measuring Basal Eurasian is still challenging. Lazaridis 2016's estimates did not directly measure EHG (for some reason) but placed pre-Neolithic populations as follows:

Steppe_Eneolithic - 17%, SHG - 10%, CHG - 35%, WHG - 2.5%, Steppe_EMBA - 21%

Run regressions around the two clines there and you would get EHG as 11% (on the CHG->Steppe_EMBA->Steppe_Eneolithic cline) or 18% (on the Motala->WHG cline); average between them would be 15%. (Assumptions - SteppeEneo - 80% EHG, Steppe_EMBA - 50% EHG, SHG - 50% EHG).

Could be wrong of course, but that was their "state of the art" model for estimating Basal Eurasian, and the best that we formally have from the people who estimated that it even exists.

This could allow South->North geneflow into EHG.

Davidski said...

@Szkx

I think the authors just use old terminology. ANE was called "steppe component" because we thought all of it in Europeans was from the steppe. Now we know ANE predates steppe expansions in Eastern Europe and that is what they want to say. That's all. They just should use the right name, ANE component.

No, they're talking about the Yamnaya-related admixture estimates from Haak et al., because they mention the 75% figure for Corded Ware.

Corded Ware was estimated to be ~75% Yamnaya-related in Haak et al., not ~75% ANE.

Pre-LNBA ANE admixture outside of the steppe is irrelevant here unless it's accompanied by exactly the right proportions of EHG and CHG.

So it's impossible to claim that the Steppe component existed in Europe outside of the steppe prior to the LNBA without CHG being present there, and impossible to claim that the Steppe component existed in Early Neolithic Iran without EHG being present there.

And even if present, they have to come in the right proportions.

Samuel Andrews said...

@Matt,
"Measuring Basal Eurasian is still challenging."

That's because Basal Eurasian is just an explaination for why ancient Middle Easterners are less related to East Asians+UstIshim than ancient North Eurasians.

I'm not even sure Basal Eurasian ever existed. There are many possible explanations for why ancient Middle Easterners are less related to East Asians than ancient North Eurasians, Basal Eurasian is just one explanation. Maybe it's a combination of something super Basal and something East Asian-like. Even that's possible.

Maybe if David created an ADMIXTURE test with only ancient genomes a Basal Eurasian component would emerge. If Basal Eurasian existed then all the ancient Middle Easterners should share some type of detectable shared ancestry

Nirjhar007 said...

ANE is not steppe component . Never was .

Now , the attitude for an open discussion shown by the geneticists is very welcome . In future we may see more things like that . Also some Geneticists are opening blogs also . This is what I predicted years back and glad to see its happening or at least starting to happen .

Karl_K said...

@Samuel

"If Basal Eurasian existed then all the ancient Middle Easterners should share some type of detectable shared ancestry."

Not if there were more than one branch of Basal Eurasians.

But the lowered amount of Neanderthal ancestry in people with Basal Eurasian, while they seem to lack Sub-Saharan ancestry, definitely makes other scenerios unlikely.

The simplest explanation is that there were one or more 'Out-Of-Africa' populations that did not mix with Neanderthals, remained somewhere near the middle east or North Africa for tens of thousands of years, and later mixed into seperate populations in that area.

If East Asians mixed with a super basal group to create the Basal Eurasians, you would see greater relatedness to East Asians from populations that have it, instead you see less.




Matt said...

Sam: If Basal Eurasian existed then all the ancient Middle Easterners should share some type of detectable shared ancestry.

Very low in group sharing of f3 variants could mess with this, if we're measuring via f3 sharing.

Like, if we look at Barcin_N, then we see much lower sharing of f3 variants within the group than we see in WHG (though Boncuklu is more comparable to WHG).

If you extrapolate the trend, you could have individuals within a "Basal Eurasian" population who share population history, but might share more f3 variants with other groups...

Matt said...

Also may be of interest to some, another experiment about Basal Eurasian that I tried with Fst scores.

In theory BEu cancels a shift towards Asia relative to Africa, present in WHG / ANE. This shift is present in D and f4 statistics .

So I wanted to test if this was obvious in Fst scores as well.

So 1) Fst between groups and an Southern ENA panel (Onge, Australian, Papuan, Bougainville): http://i.imgur.com/gGyqNg6.png, 2) FST between groups and an African panel : http://i.imgur.com/KSPdLWP.png
You can see these tend to rise and fall together, i.e. drift increases distance to both outgroups.

3) Fst shift : http://i.imgur.com/dbrZ0vl.png

With this measure, it looks like the ENA shift tends to fall closest to the neighbouring modern day populations - e.g. in Fst the Levant_N and Natufian are pretty much ENA->Africa shifted as Palestinians or BedouinB, Europe_EN is around where Cypriots are, Iberia_Chal close to Sardinians.

More interestingly, Andronovo, Yamnaya_Kalmykia, WHG, Motala_HG are all in the same range as modern North-Central Europeans (Czech, English, Croatian).

Oddly, by this measure, Kostenki shows more ENA shift than most other ancient Europeans (e.g. WHG), not less.

So a bit of puzzle to me - I'm not sure why an ENA shift would not replicate within Fst... Seems like it should.

Karl_K said...

@Matt

ENA could ALSO have admixture from a different UNIQUE basal modern human source. We know that the Altai Neanderthal had admixture from very basal modern humans, and they might have still been around when OOA groups moved in.

batman said...

Karl,

"We know that the Altai Neanderthal had admixture from very basal modern humans, and they might have still been around when OOA groups moved in."

When is such a 'movement' (OOA to Altai) supposed to have happen?

What facts exist to give evidence that this suggestion have any root in realities?

MfA said...

Haplogroup BT might be responsible to the very basal AMH admixture in Altai Neanderthals.

Matt said...

Off topic:

With the Fst scores again, I thought I would put them into rank order, to see how they compare with outgroup f3 and IBS stats:

Euro-Siberian HG (Upper Paleolithic - Mesolithic): http://i.imgur.com/TLm9HNE.png
Near East (late Paleolithic - Neolithic): http://i.imgur.com/vY9LTDl.png
Europe (Early - Middle Neolithic + Chal): http://i.imgur.com/49os3H5.png
Near East (Chalcolithic - Bronze Age): http://i.imgur.com/9tGxUWT.png
Steppe (Eneolithic - Early-Mid Bronze Age): http://i.imgur.com/bQZQW8f.png
Europe (LNBA): http://i.imgur.com/HqDQJKa.png
Steppe (MLBA): http://i.imgur.com/8gzxAFJ.png
Steppe (post MLBA): http://i.imgur.com/cGNbwXi.png

Some kind of interesting things there, at least:

1) Much stronger affinity for AG3-MA1 in Fst to South Asia than I expected.
2) Ancient Near East connects to their "logical" recent neighbours, not to Sardinian / West Med.
3) Some specific affinities that make sense in the Europe EN->LNBA setup
(e.g. CWG closest to Germany, Albanian and Greek and mainland Italians less differentiated from early farmers in those areas than Sardinians are).

Using the difference in Fst from (Ancient)-Yoruba, you can kind of show a similar pattern to what the f3(Mbuti,Pop,Ancient) stats show:

Euro-Siberian HG - http://i.imgur.com/Apxr7Zl.png, Near East (Early) - http://i.imgur.com/5fDMmwd.png, Near East (CHL-BA) - http://i.imgur.com/IIEEdeB.png.

The same kind of pattern emerges there where the European samples tend to get an accelerated level of similarity to ancients, when you use the difference against an African population. E.g. for Levant_N, Sardinians move up the list, for AG3-MA1, the Volga-Ural area, for Armenia_MLBA, its Scots, Icelanders, Croats, etc.

Matt said...

Off topic:

With the Fst scores again, I thought I would put them into rank order, to see how they compare with outgroup f3 and IBS stats:

(Full post at https://pastebin.com/ibyYazPh as a lot of links might trigger the spam filter).

Might be of interest to Sein as there are some patterns with SCA there.

Matt said...

Following on, f3 stats vs Fst - http://i.imgur.com/Znn46ae.png

f3 to Loschbour quite linear to Fst to WHG. f3 to Oetzi much less linear to Fst to Hungary N and f3 to MA-1 much less linear to Fst to AG3-MA1.

Matt said...

f3(Mbuti,Ancient,Test) are very linear with Fst(Yoruba;Test)-Fst(Ancient;Test).

E.g - AG3-MA1: http://i.imgur.com/ixctAMH.png / WHG: http://i.imgur.com/611UD6F.png

r2 = 0.98 correlation.

Seinundzeit said...

Matt,

Thanks for looking into this, really appreciate the analyses.

Those are some very interesting patterns.

For what it's worth, the Fst rankings strike me as being more sensible when compared to what we've seen with F3 stats.

One wonders though; is the ANE affinity in South Asia reflective of some local ancient substrate involving forager populations (perhaps quite similar to Iran_Hotu, but with even more ANE-related affinity), or is it reflective of heavy genetic influence from populations similar to Karasuk_outlier and Okunevo (or even perhaps reflective of Srubnaya_outlier-related admixture, which is what PCA-based nMonte consistently showed)?

Matt said...

Thanks. Yeah, I actually think there are definite benefits to both kind of stats (now it seems there's enough data to calculate the Fst sensibly!).

The f3 (and the "Yoruba relative Fst" strongly correlated with it) is useful at telling us more by excluding strong drift, so one use is that Kalash tend to be pretty typical for their region in that stat, unlike the Fst, and another is you can see the strong link to Native Americans.

On the other hand, the Fst seems like it is telling us more about the actual total genetic differentiation between the groups, and which present day groups are least divergent, as a population, to, for'ex, the AG3-MA1 "North Eurasians".

(I also wonder if the f3(Mbuti,Ancient,Test) isn't somehow weighted down by Basal Eurasian (or an ancient Northeast African population or whatever it is) in some way that isn't necessarily what we're expecting it to do by just removing the effect of drift. So you're not just getting a measure of how close say South Central Asians are to the ANE without drift, but somehow it is depressed by Basal Eurasian in a way which is less true for Volga-Ural region and less again for Native Americans).

Re; why the affinity, those both seem like pretty plausible possibilities (if either would be incorrect straight off the bat I can't think of a reason), so seems like it will depend on adna to constrain.

Alberto said...

@Matt

Thanks again for all of these.

It's indeed very interesting how opposite the rankings of Fst differ from f3. Fst seems to favour diverse populations, who tend to get lower distances, while it "punishes" populations with very low diversity (Amerindians, Kalash, to a lesser degree Basques). But this looks quite preferable to f3, where it favours low diversity populations and heavily "punishes" diverse ones (in Eurasia, ones like S-C Asians with Basal Eurasian and ENA, even more Near Eastern ones with SSA admixture). This is even more evident in Africa, where Fst show Africans being diverse, but still closer to each other, while formal stats will show very strange results with Yoruba being closer to Han than to Mbuti or whatever).

(BTW, I spotted some oddities there in your calculations, for example: http://i.imgur.com/IIEEdeB.png, Iran_Chalcolithic being very far from itself because of a very low distance to Yoruba, which would imply significant Yoruba-related admixture. Seems to me like a wrong value got there. Similar with Jordan_EBA.)

Matt said...

Few more graphs:

IBS vs Fst (ANE, WHG, Early Neolithic) - http://i.imgur.com/zKyaxxO.png

Basal K7 proportion vs Fst - http://i.imgur.com/Tmzc7u1.png

IBS vs Fst (Bronze Age version) - http://i.imgur.com/rsLytlk.png

IBS behaves much like f3 outgroup stat (as we know from graphing them against one another).

Arza said...

@ Davidski

Main CHG mixture cline.

I've asked about this line because it's probably the only one here that is in fact very far from being straight.

If you want, you can put CHG at any place around of Iran Neo using just a slight rotation of the 3D plot, because in PC3 CHG is way "over" the Iran (view on the right). Such plot will be practically indistinguishable at first glance from the PC1/PC2.

https://2.bp.blogspot.com/-CVHusjdLpr4/WPA819bRSHI/AAAAAAAAAHk/7B7485Y0bBIC8JGVjDdstMseGOOk50kjQCLcB/s1600/hungCHGiran.png

Seinundzeit said...

Matt,

"I also wonder if the f3(Mbuti,Ancient,Test) isn't somehow weighted down by Basal Eurasian (or an ancient Northeast African population or whatever it is) in some way that isn't necessarily what we're expecting it to do by just removing the effect of drift. So you're not just getting a measure of how close say South Central Asians are to the ANE without drift, but somehow it is depressed by Basal Eurasian in a way which is less true for Volga-Ural region and less again for Native Americans".

I find this to be an exceedingly interesting train of thought.

This makes IBS seem quite effective, since it shows similar results to Fst when it comes to the ANE-SCA connection, yet also shows the ANE-Native American connection.

Davidski said...

@Arza

The CHG mixture cline on the plot is testing the validity of the qpAdm mixture model for CHG.

CHG
EHG 0.004
Hungary_HG 0.207
Iran_Neolithic 0.789
chisq 12.265 tail_prob 0.0921732

https://drive.google.com/file/d/0B8XSV9HEoqpFNEduVnNLX3lITk0/view?usp=sharing

These results appear to be in line with the following inference about CHG from Fu et al. 2016, because Hungary_HG is part of the Villabruna cluster.

The Satsurblia Cluster individuals from the Caucasus dating to ~13,000–10,000 years ago 2 share more alleles with the Villabruna Cluster individuals than they do with earlier Europeans, indicating that they are related to the population that contributed new alleles to people in the Villabruna Cluster, although they cannot be the direct source of the gene flow. One reason for this is that the Satsurblia Cluster carries large amounts of Basal Eurasian ancestry while Villabruna Cluster individuals do not 2 (Supplementary Information section 12; Extended Data Fig. 4).

Karl_K said...

@batman

"When is such a 'movement' (OOA to Altai) supposed to have happen?

What facts exist to give evidence that this suggestion have any root in realities?"

I don't know exactly what you mean, what is your question really?

Modern humans from the OOA genetic group almost certainly arrived in that area >45,000 years ago, as they were already in places like Siberia and Sahul by then, and had already admixed with Neanderthals and Denisovans. Modern humans obviously are living there right now.

If you are suggesting that the OOA group of people came from somewhere other than Africa, I think that is highly unlikely due to a number of factual arguments that have been covered on numerous occasions by many people with more time than I have.

Ric Hern said...

@Davidski :

"population that contributed new alleles to people in the Villabruna Cluster"

Was this populations closer related to Villabrunas or Sutsurblia ? Or did they contribute more towards Villabruna than to Sutsurblia ?

Was this population a Native Steppe population or did it migrate from somewhere else, eg. Siberia ?

Davidski said...

Was this populations closer related to Villabrunas or Sutsurblia ? Or did they contribute more towards Villabruna than to Sutsurblia?

The Satsurblia cluster (CHG) has significant Villabruna-related and MA1-related ancestry.

The Villabruna-related ancestry probably moved into the Caucasus from the Western Steppe, while the MA1-related ancestry probably mostly moved into the Caucasus via the South Caspian, along with Basal Eurasian ancestry.

Ric Hern said...

@Davidski

So MA1-related ancestry did not use a Northern Route from +-Siberia to Europe ?

Ric Hern said...

Could there have been two different routes of MA1-related migrations into the Pontic Caspian Steppe ? One North and the other South of the Caspian ?

Davidski said...

MA1-related ancestry moved via the South Caspian into the Caucasus along with Basal Eurasian ancestry to form CHG, and it moved into Europe without Basal Eurasian ancestry to form EHG.

Ric Hern said...

So MA1-related ancestry within EHG was due to a migration that did not come into contact with Basal Eurasian. So could this point to an Earlier migration towards the West that predated the migration of Basal Eurasian(Baradostian?)types into the Southern Caspian that formed the CHG(Zarzian?) ? Which Culture preceded the Zarzian Culture in the Southern Caspian ?

Olympus Mons said...

@Davidski...
Then I am the one producing fiction? --- Jesus!

Have you ever looked at an Hypsometric map?- Go to googleMaps and just press "terrain". Then take a second to imagine how "real humans", not "Statistic Humans" would really move around.

Gioiello said...

@ Rob
"There no problems with MA/AG ancestry in Eastern Europe - microblades and all arriving at the terminal Palaeolithic, but I question if fits straightforwardly linked to Y Hg R. On balance of current evidence, it seems like it's *not*, at least for R1b.
[...] The South Caucasus has a hiatus during the LGM, after which would have bright villabruna / ANE fusion probably from Ukraine and Basal from the levant".

Did you mean that R1b was in Western Europe or at least Westward Ukraine?

Ric Hern said...

Zarzian =Epi-Paleolithic 18 000 bC.
Mezine =Epi-Paleolithic 16 000 bC.
Villabruna =Epi-Paleolithic 12 000 bC.

Rob said...

@ Gio

No I deleted that comment because i had to add more details about the Caucasus/ Iran. Disregard it.

@ Ric.

It is had to say when ANE arrived to the Caucasus or Iran without pre-LGM samples in either region. In fact, from Iran, we have nothing before 10, 000 BC. The problems is that appears to be discontinuity of settlement in the Caucasus during the LGM (although new excavations might change that picture); and possibly 2 gaps in Iran (between Barodistian & Zarzian, and Zarzian and the pre-Ceramic Neolithic - the latter of which is from when our earliest Iran samples come).
My guess as far as the Caucasus is concerned, the ANE could have co-arrived with Villabruna ancestry from the north, and BE separately from the south. With regard to ANE in Iran, it could have come directly from central Asia.
It would be great to ever know what Zarzians, or Barodistian (formerly called Zagros "Aurignacians") looked like.

Lastly, as i mentioned before, it is possible that ANE in eastern Europe arrived in the Final Palaeolithic with microblade groups from the Urals, but i have doubts they were R1b.

Gioiello said...

@ Rob
"@ Gio No I deleted that comment because i had to add more details about the Caucasus/ Iran. Disregard it".

As someone has many confused ideas about Epigravettian, I may say to you, who are a friend, that R-V88 is definetely demonstrated that it is older in Italy and Western Europe more than elsewhere, and also R-L389, when Sergey Malyshev decides to update his tree, is older in Italy (see Mangino Big Y tested). We are waiting that also and R-73 of Western Europe is tested so it will be demonstrated that it is older in Western Europe than in Eastern Europe and Asia, and so on...

Ric Hern said...

@Rob

Yes there seems to be a connection between the Zarzians and the Southern Urals with the Microblade technology. But yes it is still anybodies guess what the Zarzians actually looked like Genetically. MA1-related ancestry could have introduced or contributed Microblade technology to the remnants of the Baradostians in that area with some partly admixing and formed the Zarzians.

Others MA1-related groups could have bypassed this admixture event because the area was sparsely populated. It is interesting that the Bow and Arrow plus Dogs were introduced into this area during the Zarzian.

But how could MA1-related people migrate from Southern Siberia to the Southern Caucasus during this time period to arrive there at 18 000 BCE ?
This was still the early phase of Deglaciation and Central Asia was still relatively dry.

Matt said...

@ Sein, yeah, if you compare the three measures: f3 outgroup stat, outgroup adjusted Fst, IBS, then you do find that the raw IBS is the most akin to the Fsts in representing an strong affinity of the AG3-MA1 population to South Central Asian populations without much drift.

Though note the raw Fst is much more different to any of these measures than they are to each other! It's more like those measures are telling one story and the Fst (role of drift and within population diversity very different in Fst than other measures) another than anything else.

I think whichever measure's we use, if we were using them in nMonte or something like that, we would tend to get the right answer with enough relevant stats - because it's the systematic nature of them more than where a pop sits in the rankings that matters; so long as a population places systematically similar to a set of ancients and the stat is of a type where you can apply addition of proportions then you can get the right result - but we would have different intuitions about which populations are closest to the AG3-MA1 population looking at each one of these stats. So we should perhaps be cautious of just using the f3 generally for a measure of which population is closest (even net of drift!)?

Lee said...

Any chance you could list out the scripts you used for the analysis?

Rob said...

@ Ric

Well a similar post-LGM industry appears in both sides of the great Caucasus c 18kya. So they must have communicated and migrated across

Have a read of " Epipaleolithic of the Caucasus after the Last Glacial Maximum"

Tell me what you think. I haven't read it in a while

Ric Hern said...

Thanks Rob.

Gioiello said...

From the paper of Pamjav et al. about Hungarian Y nothing new and very old. E808/15, tested R1b-M343, is an "R1b-M73 > M478 > L1432" to compare with
277808 Stepan Kolomoets, Abt.1700 Ukraine R-BY13055
13 19 14 10 13-13 12 12 14 14 13 30 16 9-9 11 11 23 15 19 34 12-15-15-16 11 9 19-25 15 16 17 17 29-35 12 10 11 8 16-16 8 10 10 8 10 11 12 23-23 16 10 12 12 16 6 12 24 21 13 12 11 13 11 11 12 11 33 14 9 15 13 24 26 19 12 11 12 12 13 9 13 11 10 11 12 31 12 13 24 13 9 10 19 15 21 12 23 16 12 15 25 12 25 18 8 14 16 9 11 11
The other 5 samples (E797/15, E775/15, E899/15, E005/2016, E868/15) tested M343 and P25, are very likely all R-L23-Z2013 subclades diffused in Eastern Europe or also come from Asia.
Also the sample E908/15, tested M412, difficult to say if tit is an R-L51-PF7589 lacking DYS426, thus nothing from these samples in favour of an origin od the oldest R1b subclades and of R-L51 in Eastern Europe.

Gioiello said...

Of course Z2103 and not Z2013.