Saturday, October 3, 2015

Linguistics, Archeology and Genetics (L-A-G) Conference abstracts

The Max Planck Institute is holding a conference in a few days dedicated to the latest developments in the search for the Indo-European homeland.

Linguistics, Archeology and Genetics: Integrating new evidence for the origin and spread of Indo-European languages

A draft book of presentation abstracts is available here. This one from Danish linguist Guus Kroonen looks very promising.

Pre-Indo-European speech carrying a Neolithic signature emanating from the Aegean

Guus Kroonen, Institute for Nordic Studies and Linguistics, Copenhagen University, Copenhagen

When different Indo-European speaking groups settled Europe, they did not arrive in terra nullius. Both from the perspective of the Anatolian hypothesis and the Steppe hypothesis the carriers of Indo-European speech likely encountered existing populations that spoke dissimilar, unrelated languages. Relatively little is known about the Pre-Indo-European linguistic landscape of Europe, as the Indo-Europeanization of the continent caused a largely unrecorded, massive linguistic extinction event. However, when the different Indo-European groups entered Europe, they incorporated lexical material from Europe’s original languages into their own vocabularies. By integrating these “natural samples” of Pre-Indo-European speech, the original European linguistic and cultural landscape can partly be reconstructed and matched against the Anatolia and the Steppe hypotheses. My results reveal that Pre-Indo-European speech contains a clear Neolithic signature emanating from the Aegean, and thus patterns with the prehistoric migration of Europe’s first farming populations. These results also imply that Indo-European speech came to Europe following a later migration wave, and therefore favor the Steppe Hypothesis as a likely scenario for the spread of the Proto-Indo-Europeans.

Also, we've known for a while now that the good people at Broad MIT/Harvard have analyzed remains from Neolithic Anatolia, but it's nice to see this framed in the context of the Indo-European homeland debate.

Close genetic relationship of Neolithic Anatolians to early European farmers

Iosif Lazaridis et al.

We study 1.2 million genome-wide single nucleotide polymorphisms on a sample of 26 Neolithic individuals (~6,300 years BCE) from northwestern Anatolia. Our analysis reveals a homogeneous population that was genetically similar to early farmers from Europe (FST=0.004±0.0003 and frequency of 60% of Y-chromosome haplogroup G2a). We model Early Neolithic farmers from central Europe and Iberia as a genetic mixture of ~90% Anatolians and ~10% European hunter-gatherers, suggesting little influence by Mesolithic Europeans prior to the dispersal of European farmers into the interior of the continent. Neolithic Anatolians differ from all present-day populations of western Asia, suggesting genetic changes have occurred in parts of this region since the Neolithic period. We suggest that the language spoken by the homogeneous Anatolian-European Neolithic farmers is unlikely to have been the same as that spoken by the Yamnaya steppe pastoralists whose ancestry was derived from eastern Europe and a different population from the Caucasus/Near East [Haak et al. 2015], and discuss implications for alternative models of Indo-European dispersals.

Indeed, my view is that the implications of this data for the Anatolian hypothesis are fatal (see here). It might also have dire implications for the Armenian Plateau hypothesis, although for the time being this hypothesis limps on.

Feel free to post and discuss your favorite abstracts in the comments below. If anyone reading is going to this thing, I'd love to hear more about the Y-haplogroups of the Anatolian farmers.


Ariele Iacopo Maggi said...

"60% of Y-chromosome haplogroup G2a" The remaining 40% is going to be interesting.

Krefter said...


The rest will be mostly F*(H or T).

Alberto said...

Certainly a very interesting conference. But it seems they'll be mostly discussing the same things we've been discussing here for the last months/year. Basically that the data we've got partially (or superficially) supports the Steppe Hypothesis (for Europe, mostly), but in a deeper level it raises some serious questions about it that will need further data to be resolved.

One interesting thing I think we haven't discussed so far. From one abstract:

"- Tocharian split. The steppe and Anatolian theories identify the Tocharian split with different events, but fortunately, their timing almost coincides. We posit a single Tocharian-split constraint at 3400-3100 BC, common to both theories."

Indeed, Tocharian could be key to the IE origins. It's a very early branching of IE (earliest known together with Anatolian), though it's been attested quite late (6th-8th centuries AD, IIRC). This means that a population of very early IE speakers stayed isolated for a very long period in quite an isolated area in the Tarim Basin. We know that after the 8th-9th century Mongolian tribes mixed with them bringing Turkic languages to the area. But can we know who were these Tocharian speakers?

They have been tentatively related to the Afanasievo people. If this is right, then the descendants of these Tocharian and Turkic people (the Uyghurs) would probably look like a mix of Afanasievo and Mongolian people. I used 4mix and the K8 spreadsheet to test this:

Uygurs = 0% Yamnaya + 37% Yamnaya + 52% Mongolian + 11% Mongolian @ D = 0.1562

Trying instead with Tajiks, one of the best matches for the non-EHG part of Yamnaya - probably the best if they didn't have some ENA admixture (though here it doesn't affect the results because Uyghurs have it themselves):

Uygurs = 27% Tadjik + 19% Tadjik_Ishkasim + 3% Mongolian + 51% Mongolian @ D = 0.0363

As usual, only ancient DNA can elucidate this with certainty, but it would seem like Tocharians didn't descend from Afanasievo, but rather directly from ANI populations. Being Tocharians (or, in this case, their descendants, the Uyghurs) the best proxy we have for the most ancient PIE speakers, these results look interesting.

Davidski said...

The Tarim Basin was initially settled by R1a steppe groups, but very quickly became multi-cultural and some of the groups settling there during the later periods probably came from BMAC.

The R1a groups were probably very similar to Afanasievo.

Alberto said...

But the influence must have been small if Tocharian continued to be the language used until the Turkic/Mongol invasions. That's basically what's interesting about Tocharians. Their language was isolated from their neighbours for 4000 years. Sure, we'll need ancient DNA to really know how much they changed in those 4000 years. But language continuity seems an indication of population continuity.

If they had been mostly from BMAC, they would probably have spoken Indo-Iranian since thousands of years before Tocharian is attested.

Alberto said...

With Tabassaran (to avoid the advantage that ENA might give Tajiks):

Uygur = 0% Lezgin + 39% Tabassaran + 38% Mongolian + 23% Mongolian @ D = 0.057

Rob said...

Alberto / Dave

Whilst not impossible, I find it a bit difficult to fathom that Tocharian continued to be spoken for 4000 years unchanged, and fixed.
All kinds of movements moved in and out of the Tarim basin between 3000 BC and 600 AD.

Again, a diachronic sampling of aDNA will be discerning ..

Davidski said...

If the ancestors of the Tocharians arrived in the Tarim Basin more than 4,000 years ago, then those R1a mummies fit the profile.

The fact that they kept their language, which might have been the lingua franca in the region, but not their steppe genetic profile isn't really surprising.

Alberto said...


Yes, you are probably right. I hardly ever rely on linguistic evidence as being anything too solid in itself (at this time depth, going to Prehistory). But in this case, Tocharian being an early branching of PIE and quite unrelated to all other surrounding languages seems quite clear and widely accepted. This doesn't mean the language didn't change in 4000 years, but it means it was not replaced (or too strongly influenced) by another language.

How much that attests to population continuity is dubious. But lacking ancient DNA, it's at least an interesting thing to look at.

Karl_K said...


"If they had been mostly from BMAC, they would probably have spoken Indo-Iranian since thousands of years before Tocharian is attested."

Really? So BMAC spoke Indo-Iranian, then?

Rob said...

n terms of specific relations to other languages ; some have pointed to Celtic, others Anatolian some even Slavic (despite it being a centum idiom).

In fact; if aDNA proves an EE (sensu latu) original for proto-Tocharians, then it's a bit of a clue as to how pre-Balto-Slavic EE languages looked like

Kurti said...

As I was always against the Anatolian hypothesis and anyone came up with a PIE homeland in the Near East and talked about Anatolia, I always told them on Eupedia or Dienekes.blogspot that I don't think Anatolia has anything to do with Indo Europeans. And always favored the idea that a region between EastAnatolia, Mesopotamia all the way into South_Central Asia should be taken into account. the Zagros-ELborz mountain and Southeast Caucasus was always my favored theory and a better fit for the source than Anatolia. Simply out of the logic that all of the earliest Indo European cultures in Anatolia are said to have arrived from the east. Hittite and their kings are documented by ancient inscriptions to have arrived from a bit futher east, Mitanni seem like they have arrived from a little bit further east.

If there is a source for the teal ancestry in Yamna it is definitely not Anatolia but a region between Zagros-Elborz/Southeast Caucasus and South_Central Asia.

Kurti said...

Take in mind what I call "EastAnatolia" here is technically seen not Anatolia but a transtion of North Mesopotamia and Transcaucsus. Just to avoid the confusions.

Rob said...


Where does it say Hitties and Mittani 'came from the east".

All I know is that it says their fugitives from Kanesh

Alberto said...


"Really? So BMAC spoke Indo-Iranian, then?"

Just according to the mainstream, at some point they started to speak Indo-Iranian. Before that, they didn't speak Indo-European at all.

So if an IE language was taken from BMAC people to the Tarim Basin it could have only been Indo-Iranian (if you accept the mainstream, that is).

If you suggest that actually BMAC spoke Tocharian and they were the ones who took it to the Tarim Basin, that's an interesting idea. But I doubt you are suggesting that.

Kurti said...


I remember in inscriptions saying that the Hittite kings came somewhere from a city in the east I don't remember it anymore.

And than just recently archeologic settlements of early Hittites have been found in Kurdistan.

I simply don't by the Hittie came from West theory. Nothing speaks for this. But much more for them coming from the east.

Nirjhar007 said...

Thanks for the Post!.

Roy King said...

If now, the Iberian and LBK Neolithic samples can be modeled as 90% NW Anatolian Neolithic (which may of course be different from PPNB and contain its own Mesolithic elements) and 10% European/Western Hunter Gatherers), do we know where say contemporary Sardinians would fall under this model? A separate question, of course, would be the origin of the Anatolian farmers.

Krefter said...


Before we had Sintashta DNA most thought they'd have very little WHG, high ENF, and high ANE. This is because they trusted a Sintashta zombie could be sucked out of modern people in the same region. All kinds of stuff could have gone done in Tarim Basin besides migration Tocherians. Actual blood from Tocherians or their lingustic ancestors who arrived 1,000s of years earlier may be very small.

Balaji said...

The Armenian Plateau hypothesis is dead. If EEF = 90%Anotolian + 10%WHG, how could anyone in the Armenian Plateau, so close to Anatolia be very different?

Ancient DNA is largely vindicating the demic diffusion model. The near-Eastern ancestors of the Yamnaya could not have been derived from the Middle East.

Giacomo Benedetti said...


I agree with your theory about the area of origin of the teal component, have you read my post on Zagros?
And can you tell me more about the "archeologic settlements of early Hittites" found in Kurdistan?

@Rob Yes, Hittites came from Kanesh to Hattusha, and Kanesh is to the South-East of Hattusha. It is also interesting how Luwian inscriptions are concentrated in SE Anatolia:

Alberto said...


Yes, but the difference is that we don't have any written records of the language spoken by the Sintashta people, nor do we know at which point this language was replaced (or if it was replaced at all, and by who exactly, or how many times).

Tocharian is a special case for several reasons: First, it's a very early branch of the IE tree, and therefor its original speakers must have been very close to PIEs. Second, the language is attested with written records, so we know exactly when and where it was spoken. Moreover, we know that it's very significantly different from all the neighbouring languages, which points to language isolation.

A very early branch of IE, and isolated, and attested with written records is something very rare and very interesting. But the best part of it is that we know exactly when and by who the language was replaced: by Uyghurs, who mixed with the Tocharian speaking population.

With languages we have examples for all type of cases, but we could summarize them in 4 categories:

- Small genetic impact with language change
- Small genetic impact with language continuity
- Big genetic impact with language change
- Big genetic impact with language continuity

Of those 4 categories I'm sure you can easily find examples for the first 3. But for the 4th category you'll need to think harder to find a examples. It's by far the most rare of all the possibilities.

Why would you suggest this least likely scenario in this specific case? Unless we have good reasons to think otherwise, language continuity (more so if it's an isolated one) strongly suggests genetic continuity.

So if the original speakers of Tocharian where Afanasievo people and they remained isolated and speaking their own language for thousands of years until the Uyghurs arrived and caused a big genetic impact and language change (See? This is the normal thing to happen when there is big genetic impact: language change), by "removing" this Turkic/Mongol genetic part from Uyghurs we would have something that should resemble to a good degree Yamnaya/Afanasievo types.

But instead we have something closer to Tajiks or, more generally, ANI (the non-EHG part of Yamnaya). It obviously doesn't prove anything, but given the hypothetical tests we've been looking at lately, I find this one rather interesting.

(Note: the Tarim Basin might have been inhabited (or not) by Afanasievo people 4000 year ago. Or by others later. But we don't know which language(s) they spoke back then. What we do know with certainty is that Tocharian was spoken in the early centuries of our era by the population living there at that time. That's why that specific population is interesting - and we know quite accurately where their genes went).

Matt said...

@ the 10% estimate Cool to see some quantification. 10% is more or less similar to the 5-7% HG seen in Haak 2015. So it's that's correct, average MN in Spain and Germany would probably be around 25% overall "real" European HG (maybe 33% in the Sweden MN), with 15% or so extra HG. Not too "bad" for small hunter gatherer cultures confronted with a farming culture.

@ Krefter, re: Sintashta, well the Sintashta sites that were sampled are pretty much Russian today, just north of Kazakhstan (

So would we really have assumed little WHG, high ENF and high ANE by using a Sintashta "zombie" based on modern people from the same region?

Not disputing your general point, or anything, about the unreliability of assuming past populations from modern with turnover - e.g. high turnover in Yamnaya culture area, where modern day samples from fairly close to there do not seem to have much more than 50% Yamnaya ancestry at most, due to replacements by East Asian and MN European ancestry, from the closest references in Haak.

Rob said...


"Tocharian is a special case for several reasons: First, it's a very early branch of the IE tree,"

Sure, I get what you're saying, but is it really ?are we sure ?

Alberto said...


Experts seem to agree about this point (and that's quite something).

What do you suggest? Any pet theory about it?

André de Vasconcelos said...


The problem with that argument is that you're talking about something that happened over a timespan of nearly 4000 of years. Just because the language remained in use since 3000 BC (let's assume this was the case) until the Uyghur overran the place, it doesn't mean the area wasn't permeable to small influxes of people who were absorbed by the original Tocharian (Afanasevo?) population over time. I'd be baffled if we found out the original peoples were even similar to the later populations, it's just too much time in a place that connects two otherwise separated regions (silk road).

Besides, the written manuscripts are from around 700AD, the language could have changed quite a bit since it first arrived in the Tarim Basin. And remember there were at least two branches of Tocharian by this time.

Alberto said...


So you go for the big genetic impact but language continuity scenario as the most likely?

We don't know where the Tocharian speakers lived over those 4000 years. Maybe in the Tarim Basin, maybe not. Maybe they lived isolated in the Pamir Mountains or somewhere else. But we do know they kept their own language for those 4000 years. Of course the language evolved. But it was mostly intra-evolution, not shared with their neighbours.

I could refer to the Basque example. They kept their own language for thousands of years (most likely, at least). Does it mean they had zero genetic impact from outside? Obviously not. But they did have less impact than any other population around them, and they do resemble quite well the population of the area from 3000-5000 years ago.

Why do people concentrate on the obvious caveats instead of on the obvious advantages of this hypothetical exercise? All these hypothetical exercises are far from perfect, but then just throw to the bin any of them you've seen in the last few years here. They all have bigger ones than this rather unique opportunity to peek into the past (without having ancient DNA).

As I said from the beginning, this proves nothing. But it should give some interesting hints for those wanting to see them. Though I understand they don't fit everyone's agenda.

Davidski said...

The Tarim Basin wasn't like Basque Country. It became a multi-cultural hub not long after the ancestors of the R1a Tarim Basin mummies arrived there.

The language of the Tarim Basin mummies could have survived the population turnover if, say, it was the lingua franca in the area.

Rob said...

ALberto / Dave

"But we do know they kept their own language for those 4000 years"

I don't think we know that at all :)

I don;t think we have a reason why Tocharian should be a lingua franca for so long. LFs are actively chosen and changed by elites frequently. Eg in east central Europe, several "LFs" operated over the time span of ~ 500 years (0 AD - 600 AD), from Celtic to Elbe-Germanic to Roman to Gothic to Slavic. (!)

I have no concrete theory as to how Tocharian got to the Tarim. Yes, it simply could have got their 4000 years ago and slowly evolved despite ongoing admixture. Or it could have got their c. 200 AD after the demise of the Xiongnu by a IE group from further west filled their vacuum on the NW -most periphery of their former polity (hypothetically, of course)

But on reviewing some of the most recent consensuses it indeed seems that any more proximate relation between Tocharian and 'western' IE (Celtic, Germanic, Balto-Slavic) are mostly shared retained archaicisms and trivial convergent developments, suggesting it is indeed an early offshoot viz-a-viz the others. However, this gives us a chronological en point, not necessarily a geographic one

Davidski said...


Considering these results, I'd say that Sardinians have 10-15% European hunter-gatherer ancestry. No more than that.

Alberto said...


I agree that the theory of it being a Lingua Franca is rather baseless and unlikely. But a possible offshoot from Celtic people exiting Gaul after the Romans put some pressure on them and making it to the Tarim Basin ca. 200 AD, it does look plausible ;) (but you are right that the early branching doesn't mean that these people stayed in the Tarim Basin all that time, they might have been in the nearby mountains isolated until they appear in the Tarim Basin in historical times).

And now seriously, this test is purely speculative like any other, and it proves nothing at all. But I have to say that if the result was that removing the Mongolian part of Uyghurs we found that they were some Yamnaya types instead, everyone would be saying how much this *proved* that the theory of Tocharian coming from Afanasievo was correct and how this was another step closer to prove in a definitive way the steppe hypothesis, etc, etc... But unfortunately, the result pointed to ANI, so better to highlight any shortcomings.

I'm just too used to hear how populations supposedly descended from Yamnaya somehow, for the most weird reasons, magically lose their EHG ancestry while retaining their ANI ancestry untouched.

Davidski said...

ANI is a complex composite. And it's only relevant to South Asians.

Btw, you should go to South London and see how any pure Anglo-Saxons you'll find there among the English speakers.

Alberto said...

David, true. I forgot that the Tarim Basin of 600 AD was the New York of the time. And probably all the traders coming from far away and passing along spoke Tocharian fluently for convenience. Why would Tocharians themselves learn Iranian or Chinese when Tocharian was the de facto lingua franca of trade at the time? It all makes sense now.

As for ANI, call it "Georgian-like" if you prefer. It's not that complex, nor only relevant to South Asia once you change the name. I just think that ANI has some literature defining it, while other names ("Armeanian-like", "Georgian-like", Aryans, "Teal people") are more vague and problematic. But choose your own. I'd be happy to settle with any of them.

Rob said...


So is "ANI" basically ANE + "Georgian farmer/ Teal" ?

Arch Hades said...

"If now, the Iberian and LBK Neolithic samples can be modeled as 90% NW Anatolian Neolithic (which may of course be different from PPNB and contain its own Mesolithic elements) and 10% European/Western Hunter Gatherers), do we know where say contemporary Sardinians would fall under this model? A separate question, of course, would be the origin of the Anatolian farmers."]

In Haak et al Sardinians were 90% EEF [LBK Early Neolithic], 3% WHG, 7% Yamnaya.

So if LBK_EN is 90% Anatolian farmer and 10% WHG.

Then Sardinians are really around 81% Anatolian farmer, 12% WHG, and 7% Yamnaya.

Their WHG ancestry shoots up a lot with this model.

Karl_K said...

@Arch Hades

Yeah, like 400% ! Or is it 9% ? I was never that great with statistics. Actually, I guess it's only a 3% difference between them, right? And I wonder what the error rate might be.

But in any case, it is still a pretty small change considering the many thousands of years and kilometers seperating them. Remarkable actually.

Matt said...

Re: Sardinians, although if we ever do find this "other side" of the Yamnaya ancestry, that could be a better model as contributor to Sardinian than Yamnaya, and that would affect the WHG level (as Yamnaya has a lot of EHG "baked in" and that affects how the qpAdm would be finding WHG).

(On a tangent, although have some doubts about how well those qpAdm Haak stats are really recovering proportions of EN and WHG.

There is a really tight correlation between the proportions they recover in their models like Figure S9.27 and shared drift with Yamnaya as modelled by "direct" D stats... but much less of a tight correlation for the proportions they pick up of EN and WHG and what is observed via direct D stats - - visualised as bivariate stats - visualised via correlation PCA

Some of this will be due to the sharing of WHG ancestry through both EN and direct WHG... as well as shared HG via WHG and EHG, but still... I'm not sure if their qpAdm method is really reliably able to distinguish between WHG and EN via outgroups. The qpAdm is clearly effective at picking up ANE/EHG relatedness via Native American vs East Asian contrasts, beyond that, for relatedness to WHG and EN, not so sure.).