search this blog

Showing posts with label Anatolia. Show all posts
Showing posts with label Anatolia. Show all posts

Wednesday, December 4, 2024

The PIE homeland controversy: December 2024 open thread


It seems like we're getting close to the moment when Iosif Lazaridis has to finally admit that the Proto-Indo-European (PIE) homeland was located in Eastern Europe, and also that the ancestors of the Hittites and other Anatolian speakers entered Anatolia via the Balkans.

Let's discuss.


However, please note that comments from total morons, trolls and/or mentally unstable people will not be approved.

See also...

Indo-European crackpottery

Thursday, February 22, 2024

Berkeley, we have a problem


A new preprint at bioRxiv by Kerdoncuff et al. makes the following, somewhat surprising, claim:

One of the individuals, referred to Sarazm_EN_1 (I4290) described above that was discovered with shell bangles showing affiliation with South Asia, has significant amount AHG-related ancestry, while a model without AHG-related ancestry provides the best fit for Sarazm_EN_2 (I4210) (Table S4.5).

First of all, the authors are actually referring to sample ID I4910 not I4210.

The aforementioned table, based on qpAdm output, shows that I4290 has 15.9% AHG-related ancestry and basically no Anatolian farmer-related ancestry. It also shows that I4910 has no AHG-related ancestry but 17.9% Anatolian farmer-related ancestry.

AHG stands for Andaman hunter-gatherer. The authors are using it as a proxy for South Asian hunter-gatherer ancestry.

However, I've looked at I4290 and I4910 in great detail over the years using ADMIXTURE, Principal Component Analysis (PCA), and qpAdm. And I'm quite certain that they do not show any obvious, above noise level South Asian ancestry. Indeed, I'd say that if they do have some minor South Asian ancestry, then I4910 probably has more of it than I4290.

Kerdoncuff et al. used the following "right pops" or outgroups: Ethiopia_4500BP.SG, WEHG, EEHG, ESHG, Dai.DG, Russia_Ust_Ishim_HG.DG, Iran_Mesolithic_BeltCave and Israel_Natufian.

This means they mixed data that were generated in very different ways (DG, SG and capture) and included some poor quality samples. For instance, the highest coverage version of Iran_Mesolithic_BeltCave offers just ~50K SNPs.

Mixing different types of data and relying on low coverage samples, even in part, often has negative consequences when using qpAdm. So I suspect that the above mentioned mixture results for I4290 are skewed by a poor choice of outgroups.

When I run qpAdm I try to stick to one type of data and avoid low quality singletons in the outgroups. This is the best qpAdm model that I can find for Sarazm_EN:

right pops:
Cameroon_SMA
Morocco_Iberomaurusian
Israel_Natufian
Levant_N
Iran_GanjDareh_N
Turkey_N
Russia_Karelia_HG
Russia_WestSiberia_HG
Mongolia_North_N
Brazil_LapaDoSanto_9600BP

Sarazm_EN
Kazakhstan_Botai_Eneolithic 0.113±0.017
Turkmenistan_C_Geoksyur_subset 0.887±0.017
P-value 0.06392

Sarazm_EN_1 (I4290)
Kazakhstan_Botai_Eneolithic 0.129±0.021
Turkmenistan_C_Geoksyur_subset 0.871±0.021
P-value 0.11019

Sarazm_EN_2 (I4910)
Kazakhstan_Botai_Eneolithic 0.104±0.021
Turkmenistan_C_Geoksyur_subset 0.896±0.021
P-value 0.07427

Also...

Sarazm_EN
Andaman_hunter-gatherer -0.018±0.020
Kazakhstan_Botai_Eneolithic 0.123±0.019
Turkmenistan_C_Geoksyur_subset 0.895±0.020
P-value 0.0298403
(Infeasible model)

Please note that Turkmenistan_C_Geoksyur_subset is made up of just three relatively high quality individuals: I8504, I12483 and I12487. That's because it's not possible to model the ancestry of Sarazm_EN using the full Geoksyur set, probably due to subtle genetic substructures within the latter.

Below is a PCA plot that, more or less, reflects my qpAdm model. I4290 and I4910 are sitting right next to each other in a cluster of ancient Central and Western Asians, and it's actually I4910 that is shifted slightly towards the South Asian pole of the PCA. Indeed, I can confidently say that there's no way to design a PCA in which I4290 is shifted significantly towards South Asia relative to I4910.

Citation...

Kerdoncuff et al., 50,000 years of Evolutionary History of India: Insights from ∼2,700 Whole Genome Sequences, bioRxiv, posted February 20, 2024, doi: https://doi.org/10.1101/2024.02.15.580575

See also...

The Nalchik surprise

A comedy of errors

Thursday, October 27, 2022

The Yassitepe challenge


This is about the only successful qpAdm model that I can find for the pair of Early Bronze Age (EBA) females from Yassitepe, Turkey, using a decent set of outgroups and markers. I wouldn't take it too literally, but it does suggest a potentially significant level of European ancestry, including some steppe ancestry, in these Yassitepe individuals.

TUR_Aegean_Yassitepe_EBA
AZE_Caucasus_lowlands_LN 0.565±0.054
ROU_N 0.387±0.041
RUS_Progress_En 0.048±0.022

P-value 0.103248
Full output

If anyone reading this can find a better, more convincing solution then I'd love to see it. Feel free to share it in the comments below.

Obviously, both of the Yassitepe samples are from the recent Lazaridis, Alpaslan-Roodenberg et al. paper. Their EBA dating suggests that they might be relevant to the debate over the origins of Anatolian speakers, such as the Hittites and Luwians.

See also...

Dear Iosif, about that ~2%

The precursor of the Trojans

Friday, September 9, 2022

Dear Iosif, about that ~2%


The debate over the location of the so called Indo-Anatolian homeland won't be decided by the persistence of any type of genetic ancestry in ancient Anatolia.

It'll be decided by a multidisciplinary study on the interactions between the ancient peoples of the North Pontic steppe, the eastern Balkans, and western Anatolia.

If such a study finds a pulse of steppe-related gene flow from the Balkans into Anatolia sometime during the early metal ages, it'll corroborate the linguistic hypothesis that a language ancestral to Hittite, Luwian and related tongues moved into Anatolia from Eastern Europe.

Why do we only need a pulse of gene flow, you might ask? Obviously, because:

- language and genetic ancestry can start with a strong association but, since they're not linked, they can eventually follow very different trajectories

- the dilution of genetic ancestry is an important factor, especially in ancient West Asia, and it must be taken into account in models of language spread, rather than ignored in favor of simple, elegant models that do not reflect reality.

Here's my favorite quote from the recent Lazaridis, Alpaslan-Roodenberg et al. paper, because, probably unbeknownst to the authors, it's exceptionally revealing about the spread of a wide range of Indo-European speakers into Anatolia.

However, in individuals from Gordion, a Central Anatolian city that was under the control of Hittites before becoming the Phrygian capital and then coming under the control of Persian and Hellenistic rulers, the proportion of Eastern hunter-gatherer ancestry is only ~2%, a tiny fraction for a region controlled by at least four different Indo-European–speaking groups.

Indeed, this is exactly what the Lazaridis, Alpaslan-Roodenberg et al. paper should've been about. That is, the authors should've given us a painstaking account of the spread of different ancient Indo-European speaking groups into Anatolia and explained how, overall, their DNA was rapidly diluted to a trace amount.

However, instead they treated us to a make-believe tale about a so called Indo-Anatolian homeland in what is now Armenia.

See also...

Dear Iosif...Yamnaya

But Iosif, what about the Phrygians?

Dear Iosif...

Dear Iosif #2

Dear Iosif #3

Sunday, September 4, 2022

But Iosif, what about the Phrygians?


A paper in Science co-authored by around 200 scientists from some of the world's top academic institutions surely must mean something, right? Not necessarily.

In this short blog post I'll try to explain, as simply as I can, why the Lazaridis, Alpaslan-Roodenberg et al. paper doesn't get us any closer to solving the riddle of the so called Indo-Anatolian homeland.

However, it must be said that the paper does include many interesting and valuable samples. I'll be using six of these samples, labeled TUR_C_Gordion_Anc, to argue my case.

The TUR_C_Gordion_Anc sample set is from Gordion, the capital of ancient Phrygia, and thus, in all likeliness, it represents Phrygian speakers.

Phrygian is an Indo-European language and the leading hypothesis is that it originated in the Balkans.

In terms of fine scale ancestry, TUR_C_Gordion_Anc can be reliably divided into two genetic clusters. In the Principal Component Analysis (PCA) below these clusters are labeled TUR_C_Gordion_Anc1 and TUR_C_Gordion_Anc2.

Note that TUR_C_Gordion_Anc1 is obviously pulling away from TUR_C_Gordion_Anc2 towards samples from the Balkans. I've used ancient samples from what is now North Macedonia, labeled MKD_Anc, to represent the Balkans. To see an interactive version of the plot, paste the PCA coordinates from here into the relevant field here.

Visually, this is not an especially dramatic outcome, but it's an incredible result nonetheless, because it shows that even a few ancient samples can help to solve an age old mystery.

Across many dimensions of genetic variation, the shift in the PCA from TUR_C_Gordion_Anc1 to TUR_C_Gordion_Anc2 represents about 20% admixture from the Balkans, and about 8% from the Eastern European steppe. That's plenty enough to corroborate the linguistic hypothesis that the Phrygians originated in the Balkans, and that some of their ancestors came from the steppe. The mixture models below were done with the tools here.

Target: TUR_C_Gordion_Anc1
Distance: 1.6634% / 0.01663373
40.6 Kura-Araxes_ARM_Kaps
22.2 Anatolia_Barcin_N
21.8 MKD_Anc
13.6 Levant_PPNB
1.4 IRN_Ganj_Dareh_N
0.4 Han

Target: TUR_C_Gordion_Anc1
Distance: 1.7109% / 0.01710904
40.2 Kura-Araxes_ARM_Kaps
37.8 Anatolia_Barcin_N
12.4 Levant_PPNB
8.0 Yamnaya_RUS_Samara
1.2 IRN_Ganj_Dareh_N
0.4 Han

Target: TUR_C_Gordion_Anc2
Distance: 2.0293% / 0.02029339
51.0 Kura-Araxes_ARM_Kaps
26.8 Anatolia_Tepecik_Ciftlik_N
17.6 Anatolia_Barcin_N
4.6 Levant_PPNB

Surprisingly, Lazaridis, Alpaslan-Roodenberg et al. didn't have much to say about this topic. This quote basically sums it up:

However, in individuals from Gordion, a Central Anatolian city that was under the control of Hittites before becoming the Phrygian capital and then coming under the control of Persian and Hellenistic rulers, the proportion of Eastern hunter-gatherer ancestry is only ~2%, a tiny fraction for a region controlled by at least four different Indo-European–speaking groups.

I have no doubt that Lazaridis, Alpaslan-Roodenberg et al. can run a very decent PCA, and then blow it up to a size big enough to show that the Gordion samples represent two genetically somewhat distinct groups. I'm also sure that, if they really try, they can locate significant levels of proximate and relevant European ancestry in some of these samples.

They don't have to use my methods; they can use any methods they like. My point is that they won't find much if they're just looking for genetic signals from the Upper Paleolithic or Mesolithic.

Now, considering the way that the Phrygian question was treated by Lazaridis, Alpaslan-Roodenberg et al., despite the fact that they managed to sequence a few likely Phrygian speakers from none other than the Phrygian capital, let's not pretend that their paper brought us any closer to understanding the genetic origins of Anatolian speakers or pinpointing their ancestral homeland.

In order to even try to solve these problems with ancient DNA, we need a wide range of samples from Hittite, Luwian and other key sites where Anatolian languages were spoken. And then we must analyze them properly.

I'm guessing that Lazaridis, Alpaslan-Roodenberg et al. went out of their way to get such samples, but for one reason or another they failed. If so, that's OK, but I have a feeling that even if they got them, they wouldn't know what to do with them, because at best these samples would only show ~2% Eastern hunter-gatherer ancestry. Haha.

For what it's worth, I believe that the ancient data in the Lazaridis, Alpaslan-Roodenberg et al. paper point to the North Pontic steppe as the Indo-Anatolian homeland, and I'll lay out my arguments in an upcoming blog post.

See also...

Dear Iosif...Yamnaya

Dear Iosif, about that ~2%

Dear Iosif...

Dear Iosif #2

Dear Iosif #3

Saturday, August 27, 2022

Dear Iosif...

Update 29/08/22: Dear Iosif #2

...


I'm skimming through the Lazaridis, Alpaslan-Roodenberg et al. paper that just came out at Science. And I feel like someone punched me in the face.

Nevertheless, I'll try to be diplomatic. Suffice to say, for now, that there's some rather strange stuff in this paper.

The main problem is that the authors are attempting to study fine scale ancestry with a somewhat rough distal model. As a result, they miss important details.

For instance, this quote is from the paper's supplementary PDF file, freely available here.

However, the complete lack of association of R-haplogroup descendants and EHG ancestry in either Armenia or Iran is consistent with either a massive dilution of EHG ancestry in these populations resulting in the dissociation of Y-chromosome lineages from autosomal ancestry over time, or with a scenario in which R-M269 was not associated with substantial EHG ancestry to begin with.

Obviously, EHG means Eastern European Hunter-Gatherer. But why focus on EHG? Surely, this makes little sense when looking at the genetic prehistory of West Asia, because no one ever argued that this region was settled by EHG populations. It was widely settled by Yamnaya-related groups, with already heavily diluted EHG ancestry, during the metal ages.

OK, so the authors are actually aware about the potential dilution of EHG ancestry, but they don't really do anything about it.

If we're looking at the origins of West Asian R1b-M269, and using its association with autosomal DNA components as a guide, then we should be focusing on Yamnaya-related ancestry.

For instance, here's a fine scale ancient ancestry model based on Principal Component Analysis (PCA) data. It shows the ancestry proportions of two relatively high coverage Iron Age males from two different sites in Iran from the Lazaridis, Alpaslan-Roodenberg et al. dataset. Both belong to R1b-M269 and both show significant Yamnaya-related ancestry.

Target: IRN_HajjiFiruz_IA:I2327_all
Distance: 2.2930% / 0.02292994
39.6 Kura-Araxes_ARM_Kaps
24.2 IRN_Ganj_Dareh_N
18.2 Levant_PPNB
12.4 Yamnaya_RUS_Samara
4.4 Anatolia_Tepecik_Ciftlik_N
1.2 Han

Target: IRN_Hasanlu_IA:I4232_all
Distance: 2.5179% / 0.02517895
26.0 IRN_Ganj_Dareh_N
25.6 Kura-Araxes_ARM_Kaps
24.4 Anatolia_Tepecik_Ciftlik_N
15.8 Yamnaya_RUS_Samara
7.6 Levant_PPNB
0.6 IRN_Shahr_I_Sokhta_BA2

As a control, here's an earlier, Chalcolithic sample bearing Y-haplogroup J2b from the same region. Not surprisingly, this individual totally lacks the Yamnaya-related signal.

Target: IRN_HajjiFiruz_ChL:I4241_all
Distance: 2.7938% / 0.02793782
32.6 Kura-Araxes_ARM_Kaps
25.6 IRN_Ganj_Dareh_N
23.6 Anatolia_Tepecik_Ciftlik_N
18.2 Levant_PPNB

Overall, these results make perfect sense. I could probably locate very minor signals of EHG ancestry in the Iron Age samples, but that would be more difficult and much less certain, so I won't bother.

Soon I'll be able to rerun these analyses with Bronze Age samples from Dagestan and surrounds. That should bump up the levels of Yamnaya-related ancestry and improve the statistical fits (wink, wink, nudge, nudge).

Disappointingly, Lazaridis, Alpaslan-Roodenberg et al. go so far as to suggest that R1b-M269 may have originated in West Asia.

However, considering the scores of ancient Eastern European populations rich in R1b-M269 and many near and far related subclades of R1b, this makes no sense whatsoever.

Indeed, contemplating nowadays that R1b-M269 might be native to West Asia, where R1b only starts showing up in the ancient DNA record during the Copper Age, is about as stupid as claiming that gravity doesn't exist.

Largely due to their distal model approach, Lazaridis, Alpaslan-Roodenberg et al. also argue that the Indo-Anatolian homeland was located in what is now Armenia and surrounds. I'm far from convinced that this solution will stand the test of time.

In terms of the more widely accepted theory that the Indo-Anatolian homeland was located on the Pontic-Caspian steppe in Eastern Europe, the most important samples in the paper are the three Bronze Age individuals from Yassitepe in western Anatolia. That's because they're from a region that is traditionally seen as the entry point of Indo-Anatolian speakers into Anatolia from the European steppe via the Balkans.

Interestingly, individual I5737, dated to 2035-1900 calBCE or the Middle Bronze Age, belongs to Y-chromosome haplogroup I2a-P78, which surely must be a signal of European ancestry. I see this as a significant result.

Here's how the trio from Yassitepe look in my fine scale ancient ancestry model. Minor Yamnaya-related ancestry does show up, although, admittedly, it might just be noise in individual I5735.

Target: TUR_Aegean_Izmir_Yassitepe_MBA:I5737
Distance: 2.7507% / 0.02750748
58.4 Anatolia_Barcin_N
20.4 Kura-Araxes_ARM_Kaps
9.2 Anatolia_Tepecik_Ciftlik_N
5.6 IRN_Ganj_Dareh_N
3.8 Yamnaya_RUS_Samara
2.6 Levant_PPNB

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5733
Distance: 2.7969% / 0.02796887
52.0 Anatolia_Barcin_N
27.2 Kura-Araxes_ARM_Kaps
8.6 Levant_PPNB
6.2 Yamnaya_RUS_Samara
6.0 IRN_Ganj_Dareh_N

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5735
Distance: 3.1270% / 0.03127009
36.0 Kura-Araxes_ARM_Kaps
32.4 Anatolia_Tepecik_Ciftlik_N
26.0 Anatolia_Barcin_N
2.8 IRN_Shahr_I_Sokhta_BA2
1.2 Yamnaya_RUS_Samara
1.0 Levant_PPNB
0.6 MAR_Taforalt

This isn't much, especially considering it's already late 2022, but it's better than nothing. Fortunately, more samples from Bronze Age western Anatolia are on the way (wink, wink, nudge, nudge).

However, I'm not done with the Lazaridis, Alpaslan-Roodenberg et al. dataset yet. I'm planning to spend much more time on this blog in the coming weeks and months and will be using their samples in a wide range of analyses.

Citation...

Iosif Lazaridis, Songül Alpaslan-Roodenberg et al., The genetic history of the Southern Arc:A bridge between West Asia and Europe, Science 377, eabm4247 (2022)

See also...

Dear Iosif #3

But Iosif, what about the Phrygians?

Dear Iosif, about that ~2%

Dear Iosif...Yamnaya

Saturday, June 18, 2022

David Reich on the origin of the Yamnaya people (!?)


Harvard's David Reich is doing a talk next month about the genetic history of West Asia and nearby parts of Europe. This is a quote from an online abstract of the talk (found here).

The impermeability of Anatolia to exogenous migration contrasts with our finding that the Yamnaya had two distinct gene flows, both from West Asia, suggesting that the Indo-Anatolian language family originated in the eastern wing of the Southern Arc and that the steppe served only as a secondary staging area of Indo-European language dispersal.

If this is actually what David Reich is going to claim then I'd say his team has a lot of work to do before they put out their paper on the topic.

First of all, Yamnaya did not have two distinct gene flows from West Asia. I don't even know what that means exactly, but there's no way that this statement is correct no matter how one interprets it.

In fact, the Yamnaya population formed on the Pontic-Caspian steppe from earlier groups native to this part of Eastern Europe, such as the people associated with the Sredny Stog culture.

That is, there were no migrations from West Asia into Eastern Europe that can be claimed to have been instrumental in the emergence of the Yamnaya population. On the other hand, Yamnaya may have been significantly influenced by cultural impulses from West Asia, but this is nothing new.

In terms of deep population structure, the Yamnaya genotype can be described as a mixture between Eastern European and West Asian-related genetic components. However, these Asian-related components were already in Europe thousands of years before Yamnaya came into existence.

Indeed, soon to be published ancient DNA shows that hunter-gatherers very similar to the Yamnaya people, packing quite a lot of West Asian-related ancestry, lived in the Middle Don region (just north of the Pontic-Caspian steppe) well before 5,000 BCE (see here).

So, did the West Asian ancestors of these Middle Don hunter-gatherers speak Proto-Indo-European, or, as David Reich calls it, Indo-Anatolian? Keep in mind that most linguists put the birth of Indo-Anatolian around 4,000 BCE, which is actually the Sredny Stog period.

Moreover, in underlining Anatolia's supposed impermeability to exogenous migration, David Reich is arguing against things that no one worth their salt ever claimed. That's because the spread of Indo-Anatolian speakers into Anatolia has never really been described by archeologists and linguists as a massive migration, but rather as an infiltration into lands already heavily populated by the Hattians (for instance, see here).

We may have already seen the genetic evidence of this infiltration in the presence of steppe Y-chromosome haplogroup R-V1636 in a Chalcolithic burial at Arslantepe (see here and here). Let's wait and see what else crops up over the next few years as many more ancient Anatolian genomes are sequenced by David Reich and colleagues.

See also...


Wednesday, February 2, 2022

The PIE homeland controversy: February 2022 status report


I think we'll see the emergence of two main competing proto-Indo-European (PIE) homeland theories over the next few years:

- a homeland in the Eneolithic North Caucasus, and the spread of Anatolian languages into West Asia with Maykop-related ancestry

- a homeland in the North Pontic region, possibly within the Eneolithic Sredny Stog archeological culture, and the spread of Anatolian languages into West Asia via the Balkans.

Both theories have support from ancient DNA. Some of it has already been published (for instance, see here).

At this point, I can see myself firmly in the North Pontic camp, even if it turns out that North Pontic-related ancestry only made a fleeting impact on Bronze Age Anatolia.

After all, there's no direct relationship between genes and languages, so to prove that Anatolian languages came from the North Pontic, there's no need for North Pontic-related ancestry to persist in Anatolia, as long as we have solid evidence that people with this type of ancestry moved there at the right time.

In my mind, for now, the Maykop culture provides an excellent explanation for non-Indo-European influences in PIE, and there's no need to make it Indo-European speaking, let alone PIE speaking.

See also...

The PIE homeland controversy: June 2021 status report

Sunday, January 17, 2021

A tantalizing link


A new paper at PLoS ONE reports on the first human genomes reliably associated with the Single Grave culture (SGC). They were sequenced from remains in a burial at Gjerrild, Denmark, roughly dating to 2,500 BCE.

Surprisingly, one of the male genomes belongs to Y-haplogroup R1b-V1636, which is an exceedingly rare marker both in ancient and present-day populations.

However, the results do make sense, because the earliest instances of R1b-V1636 are in three Eneolithic males from burial sites on the Pontic-Caspian (PC) steppe in Eastern Europe, which is precisely where one would expect to find the paternal ancestors of the SGC population. The SGC, of course, is the westernmost variant of the Corded Ware culture (CWC), and there's very little doubt nowadays that the CWC had its roots on the PC steppe.

A Copper Age individual from Arslantepe in central Anatolia also belongs to R1b-V1636, which suggests that Northern Europe shared a very specific link with Anatolia via Eastern Europe during a period generally regarded to have been the time of early Indo-European dispersals.

Numerous SGC barrows or kurgans dot the landscape in what are now the Netherlands, northwestern Germany and Denmark. Unfortunately, most SGC human remains have been eaten up by the acidic soils that exist in this area.

Citation: Egfjord AF-H, Margaryan A, Fischer A, Sjögren K-G, Price TD, Johannsen NN, et al. (2021) Genomic Steppe ancestry in skeletons from the Neolithic Single Grave Culture in Denmark. PLoS ONE 16(1): e0244872. https://doi.org/10.1371/journal.pone.0244872

See also...

Maykop ancestry in Copper Age Arslantepe

Wednesday, September 16, 2020

Domestic horses were introduced into Anatolia and Transcaucasia during the Bronze Age (Guimaraes et al. 2020)


Over at Science Advances at this LINK. This is a very important paper because it basically eliminates West Asia as the source of the modern domestic horse lineage, which leaves the Pontic-Caspian steppe in Eastern Europe as the only viable option.

It also corroborates the linguistic theory that the Proto-Indo-European homeland was located on the Pontic-Caspian steppe. That's because the horse is a key animal in the Proto-Indo-European pantheon, and it appears in Indo-European mythology in intricate roles. This suggests that the speakers of Proto-Indo-European weren't just familiar with the horse but also managed to domesticate it. From the paper:

Abstract: Despite the important roles that horses have played in human history, particularly in the spread of languages and cultures, and correspondingly intensive research on this topic, the origin of domestic horses remains elusive. Several domestication centers have been hypothesized, but most of these have been invalidated through recent paleogenetic studies. Anatolia is a region with an extended history of horse exploitation that has been considered a candidate for the origins of domestic horses but has never been subject to detailed investigation. Our paleogenetic study of pre- and protohistoric horses in Anatolia and the Caucasus, based on a diachronic sample from the early Neolithic to the Iron Age (~8000 to ~1000 BCE) that encompasses the presumed transition from wild to domestic horses (4000 to 3000 BCE), shows the rapid and large-scale introduction of domestic horses at the end of the third millennium BCE. Thus, our results argue strongly against autochthonous independent domestication of horses in Anatolia.
Guimaraes et al., Ancient DNA shows domestic horses were introduced in the southern Caucasus and Anatolia during the Bronze Age, Science Advances 16 Sep 2020: Vol. 6, no. 38, eabb0030, DOI: 10.1126/sciadv.abb0030

See also...


Tuesday, August 11, 2020

Villabruna people existed in Europe at least 17,000 years ago (Bortolini et al. 2020 preprint)


Over at bioRxiv at this LINK. So, like I said here a few years back, there was no migration into Europe from the Near East ~14,00 years ago. I don't think there was even such a migration ~17,000 years ago. My view is that the so called Villabruna cluster formed somewhere in Europe at least 20,000 years ago. Below is the Bortolini et al. abstract, emphasis is mine:

The end of the Last Glacial Maximum (LGM) in Europe (~16.5 ka ago) set in motion major changes in human culture and population structure. In Southern Europe, Early Epigravettian material culture was replaced by Late Epigravettian art and technology about 18-17 ka ago at the beginning of southern Alpine deglaciation, although available genetic evidence from individuals who lived ~14 ka ago opened up questions on the impact of migrations on this cultural transition only after that date. Here we generate new genomic data from a human mandible uncovered at the Late Epigravettian site of Riparo Tagliente (Veneto, Italy), that we directly dated to 16,980-16,510 cal BP (2σ). This individual, affected by a low-prevalence dental pathology named focal osseous dysplasia, attests that the very emergence of Late Epigravettian material culture in Italy was already associated with migration and genetic replacement of the Gravettian-related ancestry. In doing so, we push back by at least 3,000 years the date of the diffusion in Southern Europe of a genetic component linked to Balkan/Anatolian refugia, previously believed to have spread during the later Bolling/Allerod warming event (~14 ka ago). Our results suggest that demic diffusion from a genetically diverse population may have substantially contributed to cultural changes in LGM and post-LGM Southern Europe, independently from abrupt shifts to warmer and more favourable conditions.

Bortolini et al., Early Alpine human occupation backdates westward human migration in Late Glacial Europe, bioRxiv, posted August 10, 2020, doi: https://doi.org/10.1101/2020.08.10.241430

See also...

Villabruna cluster =/= Near Eastern migrants

Tuesday, June 30, 2020

The precursor of the Trojans


Who remembers kum4 from Omrak et al. 2016? I'm pretty sure now that this individual packs a lot of ancestry from the Pontic-Caspian (PC) steppe.

If so, that's a big deal, because her Chalcolithic (or Late Neolithic?) burial was located at Kumtepe. That is, in the same part of Anatolia as the later settlement of Troy, which may have been founded by early Anatolian speakers from Eastern Europe (see here).

The qpAdm mixture models below, featuring kum4 and the likely older kum6, also from Kumtepe, are based on qpfstats output. qpfstats is a new program from the David Reich Lab specifically designed to help analyze low coverage ancients (see here). And kum4 is certainly that.

TUR_Kumtepe_N_kum4
RUS_Progress_En 0.383±0.114
TUR_Barcin_N 0.617±0.114
chisq 7.868
tail prob 0.247957
Full output

TUR_Kumtepe_N_kum4
IRN_Seh_Gabi_C 0.325±0.150
TUR_Barcin_N 0.675±0.150
chisq 14.736
tail prob 0.0224096
Full output

TUR_Kumtepe_N_kum6
RUS_Progress_En 0.121±0.042
TUR_Barcin_N 0.879±0.042
chisq 21.790
tail prob 0.00132149
Full output

TUR_Kumtepe_N_kum6
IRN_Seh_Gabi_C 0.283±0.059
TUR_Barcin_N 0.717±0.059
chisq 6.289
tail prob 0.391566
Full output

Indeed, kum4 and kum6 offer just ~10,000 and ~100,000 "valid SNPs", respectively (see here). However, if nothing else, the results are clearly not random.

For one, because they fit the expected pattern, with the likely older individual lacking ancestry from the PC steppe (her model with RUS_Progress_En shows a weak statistical fit). Moreover, the qpAdm mixture ratios align almost perfectly with the results in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation. Coincidence?

See also...

Perhaps a hint of things to come

Tuesday, June 9, 2020

Maykop ancestry in Copper Age Arslantepe


At least four individuals from the Late Chalcolithic (LC) burial site of Arslantepe show ancestry typical of the population associated with the contemporaneous Maykop culture in the North Caucasus. They are ART018, ART020, ART027 and ART039 from the recent Skourtanioti et al. paper. I've labeled them TUR_Arslantepe_LC_Maykop in my qpAdm mixture model below:

TUR_Arslantepe_LC_Maykop
RUS_Maykop_Novosvobodnaya 0.318±0.041
TUR_Arslantepe_LC 0.682±0.041
chisq 9.969
tail prob 0.533159
Full output

Considering the tight statistical fit, I think it's even possible that some of these people harbor direct ancestry from Maykop Novosvobodnaya. Here's a Principal Component Analysis (PCA) showing why my qpAdm model works so well. It was produced with the data in the text file here and the Vahaduo PCA tools here.



Moreover, one of the Arslantepe males, ART038, belongs to Y-haplogroup R1b-V1636 (R1b1a2). This is clearly a marker of paternal steppe ancestry, because it's been reported in two Eneolithic samples from the southernmost part of the Pontic-Caspian steppe near the North Caucasus foothills (see here). These individuals are dated to ~4,200 calBCE, so they lived about a thousand years earlier than ART038.

ART038 probably lacks steppe and Maykop-related ancestries on his autosomes. Nevertheless, my point about his Y-haplogroup stands, because autosomal admixture can be bred out and disappear completely within a couple hundred years, or about 6 to 8 generations.

Interestingly, Skourtanioti et al. argued against the possibility of significant steppe and Maykop-related ancestries in the Arslantepe LC samples. They also didn't see R1b-V1636 as an obvious signal of paternal steppe ancestry. I find this very puzzling indeed, because to me it seems way off the mark. From the paper:

However, R1b-V1636 and R1b-Z2103 lineages split long before (~17 kya) and therefore there is no direct evidence for an early incursion from the Pontic steppe during the main era of Arslantepe. Lineage L2-L595 found in ALA084 (Alalakh) has previously been reported in one individual from Chalcolithic Northern Iran (Narasimhan et al., 2019) and in three males from the Late Maykop phase in the North Caucasus (Wang et al., 2019). These three share ancestry from the common Anatolian/Iranian ancestry cline described here, which indicates a widespread distribution that also reached the southern margins of the steppe zone north of the Caucasus mountain range.

See also...

Perhaps a hint of things to come

An early Mitanni?

How relevant is Arslantepe to the PIE homeland debate?

Tuesday, June 2, 2020

Perhaps a hint of things to come


It's still a mystery how the Hittites and other Anatolian speakers ended up in the Near East. However, the leading theory is that their ancestors migrated from the steppes of Eastern Europe to western Anatolia via the Balkans sometime during the Copper Age.

Consider the qpAdm mixture models below, made possible thanks to some of the ancient samples published recently along with Skourtanioti et al. 2020. The key ancients are described in a text file available here.

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.471±0.094
RUS_Vonyuchka_En 0.148±0.040
TUR_Barcin_N 0.381±0.069
chisq 12.874
tail prob 0.116261
Full output

TUR_Barcin_C
RUS_Vonyuchka_En 0.107±0.029
TUR_Buyukkaya_EC 0.893±0.029
chisq 12.107
tail prob 0.207331
Full output

I'd say it's quite clear now that TUR_Barcin_C harbors minor ancestry from the Pontic-Caspian (PC) steppe. The reason this isn't widely accepted yet is because demonstrating it convincingly hasn't been possible without a proximate Anatolian ancestry source for TUR_Barcin_C, precisely like TUR_Buyukkaya_EC.

Admittedly, though, the statistical fits in my models aren't all that great. I suspect the problem lies with RUS_Vonyuchka_En, which is likely to be a rather poor stand in for the people who brought steppe ancestry, and possibly early Anatolian speech, to western Anatolia.

So let's see what happens when I try a more proximate reference for the steppe ancestry in TUR_Barcin_C. How about Yamnaya_BGR, an individual of mixed Balkan and steppe origin from what is now Bulgaria?

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.518±0.075
TUR_Barcin_N 0.203±0.056
Yamnaya_BGR 0.279±0.067
chisq 10.602
tail prob 0.225269
Full output

TUR_Barcin_C
TUR_Buyukkaya_EC 0.749±0.058
Yamnaya_BGR 0.251±0.058
chisq 9.687
tail prob 0.376414
Full output

That's a little better. Unfortunately, the problem now is that the models are anachronistic, because TUR_Barcin_C is about a thousand years older than Yamnaya_BGR. Clearly, we need more Copper Age samples from the western edge of the PC steppe, the eastern Balkans, and especially northwestern Anatolia.

The Principal Component Analysis (PCA) below effectively illustrates why my qpAdm models work. It was produced with Global25 data using the Vahaduo PCA tools freely available here. Note that TUR_Barcin_C is shifted away from the essentially perfect cline formed by AZE_Caucasus_lowlands_LN, TUR_Barcin_N and TUR_Buyukkaya_EC towards samples from ancient Eastern Europe, including Yamnaya_BGR.


See also...

Steppe invaders in the Bronze Age Balkans

Thursday, May 28, 2020

An early Mitanni?


I've updated my Global25 datasheets with most of the ancients from the new Skourtanioti et al. paper. Here's a Principal Component Analysis (PCA) based on the data. It was produced with the Vahaduo PCA tools freely available here and the text file here.


Note that one of the Bronze Age females from Alalakh, labeled ALA019, appears to have ancestry from Turan and the Eurasian steppe. She may well have been a Mitanni of Indo-Aryan origin.

Interestingly, a Copper Age male from Arslantepe, ART038, belongs to Y-haplogroup R1b1a2 aka R1b-V1636. This is an unusual find, because R1b hasn't yet been reported in any Copper Age or earlier samples from outside of Europe and the Eurasian steppe.

As far as I can tell, this individual doesn't harbor any genome-wide ancestry from north of the Caucasus. However, R1b-V1636 is a rare lineage that is first attested in Eneolithic samples from the North Caucasus Piedmont steppe, so ART038's Y-chromosome might be the first evidence of the presence of steppe ancestry in Copper Age Anatolia.

I've also added most of the ancients from the new Agranat-Tamir et al. paper to the Gobal25 datasheets. The PCA below is based on the text file available here.


The Megiddo samples include a trio of interesting outliers dated to 1600-1500 BCE with significant ancestry from the steppe. One of these individuals is a male, I2189, who belongs to Y-haplogroup R and probably R1a. So he might also be of Indo-Aryan origin.

Another Megiddo male, S10768, belongs to R1b-M269 and probably shows a few per cent of steppe ancestry. I've already discussed how R1b and steppe ancestry may have ended up in the Bronze Age Near East in a couple of my previous posts:

R1b-M269 in the Bronze Age Levant

How did steppe ancestry spread into the Biblical-era Levant?

R-V1636: Eneolithic steppe > Kura-Araxes?

Sunday, July 7, 2019

How did steppe ancestry spread into the Biblical-era Levant?


It's likely that at least two of the Philistines from Feldman et al. 2019 harbor relatively recent steppe ancestry. They're labeled ASH067 and ASH068 in the paper. The former individual is a male who belongs to Y-chromosome haplogroup R1, which appears to be R1b-M269 judging by the data from the relevant BAM file.

This is just the second instance of Y-haplogroup R1 from the pre-Crusades Levant, and, of course, neither R1 nor R1b-M269 appear in the Near Eastern ancient DNA record prior to the expansions of the Yamnaya and other closely related pastoralist groups from the steppes and forest steppes of Eastern Europe.

So how did the Yamnaya-related ancestry spread into the Biblical-era Levant? Did it come via Anatolia, the Caucasus and/or the Mediterranean?

To try and answer this question I analyzed separately the genome-wide data for ASH067 and ASH068 with qpAdm, relying on outgroup and reference populations that weren't featured in the qpAdm runs in the Feldman et al. paper. I also limited the analyses to what were in my view the most proximate two- and three-way solutions in terms of chronology and geography.

The models with the best statistical fits, each labeled with their "tail probs", are available in a zip file here. From my experience with qpAdm, I'd say that the most useful models generally show comparably high tail probs but low chisq values and standard errors. Please note also that I discarded all of the models with at least one standard error higher than 0.2 and/or based on less than 100K SNPs.

As far as I can see, these two are among the very best outcomes. Bell_Beaker_FRA are nine samples associated with the Bell Beaker culture (BBC) from what is now France. Interestingly, the BBC population was rich in Y-haplogroup R1b-M269.

Levant_ISR_Ashkelon_IA1_ASH067
Bell_Beaker_FRA 0.116±0.059
GRC_Minoan 0.507±0.111
Levant_ISR_Ashkelon_LBA 0.377±0.117
chisq 9.018
tail prob 0.530432

Levant_ISR_Ashkelon_IA1_ASH068
Bell_Beaker_FRA 0.237±0.044
GRC_Minoan 0.763±0.044
chisq 4.736
tail prob 0.943265

In my opinion, these models basically confirm that both ASH067 and ASH068 harbor Yamnaya-related ancestry. It's heavily diluted and minor, but it's there. Admittedly, even after looking over the qpAdm output several times, I'm still not quite sure how their ancestors acquired this ancestry. But for the time being, Mediterranean Europe appears to be the most plausible proximate source one way or another. Any thoughts about that? Feel free to share them in the comments below.

See also...

Evidence of European ancestry in the Philistines

R1b-M269 in the Bronze Age Levant

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Wednesday, July 3, 2019

Evidence of European ancestry in the Philistines


The abstract below has just appeared at the European Nucleotide Archive (see here), so I'm guessing that the relevant paper and accompanying ancient genome-wide data will be published within weeks if not days. Emphasis is mine:

The ancient Mediterranean port-city of Ashkelon, identified as “Philistine” during the Iron Age, underwent a dramatic cultural change between the Late Bronze- and the early Iron- Age. It has been long debated whether this change was driven by a substantial movement of people, possibly linked to a larger migration of the so-called “Sea Peoples”. Here, we report genome-wide data of ten Bronze- and Iron- Age individuals from Ashkelon. We find that the early Iron Age population was genetically distinct due to a European related admixture. Interestingly, this genetic signal is no longer detectible in the later Iron Age population. Our results support that a migration event occurred during the Bronze- to Iron- Age transition in Ashkelon but did not leave a long-lasting genetic signature.

Update 4/7/2019: The paper is now available at Science Advances [LINK]. One of the Ashkelon ancients, who also shows a relatively high level of European ancestry, belongs to Y-Chromosome haplogroup R1 (probably R1b-M269). I've updated my Global25 datasheets with the new samples. Look for the Levant_ISR_Ashkelon prefix. Same links as always...

Global25 datasheet ancient scaled

Global25 pop averages ancient scaled

Global25 datasheet ancient

Global25 pop averages ancient

This is how they cluster in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation. The relevant datasheet is available here. Based on these results, it's tempting to think that the European ancestry in the Philistines may have been of Greek provenance. But keep in mind that this is just a two dimensional view and a simplification of reality. I'll have more to say about the ancestry of these individuals and the origins of the Philistines in future blog posts.

See also...

Five foot Philistines

How did steppe ancestry spread into the Biblical-era Levant?

Saturday, February 23, 2019

Catacomb > Armenia_MLBA


It's now clear, thanks to ancient DNA, that Transcaucasia and surrounds were affected by multiple, and at times significant, population movements from Eastern Europe during the Chalcolithic and Bronze Age periods. Based on the ancient samples from what is now Armenia, I'd say that this process peaked during the Middle Bronze Age. But who exactly were the people who perhaps swarmed south of the Caucasus at this time?

The most likely suspects are the various groups that occupied the southernmost parts of the Pontic-Caspian steppe throughout the Bronze Age. They were associated with the so called Catacomb, Kubano-Tersk and Yamnaya archeological cultures. Below is a Principal Component Analysis (PCA) that compares samples from these cultures with those from Middle to Late Bronze Age Armenia (labeled Armenia_MLBA). The relevant datasheet is available here.


Note that Armenia_MLBA forms a cline that appears to be stretching out towards the Catacomb, Kubano-Tersk, Yamnaya and other Bronze Age steppe groups, and this suggests that it harbors significant and probably recent steppe-related ancestry. But PCA plots based on just two dimensions of genetic variation can be misleading at times, so let's check this out with some formal mixture models using qpAdm.

Armenia_MLBA
Catacomb 0.234±0.028
Kura-Araxes_Kaps 0.766±0.028
chisq 10.723
tail prob 0.826248
Full output

Armenia_MLBA
Kubano-Tersk 0.254±0.030
Kura-Araxes_Kaps 0.746±0.030
chisq 13.535
tail prob 0.633284
Full output

Armenia_MLBA
Kura-Araxes_Kaps 0.768±0.028
Yamnaya_Kalmykia 0.232±0.028
chisq 14.454
tail prob 0.564954
Full output

Armenia_MLBA
Kura-Araxes_Kaps 0.762±0.029
Yamnaya_Caucasus 0.238±0.029
chisq 15.916
tail prob 0.458816
Full output

All of these models are statistically very sound, and even though I ranked the results by "tail prob", there's nothing in the output that clearly points to any one of the southern steppe groups as the obvious source of the steppe-related ancestry in Armenia_MLBA. But, interestingly, Catacomb tops the ranking, and it probably also makes the most sense based simply on Carbon-14 chronology. So, for now, I'm going with Catacomb.

I didn't get a chance yet to investigate this issue in detail with the Global25. Does it contradict the results from my PCA and qpAdm analyses? If anyone reading this would like to take a close look that'd be great. Feel free to post your findings in the comments below. And if the answer is indeed Catacomb, then what language did these Catacomb-derived migrants, or perhaps invaders, speak? If not proto-Armenian then what?

By the way, please be aware that the Kubano-Tersk samples in my analyses are the same individuals as those featured in Wang et al. 2019 under the label "North Caucasus".

See also...

Early chariot drivers of Transcaucasia came from...

Likely Yamnaya incursion(s) into Northwestern Iran

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Wednesday, August 2, 2017

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)


Over at Nature at this LINK. Why is the presence of steppe admixture in Mycenaeans important? And why does it matter if the Minoans already had a lot of ancestry from the Caucasus or surrounds? Because Mycenaeans were Indo-Europeans and Minoans weren't. I'm still reading the paper and will update this entry regularly over the next few days. Below is the abstract and, in my opinion, a key quote. Emphasis is mine.

The origins of the Bronze Age Minoan and Mycenaean cultures have puzzled archaeologists for more than a century. We have assembled genome-wide data from 19 ancient individuals, including Minoans from Crete, Mycenaeans from mainland Greece, and their eastern neighbours from southwestern Anatolia. Here we show that Minoans and Mycenaeans were genetically similar, having at least three-quarters of their ancestry from the first Neolithic farmers of western Anatolia and the Aegean [1, 2], and most of the remainder from ancient populations related to those of the Caucasus [3] and Iran [4, 5]. However, the Mycenaeans differed from Minoans in deriving additional ancestry from an ultimate source related to the hunter–gatherers of eastern Europe and Siberia [6, 7, 8], introduced via a proximal source related to the inhabitants of either the Eurasian steppe [1, 6, 9] or Armenia [4, 9]. Modern Greeks resemble the Mycenaeans, but with some additional dilution of the Early Neolithic ancestry. Our results support the idea of continuity but not isolation in the history of populations of the Aegean, before and after the time of its earliest civilizations.

...

The simulation framework also allows us to compare different models directly. Suppose that there are two models (Simulated1, Simulated2) and we wish to examine whether either of them is a better description of a population of interest (in this case, Mycenaeans). We test f4(Simulated1, Simulated2; Mycenaean, Chimp), which directly determines whether the observed Mycenaeans shares more alleles with one or the other of the two models. When we apply this intuition to the best models for the Mycenaeans (Extended Data Fig. 6), we observe that none of them clearly outperforms the others as there are no statistics with |Z|>3 (Table S2.28). However, we do notice that the model 79%Minoan_Lasithi+21%Europe_LNBA tends to share more drift with Mycenaeans (at the |Z|>2 level). Europe_LNBA is a diverse group of steppe-admixed Late Neolithic/Bronze Age individuals from mainland Europe, and we think that the further study of areas to the north of Greece might identify a surrogate for this admixture event – if, indeed, the Minoan_Lasithi+Europe_LNBA model represents the true history.

Lazaridis, Mittnik et al., Genetic origins of the Minoans and Mycenaeans, Nature, Published online 02 August 2017, doi:10.1038/nature23310

Update 03/08/2017: This is my own Principal Component Analysis (PCA) of the Minoan and Mycenaean samples, which are freely available at the Reich Lab website here. The Armenian angle for the eastern admixture in Mycenaeans looks forced. The trajectory of this admixture obviously runs from Northern or Eastern Europe to the Minoans. If it did arrive from Armenia, then realistically only via a heavily steppe-admixed population. Right click and open in a new tab to enlarge:


Update 05/08/2017: Much like Lazaridis et al., I ran a series to qpAdm analyses to find the best mixture model for the Mycenaeans. However, just to see what would happen, unlike Lazaridis et al., I didn't group any of the archaeological populations into larger clusters based on their genetic affinities. The three models below stood out from the rest in terms of their statistical fits.

Mycenaean
Minoan_Lasithi 0.786±0.049
Sintashta 0.214±0.049
taildiff: 0.96574059
chisq: 6.030
Full output

Mycenaean
Corded_Ware_Germany 0.210±0.043
Minoan_Lasithi 0.790±0.043
taildiff: 0.961238695
chisq: 6.198
Full output

Mycenaean
Minoan_Lasithi 0.791±0.043
Srubnaya 0.209±0.043
taildiff: 0.950419642
chisq: 6.558
Full output

So it's essentially the same outcome as the one obtained by Lazaridis et al., because Sintashta and Srubnaya are part of their Steppe_MLBA cluster, while Corded Ware is part of their Europe_LNBA cluster, and it's these clusters that, along with Minoan_Lasithi, provided their most successful mixture models for the Mycenaeans. But it's nice to see Sintashta at the top of my results, because it fits so well with the long postulated archaeological links between Sintashta and the Mycenaeans (for instance, see here).

By the way, here's what I said back in May when the Mathieson et al. 2017 preprint came out (see here). So things are falling into place rather nicely.

The same paper also includes the following individual from present-day Bulgaria dated to the start of the Late Bronze Age (LBA), which is roughly when the Mycenaeans appeared nearby in what is now Greece:

Bulgaria_MLBA I2163: Y-hg R1a1a1b2 mt-hg U5a2 1750-1625 calBCE

This guy is the most Yamnaya-like of all of the Balkan samples in Mathieson et al. 2017, and, as far as I can see based on his overall genome-wide results, probably indistinguishable from the contemporaneous Srubnaya people of the Pontic-Caspian steppe. He also belongs to Y-haplogroup R1a-Z93, which is a marker typical of Srubnaya and other closely related steppe groups such as Andronovo, Potapovka and Sintashta. So there's very little doubt that he's either a migrant or a recent descendant of migrants to the Balkans from the Pontic-Caspian steppe.

See also...

A Mycenaean and an Iron Age Iranian walk into a bar...

Main candidates for the precursors of the proto-Greeks in the ancient DNA record to date

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friday, May 12, 2017

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...


All of the post-Middle Neolithic samples from the recent Mittnik et al. and Saag et al. preprints on the ancient population history of the Baltic region belonged to Y-chromosome haplogroup R1a. And most of them belonged to the R1a-M417 (R1a1a) subclade that makes up almost 100% of the R1a lineages in the world today. This is what the results look like in a table (the sample IDs are of my own design):


Earlier samples from the same region belonged to Y-haplogroups I2a and R1a, but this was a subclade of R1a defined by the YP1272 mutation that is extremely rare today even in Northeastern Europe.

And now shifting our focus west of Scandinavia: all but two of the post-Middle Neolithic samples from around the North Sea from the recent Olalde et al. preprint on the Bell Beaker phenomenon and ancient population history of Northwest Europe belonged to Y-chromosome R1b, and more specifically to the R1b-M269 (R1b1a1a2) subclade, which makes up almost 100% of the R1b lineages in the world today. Here's a table:


Earlier samples from the same region belonged to Y-haplogroups I2a, I, G2a and CF, and most of the instances of I and the CF would probably be classified as I2a if not for missing data.

Interestingly, despite the R1a vs R1b dichotomy between these post-Middle Neolithic obvious newcomers to the Baltic and North Sea regions, respectively, they were very similar in terms of overall genetic structure, obviously closely related, starkly different from Middle Neolithic Northern Europeans, and in all likelihood mainly derived from the same homeland that was not located in Northern Europe.

So can we locate this homeland with any degree of certainty, you might wonder? In fact, you might ask, isn't this a futile search for the time being, as we await ancient DNA from many prehistoric Eurasian populations?

Not at all, because when attempting to answer this question we're bounded by two key constraints: the exceptionally high frequencies of R1a and R1b in the post-Middle Neolithic Baltic and North Sea samples, and their close genetic affinity to earlier and contemporaneous populations from the Pontic-Caspian steppe, part of which is due to significant Caucasus Hunter-Gatherer (CHG) admixture that was lacking in Middle Neolithic Northern Europeans.

Indeed, to date, the Pontic-Caspian steppe is the only region where both R1a and R1b have been found in ancient remains from the same sites dating to the Mesolithic, Neolithic and Eneolithic. Here's a table based on results from Mathieson et al. 2015 and 2017. The R and R1 might really be R1a or R1b if not for missing data.


The Pontic-Caspian steppe also abuts the Caucasus foothills, and we know that CHG admixture was a major feature of its inhabitants from at least the Eneolithic. So odds are, and make no mistake, these are indeed excellent odds, that the homeland we're looking for was on the Pontic-Caspian steppe.

But of course I2a has also been recorded in prehistoric samples from the Pontic-Caspian steppe. So, you might ask, why did the populations migrating out of the steppe belong to R1a and R1b, and why did some of them seemingly carry only R1a while others only R1b? This can be explained by local founder effects on the steppe due to patrilocality. Moreover, it's possible that some groups moving out of the steppe did carry high frequencies of I2a, but they're yet to enter the ancient DNA record. [Edit: Maybe they already have? See here]

Now, the aforementioned post-Middle Neolithic newcomers to the Baltic and North Sea regions are most certainly in large part the direct ancestors of modern-day Northern Europeans, speaking languages belonging to the three daughter branches of late Proto-Indo-European (PIE): Balto-Slavic, Celtic and Germanic. It's highly unlikely that languages ancestral to these present-day languages were spoken by Middle Neolithic farmers, nor introduced into Northern Europe after it was colonized by the migrants from the Pontic-Caspian steppe.

What this strongly suggests is that the Pontic-Caspian steppe was also the late PIE homeland.

But, you might argue, the Pontic-Caspian steppe may have just been the expansion point for some of the late PIE language branches. No, that won't work. For one, modern-day populations speaking languages belonging to all other late PIE branches, such as Armenian, Greek, Indo-Iranian and Italic, show signals of the same population expansion from the Pontic-Caspian steppe that gave rise to modern-day Northern Europeans, in the form of Yamnaya-related genome-wide genetic admixture and appreciable frequencies of Y-chromosome haplogroups R1a-M417 and/or R1b-M269.

Some of these signals are certainly due to fairly recent admixture from Northern Europeans, like in much of Greece as a result of the Slavic expansions during the Early Middle Ages, but most cannot be explained in this way.

Secondly, Balto-Slavic, Celtic and Germanic are not more closely related to each other than to some of the other late PIE branches. For instance, Balto-Slavic is considered far more closely related to Indo-Iranian than to Celtic, which is generally seen as a sister branch to Italic. Therefore, if Balto-Slavic and Celtic derive from a homeland on the Pontic-Caspian steppe, then logically this is also where we should look for the origins of Indo-Iranian and Italic.

So as far as the late PIE homeland is concerned, thanks to ancient DNA, the debate is now practically over. But the PIE homeland debate is still wide open, or so we're told.

Apparently, Mathieson et al. 2017 aren't comfortable with putting the PIE homeland on the Pontic-Caspian Steppe because they can't find any evidence in their ancient DNA dataset of a significant migration through the Balkans that would potentially bring Anatolian languages from the Pontic-Caspian steppe to Anatolia. From the paper:

One version of the Steppe Hypothesis of Indo-European language origins suggests that Proto-Indo European languages developed in the steppe north of the Black and Caspian seas, and that the earliest known diverging branch – Anatolian – was spread into Asia Minor by movements of steppe peoples through the Balkan peninsula during the Copper Age around 4000 BCE, as part of the same incursions from the steppe that coincided with the decline of the tell settlements. [51] If this were correct, then one way to detect evidence of it would be the appearance of large amounts of characteristic steppe ancestry first in the Balkan Peninsula, and then in Anatolia. However, our genetic data do not support this scenario. While we find steppe ancestry in Balkan Copper Age and Bronze Age individuals, this ancestry is sporadic across individuals in the Copper Age, and at low levels in the Bronze Age. Moreover, while Bronze Age Anatolian individuals have CHG/Iran Neolithic related ancestry, they have neither the EHG ancestry characteristic of all steppe populations sampled to date [20] , nor the WHG ancestry that is ubiquitous in southeastern Europe in the Neolithic (Figure 1A, Supplementary Data Table 2, Supplementary Information section 1). This pattern is consistent with that seen in northwestern Anatolia [11] and later in Copper Age Anatolia [23], suggesting continuing migration into Anatolia from the East rather than from Europe.

And this...

On the other hand, our data could still be consistent with the Steppe-Balkans-Anatolia route hypothesis model, albeit with constraints. It remains possible that populations dating to around 1600 BCE in the regions where the Indo-European Luwian, Hittite and Palaic languages were spoken did have European hunter-gatherer ancestry. However, our results would require that such ancestry was not ubiquitous in Bronze Age Anatolia, and was perhaps tightly linked to Indo-European speaking groups. We predict that additional insight about the genetic origins of the potential speakers of early Indo-European languages will be obtained when ancient DNA data become available from additional sites in this key period in Anatolia and the Caucasus.

But I'd say the authors are taking that one particular version of the Steppe Hypothesis way too seriously. They might even be implying things that the creator(s) of the said hypothesis never posited.

Why do they seemingly expect a massive surge of steppe admixture into the Balkans during the Copper Age? If the steppe people are just shooting through the Balkans on their way to Anatolia, why would they leave a lot of admixture along the way? And if the locals are abandoning their tell settlements and running for the hills as far away from the oncoming steppe invaders as they can, how exactly would they acquire steppe admixture? Osmosis or what?

The Balkans is not Northern Europe, and the hypothesized migration of the proto-Anatolians from the Pontic-Caspian Steppe to Anatolia through the Balkans was never, as far as I know, meant to parallel the massive Corded Ware expansion across Northern Europe. In other words, why should all of the early Indo-European expansions have been of the same character, especially considering that they moved into such starkly different areas of Eurasia?

Indeed, as Mathieson et al. 2017 point out in the quote above, the evidence for the fleeting presence of steppe peoples in the Copper Age Balkans is in their dataset. For instance, in their Varna 1 sample set from Bulgaria, three out of the five individuals show significant steppe admixture. One of these individuals is almost 50% Yamnaya-like. Surely, there's really no need to expect anything more than that when looking for signals of a proto-Anatolian migration from the Pontic-Caspian Steppe to Anatolia.

In fact, even though I do appreciate the incredible work these guys are doing and the data they're making available to myself and everyone else, I suspect that there's a little bit of, shall we say, schadenfreude going on here.

They sequenced all of three Early Bronze Age Anatolians of obscure origin (are they actually suspected Anatolian speakers, like Luwians?), and apparently it's a big deal that they can't find any steppe admixture in Early Bronze Age Anatolia. Come on.

And then we're offered just three Yamnaya samples from the Pontic Steppe in Ukraine. One happens to be a massive outlier towards the Caucasus. Wow, what are the chances of that? And guess what, all three of these Yamnayans are females, so of course we're left wondering about the Y-haplogroups of the Yamnaya males on the Pontic Steppe. What happened to the males? Next paper, that's what.

Update 19//05/2017: Please note that the authors are not holding back any Yamnaya males from Ukraine for a future paper, as per my claim in the last paragraph above. They used what they had for the time being.

Update 21/05/2017: Actually, I suspect that we already have a population from the Bronze Age steppe in the ancient DNA record with a high frequency of Y-haplogroup I2a. See here.

See also...

R1a-M417 from Eneolithic Ukraine!!!11

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Eastern Europe as a bifurcation hotspot for Y-hg R1

Globular Amphora people starkly different from Yamnaya people