search this blog

Monday, September 19, 2022

Dear Iosif...Yamnaya


Even though the Yamnaya culture probably originated in what is now Ukraine, the earliest Yamnaya samples currently available are from the modern-day Samara region of Russia. They mostly date to around 3,000 BCE. I can analyze their ancestry using Principal Component Analysis (PCA) data.

Target: RUS_Yamnaya_Samara
Distance: 3.2816% / 0.03281581
81.0 RUS_Progress_En
14.4 UKR_N
4.6 HUN_Vinca_MN
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_C
0.0 BGR_Dzhulyunitsa_N
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

The above results show exactly zero ancestry from West Asia. Admittedly, both RUS_Progress_En and HUN_Vinca_MN are European ancients with significant West Asian-related ancestry. However, this ancestry is very distantly West Asian-related, and, for instance, it almost certainly has no relevance to the Indo-Anatolian homeland debate.

The Afanasievo culture of Central Asia is regarded to have been an early offshoot of the Yamnaya culture. A good number of Afanasievo samples are available, so let's have a look if their results match those of the Yamnaya folks. And indeed they do, since BGR_C is very similar to HUN_Vinca_MN.

Target: RUS_Afanasievo
Distance: 3.4055% / 0.03405499
84.0 RUS_Progress_En
11.4 UKR_N
4.6 BGR_C
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_Dzhulyunitsa_N
0.0 HUN_Vinca_MN
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

To try this at home, stick the PCA data in the text file here into the relevant fields here and cranck up the "Cycles" to 4X. You should see exactly zero ancestry from West Asia every time.

I can, more or less, reproduce these results with tools that are routinely used in peer reviewed papers. Below is a table of mixture models produced with the qpAdm software. I set the pass threshold to P ≥0.05, which is an arbitrary value, but the pattern is clear. The full output from each qpAdm run is available here.


Importantly, qpAdm needs to be fed the relevant "right pop" outgroups to be able to discriminate accurately between reference populations.

right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
TUR_Marmara_Barcin_N
HUN_Starcevo_N
HUN_Koros_N
SRB_Iron_Gates_HG
Iberia_Southeast_Meso
RUS_Karelia_HG
RUS_West_Siberia_HG
RUS_Boisman_MN
MNG_North_N
TWN_Hanben
BRA_LapaDoSanto_9600BP

So, for instance, if one were to use in this role the modern-day Mbuti people, as opposed to, say, the ancient hunter-gatherers of Shum Laka, one might find that many models look statistically better than they should. And then one might also find that the Yamnaya samples carry significant West Asian ancestry.

Actually, I'm not opposed to the idea of some West Asian ancestry in Yamnaya. Indeed, considering the extraordinary mobility of the Yamnaya people and their Eneolithic predecessors on the Pontic-Caspian steppe, it would be unusual if they didn't come into close contact and mix, to some degree, with their neighbors from West Asia.

However, based on everything I've seen, from uniparental markers to different types of autosomal genetic tests, it's clear to me that there's no substantial West Asian ancestry in any Yamnaya samples, except for an outlier female from modern-day Ozera, Ukraine (see here).

Admittedly, ancient DNA does have a habit of throwing curveballs, so I'm eagerly awaiting new Eneolithic samples from the Pontic-Caspian steppe, particularly those associated with the Yamnaya-like Sredni Stog culture, to help finally settle this issue.

Believe it or not, a contact recently sent me a supposedly unpublished female sample from a ~4,200 BCE Sredni Stog burial in modern-day Igren, east central Ukraine. So what the hell, let's assume for the time being that this sample is genuine. This is how Miss Sredni Stog behaves in my PCA mixture test.

Target: UKR_Sredni_Stog
Distance: 4.0769% / 0.04076877
75.6 RUS_Progress_En
17.8 UKR_N
6.6 HUN_Vinca_MN
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_C
0.0 BGR_Dzhulyunitsa_N
0.0 HUN_Vinca_MN
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

Wow, just wow. Have we actually found Miss Proto-Yamnaya? What does qpAdm have to say in the matter?

UKR_Sredni_Stog
HUN_Vinca_MN 0.034±0.028
RUS_Progress_En 0.796±0.045
UKR_N 0.170±0.034
P-value 0.41088

Again, this is an excellent match with the results from my PCA test, especially if we take into account the standard errors. However, with qpAdm it's also possible to model this individual's ancestry as part West Asian.

UKR_Sredni_Stog
AZE_Caucasus_lowlands_LN 0.056±0.039
RUS_Progress_En 0.761±0.061
UKR_N 0.183±0.036
P-value 0.465667

As I pointed out above, it's plausible for such people to harbor some West Asian ancestry, but I'm very sceptical that this is really the case here, despite the rather solid qpAdm statistical fit. That's because UKR_Sredni_Stog is not a high quality sample, and, from my experience, qpAdm often has problems analyzing fine scale ancestry in singletons or even small groups that show excess DNA damage and/or offer much less than a million markers.

See also...

Dear Iosif, about that ~2%

But Iosif, what about the Phrygians?

Friday, September 9, 2022

Dear Iosif, about that ~2%


The debate over the location of the so called Indo-Anatolian homeland won't be decided by the persistence of any type of genetic ancestry in ancient Anatolia.

It'll be decided by a multidisciplinary study on the interactions between the ancient peoples of the North Pontic steppe, the eastern Balkans, and western Anatolia.

If such a study finds a pulse of steppe-related gene flow from the Balkans into Anatolia sometime during the early metal ages, it'll corroborate the linguistic hypothesis that a language ancestral to Hittite, Luwian and related tongues moved into Anatolia from Eastern Europe.

Why do we only need a pulse of gene flow, you might ask? Obviously, because:

- language and genetic ancestry can start with a strong association but, since they're not linked, they can eventually follow very different trajectories

- the dilution of genetic ancestry is an important factor, especially in ancient West Asia, and it must be taken into account in models of language spread, rather than ignored in favor of simple, elegant models that do not reflect reality.

Here's my favorite quote from the recent Lazaridis, Alpaslan-Roodenberg et al. paper, because, probably unbeknownst to the authors, it's exceptionally revealing about the spread of a wide range of Indo-European speakers into Anatolia.

However, in individuals from Gordion, a Central Anatolian city that was under the control of Hittites before becoming the Phrygian capital and then coming under the control of Persian and Hellenistic rulers, the proportion of Eastern hunter-gatherer ancestry is only ~2%, a tiny fraction for a region controlled by at least four different Indo-European–speaking groups.

Indeed, this is exactly what the Lazaridis, Alpaslan-Roodenberg et al. paper should've been about. That is, the authors should've given us a painstaking account of the spread of different ancient Indo-European speaking groups into Anatolia and explained how, overall, their DNA was rapidly diluted to a trace amount.

However, instead they treated us to a make-believe tale about a so called Indo-Anatolian homeland in what is now Armenia.

See also...

Dear Iosif...Yamnaya

But Iosif, what about the Phrygians?

Dear Iosif...

Dear Iosif #2

Dear Iosif #3

Sunday, September 4, 2022

But Iosif, what about the Phrygians?


A paper in Science authored by around 200 scientists from some of the world's top academic institutions surely must mean something, right? Not necessarily.

In this short blog post I'll try to explain, as simply as I can, why the Lazaridis, Alpaslan-Roodenberg et al. paper doesn't get us any closer to solving the riddle of the so called Indo-Anatolian homeland.

However, it must be said that the paper does include many interesting and valuable samples. I'll be using six of these samples, labeled TUR_C_Gordion_Anc, to argue my case.

The TUR_C_Gordion_Anc sample set is from Gordion, the capital of ancient Phrygia, and thus, in all likeliness, it represents Phrygian speakers.

Phrygian is an Indo-European language and the leading hypothesis is that it originated in the Balkans.

In terms of fine scale ancestry, TUR_C_Gordion_Anc can be reliably divided into two genetic clusters. In the Principal Component Analysis (PCA) below these clusters are labeled TUR_C_Gordion_Anc1 and TUR_C_Gordion_Anc2.

Note that TUR_C_Gordion_Anc1 is obviously pulling away from TUR_C_Gordion_Anc2 towards samples from the Balkans. I've used ancient samples from what is now North Macedonia, labeled MKD_Anc, to represent the Balkans. To see an interactive version of the plot, paste the PCA coordinates from here into the relevant field here.

Visually, this is not an especially dramatic outcome, but it's an incredible result nonetheless, because it shows that even a few ancient samples can help to solve an age old mystery.

Across many dimensions of genetic variation, the shift in the PCA from TUR_C_Gordion_Anc1 to TUR_C_Gordion_Anc2 represents about 20% admixture from the Balkans, and about 8% from the Eastern European steppe. That's plenty enough to corroborate the linguistic hypothesis that the Phrygians originated in the Balkans, and that some of their ancestors came from the steppe. The mixture models below were done with the tools here.

Target: TUR_C_Gordion_Anc1
Distance: 1.6634% / 0.01663373
40.6 Kura-Araxes_ARM_Kaps
22.2 Anatolia_Barcin_N
21.8 MKD_Anc
13.6 Levant_PPNB
1.4 IRN_Ganj_Dareh_N
0.4 Han

Target: TUR_C_Gordion_Anc1
Distance: 1.7109% / 0.01710904
40.2 Kura-Araxes_ARM_Kaps
37.8 Anatolia_Barcin_N
12.4 Levant_PPNB
8.0 Yamnaya_RUS_Samara
1.2 IRN_Ganj_Dareh_N
0.4 Han

Target: TUR_C_Gordion_Anc2
Distance: 2.0293% / 0.02029339
51.0 Kura-Araxes_ARM_Kaps
26.8 Anatolia_Tepecik_Ciftlik_N
17.6 Anatolia_Barcin_N
4.6 Levant_PPNB

Surprisingly, Lazaridis, Alpaslan-Roodenberg et al. didn't have much to say about this topic. This quote basically sums it up:

However, in individuals from Gordion, a Central Anatolian city that was under the control of Hittites before becoming the Phrygian capital and then coming under the control of Persian and Hellenistic rulers, the proportion of Eastern hunter-gatherer ancestry is only ~2%, a tiny fraction for a region controlled by at least four different Indo-European–speaking groups.

I have no doubt that Lazaridis, Alpaslan-Roodenberg et al. can run a very decent PCA, and then blow it up to a size big enough to show that the Gordion samples represent two genetically somewhat distinct groups. I'm also sure that, if they really try, they can locate significant levels of proximate and relevant European ancestry in some of these samples.

They don't have to use my methods; they can use any methods they like. My point is that they won't find much if they're just looking for genetic signals from the Upper Paleolithic or Mesolithic.

Now, considering the way that the Phrygian question was treated by Lazaridis, Alpaslan-Roodenberg et al., despite the fact that they managed to sequence a few likely Phrygian speakers from none other than the Phrygian capital, let's not pretend that their paper brought us any closer to understanding the genetic origins of Anatolian speakers or pinpointing their ancestral homeland.

In order to even try to solve these problems with ancient DNA, we need a wide range of samples from Hittite, Luwian and other key sites where Anatolian languages were spoken. And then we must analyze them properly.

I'm guessing that Lazaridis, Alpaslan-Roodenberg et al. went out of their way to get such samples, but for one reason or another they failed. If so, that's OK, but I have a feeling that even if they got them, they wouldn't know what to do with them, because at best these samples would only show ~2% Eastern hunter-gatherer ancestry. Haha.

For what it's worth, I believe that the ancient data in the Lazaridis, Alpaslan-Roodenberg et al. paper point to the North Pontic steppe as the Indo-Anatolian homeland, and I'll lay out my arguments in an upcoming blog post.

See also...

Dear Iosif...Yamnaya

Dear Iosif, about that ~2%

Dear Iosif...

Dear Iosif #2

Dear Iosif #3

Thursday, September 1, 2022

Dear Iosif #3


Back in 2016 I made this prediction about the origins of the Yamnaya people (Steppe_EMBA):

But here's my prediction: Steppe_EMBA only has 10-15% admixture from the post-Mesolithic Near East not including the North Caucasus, and basically all of this comes via female mediated gene flow from farming communities in the Caucasus and perhaps present-day Ukraine.

The relevant blog post is still here. Looking back, my analysis is a bit sloppy and I didn't articulate my ideas too well. But that was a pretty good prediction for its time, and I believe it still has a chance of being confirmed, more or less.

On the other hand, the widely publicized hypothesis that the Yamnaya population is a ~50/50 mixture between indigenous Eastern European hunter-gatherers and Near Eastern or West Asian migrants never looked right to me. So I'm glad that it's now dead and buried.

Those of you not up to date with this topic, all you need to know is that the Yamnaya genotype existed in Eastern Europe at least a thousand years before Yamnaya, and, moreover, the Yamnaya people are largely derived from Eastern European foragers already rich in Near Eastern-related ancestry. The relevant ancient genomes are on the way (for instance, see here).

Nevertheless, the narrative that waves of Near Eastern migrants moved into prehistoric Eastern Europe, leading to the emergence of the Yamnaya culture and even the Proto-Indo-European language, is still being pushed by some notable scientists working with ancient DNA.

My hope is that, considering the latest revelations about the genetic origins of the Yamnaya people, these scientists can embrace a more nuanced view. How about something like this?

- people moved around, and they were especially mobile on the Eastern European steppe from the Eneolithic onwards

- when they made contact they sometimes mixed, so there was admixture between far flung steppe groups

- since population densities on the steppe were low until the Yamnaya period, minor admixture that entered the steppe during the Neolithic and Eneolithic wasn't dilluted easily.

See also...

Dear Iosif #2

But Iosif, what about the Phrygians?

Dear Iosif, about that ~2%

Dear Iosif...Yamnaya

Sunday, August 28, 2022

Dear Iosif #2


In my last blog post I made a mistake in my interpretation of this quote from Lazaridis, Alpaslan-Roodenberg et al., because it confused the crap out of me:

However, the complete lack of association of R-haplogroup descendants and EHG ancestry in either Armenia or Iran is consistent with either a massive dilution of EHG ancestry in these populations resulting in the dissociation of Y-chromosome lineages from autosomal ancestry over time, or with a scenario in which R-M269 was not associated with substantial EHG ancestry to begin with.

I thought they meant that they couldn't find any Eastern European hunter-gatherer (EHG) ancestry in samples from Armenia or Iran bearing Y-chromosome R1b-M269.

Of course, they did find EHG ancestry in these individuals, it's just that they couldn't establish an association specifically between this type of ancestry and Y-haplogroup R.

That is, males with Y-haplogroup R in Armenia, Iran and everywhere else generally show about the same level of EHG ancestry as their ethnic kin with other Y-haplogroups.

But so what? Why mention this when discussing the origins of R1b-M269, when it has absolutely no value in this context?

Y-haplogroups aren't linked directly to autosomal DNA, and Lazaridis, Alpaslan-Roodenberg et al. are obviously aware of this (hence their point about the potential massive dilution of EHG ancestry).

In regards to the origins of R1b-M269, and the provenance of West Asian R1b-M269, the really powerful observation is that R1b-M269 shows up rather late and suddenly in the West Asian ancient DNA record along with EHG and steppe ancestry.

That, and the fact that Eastern Europe is an ancient R1b hotbed (while West Asia a desert), means there's virtually no chance that R1b-M269 is native to West Asia. In other words, there was no R1b-M269 in West Asia until the steppe people brought it there from north of the Caucasus.

See also...

Dear Iosif...

Dear Iosif #3

But Iosif, what about the Phrygians?

Dear Iosif, about that ~2%

Dear Iosif...Yamnaya

Saturday, August 27, 2022

Dear Iosif...

Update 29/08/22: Dear Iosif #2

...


I'm skimming through the Lazaridis, Alpaslan-Roodenberg et al. paper that just came out at Science. And I feel like someone punched me in the face.

Nevertheless, I'll try to be diplomatic. Suffice to say, for now, that there's some rather strange stuff in this paper.

The main problem is that the authors are attempting to study fine scale ancestry with a somewhat rough distal model. As a result, they miss important details.

For instance, this quote is from the paper's supplementary PDF file, freely available here.

However, the complete lack of association of R-haplogroup descendants and EHG ancestry in either Armenia or Iran is consistent with either a massive dilution of EHG ancestry in these populations resulting in the dissociation of Y-chromosome lineages from autosomal ancestry over time, or with a scenario in which R-M269 was not associated with substantial EHG ancestry to begin with.

Obviously, EHG means Eastern European Hunter-Gatherer. But why focus on EHG? Surely, this makes little sense when looking at the genetic prehistory of West Asia, because no one ever argued that this region was settled by EHG populations. It was widely settled by Yamnaya-related groups, with already heavily diluted EHG ancestry, during the metal ages.

OK, so the authors are actually aware about the potential dilution of EHG ancestry, but they don't really do anything about it.

If we're looking at the origins of West Asian R1b-M269, and using its association with autosomal DNA components as a guide, then we should be focusing on Yamnaya-related ancestry.

For instance, here's a fine scale ancient ancestry model based on Principal Component Analysis (PCA) data. It shows the ancestry proportions of two relatively high coverage Iron Age males from two different sites in Iran from the Lazaridis, Alpaslan-Roodenberg et al. dataset. Both belong to R1b-M269 and both show significant Yamnaya-related ancestry.

Target: IRN_HajjiFiruz_IA:I2327_all
Distance: 2.2930% / 0.02292994
39.6 Kura-Araxes_ARM_Kaps
24.2 IRN_Ganj_Dareh_N
18.2 Levant_PPNB
12.4 Yamnaya_RUS_Samara
4.4 Anatolia_Tepecik_Ciftlik_N
1.2 Han

Target: IRN_Hasanlu_IA:I4232_all
Distance: 2.5179% / 0.02517895
26.0 IRN_Ganj_Dareh_N
25.6 Kura-Araxes_ARM_Kaps
24.4 Anatolia_Tepecik_Ciftlik_N
15.8 Yamnaya_RUS_Samara
7.6 Levant_PPNB
0.6 IRN_Shahr_I_Sokhta_BA2

As a control, here's an earlier, Chalcolithic sample bearing Y-haplogroup J2b from the same region. Not surprisingly, this individual totally lacks the Yamnaya-related signal.

Target: IRN_HajjiFiruz_ChL:I4241_all
Distance: 2.7938% / 0.02793782
32.6 Kura-Araxes_ARM_Kaps
25.6 IRN_Ganj_Dareh_N
23.6 Anatolia_Tepecik_Ciftlik_N
18.2 Levant_PPNB

Overall, these results make perfect sense. I could probably locate very minor signals of EHG ancestry in the Iron Age samples, but that would be more difficult and much less certain, so I won't bother.

Soon I'll be able to rerun these analyses with Bronze Age samples from Dagestan and surrounds. That should bump up the levels of Yamnaya-related ancestry and improve the statistical fits (wink, wink, nudge, nudge).

Disappointingly, Lazaridis, Alpaslan-Roodenberg et al. go so far as to suggest that R1b-M269 may have originated in West Asia.

However, considering the scores of ancient Eastern European populations rich in R1b-M269 and many near and far related subclades of R1b, this makes no sense whatsoever.

Indeed, contemplating nowadays that R1b-M269 might be native to West Asia, where R1b only starts showing up in the ancient DNA record during the Copper Age, is about as stupid as claiming that gravity doesn't exist.

Largely due to their distal model approach, Lazaridis, Alpaslan-Roodenberg et al. also argue that the Indo-Anatolian homeland was located in what is now Armenia and surrounds. I'm far from convinced that this solution will stand the test of time.

In terms of the more widely accepted theory that the Indo-Anatolian homeland was located on the Pontic-Caspian steppe in Eastern Europe, the most important samples in the paper are the three Bronze Age individuals from Yassitepe in western Anatolia. That's because they're from a region that is traditionally seen as the entry point of Indo-Anatolian speakers into Anatolia from the European steppe via the Balkans.

Interestingly, individual I5737, dated to 2035-1900 calBCE or the Middle Bronze Age, belongs to Y-chromosome haplogroup I2a-P78, which surely must be a signal of European ancestry. I see this as a significant result.

Here's how the trio from Yassitepe look in my fine scale ancient ancestry model. Minor Yamnaya-related ancestry does show up, although, admittedly, it might just be noise in individual I5735.

Target: TUR_Aegean_Izmir_Yassitepe_MBA:I5737
Distance: 2.7507% / 0.02750748
58.4 Anatolia_Barcin_N
20.4 Kura-Araxes_ARM_Kaps
9.2 Anatolia_Tepecik_Ciftlik_N
5.6 IRN_Ganj_Dareh_N
3.8 Yamnaya_RUS_Samara
2.6 Levant_PPNB

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5733
Distance: 2.7969% / 0.02796887
52.0 Anatolia_Barcin_N
27.2 Kura-Araxes_ARM_Kaps
8.6 Levant_PPNB
6.2 Yamnaya_RUS_Samara
6.0 IRN_Ganj_Dareh_N

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5735
Distance: 3.1270% / 0.03127009
36.0 Kura-Araxes_ARM_Kaps
32.4 Anatolia_Tepecik_Ciftlik_N
26.0 Anatolia_Barcin_N
2.8 IRN_Shahr_I_Sokhta_BA2
1.2 Yamnaya_RUS_Samara
1.0 Levant_PPNB
0.6 MAR_Taforalt

This isn't much, especially considering it's already late 2022, but it's better than nothing. Fortunately, more samples from Bronze Age western Anatolia are on the way (wink, wink, nudge, nudge).

However, I'm not done with the Lazaridis, Alpaslan-Roodenberg et al. dataset yet. I'm planning to spend much more time on this blog in the coming weeks and months and will be using their samples in a wide range of analyses.

Citation...

Iosif Lazaridis, Songül Alpaslan-Roodenberg et al., The genetic history of the Southern Arc:A bridge between West Asia and Europe, Science 377, eabm4247 (2022)

See also...

Dear Iosif #3

But Iosif, what about the Phrygians?

Dear Iosif, about that ~2%

Dear Iosif...Yamnaya

Friday, August 12, 2022

Mediterranean PCA update


I updated my Principal Component Analysis (PCA) of Mediterranean populations with most of the ancient Jewish samples from the Waldman et al. preprint. To view intereactive versions of the plots paste the data from here into the PCA DATA field here and press PLOT PCA. The ancient Jews are labeled DEU_MA_Erfurt.

See anything interesting? I'm again seeing more complexity than claimed by Waldman et al., but what would I know anyway?

See also...

My take on the Erfurt Jews

Greeks in a Longobard cemetery

Tuesday, June 21, 2022

My take on the Erfurt Jews


I had a quick look at the genotype data from the recent Waldman et al. preprint focusing on the ancestry of early Jews from Erfurt, Germany. My impression is that the genetic origins of these Jews are somewhat more complex than claimed in the manuscript.

Indeed, I'd say the Waldman et al. characterization of the Erfurt Jews as a three-way mixture between populations similar to present-day Lebanese, South Italians and Russians doesn't exactly reflect reality.

Unlike Waldman et al., I designed an ADMIXTURE analysis that separated East Asian ancestry into East Asian and Siberian clusters, and also included Mediterranean and North African clusters. The output is available in a spreadsheet HERE. Below is a bar graph based on some of the output.
Now, keeping in mind that ADMIXTURE is not a formal mixture test, and that it estimates ancestry proportions from inferred populations, as opposed to ancient groups that actually existed, here are some key observations:

- in terms of fine scale ancestry, the Erfurt Jews show enough variation to be divided into three or four clusters, as opposed to just two as per Waldman et al.

- some of the Erfurt Jews show excess "Mediterranean" ancestry, while others excess "North African" ancestry, and this cannot be explained with ancestral populations similar to Lebanese and/or South Italians, but rather with significant gene flow from the western Mediterranean and possibly North Africa

- several of the Erfurt Jews show relatively high levels of "East Asian" ancestry that cannot be explained by admixture from Russians, or even any Russian-like populations, because such populations almost lack this type of ancestry, and instead show significant "Siberian" admixture

- as far as I can see, there are no correlations between any of the observations above and the quality of the samples. That is, low coverage doesn't appear to be causing the aforementioned excess "Mediterranean", "North African" and/or "East Asian" ancestry proportions.

Investigating this in more detail with, say, formal statistics will take some time. But I was able to reproduce the results from the above ADMIXTURE run using several somewhat different datasets, so that's something.

It seems to me that Waldman et al. want a simple and elegant model to explain the data, which is understandable, but I do think they should at least expand their ADMIXTURE analysis to include "Siberian", "Mediterranean" and "North African" clusters, and go from there depending on what they find.

Citation...

Waldman et al., Genome-wide data from medieval German Jews show that the Ashkenazi founder event pre-dated the 14th century, bioRxiv, posted May 16, 2022, doi: https://doi.org/10.1101/2022.05.13.491805

See also...

Mediterranean PCA update

Saturday, June 18, 2022

David Reich on the origin of the Yamnaya people (!?)


Harvard's David Reich is doing a talk next month about the genetic history of West Asia and nearby parts of Europe. This is a quote from an online abstract of the talk (found here).

The impermeability of Anatolia to exogenous migration contrasts with our finding that the Yamnaya had two distinct gene flows, both from West Asia, suggesting that the Indo-Anatolian language family originated in the eastern wing of the Southern Arc and that the steppe served only as a secondary staging area of Indo-European language dispersal.

If this is actually what David Reich is going to claim then I'd say his team has a lot of work to do before they put out their paper on the topic.

First of all, Yamnaya did not have two distinct gene flows from West Asia. I don't even know what that means exactly, but there's no way that this statement is correct no matter how one interprets it.

In fact, the Yamnaya population formed on the Pontic-Caspian steppe from earlier groups native to this part of Eastern Europe, such as the people associated with the Sredny Stog culture.

That is, there were no migrations from West Asia into Eastern Europe that can be claimed to have been instrumental in the emergence of the Yamnaya population. On the other hand, Yamnaya may have been significantly influenced by cultural impulses from West Asia, but this is nothing new.

In terms of deep population structure, the Yamnaya genotype can be described as a mixture between Eastern European and West Asian-related genetic components. However, these Asian-related components were already in Europe thousands of years before Yamnaya came into existence.

Indeed, soon to be published ancient DNA shows that hunter-gatherers very similar to the Yamnaya people, packing quite a lot of West Asian-related ancestry, lived in the Middle Don region (just north of the Pontic-Caspian steppe) well before 5,000 BCE (see here).

So, did the West Asian ancestors of these Middle Don hunter-gatherers speak Proto-Indo-European, or, as David Reich calls it, Indo-Anatolian? Keep in mind that most linguists put the birth of Indo-Anatolian around 4,000 BCE, which is actually the Sredny Stog period.

Moreover, in underlining Anatolia's supposed impermeability to exogenous migration, David Reich is arguing against things that no one worth their salt ever claimed. That's because the spread of Indo-Anatolian speakers into Anatolia has never really been described by archeologists and linguists as a massive migration, but rather as an infiltration into lands already heavily populated by the Hattians (for instance, see here).

We may have already seen the genetic evidence of this infiltration in the presence of steppe Y-chromosome haplogroup R-V1636 in a Chalcolithic burial at Arslantepe (see here and here). Let's wait and see what else crops up over the next few years as many more ancient Anatolian genomes are sequenced by David Reich and colleagues.

See also...


Wednesday, May 18, 2022

Geography is hard (for some)


It's that time of the academic year again when bioRxiv is inundated with ancient DNA preprints. I'm not complaining, but I almost spat out my coffee when I saw this map in one of the new manuscripts (here).
What's the logic behind labeling almost all of Eastern Europe as "Steppe", and instead labeling just Czechia, Hungary and Slovakia as "Eastern Europe"? In my opinion those three countries, plus Poland, are better described as East Central Europe anyway.

It seems to me that many people working at the highest level in population genetics simply don't know what the Eurasian steppe is. They appear to see it as a continent of its own, when, in fact, it's a topographical feature and ecoregion that straddles the continents of Europe and Asia. That's why it's called the Eurasian steppe, and it's made up of three main parts: the Pontic-Caspian steppe of Eastern Europe, the Kazkah steppe of Central Asia, and the Eastern steppe of Mongolia.

Here's the same map with a few corrections (in red). Much better, don't you think?
Citation...

Antonio et al., Stable population structure in Europe since the Iron Age, despite high mobility, bioRxiv, posted May 16, 2022, doi: https://doi.org/10.1101/2022.05.15.491973

See also...

Matters of geography