search this blog

Showing posts with label Ukraine. Show all posts
Showing posts with label Ukraine. Show all posts

Monday, February 12, 2024

The Nalchik surprise


If, like Iosif Lazaridis, you subscribe to the idea that the Yamnaya people carry early Anatolian farmer-related admixture that spread into Eastern Europe via the Caucasus, then I've got great news for you.

We now have a human sample from the Eneolithic site of Nalchik in the North Caucasus, labeled NL122, that packs well over a quarter of this type of ancestry (see here). Below is a quick G25/Vahaduo model to illustrate the point (please note that Turkey_N = early Anatolian farmers).

Target: Nalchik_Eneolithic:NL122
Distance: 2.1934% / 0.02193447
60.8 Russia_Steppe_Eneolithic
26.2 Turkey_N
13.0 Georgia_Kotias

On the other hand, if, again like Iosif Lazaridis, you subscribe to the idea that the Indo-European language spread into Eastern Europe via the Caucasus in association with this early Anatolian farmer-related admixture, then I've got terrible news for you.

That's because NL122 is apparently dated to a whopping 5197-4850 BCE (see here). This dating might be somewhat bloated, possibly due to what's known as the reservoir effect, because the Nalchik archeological site is generally carbon dated to 4840–4820 BCE.

However, even with the younger dating, this would still mean that early Anatolian farmer-related ancestry arrived in the North Caucasus, and thus in Eastern Europe, around 4,800 BCE at the latest. That's surprisingly early, and just too early to be relevant to any sort of Indo-European expansion from a necessarily even earlier Proto-Indo-Anatolian homeland somewhere south of the Caucasus.

This means that NL122 effectively debunks Iosif Lazaridis' Indo-Anatolian hypothesis. Unless, that is, Iosif can provide evidence for a more convoluted scenario, in which there are at least two early Anatolian farmer-related expansions into Eastern Europe via the Caucasus, and the expansion relevant to the arrival of Indo-European speech came well after 5,000 BCE.

I haven't done any detailed analyses of NL122 with formal stats and qpAdm. But my G25/Vahaduo runs suggest that it might be possible to model the ancestry of the Yamnaya people with around 10% admixture from a population similar to NL122.

Target: Russia_Samara_EBA_Yamnaya
Distance: 3.4123% / 0.03412328
72.6 Russia_Progress_Eneolithic
18.2 Ukraine_N
9.2 Nalchik_Eneolithic

However, I don't subscribe to the idea that the Yamnaya people carry early Anatolian farmer-related admixture that spread into Eastern Europe via the Caucasus (on top of what is already found in Progress Eneolithic). Based on basic logic and a wide range of my own analyses, I believe that they acquired this type of ancestry from early European farmers, probably associated with the Trypillia culture. For instance...

Target: Russia_Samara_EBA_Yamnaya
Distance: 3.2481% / 0.03248061
80.2 Russia_Progress_Eneolithic
13.6 Ukraine_Neolithic
6.2 Ukraine_VertebaCave_MLTrypillia
0.0 Nalchik_Eneolithic

Another way to show this is with a Principal Component Analysis (PCA) that highlights a Yamnaya cline made up of the Yamnaya, Steppe Eneolithic and Ukraine Neolithic samples. As you can see, dear reader, there's no special relationship between the Yamnaya cline and Nalchik_Eneolithic. The Yamnaya samples, which are sitting near the eastern end of the Yamnaya cline, instead seem to show a subtle shift towards the Trypillian farmers.

Indeed, I also don't exactly understand the recent infatuation among many academics, especially Iosif Lazaridis and his colleagues, with trying to put the Proto-Indo-Anatolian homeland somewhere south of the Caucasus. Considering all of the available multidisciplinary data, I'd say it still makes perfect sense to put it in the Sredny Stog culture of the North Pontic steppe, in what is now Ukraine.

Please note that all of the G25 coordinates used in my models and the PCA are available HERE.

See also...

The Caucasus is a semipermeable barrier to gene flow

Saturday, November 4, 2023

Slavs have little, if any, Scytho-Sarmatian ancestry


Here's an abstract of a new study from the David Reich Lab about ancient Slavs, titled "Genetic identification of Slavs in Migration Period Europe using an IBD sharing graph". Emphasis is mine:

Popular methods of genetic analysis relying on allele frequencies such as PCA, ADMIXTURE and qpAdm are not suitable for distinguishing many populations that were important historical actors in the Migration Period Europe. For instance, differentiating Slavic, Germanic, and Celtic people is very difficult relying on these methods, but very helpful for archaeologists given a large proportion of graves with no inventory and frequent adoption of a different culture. To overcome these problems, we applied a method based on autosomal haplotypes. Imputation of missing genotypes and phasing was performed according to a protocol by Rubinacci et al. (2021), and IBD inference was done for ancient Eurasian individuals with data available at >600,000 1240K sites. IBD links for a subset of these individuals were represented as a graph, visualized with a force-directed layout algorithm, and clusters in this graph are inferred with the Leiden algorithm. One of the clusters in the IBD graph emerged that includes nearly all individuals in the dataset annotated archaeologically as “Slavic”. According to PCA a hypothesis for the origin of this population can be proposed: it was formed by admixture of a Baltic-related group with East Germanic people and Sarmatians or Scythians. The individuals belonging to the “Slavic” IBD sharing cluster form a chronological gradient on the PCA plot, with the earliest samples close to the Baltic LBA/EIA group. Later “Slavic” individuals are shifted to the right, closer to Central and Southern Europeans and probably reflecting further admixture of Slavs with local populations during the Migration Period.

Apparently this abstract is causing a bit of confusion online because of the mention of possible Sarmatian or Scythian ancestry in Slavs.

However, it's important to understand that the authors are referring to certain Slavic or even just Slavic-related individuals, usually from culturally heterogeneous frontier settlements deep in what is now Russia.

So yes, it's possible that some of these individuals carry Sarmatian, Scythian or other exotic eastern ancestry. But even if this is true, then obviously we can't extend this inference to all ancient and modern-day Slavs.

Indeed, below is a G25/Vahaduo Principal Component Analysis (PCA) that shows why modern-day Slavic speakers can't be linked genetically to Sarmatians or Scythians. To experience a more detailed version of the PCA paste the data here into the relevant field here.

As you can see, dear reader, most of the Slavs (Belarusians, Poles, Ukrainians and many Russians) cluster with the Irish near the western end of the plot.

Some Russians are shifted significantly east of them along the "Uralic cline" and, as a result, they cluster with various Uralic speakers such as Mordovians. That's because when Slavs migrated deep into what is now northern Russia they mixed with Uralic speakers who were there before them.

Most of the Sarmatians and Scythians form a cluster southeast of the Slavs and Irish because they carry significant levels of East Asian ancestry. This type of eastern ancestry is basically missing in modern-day Slavs (see here).

Several of the Scythians cluster among the Slavs and Irish, but that's because they're genetic outliers, whose existence, if anything, suggests that some Scythians had significant Slavic-related and/or Irish-related ancestry.

Now, even though most of the Slavs do cluster with the Irish in the above PCA plot, I strongly disagree with the authors of the abstract when they claim that "differentiating Slavic, Germanic, and Celtic people is very difficult" with PCA. It's actually pretty damn easy and I've been doing it successfully for many years. For instance, see here.

See also...

Wielbark Goths were overwhelmingly of Scandinavian origin

The Caucasus is a semipermeable barrier to gene flow

Sunday, July 23, 2023

Dear Sandra, Wolfgang...a problem


In their recent paper, titled Early contact between late farming and pastoralist societies in southeastern Europe, Penske et al. make the following claim:

By contrast, Yamnaya Caucasus individuals from the southern steppe can be modelled as a two-way model of around 76% Steppe Eneolithic and 26% Caucasus Eneolithic/Maykop, confirming the findings of Lazaridis and colleagues 47. This two-way mix (40% + 60%, respectively) also provides a well-fit model (P = 0.09) for the Ozera outlier individual, consistent with the position in PCA and corroborating an influence from the Caucasus.

Err, nope.

The Ozera Yamnaya outlier, a female dated to 3096-2913 calBCE, is, in fact, a ~50/50 mix between standard Yamnaya and Late Maykop. It's a result that is totally unambiguous.

There are a number of ways to demonstrate this fact. For example, with the qpAdm software that was also used by Penske et al., except with different outgroups or right pops. Please note that in my dataset the Ozera outlier is labeled Ukraine_Ozera_EBA_Yamnaya_o.

right pops:
Cameroon_SMA
Levant_N
Iran_GanjDareh_N
Iran_C_SehGabi
Georgia_HG
Turkey_N
Serbia_IronGates_Mesolithic
Russia_WestSiberia_HG
Russia_Karelia_HG
Latvia_HG
Russia_Boisman_MN
Brazil_LapaDoSanto_9600BP

Ukraine_Ozera_EBA_Yamnaya_o
Russia_Caucasus_EneolithicMaykop 0.554±0.031
Russia_Steppe_Eneolithic 0.446±0.031
P-value 0.00109868 (FAIL)


Ukraine_Ozera_EBA_Yamnaya_o
Russia_LateMaykop 0.512±0.035
Russia_Samara_EBA_Yamnaya 0.488±0.035
P-value 0.462447 (PASS)

I can also do it with the Global25/Vahaduo method. And you, dear reader, can too, by putting the Target and Source Global25 coords from the text file here into the relevant fields here.

Target: Ukraine_Ozera_EBA_Yamnaya_o
Distance: 2.9292% / 0.02929202
50.6 Russia_Samara_EBA_Yamnaya
49.4 Russia_Caucasus_LateMaykop
0.0 Russia_Caucasus_EneolithicMaykop
0.0 Russia_Steppe_Eneolithic

Moreover, here's a self-explanatory Principal Component Analysis (PCA) plot that illustrates why my Late Maykop/Samara Yamnaya combo is much better than the reference populations used by Penske and colleagues. It was done with the PCA tools here.
I'm pointing this out for two main reasons. First of all, this is a fairly obvious mistake that should've been avoided, especially considering the level of expertise and experience among the authors (such as Wolfgang Haak and Johannes Krause).

Secondly, it's important to understand that the Ozera outlier comes out almost exactly 50% Samara Yamnaya because the standard Yamnaya genotype already existed well before she was alive, and thus she cannot be used to corroborate any sort of influence from the Caucasus in the formation of the mainstream Yamnaya population.


As for the Yamnaya Caucasus individuals, I don't know why Penske et al. attempted to model their ancestry as a group, because they don't form a coherent genetic cluster. RK1001 and ZO2002 are fairly similar to standard Yamnaya samples, while RK1007 and SA6010 resemble Eneolithic steppe samples from the Progress burial site. This is what happens when I try to reproduce the Penske et al. model with my outgroups.

Russia_Caucasus_EBA_Yamnaya
Russia_Caucasus_EneolithicMaykop 0.187±0.019
Russia_Steppe_Eneolithic 0.813±0.019
P-value 4.15842e-06 (HARD FAIL)

Oh, and Penske et al. modeled the ancestry of mainstream Yamnaya as a three-way mixture with Steppe Eneolithic, Caucasus Eneolithic/Maykop and Ukraine Neolithic (or Ukraine N). They succeeded, but with my outgroups it's another hard fail.

Russia_Samara_EBA_Yamnaya
Russia_Caucasus_EneolithicMaykop 0.177±0.017
Russia_Steppe_Eneolithic 0.706±0.026
Ukraine_N 0.116±0.014
P-value 4.73919e-07 (HARD FAIL)

Admittedly, proximal models aren't easy to get right. And if you throw enough outgroups into a model, a large proportion of plausible models will fail. But I'm somewhat taken aback by these poor statistical fits.

In my opinion, mainstream Yamnaya doesn't harbor any Caucasus ancestry that wasn't already present on the Pontic-Caspian steppe during the Eneolithic or even much earlier (see here). But ultimately this problem can only be solved with direct evidence from ancient DNA, so let's now wait patiently for the right samples.

Citation...

Penske et al., Early contact between late farming and pastoralist societies in southeastern Europe, Nature, https://doi.org/10.1038/s41586-023-06334-8

See also...

Understanding the Eneolithic steppe

Monday, September 19, 2022

Dear Iosif...Yamnaya


Even though the Yamnaya culture probably originated in what is now Ukraine, the earliest Yamnaya samples currently available are from the modern-day Samara region of Russia. They mostly date to around 3,000 BCE. I can analyze their ancestry using Principal Component Analysis (PCA) data.

Target: RUS_Yamnaya_Samara
Distance: 3.2816% / 0.03281581
81.0 RUS_Progress_En
14.4 UKR_N
4.6 HUN_Vinca_MN
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_C
0.0 BGR_Dzhulyunitsa_N
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

The above results show exactly zero ancestry from West Asia. Admittedly, both RUS_Progress_En and HUN_Vinca_MN are European ancients with significant West Asian-related ancestry. However, this ancestry is very distantly West Asian-related, and, for instance, it almost certainly has no relevance to the Indo-Anatolian homeland debate.

The Afanasievo culture of Central Asia is regarded to have been an early offshoot of the Yamnaya culture. A good number of Afanasievo samples are available, so let's have a look if their results match those of the Yamnaya folks. And indeed they do, since BGR_C is very similar to HUN_Vinca_MN.

Target: RUS_Afanasievo
Distance: 3.4055% / 0.03405499
84.0 RUS_Progress_En
11.4 UKR_N
4.6 BGR_C
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_Dzhulyunitsa_N
0.0 HUN_Vinca_MN
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

To try this at home, stick the PCA data in the text file here into the relevant fields here and cranck up the "Cycles" to 4X. You should see exactly zero ancestry from West Asia every time.

I can, more or less, reproduce these results with tools that are routinely used in peer reviewed papers. Below is a table of mixture models produced with the qpAdm software. I set the pass threshold to P ≥0.05, which is an arbitrary value, but the pattern is clear. The full output from each qpAdm run is available here.


Importantly, qpAdm needs to be fed the relevant "right pop" outgroups to be able to discriminate accurately between reference populations.

right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
TUR_Marmara_Barcin_N
HUN_Starcevo_N
HUN_Koros_N
SRB_Iron_Gates_HG
Iberia_Southeast_Meso
RUS_Karelia_HG
RUS_West_Siberia_HG
RUS_Boisman_MN
MNG_North_N
TWN_Hanben
BRA_LapaDoSanto_9600BP

So, for instance, if one were to use in this role the modern-day Mbuti people, as opposed to, say, the ancient hunter-gatherers of Shum Laka, one might find that many models look statistically better than they should. And then one might also find that the Yamnaya samples carry significant West Asian ancestry.

Actually, I'm not opposed to the idea of some West Asian ancestry in Yamnaya. Indeed, considering the extraordinary mobility of the Yamnaya people and their Eneolithic predecessors on the Pontic-Caspian steppe, it would be unusual if they didn't come into close contact and mix, to some degree, with their neighbors from West Asia.

However, based on everything I've seen, from uniparental markers to different types of autosomal genetic tests, it's clear to me that there's no substantial West Asian ancestry in any Yamnaya samples, except for an outlier female from modern-day Ozera, Ukraine (see here).

Admittedly, ancient DNA does have a habit of throwing curveballs, so I'm eagerly awaiting new Eneolithic samples from the Pontic-Caspian steppe, particularly those associated with the Yamnaya-like Sredni Stog culture, to help finally settle this issue.

Believe it or not, a contact recently sent me a supposedly unpublished female sample from a ~4,200 BCE Sredni Stog burial in modern-day Igren, east central Ukraine. So what the hell, let's assume for the time being that this sample is genuine. This is how Miss Sredni Stog behaves in my PCA mixture test.

Target: UKR_Sredni_Stog
Distance: 4.0769% / 0.04076877
75.6 RUS_Progress_En
17.8 UKR_N
6.6 HUN_Vinca_MN
0.0 ARM_Aknashen_N
0.0 ARM_Masis_Blur_N
0.0 AZE_Caucasus_lowlands_LN
0.0 BGR_C
0.0 BGR_Dzhulyunitsa_N
0.0 HUN_Vinca_MN
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Hajji_Firuz_C
0.0 IRN_Seh_Gabi_C
0.0 IRN_Tepe_Abdul_Hosein_N
0.0 IRN_Wezmeh_N
0.0 RUS_Darkveti-Meshoko_En
0.0 RUS_Maykop
0.0 RUS_Maykop_Late
0.0 RUS_Maykop_Novosvobodnaya

Wow, just wow. Have we actually found Miss Proto-Yamnaya? What does qpAdm have to say in the matter?

UKR_Sredni_Stog
HUN_Vinca_MN 0.034±0.028
RUS_Progress_En 0.796±0.045
UKR_N 0.170±0.034
P-value 0.41088

Again, this is an excellent match with the results from my PCA test, especially if we take into account the standard errors. However, with qpAdm it's also possible to model this individual's ancestry as part West Asian.

UKR_Sredni_Stog
AZE_Caucasus_lowlands_LN 0.056±0.039
RUS_Progress_En 0.761±0.061
UKR_N 0.183±0.036
P-value 0.465667

As I pointed out above, it's plausible for such people to harbor some West Asian ancestry, but I'm very sceptical that this is really the case here, despite the rather solid qpAdm statistical fit. That's because UKR_Sredni_Stog is not a high quality sample, and, from my experience, qpAdm often has problems analyzing fine scale ancestry in singletons or even small groups that show excess DNA damage and/or offer much less than a million markers.

See also...

Dear Iosif, about that ~2%

But Iosif, what about the Phrygians?

Saturday, August 27, 2022

Dear Iosif...

Update 29/08/22: Dear Iosif #2

...


I'm skimming through the Lazaridis, Alpaslan-Roodenberg et al. paper that just came out at Science. And I feel like someone punched me in the face.

Nevertheless, I'll try to be diplomatic. Suffice to say, for now, that there's some rather strange stuff in this paper.

The main problem is that the authors are attempting to study fine scale ancestry with a somewhat rough distal model. As a result, they miss important details.

For instance, this quote is from the paper's supplementary PDF file, freely available here.

However, the complete lack of association of R-haplogroup descendants and EHG ancestry in either Armenia or Iran is consistent with either a massive dilution of EHG ancestry in these populations resulting in the dissociation of Y-chromosome lineages from autosomal ancestry over time, or with a scenario in which R-M269 was not associated with substantial EHG ancestry to begin with.

Obviously, EHG means Eastern European Hunter-Gatherer. But why focus on EHG? Surely, this makes little sense when looking at the genetic prehistory of West Asia, because no one ever argued that this region was settled by EHG populations. It was widely settled by Yamnaya-related groups, with already heavily diluted EHG ancestry, during the metal ages.

OK, so the authors are actually aware about the potential dilution of EHG ancestry, but they don't really do anything about it.

If we're looking at the origins of West Asian R1b-M269, and using its association with autosomal DNA components as a guide, then we should be focusing on Yamnaya-related ancestry.

For instance, here's a fine scale ancient ancestry model based on Principal Component Analysis (PCA) data. It shows the ancestry proportions of two relatively high coverage Iron Age males from two different sites in Iran from the Lazaridis, Alpaslan-Roodenberg et al. dataset. Both belong to R1b-M269 and both show significant Yamnaya-related ancestry.

Target: IRN_HajjiFiruz_IA:I2327_all
Distance: 2.2930% / 0.02292994
39.6 Kura-Araxes_ARM_Kaps
24.2 IRN_Ganj_Dareh_N
18.2 Levant_PPNB
12.4 Yamnaya_RUS_Samara
4.4 Anatolia_Tepecik_Ciftlik_N
1.2 Han

Target: IRN_Hasanlu_IA:I4232_all
Distance: 2.5179% / 0.02517895
26.0 IRN_Ganj_Dareh_N
25.6 Kura-Araxes_ARM_Kaps
24.4 Anatolia_Tepecik_Ciftlik_N
15.8 Yamnaya_RUS_Samara
7.6 Levant_PPNB
0.6 IRN_Shahr_I_Sokhta_BA2

As a control, here's an earlier, Chalcolithic sample bearing Y-haplogroup J2b from the same region. Not surprisingly, this individual totally lacks the Yamnaya-related signal.

Target: IRN_HajjiFiruz_ChL:I4241_all
Distance: 2.7938% / 0.02793782
32.6 Kura-Araxes_ARM_Kaps
25.6 IRN_Ganj_Dareh_N
23.6 Anatolia_Tepecik_Ciftlik_N
18.2 Levant_PPNB

Overall, these results make perfect sense. I could probably locate very minor signals of EHG ancestry in the Iron Age samples, but that would be more difficult and much less certain, so I won't bother.

Soon I'll be able to rerun these analyses with Bronze Age samples from Dagestan and surrounds. That should bump up the levels of Yamnaya-related ancestry and improve the statistical fits (wink, wink, nudge, nudge).

Disappointingly, Lazaridis, Alpaslan-Roodenberg et al. go so far as to suggest that R1b-M269 may have originated in West Asia.

However, considering the scores of ancient Eastern European populations rich in R1b-M269 and many near and far related subclades of R1b, this makes no sense whatsoever.

Indeed, contemplating nowadays that R1b-M269 might be native to West Asia, where R1b only starts showing up in the ancient DNA record during the Copper Age, is about as stupid as claiming that gravity doesn't exist.

Largely due to their distal model approach, Lazaridis, Alpaslan-Roodenberg et al. also argue that the Indo-Anatolian homeland was located in what is now Armenia and surrounds. I'm far from convinced that this solution will stand the test of time.

In terms of the more widely accepted theory that the Indo-Anatolian homeland was located on the Pontic-Caspian steppe in Eastern Europe, the most important samples in the paper are the three Bronze Age individuals from Yassitepe in western Anatolia. That's because they're from a region that is traditionally seen as the entry point of Indo-Anatolian speakers into Anatolia from the European steppe via the Balkans.

Interestingly, individual I5737, dated to 2035-1900 calBCE or the Middle Bronze Age, belongs to Y-chromosome haplogroup I2a-P78, which surely must be a signal of European ancestry. I see this as a significant result.

Here's how the trio from Yassitepe look in my fine scale ancient ancestry model. Minor Yamnaya-related ancestry does show up, although, admittedly, it might just be noise in individual I5735.

Target: TUR_Aegean_Izmir_Yassitepe_MBA:I5737
Distance: 2.7507% / 0.02750748
58.4 Anatolia_Barcin_N
20.4 Kura-Araxes_ARM_Kaps
9.2 Anatolia_Tepecik_Ciftlik_N
5.6 IRN_Ganj_Dareh_N
3.8 Yamnaya_RUS_Samara
2.6 Levant_PPNB

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5733
Distance: 2.7969% / 0.02796887
52.0 Anatolia_Barcin_N
27.2 Kura-Araxes_ARM_Kaps
8.6 Levant_PPNB
6.2 Yamnaya_RUS_Samara
6.0 IRN_Ganj_Dareh_N

Target: TUR_Aegean_Izmir_Yassıtepe_EBA:I5735
Distance: 3.1270% / 0.03127009
36.0 Kura-Araxes_ARM_Kaps
32.4 Anatolia_Tepecik_Ciftlik_N
26.0 Anatolia_Barcin_N
2.8 IRN_Shahr_I_Sokhta_BA2
1.2 Yamnaya_RUS_Samara
1.0 Levant_PPNB
0.6 MAR_Taforalt

This isn't much, especially considering it's already late 2022, but it's better than nothing. Fortunately, more samples from Bronze Age western Anatolia are on the way (wink, wink, nudge, nudge).

However, I'm not done with the Lazaridis, Alpaslan-Roodenberg et al. dataset yet. I'm planning to spend much more time on this blog in the coming weeks and months and will be using their samples in a wide range of analyses.

Citation...

Iosif Lazaridis, Songül Alpaslan-Roodenberg et al., The genetic history of the Southern Arc:A bridge between West Asia and Europe, Science 377, eabm4247 (2022)

See also...

Dear Iosif #3

But Iosif, what about the Phrygians?

Dear Iosif, about that ~2%

Dear Iosif...Yamnaya

Saturday, March 12, 2022

Lousy intel


I don't like discussing current events and politics here, but it's impossible to ignore what is happening in Eastern Europe.

It's a tragedy and catastrophe for both Ukraine and Russia. It's also likely to have a negative impact on ancient DNA research, Indo-European studies, and thus also on this blog.

I'm seeing a lot of confusion online about why Russia invaded Ukraine, but I don't think it's very complicated.

After getting the better of the West in recent years, Russia finally overreached and made a massive tactical blunder, in large part because of lousy intel. More broadly, I also see this as the Soviet Union's dead cat bounce moment.

Russia will now have to reinvent itself, possibly as China's junior partner or even vassal state.

As for the "special military operation", Russia's initial plan was to achieve a quick, relatively bloodless victory, followed by a military parade in Kyiv. But obviously that's not going to happen.

Russia's back up plan, if we can call it that, seems to be to keep pushing into Ukraine at any cost, and hope that the Ukrainians finally tap out. But right now that looks like a long shot.


See also...

Matters of geography

Wednesday, September 15, 2021

Yamnaya people drank horse milk (Wilkin et al. 2021)


Over at Nature at this LINK. I'm guessing the claim that Yamnaya pastoralists lived in Scandinavia is a huge typo. Obviously, the authors are referring to the people of the Corded Ware culture (CWC). From the paper:

During the Early Bronze Age, populations of the western Eurasian steppe expanded across an immense area of northern Eurasia. Combined archaeological and genetic evidence supports widespread Early Bronze Age population movements out of the Pontic–Caspian steppe that resulted in gene flow across vast distances, linking populations of Yamnaya pastoralists in Scandinavia with pastoral populations (known as the Afanasievo) far to the east in the Altai Mountains1,2 and Mongolia3. Although some models hold that this expansion was the outcome of a newly mobile pastoral economy characterized by horse traction, bulk wagon transport4,5,6 and regular dietary dependence on meat and milk5, hard evidence for these economic features has not been found. Here we draw on proteomic analysis of dental calculus from individuals from the western Eurasian steppe to demonstrate a major transition in dairying at the start of the Bronze Age. The rapid onset of ubiquitous dairying at a point in time when steppe populations are known to have begun dispersing offers critical insight into a key catalyst of steppe mobility. The identification of horse milk proteins also indicates horse domestication by the Early Bronze Age, which provides support for its role in steppe dispersals. Our results point to a potential epicentre for horse domestication in the Pontic–Caspian steppe by the third millennium bc, and offer strong support for the notion that the novel exploitation of secondary animal products was a key driver of the expansions of Eurasian steppe pastoralists by the Early Bronze Age.

Wilkin, S., Ventresca Miller, A., Fernandes, R. et al. Dairying enabled Early Bronze Age Yamnaya steppe expansions. Nature (2021). https://doi.org/10.1038/s41586-021-03798-4

See also...

On the origin of the Corded Ware people

Monday, July 27, 2020

Ancient ancestry proportions in present-day Europeans (to be continued)


This year has already been massive in all sorts of ways, including for new data and software releases. So I'm thinking it might be time to update many of the analyses that were featured at this blog a while ago.

Let's start with the classic hunter vs farmer vs herder mixture model for present-day European populations. The rules of the game are as follows:


- run the latest version of qpAdm using qpfstats output

- use transversion sites and 1240K capture data

- pick a set of diverse and chronologically sound outgroups

- for a model to be successful the p-value must reach 0.01

- tweak the left pops in models that are clearly underperforming

- follow high end scientific literature, logic and common sense


Obviously, the reason that I decided to limit my analysis to markers from transversion sites is to mitigate problems associated with modeling the ancestry of modern, high quality samples with relatively low quality ancients. One of these problems appears to be qpAdm assigning faux East Asian/Siberian admixture to present-day Europeans (for instance, see figure 4 here).

My starting reference populations and outgroups are listed below. In qpAdm terminology the former are known as the "left pops", while the latter as the "right pops". Most of these samples are freely available at the David Reich Lab website here.

left pops:
HUN_Koros_N_HG
TUR_Barcin_N
UKR_Yamnaya

right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
CZE_Vestonice16
BEL_GoyetQ116-1
Iberia_ElMiron
RUS_Karelia_HG
RUS_West_Siberia_HG
MNG_North_N
RUS_Ust_Kyakhta

As you can see, I picked a wide variety of right pops. But I chose most of them specifically to be able to differentiate the three streams of ancestry - from ancient hunters, farmers and herders - that are the focus of my analysis. I also intentionally avoided using samples in the right pops that may have experienced gene flow, including cryptic gene flow, from the populations in the left pops.

I somewhat speculatively earmarked HUN_Koros_N_HG, from the Early Neolithic Carpathian Basin, and UKR_Yamnaya, from the Early Bronze Age North Pontic steppe in what is now Ukraine, to represent the hunter-gatherer and pastoralist streams of ancestry, respectively.

That's because I expected HUN_Koros_N_HG to be the best proxy for the hunter-gatherer ancestry that was initially absorbed by the early farmers who fanned out from the Aegean region across much of the European continent, and of course it made sense to choose a steppe pastoralist population that was located close to Central Europe where such groups first made the biggest impact outside of the steppe.

Interestingly, HUN_Koros_N_HG and UKR_Yamnaya did prove to be among most effective choices for the types of ancestries that they represented. For instance, UKR_Yamnaya generally produced much stronger statistical fits than a very similar set of Yamnaya samples from the Caspian steppe (more precisely, from the Samara region in Russia). However, this might well be an artifact, due to very specific characteristics of these few ancient individuals. Larger sample sets would be welcome, especially from Yamnaya sites in Ukraine.

Below, dear audience, is a spreadsheet featuring the preliminary results. Click on the image to view and/or download the spreadsheet. The general rule is that the higher the tail prob, or p-value, the more likely it is that the ancestry proportions are close to the truth (a tail prob of well below 0.05 is usually a strong indication that something isn't right). For a detailed look at each of the qpAdm runs, feel free to consult the zip file here.


Note, however, that many of the European groups in my burgeoning genotype dataset are yet to make an appearance in the spreadsheet. That's because their models with the standard left pops showed p-values well under 0.01, which essentially meant that they failed, and I'm still trying to make them work.

But round one has certainly revealed some fascinating stuff. For instance, except for Hungarians and Estonians, none of the Uralic-speaking groups can be modeled successfully in the standard three-way model.

However, I managed to significantly improve the statistical fits in their models by adding a Siberian population, RUS_Baikal_BA, to the left pops. This is unlikely to be a coincidence, because the Proto-Uralic homeland was almost certainly located in or very near Siberia. Iain Mathieson please take note.

Saami
HUN_Koros_N_HG 0.134±0.043
RUS_Baikal_BA 0.270±0.015
TUR_Barcin_N 0.081±0.026
UKR_Yamnaya 0.515±0.058
chisq 19.865
tail prob 0.0108571

See also...


Saturday, July 4, 2020

Fatyanovo males were rich in Y-haplogroup R1a-Z93 (Saag et al. 2020 preprint)


I'd say that thanks to this preprint we're now a lot closer to solving the mystery of the Sintashta people. Over at bioRxiv at this LINK. From the preprint:

Transition from the Stone to the Bronze Age in Central and Western Europe was a period of major population movements originating from the Ponto-Caspian Steppe. Here, we report new genome-wide sequence data from 28 individuals from the territory north of this source area - from the under-studied Western part of present-day Russia, including Stone Age hunter-gatherers (10,800-4,250 cal BC) and Bronze Age farmers from the Corded Ware complex called Fatyanovo Culture (2,900-2,050 cal BC). We show that Eastern hunter-gatherer ancestry was present in Northwestern Russia already from around 10,000 BC. Furthermore, we see a clear change in ancestry with the arrival of farming - the Fatyanovo Culture individuals were genetically similar to other Corded Ware cultures, carrying a mixture of Steppe and European early farmer ancestry and thus likely originating from a fast migration towards the northeast from somewhere in the vicinity of modern-day Ukraine, which is the closest area where these ancestries coexisted from around 3,000 BC.

...

Interestingly, in all individuals for which the chrY hg could be determined with more depth (n=6), it was R1a2-Z93 (Table 1, Supplementary Data 2), a lineage now spread in Central and South Asia, rather than the R1a1-Z283 lineage that is common in Europe [38,39].


Saag et al., Genetic ancestry changes in Stone to Bronze Age transition in the East European plain, BioRxiv, Posted July 03, 2020, doi: https://doi.org/10.1101/2020.07.02.184507

See also...

Like three peas in a pod

Tuesday, May 19, 2020

A significant finding


At least five individuals from Neolithic burial sites in what is now Ukraine harbor ancestry that is normally associated with much later steppe populations. Labeled UKR_N_admixed in the plot below, these samples were part of the Mathieson et al. 2018 dataset and most were radiocarbon dated to well before 5,000 BCE. An interactive version of the plot is available here.


Their unusual ancestry probably explains why they form a cluster that appears to be pulling away from the ancient European hunter-gatherer cline towards the part of the plot home to RUS_Progress_En (from the Progress-2 Eneolithic burial site in the North Caucasus piedmont region). But, of course, there's more to this. For instance, consider the formal statistics-based qpAdm mixture models below:

UKR_N_admixed
RUS_Progress_En 0.083±0.021
UKR_N 0.917±0.021
chisq 7.461
tail prob 0.589238
Full output

UKR_N_admixed
RUS_Progress_En 0.172±0.021
SRB_Iron_Gates_HG 0.332±0.024
UKR_Meso 0.495±0.035
chisq 9.255
tail prob 0.321282
Full output

UKR_N_I1738
RUS_Progress_En 0.196±0.035
SRB_Iron_Gates_HG 0.414±0.039
UKR_Meso 0.390±0.056
chisq 7.913
tail prob 0.442006
Full output

Ergo, as much as a quarter of the genome of individual I1738, dated to 5473-5326 calBCE, might be derived from a population very similar to RUS_Progress_En. This is a big deal, because it's still widely believed that this type of ancestry didn't exist until the Eneolithic, and that it didn't spread significantly until the migrations of steppe pastoralists associated with the Early Bronze Age Yamnaya culture.

I'm confident, nay, certain, that my findings will be confirmed directly with more Neolithic samples from present-day Ukraine and surrounds.

See also...

Understanding the Eneolithic steppe

Ancient DNA vs Ex Oriente Lux

Mixed marriages on the early Eneolithic steppe

Saturday, June 1, 2019

They came, they saw, and they mixed


Y-chromosome haplogroup N is strongly associated with Uralic-speaking populations. That's probably because it was a salient feature of the gene pool of the earliest Uralic speakers, and it went with them as they migrated across northern Eurasia. However, some of its younger subclades appear to have spread with the speakers of Indo-European and Turkic languages.

For instance, N-Y10931 seems to be a marker of the Rurikids, a Varangian dynasty that, according to most sources, ruled the Kievan Rus in what are now Russia and Ukraine. And the Kievan Rus was a lose medieval political federation in which Slavic, Finnic (west Uralic) and Germanic languages were probably spoken. The latest on the genetic genealogy of the Rurikids was presented a couple of days ago at the Centenary of Human Population Genetics conference in Moscow, and there's an abstract of the talk available here (download the PDF and scroll down to page 84).

I'm not aware of any Rurikids among the thousands of ancients in my dataset, or even of any samples belonging to N-Y10931. But I do have the genome of someone who belongs to N-Y4339, which, as per the abstract linked to above, is proximally ancestral to N-Y10931. Not only does this person come from Viking Age Scandinavia, but he was buried in a crouched position typical of Slavic funerary customs of the time.

The individual in question is vik_84001. His genome was published recently along with a paper on the population structure of the Swedish town of Sigtuna way back when it was a Viking stronghold (see here). This is where his Y-chromosome sequence, labeled ERS2540883, is positioned on the YFull Y-chromosome phylogenetic tree. Click on the image to go to YFull.


However, the result is likely to be compromised to some extent by missing data. If so, it's possible that vik_84001 does indeed belong to N-Y10931 and ought to be sitting near or even among that cluster of Russian samples (Rurik descendants?) at the bottom of the page.

In any case, vik_84001 seems to be the closest individual in the ancient DNA record to a Rurikid. The Principal Component Analysis (PCA) below is based on my Global25 data. It features 18 other Viking Age individuals from Sigtuna alongside vik_84001 (look for the black dots). The relevant datasheet is available here. Interestingly, despite his eastern Y-haplogroup, vik_84001 is one of the few Sigtuna ancients who clusters strongly with present-day Swedes.
But here's what happens when I model his ancestry proportions with the Global25/nMonte method using a wide range of reference populations from Northern and Eastern Europe. The Swedes in this model are the same as those in the PCA.

vik_84001
Swedish,84.6
Ingrian,9.2
Russian_Tver,6.2

Belarusian,0
Estonian,0
Finnish,0
Finnish_East,0
Karelian,0
Latvian,0
Mordovian,0
Russian_Kostroma,0
Russian_Kursk,0
Russian_Orel,0
Russian_Pinega,0
Russian_Smolensk,0
Russian_Voronez,0
Ukrainian,0
Vepsian,0

[1] "distance%=2.3778"

Yep, despite his position in the PCA, vik_84001 shows a strong signal of ancestry related to the present-day populations of northwestern Russia. I'm not sure what this means exactly, but it's certainly fascinating stuff. And, by the way, I usually wouldn't use so many similar reference populations in a single Global25/nMonte model because of the problem of "overfitting", but in some cases it's OK to do so if the nMonte algorithm has enough recent genetic drift to latch onto.

See also...

More on the association between Uralic expansions and Y-haplogroup N

Fresh off the sledge

Uralic-specific genome-wide ancestry did make a signifcant impact in the East Baltic

It was always going to be this way

Conan the Barbarian probably belonged to Y-haplogroup R1a