Eurogenes Blog

Saturday, July 25, 2026

The G25 Proto-Indo-European master sheet

I'm putting together a G25 Proto-Indo-European master sheet that I'll be using in a future blog post about the Indo-European, or, if you wish, Indo-Anatolian homeland.

An early version of the master sheet is available HERE. Please note, the first pop label is my own while the second one is from the v66.2M.AADR dataset.

I'm aiming to expand it significantly and update it regularly. Below is a Vahaduo G25 West Eurasia PCA with all the samples. What am I missing? Feel free to let me know in the comments.

Spoiler alert! Here are a few of the revelations that you can expect in the aforementioned blog post about the Indo-European homeland:

- Yamnaya is almost certainly derived from Serednii Stih (aka Sredny Stog). I've been talking about this for years and you can probably pick it up without too much trouble in the PCA above. Finally, even Iosif Lazaridis, David Reich, Nick Patterson and friends are now on board with this idea as per their latest paper on the topic (see Lazaridis et al. 2025).

- However, I'm not convinced by the Lazaridis et al. hypothesis that the Caucasus–lower Volga (CLV) cline is intimately linked to the Indo-Anatolian and proto-Anatolian expansions. That's because the CLV cline is an artifact of isolation-by-distance working for thousands of years across a vast and highly diverse landscape, and it includes a wide range of populations that usually have nothing to do with each other.

- The question of who actually spoke Indo-Anatolian first will, in all likelihood, never be solved to everyone's satisfaction. But the number of true candidate groups is actually quite small, and I reckon they're almost certainly all highlighted in my PCA above. Wink, wink, nudge, nudge.

Saturday, July 18, 2026

The genetic history of Rus (Andreeva et al. 2025 preprint)

Over at bioRxiv at this LINK. As far as I can see, this is the most interesting preprint published in the last 12 months. At the same time, however, the authors could've done a much better job with the fine scale analysis of their samples. For instance, they rely solely on a West Eurasian-level Principal Component Analysis (PCA) to study intra-Northern European diversity. Hopefully, the paper will be better. Abstract:

The foundation of the first ancient Rus’ state occurred as a result of the consolidation of diverse communities inhabiting Eastern Europe during the second half of the first millennium CE. Historical sources imply that these communities mostly include East Slavs, whose settlement across a vast territory led to the emergence of the East Slavic/Rus’ culture within the Rus’ state. We generated genomic data for 200 medieval individuals from different locations to elucidate the origin and genetic structure of the Rus’ population during the early stages of the state formation. Our findings reveal a genetic continuum predominantly shaped by two key genetic groups: a broad Slavic-related continuity of different genetic subclusters of Rus’ occupying the enormous European Plain area, and a Fenno-Ugrian (Uralic)-related component in the Northern Rus’ region. Importantly, both groups have a shared genetic substrate inherited from preceding ancient Baltic region populations. To scale Scandinavian ("Viking") heritage, we traced minor Scandinavian genetic lineages that did not make up the dominating genetic stratum of the early Rus’ state. Our study presents the first comprehensive genomic image of the medieval Rus’, highlighting the intricate cultural and genetic interactions between Slavic, Fenno-Ugrian, and other groups that formed the first Rus’ state affecting Europe’s history.

Andreeva et al., Genetic history of Rus’, bioRxiv, Posted December 30, 2025, doi: https://doi.org/10.64898/2025.12.30.695215

I don't yet have access to the dataset from this preprint, so I grabbed a bunch of already published samples from Medieval and Viking Age Russia and put them into the Vahaduo G25 North Europe PCA. Clearly, many of the Viking Age individuals have significant levels of Scandinavian ancestry, because they're shifted west/northwest on the plot relative to most other Eastern Europeans. And, obviously, this contradicts the Andreeva et al. claim that there's only minor Scandinavian admixture in the Rus population. The datasheet that I used for the PCA can be downloaded here.

See also...

They came, they saw, and they mixed

Saturday, January 17, 2026

New Iron Age samples from southeastern Poland

A new dataset has appeared online from a yet to be published paper titled Cosmopolitanism in the depths of Barbaricum evidenced by archaeogenomic data from the Late Iron Age Goth community of the Masłomęcz group [Update: the paper is now available at this Link].

Most of these Gothic samples are clearly of Scandinavian origin, and very similar to present-day Swedes. Overall, however, they create a somewhat heterogeneous cluster that also overlaps with present-day Poles thanks to the presence of a few Balto-Slavic-related and possibly Roman-related individuals.

The Principal Component Analysis (PCA) plots below were produced with the excellent Vahaduo G25 Global Views tool using the data here.

Their Y-haplogroups more or less reflect the PCA results:

PL046 R-YP6228
PL048 I-PH833
PL049 I-A11537
PL052 R-Y48961
PL059 I-PH833
PL062 I-S15301
PL065 I-Y294193
PL066 R-FGC2555
PL067 R-S7759
PL070 I-CTS10028
PL071 I-BY316
PL076 I-S9318
PL082 I-Z2041
PL085 J-Z38241
PL086 I-FT29339

Saturday, September 6, 2025

Early Slavs from Tribal Period Poland

A paper dealing with the origin of Slavic speakers, titled Ancient DNA connects large-scale migration with the spread of Slavs, was just published at Nature by Gretzinger et al. (see here).

The dataset from the paper includes eight fascinating ancient samples from Gródek upon the Bug River in Southeastern Poland. These individuals are dated to the so called Tribal Period (8th –9th centuries), and, as far as I know, they represent the earliest Slavic speakers in the ancient DNA record.

The really interesting thing about these early Slavs is that they already show some Germanic and other Western European-related ancestries.

In the Principal Component Analysis (PCA) plots below, three of them cluster near present-day Ukrainians, while the rest are shifted towards present-day Northwestern, Western and Southern Europeans. The plots were produced with the excellent Vahaduo G25 Global Views tool using the data here.

These results aren't exactly shocking, because the people who preceded the early Slavs in the Gródek region were Scandinavian-like and associated with the Wielbark archeological culture. In other words, they were probably Goths who also had significant contacts with the Roman Empire.

However, it's not a given that the ancestors of the Tribal Period Slavs mixed with local Goths. It's also possible that they brought the western admixture, or at least some of it, from the Slavic homeland, wherever that may have been.

That's because the early Slavs who migrated deep into what is now Russia also showed Western European-related admixture. This is what Gretzinger et al. say on page 74 of their supplementary info (emphasis is mine):

The only deviation from this pattern is observed for ancient samples from the Russian Volga-Oka region, where we measure higher genetic affinity between present-day Southern/Western Europeans and the SP population compared to the pre-SP population (Fig. S17). This agrees with the pattern observed in PCA and ADMIXTURE that, in contrast to the Northwestern Balkan, Eastern Germany, and Poland-Northwestern Ukraine, the arrival of Slavic-associated culture in Northwestern Russia was associated with a shift in PCA space to the West, a decrease of BAL [Baltic] ancestry, and the introduction of Western European ancestries such as CNE [Continental North European] and CWE [Continental Western European].

Thus, it's highly plausible that the Tribal Period Slavs from Gródek were very similar, perhaps even practically identical, to the proto-Slavs who lived in the original Slavic homeland. Hopefully we won't have to wait too long to discover whether that's true or not. More Migration period and Slavic period samples from the border regions of Belarus, Poland and Ukraine are needed to sort that out.

On the other hand, most of the post-1000 CE individuals from Gródek are shifted closer to present-day Balts. This is probably due to admixture from nearby Baltic-speaking populations. At the time, Baltic speakers still occupied much of northern and eastern Poland.

I'm still going through the Gretzinger et al. paper and I'll probably have a lot more to say about it in the near future.

However, unfortunately, I've already spotted a silly mistake in the supplementary info that will probably have some very annoying consequences for us on this blog. On page 109 the authors make the false claim that South Asian ancestry is present in a wide range of ancient Eastern European and Central Asian populations from the Bronze Age to the Scythian period.

Furthermore, Sycthian groups from Ukraine show varying fractions of South Asian ancestry (between 5% and 12%), a component present in many ancient individuals from Moldova (e.g. Moldova_IA, Moldova_LBA and Moldova_MBA), Ukraine (Ukraine_Alexandria_MBA and Ukraine_BA_Catacomb.SG), Western Russia (e.g. Russia_EarlySarmatian.SG, Russia_MLBA_Potapovka, or Russia_MLBA_Sintashta) and the Caucasus (Russia_Caucasus_LBA_Dolmen and Russia_North_Caucasus_MBA) but (nearly) absent in the SP genomes from Central and East-Central Europe (<5%) (Fig. S42b).

All ancient and present-day South Asian populations carry what is commonly known as Ancestral South Indian (ASI) ancestry, while all of the above mentioned ancient groups lack it. Ergo, it's impossible for these ancients to have actual South Asian ancestry.

What happened is that Gretzinger et al. created a genetic component in ADMIXTURE based on present-day South Asians. However, South Asians today have very complex ancestry from several different sources, including early pastoralists from the North-Pontic steppe in Eastern Europe and early farmers from Central Asia and what is now Iran. As a result, the groups that share significant amounts of alleles with South Asians via these sources also show so called South Asian ancestry in the Gretzinger et al. analysis.

Unless this problem is corrected we're likely to see some nutjobs online using this paper to claim all sorts of nonsense about the origins of ancient Eastern Europeans and Central Asians, especially the Sintashta people and Scythians.

See also...

High-resolution stuff

Leo Speidel & Pontus Skoglund

Tuesday, August 12, 2025

Tripolje, Dereivka, and the Sredni Stog phenomenon (guest post)

This is another guest post by an anonymous contributor. Again, I don't necessarily agree with the author, but is he wrong? Feel free to let me know in the comments below.

The definition of Sredni Stog phenomenon (SSP) varies and is often a loosely applied term to refer to pretty much any individual in the Dnieper and Don regions between the ‘Neolithic’ and Yamnaya periods.

In order to elucidate the SSP, some brief remarks on the preceding Mariupol phenomenon are warranted. Understanding the Mariupol horizon is fairly straightforward – its development was catalysed by an intrusion of groups from somewhere west of the Dnieper, ~ 5500BC. The ‘proto-Mariupol’ group were genomic and economic ‘hunter-gatherers’, lacking any discernible EEF admixture, and with Y-hg I-L702 uniparental ‘trace-dye’. The Mariupol phenomenon predominantly impacted the lower reaches of the Dnieper and Azov steppe (Lower Dnieper- Azov group, ‘LDA’), but extended toward the Don, Volga and even the Kuban steppe in an attenuated form. The elevated levels of “Ukr N’ in Golubaya Krinitsa and the Y-hg I2a-L702 individual at Berezhnovka attest to this movement. The Mariupol phenomenon is associated with the development of formal cemeteries, linking them with Late Neolithic mixed farmer/HG groups in the northeast Balkans. Individuals were buried in a ‘supine straight-legged’ inhumation, with grave goods such as boar-tusk pendants for select males and adorning shell beads for females. This might signal the emergence of gender-differentiation in burials and the rise of local leaders or ‘chiefs’.

Data sets treat the ‘Neolithic’ in Ukraine as a monolithic phenomenon, however it is important to note that Dereivka stands apart – it is 200km north of other “Ukraine_N” sites such as Volnienski, Igren and Vovnigi, beyond the ‘Dnipro bend’. Moreover, male individuals from Dereivka are overwhelmingly assigned to Y-hg R1b-V88 and the burial pose (N-S) at Dereivka deviates from the more common E-W orientation seen elsewhere. Quite a few of the published ‘Sredni Stog’ individuals are from Dereivka, and often earlier than 4500BC, and N. Kotova assigns them to the Dnieper-Donets culture. Moreover, the recently published middle Don individuals, such as those from Golubaya Krinitsa and Vasilyevsky Cordon -17, are also not Sredni Stog, but can be thought of as ‘partially Mariupolised hunter-gatherers’. In another example, the (undated) ‘Sredni Stog’ individual I27930 from Igren was assigned to Y-hg Q and he can be modelled as a 2-way mix of EHG & WHG. This individual is actually from the Mesolithic.

So what occurred during the Sredni Stog period? In contrast to the Mariupol phase, the population dynamics associated with SSP are complex: at least three external flows can be highlighted (i) the advance of Tripolje communities from the Carpathians to the Dnieper (ii) arrival of South-Caucasian/CHG agro-pastoralists in the north Caucasus, and (iii) arrival of ‘central Asian’ populations in Volga-Caspian region (represented by “TTK individual’); in addition to intra-steppe shifts and flows. Notwithstanding, the ‘ideological background’ of SSP is rooted in the Mariupol horizon. The stereotypical SSP burials features individuals buried on their back, but increasingly with legs up-flexed. And we see the beginnings of kurgan constructions, which vary from stone cairns to soil-thrown barrows. Most are buried in simple pits, however some have more complex ‘catacomb’ pits.

What happened in the Dniester-Dnieper-Don region during the SSP? We can begin by orientating ourselves with a PCA to observe two main clines developing. One cline develops between ‘Ukr_N and EEF and a second cline pulls toward Lower-Volga Caucasus groups. The first cline mostly comprises of ‘Farmers’ from Tipolje and ‘hunter-gatherers’ from Dereivka. The second cline consists of individuals from Dereivka and the lower Dnieper-Azov group pulling toward Lower Volga-Caucasus groups.

Admixture analysis with qpAdm reveals 3 groups within the 2 broad clines. The first group can be thought of as ‘core Sredni Stog’. These individuals are 2-way mixes of ‘Ukr_N’ and ‘Steppe Eneolithic’ (sometimes Progress works, sometimes Remontoye or Berezhnovka). They are both males and females. In our examples, the females are from Kopachiv Yar (4000 BC) and Dereivka (3500BC). The males come from Dereivka (4300BC), Moluykhiv Bugor (4000BC), Vynohrado (4000BC); they are all derived for Y-hg I2a-L703+. These results represent a blending of social networks between the LDA and various lower Volga-Caucasus groups, and the subsequent expansion by LDA further West. The terminus ante quem of 4300 BC matches the corrected dating of the Kuban steppe sites such as Progress & Vojnuchka.

Another subset comprises of individuals from Dereivka and Verteba cave who situated on an ‘Ukr-N’ < - - > EEF cline. Many of the earlier Dereivka individuals are almost 100% Ukr_N. Verteba cave Tripolje can be modelled as 80% EEF + 20% Ukr_N. One individual from Dereikva (I3719) falls outside the Dereivka <-> Tripolje clin, as he plots further ‘south’ with Balkan-LBK farmers. Consistently, he comes out as ~100% EEF with qpAdm. Dating to ~4700BC, he precedes the arrival of Tipolje groups to the region by hundreds of years. FtDNA have assigned him to I2-Z161- FTH81, which is distinctive to the LDA haplotypes and is phylogenetically linked to a Czech LBK individual.

A third group consists of individuals with more complex 3-way ancestries, consisting of EEF, Ukr-N and Steppe_En and/or Maikop. These come from late Dereivka and late Tripolje groups, in archaeological literature often termed as ‘Soldanesti’, ‘Zhivotilovksa-Volchansk’, Cernavoda (Kartal). Once again, the males from Soldanesti and Cernavoda derive from LDA-related Y-hg I2a-L703 in some shape or form.

Conclusions:

1) Firstly, we note that the Dereivka group was subject to early EEF influence, as soon as eastern LBK groups reached Ukraine after 5000BC. However, their main interaction occurred with the younger, Tripolje group, which expanded toward the Dnieper after ~ 4300 BC.

2) In the LDA group we observe patrilineal continuity. These clans created expansive social-networks. They initially mixed with groups from the lower Volga-Caucasus area. Some then moved west, and ‘took over’ the Tripolje region and acquired high levels of EEF.

3) As a third conclusion, we can reject the commonly held notion that Tripolje was ‘conquered by Yamnaya pastoralists’. Our analysis instead highlights that their core structure fragmented as they became intertwined with powerful networks to their west (Trpolje) and east (Sredni Stog). The ‘take-over’ was due to the expansion of LDA/ SS groups. Mixed groups emerged such as Cernavoda and Soldanesti, which retained Tripolje ancestry and some cultural traditions. By the time Yamnaya groups reached the Dniester forest steppe, Tripolje had been long gone.

Sunday, June 22, 2025

‘Proto-Yamnaya’ Eneolithic individuals from Kuban steppe c. 3700 BC ? (guest post)

This is a guest post by an anonymous contributor. I don't necessarily agree with its findings, but I think it's a good way to get the ball rolling here again. Feel free to let me know what you think. Please note, however, that any comments that show mental instability will be blocked. No more crazy talk on this blog.

In order to understand who Yamnaya people were, one must first define ‘Yamnaya’. We will adopt a strictu sensu view (e.g. Anthony, Heyd) encompassing burials dating 3200-2600 BC, with a characteristic body position, mound construction, and copper artefacts. These complexes can be linked to a core group of people whose autosomal make-up is quite homogeneous throughout their wide geographic range. Moreover, almost all males belong to Y-haplogroup R1b-M269-Z2103. In this light, ‘core Yamnaya’ does not represent a ‘proto-Indo-European’ population, as commonly proclaimed, but a group which contributed to several post-PIE population-language complexes, such as Tocharian, Armenian and some Paleo-Balkan languages. However, historical linguistics is not the focus of this post.

Archeologists had linked Yamnaya to earlier complexes such as Khvalynsk, Repin and/or Mikhalivka. Given that cultural markers such as pottery and burial customs can be borrowed and copied, ancient DNA can offer a more objective assessment of population origins. However, the cacophony of clusters, clines and other statistical constructs in publications can be confusing. A more rationalized approach is required, and one way is to co-analyse phylogenetically linked individuals across space and time. Apart from a lower-quality individual from Smyadovo (Bulgaria c. 4300 BC), the earliest attestation of R1b-M269 is in two individuals from the Kuban steppe (Stavropol region) c. 3700 BC -NV3003 and KST001 (Ghaliachi et al 2024). However, Y-hg R1b-M269 is missing in currently sampled Kuban steppe and north Caucasian males from the preceding period (5000-4000 BC). Males of the ‘Kuban steppe 4500bc’ group (Progress, Vojnucka, Sengeleevskiy, etc) are instead derived for the phylogenetically divergent Y-hg Rb-V1636. Males from the Nalchik cemetery are also derived for Y-hg R1b-V1636, or related haplotypes, although they were buried in a ‘Caucasian Farmer’ pose and heavily infused with such ancestry, but probably also had a burial mound thrown above. We do not know when the R1b-V1636 clans entered the northern Caucasus region, or from where, but they appear to have been attracted by trade with North Caucasian Famer (~Eneolithic) groups- termed as ‘Meshoko-Zamok’, ‘Chokh’, etc, in literature. Curiously the Nalchik group has minimal Central Asian (“TTK-related”) ancestry, whilst the Kuban steppe group has high levels. This suggests that TTK-related ancestry arrived after R1b-V1636 dominant EHG clans entered the North Caucasus region, but other scenarios are possible. Lastly, two ‘Meshoko culture’ males from Unakozovskaya have been assigned to Y-hg J2a-L26.

A shake-up occurred in the north Caucasus after 4000 BC. As we know, this corresponds to the emergence of the Majkop phenomenon, catalysed by renewed migrations from the south. These were not ‘Uruk migrants’ as sometimes proposed - the Uruk phenomenon occurred several hundred years later and was a south Mesopotamian phenomenon. Instead, these newcomers emerge from southern Caucasus- north Mesopotamian ‘Late Chalcolithic’ groups. They brought with them multiple West Asian lineages, such as Y-hg T, L2, J2a-, J2b, G2. Over time they mixed with preceding north Caucasian Eneolithic groups, culminating in the Novosvobodnaja phenomenon.

The emergence of Majkop as a new socio-cultural complex broke down the previous system dominated by Y-hg R1b-V1636 clans. The Majkop sphere consisted of a ‘core’ of heterarchical chiefs buried in elaborate kurgans near the Mountains, and a dynamic northern ‘frontier’ in the steppe lands (as far as the lower Don) between 4000 and 3000 BC. At least 3 ‘‘Majkop periphery’ genetic groups can be defined; in fact all these groups can be termed ‘steppe Majkop’:

1- Group with western Siberian/ north central Asian ancestry (the ‘genetic steppe Majkop’ as defined in Wang et al, 2023)
2- The South Caucasian/north Iranian ‘Zolotarevka’ group
3- The R1b-M269 duo.

Regardless of their lineages and genomic affinities, these individuals were often buried in kurgans which over time formed groups. These were not continuations of pre-4000 BC kurgans, but the communities instead made a conscious choice to build new kurgans after 4000 BC, adding to the idea of discontinuity. But once built, these kurgan clusters continued to be developed for hundreds of years, into the Yamnaya period. This does imply ethnic homogeneity or continuity, just a ‘continuity of place’. Without a direct attestation of a phylogenetic ancestor, and guestimating from their (non-identical) genomic profile, we are left to speculate that Y-hg R1b-M269 individuals moved down from somewhere in the Volga-Don interfluve. Perhaps amongst groups utilizing Repin pottery, but if so, they did not continue its use in their new contexts. By 3000 BC, the Majkop system collapsed. Yamnaya groups and their ‘Catacomb’ descendants took control of the north Caucasus region, having benefitted from years of trade/ exchange and knowledge gathering. Whether Yamnaya actually descend from individuals like KST-1 or NV3003 remains to be seen, however these are the closest leads we have. Certainly, we can model Yamnaya as deriving from KST-1 (88%) + Dnieper_N (12%), but we should be cautious when using singular individuals as ‘sources’.

Monday, January 6, 2025

Hungarian origins

This quote, from a new paper at Nature, High-resolution genomic history of early medieval Europe by Speidel et al., is the most idiotic take on the ancestry of present-day Hungarians that I've ever read.

Present-day populations of Hungary do not appear to derive detectable ancestry from early medieval individuals from Longobard contexts, and are instead more similar to Scythian-related ancestry sources (Extended Data Fig. 6), consistent with the later impact of Avars, Magyars and other eastern groups.

In fact, present-day Hungarians are overwhelmingly derived from West Slavic and German peasants, showing only minor ancestry from early Magyars (or rather Hungarian Conquerors). So in terms of genetic ancestry they're basically typical East Central Europeans.

Scythians and Avars don't even deserve a mention in this context.

The reason that Speidel et al. found present-day Hungarians to be broadly similar to Scythians is because they used so called Hungarian Scythians in their analysis.

It's important to understand that these Hungarian Scythians are genetically fairly typical Central Europeans for their time, and, by and large, don't show any significant genetic relationship to Asian Scythians, Avars or early Magyars. So they're mostly either just acculturated Scythians or wrongly classified as Scythians by archeologists.

That is, the broad similarity that Speidel et al. found between present-day Hungarians and Hungarian Scythians derives from the fact that both of these populations are genetically Central Europeans, rather than the ridiculously false idea that they show strong genetic links to Avars, Hungarian Conquerors and other eastern groups.

Here's a Principal Component Analysis (PCA) of West Eurasian genetic variation, courtesy of the excellent Vahaduo:Global25 Views, that perfectly illustrates my point.

If Speidel et al. were correct about the genetic origin of present-day Hungarians, then the Hungarian_Modern and Hungary_Scythian samples would be shifted away from other Europeans, much like many of the Hungary_Avar and Hungary_Conqueror individuals. But that's obviously not the case, and instead they cluster strongly with, say, present-day Germans from Hamburg.

I emailed two of the authors of this paper, Leo Speidel and Pontus Skoglund, when they posted the preprint of the paper at bioRxiv to cordially discuss this issue (see here). But they totally ignored me.

Citation...

Speidel et al., High-resolution genomic history of early medieval Europe, Published online: 1 January 2025, https://doi.org/10.1038/s41586-024-08275-2

Wednesday, December 4, 2024

The PIE homeland controversy: December 2024 open thread

It seems like we're getting close to the moment when Iosif Lazaridis has to finally admit that the Proto-Indo-European (PIE) homeland was located in Eastern Europe, and also that the ancestors of the Hittites and other Anatolian speakers entered Anatolia via the Balkans.

Let's discuss.

However, please note that comments from total morons, trolls and/or mentally unstable people will not be approved.

See also...

Indo-European crackpottery

Monday, March 25, 2024

High-resolution stuff

I just emailed this to the authors of High-resolution genomic ancestry reveals mobility in early medieval Europe, a new preprint at bioRxiv [LINK].

I appreciate that Polish population history is not the main focus of your preprint, and also that you're constrained by the lack of relevant and suitably high quality ancient genomes from East-Central and Eastern Europe. However, I must say that your analysis of the Medieval Polish population and resulting conclusions about Polish population history don't reflect reality.

Your Poland_Middle_Ages genomic cluster is made up of just six samples that don't fully represent the genetic complexity of the core population of Medieval Poland.

As a result, you classified PCA0148 as one of the Poland_Middle_Ages outliers, even though this sample isn't an outlier when analyzed within the context of the full set of published Polish Medieval genomes.

Moreover, PCA0148 is very similar to several Polish Viking Age samples that show Scandinavian-specific genome-wide and Y-chromosome haplotypes, and probably likewise shows some Scandinavian-related ancestry.

This is important to note when attempting to recapitulate Polish population history, because it suggests that Scandinavian-related ancestry played a formative role in the shaping of the core Polish Medieval genetic cluster.

Thus, you might be correct when you claim that the six samples in your Poland_Middle_Ages cluster don't show any "detectable" Scandinavian-related ancestry, but this doesn't necessarily mean that this type of ancestry isn't a key part of the post-Iron Age Polish population history.

Below is a self-explanatory Principal Component Analysis (PCA) plot that illustrates my points. Interestingly, Figure 3c in your preprint shows very similar outcomes in regards to the post-Iron Age Polish population history. But the style and scale of your figure makes it difficult to spot the subtle but likely genuine Northwest European-related genetic shifts shown by PCA0148, the Viking context samples and present-day Poles relative to the Poland_Middle_Ages cluster.

However, I'm also skeptical that your Poland_Middle_Ages cluster doesn't carry any detectable or even significant Scandinavian-related ancestry. That's because I suspect that there might be some technical issues with your analysis that are masking this type of ancestry in the Polish samples.

Your top mixture model for the Poland_Middle_Ages cluster is, in all likelihood, an extreme statistical abstraction of reality, rather than a close reflection of it. That's because, due to a combination of historical, geographical and genetic factors, neither Italy.Imperial(I).SG nor Lithuania.IronRoman.SG are realistic formative source populations for the Medieval Polish gene pool.

One of the reasons why you ended up with such a surprising result is probably the lack of suitable samples from East-Central and Eastern Europe, especially those associated with plausibly the earliest Slavic-speaking populations.

It's also possible that basing your mixture model on formal statistics played a key part.

Formal statistics-based mixture models are known to be biased towards outcomes involving mixture sources from the extremes of mixture clines. If your analysis is affected by this problem, then this would help to explain why you characterized the Poland_Middle_Ages cluster as simply a two-way mixture between a Middle Eastern-related group from Imperial Rome and a Baltic population with a very high cut of European hunter-gatherer ancestry.

I do note that on page 6 of your manuscript you consider the possibility that the Southern European-related signal in the Poland_Middle_Ages cluster might only be very distantly related to Italy.Imperial(I).SG, and that it may even have spread across Poland with early Slavic speakers. This is a great point, and I think it should be emphasized and expanded upon, because I suspect that the problem runs deeper than this.

For instance, if the early Slavic ancestors of Poles carried substantially more Southern European-related ancestry than Lithuania.IronRoman.SG, and this ancestry was, say, more Balkan-related than Italian-related, then this might radically change your modeling of the Poland_Middle_Ages cluster. That's because these early Slavs would be positioned in a very different genetic space than Lithuania.IronRoman.SG, which could potentially require a significant signal of Scandinavian-related ancestry to get a robust mixture model.

Finally, it might be useful to consider Isolation-by-Distance as a partial vector for the Italy.Imperial(I).SG-related signal in Medieval Poland.

The full set of published Polish Medieval genomes includes a number of outliers with obvious ancestry from Western Europe and the Balkans. These people probably don't represent any large-scale migrations into Poland, but rather the movements of individuals and small groups. Over time, such small-scale mobility may have had a fairly significant impact on the genetic character of the Polish population.

Update 26/03/2024: I sent another email to Speidel et al., this time in regards to their analysis of present-day Hungarians.

Your preprint also claims that present-day Hungarians are genetically similar to Scythians, and that this is consistent with the arrival of Magyars, Avars and other eastern groups in this part of Europe.

However, present-day Hungarians are overwhelmingly derived from Slavic and German peasants from near Hungary. This is not a controversial claim on my part; it's backed up by historical sources and a wide range of genetic analyses.

Hungarians still show some minor ancestry from Hungarian Conquerors (early Magyars), but this signal only reliably shows up in large surveys of Y-chromosome samples.

The Scythians that you used to model the ancestry of present-day Hungarians are of local, Pannonian origin, and they don't show any eastern nomad ancestry. So they're either acculturated Scythians, or, more likely, wrongly classified as Scythians by archeologists.

And since these so-called Scythians lack eastern nomad ancestry, the similarity between them and present-day Hungarians is not a sign of the impact from Avars, Hungarian Conquerors and the like, but rather a lack of significant input from such groups in present-day Hungarians.

Citation...

Speidel et al., High-resolution genomic ancestry reveals mobility in early medieval Europe, bioRxiv, Posted March 19, 2024, doi: https://doi.org/10.1101/2024.03.15.585102

See also...

Wielbark Goths were overwhelmingly of Scandinavian origin

Thursday, February 22, 2024

Berkeley, we have a problem

A new preprint at bioRxiv by Kerdoncuff et al. makes the following, somewhat surprising, claim:

One of the individuals, referred to Sarazm_EN_1 (I4290) described above that was discovered with shell bangles showing affiliation with South Asia, has significant amount AHG-related ancestry, while a model without AHG-related ancestry provides the best fit for Sarazm_EN_2 (I4210) (Table S4.5).

First of all, the authors are actually referring to sample ID I4910 not I4210.

The aforementioned table, based on qpAdm output, shows that I4290 has 15.9% AHG-related ancestry and basically no Anatolian farmer-related ancestry. It also shows that I4910 has no AHG-related ancestry but 17.9% Anatolian farmer-related ancestry.

AHG stands for Andaman hunter-gatherer. The authors are using it as a proxy for South Asian hunter-gatherer ancestry.

However, I've looked at I4290 and I4910 in great detail over the years using ADMIXTURE, Principal Component Analysis (PCA), and qpAdm. And I'm quite certain that they do not show any obvious, above noise level South Asian ancestry. Indeed, I'd say that if they do have some minor South Asian ancestry, then I4910 probably has more of it than I4290.

Kerdoncuff et al. used the following "right pops" or outgroups: Ethiopia_4500BP.SG, WEHG, EEHG, ESHG, Dai.DG, Russia_Ust_Ishim_HG.DG, Iran_Mesolithic_BeltCave and Israel_Natufian.

This means they mixed data that were generated in very different ways (DG, SG and capture) and included some poor quality samples. For instance, the highest coverage version of Iran_Mesolithic_BeltCave offers just ~50K SNPs.

Mixing different types of data and relying on low coverage samples, even in part, often has negative consequences when using qpAdm. So I suspect that the above mentioned mixture results for I4290 are skewed by a poor choice of outgroups.

When I run qpAdm I try to stick to one type of data and avoid low quality singletons in the outgroups. This is the best qpAdm model that I can find for Sarazm_EN:

right pops:
Cameroon_SMA
Morocco_Iberomaurusian
Israel_Natufian
Levant_N
Iran_GanjDareh_N
Turkey_N
Russia_Karelia_HG
Russia_WestSiberia_HG
Mongolia_North_N
Brazil_LapaDoSanto_9600BP

Sarazm_EN
Kazakhstan_Botai_Eneolithic 0.113±0.017
Turkmenistan_C_Geoksyur_subset 0.887±0.017
P-value 0.06392

Sarazm_EN_1 (I4290)
Kazakhstan_Botai_Eneolithic 0.129±0.021
Turkmenistan_C_Geoksyur_subset 0.871±0.021
P-value 0.11019

Sarazm_EN_2 (I4910)
Kazakhstan_Botai_Eneolithic 0.104±0.021
Turkmenistan_C_Geoksyur_subset 0.896±0.021
P-value 0.07427

Also...

Sarazm_EN
Andaman_hunter-gatherer -0.018±0.020
Kazakhstan_Botai_Eneolithic 0.123±0.019
Turkmenistan_C_Geoksyur_subset 0.895±0.020
P-value 0.0298403
(Infeasible model)

Please note that Turkmenistan_C_Geoksyur_subset is made up of just three relatively high quality individuals: I8504, I12483 and I12487. That's because it's not possible to model the ancestry of Sarazm_EN using the full Geoksyur set, probably due to subtle genetic substructures within the latter.

Below is a PCA plot that, more or less, reflects my qpAdm model. I4290 and I4910 are sitting right next to each other in a cluster of ancient Central and Western Asians, and it's actually I4910 that is shifted slightly towards the South Asian pole of the PCA. Indeed, I can confidently say that there's no way to design a PCA in which I4290 is shifted significantly towards South Asia relative to I4910.

Citation...

Kerdoncuff et al., 50,000 years of Evolutionary History of India: Insights from ∼2,700 Whole Genome Sequences, bioRxiv, posted February 20, 2024, doi: https://doi.org/10.1101/2024.02.15.580575

See also...

The Nalchik surprise

A comedy of errors

search this blog