search this blog

Showing posts with label Balto-Slavic. Show all posts
Showing posts with label Balto-Slavic. Show all posts

Saturday, January 17, 2026

New Iron Age samples from southeastern Poland


A new dataset has appeared online from a yet to be published paper titled Cosmopolitanism in the depths of Barbaricum evidenced by archaeogenomic data from the Late Iron Age Goth community of the Masłomęcz group [Update: the paper is now available at this Link].

Most of these Gothic samples are clearly of Scandinavian origin, and very similar to present-day Swedes. Overall, however, they create a somewhat heterogeneous cluster that also overlaps with present-day Poles thanks to the presence of a few Balto-Slavic-related and possibly Roman-related individuals.

The Principal Component Analysis (PCA) plots below were produced with the excellent Vahaduo G25 Global Views tool using the data here.

Their Y-haplogroups more or less reflect the PCA results:

PL046 R-YP6228
PL048 I-PH833
PL049 I-A11537
PL052 R-Y48961
PL059 I-PH833
PL062 I-S15301
PL065 I-Y294193
PL066 R-FGC2555
PL067 R-S7759
PL070 I-CTS10028
PL071 I-BY316
PL076 I-S9318
PL082 I-Z2041
PL085 J-Z38241
PL086 I-FT29339

See also...

Early Slavs from Tribal Period Poland

Wielbark Goths were overwhelmingly of Scandinavian origin

High-resolution stuff

Saturday, September 6, 2025

Early Slavs from Tribal Period Poland


A paper dealing with the origin of Slavic speakers, titled Ancient DNA connects large-scale migration with the spread of Slavs, was just published at Nature by Gretzinger et al. (see here).

The dataset from the paper includes eight fascinating ancient samples from Gródek upon the Bug River in Southeastern Poland. These individuals are dated to the so called Tribal Period (8th –9th centuries), and, as far as I know, they represent the earliest Slavic speakers in the ancient DNA record.

The really interesting thing about these early Slavs is that they already show some Germanic and other Western European-related ancestries.

In the Principal Component Analysis (PCA) plots below, three of them cluster near present-day Ukrainians, while the rest are shifted towards present-day Northwestern, Western and Southern Europeans. The plots were produced with the excellent Vahaduo G25 Global Views tool using the data here.


These results aren't exactly shocking, because the people who preceded the early Slavs in the Gródek region were Scandinavian-like and associated with the Wielbark archeological culture. In other words, they were probably Goths who also had significant contacts with the Roman Empire.

However, it's not a given that the ancestors of the Tribal Period Slavs mixed with local Goths. It's also possible that they brought the western admixture, or at least some of it, from the Slavic homeland, wherever that may have been.

That's because the early Slavs who migrated deep into what is now Russia also showed Western European-related admixture. This is what Gretzinger et al. say on page 74 of their supplementary info (emphasis is mine):

The only deviation from this pattern is observed for ancient samples from the Russian Volga-Oka region, where we measure higher genetic affinity between present-day Southern/Western Europeans and the SP population compared to the pre-SP population (Fig. S17). This agrees with the pattern observed in PCA and ADMIXTURE that, in contrast to the Northwestern Balkan, Eastern Germany, and Poland-Northwestern Ukraine, the arrival of Slavic-associated culture in Northwestern Russia was associated with a shift in PCA space to the West, a decrease of BAL [Baltic] ancestry, and the introduction of Western European ancestries such as CNE [Continental North European] and CWE [Continental Western European].

Thus, it's highly plausible that the Tribal Period Slavs from Gródek were very similar, perhaps even practically identical, to the proto-Slavs who lived in the original Slavic homeland. Hopefully we won't have to wait too long to discover whether that's true or not. More Migration period and Slavic period samples from the border regions of Belarus, Poland and Ukraine are needed to sort that out.

On the other hand, most of the post-1000 CE individuals from Gródek are shifted closer to present-day Balts. This is probably due to admixture from nearby Baltic-speaking populations. At the time, Baltic speakers still occupied much of northern and eastern Poland.

I'm still going through the Gretzinger et al. paper and I'll probably have a lot more to say about it in the near future.

However, unfortunately, I've already spotted a silly mistake in the supplementary info that will probably have some very annoying consequences for us on this blog. On page 109 the authors make the false claim that South Asian ancestry is present in a wide range of ancient Eastern European and Central Asian populations from the Bronze Age to the Scythian period.

Furthermore, Sycthian groups from Ukraine show varying fractions of South Asian ancestry (between 5% and 12%), a component present in many ancient individuals from Moldova (e.g. Moldova_IA, Moldova_LBA and Moldova_MBA), Ukraine (Ukraine_Alexandria_MBA and Ukraine_BA_Catacomb.SG), Western Russia (e.g. Russia_EarlySarmatian.SG, Russia_MLBA_Potapovka, or Russia_MLBA_Sintashta) and the Caucasus (Russia_Caucasus_LBA_Dolmen and Russia_North_Caucasus_MBA) but (nearly) absent in the SP genomes from Central and East-Central Europe (<5%) (Fig. S42b).

All ancient and present-day South Asian populations carry what is commonly known as Ancestral South Indian (ASI) ancestry, while all of the above mentioned ancient groups lack it. Ergo, it's impossible for these ancients to have actual South Asian ancestry.

What happened is that Gretzinger et al. created a genetic component in ADMIXTURE based on present-day South Asians. However, South Asians today have very complex ancestry from several different sources, including early pastoralists from the North-Pontic steppe in Eastern Europe and early farmers from Central Asia and what is now Iran. As a result, the groups that share significant amounts of alleles with South Asians via these sources also show so called South Asian ancestry in the Gretzinger et al. analysis.

Unless this problem is corrected we're likely to see some nutjobs online using this paper to claim all sorts of nonsense about the origins of ancient Eastern Europeans and Central Asians, especially the Sintashta people and Scythians.

See also...

High-resolution stuff

Leo Speidel & Pontus Skoglund

Saturday, January 13, 2024

Romans and Slavs in the Balkans (Olalde et al. 2023)


It's always amusing to see some random Jovan or Dimitar arguing online that Slavic speakers have been in the Balkans since at least the Neolithic.

Obviously, Slavic peoples only turned up in the Balkans during the early Middle Ages. It's just that their linguistic and genetic impact on the region was so profound that it may seem like they've been there forever.

A new paper at Cell by Olalde et al. makes this point well. See here.

That's not to say, however, that it's an ideal effort. The paper's qpAdm mixture models probably could've been more precise and realistic. Genes of the Ancients has a useful discussion on the topic here.

Interestingly, Olalde et al. admit that they can't detect much, if any, admixture from the Italian Peninsula in the Balkans, even in samples dating to the Roman period. And yet, this doesn't stop them from accepting that the Roman Empire had a massive cultural and demographic impact on the Balkans.

I also assume that, by extension, they don't deny that Latin was introduced into the Balkans from the Italian Peninsula.

That is, Latin spread into the Balkans without any noticeable genetic tracer dye, and it eventually gave rise to modern Romanian spoken by millions of people today in the eastern Balkans. This might be a useful data point to keep in mind when discussing the spread of Indo-European languages into Anatolia.

See also...

Dear Iosif, about that ~2%

Saturday, November 4, 2023

Slavs have little, if any, Scytho-Sarmatian ancestry


Here's an abstract of a new study from the David Reich Lab about ancient Slavs, titled "Genetic identification of Slavs in Migration Period Europe using an IBD sharing graph". Emphasis is mine:

Popular methods of genetic analysis relying on allele frequencies such as PCA, ADMIXTURE and qpAdm are not suitable for distinguishing many populations that were important historical actors in the Migration Period Europe. For instance, differentiating Slavic, Germanic, and Celtic people is very difficult relying on these methods, but very helpful for archaeologists given a large proportion of graves with no inventory and frequent adoption of a different culture. To overcome these problems, we applied a method based on autosomal haplotypes. Imputation of missing genotypes and phasing was performed according to a protocol by Rubinacci et al. (2021), and IBD inference was done for ancient Eurasian individuals with data available at >600,000 1240K sites. IBD links for a subset of these individuals were represented as a graph, visualized with a force-directed layout algorithm, and clusters in this graph are inferred with the Leiden algorithm. One of the clusters in the IBD graph emerged that includes nearly all individuals in the dataset annotated archaeologically as “Slavic”. According to PCA a hypothesis for the origin of this population can be proposed: it was formed by admixture of a Baltic-related group with East Germanic people and Sarmatians or Scythians. The individuals belonging to the “Slavic” IBD sharing cluster form a chronological gradient on the PCA plot, with the earliest samples close to the Baltic LBA/EIA group. Later “Slavic” individuals are shifted to the right, closer to Central and Southern Europeans and probably reflecting further admixture of Slavs with local populations during the Migration Period.

Apparently this abstract is causing a bit of confusion online because of the mention of possible Sarmatian or Scythian ancestry in Slavs.

However, it's important to understand that the authors are referring to certain Slavic or even just Slavic-related individuals, usually from culturally heterogeneous frontier settlements deep in what is now Russia.

So yes, it's possible that some of these individuals carry Sarmatian, Scythian or other exotic eastern ancestry. But even if this is true, then obviously we can't extend this inference to all ancient and modern-day Slavs.

Indeed, below is a G25/Vahaduo Principal Component Analysis (PCA) that shows why modern-day Slavic speakers can't be linked genetically to Sarmatians or Scythians. To experience a more detailed version of the PCA paste the data here into the relevant field here.

As you can see, dear reader, most of the Slavs (Belarusians, Poles, Ukrainians and many Russians) cluster with the Irish near the western end of the plot.

Some Russians are shifted significantly east of them along the "Uralic cline" and, as a result, they cluster with various Uralic speakers such as Mordovians. That's because when Slavs migrated deep into what is now northern Russia they mixed with Uralic speakers who were there before them.

Most of the Sarmatians and Scythians form a cluster southeast of the Slavs and Irish because they carry significant levels of East Asian ancestry. This type of eastern ancestry is basically missing in modern-day Slavs (see here).

Several of the Scythians cluster among the Slavs and Irish, but that's because they're genetic outliers, whose existence, if anything, suggests that some Scythians had significant Slavic-related and/or Irish-related ancestry.

Now, even though most of the Slavs do cluster with the Irish in the above PCA plot, I strongly disagree with the authors of the abstract when they claim that "differentiating Slavic, Germanic, and Celtic people is very difficult" with PCA. It's actually pretty damn easy and I've been doing it successfully for many years. For instance, see here.

See also...

Wielbark Goths were overwhelmingly of Scandinavian origin

The Caucasus is a semipermeable barrier to gene flow

Thursday, October 6, 2022

Balto-Slavs and Sarmatians in the Battle of Himera


G25 coordinates for most of the samples from the recent Reitsema et al. paper are available in a text file here. They're also in the G25 datasheets at the usual link here.

A basic distance analysis with the G25 data at Vahaduo shows that the two samples labeled Himera_480BCE_3 are either early Balts or Slavs. I suspect that they're Slavs, because I believe that early Slavs had this type of Baltic-like genetic structure before mixing with their non-Slavic-speaking neighbors. Well, that's my pet theory for now, so take it or leave it.

Distance to: ITA_Sicily_Himera_480BCE_3:I10943
0.03393838 HUN_IA_La_Tene_o:I18226
0.03572886 DEU_MA_Krakauer_Berg:KRA001
0.03618075 RUS_Pskov_VA:VK159
0.03899963 SWE_Gotland_VA:VK463
0.03915018 Baltic_EST_IA:s19_V12_1

Distance to: ITA_Sicily_Himera_480BCE_3:I10949
0.03573636 HUN_IA_La_Tene_o3:I25524
0.03698768 HUN_IA_La_Tene_o:I18226
0.03732752 SWE_Skara_VA:VK397
0.03767022 Baltic_EST_IA:s19_V12_1
0.03772687 DEU_MA_Krakauer_Berg:KRA001

On the other hand, I'm almost certain that the two Himera_480BCE_4 samples are Sarmatians. The good old G25 does it again!

Distance to: ITA_Sicily_Himera_480BCE_4:I10944
0.03100861 KAZ_Segizsay_Sarmatian:SGZ002
0.03548059 MDA_Sarmatian:I11925
0.03619219 RUS_Urals_Sarmatian:MJ56
0.03626538 RUS_Urals_Sarmatian:chy001
0.03904260 RUS_Urals_Sarmatian:MJ41

Distance to: ITA_Sicily_Himera_480BCE_4:I10947
0.02989458 RUS_Urals_Sarmatian:MJ43
0.03052790 RUS_Urals_Sarmatian:chy002
0.03170622 KAZ_Kangju:DA226
0.03288789 TUR_BlackSea_Samsun_Anc_C:I4529
0.03310149 KAZ_Aigyrly_Sarmatian:AIG003
See also...

Slavic-like Medieval Germans

Sunday, January 23, 2022

Para-Turbo-Balto-Slavic?


I'm seeing increasing numbers of Bronze and Iron Age samples from Central Europe and surrounds with this peculiar set of traits:

- shared genetic drift with present-day Balto-Slavic speakers to the exclusion of most other Europeans

- and yet, an unusually low level of Yamnaya-related steppe ancestry

- so much so, in fact, that they're often outside the range of modern European genetic variation.

As far as I can tell, currently the best examples of this unusual population are HUN_Mako_EBA_o:I1502 (Mathieson et al. Nature 2015) and HUN_EIA_Prescythian_Mezocsat_o1:I18241 (Patterson et al. Nature 2021). Both are from the Carpathian Basin in what is now Hungary.

I ran a series of qpAdm mixture models to try and learn more about their origins. The most robust outcomes, out of about 50 different attempts, are these:

right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
IRN_Ganj_Dareh_N
Levant_PPNB
TUR_Barcin_N
Iberia_Southeast_Meso
UKR_Meso
England_Meso
RUS_Karelia_HG
RUS_West_Siberia_HG
MNG_North_N
TWN_Hanben
BRA_LapaDoSanto_9600BP

HUN_Mako_EBA_o
Baltic_LTU_Narva 0.149 ∓0.028
POL_Globular_Amphora 0.613 ∓0.028
Yamnaya_RUS_Samara 0.238 ∓0.029
chisq 10.836
tail prob 0.370463
Full output

HUN_EIA_Prescythian_Mezocsat_o1
Baltic_LTU_Narva 0.186 ∓0.028
POL_Globular_Amphora 0.592 ∓0.027
Yamnaya_RUS_Samara 0.222 ∓0.029
chisq 12.492
tail prob 0.253499
Full output

Combining the two genomes produces a very similar result:

HUN_EBA-EIA_o
Baltic_LTU_Narva 0.160 ∓0.023
POL_Globular_Amphora 0.612 ∓0.023
Yamnaya_RUS_Samara 0.227 ∓0.023
chisq 14.653
tail prob 0.14524
Full output

Importantly, when I move RUS_Karelia_HG from the right pops to the left pops, to test whether HUN_EBA-EIA_o really has steppe ancestry, as opposed to closely related hunter-gatherer ancestry, I still get a very similar outcome:

HUN_EBA-EIA_o
Baltic_LTU_Narva 0.158 ∓0.027
POL_Globular_Amphora 0.605 ∓0.033
RUS_Karelia_HG 0.014 ∓0.038
Yamnaya_RUS_Samara 0.223 ∓0.053
chisq 10.461
tail prob 0.234171
Full output

So these largely Globular Amphora-related individuals do harbor as much as a quarter of steppe ancestry, which is to be expected considering the massive genetic turn-over that most of Europe experienced just before their time as a result of population expansions from the Pontic-Caspian steppe.

Nevertheless, this is ~20% less steppe ancestry than in the present-day populations of the region, and it clearly shows in any decent Principal Component Analysis (PCA) of West Eurasia. For instance:
At the same time, the relatively close genetic relationship between these ancients and present-day Balto-Slavic speaking populations shows up in fine-scale intra-European PCA.

The origins and implications of this population are still a mystery to me. I don't think it's native to the Carpathian Basin. Indeed, my qpAdm models suggest that it may have moved into this region from somewhere to the northeast, because its ancestry is best modeled with ancient groups from present-day Lithuania, Poland and Russia.

I'm adamant that these people weren't Balto-Slavic speakers, and certainly not proto-Slavs. Rather, I suspect that much like the Welzin warriors of Bronze Age North-Central Europe, they were closely related to a contemporaneous group that eventually gave rise to proto-Slavs. At best, they may have somehow contributed to the ethnogenesis of Balto-Slavs.

By the way, using the Global25 to model their ancestry is highly problematic, because of the strong Balto-Slavic genetic drift that affects some of the dimensions. So be careful when you try it, or better yet, don't try it at all, and stick to formal stats in this particular instance.

See also...

Tollense Valley Bronze Age warriors were very close relatives of modern-day Slavs

Friday, August 27, 2021

R1a vs R1b in third millennium BCE Central Europe (Papac et al. 2021)


R1a-M417 and R1b-L51 are by far the most important Y-chromosome haplogroups in Europe today. More precisely, R1a-M417 dominates in Eastern Europe, while R1b-L51 in Western Europe.

It's been obvious for a while now, at least to me, that both of these Y-haplogroups are closely associated with the men of the Late Neolithic Corded Ware culture (CWC). Indeed, in my mind they're the main genetic signals of its massive expansion, probably from a homeland somewhere north of the Black Sea in what is now Ukraine.

I'm still not exactly sure how the east/west dichotomy between R1a and R1b emerged in Europe, but, thanks to a new paper by Papac et al. at Science Advances, at least now I have a working hypothesis about that. Below is a quote from the said paper, emphasis is mine:

In addition to autosomal genetic changes through time, we observe a sharp reduction in Y-chromosomal diversity going from five different lineages in early CW to a dominant (single) lineage in late CW (Fig. 4A). We used forward simulations to explore the demographic scenarios that could account for the observed reduction in Y-chromosomal diversity. Performing 1 million simulations of a population with a starting frequency of R1a-M417(xZ645) centered around the observed starting frequency in Bohemia_CW_Early (3 of 11, 0.27), we assessed the plausibility of this lineage reaching the observed frequency in Bohemia_CW_Late (10 of 11, 0.91) in the time frame of 500 years under a model of a closed population and random mating (Materials and Methods). We reject the “neutral” hypothesis, i.e., that this change in frequency occurred by chance, given a wide range of plausible population sizes. Instead, our results suggest that R1a-M417(xZ645) was subject to a nonrandom increase in frequency, resulting in these males having 15.79% (4.12 to 44.42%) more surviving offspring per generation relative to males of other Y-haplogroups. We also find that this change in Y chromosome frequency is extreme compared to the changes in allele frequencies at fully covered autosomal 1240k sites within the same males, suggesting a process that disproportionately affected Y-chromosomal compared to autosomal genetic diversity, ruling out a population bottleneck as the likely cause. Our results suggest that the Y-lineage diversity in early CW males was supplanted by a nonrandom process [selection, social structure, or influx of nonlocal R1a-M417(xZ645) lineages] that drove the collapse in Y-chromosomal diversity. A simultaneous decline of Y-chromosomal diversity dating to the Neolithic has been observed across most extant Y-haplogroups (64), possibly due to increased conflict between male-mediated patrilines (65). We view that changes in social structure (e.g., an isolated mating network with strictly exclusive social norms) could be an alternative cause but would be difficult to distinguish in the underlying model parameters.

Right, so even though the CWC was clearly a community of closely related groups, there must have been some competition between its different clans. And since these clans were highly patriarchal and patrilineal, this competition probably led to different paternal lineages dominating different parts of the CWC horizon, with M417 becoming especially common in the east and L51 in the west.

Of course, the expansions of post-Corded Ware groups, such as the M417-rich Slavs in Eastern Europe and L51-rich Celts in Western Europe, were also instrumental in creating Europe's R1a/R1b dichotomy, but obviously these groups were in large part the heirs of the CWC.

By the way, most of the samples from Papac et al. are already in the Global25 datasheets linked here. Look for the labels listed here. Below is a plot made from the Global25 data courtesy of regular commentator Matt.
Citation: L. Papac, M. Ernée, M. Dobeš, M. Langová, A. B. Rohrlach, F. Aron, G. U. Neumann, M. A. Spyrou, N. Rohland, P. Velemínský, M. Kuna, H. Brzobohatá, B. Culleton, D. Daněček, A. Danielisová, M. Dobisíková, J. Hložek, D. J. Kennett, J. Klementová, M. Kostka, P. Krištuf, M. Kuchařík, J. K. Hlavová, P. Limburský, D. Malyková, L. Mattiello, M. Pecinovská, K. Petriščáková, E. Průchová, P. Stránská, L. Smejtek, J. Špaček, R. Šumberová, O. Švejcar, M. Trefný, M. Vávra, J. Kolář, V. Heyd, J. Krause, R. Pinhasi, D. Reich, S. Schiffels, W. Haak, Dynamic changes in genomic and social structures in third millennium BCE central Europe. Sci. Adv. 7, eabi6941 (2021).

See also...

On the origin of the Corded Ware people

Understanding the Eneolithic steppe

Conan the Barbarian probably belonged to Y-haplogroup R1a

Thursday, June 17, 2021

Balto-Slavic drift


A few years ago I began using the term "Balto-Slavic genetic drift" to describe the fine-scale genetic signal that is shared by the speakers of Baltic and Slavic languages to the exclusion of Europeans without significant Balto-Slavic ancestry.

As a result, nowadays, many people online use the term "Balto-Slavic drift" when referring to this phenomenon.

The easiest way to prove that Balto-Slavic drift exists is to run a fine-scale Principal Component Analysis (PCA) of European genetic variation with a lot of Balto-Slavic samples in the mix. Indeed, my Global25 PCA analysis does a great job of illustrating the impact of Balto-Slavic drift on the population structure of Europe both in PCA plots and mixture models (for instance, see here).

It's also possible to tease out Balto-Slavic drift with formal statistics. I showed this indirectly in a recent blog post about Greek population structure (see here). In this post I'm going to demonstrate how to explicitly and formally test for Balto-Slavic drift both in ancient and present-day samples.

To do this we need to find stats that basically split Baltic and Slavic speakers from other Europeans, such as f4(Outgroup,Test;Bell_Beaker_NDL,Baltic_LVA_BA). In this f4-stat, Baltic_LVA_BA is the ancient reference population with an unusually high level of Balto-Slavic drift, while Bell_Beaker_NDL is a fairly similar population overall in terms of ancient ancestry components, but with practically zero Balto-Slavic drift.

Note that the statistics with the most significant Z scores (>3) involve populations that speak Baltic or Slavic languages, or their neighbors who plausibly harbor significant Baltic and/or Slavic ancestry. Among the ancient, mostly Scandinavian, populations (from Margaryan et al. 2020 and marked with the VK2020 prefix), significant Balto-Slavic drift only appears in the more easterly and/or later groups from the Viking Age (VA).


Unfortunately, one of the problems with this analysis is that Baltic_LVA_BA and Bell_Beaker_NDL aren't identical in terms of their ancient ancestry proportions. For one, the latter has significantly more Neolithic farmer ancestry. No wonder then, that Greeks, who are mostly of early farmer stock, don't show a significant Z score, despite probably packing a significant amount of Balto-Slavic ancestry dating to the Middle Ages.

In the near future, as more ancient samples become available, it might be possible to find better reference populations for the job and create more accurate, finer-scaled tests.

See also...

Uralian genes

That old chestnut: Northeast vs Northwest Euros

Sunday, January 17, 2021

That old chestnut: Northeast vs Northwest Euros


In the last comment thread reader Greg put forth this question:

David, when are you going to explain the genetic discrepancy between Northeastern and Northwestern Europeans? You know, the one that people believe is due to Baltic Hunter-Gatherer admixture, whereas you believe it is due to genetic drift? You ought to make a post about this issue at some point, because a lot of people are wondering what's causing the differences.

Well, Greg, this issue has been discussed to the proverbial death here and elsewhere. In fact, there were two posts and rather lengthy comment threads on the same topic at this blog just a few months ago. See here and here.

Nevertheless, it seems that a fair number of people are still befuddled, so I'm going to try to explain this one last time, as briefly as a I can using just a handful of f4-stats.

Admittedly, Northeast Europeans generally do pack higher levels of indigenous European hunter-gatherer ancestry than Northwest Europeans. This is especially true of Balts, who show more of this type of ancestry than even Scandinavians in practically every type of analysis.

The f4-stats below back this up unambiguously. Note the significantly positive (>3) Z scores, which suggest that Latvians and Lithuanians harbor more Baltic hunter-gatherer-related ancestry than Norwegians and Swedes.

Chimp Baltic_HG Norwegian Latvian 0.001301 7.114
Chimp Baltic_HG Swedish Latvian 0.001017 4.205
Chimp Baltic_HG Norwegian Lithuanian 0.001023 7.341
Chimp Baltic_HG Swedish Lithuanian 0.000763 3.408

Greg, I know what you're thinking: the naysayers are right! But wait, because there's a twist to this tale. Check out these f4-stats:

Chimp Baltic_HG Norwegian Belarusian 0.000265 1.934
Chimp Baltic_HG Swedish Belarusian 0.000152 0.7
Chimp Baltic_HG Norwegian Polish 6.4E-05 0.519
Chimp Baltic_HG Swedish Polish -0.000235 -1.074

Please note, Greg, that none of the Z scores reach significance, which means that these Northwest Europeans and Slavs are symmetrically related to Baltic_HG. They're also symmetrically related to other relevant ancient groups such as the Yamnaya steppe herders. This, of course, suggests that they harbor very similar levels of basically the same ancient genetic components.

Chimp Karelia_HG Norwegian Belarusian 0.000136 0.844
Chimp Karelia_HG Swedish Belarusian 7.9E-05 0.32
Chimp Karelia_HG Norwegian Polish -4.7E-05 -0.304
Chimp Karelia_HG Swedish Polish -0.000134 -0.54

Chimp Yamnaya_Samara Norwegian Belarusian -0.000134 -1.085
Chimp Yamnaya_Samara Swedish Belarusian -6.6E-05 -0.34
Chimp Yamnaya_Samara Norwegian Polish -0.000225 -1.995
Chimp Yamnaya_Samara Swedish Polish -0.000311 -1.574

Chimp Barcin_N Norwegian Belarusian -0.000335 -2.809
Chimp Barcin_N Swedish Belarusian -0.000284 -1.491
Chimp Barcin_N Norwegian Polish -0.000222 -2.057
Chimp Barcin_N Swedish Polish -0.000318 -1.662

Chimp Baikal_N Norwegian Belarusian 0.000186 1.3
Chimp Baikal_N Swedish Belarusian -7E-05 -0.33
Chimp Baikal_N Norwegian Polish -4.6E-05 -0.351
Chimp Baikal_N Swedish Polish -0.000477 -2.277

Interestingly, pairing up Ukrainians with English samples from Cornwall and Kent produces similar outcomes. But that's because most ancient ancestry proportions in Europe show a closer correlation with latitude than longitude.

Chimp Baltic_HG English_Cornwall Ukrainian 0.000282 2.242
Chimp Baltic_HG English_Kent Ukrainian 0.000225 1.748

Chimp Karelia_HG English_Cornwall Ukrainian 0.000323 2.175
Chimp Karelia_HG English_Kent Ukrainian 0.000239 1.634

Chimp Yamnaya_Samara English_Cornwall Ukrainian -6.6E-05 -0.569
Chimp Yamnaya_Samara English_Kent Ukrainian -0.000112 -0.977

Chimp Barcin_N English_Cornwall Ukrainian -0.000519 -4.641
Chimp Barcin_N English_Kent Ukrainian -0.000598 -5.232

Chimp Baikal_N English_Cornwall Ukrainian 0.000385 2.874
Chimp Baikal_N English_Kent Ukrainian 0.00036 2.836

Now, Greg, if at least in terms of genetic ancestry, Latvians, Lithuanians, Belarusians, Poles and Ukrainians all qualify as Northeast Europeans, then what makes them different, as a group, from Northwest Europeans? Do you believe that the key factor is admixture from Baltic hunter-gatherers? Or is it genetic drift?

Of course, considering all of the f4-stats above, logic dictates that it must be relatively recent genetic drift.

Keep in mind, however, that this only applies to Balto-Slavic speaking Northeast Europeans without significant Uralian ancestry. Overall, Uralic speakers have a more complex population history, and indeed genetic differences between them and Northwest Europeans are in large part due to somewhat different ancestry proportions and also Siberian admixture.

See also...

So who's the most (indigenous) European of us all?

Friday, November 13, 2020

Fatyanovo as part of the wider Corded Ware family (Nordqvist and Heyd 2020)


There's a new archeological paper about the Fatyanovo culture at the Proceedings of the Prehistoric Society [LINK]. It includes this quote on page 18:

In the traditional narrative, the Fatyanovo people – like the CWC populations in general – are regarded as Indo-European, representing the pre-Balto-Slavic (-Germanic) stage (Carpelan & Parpola 2001, 88; Anthony 2007, 380; also Gimbutas 1956, 163; Tretyakov 1966, 109) in the spread of Indo-European languages.

That's correct, but considering the latest ancient DNA research on the Fatyanovo people, the traditional narrative is probably wrong. Fatyanovo males were rich in Y-haplogroup R1a-Z93, which is found at very low frequencies in Balto-Slavic populations (see here). It's actually much more common nowadays in Central and South Asia, where it often reaches frequencies of over 50% in Indo-Iranian speaking groups.

Balts and Slavs are rich in R1a-Z282, which is a sister clade of R1a-Z93 that has been found in Corded Ware and Corded Ware-related samples from west of Fatyanovo sites. That is, in present-day Poland and the Baltic states.

Therefore, the origins of the Balto-Slavs should be sought somewhere west of the Fatyanovo culture, probably in the Corded Ware derived populations from what is now the border zone between Poland, Belarus and Ukraine.

Indeed, in my view the Fatyanovo people are more likely to have spoken Proto-Indo-Iranian rather than anything ancestral to Baltic or Slavic (see here).
Nordqvist and Heyd, The Forgotten Child of the Wider Corded Ware Family: Russian Fatyanovo Culture in Context, Proceedings of the Prehistoric Society, online 12 November 2020, DOI: https://doi.org/10.1017/ppr.2020.9

See also...

The oldest R1a to date

Saturday, November 7, 2020

Slavic-like Medieval Germans


The samples labeled DEU_MA_Krakauer_Berg in the Principal Component Analysis (PCA) plot below are from a recent paper by Parker et al. at Scientific Reports. Their remains were excavated from a Medieval cemetery in the now abandoned village of Krakauer Berg in eastern Germany.

Krakauer sounds sort of like Kraków, doesn't it? That's probably not a coincidence, especially considering how these people behave in my analysis. To see an interactive version of the plot, paste the coordinates from the text file here into the relevant field here.

See also...

Yamnaya-related ancestry proportions in present-day Poles

Warriors from at least two different populations fought in the Tollense Valley battle

Viking world open analysis and discussion thread

Tuesday, September 8, 2020

Warriors from at least two different populations fought in the Tollense Valley battle


I can't get the genotype data from the Burger et al. paper. The lead authors, Joachim Burger and Daniel Wegmann, aren't replying to my emails.

But they were gracious enough to release the BAM files for each of their samples, and these files can be converted to genotype data. So I've included ten of the Tollense Valley warriors (DEU_Tollense_BA) in the Global25 datasheets (see here).

The claim in the paper that these warriors "represent an unstructured population" is absolutely false and extremely naive.

Below are a couple of Principal Component Analysis (PCA) plots produced with Vahaduo Global25 views. The samples are labeled according to their Y-chromosome haplogroups. To see interactive versions of the same plots, paste the Global25 coordinates from the text file here into the relevant fields here.


These warriors are not a single unstructured population, because they cover too much ground in the above plots for that to be possible. It's clear to me that they represent at least two different groups from Central Europe and surrounds.

Of course, this would be a lot easier to work out if Burger et al. cared to supply more information about each of the warriors, such as their attire, weapons, circumstances of death, and so on. It's a complete mystery to me why this wasn't included in the paper, and the authors are refusing to talk to me, so it's unlikely that I'll ever be able to get it from them.

In the absence of such crucial archeological and anthropological data, I don't want to speculate too much, and get overly creative, but here are a couple of possible scenarios to explain the ancient DNA results:
- this may have been a battle between two Central European armies, one rich in Y-haplogroup R1b and the other rich in Y-haplogroup I2a, as well as their allies or hired help, including warriors from Eastern Europe belonging to Y-haplogroup R1a

- or perhaps it was an invasion from the east by warriors rich in Y-haplogroup R1a, and it was a success, with the local armies, rich in Y-haplogroups R1b and I2a, losing the battle and suffering most of the casualties.

I'm sure that one day someone will attempt to undertake a decent multidisciplinary study of this epic battle, and we'll at least have a rough idea about what happened. Or not.

Citation...

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Genetic and linguistic structure across space and time in Northern Europe

Sunday, September 6, 2020

Low prevalence of lactase persistence in Bronze Age Europe (Burger et al. 2020)


Over at Current Biology at this LINK. Unfortunately, this is the long-awaited Tollense Valley battle paper. Despite the obvious presence of some very interesting genetic substructures among the Tollense Valley warriors (see here), the authors have the audacity to claim that these individuals represent a "single unstructured Central/Northern European population".

One of the warriors, labeled WEZ56, belongs to Y-haplogroup R1a and shows an exceedingly Balto-Slavic-like genome-wide genetic structure. But none of this is even mentioned in passing in the paper. Indeed, according to Burger at al., WEZ56 is best classified as belonging to R1, even though the R1a classification is quite secure based on the raw data that the authors posted online.

Be extremely wary of what you read in this paper, and anything else that these scientists have published in the past and will publish in the future. Below is the paper summary:

Lactase persistence (LP), the continued expression of lactase into adulthood, is the most strongly selected single gene trait over the last 10,000 years in multiple human populations. It has been posited that the primary allele causing LP among Eurasians, rs4988235-A [1], only rose to appreciable frequencies during the Bronze and Iron Ages [2, 3], long after humans started consuming milk from domesticated animals. This rapid rise has been attributed to an influx of people from the Pontic-Caspian steppe that began around 5,000 years ago [4, 5]. We investigate the spatiotemporal spread of LP through an analysis of 14 warriors from the Tollense Bronze Age battlefield in northern Germany (∼3,200 before present, BP), the oldest large-scale conflict site north of the Alps. Genetic data indicate that these individuals represent a single unstructured Central/Northern European population. We complemented these data with genotypes of 18 individuals from the Bronze Age site Mokrin in Serbia (∼4,100 to ∼3,700 BP) and 37 individuals from Eastern Europe and the Pontic-Caspian Steppe region, predating both Bronze Age sites (∼5,980 to ∼3,980 BP). We infer low LP in all three regions, i.e., in northern Germany and South-eastern and Eastern Europe, suggesting that the surge of rs4988235 in Central and Northern Europe was unlikely caused by Steppe expansions. We estimate a selection coefficient of 0.06 and conclude that the selection was ongoing in various parts of Europe over the last 3,000 years.

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Warriors from at least two different populations fought in the Tollense Valley battle

Saturday, July 4, 2020

Fatyanovo males were rich in Y-haplogroup R1a-Z93 (Saag et al. 2020 preprint)


I'd say that thanks to this preprint we're now a lot closer to solving the mystery of the Sintashta people. Over at bioRxiv at this LINK. From the preprint:

Transition from the Stone to the Bronze Age in Central and Western Europe was a period of major population movements originating from the Ponto-Caspian Steppe. Here, we report new genome-wide sequence data from 28 individuals from the territory north of this source area - from the under-studied Western part of present-day Russia, including Stone Age hunter-gatherers (10,800-4,250 cal BC) and Bronze Age farmers from the Corded Ware complex called Fatyanovo Culture (2,900-2,050 cal BC). We show that Eastern hunter-gatherer ancestry was present in Northwestern Russia already from around 10,000 BC. Furthermore, we see a clear change in ancestry with the arrival of farming - the Fatyanovo Culture individuals were genetically similar to other Corded Ware cultures, carrying a mixture of Steppe and European early farmer ancestry and thus likely originating from a fast migration towards the northeast from somewhere in the vicinity of modern-day Ukraine, which is the closest area where these ancestries coexisted from around 3,000 BC.

...

Interestingly, in all individuals for which the chrY hg could be determined with more depth (n=6), it was R1a2-Z93 (Table 1, Supplementary Data 2), a lineage now spread in Central and South Asia, rather than the R1a1-Z283 lineage that is common in Europe [38,39].


Saag et al., Genetic ancestry changes in Stone to Bronze Age transition in the East European plain, BioRxiv, Posted July 03, 2020, doi: https://doi.org/10.1101/2020.07.02.184507

See also...

Like three peas in a pod

Tuesday, June 16, 2020

Like three peas in a pod


One of the most interesting questions still waiting to be answered by ancient DNA is where exactly did the ancestors of the present-day European and South Asian bearers of Y-haplogroup R1a part their ways? Indeed, the answer to this question is likely to be informative about the place and time of the split between the Balto-Slavic and Indo-Iranian language families.

I was doing some reading today and discovered that the peoples associated with the Bronze Age Fatyanovo-Balanovo and Unetice archeological cultures shared strikingly similar metalwork, despite being separated by well over two thousand kilometers of forest and steppe. Apparently, this similarity is especially pronounced in the metalwork of the Unetice culture from what is now Slovakia (see Ancient Metallurgy in the USSR: The Early Metal Age, page 136).

S11953 is currently the only sample from Slovakia associated with the Unetice culture (Sirak et al. 2020). There are no Fatyanovo-Balanovo samples available yet. However, as far as I can tell, I0432 from Samara, Russia, should be a decent stand in (Mathieson et al. 2015).

Of course, both S11953 and I0432 belong to Y-haplogroup R1a. Moreover, S11953 belongs to a typically Balto-Slavic subclade of R1a, while I0432 belongs to a closely related subclade that is dominant nowadays among the Indo-Iranian speakers of Asia.

S11953 is younger than I0432, but this doesn't necessarily mean that his ancestors arrived in East Central Europe from deep in Russia during the Bronze Age. Indeed, the opposite is more likely to be true. That is, I0432 is probably the recent decedent of migrants from somewhere near the North Carpathians, because he shows elevated European Neolithic farmer ancestry compared to earlier ancients from the Samara region (see here).

Below is a Principal Component Analysis (PCA) showing how S11953 and I0432 compare to each other in the context of ancient West Eurasian genetic variation. Obviously, they're sitting in the same part of the plot, which suggests that they harbor very similar ratios of ancient genetic components and probably share relatively recent ancestry. The relevant PCA datasheet is available here.


I've also highlighted myself, Davidski, on the plot. That's because I share the same Balto-Slavic-specific subclade of R1a with S11953 and, in terms of overall ancestry, I'm similar to both S11953 and I0432. Moreover, I'm the speaker of Polish, which is a Balto-Slavic language. What are the chances that we're dealing here with a remarkable string of coincidences? Indeed, was the North Carpathian region perhaps the homeland of the language ancestral to both Balto-Slavic and Indo-Iranian?

However, please note that there's nothing unusual or remarkable about my ancestry. The vast majority of people of Central, Eastern and Northern European origin - that is, mostly the speakers of Balto-Slavic, Germanic and Celtic languages - would also land in this part of the plot.

See also...

On the doorstep of India

Y-haplogroup R1a and mental health

The mystery of the Sintashta people