Eurogenes Blog: Vikings

Showing posts with label Vikings. Show all posts

Monday, March 25, 2024

High-resolution stuff

I just emailed this to the authors of High-resolution genomic ancestry reveals mobility in early medieval Europe, a new preprint at bioRxiv [LINK].

I appreciate that Polish population history is not the main focus of your preprint, and also that you're constrained by the lack of relevant and suitably high quality ancient genomes from East-Central and Eastern Europe. However, I must say that your analysis of the Medieval Polish population and resulting conclusions about Polish population history don't reflect reality.

Your Poland_Middle_Ages genomic cluster is made up of just six samples that don't fully represent the genetic complexity of the core population of Medieval Poland.

As a result, you classified PCA0148 as one of the Poland_Middle_Ages outliers, even though this sample isn't an outlier when analyzed within the context of the full set of published Polish Medieval genomes.

Moreover, PCA0148 is very similar to several Polish Viking Age samples that show Scandinavian-specific genome-wide and Y-chromosome haplotypes, and probably likewise shows some Scandinavian-related ancestry.

This is important to note when attempting to recapitulate Polish population history, because it suggests that Scandinavian-related ancestry played a formative role in the shaping of the core Polish Medieval genetic cluster.

Thus, you might be correct when you claim that the six samples in your Poland_Middle_Ages cluster don't show any "detectable" Scandinavian-related ancestry, but this doesn't necessarily mean that this type of ancestry isn't a key part of the post-Iron Age Polish population history.

Below is a self-explanatory Principal Component Analysis (PCA) plot that illustrates my points. Interestingly, Figure 3c in your preprint shows very similar outcomes in regards to the post-Iron Age Polish population history. But the style and scale of your figure makes it difficult to spot the subtle but likely genuine Northwest European-related genetic shifts shown by PCA0148, the Viking context samples and present-day Poles relative to the Poland_Middle_Ages cluster.

However, I'm also skeptical that your Poland_Middle_Ages cluster doesn't carry any detectable or even significant Scandinavian-related ancestry. That's because I suspect that there might be some technical issues with your analysis that are masking this type of ancestry in the Polish samples.

Your top mixture model for the Poland_Middle_Ages cluster is, in all likelihood, an extreme statistical abstraction of reality, rather than a close reflection of it. That's because, due to a combination of historical, geographical and genetic factors, neither Italy.Imperial(I).SG nor Lithuania.IronRoman.SG are realistic formative source populations for the Medieval Polish gene pool.

One of the reasons why you ended up with such a surprising result is probably the lack of suitable samples from East-Central and Eastern Europe, especially those associated with plausibly the earliest Slavic-speaking populations.

It's also possible that basing your mixture model on formal statistics played a key part.

Formal statistics-based mixture models are known to be biased towards outcomes involving mixture sources from the extremes of mixture clines. If your analysis is affected by this problem, then this would help to explain why you characterized the Poland_Middle_Ages cluster as simply a two-way mixture between a Middle Eastern-related group from Imperial Rome and a Baltic population with a very high cut of European hunter-gatherer ancestry.

I do note that on page 6 of your manuscript you consider the possibility that the Southern European-related signal in the Poland_Middle_Ages cluster might only be very distantly related to Italy.Imperial(I).SG, and that it may even have spread across Poland with early Slavic speakers. This is a great point, and I think it should be emphasized and expanded upon, because I suspect that the problem runs deeper than this.

For instance, if the early Slavic ancestors of Poles carried substantially more Southern European-related ancestry than Lithuania.IronRoman.SG, and this ancestry was, say, more Balkan-related than Italian-related, then this might radically change your modeling of the Poland_Middle_Ages cluster. That's because these early Slavs would be positioned in a very different genetic space than Lithuania.IronRoman.SG, which could potentially require a significant signal of Scandinavian-related ancestry to get a robust mixture model.

Finally, it might be useful to consider Isolation-by-Distance as a partial vector for the Italy.Imperial(I).SG-related signal in Medieval Poland.

The full set of published Polish Medieval genomes includes a number of outliers with obvious ancestry from Western Europe and the Balkans. These people probably don't represent any large-scale migrations into Poland, but rather the movements of individuals and small groups. Over time, such small-scale mobility may have had a fairly significant impact on the genetic character of the Polish population.

Update 26/03/2024: I sent another email to Speidel et al., this time in regards to their analysis of present-day Hungarians.

Your preprint also claims that present-day Hungarians are genetically similar to Scythians, and that this is consistent with the arrival of Magyars, Avars and other eastern groups in this part of Europe.

However, present-day Hungarians are overwhelmingly derived from Slavic and German peasants from near Hungary. This is not a controversial claim on my part; it's backed up by historical sources and a wide range of genetic analyses.

Hungarians still show some minor ancestry from Hungarian Conquerors (early Magyars), but this signal only reliably shows up in large surveys of Y-chromosome samples.

The Scythians that you used to model the ancestry of present-day Hungarians are of local, Pannonian origin, and they don't show any eastern nomad ancestry. So they're either acculturated Scythians, or, more likely, wrongly classified as Scythians by archeologists.

And since these so-called Scythians lack eastern nomad ancestry, the similarity between them and present-day Hungarians is not a sign of the impact from Avars, Hungarian Conquerors and the like, but rather a lack of significant input from such groups in present-day Hungarians.

Citation...

Speidel et al., High-resolution genomic ancestry reveals mobility in early medieval Europe, bioRxiv, Posted March 19, 2024, doi: https://doi.org/10.1101/2024.03.15.585102

See also...

Wielbark Goths were overwhelmingly of Scandinavian origin

Tuesday, September 29, 2020

Viking world open analysis and discussion thread

Global25 and Celtic vs Germanic coordinates for most of the samples from the recent Margaryan et al. Viking paper are now available HERE and HERE, respectively. Look for the VK2020 prefix.

Feel free to put them through their paces and let me know what you find. Below are a couple of examples of what can be done with these coordinates using Vahaduo Global25 Views.

Monday, November 25, 2019

Viking Age Iceland

I finally managed to get some of the Icelandic ancients from Ebenesersdóttir et al. 2018 into the Global25 datasheets (see here). Better late than never. Look for the"ISL_Viking_Age" prefix. Below is a screen cap of a Principal Component Analysis (PCA) with the new samples. It was done with an online Global25 PCA runner freely available here.

The individuals classified as unadmixed Gaels and Norse by Ebenesersdóttir et al. generally also look like it based on their Global25 coordinates.

The mixture models below, using all of the populations from the Global25 "modern pop averages scaled" datasheet, were run with an online tool freely available here. Note that the ADD DIST COL option is set to 1X. This is a useful feature for modeling the fine scale ancestry of samples that are derived from very similar populations.

Tuesday, November 5, 2019

Modeling your ancestry has never been easier

An exceedingly simple, yet feature-packed, online tool ideal for modeling ancestry with Global25 coordinates is freely available HERE. It works offline too, after downloading the web page onto your computer. Just copy paste the coordinates of your choice under the "source" and "target" tabs, and then mess around with the buttons to see what happens. The screen caps below show me doing just that.

Another free, easy to use online tool that works with Global25 coordinates is the Principal Component Analysis (PCA) runner HERE. Below is a screen cap of me checking out one of the many PCA that it offers.

See also...

Getting the most out of the Global25

Wednesday, July 17, 2019

Viking invasion at bioRxiv

A new preprint featuring hundreds of Viking Age genomes has appeared at bioRxiv [LINK]. Titled Population genomics of the Viking world, it looks like a solid effort overall, although I'm skeptical about its conclusions. I might elaborate on that in the comments below, but I'll have a lot more to say on the topic if and when I get to check out the ancient genomes with my own tools. Details about the new samples, including their Y-chromosome haplogroup assignments, are available here. Below is the abstract, emphasis is mine:

The Viking maritime expansion from Scandinavia (Denmark, Norway, and Sweden) marks one of the swiftest and most far-flung cultural transformations in global history. During this time (c. 750 to 1050 CE), the Vikings reached most of western Eurasia, Greenland, and North America, and left a cultural legacy that persists till today. To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago. We find evidence for a majority of Danish Viking presence in England, Swedish Viking presence in the Baltic, and Norwegian Viking presence in Ireland, Iceland, and Greenland. Additionally, we see substantial foreign European ancestry entering Scandinavia during the Viking Age. We also find that several of the members of the only archaeologically well-attested Viking expedition were close family members. By comparing Viking Scandinavian genomes with present-day Scandinavian genomes, we find that pigmentation-associated loci have undergone strong population differentiation during the last millennia. Finally, we are able to trace the allele frequency dynamics of positively selected loci with unprecedented detail, including the lactase persistence allele and various alleles associated with the immune response. We conclude that the Viking diaspora was characterized by substantial foreign engagement: distinct Viking populations influenced the genomic makeup of different regions of Europe, while Scandinavia also experienced increased contact with the rest of the continent.

Margaryan et al., Population genomics of the Viking world, bioRxiv, posted July 17, 2019, doi: https://doi.org/10.1101/703405

See also...

They came, they saw, and they mixed

Who were the people of the Nordic Bronze Age?

Asiatic East Germanics

Friday, July 12, 2019

Getting the most out of the Global25

The first thing you need to know about the Global25 is that I update the relevant datasheets regularly, usually every few weeks, but they're always at these links:

Global25 datasheet ancient scaled

Global25 pop averages ancient scaled

Global25 datasheet ancient

Global25 pop averages ancient

...

Global25 datasheet modern scaled

Global25 pop averages modern scaled

Global25 datasheet modern

Global25 pop averages modern

Global25 data for samples from a variety of papers that have been published recently will eventually be incorporated into the main datasheets linked above, but the process might take several weeks or even months. In the meantime, feel free to use the temporary datasheets below. Thanks for your patience.

Allentoft 2023

Chylenski 2023

Jeong 2024

Koptekin 2022

Olalde 2023

Peltola 2022

Penske 2023

Posth 2023

Sirak 2024

Skourtanioti 2023

Stolarek 2023

Varela 2023

Wang 2023

Yu 2023

Each sample has a population code and an individual code. The population codes represent the countries, ethnic groups and/or archeological affinities of the samples, and I often modify these codes to suit my needs. On the other hand, the individual codes are unique to most of the samples and I usually don't change them.

So if you'd like to know more details about the samples try searching for their individual codes via a decent online search engine. Basic information about many of the samples is also available in the "anno" files here.

The main purpose of the Global25 is to provide data for mixture modeling. In other words, for estimating ancestry proportions, both ancient and modern (see here). This can be done on your computer with the R program and the nMonte R script, or online with a couple of different tools, which I discuss below.

If you don't have R installed on your computer, you can get it here, while nMonte is available here. For this tutorial please download nMonte and nMonte3, and store them in your main working folder (usually My Documents).

Once you have R set up, make sure its working directory is the same place where you stored nMonte. You can check this in R by clicking on "File" and then "Change dir". Additionally, you'll need two nMonte input files in the working directory titled "data" and "target". Examples of these files are available here. We'll be using them to test the ancient ancestry proportions of a sample set from present-day England.

Before you can begin the analysis you need to first call the nMonte script by typing or copy pasting source('nMonte.R') into the R console window, and then hitting "enter" on your keyboard. This is what you should see in the R console window afterwards.

To start the mixture modeling process, type or copy paste getMonte('data.txt', 'target.txt') into the R console window, hit "enter", and wait for the results. After a short time, probably less than a minute or two, you should see this output.

The data and target files contain population averages. And, as you can see, the results that these population averages have produced are in line with what one would expect from such a model focusing on the genetic shifts in Northern Europe during the Late Neolithic. Very similar ancient ancestry proportions have been reported for the English and other Northern Europeans recently in scientific literature.

However, when focusing on exceptionally fine-scale genetic variation that isn't reflected too well in the Global25 population averages, a more effective strategy might be to use multiple individuals from each reference population and let nMonte3 aggregate and average the inferred ancestry proportions.

This is often the case when attempting to model ancestry proportions for more recent periods, such as the Middle Ages. So let's try this with the English sample set using a modified data file, which is available here.

Replace the old data file with the new one in your working directory, and, like before, copy paste into the R console window the following two commands, hitting "enter" after each one: source('nMonte3.R') and getMonte('data.txt', 'target.txt'). This is what you should eventually see.

It's difficult to say how accurate these estimates are. But they look more or less correct considering the limited and less than ideal reference samples. For instance, the individuals labeled SWE_Viking_Age_Sigtuna are supposed to be stand ins for Danish and Norwegian Vikings, but they're a relatively heterogeneous group from Sweden, possibly with some British or Irish ancestry, so they might be skewing the results.

However, I'll be adding many more ancient samples to the Global25 datasheets as they become available, including lots of new Vikings, which should greatly improve the accuracy of these sorts of fine-scale mixture models.

An exceedingly simple, yet feature-packed, online tool ideal for modeling ancestry with Global25 coordinates is the VahaduoJS. It's freely available HERE, and it also works offline after downloading the web page. Just copy paste the coordinates of your choice under the "source" and "target" tabs, and then mess around with the buttons to see what happens. The screen caps below show me doing just that.

However, it's important to note that the Global25 is a Principal Component Analysis (PCA), so it makes good sense to also use it for producing PCA graphs. To do this just plot any combination of two or three of its Principal Components (PCs) to create 2D or 3D graphs, respectively. This can be done with a wide variety of programs, including PAST, which is freely available here.

To produce a 2D graph, open a Global25 datasheet in PAST, choose comma as the separator, highlight any two columns of data, click on the "Plot" tab and, from the drop down list, pick "XY graph". Below is a series of graphs that I created in exactly this way. I also color coded the samples according to their geographic origins. This was done by ticking the "Row attributes" tab.

PAST can also be used to run PCA on subsets of the Global25 scaled data to produce remarkably accurate plots of fine-scale population structure. For instance, here's a plot based on present-day populations from north of the Alps, Balkans and Pyrenees.

To try this create a new text file with your choice of populations from the Global25 scaled datasheet, open it with PAST and choose Multivariate > Ordination > Principal Components Analysis. I've already put together several datasheets limited to European, Northern European, West Eurasian and South Asian populations. They're available at the links below along with more details on how to run them with PAST.

Global25 workshop 1: that classic West Eurasian plot

Global25 workshop 2: intra-European variation

Global25 workshop 3: genes vs geography in Northern Europe

The South Asian cline that no longer exists

Another free, easy to use online tool that works with Global25 coordinates is the Vahaduo Global25 Views [LINK]. Below is a screen cap of me checking out one of the many PCA that it offers.

And if you're fond of tree-like structures as a means to describe fine-scale genetic variation, please see this blog post...

Global25 workshop 4: a neighbour joining tree

See also...

New Global25 interpretation tools

Saturday, June 1, 2019

They came, they saw, and they mixed

Y-chromosome haplogroup N is strongly associated with Uralic-speaking populations. That's probably because it was a salient feature of the gene pool of the earliest Uralic speakers, and it went with them as they migrated across northern Eurasia. However, some of its younger subclades appear to have spread with the speakers of Indo-European and Turkic languages.

For instance, N-Y10931 seems to be a marker of the Rurikids, a Varangian dynasty that, according to most sources, ruled the Kievan Rus in what are now Russia and Ukraine. And the Kievan Rus was a lose medieval political federation in which Slavic, Finnic (west Uralic) and Germanic languages were probably spoken. The latest on the genetic genealogy of the Rurikids was presented a couple of days ago at the Centenary of Human Population Genetics conference in Moscow, and there's an abstract of the talk available here (download the PDF and scroll down to page 84).

I'm not aware of any Rurikids among the thousands of ancients in my dataset, or even of any samples belonging to N-Y10931. But I do have the genome of someone who belongs to N-Y4339, which, as per the abstract linked to above, is proximally ancestral to N-Y10931. Not only does this person come from Viking Age Scandinavia, but he was buried in a crouched position typical of Slavic funerary customs of the time.

The individual in question is vik_84001. His genome was published recently along with a paper on the population structure of the Swedish town of Sigtuna way back when it was a Viking stronghold (see here). This is where his Y-chromosome sequence, labeled ERS2540883, is positioned on the YFull Y-chromosome phylogenetic tree. Click on the image to go to YFull.

However, the result is likely to be compromised to some extent by missing data. If so, it's possible that vik_84001 does indeed belong to N-Y10931 and ought to be sitting near or even among that cluster of Russian samples (Rurik descendants?) at the bottom of the page.

In any case, vik_84001 seems to be the closest individual in the ancient DNA record to a Rurikid. The Principal Component Analysis (PCA) below is based on my Global25 data. It features 18 other Viking Age individuals from Sigtuna alongside vik_84001 (look for the black dots). The relevant datasheet is available here. Interestingly, despite his eastern Y-haplogroup, vik_84001 is one of the few Sigtuna ancients who clusters strongly with present-day Swedes.

But here's what happens when I model his ancestry proportions with the Global25/nMonte method using a wide range of reference populations from Northern and Eastern Europe. The Swedes in this model are the same as those in the PCA.

vik_84001
Swedish,84.6
Ingrian,9.2
Russian_Tver,6.2
Belarusian,0
Estonian,0
Finnish,0
Finnish_East,0
Karelian,0
Latvian,0
Mordovian,0
Russian_Kostroma,0
Russian_Kursk,0
Russian_Orel,0
Russian_Pinega,0
Russian_Smolensk,0
Russian_Voronez,0
Ukrainian,0
Vepsian,0

[1] "distance%=2.3778"

Yep, despite his position in the PCA, vik_84001 shows a strong signal of ancestry related to the present-day populations of northwestern Russia. I'm not sure what this means exactly, but it's certainly fascinating stuff. And, by the way, I usually wouldn't use so many similar reference populations in a single Global25/nMonte model because of the problem of "overfitting", but in some cases it's OK to do so if the nMonte algorithm has enough recent genetic drift to latch onto.

See also...

More on the association between Uralic expansions and Y-haplogroup N

Fresh off the sledge

Uralic-specific genome-wide ancestry did make a signifcant impact in the East Baltic

It was always going to be this way

Conan the Barbarian probably belonged to Y-haplogroup R1a

search this blog