search this blog
Showing posts with label Saami. Show all posts
Showing posts with label Saami. Show all posts
Tuesday, September 29, 2020
Viking world open analysis and discussion thread
Global25 and Celtic vs Germanic coordinates for most of the samples from the recent Margaryan et al. Viking paper are now available HERE and HERE, respectively. Look for the VK2020 prefix.
Feel free to put them through their paces and let me know what you find. Below are a couple of examples of what can be done with these coordinates using Vahaduo Global25 Views.
See also...
Viking invasion at bioRxiv
Commoner or elite?
Who were the people of the Nordic Bronze Age?
Labels:
ancient DNA,
Baltic Sea,
Britain,
Denmark,
Fennoscandia,
France,
Ireland,
Kievan Rus,
Nordic,
Norse,
North Sea,
Norway,
Poland,
Russia,
Saami,
Scandinavia,
Sweden,
Viking,
Viking Age,
Vikings
Monday, July 27, 2020
Ancient ancestry proportions in present-day Europeans (to be continued)
This year has already been massive in all sorts of ways, including for new data and software releases. So I'm thinking it might be time to update many of the analyses that were featured at this blog a while ago.
Let's start with the classic hunter vs farmer vs herder mixture model for present-day European populations. The rules of the game are as follows:
- run the latest version of qpAdm using qpfstats output
- use transversion sites and 1240K capture data
- pick a set of diverse and chronologically sound outgroups
- for a model to be successful the p-value must reach 0.01
- tweak the left pops in models that are clearly underperforming
- follow high end scientific literature, logic and common sense
Obviously, the reason that I decided to limit my analysis to markers from transversion sites is to mitigate problems associated with modeling the ancestry of modern, high quality samples with relatively low quality ancients. One of these problems appears to be qpAdm assigning faux East Asian/Siberian admixture to present-day Europeans (for instance, see figure 4 here).
My starting reference populations and outgroups are listed below. In qpAdm terminology the former are known as the "left pops", while the latter as the "right pops". Most of these samples are freely available at the David Reich Lab website here.
left pops:
HUN_Koros_N_HG
TUR_Barcin_N
UKR_Yamnaya
TUR_Barcin_N
UKR_Yamnaya
right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
CZE_Vestonice16
BEL_GoyetQ116-1
Iberia_ElMiron
RUS_Karelia_HG
RUS_West_Siberia_HG
MNG_North_N
RUS_Ust_Kyakhta
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
CZE_Vestonice16
BEL_GoyetQ116-1
Iberia_ElMiron
RUS_Karelia_HG
RUS_West_Siberia_HG
MNG_North_N
RUS_Ust_Kyakhta
As you can see, I picked a wide variety of right pops. But I chose most of them specifically to be able to differentiate the three streams of ancestry - from ancient hunters, farmers and herders - that are the focus of my analysis. I also intentionally avoided using samples in the right pops that may have experienced gene flow, including cryptic gene flow, from the populations in the left pops.
I somewhat speculatively earmarked HUN_Koros_N_HG, from the Early Neolithic Carpathian Basin, and UKR_Yamnaya, from the Early Bronze Age North Pontic steppe in what is now Ukraine, to represent the hunter-gatherer and pastoralist streams of ancestry, respectively.
That's because I expected HUN_Koros_N_HG to be the best proxy for the hunter-gatherer ancestry that was initially absorbed by the early farmers who fanned out from the Aegean region across much of the European continent, and of course it made sense to choose a steppe pastoralist population that was located close to Central Europe where such groups first made the biggest impact outside of the steppe.
Interestingly, HUN_Koros_N_HG and UKR_Yamnaya did prove to be among most effective choices for the types of ancestries that they represented. For instance, UKR_Yamnaya generally produced much stronger statistical fits than a very similar set of Yamnaya samples from the Caspian steppe (more precisely, from the Samara region in Russia). However, this might well be an artifact, due to very specific characteristics of these few ancient individuals. Larger sample sets would be welcome, especially from Yamnaya sites in Ukraine.
Below, dear audience, is a spreadsheet featuring the preliminary results. Click on the image to view and/or download the spreadsheet. The general rule is that the higher the tail prob, or p-value, the more likely it is that the ancestry proportions are close to the truth (a tail prob of well below 0.05 is usually a strong indication that something isn't right). For a detailed look at each of the qpAdm runs, feel free to consult the zip file here.
Note, however, that many of the European groups in my burgeoning genotype dataset are yet to make an appearance in the spreadsheet. That's because their models with the standard left pops showed p-values well under 0.01, which essentially meant that they failed, and I'm still trying to make them work.
But round one has certainly revealed some fascinating stuff. For instance, except for Hungarians and Estonians, none of the Uralic-speaking groups can be modeled successfully in the standard three-way model.
However, I managed to significantly improve the statistical fits in their models by adding a Siberian population, RUS_Baikal_BA, to the left pops. This is unlikely to be a coincidence, because the Proto-Uralic homeland was almost certainly located in or very near Siberia. Iain Mathieson please take note.
Saami
HUN_Koros_N_HG 0.134±0.043
RUS_Baikal_BA 0.270±0.015
TUR_Barcin_N 0.081±0.026
UKR_Yamnaya 0.515±0.058
HUN_Koros_N_HG 0.134±0.043
RUS_Baikal_BA 0.270±0.015
TUR_Barcin_N 0.081±0.026
UKR_Yamnaya 0.515±0.058
chisq 19.865
tail prob 0.0108571
See also...
Monday, December 3, 2018
On the trail of the Proto-Uralic speakers (work in progress)
Historical linguists have long posited that Fennoscandia was a busy contact zone between early Germanic and Uralic languages. The first ancient DNA samples from what is now Finland have corroborated their inferences, by showing that during the Iron Age the western part of the country was inhabited by a genetically heterogeneous population closely related to both the Uralic-speaking Saami and Germanic-speaking southern Scandinavians.
The samples were sequenced and analyzed by two different teams of researches, and their findings published recently in Lamnidis et al. and Sikora et al. (see here and here, respectively).
This is how most of these ancients, whose remains were excavated from the Levanluhta burial site dated to 300–800 CE, behave in a Principal Component Analysis (PCA) based on my Global25 data. Levanluhta_IA are the Saami-related samples, while Levanluhta_IA_o is an Scandinavian-like outlier. Baltic_IA is an Iron Age individual from what is now Lithuania from the recent Damgaard et al. paper (see here). Note the accuracy of the Global25 data in pinpointing their genetic affinities and also the trajectory of the Levanluhta_IA cluster, which seems to be "pulling" towards Levanluhta_IA_o.
The Saami and Levanluhta_IA are clear outliers from the main Northern European cluster. There are two reasons for this: excess East Asian/Siberian-related ancestry and Saami-specific genetic drift. However, this eastern admixture and genetic drift are shared in varying degrees by other North European populations, especially those that also speak Uralic languages, and this is why they appear to be "pulling" towards the Saami/Levanluhta_IA clusters in my PCA. Thus, what this suggests is that the expansion of Uralic languages across Northeastern Europe was intimately linked with the spread of Siberian-related ancestry into the region.
This idea has been around for a long time and is now becoming even more widely accepted (see here). However, Lamnidis et al. also featured samples from a likely pre-Uralic (1523±87 calBCE) burial site at Bolshoy Oleni Ostrov in the Kola Peninsula, present-day northern Russia, and, perhaps surprisingly, found that they showed even more Siberian-related ancestry than Levanluhta_IA. So what's going on?
I'm confident that this discrepancy can be explained by multiple waves of migrations from the east into Northeastern Europe, possibly before, during and after the time of the people buried at Bolshoy Oleni Ostrov, by pre-Uralic, para-Uralic and/or Proto-Uralic-speaking populations.
Consider the following qpAdm output, in which Levanluhta_IA is just barely modeled successfully as a two-way mixture between Levanluhta_IA_o and Bolshoy_Oleni_Ostrov. The statistical fit improves significantly with the addition of Glazkovo_EBA as a third mixture source. This is an ancient population from near Lake Baikal dated to 4597-3726 BC from the aforementioned Damgaard et al. paper.
Levanluhta_IA Bolshoy_Oleni_Ostrov 0.468±0.036 Levanluhta_IA_o 0.532±0.036 chisq 19.129 tail prob 0.0854706 Full output Levanluhta_IA Bolshoy_Oleni_Ostrov 0.241±0.092 Glazkovo_EBA 0.162±0.059 Levanluhta_IA_o 0.597±0.046 chisq 7.756 tail prob 0.734966 Full outputFor the sake of being complete, I also tested whether Levanluhta_IA_o could be substituted by other similar ancient samples from the neighborhood, including those associated with the Battle-Axe and Corded Ware cultures. There's not much to report; qpAdm returned poor statistical fits and/or implausible ancestry proportions (for the full output from my runs, see here). Baltic_IA did produce a statistically sound model, but with excess Glazkovo_EBA-related ancestry. I also had to drop Bolshoy_Oleni_Ostrov from the analysis to make things work, which suggests to me that the result shouldn't be taken too literally.
Levanluhta_IA Baltic_IA 0.677±0.034 Glazkovo_EBA 0.323±0.034 chisq 8.547 tail prob 0.741095 Full outputSo as far as I can see, the western ancestry in Levanluhta_IA is likely to be mostly of Germanic origin, and thus Indo-European, meaning that it's logical to look east, perhaps far to the east, for the source of its Uralic ancestry. This might seem like a complicated and uncertain task, considering that Levanluhta_IA could well be at least a thousand years younger than the first entry of Uralic speakers into Fennoscandia. However, take a look what happens when I substitute Glazkovo_EBA with a variety of Uralic-speaking populations from around the Ural Mountains, which is where the Proto-Uralic homeland is generally considered to have been located.
Levanluhta_IA Bolshoy_Oleni_Ostrov 0.210±0.091 Khanty 0.283±0.090 Levanluhta_IA_o 0.507±0.035 chisq 7.007 tail prob 0.798532 Full output Levanluhta_IA Bolshoy_Oleni_Ostrov 0.193±0.098 Levanluhta_IA_o 0.495±0.035 Mansi 0.312±0.100 chisq 7.884 tail prob 0.7237 Full output Levanluhta_IA Bolshoy_Oleni_Ostrov 0.300±0.065 Levanluhta_IA_o 0.337±0.072 Mari 0.363±0.121 chisq 8.393 tail prob 0.677705 Full output Levanluhta_IA Bolshoy_Oleni_Ostrov 0.238±0.084 Levanluhta_IA_o 0.553±0.036 Nenets 0.209±0.067 chisq 7.210 tail prob 0.78181 Full output Levanluhta_IA Bolshoy_Oleni_Ostrov 0.302±0.069 Levanluhta_IA_o 0.324±0.081 Udmurt 0.373±0.135 chisq 9.195 tail prob 0.60393 Full outputAll of these models look great, and easily rival the best model with Glazkovo_EBA. Moreover, they make good sense in terms of linguistics. The only problem is that they're anachronistic, because the Uralic-speaking reference populations are younger than Levanluhta_IA. So I can't be certain that they reflect reality without corroboration from ancient DNA. It might turn out, for instance, that a Glazkovo_EBA-like population was already present somewhere deep in Europe before or during the time of Bolshoy_Oleni_Ostrov, while no such population existed around the Ural Mountains until the time of Levanluhta_IA. By the way, it might be important to note that the present-day Finnish samples in my dataset can't be modeled as a mixture between Levanluhta_IA and Levanluhta_IA_o. But they can be modeled as a mixture between Baltic_IA and Levanluhta_IA. I don't know which part of Finland they're from exactly; probably all over the place, so it'd be useful to test regional Finnish populations to see how they behave in such models. Of course, Finns aren't Saamic speakers, they're Finnic speakers, and they're probably the result of a more recent Uralic expansion into Fennoscandia than the one that gave rise to the Saami.
Finnish Baltic_IA 0.671±0.076 Levanluhta_IA 0.329±0.076 chisq 14.114 tail prob 0.293508 Full outputDamgaard et al. didn't report the Y-haplogroup for Baltic_IA, but the word round the campfire is that this individual belonged to N1c, which is today the most common Y-haplogroup among Uralic speakers. Obviously, we need a lot more ancient DNA to sort all of this out, but things are already looking pretty much as expected. Stay tuned for new posts in this series following the publication of more ancient DNA relevant to this fascinating topic. See also... How did Y-haplogroup N1c get to Bolshoy Oleni Ostrov? The Uralic cline in the Global25 Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...
Labels:
ancient DNA,
Bolshoy Oleni Ostrov,
Corded Ware Culture,
CWC,
Fennoscandia,
Finnic,
Finno-Ugric,
Indo-European,
Levanluhta,
N-L1026,
N1c,
Northern Europe,
Proto-Uralic,
R1a-Z645,
Saami,
Siberia,
Uralic,
Urals
Saturday, September 22, 2018
Corded Ware people =/= Proto-Uralics (Tambets et al. 2018)
A new paper on the genetic structure of Uralic-speaking populations has appeared at Genome Biology (see here). It looks to me like the prelude to a forthcoming paleogenetics paper on the same topic that was discussed in the Estonian media recently (see here). Although not exactly ground breaking (because it basically argues what I've been saying at this blog for years, like here), it's a very nice effort all round and must be read by anyone with an interest in this topic. From the paper, emphasis is mine:
Background The genetic origins of Uralic speakers from across a vast territory in the temperate zone of North Eurasia have remained elusive. Previous studies have shown contrasting proportions of Eastern and Western Eurasian ancestry in their mitochondrial and Y chromosomal gene pools. While the maternal lineages reflect by and large the geographic background of a given Uralic-speaking population, the frequency of Y chromosomes of Eastern Eurasian origin is distinctively high among European Uralic speakers. The autosomal variation of Uralic speakers, however, has not yet been studied comprehensively. Results: Here, we present a genome-wide analysis of 15 Uralic-speaking populations which cover all main groups of the linguistic family. We show that contemporary Uralic speakers are genetically very similar to their local geographical neighbours. However, when studying relationships among geographically distant populations, we find that most of the Uralic speakers and some of their neighbours share a genetic component of possibly Siberian origin. Additionally, we show that most Uralic speakers share significantly more genomic segments identity-by-descent with each other than with geographically equidistant speakers of other languages. We find that correlated genome-wide genetic and lexical distances among Uralic speakers suggest co-dispersion of genes and languages. Yet, we do not find long-range genetic ties between Estonians and Hungarians with their linguistic sisters that would distinguish them from their non-Uralic-speaking neighbours. Conclusions: We show that most Uralic speakers share a distinct ancestry component of likely Siberian origin, which suggests that the spread of Uralic languages involved at least some demic component. ... Recent aDNA studies have shown that extant European populations draw ancestry form three main migration waves during the Upper Palaeolithic, the Neolithic and Early Bronze Age [2, 3, 45]. The more detailed reconstructions concerning NE Europe up to the Corded Ware culture agree broadly with this scenario and reveal regional differences [65–67]. However, to explain the demographic history of extant NE European populations, we need to invoke a novel genetic component in Europe—the Siberian. The geographic distribution of the main part of this component is likely associated with the spread of Uralic speakers but gene flow from Siberian sources in historic and modern Uralic speakers has been more complex, as revealed also by a recent study of ancient DNA from Fennoscandia and Northwest Russia [68]. Thus, the Siberian component we introduce here is not the perfect but still the current best candidate for the genetic counterpart in the spread of Uralic languages.Citation... Tambets et al., Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations, Genome Biology, (2018) 19:139 https://doi.org/10.1186/s13059-018-1522-1 See also... Big deal of 2019: ancient DNA confirms the link between Y-haplogroup N and Uralic expansions
Subscribe to:
Comments (Atom)




