search this blog

Saturday, December 14, 2019

Avalon vs Valhalla revisited


Pictured below is a new version of my Celtic vs Germanic genetic map. It's based on the same Principal Component Analysis (PCA) as the original (which can be seen here), but more focused on Northwestern Europe and produced with a different program.


To see the interactive online version, navigate to Vahaduo Custom PCA and copy paste the text from here into the empty space under the PCA DATA tab. Then press the PLOT PCA button under the PCA PLOT tab. For more guidance, refer to the screen caps here and here.

To include a wider range of populations in the key, just edit the data accordingly. For instance, to break up the ancient grouping into more specific populations, delete the Ancient: prefix in all of the relevant rows. This is what you should see:


Conversely, you can leave the ancient sample set intact and instead reorder the present-day linguistic groupings into, say, geographic groupings. To achieve this just delete all of the linguistic prefixes, such as Celtic:, Germanic:, and so on. You should end up with a datasheet like this and plot like this.

Of course, you can design your own plot by using any combination of the ancient and present-day individuals and populations that I've already run in this PCA. Their coordinates are listed here. Indeed, if you're in the possession of your own Celtic vs Germanic PCA coordinates, you can add yourself to the plot. And if you're not, see here.

It's also possible to re-process PCA data via the SOURCE tab. But I don't recommend doing this with the Celtic vs Germanic data, which are derived from a fine scale analysis and don't pack much variation. On the other hand, Global25 data are ideal for such re-processing. I made the plots below from subsets of Global25 coordinates available in a zip file here. To see how, refer to the screen caps here and here.




See also...

Modeling your ancestry has never been easier

Getting the most out of the Global25

Modeling genetic ancestry with Davidski: step by step

Monday, December 9, 2019

The BOO people: earliest Uralic speakers in the ancient DNA record?


N-L1026 is the Y-chromosome haplogroup most closely associated with the speakers of Uralic languages. Thus far, the oldest published instances of N-L1026 are in two Siberian-like samples dating to 1473±87 calBCE from the site of Bolshoy Oleni Ostrov (BOO), located within the Arctic Circle in the Kola Peninsula, northern Russia.

So does this mean that the BOO people were Uralic speakers? I'm now thinking that it probably does, even though, as the scientists who published the BOO samples a year ago pointed out, they predate most estimates of the spread of extant Uralic languages into the Kola Peninsula (see Lamnidis et al. here).

Hundreds of ancient human samples from across Eurasia have been sequenced since last year. In fact, thousands if we count unpublished data. But only a handful of them belong to N-L1026.

Indeed, as far as I know, the next oldest instance of N-L1026 from Europe after those at BOO is still in an Iron Age sample from what is now Estonia published earlier this year as OLS10. Of course, this individual was in all likelihood an early west Uralic (Finnic) speaker (see Saag et al. here).

Moreover, consider these comments by Murashkin et al. in regards to the BOO site (referred to as KOG in their paper, available here):

Most of the bodies had been buried in wooden, boat-shaped, lidded caskets, which looked like small boats or traditional Sámi sledges (Ru. kerezhka).

...

The morphological characteristics of the skull series of the KOG are not like those of any other ancient or modern series from the Kola Peninsula, including the Sámi people. Instead, the series shows closer biological affinities with ancient Altai Neolithic and modern, Ugric-speaking Siberian groups (Moiseyev & Khartanovich 2012). It has earlier been suggested that modern Ugric-speaking Siberians, together with Samoyeds and Volga Finnic populations, share some common morphological characteristics that indicate their common origin (Alekseyev 1974; Bunak 1956; Gokhman 1992).

...

Based on the materials from the grave field, we can argue that there were direct or indirect contacts between the inhabitants of the Kola Peninsula and southern and western Scandinavia (Murashkin & Tarasov 2013).

Thus, the BOO people may have spoken an early west Uralic language related to Sami languages. It's also possible that they are in part ancestral to the N-L1026-rich Sami people.

Another intriguing thing about these mysterious ancients is that individual BOO003 belongs to the rare mitochondrial haplogroup T2d1b1. Now, this clearly is not a lineage native to Europe or indeed any part of North Eurasia. Its ultimate source is probably West or Central Asia. So how did this pioneer polar explorer end up with such an unusual and exotic mtDNA marker, and might the answer be an important clue about the origins of the BOO people?

The most plausible explanation is that the ancestors of BOO003 were associated with the Seima-Turbino phenomenon, which stretched from the taiga zone to the oases of what is now western China along the Ob-Irtysh river system, and probably facilitated cultural, linguistic and genetic exchanges between the populations of North Eurasia and Central Asia.

In other words, considering all of the clues, it would seem that the BOO people came from some part of the Ob-Irtysh basin, which might thus be the best place to look for the population with the oldest and phylogenetically most basal N-L1026 lineages. And if we find that, then we've probably found the proto-Uralians and their homeland.


Below is a Principal Component Analysis (PCA) based on Global25 data featuring the earliest likely Uralic speakers in the ancient DNA record. It was produced with an online PCA runner freely available here. EST_IA includes the above mentioned OLS10, while FIN_Levanluhta_IA is largely made up of Saami-related samples from western Finland. See anything interesting? Feel free to let me know about it in the comments below.


See also...

Big deal of 2019: ancient DNA confirms the link between Y-haplogroup N and Uralic expansions

It was always going to be this way

More on the association between Uralic expansions and Y-haplogroup N

Sunday, December 1, 2019

Big deal of 2019: ancient DNA confirms the link between Y-haplogroup N and Uralic expansions


The academic consensus is that Indo-European languages first spread into the Baltic region from the Eastern European steppes along with the Corded Ware culture (CWC) and its people during the Late Neolithic, well before the expansion of Uralic speakers into Fennoscandia and surrounds, probably from somewhere around the Ural Mountains.

On the other hand, the views that the Uralic language family is native to Northern Europe and/or closely associated with the CWC are fringe theories usually espoused by people not familiar with the topic or, unfortunately it has to be said, mentally unstable trolls.

The likely close relationship between the CWC expansion and the early spread of Indo-European languages was discussed in several papers in recent years (for instance, see here). This year, we saw the first ancient DNA paper focusing on the transition from the Bronze Age to the Iron Age in the East Baltic, including the likely first arrival of Uralic speech in what is now Estonia.

Published in Current Biology courtesy of Saag et al., the paper showed that the genetic structure of present-day East Baltic populations largely formed in the Iron Age (see here). It was during this time, the authors revealed, that the region experienced a sudden influx of Y-chromosome haplogroup N, which is today common in many Uralic speaking populations and often referred to as a Proto-Uralic marker. Little wonder then that Saag et al. linked this genetic shift in the East Baltic to the westward migrations of early Uralic speakers.

The table below, based on data from the Saag et al. paper, surely doesn't leave much to the imagination about what happened.


Unfortunately, I have to say that the genome-wide analysis in the paper was less informative than it could have been. The authors focused their attention on rather broad genetic components, and, as a result, missed an interesting fine scale distinction between their Bronze Age and Iron Age samples. The spatial maps below, based on my Global25 data for most of the ancients from Saag et al., show what I mean. The hotter the color the higher the genetic similarity between them and present-day West Eurasian populations.

Note that the Bronze Age (Baltic_EST_BA) samples are most similar to the Baltic-speaking, and thus also Indo-European-speaking, Latvians and Lithuanians, rather than the Uralic-speaking Estonians, even though they're from burial sites in Estonia. On the other hand, the Iron Age (Baltic_EST_IA) samples show strong similarity to a wider range of populations, including Estonians and many other Uralic-speaking groups.




See also...

It was always going to be this way

Fresh off the sledge

More on the association between Uralic expansions and Y-haplogroup N

Monday, November 25, 2019

Viking Age Iceland


I finally managed to get some of the Icelandic ancients from Ebenesersdóttir et al. 2018 into the Global25 datasheets (see here). Better late than never. Look for the"ISL_Viking_Age" prefix. Below is a screen cap of a Principal Component Analysis (PCA) with the new samples. It was done with an online Global25 PCA runner freely available here.


The individuals classified as unadmixed Gaels and Norse by Ebenesersdóttir et al. generally also look like it based on their Global25 coordinates.

The mixture models below, using all of the populations from the Global25 "modern pop averages scaled" datasheet, were run with an online tool freely available here. Note that the ADD DIST COL option is set to 1X. This is a useful feature for modeling the fine scale ancestry of samples that are derived from very similar populations.






See also...

They came, they saw, and they mixed

Commoner or elite?

Who were the people of the Nordic Bronze Age?

Sunday, November 10, 2019

Open analysis and discussion thread: Etruscans, Latins, Romans and others


I've just added coordinates for more than 100 ancient genomes from the recently published Antonio et al. ancient Rome paper to the Global25 datasheets. Look for the population and individual codes listed here. Same links as always:

Global25 datasheet ancient scaled

Global25 pop averages ancient scaled

Global25 datasheet ancient

Global25 pop averages ancient

Thus far I've only managed to check a handful of the coordinates, so please let me know if you spot any issues. Below is a Principal Component Analysis (PCA) featuring the Etruscan and Italic speakers. I ran the PCA with an online tool specifically designed for Global25 coordinates freely available here.


Can we say anything useful about the origins of the Etruscan and early Italic populations thanks to these new genomes? Also, to reiterate my question from the last blog post, what are the genetic differences exactly between the Etruscans, early Latins, Romans and present-day Italians? Feel free to let me know in the comments below.

Update 13/11/2019: Here's another, similar PCA. This one, however, is based on genotype data, and it also highlights many more of the samples from the Antonio et al. paper. Considering these results, I'm tempted to say that the present-day Italian gene pool largely formed in the Iron Age, and that it was only augmented by population movements during later periods. The relevant datasheet is available here.


Update 13/11/2019: It seems to me that the two Latini-associated outliers show significant ancestry from the Levant, which possibly means that they're in part of Phoenician origin. These qpAdm models speak for themselves:

ITA_Ardea_Latini_IA_o
ITA_Proto-Villanovan 0.547±0.081
Levant_ISR_Ashkelon_IA2 0.453±0.081
chisq 7.573
tail prob 0.87027
Full output

ITA_Prenestini_tribe_IA_o
ITA_Proto-Villanovan 0.679±0.068
Levant_ISR_Ashkelon_IA2 0.321±0.068
chisq 7.222
tail prob 0.89033
Full output

The Proto-Villanovan singleton is also a key part of the models. Dating to the Bronze Age/Iron Age transition, she appears to be of western Balkan origin. Moreover, her steppe ancestry is probably derived directly from the Yamnaya horizon.

ITA_Proto-Villanovan
HRV_Vucedol 0.677±0.031
Yamnaya_RUS_Samara 0.323±0.031
chisq 10.397
tail prob 0.661174
Full output

The cluster made up of four early Italic speakers can be modeled with minor Proto-Villanovan-related ancestry, but, perhaps crucially, it doesn't need to be. Indeed, judging by the qpAdm output below, it's possible that almost all of its steppe ancestry came from the Bell Beaker complex, and, thus, the Corded Ware culture complex before that.

ITA_Italic_IA
Bell_Beaker_Mittelelbe-Saale 0.480±0.055
ITA_Grotta_Continenza_CA 0.411±0.042
ITA_Proto-Villanovan 0.109±0.084
chisq 10.294
tail prob 0.590205
Full output

Two out of the three available Etruscans look very similar to the Italic speakers in the above PCA plots, and yet they show a lot more Proto-Villanovan-related ancestry in my qpAdm run. The statistical fit is also relatively poor, perhaps suggesting that something important is missing.

ITA_Etruscan
Bell_Beaker_Mittelelbe-Saale 0.186±0.081
ITA_Grotta_Continenza_CA 0.283±0.064
ITA_Proto-Villanovan 0.531±0.126
chisq 17.175
tail prob 0.143143
Full output

Interestingly, the Etruscan outlier with significant North African admixture (proxied in my run by MAR_LN) doesn't need to be modeled with any Bell Beaker ancestry.

ITA_Etruscan_o
ITA_Proto-Villanovan 0.675±0.057
MAR_LN 0.325±0.057
chisq 14.864
tail prob 0.315912
Full output

Update 17/11/2019: The spatial maps below show how three groups of ancient Romans (from the Imperial, Late Antiquity and Medieval periods) compare to present-day West Eurasian populations in terms of their Global25 coordinates. The hotter the color, the higher the similarity. More here.




See also...

Getting the most out of the Global25

Thursday, November 7, 2019

What's the difference between ancient Romans and present-day Italians?


The first paper on the genomics of ancient Romans was finally published today at Science [LINK]. It's behind a paywall, but the supplementary info is freely available here. Below is a quick summary of the results courtesy of the accompanying Ancient Rome Data Explorer.



I'm told that the genotype data from the paper will be online within a day or so at the Pritchard Lab website here. I'll have a lot more to say about ancient Romans and present-day Italians after I get my hands on it.

See also...

Open analysis and discussion thread: Etruscans, Latins, Romans and others

Tuesday, November 5, 2019

Modeling your ancestry has never been easier


An exceedingly simple, yet feature-packed, online tool ideal for modeling ancestry with Global25 coordinates is freely available HERE. It works offline too, after downloading the web page onto your computer. Just copy paste the coordinates of your choice under the "source" and "target" tabs, and then mess around with the buttons to see what happens. The screen caps below show me doing just that.






Another free, easy to use online tool that works with Global25 coordinates is the Principal Component Analysis (PCA) runner HERE. Below is a screen cap of me checking out one of the many PCA that it offers.


See also...

Getting the most out of the Global25

Wednesday, October 16, 2019

The Battle Axe people came from the steppe (Malmstrom et al. 2019)


It's been obvious for a while now that the Corded Ware culture (CWC) and its Scandinavian variant, the Battle Axe culture (BAC), originated on the Pontic-Caspian steppe. However, Malmstrom et al. drive the point home in a new open access paper at Proceedings B [LINK]. From the paper, emphasis is mine:

The Neolithic period is characterized by major cultural transformations and human migrations, with lasting effects across Europe. To understand the population dynamics in Neolithic Scandinavia and the Baltic Sea area, we investigate the genomes of individuals associated with the Battle Axe Culture (BAC), a Middle Neolithic complex in Scandinavia resembling the continental Corded Ware Culture (CWC). We sequenced 11 individuals (dated to 3330–1665 calibrated before common era (cal BCE)) from modern-day Sweden, Estonia, and Poland to 0.26–3.24× coverage. Three of the individuals were from CWC contexts and two from the central-Swedish BAC burial ‘Bergsgraven’. By analysing these genomes together with the previously published data, we show that the BAC represents a group different from other Neolithic populations in Scandinavia, revealing stratification among cultural groups. Similar to continental CWC, the BAC-associated individuals display ancestry from the Pontic–Caspian steppe herders, as well as smaller components originating from hunter–gatherers and Early Neolithic farmers. Thus, the steppe ancestry seen in these Scandinavian BAC individuals can be explained only by migration into Scandinavia. Furthermore, we highlight the reuse of megalithic tombs of the earlier Funnel Beaker Culture (FBC) by people related to BAC. The BAC groups likely mixed with resident middle Neolithic farmers (e.g. FBC) without substantial contributions from Neolithic foragers.
...

By contrast, the CWC individuals from Obłaczkowo in Poland (poz44 and poz81) show an extremely high proportion of steppe ancestry (greater than 90%), which is different from the later CWC-associated individuals excavated in Pikutkowo (Poland) [23], but similar to some other CWC-associated individuals from Germany, Lithuania, and Latvia [2,8,31]. Interestingly, these individuals with a large fraction of steppe ancestry have typically been dated to more than 2600 BCE, making them among the earliest CWC individuals genetically investigated. This observation, i.e. early CWC individuals resembled (genetically) Yamnaya-associated individuals, while later CWC groups show higher levels of European Neolithic farmer ancestry (Pearson's correlation coefficient: −0.51, p = 0.006) (figure 2), suggests an initial dispersal that occurred rapidly.

See also...