search this blog

Monday, April 26, 2021

Uralians of the Sargat horizon

Many years ago, well before the start of the ancient DNA revolution, someone made the very clever inference that the N-Tat Y-chromosome marker was closely associated with the expansion of Uralic languages.

Since then, N-Tat has been renamed several times over, to the point that I no longer know what it's called, but the aforementioned inference has turned into a very solid consensus backed up by a wide range of studies focusing on modern and ancient DNA.

Nowadays, Y-haplogroup N-L1026, a subclade of N-Tat, is seen as the main genetic signal of the Uralic expansions, along, of course, with Nganasan-related genome-wide genetic ancestry.

A recent paper at Science Advances by Gnecchi-Ruscone et al. featured the first ever genome-wide samples from the Sargat horizon, which is an Iron Age archeological formation in western Siberia normally associated with the Ugric branch of the Uralic language family. Surprisingly, and disappointingly, the authors failed to investigate this widely accepted connection.

If we go by the Y-haplogroup classifications in the paper, which may or may not be the smart thing to do, at least two of the Sargat horizon males belong to N-L1026, and one also to the more derived N-Z1936 subclade, which has been found in the remains of Hungarian Conquerers from Medieval Hungary. Of course, Hungarian is an Ugric language generally thought to have been introduced into the Carpathian Basin by the Hungarian Conquerers who originally came from western Siberia.

That's probably enough to corroborate the association between the Sargat horizon and the spread of Ugric/Uralic languages, but let's also take a quick look at the autosomal DNA of these Sargat individuals. Firstly, here's a Principal Component Analysis (PCA), based on Global25 data and produced with the Vahaduo G25 Views online tool. The results are self-explanatory.

Interestingly, I can't get a decent statistical fit when I try to reproduce the four-way qpWave/qpAdm model done by Gnecchi-Ruscone et al., probably mostly because my right pops or outgroups are different. This suggests to me that there's something important missing in their model.

MNG_Khovsgol_LBA 0.213±0.044
RUS_Ekven_IA 0.173±0.043
RUS_Sintashta_MLBA 0.548±0.014
TKM_Gonur1_BA 0.065±0.013
chisq 17.387
tail prob 0.0150625
Full output

So how about if I replace RUS_Ekven_IA with kra001, the oldest Nganasan-like individual in the ancient DNA record (see here), and MNG_Khovsgol_LBA with KAZ_Mereke_MBA, to add a more local stream of ancestry?

KAZ_Mereke_MBA 0.134±0.016
kra001 0.300±0.007
RUS_Sintashta_MLBA 0.503±0.023
TKM_Gonur1_BA 0.062±0.015
chisq 10.472
tail prob 0.163387
Full output

That's a better statistical fit and also, I'd say, a more realistic model, at least in terms of distal ancestry proportions. Note that Nganasan-related ancestry makes up 30% of the genome-wide genetic structure of the Sargat samples, which again corroborates the view that Uralic languages were spoken within the Sargat horizon.

Update 28/04/21: This is the best qpAdm model that I could find for Sargat_IA, at least in terms of the chisq and tail prob. It shows that the Sargat population was in large part very similar to that of KAZ_Pazyryk_IA.

KAZ_Mereke_MBA 0.033±0.016
KAZ_Pazyryk_IA 0.695±0.015
RUS_Sintashta_MLBA 0.240±0.022
TKM_Gonur1_BA 0.032±0.014

chisq 2.165
tail prob 0.950146
Full output

It's missing kra001, because KAZ_Pazyryk_IA packs enough kra001-related ancestry for the job.

KAZ_Mereke_MBA 0.143±0.018
kra001 0.430±0.008
RUS_Sintashta_MLBA 0.380±0.026
TKM_Gonur1_BA 0.047±0.018

chisq 8.872
tail prob 0.261943
Full output

The fact that KAZ_Pazyryk_IA can be modeled with significant kra001-related ancestry isn't surprising, considering that its territory was located in Siberia. However, my model doesn't necessarily prove that the Sargat population was largely or even partly of Pazyryk origin. Indeed, N-L1026 hasn't yet appeared in any Pazyryk remains.

See also...

The Uralic cline with kra001 - no projection this time

First taste of Early Medieval DNA from the Ural region

Hungarian Conquerors were rich in Y-haplogroup N

More on the association between Uralic expansions and Y-haplogroup N

It was always going to be this way

On the association between Uralic expansions and Y-haplogroup N

Thursday, April 22, 2021

The history of the Scythians (Gnecchi-Ruscone et al. 2021)

Over at Science Advances at this LINK. Many of the samples from this paper are in the Global25 datasheets. Look for the relevant population and individual IDs from the paper.

The Scythians were a multitude of horse-warrior nomad cultures dwelling in the Eurasian steppe during the first millennium BCE. Because of the lack of first-hand written records, little is known about the origins and relations among the different cultures. To address these questions, we produced genome-wide data for 111 ancient individuals retrieved from 39 archaeological sites from the first millennia BCE and CE across the Central Asian Steppe. We uncovered major admixture events in the Late Bronze Age forming the genetic substratum for two main Iron Age gene-pools emerging around the Altai and the Urals respectively. Their demise was mirrored by new genetic turnovers, linked to the spread of the eastern nomad empires in the first centuries CE. Compared to the high genetic heterogeneity of the past, the homogenization of the present-day Kazakhs gene pool is notable, likely a result of 400 years of strict exogamous social rules.

Gnecchi-Ruscone et al. 2021, Ancient genomic time transect from the Central Asian Steppe unravels the history of the Scythians, Science Advances, Sci Adv 7 (13), eabe4414, DOI: 10.1126/sciadv.abe4414

See also...

Uralians of the Sargat horizon

Wednesday, April 7, 2021

The Bacho Kiro surprise (Hajdinjak et al. 2021)

Over at Nature at this LINK. The paper focuses on Neanderthal ancestry in Initial Upper Paleolithic (IUP) humans from what is now Bulgaria. But, to me, much more interesting is the claim by its authors that present-day East Asians harbor ancient European, or, at least, European-related ancestry. From the paper, emphasis is mine:

When we explored models of population history that are compatible with the observations above using admixture graphs [28], we found that the IUP Bacho Kiro Cave individuals were related to populations that contributed ancestry to the Tianyuan individual in China as well as, to a lesser extent, to the GoyetQ116-1 and Ust’Ishim individuals (all |Z| < 3; Fig. 2d, Supplementary Information 6). This resolves the previously unclear relationship between the GoyetQ116-1 and Tianyuan individuals [13] without the need for gene flow between these two geographically distant individuals.


In conclusion, the Bacho Kiro Cave genomes show that several distinct modern human populations existed during the early Upper Palaeolithic in Eurasia. Some of these populations, represented by the Oase1 and Ust’Ishim individuals, show no detectable affinities to later populations, whereas groups related to the IUP Bacho Kiro Cave individuals contributed to later populations with Asian ancestry as well as some western Eurasian humans such as the GoyetQ116-1 individual in Belgium. This is consistent with the fact that IUP archaeological assemblages are found from central and eastern Europe to present-day Mongolia [5,15,16] (Fig. 1), and a putative IUP dispersal that reached from eastern Europe to East Asia. Eventually populations related to the IUP Bacho Kiro Cave individuals disappeared in western Eurasia without leaving a detectable genetic contribution to later populations, as indicated by the fact that later individuals, including BK1653 at Bacho Kiro Cave, were closer to present-day European populations than to present-day Asian populations [29,30].

Hajdinjak, M., Mafessoni, F., Skov, L. et al. Initial Upper Palaeolithic humans in Europe had recent Neanderthal ancestry. Nature 592, 253–257 (2021).

See also...

Ust'-Ishim belongs to K-M526

Wednesday, March 31, 2021

Against the conventional wisdom

I've read some very strange theories over the years trying to explain who was responsible for the so called Caucasus/Iranian-related ancestry in the Yamnaya people.

Proto-Indo-European speaking farmers from what is now Iran? How about Uruk invaders from Mesopotamia? No, wait, they were migrants from India who spoke Sanskrit. Haha.

Nope, it seems that hunter-gatherers rich in this type of ancestry lived north of the Caucasus already during the so called Pottery Neolithic or even the Mesolithic. That's the impression that I'm getting from watching the clip HERE.

This is basically also the idea that I gradually developed at this blog during the last few years, following common sense and logic, but totally against the conventional wisdom in regards to this topic. For instance, see here...

But here's my prediction: Steppe_EMBA only has 10-15% admixture from the post-Mesolithic Near East not including the North Caucasus, and basically all of this comes via female mediated gene flow from farming communities in the Caucasus and perhaps present-day Ukraine.
Modeling Steppe_EMBA

Of course, I could've done better with many of the details in my posts, like the dates and archeological links. But hey, at least I was smart enough to ignore the conventional wisdom.

I can't wait for the new ancient samples from the Pontic-Caspian steppe that David Anthony featured in his talks recently. Once I have them we'll be able to work out the details here for ourselves.

See also...

Ahead of the pack

Ancient DNA vs Ex Oriente Lux

Understanding the Eneolithic steppe

Monday, March 29, 2021

Khvalynsk is now out of the picture

The population associated with the Khvalynsk culture was not ancestral to the Yamnaya people. Archeologist David Anthony says so HERE. So where did the Yamnaya and Corded Ware populations come from? Anthony doesn't know yet.

See also...

Understanding the Eneolithic steppe

Ancient DNA vs Ex Oriente Lux

A final note for the year

Sunday, March 14, 2021

How the Shirenzigou nomads became Proto-Tocharians

A couple of years ago, the authors of a paper about a group of Iron Age nomads from the site of Shirenzigou, in the eastern Tian Shan, made a mistake. They wrongly assigned two of these nomads to Y-haplogroup R1b-M269.

This faux pas made them believe that the Shirenzigou nomads were closely related to the M269-rich population associated with the Afanasievo culture.

Indeed, since the Afanasievo culture was often credited with the spread of Tocharian languages to the Tarim Basin, these authors, led by Chao Ning, also concluded that the Shirenzigou nomads were potentially the missing link between the Afanasievo culture and the Tocharians (see here).

Moreover, Ning et al. used formal statistics to argue that the Shirenzegou nomads harbored Afanasievo-related genome-wide ancestry, rather than Sintashta-related genome-wide ancestry, despite the fact that the latter ancestry was widespread in the Tian Shan and surrounds during the Bronze and Iron ages. Soon after, another group of authors, led by Chuan-Chao Wang, also went out of their way to link the Shirenzigou nomads to the Afanasievo people with genome-wide DNA using formal statistics (see here).

Interestingly, one of the Shirenzigou nomads belongs to Y-haplogroup R1a-Z93, which is an obvious Sintashta-related lineage. Both Ning et al. and Wang et al. missed this important fact.

They also missed the key fact that the R1b lineage found in the Shirenzigou nomads actually belongs a native Central Asian subclade, which is only very distantly related to the originally Eastern European R1b-M269.

Now, formal stats are a very useful tool for studying genome-wide ancestry. But they're not infallible, and that's actually something of an understatement. Indeed, if you don't run sanity checks when using formal stats, you're likely to come to some unusual, even arse about face, conclusions. Uniparental markers, like Y-chromosome haplogroups, can provide a robust sanity check when running formal stats on genome-wide data.

One problem with formal stats is that Sintashta-related ancestry often looks very much like Afanasievo-related ancestry when it's mixed with indigenous Central Asian ancestry. Basically, the reason why this happens is that the Central Asian ancestry dampens the Early European Farmer (EEF) signal in the Sintashta-related ancestry.

This is an artifact that once caused scientists at Harvard to believe that Central Asian Scythians and present-day South Asians lacked Sintashta-related ancestry.

Unfortunately, since the publication of the Ning et al. paper, a consensus has emerged in academia that the Shirenzigou nomads are indeed the missing link between the Afanasievo culture and the Tocharians. But, let's be objective and honest here, it's a consensus based on nothing more than a comedy of errors.

On the other hand, me and most of the commentators at this blog have formed opinions about the Shirenzigou nomads that are totally at odds with the academic consensus, that:

- they're a complex mixture of Sintashta-related, indigenous Central Asian and Tibetan-related ancestries, with no clear, unambiguous signal of Afanasievo-related ancestry

- they weren't the speakers of Proto-Tocharian or even related in any specific way to the Tocharians

- they were probably the speakers of a now extinct Indo-Iranian language, and, at least based on geographic proximity, possibly related to the Yuezhi.

Feel free to make up your own mind. But for me, the question of how Tocharian languages ended up in the Tarim Basin remains wide open. I admit though, I'm currently quite partial to the idea floated here by commentator Copper Axe that the Chemurchek culture may have had something to do with it.

See also...

Don't believe everything you read in peer reviewed papers

Saturday, February 13, 2021

The Uralic cline with kra001 - no projection this time

A whole lot of nonsense was posted online, often by people who should've known better, after I claimed that kra001 was a solid proxy for a proto-Uralic genome (see here).

For those of you who still don't get it, below are three Principal Component Analysis (PCA) plots featuring Uralic speakers and other present-day Eurasians. Kra001 is also there. These graphs are based on genotype data not reprocessed Global25 data. The relevant datasheet is available here.

Compared to my previous PCA with kra001, here I included a bigger range of East Eurasian populations to help mitigate the effects of extreme genetic drift in some of the Siberian groups, at least on the first few Principal Components (PCs). Moreover, kra001 wasn't projected onto PCs computed with modern-day samples, so he was free to influence the outcome of the PCA.

Note the east to west clines made up largely of Uralic speaking groups on the first two plots. These plots are based on PCs 1/2 and 1 /3, respectively. The third plot, based on PCs 1/4, is more complex and thus more difficult to interpret, but it also manages to isolate many of the Uralic populations from the others.

The Uralic-specific clines do intersect with the clines and clusters formed by the other linguistic groups. However, based on the three plots, the Yeniseian-speaking Kets are the only Asian group that can plausibly be confused for Uralic speakers.

Importantly, apart from the Kets, kra001 is the only Asian individual who shifts his position on all three plots as if he were a Uralic speaker. This might well be a coincidence, and we'll never know what language was spoken by kra001, but it does suggest to me that his genome is a solid proxy for a proto-Uralic genome.

See also...

First taste of Early Medieval DNA from the Ural region

The BOO people: earliest Uralic speakers in the ancient DNA record?

Fresh off the sledge

Friday, February 5, 2021

Finally, a proto-Uralic genome

Obviously, genes don't speak languages, people do. But sometimes it's possible to associate a linguistic group with a very specific genetic signature.

A while ago many of us in the blogosphere spotted an uncanny connection between the Uralic language family, Y-haplogroup N-L1026 and Nganasan-like genome-wide genetic ancestry.

As a result, we expected a Nganasan-like population rich in N-L1026 to eventually appear in the ancient DNA record, probably somewhere in Siberia and in burials from a likely proto-Uralic archeological culture. This hasn't happened yet, but we now have direct evidence that such a population must have existed somewhere deep in Siberia as early as the Bronze Age.

Kra001, whose genome was published recently along with Kilinc et al., belongs to a pre-N-L1026 lineage and, at least in terms of genome-wide genetic structure, could well be from a population directly ancestral to present-day Nganasans. Of course, the Nganasan language is part of the Samoyedic branch of Uralic.

Below is a series of Principal Component Analyses (PCA) featuring kra001. He's labeled RUS_Krasnoyarsk_BA, after the location and age of his burial. Note the obvious Uralic cline running across the plots. That is, from west to east. Kra001 is positioned at the end of this cline very close to a small cluster of Nganasans. To see interactive versions of the plots, paste the Global25 coordinates here into the relevant field here.

Admittedly, there's no way of knowing whether this individual spoke proto-Uralic or not. Indeed, he may have spoken something totally unrelated. The important point is that the very specific genetic signature shared by almost all present-day Uralic speakers, except perhaps Hungarians, is now finally represented in the ancient DNA record. And I can reveal to you that we'll soon be seeing many more ancients very similar to kra001 in upcoming papers.

See also...

The Uralic cline with kra001 - no projection this time

The BOO people: earliest Uralic speakers in the ancient DNA record?

Fresh off the sledge

Wednesday, January 27, 2021

The great shift

Here's a Principal Component Analysis (PCA) featuring some of the ancients from the recent Saag et al. paper at Science Advances. To see an interactive version of the plot paste the Global25 coordinates here into the relevant field here.

Note that the Fatyanovo culture agropastoralists, who are rich in Y-haplogroup R1a and steppe ancestry, cluster with present-day Eastern Europeans. On the other hand, the Volosovo culture singleton sits near the European hunter-gatherer cline that no longer exists.

This Volosovo individual belongs to Y-haplogroup Q1a. However, most of the Volosovo males whose genomes are soon to be published belong to Y-haplogroup R1b.

Thus, in much of Eastern Europe during the Bronze Age, agropastoralists rich in R1a and steppe ancestry replaced hunter-gatherers rich in R1b and with no steppe ancestry. Of course, that's not where the story ends, but I'll get back to that later this year.

By the way, the relatively high coverage Fatyanovo Y-chromosome sequences are being analyzed at YFull. You can check out the results here.

See also...