search this blog

Showing posts with label Eurasia. Show all posts
Showing posts with label Eurasia. Show all posts

Friday, January 21, 2022

Yamnaya is from Europe, but it's really from Asia


I was about to post a comment under a new preprint at bioRxiv, but the comment section isn't there anymore. Hopefully, this is just a temporary glitch.

The preprint in question is titled Reconstructing the spatiotemporal patterns of admixture during the European Holocene using a novel genomic dating method [LINK]. It's co-authored by Harvard/Broad MIT scientist Nick Patterson who occasionally comments at this blog.

My impression is that the authors see the people associated with the Yamnaya culture as Asians who simply used "far" Eastern Europe as a springboard to expand into other parts of Europe.

If so, they're dead wrong.

There are at least three arguments why the Yamnaya population should be seen as quintessentially European:

- its home was initially and overwhelmingly the Pontic-Caspian steppe, which is entirely located within the present-day borders of Europe

- Yamnaya genomes are clearly different from those of older populations native to nearby parts of Asia, and, in fact, these differences show a very strong correlation with the present-day borders between Europe and Asia

- the Yamnaya people weren't a new population in Europe by any stretch, but must have been overwhelmingly derived from the very similar Eneolithic peoples of the Pontic-Caspian steppe and/or the nearby forest steppe, both of which are located in Eastern Europe.

And yet, this is what the preprint claims:

The beginning of the Bronze Age was a period of major cultural and demographic change in Eurasia, accompanied by the spread of Yamnaya Steppe Pastoralist-related ancestry from Pontic-Caspian steppes into Europe and South Asia (16).

In fact, what really happened at this time was that Yamnaya steppe pastoralist-related ancestry spread from Eastern Europe to other parts of Europe, as well as to Central and West Asia.

The preprint does eventually explain that present-day South Asians derive their Yamnaya-related ancestry from a later eastward expansion of the European Corded Ware culture (CWC), but it completely ignores the fact that the Afanasievo culture was the result of the initial eastward expansion from Europe to Asia. That is, the ancestors of the Afanasievo people were recent migrants from the Pontic-Caspian steppe to Central Asia and Siberia.

There's also this:

Over the following millennium, the Yamnaya-derived groups of the Corded Ware Complex (CWC) and Bell Beaker complex (BBC) cultures brought Steppe pastoralist-related ancestry to Europe.

Seriously? Both the CWC and BBC, just like the Yamnaya culture, were from Europe. In fact, as per above, the descendants of the CWC expanded into Asia.

And this:

The second major migration occurred when populations associated with the Yamnaya culture in the Pontic-Caspian steppe expanded to central and western Europe from far eastern Europe.

The authors basically admit here that Yamnaya came from Eastern Europe, but they call it "far" Eastern Europe. Perhaps they know something I don't, but as things stand, there's no evidence that Yamnaya came from "far" Eastern Europe. In fact, the emerging consensus based on ancient DNA, including pre-publication data, is that Yamnaya may have originated in what is now Ukraine. In my opinion, Ukraine isn't located in "far" Eastern Europe, but more or less in the middle of it.

Inexplicably, this is what they say about the genetic origins of the Yamnaya and Afanasievo peoples:

These groups were likely the result of a genetic admixture between the descendants of EHG-related groups and CHG-related groups associated with the first farmers from Iran (8, 22, 36).

...

Thus, we combined all early Steppe pastoralist individuals in one group to obtain a more precise estimate for the genetic formation of proto-Yamnaya of ~4,400 to 4,000 BCE (Figure 2). These dates are noteworthy as they pre-date the archeological evidence by more than a millennium (37) and have important implications for understanding the origin of proto-Pontic Caspian cultures and their spread to Europe and South Asia.

Not really.

Like I said, the Yamnaya population was overwhelmingly derived from the Eneolithic peoples of the Eastern European steppe and/or forest steppe. And these Yamnaya-like Eneolithic peoples were spread out across a vast area of Eastern Europe by at least ~4,500 BCE. Some of their genomes have been available for several years, and many more are on the way.

It is possible that the Yamnaya and Afanasievo genotype formed in 4,400-4,000 BCE, but if so, then this was due to mixing between the Eneolithic steppe peoples and nearby European farmers. That's because the difference between the Yamnaya and Eneolithic steppe genotypes is minor (~15%) European farmer admixture in the former.

The really interesting puzzle is exactly where and when the peculiar Eneolithic steppe genotype came into being. Any ideas Dr Patterson?

See also...

Matters of geography

Understanding the Eneolithic steppe

Wednesday, April 7, 2021

The Bacho Kiro surprise (Hajdinjak et al. 2021)


Over at Nature at this LINK. The paper focuses on Neanderthal ancestry in Initial Upper Paleolithic (IUP) humans from what is now Bulgaria. But, to me, much more interesting is the claim by its authors that present-day East Asians harbor ancient European, or, at least, European-related ancestry. From the paper, emphasis is mine:

When we explored models of population history that are compatible with the observations above using admixture graphs [28], we found that the IUP Bacho Kiro Cave individuals were related to populations that contributed ancestry to the Tianyuan individual in China as well as, to a lesser extent, to the GoyetQ116-1 and Ust’Ishim individuals (all |Z| < 3; Fig. 2d, Supplementary Information 6). This resolves the previously unclear relationship between the GoyetQ116-1 and Tianyuan individuals [13] without the need for gene flow between these two geographically distant individuals.

...

In conclusion, the Bacho Kiro Cave genomes show that several distinct modern human populations existed during the early Upper Palaeolithic in Eurasia. Some of these populations, represented by the Oase1 and Ust’Ishim individuals, show no detectable affinities to later populations, whereas groups related to the IUP Bacho Kiro Cave individuals contributed to later populations with Asian ancestry as well as some western Eurasian humans such as the GoyetQ116-1 individual in Belgium. This is consistent with the fact that IUP archaeological assemblages are found from central and eastern Europe to present-day Mongolia [5,15,16] (Fig. 1), and a putative IUP dispersal that reached from eastern Europe to East Asia. Eventually populations related to the IUP Bacho Kiro Cave individuals disappeared in western Eurasia without leaving a detectable genetic contribution to later populations, as indicated by the fact that later individuals, including BK1653 at Bacho Kiro Cave, were closer to present-day European populations than to present-day Asian populations [29,30].

Hajdinjak, M., Mafessoni, F., Skov, L. et al. Initial Upper Palaeolithic humans in Europe had recent Neanderthal ancestry. Nature 592, 253–257 (2021). https://doi.org/10.1038/s41586-021-03335-3

See also...

Ust'-Ishim belongs to K-M526

Saturday, February 13, 2021

The Uralic cline with kra001 - no projection this time


A whole lot of nonsense was posted online, often by people who should've known better, after I claimed that kra001 was a solid proxy for a proto-Uralic genome (see here).

For those of you who still don't get it, below are three Principal Component Analysis (PCA) plots featuring Uralic speakers and other present-day Eurasians. Kra001 is also there. These graphs are based on genotype data not reprocessed Global25 data. The relevant datasheet is available here.

Compared to my previous PCA with kra001, here I included a bigger range of East Eurasian populations to help mitigate the effects of extreme genetic drift in some of the Siberian groups, at least on the first few Principal Components (PCs). Moreover, kra001 wasn't projected onto PCs computed with modern-day samples, so he was free to influence the outcome of the PCA.


Note the east to west clines made up largely of Uralic speaking groups on the first two plots. These plots are based on PCs 1/2 and 1 /3, respectively. The third plot, based on PCs 1/4, is more complex and thus more difficult to interpret, but it also manages to isolate many of the Uralic populations from the others.

The Uralic-specific clines do intersect with the clines and clusters formed by the other linguistic groups. However, based on the three plots, the Yeniseian-speaking Kets are the only Asian group that can plausibly be confused for Uralic speakers.

Importantly, apart from the Kets, kra001 is the only Asian individual who shifts his position on all three plots as if he were a Uralic speaker. This might well be a coincidence, and we'll never know what language was spoken by kra001, but it does suggest to me that his genome is a solid proxy for a proto-Uralic genome.

See also...

First taste of Early Medieval DNA from the Ural region

The BOO people: earliest Uralic speakers in the ancient DNA record?

Fresh off the sledge

Friday, February 5, 2021

Finally, a proto-Uralic genome


Obviously, genes don't speak languages, people do. But sometimes it's possible to associate a linguistic group with a very specific genetic signature.

A while ago many of us in the blogosphere spotted an uncanny connection between the Uralic language family, Y-haplogroup N-L1026 and Nganasan-like genome-wide genetic ancestry.

As a result, we expected a Nganasan-like population rich in N-L1026 to eventually appear in the ancient DNA record, probably somewhere in Siberia and in burials from a likely proto-Uralic archeological culture. This hasn't happened yet, but we now have direct evidence that such a population must have existed somewhere deep in Siberia as early as the Bronze Age.

Kra001, whose genome was published recently along with Kilinc et al., belongs to a pre-N-L1026 lineage and, at least in terms of genome-wide genetic structure, could well be from a population directly ancestral to present-day Nganasans. Of course, the Nganasan language is part of the Samoyedic branch of Uralic.

Below is a series of Principal Component Analyses (PCA) featuring kra001. He's labeled RUS_Krasnoyarsk_BA, after the location and age of his burial. Note the obvious Uralic cline running across the plots. That is, from west to east. Kra001 is positioned at the end of this cline very close to a small cluster of Nganasans. To see interactive versions of the plots, paste the Global25 coordinates here into the relevant field here.

Admittedly, there's no way of knowing whether this individual spoke proto-Uralic or not. Indeed, he may have spoken something totally unrelated. The important point is that the very specific genetic signature shared by almost all present-day Uralic speakers, except perhaps Hungarians, is now finally represented in the ancient DNA record. And I can reveal to you that we'll soon be seeing many more ancients very similar to kra001 in upcoming papers.

See also...

The Uralic cline with kra001 - no projection this time

The BOO people: earliest Uralic speakers in the ancient DNA record?

Fresh off the sledge

Tuesday, July 21, 2020

The oldest R1a to date


My popular map of the oldest instances of Y-haplogroup R1a in the ancient DNA record has a new entry: PES001 from the recent Saag et al. preprint. PES001 comes from a burial site in what is now northwestern Russia and is dated to a whopping 10785–10626 calBCE.


Indeed, I'm not aware of any R1a samples older than PES001 among the treasure trove of thousands of ancient samples waiting to be published. So it's likely that this individual will remain the oldest member of our R1a clan for some years to come.

See also...

Y-haplogroup R1a and mental health

Like three peas in a pod

The mystery of the Sintashta people

Wednesday, March 25, 2020

The origins of East Asians (Wang et al. 2020 preprint)


Over at bioRxiv at this LINK. Here's the abstract:

The deep population history of East Asia remains poorly understood due to a lack of ancient DNA data and sparse sampling of present-day people. We report genome-wide data from 191 individuals from Mongolia, northern China, Taiwan, the Amur River Basin and Japan dating to 6000 BCE - 1000 CE, many from contexts never previously analyzed with ancient DNA. We also report 383 present-day individuals from 46 groups mostly from the Tibetan Plateau and southern China. We document how 6000-3600 BCE people of Mongolia and the Amur River Basin were from populations that expanded over Northeast Asia, likely dispersing the ancestors of Mongolic and Tungusic languages. In a time transect of 89 Mongolians, we reveal how Yamnaya steppe pastoralist spread from the west by 3300-2900 BCE in association with the Afanasievo culture, although we also document a boy buried in an Afanasievo barrow with ancestry entirely from local Mongolian hunter-gatherers, representing a unique case of someone of entirely non-Yamnaya ancestry interred in this way. The second spread of Yamnaya-derived ancestry came via groups that harbored about a third of their ancestry from European farmers, which nearly completely displaced unmixed Yamnaya-related lineages in Mongolia in the second millennium BCE, but did not replace Afanasievo lineages in western China where Afanasievo ancestry persisted, plausibly acting as the source of the early-splitting Tocharian branch of Indo-European languages. Analyzing 20 Yellow River Basin farmers dating to ~3000 BCE, we document a population that was a plausible vector for the spread of Sino-Tibetan languages both to the Tibetan Plateau and to the central plain where they mixed with southern agriculturalists to form the ancestors of Han Chinese. We show that the individuals in a time transect of 52 ancient Taiwan individuals spanning at least 1400 BCE to 600 CE were consistent with being nearly direct descendants of Yangtze Valley first farmers who likely spread Austronesian, Tai-Kadai and Austroasiatic languages across Southeast and South Asia and mixing with the people they encountered, contributing to a four-fold reduction of genetic differentiation during the emergence of complex societies. We finally report data from Jomon hunter-gatherers from Japan who harbored one of the earliest splitting branches of East Eurasian variation, and show an affinity among Jomon, Amur River Basin, ancient Taiwan, and Austronesian-speakers, as expected for ancestry if they all had contributions from a Late Pleistocene coastal route migration to East Asia.

Also this part is interesting, but surprisingly naive:

The findings of the original study that reported evidence that the Afanasievo spread was the source of Steppe ancestry in the Iron Age Shirenzigou have been questioned with the proposal of alternative models that use ancient Kazakh Steppe Herders from the site of Botai, Wusun, Saka and ancient Tibetans from the site of Mebrak 15 in present-day Nepal as major sources for Steppe and East Asian-related ancestry [28]. However, when we fit these models with Russia_Afanasievo and Mongolian_East_N added to the outgroups, the proposed models are rejected (P-values between 10 -7 and 10 -2), except in a model involving a single low coverage Saka individual from Kazakhstan as a source (P=0.17, likely reflecting the limited power to reject models with this low coverage). Repeating the modeling using other ancient Nepalese with very similar genetic ancestry to that in Mebrak results in uniformly poor fits (Online Table 5). Thus, ancestry typical of the Afanasievo culture and Mongolian Neolithic contributed to the Shirenzigou individuals, supporting the theory that the Tocharian languages of the Tarim Basin—from the second-oldest-known branch of the Indo-European language family—spread eastward through the migration of Yamnaya steppe pastoralists to the Altai Mountains and Mongolia in the guise of the Afansievo culture, from where they spread further to Xinjiang [5,7,8,27,29,30]. These results are significant for theories of Indo-European language diversification, as they increase the evidence in favor of the hypothesis the branch time of the second-oldest branch in the Indo-European language tree occurred at the end of the fourth millennium BCE [27,29,30].

I'd say the authors are putting too much faith in their qpAdm mixture models. They ought to know that qpAdm has some serious limitations, especially in regards to fine scale ancestry. I would urge them to become better acquainted with the uniparental markers of the Iron Age Shirenzigou samples instead of forcing the ideas that these individuals harbor Afanasievo-derived ancestry and lack Tibetan-related ancestry.

See also...

They mixed up Huns with Tocharians

A surprising twist to the Shirenzigou nomads story

Afanasievo people may well have been proto-Tocharian speakers (Ning et al. 2019)

Wednesday, September 11, 2019

Y-haplogroup R1a and mental health


I've updated my map of pre-Corded Ware culture R1a samples with a couple of new entries from Central and South Asia (the original is still here). However, before any of you get overly excited, please note that these samples aren't older than the Corded Ware culture. The reason I added them to my map is to counter the ongoing absurd claims online that South Asian R1a isn't derived from European R1a.


Just in case the map can't be viewed in all of its glory in some devices, here's what the fine print says:

The oldest example of R1a in ancient DNA from Central Asia is dated to 2132-1940 calBCE (ID I3770, Narasimhan 2019). Moreover, this sequence is closely related to much older R1a samples from Central, Eastern and Northern Europe, and phylogenetically nested within their diversity. Thus, it must surely represent a population expansion from Europe to Central Asia. Indeed, it's also associated with the Bronze Age Andronovo archeological culture, which is usually seen as an offshoot of the Corded Ware culture (CWC) of Late Neolithic Europe. The vast majority of present-day R1a lineages in Central Asia are closely related to that of I3770, and so must also ultimately derive from Europe.

The oldest instance of R1a in ancient DNA from South Asia is dated to just 1044-922 calBCE (ID I12457, Narasimhan 2019). This sequence, as well as the vast majority of present-day South Asian R1a lineages, are closely related to much older R1a samples from Central, Eastern and Northern Europe, and phylogenetically nested within their diversity. Thus, they must surely represent a population expansion from Europe to South Asia via Central Asia, in all likelihood during the Bronze Age. Even if R1a existed in South Asia before the Bronze Age, which is extremely unlikely, because it's found in samples from indigenous European hunter-gatherers, the vast majority of present-day R1a lineages in South Asia must be ultimately from Europe.

The idea that most, if not all, South Asian R1a is derived from European R1a seriously scares a lot of people. This is obvious in many online discussions on the topic. I suspect they're so frightened by it because, in their minds, it has the potential to encourage discrimination and even racism, perhaps by re-defining the colonization of much of the world by European nations in the recent past as the natural order of things?

In any case, clearly we're dealing with some sort of mass phobia here. I've got advice for those of you suffering from this problem: if you're honestly worried that the geographic provenance and expansion history of some Y-haplogroup is going to negatively impact on your life in any meaningful way, then it's time to find yourself a quality mental health professional. All the best with that.

See also...

The mystery of the Sintashta people

The Poltavka outlier

Yamnaya isn't from Iran just like R1a isn't from India

Sunday, March 31, 2019

Map of pre-Corded Ware culture (>2900 BCE) instances of Y-haplogroup R1a (updated)


Below is a map showing the global distribution of Y-chromosome haplogroup R1a prior to the expansions of the R1a-rich Corded Ware culture (CWC) people and their descendants across Europe and Asia from around 2900 BCE. I'll be updating this map regularly and using it to help me narrow down the options for the place of origin of R1a, and also to counter the misinformation about this topic that has appeared in print and online over the years, including in many scientific publications and popular websites such as Wikipedia.


Incredibly, as far as I know, there are just six reliably called instances of R1a in the now ample Eurasian ancient DNA record dating to the pre-CWC period. To put this into perspective, consider that R1a is today the most common Y-haplogroup in much of Europe and Asia. How did that happen I wonder? However, please note that I chose to base the map only on samples sequenced with the capture and shotgun methods, rather than the PCR method, which is susceptible to producing contaminated results and no longer used in major ancient DNA studies.

See also...

Y-haplogroup R1a and mental health

The Poltavka outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, February 15, 2018

Modeling genetic ancestry with Davidski: step by step


There are many different ways to model your genetic ancestry but I prefer the Global25/nMonte method. This is a step by step guide to modeling ancient ancestry proportions with this simple but powerful method using my own genome.


As far as I know, the vast majority of my recent ancestors came from the northern half of Europe. This may or may not be correct, but it gives me somewhere to start, so that I can come up with a coherent model. If you don't have this sort of information, because, perhaps, you were adopted, then just look in the mirror, and work from there. Like I say, it's not imperative that you know anything whatsoever about your ancestry, because your genetic data will do the talking, but you do need a model when modeling.

In scientific literature nowadays Northern Europeans are often described as a three-way mixture between Yamnaya-related pastoralists, Anatolian-derived early farmers, and Western European Hunter-Gatherers (WHG). So let's see if this model works for me. Obviously, if it does, then it'll confirm the information that I have about my origins, but it might also reveal finer details that I'm not aware of. The datasheet that I'm using for this model is available here.

[1] distance%=6.9025 / distance=0.069025

Davidski

Yamnaya_Samara 53.9
Barcin_N 30.75
Rochedane 15.35
Tepecik_Ciftlik_N 0

Yep, the model does work, with a fairly reasonable distance of almost 7%. The ancestry proportions more or less match those from scientific literature and the plethora of analyses that I've featured at this blog on the topic. Please note that I've kept things very simple, using only four reference populations and individuals as proxies for four distinct streams of ancestry. But I've put my own twist on this Neolithic/Bronze Age model by including two populations from Neolithic Anatolia (Barcin_N and Tepecik_Ciftlik_N), just to see what would happen. The WHG proxy is Rochedane.

Admittedly, though, my Yamnaya cut of ancestry appears somewhat bloated at over 53%, and the model's distance is a little higher than what I normally see for really strong models. So let's check if I can get a better fitting and more sensible result by adding a slightly more easterly forager proxy than Rochedane: Narva_Lithuania.

[1] distance%=5.9331 / distance=0.059331

Davidski

Yamnaya_Samara 45.75
Barcin_N 31.45
Narva_Lithuania 22.8
Rochedane 0
Tepecik_Ciftlik_N 0

The statistical fit does improve, and when given a choice between Rochedane and Narva_Lithuania, the algorithm picks the latter as the only source of extra forager input in my genome.

What could this mean? It might mean that a large part of my ancestry derives from the Baltic region. Actually, I know for a fact that this is true. But even if I had no idea about my genealogy, this result would be a very strong hint about my genetic origins. Indeed, let's follow this trail and try to further improve the fit of the model by adding a more relevant Yamnaya-related proxy, such as early Baltic Corded Ware (CWC_Baltic_early).

[1] distance%=5.444 / distance=0.05444

Davidski

CWC_Baltic_early 54.95
Barcin_N 26.7
Narva_Lithuania 18.35
Rochedane 0
Tepecik_Ciftlik_N 0
Yamnaya_Samara 0

Holy shit! To be honest, I wasn't expecting this sort of resolution and accuracy, and I can't promise that everyone using the Global25/nMonte method will see such incredibly nuanced outcomes, but this isn't a fluke. It can't be, because it gels so well with everything that I know about my ancestry. Please note also that I belong to Y-chromosome haplogroup R1a-M417, which is a lineage intimately associated with the Corded Ware expansion across Northern Europe (for instance, see here).

But of course, the Baltic and nearby regions haven't been isolated from migrations and invasions since the Corded Ware times. For instance, at some point, probably during the Bronze Age, Uralic-speaking groups moved west across the forest zone of Northeastern Europe and into the East Baltic and northern Scandinavia. It's generally accepted that they brought Siberian admixture with them (see here). Moreover, from the Iron Age to the Middle Ages, East Central Europe was under intense pressure from a wide range of nomadic steppe groups with complex ancestry, such as the Sarmatians, Avars, Huns, and Mongolians. Did any of these peoples leave their mark on my genome? At the risk of overfitting the model, let's explore this possibility by adding a few more reference populations.

[1] distance%=5.444 / distance=0.05444

Davidski

CWC_Baltic_early 54.95
Barcin_N 26.7
Narva_Lithuania 18.35
Han 0
Mongolian 0
Nganassan 0
Rochedane 0
Sarmatian_Pokrovka 0
Tepecik_Ciftlik_N 0
Yamnaya_Samara 0

Nothing changes when I add the Han Chinese, Mongolians, Nganassans (a Uralic group from Siberia), and Sarmatians to the model. But what about if I throw in the only ancient Slav in my datasheet?

[1] distance%=2.9904 / distance=0.029904

Davidski

Slav_Bohemia 85.9
CWC_Baltic_early 7.7
Narva_Lithuania 6.4
Barcin_N 0
Rochedane 0
Tepecik_Ciftlik_N 0
Yamnaya_Samara 0

Considering that the vast majority of my recent ancestors were Poles, thus a Slavic-speaking people from near the Baltic, this outcome makes perfect sense. And check out the new distance! But the problem now is that I'm overfitting the model by using two very similar and probably very closely related references, CWC_Baltic_early and Slav_Bohemia. And overfitting should be avoided at all costs. So it might be useful to break up this effort into two models: one focusing on the Neolithic and Bronze Age, and the other on the Iron Age and Middle Ages. I'll do that soon, but not just yet, because there are still too few Iron Age and Medieval samples available from the Baltic region and surrounds for meaningful analyses of this type.

See also...

Genetic ancestry online store (to be updated regularly)