search this blog

Monday, November 5, 2018

On the spread of dairy pastoralism to East Asia (Jeong & Wilkin et al. 2018)

Over at PNAS at this LINK. Below is the abstract and a table with the uniparental haplogroups for the 20 ancient samples from the paper. Emphasis is mine.

Recent paleogenomic studies have shown that migrations of Western steppe herders (WSH) beginning in the Eneolithic (ca. 3300–2700 BCE) profoundly transformed the genes and cultures of Europe and central Asia. Compared with Europe, however, the eastern extent of this WSH expansion is not well defined. Here we present genomic and proteomic data from 22 directly dated Late Bronze Age burials putatively associated with early pastoralism in northern Mongolia (ca. 1380–975 BCE). Genome-wide analysis reveals that they are largely descended from a population represented by Early Bronze Age hunter-gatherers in the Baikal region, with only a limited contribution (∼7%) of WSH ancestry. At the same time, however, mass spectrometry analysis of dental calculus provides direct protein evidence of bovine, sheep, and goat milk consumption in seven of nine individuals. No individuals showed molecular evidence of lactase persistence, and only one individual exhibited evidence of >10% WSH ancestry, despite the presence of WSH populations in the nearby Altai-Sayan region for more than a millennium. Unlike the spread of Neolithic farming in Europe and the expansion of Bronze Age pastoralism on the Western steppe, our results indicate that ruminant dairy pastoralism was adopted on the Eastern steppe by local hunter-gatherers through a process of cultural transmission and minimal genetic exchange with outside groups.

Jeong & Wilkin et al., Bronze Age population dynamics and the rise of dairy pastoralism on the eastern Eurasian steppe, PNAS published ahead of print November 5, 2018

See also...

The mystery of the Sintashta people


Ric Hern said...

Does this not make sense when considering that the Okunevo Culture basically displaced the Afanasevo Culture ? (Archeology)

Ric Hern said...

@ Davidski

That Map is particularly interesting since it shows Western Steppe Ancestry in Anatolia.....Just like you said.

Ric Hern said...

Or do they imply that Botai people migrated to the East maybe taking a more Northern route as Afanasevo and then formed the Okunevo who pushed back Westwards ?

Davidski said...


That map shows steppe lands, not steppe ancestry.

EastPole said...

There are plenty of Slavic words in Altaic languages. This was for some linguists the main argument for eastern origin of Slavs i.e. that Slavs came from Asia because they were in contact with Turks 2000-1000 BC and Turks originated in Asia.

But it looks like dairy pastoralism on the eastern Eurasian steppe was introduced by Sintashta which is derived from Corded Ware Culture which originated in Poland. Interesting.

Ric Hern said...

Okay. Sorry got a littlebit carried away there. Heheheeh..

Matt said...

Hmm... On the y, no C2 (a la "Golden Horde Asian" C2b1a3a1a), and no O, but one N1c (with mt U5, which seems a bit unusual as well? But there's nothing autosomally particularly odd about ARS003. Might be worth seeing if he's more ANE rich than other samples). Perhaps ARS015 would show something different if she'd been a male sample with the same autosomal basis.

Autosomally, "However, when Native Americans are added to PC calculation, we observe that LBA Khövsgöls are displaced from modern neighbors toward Native Americans along PC2, occupying a space not overlapping with any contemporary population (Fig. 2A and SI Appendix, Fig. S8). Such an upward shift on PC2 is also observed in the ancient Baikal populations from the Neolithic to EBA and in the Bronze Age individuals from the Altai associated with Okunevo and Karasuk cultures (1). These observations are consistent with LBA Khövsgöls and other ancient Siberians sharing more ancestry with Native American-related gene pools than modern populations in the region do.".

I'd still like to see a Fst analysis and f3 outgroup from these published papers, now that these ancient Northeast Asian samples are coming through. Lots of these East Asian groups are going to be a clade to West Eurasian populations, apart from the ANE fractions, so modelling this group as Baikal+ANE may have some limited meaning. For example, it might tell us more useful things to understand how these samples, Baikal and Devil's Gate relate to the Jomon sample - is there a cline here after already accounting for ANE?

Quite interesting on phenotype SNPs: Khövsgöl individuals harbor derived alleles for many skin pigmentation variants (i.e. alleles associated with lighter skin pigmentation or under selective sweep), especially for those reaching high frequency in both Europeans and East Asians, such as rs1800404 in the OCA2 gene (Fig. S17). Rs3827760 in EDAR is also found in derived status for most of individuals, with a clear exception of the western outlier ARS026, who is ancestral homozygous. In contrast, variants with more restricted geographic distribution tend to be in lower frequency in Khövsgöl. Especially, three variants mostly confined to East Asia are found mostly ancestral in Khövsgöl: rs1800414 in OCA2 (only two individuals with derived alleles), rs2228479 in MC1R and rs1229984 in ADH1B (no derived allele observed; Fig. S17). .

Suggests that the trend on the Western Steppe and Europe where SLC45A2 rose in frequency suggesting lighter pigmentation is actually also paralleled by some changes in pigmentation allele related frequencies in at least parts of East Eurasia? Asian alcohol dehydrogenase also different. More confirmation EDAR selection would've happened to at least some extent in whatever probably LGM era population was co-ancestral to the north of East Asia and then post LGM Siberians and Native Americans.

Synome said...

Cool stuff!

The dominant presence of Y-hg Q and the heightened affinity to NA is suggestive of an older Paleosiberian related population dwelling in the Baikal region before Turkic and Mongolic speakers entered the area.

This accords with the accounts of some historians who believe that the original core of the Xiongnu were Yeniseian speakers, and that Yeniseian languages had a wider, more southerly distribution in ancient times.

bellbeakerblogger said...

It’s cool to see dental calculus and LP alleles finally being examined together.

andrew said...

Elite dominance? The West Eurasian proportion is not zero and tech transfer can have an outsized cultural impact but still needs a human vector in this time period.

Shaikorth said...

Looks like most of the West Eurasian there is not Steppe_MLBA but from something like Okunevo and thus ultimately derived from WSHG or Botai. The Sintashta-Andronovo types were the cattle herders and their impact looks primarily cultural even if we consider an elite dominance scenario.

Samuel Andrews said...


Pale skin in Asia is definitely "old" considering unrelated groups with little to no common ancestry in the last 10,000 years share it.

Samuel Andrews said...

That doesn't exclude the possibility it was unpopular in some & became popular recently.

Kristiina said...

@ Synome

You are probably right when you write that ”the Yeniseian languages had a more southerly distribution”. Indeed, these pastoralists inhabited Western Mongolia and not the Baikal area. At the moment, the oldest Q is Kolyma1, 9800 BP from Northeastern Siberia. He is c. 75 % Devils gate.

The yDNAs from the Baikal Neolithic belong to N, c. 7000 BP. Q1b1b1-YP4004(Q-L53) appears during the Late Neolithic/Bronze Age. The Ket Q haplotype is not old: the OMRS expansion time of Q1a2a1-L54 in the Yenisei basin is 3.3 kya.

With the current evidence you cannot claim that the Yeniseian languages preceded the Turkic languages in the Baikal area, provided that we presume that there is a connection between Baikal N(N1a2-L666) and the Turkic languages.

Max said...

Ars26(r1a1) close to Okunevo like Karasuk culter members

Mem said...


Slab grave khövsöl genetics contains approximately 4%-7% of the western Eurasian component,otherwise shamanka EBA in the Baikal Lake-lena Basin region have not any difference From Khövsgöl .So the whole Mongolian and baikal region was populated bu this Paleo amerind y dna Q people.

The urheimat of proto Bulgaro Turkic language,according to paleolinguistic data, in the West Siberian Baraba steppes .Proto Turkic migration to the East seems to have started from 800-500 BC.

"The geolexical analysis of the Proto-Bulgaro-Turkic lexemes leads to the reconstruction of a water-rich ecozone with deciduous woods and particularly beavers, located in the temperate climate near grassland away from arid steppeland and sand deserts.

This seems to exclude any areas around the Aral Sea and the lower part of the Turan Depression, most parts of southern Kazakhstan, and most arid areas located to the south of the Eurasian Barrier, such as Dzungaria, the Taklamakan Desert (West China), the Dzungarian Gobi, the Gobi Desert (Inner Mongolia, China), the Great Lakes Depression (West Mongolia), the Alashan Desert (China), the Ordos Desert (China), etc.

The reconstructability of the beaver ecozone seems to exclude mountain areas along the northern Tian-Shan.

Most of the Mongolian territory may be excluded with some 80% probability based on its cold, dry environment with deforested steppeland and mountainous areas, which contradicts the requirement for multiple species of deciduous trees; beavers; ponds and rivers that should not freeze to the bottom in wintertime. Moreover, unlike the Ural Mountains, northern Kazakhstan and the Irtysh basin, all of which have high outputs of wheat, millet, barley, oats, southern Mongolia is only barely suitable for crop cultivation.

The Ob and Yenisei demoregions cannot be completely excluded as they share environment comparable with the Tobol-Ishim-Irtysh demoregion. Nevertheless, the Yenisei demoregion can be regarded as a much less likely candidate based on the relative scarcity of salt deposits, and the prevalence of taiga forests and the taiga fauna, which is hardly reflected in the reconstructed vocabulary above. This means that the Proto-Bulgaro-Turks might have been unfamiliar with dense woodland, or at least such environment was rather uncommon.

Mem said...

The reconstructed description of the Proto-Bulgaro-Turkic environment

As an additional result of the geolexical analysis above, we can make the following conclusions concerning the ethnological description of the Proto-Bulgaro-Turkic people:

They lived on the border of the deciduous woodland and open steppe that included some bushland of juniper, mugwort and various flower plants; not too far from the highlands with mineral deposits, though not necessarily in the direct vicinity of the highlands. Stone wastelands could be typical in the area.

The winters must have been snowy, severe and windy, but the summers were relatively hot as well, as characteristic of the continental climate.

Various deciduous trees with soft wood abounded, such as willow, aspen, linden, though the birch-tree was among the most notable ones.

The PBT environment was mostly inhabited with relatively small steppeland or grassland fauna such as mice, snakes, stoats, badgers, foxes; sparrows, larks. Birds of prey were common near the highlands, so the Proto-Bulgaro-Turkic people were probably familiar with falconry. In the riparian woods near lakes, beavers were the usual inhabitants.

Rivers and streams were usually rather small or intermittent, whereas lakes and ponds were much more common (especially after the eastern migration of Proto-Turkic); some of them could be saline.

They must have practiced fishing by using nets and sometimes probably building dams.

The lakes were frequently visited by cranes and wild geese. It is logical to assume that the Proto-Bulgaro-Turkic people hunted beavers, stoats, and foxes for fur to make winter clothing, which is confirmed, for instance, by the Baraba Tatars' similar hunting activities.

Marshland regions were situated somewhere nearby. On the contrary, sand dunes were probably atypical, and the dense taiga forest was most likely unknown, either. Generally speaking, beavers and birch-trees must have been among the most distinguished features of the local environment.

The Proto-Bulgaro-Turks bred cattle, horses, and probably goats, though there is no direct geolexical evidence for sheep. Cf. the domesticated animals ratio for the Baraba Tatars from 1889: horses: 40% , cattle: 30%, sheep: 30%, [see Myagkov (2009)]. The Proto-Bulgaro-Turks probably used horse-drawn sledges in winter; and apparently were well-familiar with horse riding, saddle and stirrup making, probably availing themselves to cowboy-style nomadism in summer, but living in houses during the wintertime, either keeping the animals in stalls (as the Baraba Tatars do) or letting them out in the open to let them dig the grass from under the snow (as the Kazakhs do). They fed on a variety of diary products, such as sour cream and quark. They practiced crop cultivation, including barley, Spelt, millet, oats, and possibly flax (necessary in fabric making).

They used copper metallurgy (evidently, bronze) and were probably familiar with iron, as well as with silver and gold jewelry.

2.5 Conclusions about the position of the Proto-Bulgaro-Turkic Urheimat

By exclusion, we must conclude that the area consistent with the principle of the maximum diversity, the demographic analysis and the geolexical analysis may have been situated somewhere within the following three closely interconnected geographic areas, largely coinciding with the Tobol-Ishim-Irtysh or Ob or Yenisei demoregions.

The Tobol-Ishim-Irtysh demoregion, which forms a sort of a fertile crescent, seems to meet nearly all the criteria stipulated by the reconstructed geographical lexemes. The location of the PBT Urheimat along the middle course of the Irtysh seems to be particularly likely.

The Yenisei demoregion is much less likely due to the possible partial exclusion of certain reconstructed lexemes and much too eastern location, which is in contradiction with the principle of maximum diversity. However, the Ob demoregion could not be entirely dismissed at the present stage and should be kept in mind for additional consideration."

Matt said...

@Sam, I'm not sure whether we can know at all that there are groups today who share no ancestry <10,000 years old in East Asia, or the threshold for pale skin here really.

Sure, you could have who groups have high levels of derived variants like SHG are in the West Eurasia (who even have greater than frequencies of derived variants than any modern population) and these could even be relatively more common in East Eurasia than in West Eurasia.

It's all pretty unknown at the moment - I would guess you will see a parallel to West Eurasia as Mathieson's last abstract describes, where there is a robust depigmentation trend over tens of thousands of years, but still some further change in the last 10,000 and even last 4,000 years.

Plus extinct (mostly) groups and understudied groups could have subtly different variants.

Ric Hern said...

@ Samuel

If I remember correctly it was already shown that ancestors of the Khoi-San already showed variations of skintone and it could be as old as 200 000 years or even earlier. After all did you have a look at the skintone of Rhesus Monkeys....?

Kristiina said...

Correction: Kolyma1 is the oldest Q outside of America. The oldest Q is Anzick1, 12600YBP, in Northwest America.

Matt said...

Anyone have any suggestions about what could explain the hiatus of expansion into the Eastern Steppe? Ecological boundaries disfavouring steppe EMBA modes of subsistence? That's something like domestic animals being transferred but problems using more technologically intensive parts of the toolkit?

Or is it just chance and luck that the early eastern steppe groups had technology transfer before Western Steppe populations could gain much demographic prominence (which Central Steppe groups more like Botai and Okunevo seem ultimately less fortunate with)?.

EastPole said...

David, the theory of Asiatic origin of Slavs was based among other things on many Slavic words in Altaic languages. Here is a short description of this theory (in Polish):

We know today that Slavs are derived from Corded Ware Culture and have Central-Eastern European origin, but that link with Altaic languages is very interesting in view of recent genetic studies on the spread of dairy pastoralism to East Asia.

The link Slavic languages have with Altaic languages was described by Kazimierz Moszyński in his book “Pierwotny zasiąg [!] języka prasłowiańskiego”:

I made a scan of chapter XVII of this book where prof. Moszyński describes his comparison of Slavic and Turkic words (in Polish):

It is very relevant to Jeong’s article because some of these Slavic words in Altaic languages are related to herding and food production.
For example on page 227 Moszyński writes about such Slavic words in Turkic languages as ‘bull’ and ‘goat’. In Turkic languages these words are isolated, don’t have etymology, whereas in Slavic they are well explained. Turkic languages borrowed from Slavic not only the word for ‘goat’ but also for ‘goat’s skin’ and ‘dress made of goat’s skin”.
Another words denote ‘food made of millet’ on page 226 and ‘flour/labor’ on page 228.

Domestic cattle, goats and sheep, millet cultivation and food production were probably introduced to eastern Eurasian steppe by Sintashta herders and linguistics here agrees with genetics.

Another important word Moszyński discusses on pages 217-224 is ‘hop’, used for beer and mead production, where he demonstrates that Turkic ‘kumalak’ is derived from Slavic ‘chъmelь’ and also explains it’s relation to Indo-Iranian words like ‘soma/haoma’.

Moszyński mentions ‘hop’ in chapter VIII where he discusses Slavic Indo-Iranian linguistic similarities especially in religious vocabulary on page 85(in Polish):

In discussed chapter XVII on page 223 he states that we cannot exclude that ‘soma’ was ‘hops’:

I think ‘soma’ was ‘hops’. I know Rigveda much better than Moszyński and to me his arguments are very convincing. It fits everything perfectly: Slavic religion, similarity between Soma cult in India and Dionysus cult in Greece , Hyperboran religion etc.

Synome said...

@ Matt

I think the two factors are related, with the early culture transfer dependent on the geo-climatic barrier.

There is a continuous chain of mountains from the Altai to the Pacific that along with Lake Baikal separates the Eastern Steppe from the Western Steppe. We've seen how mountains can prove a significant barrier for steppe peoples e.g. the Caucasus. If you look at maps of the Eurasian steppe, the Urals make a bit of a dent, but it's at the Altai where the steppe habitat is significantly broken up and afterwards becomes much narrower.

I think the Altai-Sayan region become an exposure zone where steppe migrants interacted with locals and transmitted their culture and domesticates without being able to overwhelm the population.

This started as early as Afanasievo, and so by the time of the post-Sintashta migrations there was already more local parity in lifestyle and culture that additionally helped to prevent steppe newcomers from dominating.

Samuel Andrews said...

"I'm not sure whether we can know at all that there are groups today who share no ancestry <10,000 years old in East Asia, or the threshold for pale skin here really. "

Lokomotiv (8ky) represents the Asian ancestry of Altaians & people on the Asian Steppe. Devil Gate's forms foundation Ulchi ancestry. Kolym1 (9ky) represents Asian ancestry of Inuit, Eskimos, Itlemen, etc. Ainu have ancient origins in Japan.

Ebizur said...

The Devil's Gate specimens have mostly been determined to be female through genetic testing (contrary to morphological assessments, which predicted that many of them should be male). They all belong to mtDNA haplogroup D4 (modal Japanese/Korean/Mongol haplogroup), with about half of them belonging to the D4m subclade, which is now split between a Japanese branch and an Amurian (esp. Nivkh) branch.

It seems that most modern Ulchi belong to other haplogroups, mostly Japan-specific or East Siberian-specific clades plus a few generic East Asian/Chinese types.

Ebizur said...

Ulchi mtDNA (Rem I. Sukernik, Natalia V. Volodko, Ilya O. Mazunin, et al. (2012), "Mitochondrial Genome Diversity in the Tubalar, Even, and Ulchi: Contribution to Prehistory of Native Siberians and Their Affinities to Native Americans." American Journal of Physical Anthropology 148:123–138.)

("Historically, the Ulchi are a well-defined tribe of hunters and fishermen dispersed along the lakes and the reaches of the Lower Amur. They speak a language of the Tungusic-Manchu group (Levin and Potapov, 1964; Black, 1988; Krauss, 1988). Previously published mtDNA data obtained from 87 elderly Ulchi residing in Old and New Bulava, two neighboring villages (Starikovskaya et al. 2005), were revised and supplemented by 73 new samples collected in Bogorodskoe and Nizhniy Gavan villages (Ulchi district, Khabarovsk Region) in September 2009. Hence, the total Ulchi sample consisted of 160 individuals, with little admixture with the Nivkhi, Negidal, and Udegey.")

7/160 = 4.38% N9b [Japan-specific]
69/160 = 43.13% Y1a [Amurian/East Siberian-specific]
1/160 = 0.63% B5b2 [Most likely belongs to a B5b2a(xB5b2a1) branch that also has been observed in a Negidal and a Khamnigan. Various subclades of B5b2 have been found in populations ranging from the Arctic coast of Siberia to the Malay Archipelago and eastward as far as the Solomon Islands.]
5/160 = 3.13% F1a [Southeast/East Asian; seems to roughly correlate with the distribution of Y-DNA haplogroup O-M95 and Austroasiatic languages]
1/160 = 0.63% C1a [Also observed in Japan, Nanai, Daur, Buryat, Kyrgyz. Relatively closely related to indigenous American subclades of C.]
3/160 = 1.88% C4a1 [Also observed in Hungary (Szeged region), Poland (Kashubia), Belarus, Russian, Bashkortostan, Dagestan, Iran (Azeri), Turkey, India (Jammu and Kashmir, Iyer from Tamil Nadu), Nepal, Lachungpa, Lepcha, Wancho, Uzbekistan, Kazakhstan, Kyrgyz (Kyrgyzstan, Artux, and Tashkurgan), Sarikoli, Wakhi, Uyghur, Altai, Shor, Teleut, Tubalar, Tofalar, Nganasan, Evenk, Yakut, Buryat, Inner Mongolians, Chinese, Tibet, Ladakh, Thailand/Laos.]
5/160 = 3.13% C4b
11/160 = 6.88% C5
1/160 = 0.63% Z1a2
6/160 = 3.75% M8a

1/160 = 0.63% D4a1
2/160 = 1.25% D3
1/160 = 0.63% D4b2b
2/160 = 1.25% D4c2
3/160 = 1.88% D4e4
1/160 = 0.63% D4g2b
4/160 = 2.50% D4h
3/160 = 1.88% D4j
1/160 = 0.63% D4m2
1/160 = 0.63% D4o1
12/160 = 7.50% D4o2
(31/160 = 19.38% D4 total)

1/160 = 0.63% D5a
12/160 = 7.50% G1b
2/160 = 1.25% G2a1
4/160 = 2.50% M7
1/160 = 0.63% M9a1

Haplogroup D4 is neither overwhelmingly common nor uncommon among modern Ulchi. However, it appears that only the D4o2 and perhaps also the D4h(2?) subclades may be notably frequent among the Ulchi in particular.

The Y1a clade that predominates among present-day Ulchi may be a rather young clade. It is also very frequent among modern Nivkh and Nanai populations, and somewhat less frequent (but still notable) among Ewenic populations (Negidal, Even, Evenk) and probably also among Ainus (although I am uncertain which subclade the Ainus belong to). Sukernik et al. (2012) have estimated it to be 6,000 (95% CI 3,300 <-> 8,800) years old. The specimens from Devil's Gate Cave may in fact predate the MRCA of mtDNA haplogroup Y1a. It would be interesting to find a contemporaneous specimen belonging to (pre-)Y1a.

G1b is very common among Chukotko-Kamchatkans and also has been found among Nivkhs, Yukaghirs, Ewenic peoples, and Yakuts.

Grey said...

"Anyone have any suggestions about what could explain the hiatus of expansion into the Eastern Steppe?"

"There is a continuous chain of mountains from the Altai to the Pacific that along with Lake Baikal separates the Eastern Steppe from the Western Steppe."

Another Ertobolle?

relatively well populated (?) and sedentary (?) HG population living in wetland/marsh terrain unsuitable for herd animals blocks the expansion of herders but allows slow cultural exchange so it's HGs from Lake Baikal who adopt the package and expand to the east as herders?

(maybe with females as the vector i.e. alliance marriages of west shore herder females with east shore HG chiefs and the females start raising goats or whatever to add to the HG fishing/hunting package? and that hybrid population with minor WSH expand to the east over the top of the previous non-lake HGs?)

Matt said...

@Sam, this is what abstracts on the topic are currently saying:

Abstracts from

Neolithic contact of populations in Northeast Asia

Chao Ning1, Hai Zhang 2
1. Eurasia3angle Group, Max Planck Institute for the Science of Human History
2. School of Archaeology and Museology, Peking University

Linguistic evidence claims that different languages spoken in Northeast Asia, such as Turkic, Mongolic, Tungusic, Koreanic and Japonic languages derived ultimately from a common ancestral linguistic family and the dispersal of those languages is indeed driven by the spread of millet agriculture. Despite the lack of consensus among archaeologists whether millet was first cultivated in West Liao River or in Yellow River region, recent linguistic studies claimed the former as the key region where proto‐Transeurasian language was spoken. Here we analyzed ancient genomes from West Liao River Valley and Yellow River Valley and integrating the archaeological evidences from those regions, we found that populations in West Liao River region were quite dynamic and genetically closer to modern populations who are speaking Tungusic, Mongolic and Turkic languages. Whereas, populations in Yellow River region showed a more stable genetic structure, consistent with the archaeological findings of this region. Our study mirrors the linguistic studies that millet farming technologies was introduced together with genes.

Bioarchaeological perspective on the expansion of Transeurasian languages in Northeast China

Yinqiu Cui1,2 Quanchao Zhang2

1. School of Life Sciences, Jilin University, China
2. Research Center for Chinese Frontier Archaeology of Jilin University

Northeast Asian has been an important region where a wide variety of languages were spoken, including Japonic, Koreanic, Tungusic, Mongolic and Turkic languages, termed together as Transeurasian language family. The linguist proposed that the spread of Transeurasian languages is driven by agriculture, which emerged in Neolithic Hongshan Culture phase and played an increasingly important role in the means of subsistence and the population grew gradually in the northest China.

To investigate this hypothesis, we used shotgun data from ancient populations lived between 12000‐2300 years ago from Amur River Basin in Northeast China, where the extant populations mainly speake Tungusic languages. We find evidence for some Early Neolithic contacts with other populations, and otherwise detect genetic continuity beginning in the Early Neolithic, spanning throughout the time series, and extending into the present day.

We also found that there is close genetic affinity along the populations living surrounding Amur River Region, such as Mongol, Hezhe, Oroqen and Ulchi. This finding mirrors the linguistic evidences that populations who spoke Transeurasian languages share a higher genetic affinity.


Matt said...


Abstracts from

The homeland of proto‐Tungusic languages inferred from contemporary words and ancient genomes

Chuan‐Chao Wang1, Martine Robbeets 2
1.Institute of Anthropology, Xiamen University, 361005 Xiamen, China.
2.Max Planck Institute for the Science of Human History, 07745 Jena, Germany

There are two competing hypotheses concerning the possible homeland of proto‐Tungusic languages, namely, the Lake Baikal or the Amur River basin. We here generated genome‐wide data from 15 ancient East Asians, including 11 from the Amur River Basin in the Russian Far East dating to around ~5000 BCE (Middle Neolithic hunter‐gatherers from the Boisman‐2 cemetery), one ~1000 BCE individual (Iron Age from the Yankovsky Culture), two early Medieval individuals dating to ~1000 CE (from the Heishui Mohe culture). We analyzed our data together with published ancient genomes from Devil’s gate (7700 years ago) and Baikal (2000‐4500 years ago). We found a strong genetic overlap between the ancient samples from Amur River stretching as far as the Baikal region and present‐day speakers of Tungusic language. The ancient Amur River samples tended to be in an unadmixed form, while ancient Baikal samples obviously had West Eurasian gene flow. The results give circumstantial genetic evidence for an Amur River Basin homeland for proto‐Tungusic languages. The expansion of proto‐Tungusic people had shaped the genetic structure of the vast region from Amur River basin to Baikal. We also found the Han expansion left a significant genetic signature in Amur River Basin and there is also evidence for West Eurasian admixture into ancient Baikal and Tungusic populations in later time.

ISBA 2018:
Genomic insight into the Neolithic transition peopling of Northeast Asia

C. Ning1

East Asian representing a large geographic region where around one fifth of the world populations live, has been an interesting place for population genetic studies. In contrast to Western Eurasia, East Asia has so far received little attention despite agriculture here evolved differently from elsewhere around the globe. To date, only very limited genomic studies from East Asia had been published, the genetic history of East Asia is still largely unknown. In this study, we shotgun sequenced six hunter-gatherer individuals from Houtaomuga site in Jilin, Northeast China, dated from 12000 to 2300 BP and, 3 farming individuals from Banlashan site in Liaoning, Northeast China, dated around 5300 BP. We find a high level of genetic continuity within northeast Asia Amur River Basin as far back to 12000 BP, a region where populations are speaking Tungusic languages. We also find our Compared with Houtaomuga hunter-gatherers, the Neolithic farming population harbors a larger proportion of ancestry from Houtaomuga related hunter-gathers as well as genetic ancestry from central or perhaps southern China. Our finding further suggests that the introduction of farming technology into Northeast Asia was probably introduced through demic diffusion.

(The site from the last paper in question on Google maps if you are interested: Very far to the North East, if not quite to Ulchi territory -

Kristiina said...

@ Matt

There are high yDNA C-M217 frequencies with many divergent clades in Palaeo-Siberian populations such as Nivkhs (71% Kharkov), Koryaks (59,2% Lell), Itelmens (66,7% Lell) and Yukaghirs (33%). There is C-M217 also in America. According to Wikipedia, Cheyenne are 16% C-M217 and Apache 15% C-M217. Nivkhs speak an Paleo-Siberian isolate language. Koryak and Itelmen languages belong to the Chukchi-Kamchatkan language family. Yukaghirs speak a Paleo-Siberian isolate language. Cheyenne belongs to the Algonquian language family. Apache belongs to the Na-Dené languages.

If you think that the Devil’s Gate population spoke a pre-Proto-Tungusic language in Amur c. 10 000 bp, do you think that the Tungusic languages are related to Nivkh, Chukchi-Kamchatkan and Algonquian? If not, what do you think is the original yDNA of Nivkh, Chukchi-Kamchatkan and Algonquian and where they are from. If we have different yDNA C ancestors speaking different languages, how do you know that the Devil’s gate population spoke proto-Tungusic and not Proto-Nivkh or Proto-Chuktchi-Kamchatkan?

Kristiina said...

According to Wikipedia (, Tungusic C seems to be mostly C2c1a1a1 M407 (TMRCA 4100 ybp), C2b1a2-M48 and C2b1a3a2 F10283. On yfull, TMRCA of the whole C2b1a2-M48 is only 3800 years.

C2b1a2a M77 is shared between Yukaghirs (Yukaghir language isolate), Nivkhs (Nivkh language isolate), Itelmens (Chukchi-Kamchatkan), Kazakhs, Oirats, Kalmyks, Outer Mongolians (Mongolic group), and Udegeys (Tungusic group), with a moderate distribution among other Southern Tungusic peoples (Tungusic group), Inner Mongolians, Buryats (Mongolic group), Tuvinians, Yakuts, Kyrgyz, Uyghurs, Uzbeks, Karakalpaks (Turkic groups), Chukchi (Chukchi-Kamchatkan) and Tajiks (Iranian). C2b1a2b B90 is shared between Koryaks (Chukchi-Kamchatkan) and sporadically among Evenks, Evens (Tungusic group), and Yukaghirs (Yukaghir language isolate).

Devil’s Gate is much older than M48 and the upstream clades of M48 are found in America (Algonquian speaking Cheyenne and Na-Dené speaking Apache) and in Koryaks. We have to see the broader picture if we want to take a position on the language spoken by the Devil’s Gate population.

Ebizur said...

Kristiina wrote,

"On yfull, TMRCA of the whole C2b1a2-M48 is only 3800 years."

IIRC, the existence of C-M48(xM86, M77) Y-chromosomes was first noted by Pakendorf et al. (2006, 2007) in some samples from northeast Siberia. However, this refinement of the phylogeny was not recognized by ISOGG until several years later.

Additional information about C-M48(xM86, M77) was provided by Karmin et al. (2015). Their sample set included four members (three Koryaks and one Evenk) of C-M48(xM77), all of whom were found to comprise a monophyletic clade marked by the B90 SNP (TMRCA 4,992 [95% CI 4,188 <-> 5,732] ybp). Furthermore, the three Koryaks all belonged to a subclade marked by the B91 SNP (TMRCA 3,812 [95% CI 3,005 <-> 4,654] ybp), whereas the Evenk's Y-DNA exhibited the B93 SNP.

Besides the three Koryak members of C-L1373 > C-M48 > C-B90 > C-B91, the sample set of Karmin et al. also included one Koryak member of C-L1373 > C-B473 > C-B77 and one Koryak member of C-L1373 > C-B79. With respect to C-M48, C-B77 forms a clade with North American C-P39.

It seems clear already that all or nearly all members of C-M48 among Turkic and Mongolic peoples of the steppe belong to the C-M86/M77 subclade and share a fairly recent common ancestor as you have noted. However, because of the lack of resolution regarding M86/M77 versus M48(xM86, M77) in nearly all studies published before the current decade (except the aforementioned studies by Pakendorf et al.) and even in many (perhaps most) studies published in the current decade, the eastern extent of the distribution of C-M86/M77 remains rather unclear.

Lell et al. 2002 found C-M48 in 62.5% (10/16) of a sample of Okhotsk Evenk, 60% (12/20) of a sample of Udegey, 38.9% (7/18) of a sample of Itel'man, 37.7% (20/53) of a sample of "Ulchi/Nanai," 35.3% (6/17) of a sample of Nivkh, 33.3% (9/27) of a sample of Koryak, and 4.2% (1/24) of a sample of Chukchi. There appears to be an error in the study's Figure 2, but the authors most likely also found C-M48 in 52.9% (9/17) of a pair of small samples of Negidal (7/7 = 100% of one sample and 2/10 = 20% of the other sample). The authors did not find any case of C-M48 in a sample of Siberian Eskimo (n=33).

Xue et al. (2006) found C-M48 in 41.9% (13/31) Oroqen, 26.9% (7/26) Ewenki, 11.1% (5/45) Hezhe, 8.9% (4/45) Inner Mongolian, 4.9% (2/41) Xibe, 2.9% (1/35) Manchu, and 2.6% (1/39) Daur in the PRC. The authors found C-M48 in 20.0% (13/65) Outer Mongolian. The frequency of C-M48 in populations in East Asia appears to be positively correlated with a distinctively Siberian autosomal affinity (notable in e.g. Oroqen and Outer Mongolian populations). Populations with lower frequency of C-M48 Y-DNA tend to be more similar to the East Asian autosomal norm. I would not rush to ascribe that pattern to an origin of C-M48 in Siberia followed by southward migration, however, because the rest of the Y-DNA pools of these minority ethnic groups in the PRC tends to be very similar to Chinese people in general, which might suggest that C-M48-bearing ancestors of indigenous Siberians have dwelt in some part of northern China/East Asia prior to the migration of some part of their populations northward to Siberia and that the other part of their populations (who have remained behind in northern China/East Asia) subsequently have mixed with Chinese (or perhaps Korean) men.

Pakendorf et al. (2006)
7.7% (1/13) C-M48(xM86)
15.4% (2/13) C-M86
7.7% (1/13) C-M217(xM48)

Ebizur said...

Vladimir N. Kharkov has reported the following data regarding the Y-DNA of eastern Siberians:

Koryak (n=33)
48% C3*(xC3c, C3d)
0% C3c
0% C3d

Nivkh (n=52)
71% C3*
0% C3c
0% C3d

Udegey (n=31)
61% C3*
9.7% C3c
0% C3d

Chukchi (n=46)
15% C3*
0% C3c
0% C3d

Evenki (n=32)
3.1% C3*
25% C3c
0% C3d

Yakut (n=225)
2.2% C3*
1.8% C3c
0% C3d

Eskimo (n=43)
9.3% C3*
0% C3c
0% C3d

Note that Vladimir N. Kharkov has a habit of using C3* to mean C-M217(xM77) or C-M217(xM77, M407), C3c to mean C-M77, and C3d to mean C-M407.

Fedorova et al. (2013) found the following:

Central Yakut
1.1% (1/92) C3d-M407

Vilyuy Yakut
3.4% (2/58) C3c-M48

Northern Yakut
4.5% (3/66) C3d-M407
3.0% (2/66) C3*-M217(xM407, M48)
6.1% (4/66) C3c-M48

Evenk (Ust-Maysky, Zhigansky, and Oleneksky districts of Sakha Republic)
5.3% (3/57) C3*-M217(xM407, M48)
26.3% (15/57) C3c-M48

Even (Eveno-Bytantaysky National district and Momsky district of Sakha Republic)
41.7% (10/24) C3c-M48

Yukaghir (Nizhnekolymsky and Verkhnekolymsky districts of Sakha Republic)
27.3% (3/11) C3*-M217(xM407, M48)
9.1% (1/11) C3c-M48

Dolgan (n=57 from Volochanka, Ust-Avam, and Dudinka of Taimyr plus n=10 from Anabarsky district of Sakha Republic)
1.5% (1/67) C3*-M217(xM407, M48)
11.9% (8/67) C3c-M48

Duggan et al. (2013) have found:

Evenk in Taimyr
44.4% (8/18) C3c1-M86

Even in Berezovka village, Srednekolymsky District, Sakha Republic
100% (7/7) C3c1-M86

Even in Esso and Anavgai villages, Kamchatka
100% (15/15) C3c1-M86

Populations for which the proportions of various subclades of C-M217 are unclear include the Chukchi, the Nivkh, and the Udegey.

25% (1/4) 1F-RPS4Y i.e. C-M130 (Karafet et al. 1999)
4.2% (1/24) C-M48 (Lell et al. 2002)
15.2% (7/46) C3*(xC3c, C3d) (Vladimir N. Kharkov)

35.3% (6/17) C-M48 (Lell et al. 2002)
38.1% (8/21) C-M217 (Tajima et al. 2004)
71.2% (37/52) C3*(xC3c, C3d) (Vladimir N. Kharkov)

60% (12/20) C-M48 (Lell et al. 2002)
66.7% (14/21) C-M130 (Han-Jun Jin et al. 2010)
61.3% (19/31) C3*(xC3c, C3d), 9.7% (3/31) C3c (Vladimir N. Kharkov)

In the case of the Nivkh and the Udegey, a comparison of the high frequency of C-M48 in Lell's samples with the high frequency of C3*(xC3c, C3d) and the low frequency of C3c in Kharkov's samples suggests that most males in these populations may belong to C-M48(xM77), perhaps C-B90 like one of Karmin's Evenks and three of Karmin's Koryaks. The Chukchi cases of C-M48 and C3*(xC3c, C3d) may also be considered likely to belong to C-B90 because of their geographical and linguistic closeness to the Koryaks.

However, note that Pakendorf et al. (2006) have found 7.7% (1/13) C-M48(xM86), 15.4% (2/13) C-M86, and 7.7% (1/13) C-M217(xM48) in their sample of Yukaghir, whereas Fedorova et al. (2013) have found 27.3% (3/11) C3*-M217(xM407, M48) and 9.1% (1/11) C3c-M48 in their sample of Yukaghir. Pakendorf's sample has a total of 23.1% (3/13) C-M48 and only 7.7% C-M217(xM48), whereas Fedorova's sample has only 9.1% C-M48 and 27.3% C-M217(xM407, M48). The Yukaghir sample of Karafet et al. (1999) is 50% (6/12) C-M130, an even greater proportion than their Koryak sample (4/12 = 33.3% C-M130).

I would say that data remain insufficient to determine the relative proportions of C-M217 and its subclades in various indigenous populations of Eastern Siberia, although the data of Karmin et al. (2015) have revealed that at least the Koryaks contain a great amount of diversity within the C2b-L1373 clade, whose members are now found mainly in Central Asia, Siberia, and North America, and marginally in neighboring regions (Europe, South Asia, East Asia, and probably also South America).

Kristiina said...

Thanks Ebizur! To sum up:

Devil’s Gate genomes are c. 7.7 kya old which requires us too look at C-F4032 which is upstream of C-M48. C-F4032 formed 14700 ybp, TMRCA 14500 and according to Ebizur, it should more or less correspond to C3*(xC3c, C3d). The modern northeast Asian populations that carry C3*(xC3c, C3d) are:
71.2% of Nivkhs ((37/52) C3*(xC3c, C3d), Vladimir N. Kharkov))
61.3% of Udegey ((19/31) C3*(xC3c, C3d), Vladimir N. Kharkov)
48% of Koryaks (C3*(xC3c, C3d))
27.3% of Yukaghirs (3/11) (C3*-M217(xM407, M48))
16% of Cheyenne (C-P39)
15.2% of Chukchi (7/46) (C3*(xC3c, C3d) (Vladimir N. Kharkov))
15% of Apache (C-P39)
9.3% of Eskimo (C3)
7.7% of Yukaghirs (1/13) (C-M217(xM48))
5.3% of Sakha Evenks (3/57) (C3*-M217(xM407, M48))
3.1% of Evenks (C3*)
2.2% of Yakuts (C3*) or 3.0% of Yakuts (2/66) (C3*-M217(xM407, M48))
1.5% of Dolgans (1/67) (C3*-M217(xM407, M48))

We see that most of these populations speak Paleo-Siberian languages, some even Native American languages. Therefore, I think it is highly unlikely that the Devil’s Gate population spoke a Tungusic language. IMO it makes much more sense to presume that the Mesolithic Amur language is related to Nivkh and/or Chukchi-Kamchatkan languages.

Ebizur said...

Kristiina wrote,

"Devil’s Gate genomes are c. 7.7 kya old which requires us too look at C-F4032 which is upstream of C-M48."

According to Table S7 of Karmin et al. (2015), the age of Node 255 (the most recent common ancestor of C-M77/M86 and C-B90, which are the two currently known primary subclades of C-M48) is 12,131 ybp [95% CI 10,916 ybp <-> 13,363 ybp considering only the variance of branch length estimation in BEAST, 8,503 ybp <-> 15,697 ybp considering also the uncertainty due to the confidence intervals of the mutation rate 0.63-0.95 x 10ˉ⁹]. The TMRCA of the C-M48 clade is probably somewhat less than the TMRCA of the Q-M3 clade (15,444 [95% CI 14,390 <-> 16,480 or 11,209 <-> 19,358] ybp) or the Q-F746/NWT01 clade (16,781 [95% CI 15,120 <-> 18,453 or 11,778 <-> 21,675] ybp) and somewhat greater than the TMRCA of the Q-L330 clade (8,506 [95% CI 7,336 <-> 9,699 or 5,715 <-> 11,392] ybp). Most present-day populations in which C-M48 is frequently observed also contain members of N-F1419, so one might consider the TMRCA of that clade, too: 11,659 [95% CI 10,440 <-> 12,936 or 8,133 <-> 15,195] ybp. However, the N-Y23747 (or N-F4063) clade, which has been observed in a few Japanese, Oroqen, Chinese, and Tibetan individuals, is slightly basal to N-F1419; the TMRCA of the entire N-M178 clade, subsuming both N-F1419 and N-Y23747, may be about 1,000 years greater than the TMRCA of N-F1419 according to YFull's estimates. (The data set analyzed by Karmin et al. 2015 does not include any member of N-Y23747.) Anyway, regardless of whether we compare C-M48 and N-F1419 or C-M48 and N-M178, the TMRCA estimates of the two clades do not differ significantly, and members of both clades tend to be found in the same ethnic groups (at least in Asia; C-M48 is extremely rare in Europe despite the high frequency of certain subclades of N-M178 in some European populations), so members of the two clades may have a complicated history of interaction from their very origins. I currently would guess that the two clades may have originated in some area(s) between the Liao and Amur rivers (in other words, in Manchuria), possibly among early users of ceramics in that region who subsisted by hunting, gathering, and fishing.

Kristiina said...

@ Ebizur
You argue that if two yDNA clades have a similar TMRCA they come from the same place...

It seems as if you are speculating about the macro-Altaic language family which includes Uralic and originated in the West Liao river area. The earliest West Liao samples are dated 6500-5000 BP and they belong to the Chinese N branch which is not N-F1419 but N-F2905 and it diverged from the rest 18 000 years ago. At the moment, the oldest N is from the Baikal area, Shamanka EN DA245, 7100 BP.

”Most present-day populations in which C-M48 is frequently observed also contain members of N-F1419”

We have already concluded that C-M48 is a young Bronze Age branch and it cannot be connected with the Chinese Neolithic. West Liao cultures have yielded the so called C3e. If N-F1419 and C-M48 were together in some other place which is not proven in ancient DNA, it is however important to add that the Uralic populations carry 0% of C-M48. It seems that at the moment the only ancient C-M48 is a medieval Nomad from Boz-Adyr Batken Kyrgyzstan, DA106, C2b1a2a2a-Y12825.

Kristiina said...

I checked the data from Karmin et al. (2015). The age of C3c is older in Karmin et al because there is a bifurcation between the Koryak line and the rest. I cannot identify this Koryak line on yfull, so it may be missing. However, the age of the expansive branch (C2b1a2a-M77) that spread the Tungusic languages, but probably not only limited to them, is still only between 1,735 and 4,030 years also according to Karmin et al. The other branch with a deeper age is probably related to the expansion of the Chukchi-Kamchatkan languages, and in particular to the history of the Koryaks.

Supplemental Figures.pdf