search this blog

Tuesday, August 19, 2014

Complex paternal origins of the Han Chinese

There's an intriguing new paper at the AJHB on the paternal ancestry of a population from Iron Age China. It argues that the Han Chinese are the result of fairly recent admixture events, with Y-chromosome haplogroup Q1a1 entering the ancestral territory of the Han, the Central Plain of China, only around 3,000 years ago from the northwest. It's probably a sign of things to come, not only for the Han but many populations generally thought to be genetically homogeneous.

Note also how the Y-chromosome haplogroups appear to be associated with different burial customs and inferred social status. Q1a1 was found in the remains of three aristocrats and eight commoners, most of them buried in the extended prostrate position typical of Bronze and Iron Age steppe nomads of what is now western China. Most of the other remains were buried in the extended supine position, characteristic of the populations of the Chinese Central Plain at the time. I've put the details into a spreadsheet here.

It'll be interesting to learn about the genome-wide genetic structure of the people who introduced haplogroup Q1a1 into the ancestral Han gene pool. Were they perhaps in large part of Ancient North Eurasian (ANE) origin? The reason I say this is because Q is the most common Y-chromosome haplogroup in the Americas, where ANE peaks today. It's also the sister clade of haplogroup R, which is the paternal marker of Mal'ta boy, or the MA-1 genome, the main reference sample for ANE.

Indeed, haplogroup R was expanding in a big way across Europe and West and Central Asia at about the same time as Q1a1 in China. It also probably came from the steppe and was in all likelihood associated with the spread of ANE deep into Europe.

Objectives: Y chromosome haplogroup Q1a1 is found almost only in Han Chinese populations. However, it has not been found in ancient Han Chinese samples until now. Thus, the origin of haplogroup Q1a1 in Han Chinese is still obscure. This study attempts to provide answer to this question, and to uncover the origin and paternal genetic structure of the ancestors of the Han Chinese.

Methods: Eighty-nine ancient human remains that were excavated from the presumed geographic source of the Han Chinese and dated to approximately 3,000 years ago were treated by the amelogenin gene polymerase chain reaction test, to determine their sex. Then, Y chromosome single nucleotide polymorphisms were subsequently analyzed from the samples detected as male.

Results: Samples from 27 individuals were successfully amplified. Their haplotypes could be attributed to haplogroups N, O*, O2a, O3a, and Q1a1. Analyses showed that the assigned haplogroup of each sample is correlated to the suspected social status and observed burial custom associated with the sample.

Conclusions: The origins of the observed haplotypes and their distribution in present day Han Chinese and in the samples suggest that haplogroup Q1a1 was probably introduced into the Han Chinese population approximately 3,000 years ago.


Yong-Bin Zhao et al., Ancient DNA evidence reveals that the Y chromosome haplogroup Q1a1 admixed into the Han Chinese 3,000 years ago, American Journal of Human Biology, Article first published online: 18 AUG 2014, DOI: 10.1002/ajhb.22604

See also...

Lots of ancient Y-DNA from China

First genome of an Upper Paleolithic human (Mal'ta boy)

Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans


Chad Rohlfsen said...

Don't forget it's the only known source of ANE in mesolithic Scandinavia.

Chad Rohlfsen said...

Seima-Turbino perhaps....

Davidski said...

How about Ordos?

They were partly West Eurasian, it seems.

Chad Rohlfsen said...

I think everyone in Siberia has West Eurasian, since at least 24kya.there may have been an almost uniform Amerindian people from near the Altai to the tip of South America, 15kya, as the Ket are nearly identical to Kaitiana in the k6 run.

Chad Rohlfsen said...

Karitiana, excuse the phone typos

Chad Rohlfsen said...

It very well could be the Ordos. It could be a whole lot of different people. Even Papuans have 1%ANE, so they got around.

spagetiMeatball said...

But there's no ANE detected in most chinese populations right?
Except those like uyghurs, I mean.

Davidski said...

The Dai show a bit of ANE in Treemix. Laz et al. found just over 1%, and I sometimes see more than that.

So the Han, especially the northern Han, might have some cryptic ANE.

Matt said...

I don't know uniparental haplogroups but distribution of Q does seems to match ANE better than Northern East Asian component. Presence of ANE could be there as a patriline (maybe one that introduced a few useful genes or cultural ideas), perhaps via a Siberian intermediary (we know mixed EA-ANE population exist in Siberia today), then autosomally reduced to almost nothing.

I think we'll find when we get adna in China that China isn't admixed with any clades as divergent as BEA and WHG/ANE are supposed to be, but will be between different Eastern Non-African populations which are quite divergent (maybe as divergent as WHG and ANE?). These ENA subpopulations may have left greater echoes in present day populations at the extremes of the range like Sherpas/Tibetans and Ryukyuans/Japanese.

andrew said...

"Samples from 27 individuals were successfully amplified. Their haplotypes could be attributed to haplogroups N, O*, O2a, O3a, and Q1a1. Analyses showed that the assigned haplogroup of each sample is correlated to the suspected social status and observed burial custom associated with the sample."

I would love to hear more details on the social status sorting which implies multiple and not just one instance of admixture if there are more than two burial customs.

Davidski said...

I've updated the entry with more details.

As I anticipated, Y-HG Q in this early Chinese sample seems to be associated with aristocrats and commoners largely of steppe origin. None of the individuals belonging to Q were slaves.

spagetiMeatball said...

Altogether R represents about 80% of european Y-DNA haplogroups, right? And yet ANE only reaches levels on the order of 10-15%...

There's a similar discrepancy in all south asian, central asian and near eastern groups carrying R and Q.

It's looking more and more now like the peripheries of eurasia (europe, near east, east asia) were population sinks, and there were constantly fresh waves of invaders from the steppe, shaking things up in a big way.

Ebizur said...

Two comments on your spreadsheet:

*Q1a1 (I suppose this refers to Q1a1a1-M120, a typically East Asian subclade of Q) is the most common haplogroup in this sample, even if one merges O3a3 and O3a(xO3a3) into a single category. Is the identification of this archaeological population as "ancestors of the Han Chinese" really secure?

*Only 2/3 Q1a1 "aristocrats" and 5/8 Q1a1 "commoners" were buried in a prostrate position. The remainder were buried in a supine position. Thus, it is correct that a majority of the sampled remains of Y-DNA Q1a1 bearers were found in a prostrate position of burial, but it is only two thirds of the whole. One third were found in a supine position of burial instead. With a sample size of only eleven, divided between eight "commoners" and three "aristocrats," I am not sure that one can make any robust inference about the burial traditions of the ancestors of the Q1a1a1-M120 population.

Davidski said...

"I am not sure that one can make any robust inference about the burial traditions of the ancestors of the Q1a1a1-M120 population."

We can because the paper mentions that two more or less contemporaneous samples of nomads from northwest China carrying Q1a and Q1b were buried in the prostrate position.

Of course, when these nomads moved into the Central Chinese Plain they became acculturated to the local ways after a while, which is probably why they were buried as locals.

Davidski said...

By the way, how do you figure that this isn't an ancestral Han population when Q1a1 is a Han-specific marker today?

Ebizur said...

Davidski wrote,

"We can because the paper mentions that two more or less contemporaneous samples of nomads from northwest China carrying Q1a and Q1b were buried in the prostrate position."

Were those Q1a individuals also members of the Q1a1a1-M120 subclade, or is their phylogenetic position downstream of Q1a unknown?

Davidski wrote,

"By the way, how do you figure that this isn't an ancestral Han population when Q1a1 is a Han-specific marker today?"

It is not a Han-specific marker today. It has been confirmed in samples of Hmongs from Laos, Japanese, Koreans, and Tibetans (including Dbus Tibetans from Lhasa and Khams Tibetans from Xinlong County of western Sichuan), and it may be spread even more widely since many haplogroup Q individuals in samples of Uyghurs, Mongols, Vietnamese, Thais, etc. have not been tested for the M120 SNP. Of course, its frequency in each of these populations is quite low, and they all have interacted with the Chinese historically, so it is possible that it might reflect migration of Chinese to peripheral regions.

However, the frequency of Q1a1a1-M120 even in modern Han Chinese is not very high, so it is curious that an archaeological population that contains such a high frequency of Q1a1a1-M120 would be identified as ancestral Han Chinese. Perhaps these archaeological samples might represent some ethnic group that has been assimilated by ancestors of the Han Chinese in the process of the latter group's expansion, but I doubt that they could represent the mainstream ancestors of the Han Chinese.

Daniel Szelkey said...

Q1a2 is found in the Turkmens. Q1a2 is found in the Hazara the entire subclade that what called R1b1a1 has been relabled as Q1a2. Why do I think this? Because according to
Q1a2 is found in the Turkmens at almost 40% while R1b is very rare ,a nd in the past you all thought the turkmen had lots of R1b.
The evidence strongly indicates (although not 100% yet) that R1b1a1-M73 does not exist and is really Q1a2-p25. If this is the case than Q1a2-m25 (2011 name) is defined by the presence of SNP marker M73. It has been found at generally low frequencies throughout central Eurasia,[25] and in Altay,[26] but has been found with relatively high frequency among particular populations there including Hazaras in Pakistan (8/25 = 32%);[27] and Bashkirs in Bashkortostan (62/471 = 13.2%), 44 of these being found among the 80 tested Bashkirs of the Abzelilovsky District in the Republic of Bashkortostan (55.0%).[28] Four R-M73 men were also found in a 523-person study of Turkey,[4] and one person in a 168-person study of Crete.[29]

A large proportion of Kumandin Y chromosomes belong to R-M73 which is largely restricted to Central Asia[30] but has also been found in Altaian Kazakhs and other southern Siberians.[31][32][33]
Ironically R1b-P312 is found in the Bashkirs but if I am right (this is just a theory) than some Q-rich populations exist in Europe, whuch may suggest that R was in ancient East Euro hunter gatherers. I could be wrong, however one think is certain in the Turkmens, R1b has been confused withQ or vice versa.

Davidski said...

The ancient steppe samples from northwestern China are just listed as Q1a, while another population of nomads from the Central Plain, near the Hengbei site, are listed as Q1a1. So it looks like a specific subset of the steppe nomads introduced Q1a1 to the Chinese Central Plain. This is obviously a very reasonable conclusion.

Also, no one's saying that this ancient Hengbei population is the one that spawned all modern Han Chinese groups. The idea is that it's broadly an example of a founding Han Chinese population of mixed steppe and Central Plain origin.

There's no reason to expect that this population should have exactly the same subclades of Y-chromosome haplogroups as the present-day Han Chinese, nor that these markers ought to be found at comparable frequencies. The fact that the haplogroups, archaeology and geography match is enough, because haplogroups mutate, and their frequencies can change drastically due to founder effects and drift.

Daniel Szelkey said...

This would not be the first time that y-dna markers that are somewhat closely related have been mixed up. For example in the past everyone WAS whining about how I-Y-DNA was found in Iran and dagestan at high freqncies it Turned out that this I-HG was really J1. Please give me evidence whether this -M73 is R1B or Q.

ryukendo kendow said...

@ Davidski

The ethnic composition of China down to Zhou periods was extremely complex. Both historical records and archaeology bear this out. The Shang and Zhou themselves were very different culturally, with a completely different pantheon. The sinitic expansions behind these empires saw the imposition of a sinitic-speaking aristocracy on local populations, and a standardisation of elite funerary practices and giant bronze ritual vessels in a feudal, hierarchical, imperial framework, while the local commoner culture often differed considerably. c.f. Paul Benedict, or any annals of the period.

Of the Iron Age Warring states, Chu, Wu, Yue state in the south were openly acknowledged to have strong nonsinitic elements. The commoners of the latter even tattooed their bodies and lived in stilt houses, considered barbaric by the rest. Zhongshan, guzhu and yan state in the northeast steppe border similar. Van Driem, Pulleybank, etc generally propose that what remained of proto-koreanics, proto-tungusics/mongolics, or yayoi types comprised the commoners for these. State of Qin, ditto as well.

The books even recount Zhao state adopting the dress of steppe nomads to better the skills of their calvary to drive the nomads away. The slim fit shirt has a long history in China lol.

In fact Caucasoid skulls were found sacrificed in the Shang capital at Anyang, probably POWs. In all probability they introduced the chariot to the chinese as well. In any case chinese has IE loanwords at the proto-language level.

蜜 mi, 'honey', Middle Chinese 'miet' -- English 'mead', Sanskrit 'madhu' - sweet, Greek 'μέθυ'

獅子 shi zi 'lion' Middle Chinese 'sei tse' -- Tocharian 'sisak', 'sesake' - lion

The Rong and Di, appellations for that miscellany of semi-pastoralist ethnic groups in early Chinese history in the north, were completely assimilated by the Han period and vanish from history. Giant blenders Han-ised everyone by that time.

So IE and Sino-Tibetan go back a long way.

@ Ebizur

It has been noticed since a long time back the chinese seem to have an inordinate fondness for sheep/goats.

Here are some characters, traced back to bronze age script, with a decomposition:
羊 yang: sheep
美 mei: beautiful -- 'big' + 'Sheep' = 'beautiful'
羏 yang: excellent, noble -- 'Sheep' + jumble = 'Excellent'
羡 xian: admiration, envy, covetousness --'sheep' + 'occasion' = 'admiration, envy'
and I find this quite funny -
着 kan: look -- 'sheep' + 'eye' = 'look' !!

Similar to sanskrit word for war, gavisti - 'desire for cows', 'searching for cows'. Very revealing about their way of life.

Pulleybank, Benedict and etc have used this to suggest that the ancestral Chinese were sheep/goat pastoralists on the Tibetan foothills, like the tibetans, qiang and other sinotibetan groups are even today. They subsequently invaded and imposed themselves on the central plains agriculturalists, who were probably Hmong miens/austroasiatics/Tai+pre-proto-Austronesians. We have no records of the Shang doing this, but for the Zhou who are recorded as a confederacy coming from the west, we might, depending on interpretation.

So it makes sense for them to have interacted with another pastoralist group further west, explaining the loanwords and the genes, before spreading eastwards.

The chinese themselves would never propose such a thing lol.

Chad Rohlfsen said...

R1b-m73 is Q? Please cite the source. If that's true, I wonder if any or all of m269 will get pulled later on.

Daniel Szelkey said...
@Chad Rohlson ;;R1b-m73 is Q? Please cite the source. If that's true, I wonder if any or all of m269 will get pulled later on.;;
This is brainstorming.
In the past it was thought that the Turkmens had 35% R1b, this has been entirely disproven as this was relabled as Q1a2-M25. I know for a fact that the Turkmen do NOT have R1b. I though this was becuase of confusion between R1b-P25 and Q1a-M25 which is the brother of this ancient chinese Q (Afanogora is also Q1a).

After looking at the data in figure S7 from

I see that 23 out of 74 Turkmen have Q-M25 0 have R1b-M478/M73

Of these two the Hazara only have

The mongols have both!

If this is true than the Turkmen dont have R1b while the Altai do.

However, I was thinking that it is possible (but not likely) that
these two would be the same if M73 is a subclade of Q-m25 which was incorrectly labled as R1b.

I wish someone who knew about genetics could show me that the R1b-m73 in the Hazara from

is genuine while also showing evidence that by using the same methologies the Q-M25 (formerly known as R1b) in the Turkmen is not.

Daniel Szelkey said...

When I say dont have I mean generally have at less than 5%

Ebizur said...

Thank you for the link, Daniel.

Mongol/Central Mongolia (DiCristofaro et al. 2013)
1/18 = 5.6% C2e1a-M407
4/18 = 22.2% C2b2a-M86
3/18 = 16.7% C2-M532
1/18 = 5.6% C2-M401
2/18 = 11.1% D1c1a-M533
1/18 = 5.6% J1a2b-Page8
1/18 = 5.6% N1c2b-P43
1/18 = 5.6% O3a2c1a-M117
1/18 = 5.6% Q1a1a1-M120
1/18 = 5.6% Q1a1b-M25
2/18 = 11.1% R1b1a1-M478/M73

Mongol/Northwest Mongolia (DiCristofaro et al. 2013)
3/97 = 3.1% C2-M386(xM407)
6/97 = 6.2% C2e1a-M407
29/97 = 29.9% C2b2a-M86
2/97 = 2.1% C2-M532
11/97 = 11.3% C2-M401
1/97 = 1.0% D1c1-P47(xD1c1a-M533)
1/97 = 1.0% I2a2-M436
1/97 = 1.0% J1a2b-Page8
2/97 = 2.1% J2a-M410(xJ2a1-Page55)
1/97 = 1.0% J2a1-Page55(xM530, M67)
1/97 = 1.0% J2a1-M530(xDYS445=6)
1/97 = 1.0% J2a1b-M67(xJ2a1b1-M92)
1/97 = 1.0% N1c2b-P43
12/97 = 12.4% N1c1-Tat
1/97 = 1.0% O-M175(xO1a-M119, O2a1-M95, O2b-M176, O3-M122)
3/97 = 3.1% O3-M122(xO3a1-KL2, O3a2-P201)
2/97 = 2.1% O3a2-P201(xO3a2c1-M134)
4/97 = 4.1% O3a2c1-M134(xO3a2c1a-M117)
3/97 = 3.1% O3a2c1a-M117
1/97 = 1.0% Q-M242(xQ1a1a1-M120, Q1a1b-M25, Q1a2-M346, Q1b1-M378)
1/97 = 1.0% Q1a1a1-M120
5/97 = 5.2% Q1a2-M346
2/97 = 2.1% R1a1a-M198/M17
2/97 = 2.1% R1b1a1-M478/M73
1/97 = 1.0% R1b1a2a-L23

Mongol/Southeast Mongolia (DiCristofaro et al. 2013)
2/23 = 8.7% C2e1a-M407
1/23 = 4.3% C2b2a-M86
1/23 = 4.3% C2-M532
1/23 = 4.3% C2e-M546(xC2e1a-M407)
6/23 = 26.1% C2-M401
2/23 = 8.7% D1c1a-M533
1/23 = 4.3% G1-M285
1/23 = 4.3% N1c2b-P43
1/23 = 4.3% O1a-M119
2/23 = 8.7% O3a2c1-M134(xO3a2c1a-M117)
1/23 = 4.3% O3a2c1a-M117
1/23 = 4.3% Q-M242(xQ1a1a1-M120, Q1a1b-M25, Q1a2-M346, Q1b1-M378)
1/23 = 4.3% R2a-M124
1/23 = 4.3% R1b1a1-M478/M73
1/23 = 4.3% R1b1a2-M269(xR1b1a2a-L23)

I see that the Mongols possess a dizzying array of Y-DNA haplogroups, including Q1a1a1-M120 (found in Central Mongolia and Northwest Mongolia). Q1a1a1-M120 was found alongside its sister clade, Q1a1b-M25, in Central Mongolia, and alongside Q1a2-M346 and Q-M242(xQ1a1a1-M120, Q1a1b-M25, Q1a2-M346, Q1b1-M378) in Northwest Mongolia.

Davidski, do you believe these Mongols are descendants of Han migrants to Central and Northwest Mongolia, or do you believe that Q1a1a1-M120 originated outside of China and migrated into southern Shanxi (the location of the Hengbei burials) as an already-formed haplogroup?

Matt said...

@ spagetiMeatball - It's looking more and more now like the peripheries of eurasia (europe, near east, east asia) were population sinks, and there were constantly fresh waves of invaders from the steppe, shaking things up in a big way.

At the same time, the Uyghurs are closer to 1/3 AME, 1/2 proto-Japanese and 1/6 ANE for a reason....

If anything, Central Eurasia would seem to have less genetic continuity than the "periphery" (not counting movements within the periphery)... So thinking of Central Asia as a source and Coastal Eurasia as a sink seems not quite right.

About the aristocratic status of Q1a1, it's perhaps interesting that the aristos' burials also had N. Suggestive that whatever might have come off the steppe was Siberian, not anything with another Western ANE connection.

andrew said...

The bottom of the social hierarchy in this time period seems to be made up of O* and O2a. These generally have a geographic association with what is now Southern China prior to Han conquest and SE Asia, consistent with the historical fact of Northern Chinese Han conquest of the South that would have resulted in the assimilation of these populations as slaves and/or inferiors.

I don't recall if family arrangements around this time were patrilineal or matrilineal. In a patrilineal scheme the common Q1a1 individuals could have been bastards. In a matrilineal scheme, links between Y-DNA and social class would be only due to assortive mating.

Daniel Szelkey said...

The Bashkirs probably are dominated by Q1a2 (given the ancient chinese had Q1a1) this is why.
Bashkirs R1b is mostly so called R1b-m73. This was first reported in the Hazara in 5 out of 25 samples IN 2006.A 2010 study with much better phlyogeny by Behar found that these 5 R1b-m73 Bashkirs belonged to PQR2-M73 (or Q-M73 as they were not R1) The only logical explianation for this is that (Q/R1B)-m73 is really just a subclade of Q-m25 which was incorrectly labled as R1b and than had its subclade (Q-M73) incorrectly labeled as R1b by a lab in India

$$$Dollar$$$ said...

Hi David:

I am O2A1 myself. When you took your own look at my results you came up with some degree of South-East Asian admixture. My last name is typical Chinese from the Hong Kong region. Is it possible that the Asian is, in fact, derived from the Tanka people? Are these folks represented in your samples?


Davidski said...

I don't have any Tanka samples. But it's very difficult to differentiate ancestry from the various ethnic groups in Southern China. They're all very similar in terms of genome-wide structure.

The best way to track the origins of your paternal ancestry to a specific ethnic group like that might be to get a full Y-chromosome scan.

Lathdrinor said...

@Ebizur the authors of the paper are not saying that Hengbei is *the* mainstream ancestor of the Han Chinese, but that it is *one* of the important ancestors of the Han Chinese and the primary source of Q1a in Han Chinese. The paper's statement that 'Q1a1 was introduced into the Han Chinese population ~3,000 years ago' indicates that it agrees with your assessment that Q1a1 is not the mainstream Han Chinese haplogroup but an appendage. Within the genetics community in China, it is widely agreed that O3 is the mainstream Han Chinese haplotype.

Ebizur said...


The same authors in one of their Chinese papers have argued based on mtDNA that the people of the Hengbei site are the most similar (exhumed and tested) ancient population to modern Han Chinese. They clearly have a hypothesis that the people buried at the Hengbei site are central to the formation of the historical and present Han ethnos (cf. "Eighty-nine ancient human remains that were excavated from the presumed geographic source of the Han Chinese..."). I just find this idea curious, considering the fact that the majority of the tested male specimens from the Hengbei site belong to haplogroup Q-M120, which has been completely absent from some samples of modern Han Chinese and very rare in many others. Q-M120 also has been found in samples of Hmongs in Laos, Japanese in eastern Japan, Koreans in Korea, Tibetans in western Sichuan and in Lhasa, and Mongols in central and northwestern Mongolia, and it potentially may be found in many other populations of Asia, but I do not recall having ever seen any estimate of the TMRCA of all Q-M120 Y-chromosomes.

It is true that the general area of the Hengbei site (Henan, southern Shanxi, southern Shaanxi, etc.) is considered on archaeological and historical grounds to be the geographic source of "Chinese culture." Considering all the facts together, I think that this site at Hengbei most likely should represent a burial ground for the members of some invading foreign (originally non-Han Chinese) population, so I am really curious about any material artifacts that have been found with these human remains and what they might suggest about the cultural background of the deceased.

Lathdrinor said...

@Ebizur without having the authors' other paper on hand, I'm not able to confirm what their particular bias is. I certainly agree that the Hengbei site does not represent the general population of the Central Plains, and that this is a well known fact archaeologically. I have posted details regarding it @ Dienekes, but it has not yet shown up. For the time being, I encourage you to read the following paper about the site from which these samples came: It ought to confirm your belief that the Hengbei site is, in fact, comprised of migrants from outside of China, though they were not necessarily invaders, judging by their status as vassals. It was common practice in ancient China to resettle defeated tribes in China as a reward for surrender and as a way to pacify them.

Ebizur said...
This comment has been removed by the author.
Ebizur said...


A new paper regarding the phylogeny of haplogroup O-M95 was published a couple months ago. Have you seen it?

Zhang X, Kampuansai J, Qi X, Yan S, Yang Z, et al. (2014) An Updated Phylogeny of the Human Y-Chromosome Lineage O2a-M95 with Novel SNPs. PLoS ONE 9(6): e101020. doi:10.1371/journal.pone.0101020

Ebizur said...

Also, the position of the ancient Taojiazhai population on the PCA plot is very curious. It is in a somewhat intermediate position between the North China cluster (ancient Hengbei plus modern Han of North China, Inner Mongolia, Liaoning, and Shanghai) and the modern Central China cluster, but, if I remember correctly, the Taojiazhai site is a set of Han period tombs located in Xining, Qinghai.

So, we have two ancient samples, including the Xiongnu sample, plus two modern samples (Qinghai Han and Gansu Han) grouping together as a "Northwest China" cluster, but one ancient sample from the same geographic region (Taojiazhai of the Han period) being very distinct and closer to modern Han of North/Central/East China. Perhaps modern Han in Qinghai and Gansu are primarily descendants of assimilated Xiongnu and other non-Han peoples, with the Taojiazhai site representing Han colonists from the east who have managed to Sinicize much of the region in classical antiquity. There are still a lot of divergent (i.e. probably long-isolated) Turko-Mongol minority groups in the region (Salar, Yugur, Bao'an/Bonan, Dongxiang/Santa, Tu/Monguor, etc.), and, a little further to the south, there are many extremely divergent Sino-Tibetan groups (the so-called Qiangic peoples).

Ebizur said...

I'm sorry for the multiple posts, but I thought I should try another way of summarizing the analysis of mtDNA of many ancient and modern samples from China plus one ancient sample from Mongolia (Xiongnu) that these authors have published previously in Chinese:

Niuheliang (site of the Hongshan culture located in Lingyuan, westernmost Liaoning Province of southern Manchuria, approx. 5000 YBP): in no-man's land, nearest to "Northwest China" (Xiongnu/Turko-Mongol?) cluster on PC1, nearest to "Central China" (original Han?) cluster on PC2

Taosi (site of Late Longshan culture in Xiangfen County, southern Shanxi in North China, approx. 4000 YBP): falls within the "Northwest China" (Xiongnu/Turko-Mongol?) cluster on PC1, intermediate between the "North China" (northern barbarian-influenced later Han?) cluster and the "Central China" (original Han?) cluster on PC2, but somewhat closer to the former

Hengbei (site of proto-historical Peng state, vassal of Zhou, in Hengbei Village, Jiang County, Yuncheng, southern Shanxi, ca. 3000 YBP): an eccentric member of the "North China" cluster, at the "Northwestern" extremity of that cluster (alongside a sample of modern Han from Shaanxi and a sample of modern Han from Shandong). Intermediate between ancient Xianbei+Niuheliang and ancient Taojiazhai in regard to PC1, and intermediate between ancient Xiongnu+Gansu/Qinghai and ancient Xianbei in regard to PC2.

Laborers of the Tomb of the First Emperor of Qin (workers who were involved in the construction of the mausoleum, located near Xi'an in southern Shaanxi, of the King of Qin, first Emperor of China, and mostly murdered/"sacrificed" after their work was complete; ca. 2200 YBP): falls within the "South China" cluster, which otherwise consists of samples of modern Han from southern China. It is the most "eastern" (highest value for PC1) and most "southern" (lowest value for PC2) of all the ancient samples in this study. Perhaps the tombbuilders were, like many builders of the Great Wall, forcibly brought from the recently annexed territory of the erstwhile Kingdom of Chu in the Yangtze basin.

Taojiazhai (graveyard of the Han period located in Xining, Qinghai; ca. 2000 YBP): in no-man's land between the "North China" (northern barbarian-influenced later Han?) and "Central China" (original Han?) clusters. Among the ancient samples, it is intermediate between the Hengbei sample and the First Emperor tombbuilder sample in regard to PC1, and intermediate between the Taosi sample and the Niuheliang sample in regard to PC2, though somewhat closer to the former.

Ebizur said...

Xiongnu (an sample of ancient remains from the territory of modern Mongolia that have been associated with the ancient Xiongnu): falls within the "Northwest China" cluster. Minimal value for PC1 (i.e. "westernmost") among all the ancient samples. Second highest value for PC2 (i.e. second "northernmost") among all the ancient samples.

Xianbei (nominal descendants of the classical Donghu; beginning in the late 3rd century and continuing through the 4th century CE, they founded many kingdoms in northern China): in no-man's land roughly between the "Northwest China" and "North China" clusters. Roughly the same as Niuheliang for PC1, between ancient Gansu-Qinghai/Taosi/Xiongnu on the "west" and ancient Hengbei on the "east." Falls between ancient Hengbei/Xiongnu on the "north" and ancient Taosi/Taojiazhai on the "south" in regard to PC2.

Gansu-Qinghai (ancient sample from the territory of Gansu and Qinghai provinces in Northwest China; I am unclear about the date or cultural affiliation of this sample): falls within the "Northwest China" cluster, together with the ancient Xiongnu sample from Mongolia and the modern Han samples from Qinghai and Gansu. Very close to ancient Taosi in regard to PC1, intermediate between ancient Xiongnu on the "west" and ancient Niuheliang/Xianbei on the "east," though closer to the former. Maximal value for PC2 (i.e. "northernmost") among all the ancient samples.

They have performed an admixture analysis according to which the "modern Han" (現代漢族) may be modeled as descendants of the ancient samples in the following proportions:

Hengbei ancient population: 0.396717
Taojiazhai ancient population: 0.382471
Gansu-Qinghai region ancient population: 0.214617
Xianbei population: 0.183724
Taosi ancient population: 0.0654621
Niuheliang ancient population: 0.0560747
Xiongnu population: -0.299065

It does not make sense to me why they would model the modern Han as having 29.9% negative admixture from the Xiongnu of ancient Mongolia. Such a thing is not logically possible, but I suppose it might be hinting at a contribution to the ancestry of the modern Han from a population whose mtDNA profile is diametrically opposed to the mtDNA profile of the ancient Xiongnu sample from Mongolia. Perhaps a sample of ancient mtDNA from southern China might help to clarify this matter.

It is also very curious that they have found that the population of Taosi (southern Shanxi ca. 4000 YBP) has bequeathed such a small contribution to the mtDNA of the modern Han, whereas they have found that the population of Hengbei (southern Shanxi ca. 3000 YBP) has bequeathed the greatest contribution to the mtDNA of the modern Han. Both samples are from the same geographic area, but the Hengbei sample is about 1000 years more recent. The Hengbei sample is shifted a great deal toward the "east" and somewhat less (but also significantly) toward the "north" relative to the Taosi sample on the PCA plot.

Lathdrinor said...

@Ebizur mtDNA, Y-DNA, and aDNA are all required to talk about the source of a population, and in this case it is especially important because there is a drastic difference between northern and southern Han when it comes to mtDNA, and a drastic difference between Turko-Mongols and Han when it comes to Y-DNA. There is not, however, as drastic of a difference between the Y-DNA of northern and southern Han, and there is not as drastic of a difference between the mtDNA of northern Han and Turko-Mongols, and for that matter between southern Han and southern aboriginals.

Men and women expand differently, and ethnicity is not transferred equally between the two genders. Taosi and Taojiazhai provide great examples of this clause: from what I know, all the Y-DNA tested from these sites ended up being subclades of O3. Thus, though maternally they look rather different, paternally the link becomes obvious. And the same goes for Hengbei - maternally it looks to be the source of the Han population, yet paternally there is a definite gotcha moment.

Ebizur said...


Yes, of course. I have been rambling solely about mtDNA because that is what has been provided in that paper.

Considering the ancient Y-DNA analysis provided in the present paper by the same authors in conjunction with an earlier study of ancient Y-DNA by Li Hui et al. (2007), one may note that both the Y-DNA and the mtDNA of the Hengbei site are distinct from those of the Taosi site of roughly 1000 years earlier in the same area.

The earlier Taosi site (Late Longshan, ca. 4000 YBP) contains 3/5 O3-M122(xM7, M134), 1/5 O3a2c1-M134, and 1/5 "Missing" (no amplifiable product could be obtained). The later Hengbei site (Western Zhou, ca. 3000 YBP) contains 11/27 Q1a1, 5/27 O3a3 (?=O-M134), 4/27 O3a(xO3a3) (?=O3a-M324(xM134)), 4/27 O(xO2a, O3a), 2/27 O2a, and 1/27 N as far as I can tell from Davidski's spreadsheet. Only O-M122(xM134) and O-M134 seem to have been found at both sites.

The "Genetic Exploration of the Origin and Development of the Han Ethnos" paper shows that the shift of the mtDNA profile from 4000 YBP Taosi to 3000 YBP Hengbei has been toward modern Guangdong, Yunnan, Jiangxi, Fujian, and Guangxi Han along a vector that contrasts those populations with modern Qinghai and Gansu Han as well as ancient Xiongnu from Mongolia, some population(s) from ancient Gansu and Qinghai, and the Taosi population itself. It also has shifted, though to a somewhat lesser degree, toward modern Qinghai, Gansu, Shaanxi, Shanxi, and Shandong Han and the ancient Xiongnu, Gansu-Qinghai population, and Xianbei along a vector that contrasts those populations with modern Guangxi, Guangdong, Fujian, Hubei, Yunnan, and Zhejiang Han as well as the laborers used to construct the mausoleum of the First Emperor.

Unfortunately, we have no sample from an ancient contemporary of Hengbei that has an even higher concentration of Q1a1 Y-DNA (i.e. a potential foreign source of this genetic influx that appears to have replaced a great deal of the Y-DNA of the Taosi/Late Longshan people of the area). We also have no sample of an ancient contemporary that has even higher values for both PC1 and PC2 in the PCA analysis of mtDNA by Zhao, Zhou et al. (2012) (in other words, roughly speaking, no example of a population whose mtDNA pool is even more "northeast-shifted" than the mtDNA pool of the Hengbei population).

The Hengbei population is about 2000 years more recent than the Niuheliang population, so I am curious about what the mtDNA profile of the people in Northeast China (a.k.a. Manchuria) and Korea might have looked like toward the end of the second millennium BCE. Of course, the mtDNA of the Xianbei sample, which is from a supposedly northeastern-influenced population of a much more recent era, is rather intermediate between the mtDNA of Niuheliang and Hengbei, and the ancient Y-DNA of the western Liao River basin (roughly the same as the area of the Niuheliang sample) during the period in question is, according to Cui et al. (2013), mostly N and O3 with a bit of C2, so that does not seem to be a likely source area for the Y-DNA.

If one considers the mtDNA profile of the Taosi sample to be a fluke caused by drift, sampling error, or whatever, then the mtDNA pool of the Hengbei population and the modern "North China" cluster appears to have developed from an admixture event between the "Northwest China" cluster (including ancient Xiongnu and Gansu-Qinghai populations) and the "Central China" cluster (including Taojiazhai as a bit of an outlier).

Ebizur said...

According to another Chinese paper by Zhang et al. (2009), 陶寺中晚期人骨的种系分析 ("Phyletic Analysis of Human Bones from Mid-to-Late Taosi"), which is cited by Zhao et al. (2012) as the source of their Taosi sample,
the Taosi sample consists of the remains of 25 human individuals who belonged to the following mtDNA haplogroups:

1 B5 (sex undetermined, ash pit burial)
1 C (sex undetermined, grave burial)
1 D4 (female, ash pit burial)
3 D5 (one male, one female, one sex undetermined, all ash pit burials)
1 D4b1 (male?, ash pit burial)
1 D (male, ash pit burial)
1 F* (female, earth fill burial)
1 G2a (male, ash pit burial)
1 Y (male, grave burial)
7 M10 (two male, two female, three sex undetermined)
5 M7c (two male, one female, one considered likely to be male but uncertain, one sex undetermined)
2 M(xC, D, G, M10, M7c) (one male, one sex undetermined)

Apparently, there was a diachronic transition at the Taosi site, according to which burial in ash pits became progressively more common. One of the grave burials, IIM22 from "the middle period at Taosi" (陶寺中期), is noted for being "of the greatest scale" (规模最大), with the "richest grave goods" (遗物最为丰富) and the "owner of the tomb being placed in a coffin made in the form of a dugout canoe" (墓主被放置在一根整木挖凿出来的船形棺中). The owner of this tomb, a male, is the one who belonged to mtDNA haplogroup Y. The remains of three individuals, one male who belonged to haplogroup M7c, one female who belonged to haplogroup F*, and one individual of undetermined sex who belonged to haplogroup M7c, were recovered from "earth fill" (填土) in a pit that was later dug on top of this tomb.

Of the fourteen M(xC, D, G) individuals, five were recovered from "grave burials" and nine were recovered from "ash pits." The two M7c individuals recovered from "earth fill" above the haplogroup Y individual's tomb probably have been counted as "grave burials," judging from the fact that the F* female found in the same earth fill has been counted as such.

I cannot discern precisely what is responsible for the position of the Taosi sample on the PCA of mtDNA variation in Zhao et al. (2012). It was very low on PC1, which appears to correlate most strongly with longitude, and it was fairly average on PC2 (like the Taojiazhai sample), which appears to correlate most strongly with latitude. This would most likely position the Taosi sample somewhere around Tibet if the PCA plot were projected onto a map of East Asia.

In any case, the most conspicuous feature of the Taosi mtDNA sample is its extremely high frequency of M10 (7/25 = 28%).

Ebizur said...

The Taosi population of Late Longshan culture in southern Shanxi has an idiosyncratic mtDNA profile dominated by three subclades of haplogroup M: M10, D, and M7c. These three subclades alone account for 18/25 = 72% of the entire sample.

At an even lower resolution, the mtDNA pool of the Taosi site is odd for belonging almost entirely to haplogroup M: 22/25 = 88% M, 2/25 = 8% R (1 B5, 1 F), 1/25 = 4% N(xR) (1 Y).

Modern Chinese, on the other hand, are only about 50% M on average (ranging from about 40% M in the south to about 60% M in the north). This might have something to do with the Taosi population's position very far from the modern Southern Han and the ancient First Emperor tombbuilders on the PCA plot of mtDNA variation in 汉族起源与发展的遗传学探索 ("A Genetic Exploration of the Origin and Development of the Han Ethnos") by Zhao Yongbin et al. (2012), since modern Southern Han have high frequencies of mtDNA haplogroup R (B4, B5, B6, F1, F2, R9, etc.).

According to Yong-Gang Yao et al. (2002), "Phylogeographic Differentiation of Mitochondrial DNA in Han Chinese," the modern Han in Xining, Qinghai (location of the Han-era Taojiazhai site) are pretty much typical modern Northern Han, but they also have a Western Eurasian element (4/78 = 5.1% "West Eurasian haplotypes"), and they have the highest frequency of N9a (6/78 = 7.7%) and Y (3/78 = 3.8%) among the thirteen regional Han populations considered in that study. The frequency of Y is roughly the same as that of the ancient Taosi sample (1/25 = 4%). I know that N9a is also common in eastern and southern China, so I suppose that Western Eurasian mtDNA is probably the major distinguishing factor here, and that it is strongly negatively correlated with PC1 and positively correlated with PC2. That certainly would fit with geography.

At this point, I think it might make some sense to consider the Taojiazhai mtDNA profile as a starting point for Han mtDNA, with the variation among modern Han regional populations mainly being caused by historical admixture from people with a Turko-Mongol-like mtDNA profile (causing "northwest shift") or from people with a South China aborigine-like mtDNA profile (causing "southeast shift"). The apparent "northwest shift" is of greatest magnitude in Qinghai and Gansu Han, but also notable throughout northern China and even Shanghai. The apparent "southeast shift" is greatest in Guangdong and Guangxi Han.

On the other hand, the Taojiazhai site was located in Qinghai, in the extreme northwest. It is possible that it might have already been "northwest shifted" to some degree. In that case, the modern "Central China" cluster would best represent the mtDNA profile of the original Han Chinese, with the inferred Turko-Mongol-like influence on modern Han populations of northern China increasing and the South China aborigine-like influence on modern populations of southern China decreasing accordingly.

Lathdrinor said...

I find it difficult to understand the choice of Taojiazhai. As you said, it is liable to have been northwest shifted itself, and is from a geographic region of China that was not at all the center of the ancient Han culture. At the same time, just because modern Han does not have the same profile as Taosi does not make it a bad candidate. Certainly, mtDNA is bound to change with the expansion of an ethnic population, especially when it is paternally driven as was the case for Han Chinese. To this end, I think we need to separate the issue of finding the mtDNA profile of the ancient Han and the issue of finding the various mtDNA sources of the modern Han. The two are not the same, and I'm not even sure that there is a 'primary' source population that contributed the bulk of mtDNA lineages in the modern Han. At best, it is a different primary source between the northern and southern Han.

Also, have you seen the mtDNA samples from Erlitou, Erligang, and Yinxu - sites that are closer to the proverbial 'center' of ancient Han culture than Taosi and Taojiazhai? I think your opinion, both of the distribution of mtDNA lineages in ancient China, and of the value of trying to find a primary source of mtDNA for the Han, ought to change after seeing the results. Unfortunately, I only browsed translated versions briefly at a poster session, but I'm sure that you're able to find them given your ability to read Chinese.

Adrian Yohanes Purnomo said...

So, do you mean a "Han" Chinese East Asians Men Y Hg O*-M175 and it's subclades, Y Hg O1-M199, O2-M268 and O3-M122 are a "slave" (like "Sudra" people in India) of a Northern Chinese like Turko Mongol and another Central Asians Men who carried a Y Hg C3*/C2(ISOGG)-M217 and Hg Q*-a Siberian and Native American Men? So, in other words, my Paternal Y Hg O-CTS5492 / O3*-M134, an Indonesian - Hokkianese Descandant is a "low social class" people? Oh my.... But i must accept an unpleasent fact but i like a people with a logical thinking.

Adrian Yohanes Purnomo said...

I understand with your statement.
Turko Mongol Y Hg C2-M217 and it's subrances VS Northern Han Chinese Y Hg O2-M122, M134 and M117.
Northern Han Chinese and Turko Mongol share a similar mtDNA Hg M8, CZ, C, Z, M11, G and D (M Type).
Northern Han Chinese MtDNA M8, CZ, C, Z, G, and D (mostly M Type) VS Southern Han Chinese mtDNA Hg M7 (M Type) Hg N9a, X"?", Y (N Type) Hg R11, B, R9 and F (N - R Type).
Northern Han Chinese and Southern Han Chinese have similar Y Hg O2-M122, JST00621, P201, M134 and M117. But:
North + Southern "Han" Chinese and Southeast Asian (Austroasiatic + Austronesia) have relatively less difference from the Y DNA because they share a common ancestors Y Hg NO*/K2a-M214 and O*-M175 with O's subrances: O1a-M119 (Aborigines Taiwanese, Nias People), O1b1-M95 (Vietnamese, Cambodian, Thai, Malaysian and Indonesians Malays like Javanese, Balinese, etc), O2-M122*, P201, P164, M7, M134 and M117 (Filipino, Burmese, Vietnamese, Micronesians, Polynesians, Sumatera people in Indonesia and Chinese Hmong Mien, Sino Tibetan (Huaxia) and Tibeto Burmese.
Southern Han Chinese and Southeastern Asian have similar mtDNA Hg M7, M9 and E (M Type), N9a and Y (N Type), R9, F, R11 and B (N - R Type).
Southern Han Chinese + Southeast Asian ancient ancestors, from Y Hg F*-M89 - K*-M9 - K2-M526 and K2a / NO*-M214 and mtDNA Hg N - R, except M7, have completely difference both Haplogroups with Turko Mongolic people Y Hg C2-M217, D1-M15, D2-M55 and D3 (Not the descendants from F*-M89) and mostly a descendants from mtDNA Macrohaplogroup M* except mtDNA Hg A, X an Y (N Type).
It's not easy to determine a Far East Asian Autosomal Ethnic DNA because a Northeast Asians like Mongolian, Khakkasian, Altaian, The Kazakhs, etc have small amount 5 - 15% Western Eurasians DNA while Southeast Asians especially The Philipphines have around 5 - 35% of Southwest Asian, Europeans, Oceanian and even Native American DNA. But Chinese, Japanese, Korean, Vietnamese, Laotian, Cambodian and Thai people doesn't have or less than 2% West Eurasians, Africans, Pygmies, Native Americans and Oceanian Autosomal DNA.