search this blog


Tuesday, April 21, 2015

West Eurasian mtDNA lineages in India

It might be interesting to compare the modern-day Indian mtDNA from this paper to the the ancient European mtDNA from Haak et al. 2015. The data is freely available in the spreadsheet here. Anybody up for it?

There is no indication from the previous mtDNA studies that west Eurasian-specific subclades have evolved within India and played a role in the spread of languages and the origins of the caste system. To address these issues, we have screened 14,198 individuals (4208 from this study) and analyzed 112 mitogenomes (41 new sequences) to trace west Eurasian maternal ancestry. This has led to the identification of two autochthonous subhaplogroups-HV14a1 and U1a1a4, which are likely to have originated in the Dravidian-speaking populations approximately 10.5-17.9 thousand years ago (kya). The carriers of these maternal lineages might have settled in South India during the time of the spread of the Dravidian language. In addition to this, we have identified several subsets of autochthonous U7 lineages, including U7a1, U7a2b, U7a3, U7a6, U7a7, and U7c, which seem to have originated particularly in the higher-ranked caste populations in relatively recent times (2.6-8.0 kya with an average of 5.7 kya). These lineages have provided crucial clues to the differentiation of the caste system that has occurred during the recent past and possibly, this might have been influenced by the Indo-Aryan migration. The remaining west Eurasian lineages observed in the higher-ranked caste groups, like the Brahmins, were found to cluster with populations who possibly arrived from west Asia during more recent times.

Palanichamy et al., West Eurasian mtDNA lineages in India: an insight into the spread of the Dravidian language and the origins of the caste system, Human Genetics, 2015 Apr 2. [Epub ahead of print]

See also...

Indian genetic structure in the context of ancient European DNA

Friday, April 17, 2015

IBS similarity analysis: 60 ancient genomes + 233 present-day pops

IBS stands for Identical-by-State. The full output is available in a zip file here. Below are a few examples in chronological order. Most of these genomes are from Haak et al. 2015.

MA1 or Mal'ta boy


La Brana-1

Motala_HG I0117

Samara_HG I0124

HungaryGamba_HG KO1

Stuttgart LBK380

HungaryGamba_EN NE1

Spain_EN I0412

Spain_MN I0406

Esperstedt_MN I0172

Oetzi the Iceman

Yamnaya I0429

Corded_Ware_LN I0104

Bell_Beaker_LN I0113

Unetice_EBA I0047

HungaryGamba_BA BR2

Hinxton4 ERS389798

Hinxton2 ERS389796

The results obviously make a lot of sense. Also, please note that my Principal Component Analyses (PCA) are usually based on IBS similarity, so it's a method that I have a lot of confidence in. Here are some examples from a few weeks ago featuring samples from the IBS zip file.

Karelia_HG I0061 PCA

Yamnaya I0231 PCA

Yamnaya I0443 PCA

Corded_Ware I0103 PCA

Bell_Beaker I0112 PCA

Update 18/04/2015: Matt posted this PCA based on the IBS similarity stats in the comments section. Sardinians and Samaritans appear to be the two obvious outliers within West Eurasia, which is probably because they harbor significantly lower levels of admixture from the steppe and/or Central Asia than their neighbors.


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

See also...

Hinxton ancient genomes roundup

Friday, April 3, 2015

The teal people: did they actually exist, and if so, who were they?

The ADMIXTURE analysis in Haak et al. 2015 includes a series of intriguing teal colored components from K=16 to K=20 (see image here). The main reason I'm so intrigued by these components is because they generally make up over 40% of the genetic structure of the potentially Proto-Indo-European Yamnaya genomes.

But there's only so much one can learn by starring at a bar graph, so I thought I'd have a go at isolating the same signal with ADMIXTURE to study it in more detail. You can view the results of my experiment in the spreadsheet here.

I wasn't able to completely nail any one of the teal components from Haak et al., because I don't have access to all of the samples used in the paper (I'd have to sign a waiver to get them). Nevertheless, the signal looks basically the same.

Below is a bar graph based on the output featuring selected populations and ancient genomes from Europe and Asia. The Fst genetic distances between the nine components are available here.

Note that the teal component peaks in the Caucasus and the Hindu Kush, and generally shows a strong correlation with regions of relatively high MA1-related or Ancient North Eurasian (ANE) admixture. On the other hand, the orange component peaks among Early European Farmers (EEF), who basically lack ANE.

To learn about the structure of the three main West Eurasian components - blue, orange and teal - I made synthetic individuals from the P output to represent each of the components, and tested them with my K8 model. As expected, the teal component harbors a high level of ANE, while the orange component lacks it altogether. Refer to the spreadsheet here.

It's very likely that the teal and orange components from Haak et al. share these traits. I think this is more than obvious by looking at their frequencies across space and time in Eurasia.

I also analyzed the synthetic individuals with PCA based on their K8 ancestry proportions. The samples representing the orange component fall just south of the Stuttgart genome from Neolithic Germany, and this is basically where I expect Neolithic genomes from the Near East to cluster when they become available.

Interestingly, the samples representing the blue component are dead ringers for Scandinavian hunter-gatherers (SHG). However, I suspect this is something of a coincidence caused by the small number of Western European hunter-gatherer (WHG) and Eastern hunter-gatherer (EHG) genomes in the dataset. The algorithm probably doesn't have enough variation to latch onto to create both WHG and EHG components, and in the end settles for something in between, which just happens to resemble SHG.

But the fact that the orange and blue samples more or less pass for ancient populations leaves open the possibility that the same might be said for the teal samples.

So did the teal people actually exist, and if so, who were they?

My view at the moment is that a population very similar to the teal samples formed in Central Asia or the North Caucasus during the Neolithic as result of admixture between MA1-like and Near Eastern groups. This population, I believe, then expanded into the Russo-Kazakh steppe by the onset of the Eneolithic.

Were they perhaps the Proto-Indo-Europeans? Probably not. I'd say they were Neolithic farmers who eventually played a role in the formation of the Proto-Indo-Europeans. In any case, someone had to bring the Caucasian or Central Asian admixture to the steppe, and I have it on good authority that it was already present among the Khvalynsk population of the Eneolithic, albeit at a lower level than among the Yamnaya of the early Bronze Age.


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

See also...

Modeling the Yamnaya with qpAdm

Sunday, March 29, 2015

European foragers were almost wiped out by the ice age

I think what this article is really saying is that the effective population size of Europeans might have dropped to as little as 30 after the LGM peak. If so, that's pretty close to a genetic precipice for most animals. In any case, it looks like there are more hunter-gatherer genomes on the way, including from Denmark and Switzerland, courtesy of Ron Pinhasi's team, which brought us the ancient Hungarian genomes last year (see here).

‘As an archaeologist and anthropologist, I was quite shocked to see how limited, how small the population numbers were. You know, shockingly small,’ said Prof. Pinhasi, based at University College Dublin, Ireland.

‘I think that what happened, it’s on a catastrophic level of demography for a long time in human evolution,’ he said.

The impacts of this are significant for understanding the origins of many Europeans today, as it is forcing researchers to reconsider models of human expansion and colonisation of the continent, as well as our genetic ancestry.

By analysing the genomes of human remains, the researchers are able to gather demographic data and clues to potential population sizes.

Prof. Pinhasi’s team has found that the genomes sequenced from hunter-gatherers from Hungary and Switzerland between 14 000 to 7500 years ago are very close to specimens from Denmark or Sweden from the same period.

These findings suggest that genetic diversity between inhabitants of most of western and central Europe after the ice age was very limited, indicating a major demographic bottleneck triggered by human isolation and extinction during the ice age.

‘We’re starting to be able to reconstruct the actual dynamics of migrations and colonisation of the continent by modern humans and that’s never been done before the genomic era,’ explained Prof. Pinhasi.

He believes that early humans crossed the continent in small groups that were cut off while the ice was at its peak, then successively dispersed and regrouped over thousands of years, with dwindling northern populations invigorated by humans arriving from the south, where the climate was better.

Source: Francesca Jenner, Ice-age Europeans roamed in small bands of fewer than 30, on brink of extinction, 26 March 2015, Horizon Magazine

Saturday, March 28, 2015

Population genetics of Copper and Bronze Age inhabitants of the Eastern European steppe

I'm hoping like hell that the samples from this thesis eventually get the same treatment as those from Haak et al. 2015.

Summary: This dissertation presents the first genetic study of prehistoric populations in the Pontic-Caspian steppe from the Upper Thracian Plain to the Volga. Hypervariable region I (HVR I) and 30 short sections of the coding region containing 32 clade- determining polymorphisms on the mitochondrial DNA, as well as 20 putatively naturally selected autosomal SNPs and a sex-determining locus were analysed using a combination of multiplex PCR and 454 sequencing. Data analysis was performed on the HVR I of 65 of the 180 Eneolithic and Bronze Age samples. (Partial) genotypes were generated from 61 individuals. Published ancient DNA data from Central and Eastern Europe and Central Asia, as well as modern DNA sequences were consulted for comparison.

The genetic data support the inference that early Neolithic farmers from Southeast Europe were involved in establishing pastoralism in the steppes by demic diffusion. The consistently low values of the FST-statistic (the range includes zero) between the Yamnaya Culture of the steppe and a succession of Neolithic cultures in Central Europe indicate continuous or recurrent contacts between the two regions. Between the Yamnaya Culture and its successor, the Catacomb Culture, the incidence of haplogroup U4, which is at high frequency in hunter-gatherer populations of Neolithic Scandinavia and Mesolithic Northwest Russia, rises from approximately 5 % to above 30 %. It is possible that immigrants from Eastern Baltic hunter-gatherer refugia were involved in the genesis of the Catacomb Culture.

The low FST values between the prehistoric steppe populations and the modern populations of Central and Eastern Europe indicate genetic continuity. This is supported by the nuclear genotype frequencies. According to current knowledge the modern European gene pool can be explained by three roots: indigenous Mesolithic hunter-gatherers, early farmers from the Near East, and an ancient North Eurasian component with an Upper Palaeolithic origin. Maybe the third ancestry component was introduced into the late Neolithic European genome by the North Pontic population.

Source: Wilde, Sandra, Populationsgenetik kupfer- und bronzezeitlicher Bevölkerungen der osteuropäischen Steppe, 2014, Dissertation

Tuesday, March 24, 2015

Live reports from AAPA 2015

Chad Rohlfsen is heading off to St. Louis tomorrow for the annual American Association of Physical Anthropologists (AAPA) conference, and will be posting updates from the big event in the comments below. Most of you will know Chad from the comments section on this blog. He's yet to finalize his program, but I know he'll be at this talk on the population history of the Aegean.

The origins of the Aegean palatial civilizations from a population genetic perspective

MARTINA UNTERLÄNDER1,2, SUSANNE KREUTZER2 and CHRISTINA PAPAGEORGOPOULOU1. 1 Department of History and Ethnology, Demokritus University of Thrace, 2 Palaeogenetics Group, Institute of Anthropology, Johannes Gutenberg-University of Mainz.

The present paper investigates the origins of the Aegean pre-palatial civilizations (5th-3rd millennium BC) by applying cutting-edge methods of molecular biology and population genetics. The term Aegean Civilizations refers to the novel human lifeway (agriculture and craft specialization, redistribution systems, intensive trade) that appeared during the end of the Neolithic and the beginning of the Bronze Age in the Aegean. Although many studies exist on archaeological constructions of ethnic and cultural identity on mainland Greece, the Cyclades and Crete, not enough efforts have been made to explore this direction on a population history basis. We have investigated Late, Final Neolithic and Early Bronze Age human skeletons (n=127) from the Aegean using ancient DNA methods, next generation sequencing (NGS) technology and statistical population genetic inferences to i) gather information on diversity, population size, and origin of the pre-palatial Aegean Cultures, ii) to compare them on a genetic basis, in terms of their cultural division (Helladic, Cycladic, Minoan) and iii) to investigate their ancestral/non-ancestral status to the Early and Middle Neolithic farmers from Greece. In addition to mitochondrial DNA genomes, by applying a capture-NGS approach we collected information on functional traits of the early Aegean communities in southeastern Europe. Considering the International Spirit that overwhelms the Aegean during the 3rd millennium BC, seen by the wide distribution of artifacts, this palaeogenetic approach provides valuable new insights on population structure of the groups involved in the Neolithic-Bronze Age transition and the spread of specific alleles in this part of Europe.

Feel free to help Chad plan the rest of his itinerary. The AAPA 2015 website is here. You can download a PDF book with all of the abstracts here.

By the way, Chad is paying for the trip himself. If anyone wants to help him cover the costs, please send contributions via PayPal to c_rohlfsen [at] hotmail [dot] com.

Monday, March 23, 2015

Indian genetic structure in the context of ancient European DNA

Let's turn our attention to South Asia for a moment. I was hoping that someone better informed than myself about India's history and genetics might help me interpret these D-stats:

Dravidian & Indo-Aryan D-stats

I ran the tests using qpDstat and genotype data from Haak et al. 2015 and Metspalu et al. 2011. The Indian samples I chose for the analysis are listed here.

Note that I didn't run any groups from northwest of Uttar Pradesh, because this part of India has an even more complex population history than the rest of the country, and I was just trying to focus on the Neolithic and early Indo-European migrations into South Asia.

Interpreting D-stats in the context of ancient migrations and admixture events is often not a straightforward task, but at least they're east to read. If the Z-score is +ve, then the gene flow occurred between W and Y and/or X and Z. If it's -ve, then the gene flow occurred between W and Z and/or X and Y.

Below are some of my observations and (potentially wild) suppositions based on the D-stats. Feel free to correct me in the comments section if I have misinterpreted the data in any way:

- the Indo-Aryans share significantly more gene flow with LBK, MA1 and Yamnaya than the Dravidians do (analyses 1-3), which correlates with their more northerly geographic location, and also with the generally accepted idea that their West Eurasian ancestors arrived in India later than those of the Dravidians, and thus had less time to mix with the locals

- the West Eurasian ancestry of the Indo-Aryans is more Yamnaya-like than that of the Dravidians (analysis 4), which fits with linguistics data, because the Indo-Aryans are obviously Indo-Europeans, while the Yamnaya were supposedly late Proto-Indo-Europeans

- the West Eurasian ancestry of the Dravidians is more LBK-like than that of the Indo-Aryans (again, analysis 4), which suggests that proto-Dravidian languages might have been introduced into South Asia by Neolithic farmers

- the West Eurasian ancestry of the Dravidians is also more MA1-like than that of the Indo-Aryans (analyses 5 & 6), which might mean that MA1-related admixture was already present in South Asia before the Indo-Europeans arrived there

- the Near Eastern ancestors of the Dravidians possibly came from Iran and/or the southern Near East, as opposed to the northern Near East or the Caucasus (analyses 7-13)


Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317

Metspalu M et al., Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia, Am J Hum Genet. 2011 Dec 9; 89(6): 731–744. doi: 10.1016/j.ajhg.2011.11.010

See also...

West Eurasian mtDNA lineages in India