search this blog

Sunday, August 28, 2016

Ancient vs modern day West Eurasian variation

The Principal Component Analyses (PCA) with ancient samples that I post on this blog are amongst the most accurate and best examples of their kind that you'll see anywhere. That's not just wishful thinking; it's a fact.

My PCA don't suffer from projection bias or shrinkage, which is a handicap of PCA in many ancient DNA papers, and they're run only on observed (rather than imputed) genotypes.

However, even my PCA are far from perfect, because they're based entirely on present-day variation. In other words, I still project the ancients onto eigenvectors computed with modern day reference samples. I guess that's the equivalent of putting the cart before the horse, when originally the horse may have been a donkey, or something like that.

Nevertheless, it's the only sensible way to plot heavily degraded ancient samples with a lot of missing data. But it does often leave me wondering whether the output says anything useful about the ancient world?

Thanks to the recent release of a lot of fairly high quality ancient genotype data from West Eurasia (most of it freely available at the Reich Lab website here), I can now test how well my trademark PCA of ancient West Eurasia reflects reality.

Below are two PCA featuring ancient composite samples. The first PCA is based on ~650,000 SNPs, with 100% call rates in each of the composites. For the second PCA I pruned the markers to correct for LD or linkage, and also made sure that about half of the SNPs were from transversion sites, which are less likely to be affected by postmortem damage. That left ~125,000, hopefully relatively high quality, SNPs.

Obviously, the plots are very similar, which makes me wonder whether there's any point thinning the markers when running decent quality ancient sequences? The datasheets are available for download here and here.

Now, below is a recent example of my PCA of ancient West Eurasia. It's basically almost identical to the plots above. This is very cool, and also very important, because it means that my strategy for running PCA with ancient samples produces solid and relevant results.

Interestingly, on closer inspection, the distance between the western and eastern Neolithic farmers on the first two plots appears bloated. Conversely, the distances between the northern Hunter-Gatherer (HG) samples are somewhat reduced. Any ideas why?

Update 31/08/2016: Open Genomes generated a 3D plot based on a new PCA datasheet that I posted in the comments. Click on the image below to check it out.

Update 01/09/2016: I added present-day samples to the PCA. Very happy with the outcome. The relevant datasheet is available here.

See also...

Ust'-Ishim man x2

Friday, August 19, 2016

Maybe first direct hints of Yamnaya-related gene flow into South Central Asia

Unfortunately, this is just an abstract for a presentation poster from the upcoming 6th DNA Polymorphisms in Human Populations conference in Paris. However, it might be important because, as far as I know, it's the first ancient DNA report supporting the idea that Bronze Age herders from the Eastern European steppe had a profound impact on the ancient populations of South Central Asia.

At the end of the Bronze Age, the proto-urban Oxus Civilisation in Southern Central Asia (Uzbekistan, Turkmenistan) disappeared and was replaced by Iron Age Yaz Cultures. Environmental changes such as aridification and geopolitical reasons are called for to explain this cultural transition. However, evidences of settlements from Andronovo populations during the late Bronze Age suggest that this transition was associated with migrations from northern steppe populations. Indeed, palaeogenetic studies (Allentoft et al., 2015; Haak et al., 2015) have already shown that gene flow from Yamnaya steppe populations occurred in Europe and Altai at the end of the Neolithic, suggesting that the steppe inhabitants spoke indo-european langages.

To investigate the role of migrations in the Bronze Age/Iron Age transition in Southern Central Asia, we turned to palaeogenetic studies. DNA was extracted from 17 skeletons excavated in Ulug Depe (Turkmenistan) archaeological site. The hypervariable region I of the mitochondrial (mt) genome was sequenced for 6 individuals from the Bronze Age and 4 from the Iron Age.

Criteria of authentication for ancient DNA were met: experiments were done in a clean room dedicated to ancient DNA analysis, and blank DNA extraction and PCR controls were performed. Indeed, we observed DNA damages specific for ancient DNA and an inverse correlation between the efficiency of the PCR and the length of the amplified DNA fragment. Thus, we first evidenced the preservation of ancient DNA in Southern Central Asia. After sequencing and assignment of individuals to human mitochondrial haplotypes, a high diversity of haplotypes at Ulug Depe was observed. All the haplogroups found in Ulug Depe belong to modern western Eurasian populations.

Haplogroups shared between steppe populations and Ulug Depe were evidenced, suggesting gene flow between Southern Central Asia and the Steppe. Genetic data suggest a close relationship between Yamnaya related populations and Iron Age Ulug Depe population. However, no significant genetic discontinuity between Bronze and Iron Age was shown, that may be due to a limited sample dataset and calls for nuclear DNA analysis.

Monnereau A., Lhuillier, J., Bendezu-Sarmiento, J.,Bon, C., Palaeogenetic analysis of Bronze Age/Iron Age transition in Southern Central Asia, poster, 6th DNA Polymorphisms in Human Populations, Musee de l’Homme, Paris, 7-10 December, 2016

See also...

Pots were people in Bronze Age southern Central Asia too

Tuesday, August 16, 2016

EAA 2016 abstracts

The abstract book for this year's meeting in Vilnius can be gotten here. I'm hoping there's a paper coming real soon based on this talk on the genetic history of the East Baltic. Emphasis is mine.

Recent studies of ancient genomes have revealed two large-scale prehistoric population movements into Europe after the initial settlement by modern humans: A first expansion from the Near East that brought agricultural practices, also known as the Neolithic revolution; and a second migration from the East that was seen in a genetic component related to the Yamnaya pastoralists of the Pontic Steppe, which appears in Central Europe in people of the Late Neolithic Corded Ware and has been present in Europeans since then in a decreasing North-East to South-West gradient. This migration has been proposed to be the source of the majority of today’s Indo-European languages within Europe.

In this paper we aim to show how these processes affected the Eastern Baltic region where the archeological record shows a drastically different picture than Central and Southern Europe. While agricultural subsistence strategies were commonplace in most of the latter by the Middle Neolithic, ceramic-producing hunter-gatherer cultures still persisted in the Eastern Baltic up until around 4000 BP and only adopted domesticated plants and animals at a late stage after which they disappeared into the widespread Corded Ware culture.

We present the results of ancient DNA analyses of 81 individuals from the territory of today’s Lithuania, Latvia and Estonia that span from the Mesolithic to Bronze Age. Through study of the uniparentally inherited mtDNA and Y-chromosome as well as positions across the entire genome that are informative about ancient ancestry we reveal the dynamics of prehistoric population continuity and change within this understudied region and how they are reflected in today’s Baltic populations.

Mittnik et al., A genetic perspective on population dynamics of the pre-historic Eastern Baltic region, EAA 2016 presentation, TH4-11 Abstract 06

Monday, August 15, 2016

A few mito genomes from Maikop (or Maykop)

The mtDNA haplogroup list below is from a new paper at the Journal of Archaeological Science. I can't remember seeing mt-hgs M52, U8 or V7 in any of the results to date from the Bronze Age steppe. So perhaps we can tentatively say that Maikop-Novosvobodnaya populations didn't have an important impact on the maternal ancestry of early steppe pastoralists?

- Krasnodar Krai, Maikop burial, 4000-3000 BCE, mt-hg U8b1a2

- Krasnodar Krai, Maikop burial, 3700-3300 BCE mt-hg U8b1a2

- Republic of Adygea, Maikop burial, 3700-3300 BCE mt-hg M52

- Republic of Adygea, Novosvobodnaya burial, 3700-3300 BCE mt-hg V7

- Krasnodar Krai, unknown burial, 3700-3300 BCE mt-hg N1b1

- Republic of Adygea, unknown burial, 3700-3300 BCE mt-hg T2b

Also, interestingly, the Novosvobodnaya individual suffered from Bang's disease. You get that from drinking unpasteurized milk.


Sokolov et al., Six complete mitochondrial genomes from Early Bronze Age humans in the North Caucasus, Journal of Archaeological Science, Volume 73, September 2016, Pages 138–144, doi:10.1016/j.jas.2016.07.017

See also...

Big deal of 2018: Yamnaya not related to Maykop

Genetic borders are usually linguistic borders too

On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint)

Saturday, August 13, 2016

PCA: Neolithic Central Anatolians

Note that the individuals from the earlier site of Boncuklu basically cluster with early Neolithic Europeans, while those from Tepecik-Ciftlik are shifted south and east, suggesting an influx of admixture into central Anatolia from perhaps eastern Anatolia and the Levant after the early Neolithic. This is in accordance with the findings of Kılınç et al. who published these genomes.

I also tested the same samples with the Basal-rich K7 (refer to the spreadsheet here). Their results appear to correlate very nicely with the PCA. However, I deleted Tep001 from the PCA plot because his PCA and Basal-rich K7 outcomes didn't match, suggesting that either one or the other, or both, were spurious. This isn't surprising, however, since Tep001 only has a coverage of 0.023x.


Gülşah Merve Kılınç et al., The Demographic Development of the First Farmers in Anatolia, Current Biology, August 8, 2016, DOI:

Update 15/08/2016: Below are a few admixture f3-stats from an analysis involving the new Anatolian samples. Please note, the more negative the Z score, the more likely that the target is admixed. Also, I had to use transversion SNPs to make this work, so the Z scores aren't as imposing as they might have been with more markers behind them. I'm posting all of the outcomes with Z scores lower than -1, but it might be best to ignore anything above -2.

Boncuklu_EN + Levant_N > Barcin_N f3 -0.005525 Z -2.62 SNPs 48620
Boncuklu_EN + Natufian > Barcin_N f3 -0.004252 Z -1.34 SNPs 28893

Boncuklu_EN + Natufian > Tepecik-Ciftlik_N f3 -0.013262 Z -1.566 SNPs 4384

Barcin_N + Villabruna > LBK_EN f3 -0.003652 Z -2.685 SNPs 49325
Barcin_N + LaBrana1 > LBK_EN f3 -0.003382 Z -2.462 SNPs 53537
Barcin_N + Motala_HG > LBK_EN f3 -0.002539 Z -2.388 SNPs 57533
Barcin_N + Loschbour > LBK_EN f3 -0.003089 Z -2.272 SNPs 48728
Tepecik-Ciftlik_N + Villabruna > LBK_EN f3 -0.004176 Z -1.452 SNPs 34905
Barcin_N + Hungary_HG > LBK_EN f3 -0.001939 Z -1.32 SNPs 41610
Boncuklu_EN + Levant_N > LBK_EN f3 -0.003035 Z -1.221 SNPs 40815

Barcin_N + Loschbour > Iberia_EN f3 -0.002457 Z -1.171 SNPs 38141
Tepecik-Ciftlik_N + Hungary_HG > Iberia_EN f3 -0.004022 Z -1.039 SNPs 21848

Barcin_N + Villabruna > Hungary_N f3 -0.006408 Z -4.28 SNPs 44545
Barcin_N + Hungary_HG > Hungary_N f3 -0.005216 Z -3.355 SNPs 39808
Barcin_N + Bichon > Hungary_N f3 -0.002554 Z -1.667 SNPs 48500
Barcin_N + Motala_HG > Hungary_N f3 -0.001559 Z -1.298 SNPs 51052
Tepecik-Ciftlik_N + Villabruna > Hungary_N f3 -0.003929 Z -1.239 SNPs 31535
Tepecik-Ciftlik_N + Hungary_HG > Hungary_N f3 -0.003897 Z -1.192 SNPs 28472
Barcin_N + LaBrana1 > Hungary_N f3 -0.00179 Z -1.083 SNPs 46955

Tuesday, August 9, 2016

On the enigmatic early Neolithic farmers from Iran

There still seems to be a lot of confusion around the traps, including in the comments at this blog, about the genetic structure of the early Neolithic Iranian farmers.

They're certainly a unique and mysterious West Eurasian population, but I'd say the picture is generally pretty straightforward considering that they were dug up on the border between the Near East and Central Asia.

As per my K7 test, they're closely related to other West Eurasians, and especially Near Easterners, via an ancient component that appears to be a mixture of Basal Eurasian and something very similar to the Villabruna cluster (see post here and the last page of the accompanying comments).

Apart from that, they harbor a lot of AG3-related ancestry, albeit probably only distantly related. My guess for now is that this is mostly admixture from an as yet unsampled Central Asian forager population, perhaps with elevated affinity to Ust_Ishim (update: probably not, see here).

The graphs below are based on the datasheet available here. Like I say, these ancient Zagros farmers are unique and eastern shifted, but, at the same time, don't show the type of Southeast Asian pull that characterizes present-day South and South Central Asians.

Saturday, August 6, 2016

Yamnaya dogs (?)

Just in at bioRxiv:

Abstract: Europe has played a major role in dog evolution, harbouring the oldest uncontested Palaeolithic remains and having been the centre of modern dog breed creation. We sequenced the whole genomes of an Early and End Neolithic dog from Germany, including a sample associated with one of Europe’s earliest farming communities. Both dogs demonstrate continuity with each other and predominantly share ancestry with modern European dogs, contradicting a Late Neolithic population replacement previously suggested by analysis of mitochondrial DNA and a Late Neolithic Irish genome. However, our End Neolithic sample possesses additional ancestry found in modern Indian dogs, which we speculate may be derived from dogs that accompanied humans from the Eastern European steppe migrating into Central Europe. By calibrating the mutation rate using our oldest dog, we narrow the timing of dog domestication to 20,000-40,000 years ago. Interestingly, the extreme copy number expansion of the AMY2B gene found in modern dogs was not observed in the ancient samples, indicating that the AMY2B copy number increase arose as an adaptation to starch-rich diets after the advent of agriculture in the Neolithic period.

And on page 17:
The age of the samples provide a time frame, between ~7,000 and 5,000 years ago, for CTC to obtain its additional Indian­like ancestry component. Considering that CTC shows similar admixture patterns to Central Asian and Middle Eastern modern dog populations, as seen in the PCA (Figure 2) and ADMIXTURE (Supplementary Figure S8.3.2.) analysis, and that the cranium was found next to two individuals associated with the Neolithic Corded Ware Culture, we speculate that the Indian­-like gene flow may have been acquired by admixture with incoming populations of dogs that accompanied steppe people migrating from the East. Moreover, ADMIXTUREGRAPH and ​ f4 statistics support the possibility that the Indian and the wolf ancestry are the consequence of the same admixture event, involving a dog population that carried the two ancestries. This scenario is further supported by the model estimated by G­PhoCS, which infers substantial migration from wolves to the lineage represented by Indian village dogs (and as much as 0.36 migration rate when Indian wolves are included in the tree (Supplementary Methods 12)).

Botigue et al., Ancient European dog genomes reveal continuity since the early Neolithic, bioRxiv, posted August 5, 2016, doi: