search this blog

Thursday, February 23, 2017

Kurgan waves as far as the Atlantic

Paleogenetista Carles Lalueza-Fox is currently collaborating with Harvard on a major ancient DNA paper on the population history of the Iberian Peninsula, which is scheduled to be published next year. He was recently interviewed by Spanish Newspaper LaVanguardia about the project. Here are a few interesting quotes (pardon the translation):

And the mixing of genes continued...

Yes, 4,000 years ago came the Kurgan people, who domesticated the horse on the Pontic-Caspian Steppe and spoke Proto-Indo-European, the ancestral language to Celtic, Latin, Greek...their impact on our DNA was high.

What does that mean?

Their numbers grew at the expense of the previous population: today the Kurgan genetic footprint makes up 40% of the Western European genome!

Also in the Iberian peninsula?

Fifty per cent of our ancestry is derived from Neolithic farmers, 30% from the Kurgan people and 20% from hunter-gatherers.

Note the date: 4,000 years ago. That's not a Bell Beaker date, it's an Atlantic cist tradition date. What the hell is the Atlantic cist tradition, you're probably asking. Have a look at the video here, or, if you're too lazy, this screen cap.

See also...

Yamnaya-related admixture in Bronze Age northern Iberia

Maykop prediction

It's no secret that Maykop (or Maikop) Culture samples have been sequenced at the Reich and GeoGenetic labs. I don't know when they'll be published, but hopefully soon.

Maykop is arguably one of the most fascinating and important archaeological cultures of the Early Bronze Age (EBA), so there's a lot of interest in how these samples will come out in the context of ancient and modern-day Eurasian genetic diversity.

It's not an easy thing to predict, because Maykop territory basically straddled two perennially highly differentiated West Eurasian biogeographical zones: Eastern Europe and West Asia. So the question is, was the Maykop population, for its time, Eastern European, West Asian, or a rare example of something in between?

If we assume that the Adgyei people of the Northwest Caucasus are largely of Maykop origin, but with various post-Bronze Age admixtures from the steppe and perhaps eastern Asia, which I'd say is not a bad assumption for now, then my prediction is that the Maykop samples will be very similar to the three currently available Armenia_EBA or Kura-Araxes individuals.

Consider the following qpAdm models. Armenia_EBA is the key to a tight fit. Barcin_Neolithic and Jordan_EBA help to improve the fit slightly, but also bump up the standard errors. Caucasus_HG does very well alongside Jordan_EBA, but is temporally a less proximate choice than Armenia_EBA.


Armenia_EBA 0.633±0.062
Barcin_Neolithic 0.054±0.042
Scythian_IA 0.260±0.038
Han 0.053±0.011
chisq 4.712 tail_prob 0.787883

Armenia_EBA 0.580±0.127
Jordan_EBA 0.084±0.082
Scythian_IA 0.286±0.053
Han 0.050±0.011
chisq 5.070 tail_prob 0.750115

Armenia_EBA 0.699±0.034
Scythian_IA 0.252±0.039
Han 0.049±0.011
chisq 6.230 tail_prob 0.716716

Caucasus_HG 0.243±0.054
Jordan_EBA 0.342±0.033
Scythian_IA 0.360±0.039
Han 0.055±0.010
chisq 8.370 tail_prob 0.3982

Armenia_Chalcolithic 0.674±0.674
Caucasus_HG 0.147±0.147
Scythian_IA 0.113±0.113
Han 0.066±0.066
chisq 11.020 tail_prob 0.200589

Iran_Neolithic 0.221±0.035
Barcin_Neolithic 0.338±0.027
Scythian_IA 0.390±0.032
Han 0.051±0.011
chisq 12.709 tail_prob 0.079515

Chalcolithic and Neolithic samples from modern-day Iran, even though very similar to Armenia_EBA and Caucasus_HG, don't appear to produce similarly effective models for the Adygei. That's not to say, however, that the Adygei don't have minor ancient ancestry from the Iranian Plateau. It's possible that they do, but I'm not able to test for it with this methodology.

Wednesday, February 22, 2017

Rumors & leaks thread

Nothing much is happening, so I'm putting up this temporary thread for rumors and leaks about upcoming ancient DNA papers. Feel free to post your rumor or leak anonymously. Here's my somewhat cryptic contribution. Make of it what you will.

Tuesday, February 21, 2017

We're probably smarter than our Bronze and Iron Age ancestors

There's an interesting new preprint at bioRxiv focusing on cognitive ability in Europeans from the Bronze Age to the present:

Abstract: Human populations living in Eurasia during the Holocene experienced significant evolutionary change. It has been predicted that the transition of Holocene populations into agrarianism and urbanization brought about culture-gene co-evolution that favoured via directional selection genetic variants associated with higher general cognitive ability (GCA). Population expansion and replacement has also been proposed as an important source of GCA gene-frequency change during this time period. To examine whether GCA might have risen during the Holocene, we compare a sample of 99 ancient Eurasian genomes (ranging from 4,557 to 1,208 years of age) with a sample of 503 modern European genomes, using three different cognitive polygenic scores. Significant differences favouring the modern genomes were found for all three polygenic scores (Odds Ratio=0.92, p=0.037; 0.81, p=0.001 and 0.81, p=0.02). Furthermore, a significant increase in positive allele count over 3,249 years was found using a sample of 66 ancient genomes (r=0.217, p one-tailed=0.04). These observations are consistent with the expectation that GCA rose during the Holocene.

As far as I can see, the preprint overall makes sense, and this part is interesting in the context of population genetics, even if it just confirms what many of us here have already known.

Late Bronze Age European and Central Asian gene pools resemble present-day Eurasian genetic structure (17). Indeed, with values of Fst ranging from 0.00 to 0.08, the genetic distances between present-day European 1000 Genomes samples and the Ancient samples indicate little to modest levels of genetic differentiation (little differentiation corresponds to an Fst range of 0 to 0.05, and modest to an Fst range of 0.05 to 0.15 [41]). These values are lower than the distance between present-day Europeans and East Asians (F st =0.11) (17). Despite this the two ancient genomes belonging to the Siberian Okunevo culture (RISE515 and RISE516) were somewhat of an outlier, exhibiting modest differentiation relative to the EUR sample when compared with the other genomes in the sample (average F st =0.074 vs. 0.016 for the remainder of the sample). Their removal reduced the genetic differentiation between the two samples, yielding 99 ancient genomes, sourced from sites located in present-day Armenia (8.08%), Czech Republic (6.06%), Denmark (6.06%), Estonia (1.01%), Germany (10.1%), Hungary (10.1%), Italy (3.03%), Kazakhstan (1.01%), Lithuania (1.01%), Montenegro (2.02%), Poland (7.07%), Russia (36.36%) and Sweden (8.08%).

But this part is just weird.

Changes in allele frequencies can also occur via population expansion and replacement, perhaps driven in part by the relative advantage in conflict conferred upon populations by GCA. Consistent with this, as a possible result of the Neolithic revolution and during the Bronze Age in Europe, three Y-chromosomal haplogroups (R1a, R1b, I1), which are associated with farming or pastoralist cultures, came to mostly replace the formerly dominant hunter-gatherer lineages (associated predominantly with haplogroups G2a and I2) (32). Ancient farming societies in particular are associated with higher social complexity and the use of more complex tools (11); furthermore the contemporary distribution of these three haplogroups is positively associated with the variation in cognitive ability among contemporary European nations (32). The major population movements occurred in the period between 3.5 and 7.3 kybp, however, as noted in (17), westward migration of populations associated with haplogroup R1a continued from the Pontic-Steppe region between 5 and 1.4 kybp.

Needless to say, an intervention was begging, so I left the following comment at bioRxiv under the preprint. It'll be interesting to see how the authors incorporate this information into their model.

That's not correct.

G2a is the main early farmer lineage of Neolithic Western, Central and Southern Europe, and it arrived in Europe with early Neolithic farmers from Anatolia.

I2 is the main hunter-gatherer lineage of Mesolithic Western, Central and Southern Europe.

R1a and R1b appear to be the main hunter-gatherer lineages of Mesolithic and Neolithic Eastern Europe (keep in mind that the Neolithic in much of Eastern Europe was defined by the presence of pottery, not necessarily any type of farming).

At some point hunter-gatherers native to Western, Central and Southern Europe carrying I2 were acculturated into farming societies, and so I2 rose in frequency in farmer populations at the expense of G2a.

Then, during the Eneolithic/Copper Age, foragers on the Eastern European steppe carrying R1a and R1b mixed with pastoralists from the fringes of the steppe, like the North Caucasus, and became steppe pastoralists.

These steppe pastoralists with Eastern European forager-derived R1a and R1b then expanded rapidly and moved en masse into the rest of Europe, largely replacing the farmer G2a and I2 lineages there.

It's still a mystery how I1 fits into the picture. But it's probably just a North European forager-derived lineage that got caught up somehow in the expansions of the steppe pastoralists or their descendants.


Woodley et al., Holocene selection for variants associated with cognitive ability: Comparing ancient and modern genomes, bioRxiv, February 21, 2017, doi:

The great male migration

The Goldberg et al. preprint that I blogged about late last year has made it into PNAS under a different title and with a few other changes (behind a pay wall here). The authors added a couple of lines about R1a and R1b, which is awesome because I think these markers are crucial to the Indo-European homeland debate, and also made the change from "horse-driven chariots" to "horse-driven wagons", probably as a result of my comment at bioRxiv (scroll down here).

Based on archeological data, as well as ancient and modern Y chromosome data, the later migration from the Pontic-Caspian Steppe has also been hypothesized to be male-biased (5, 24–29). In particular, multiple large-scale studies of modern Y-chromosome data infer a rapid growth of R1a and R1b haplotypes ∼5,000 y ago (27–29). Similarly, Haak et al. (5) provide evidence that R1a and R1b were rare in central Europe before ∼4,500 y ago, but common soon thereafter. They also observe multiple R1b haplotypes in ancient Yamnaya individuals from the steppe. Populations in the Pontic-Caspian Steppe region, such as the Yamnaya or Pit Grave culture, are thought to have strong male-biased hierarchy, as inferred by overrepresentation of male burials, male deities, and kinship terms (26, 30). The region is a putative origin for the domesticated horse in Europe, and the culture is known for its use of horse-driven wagons, a potential male-biased mechanism of dispersal into central Europe (30).


The signal of a large male bias holds when analyzing late Neolithic Corded Ware individuals and later Bronze Age Unetice individuals separately, with mean X-to-autosomal ancestry ratios in the two groups of 0.716 and 0.474, respectively. Ancestry and sex bias do differ between the groups, with a larger male bias and lower SP ancestry for the later Unetice, although the trend is not statistically significant (SI Appendix, Fig. S1B). Individuals from Bell Beaker archeological sites, a culture that overlapped with Corded Ware and Unetice but occurred over a wider geographic scale, show levels of X and autosomal ancestry suggestive of overall ancestry contributions and levels of sex bias that are similar to Corded Ware and Unetice, with mean X and autosomal ancestry of 0.28 and 0.56, respectively (SI Appendix, Table S7).

Goldberg et al., Ancient X chromosomes reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations, PNAS, February 21, 2017, doi: 10.1073/pnas.1616392114

Monday, February 20, 2017

Bronze Age dope dealers

Over at Vegetation History and Archaeobotany:

Abstract: A systematic review of archaeological and palaeoenvironmental records of cannabis (fibres, pollen, achenes and imprints of achenes) reveals its complex history in Eurasia. A multiregional origin of human use of the plant is proposed, considering the more or less contemporaneous appearance of cannabis records in two distal parts (Europe and East Asia) of the continent. A marked increase in cannabis achene records from East Asia between ca. 5,000 and 4,000 cal bp might be associated with the establishment of a trans-Eurasian exchange/migration network through the steppe zone, influenced by the more intensive exploitation of cannabis achenes popular in Eastern Europe pastoralist communities. The role of the Hexi Corridor region as a hub for an East Asian spread of domesticated plants, animals and cultural elements originally from Southwest Asia and Europe is highlighted. More systematic, interdisciplinary and well-dated data, especially from South Russia and Central Asia, are necessary to address the unresolved issues in understanding the complex history of human cannabis utilisation.

Long, T., Wagner, M., Demske, D. et al., Cannabis in Eurasia: origin of human use and Bronze Age trans-continental connections, Veget Hist Archaeobot (2017) 26: 245. doi:10.1007/s00334-016-0579-6

See also...

RIP with cannabis

Sunday, February 19, 2017

Phylogeography of Y-haplogroup Q3-L275

BMC Evolutionary Biology has a decent new paper on the phylogeography of Y-haplogroup Q3-L275. It would've been a great paper a couple of years ago, but I think that nowadays papers like this should also come with a few kick ass ancient samples to help make their point, otherwise they just feel like a prelude to something else. In this case it's probably a matter of funding and logistics, because the authors appear to be aware of the pitfalls of working with modern-day data:

Haplogroup Q3-L275 results from the first known split within haplogroup Q, which occurred in the Paleolithic epoch: according to previous studies [15, 24], haplogroup Q split into the Q3-L275 and Q1’2-L472 branches around 35 ky ago. Thus the location of this split might help identify the homeland of haplogroup Q, from where it spread throughout Eurasia and the Americas. Our findings better support a West Asian or Central Asian homeland of Q3 than any other area: a higher frequency was found in West Asia and in neighboring Pakistan; and early branches were identified in West Asia, Central Asia and South Asia. Increasing the dataset of ancient DNA might in future identify additional early branches, helping to locate a possible homeland more precisely. The very few samples from present-day (Additional file 3: Table S2) or ancient [43] China do not contradict this hypothesis, as they came from the western provinces located in Central Asia or historically linked to this area. The single Portuguese sample likely reflects the origin of the carrier, rather than more general population history. Thus, Q3 was one of the Paleolithic West Eurasian haplogroups. Its West/Central Asian homeland proposed here is hypothetical, because present-day genetic patterns do not necessarily reflect ancient ones as these can be modified by the more recent demographic events.

I like this diagram. But again, it would've been even better if augmented by a sprinkling of high resolution ancient samples.

Balanovsky et al., Phylogeography of human Y-chromosome haplogroup Q3-L275 from an academic/citizen science collaboration, BMC Evolutionary Biology, 201717(Suppl 1):18, DOI: 10.1186/s12862-016-0870-2

Thursday, February 16, 2017

The Khvalynsk men #2

I didn't run any mixture models of the Khvalynsk men in my original post on these three individuals from the 5200-4000 BCE Eneolithic cemetery at Khvalynsk, Samara Oblast, Russia. That's because at the time I felt that I didn't have the right reference samples and outgroups to produce convincing results. But this is no longer an issue, so here goes, using qpAdm.


Anatolia_Chalcolithic 0.070±0.059
Caucasus_HG 0.136±0.050
Eastern_HG 0.794±0.037
chisq 7.964 tail_prob 0.716545

Anatolia_Chalcolithic 0.046±0.065
Caucasus_HG 0.155±0.058
Eastern_HG 0.799±0.038
chisq 7.970 tail_prob 0.715965

Anatolia_Chalcolithic 0.195±0.200
Caucasus_HG 0.238±0.192
Eastern_HG 0.567±0.076
chisq 10.965 tail_prob 0.446237


Anatolia_Chalcolithic 0.082±0.048
Caucasus_HG 0.135±0.042
Eastern_HG 0.783±0.030
chisq 5.610 tail_prob 0.898074

Caucasus_HG 0.065±0.047
Eastern_HG 0.804±0.025
Iran_Chalcolithic 0.132±0.051
chisq 6.909 tail_prob 0.806405

Caucasus_HG 0.147±0.033
Eastern_HG 0.797±0.027
Lengyel_LN 0.057±0.036
chisq 7.040 tail_prob 0.795835

Caucasus_HG 0.105±0.055
Eastern_HG 0.809±0.028
Iran_Late_Neolithic 0.086±0.051
chisq 8.304 tail_prob 0.685822

Armenia_Chalcolithic 0.130±0.056
Caucasus_HG 0.088±0.043
Eastern_HG 0.782±0.030
chisq 9.121 tail_prob 0.610719


Yamnaya_Samara:I0429 (3339-2917 calBCE)
Anatolia_Chalcolithic 0.190±0.063
Caucasus_HG 0.277±0.056
Eastern_HG 0.533±0.034
chisq 11.732 tail_prob 0.38412

I tried a number of different combinations of reference samples, and the three I settled for produced the best fits and lowest standard errors overall. That doesn't mean they literally show what happened; they're just the best we've got for the time being.

The results are very interesting, and perhaps unexpected, with Samara Eneolithic I0434 packing the highest ratio of Anatolia- and Caucasus-related ancestry, and, as per above, almost looking like he could be an early Yamnaya sample. I say perhaps unexpected because this individual belongs to Y-haplogroup Q1a and mitochondrial haplogroup U4a2, so his uniparental markers don't suggest any strong southern affinities.

But the result, even though only based on 13527 SNPs, looks robust enough, and it basically matches the Principal Component Analysis (PCA) that I featured in my original post.

Keep in mind that 10434 is the individual that appears to have been whacked over the head a few times and simply thrown into a ditch. Perhaps this suggests that the genetic shift in the Samara region from the Eneolithic to the Bronze Age, which saw the dilution of Eastern Hunter-Gatherer (EHG) ancestry by Anatolian- and Caucasus-related gene flows, was not always a peaceful and migrant-friendly process.

Wednesday, February 15, 2017

Post-ANE Siberian admixture in Middle Neolithic East Baltic foragers (?)

This hasn't been reported anywhere before, but it appears that at least one of the Latvian Middle Neolithic (MN) samples from Jones et al. 2017 harbors elevated post-Ancient North Eurasian (ANE) Siberian admixture.

If true, and it needs to be confirmed with more markers, then this individual, dated to ~6,000 cal BP, is the oldest European with this type of ancestry sequenced to date. Consider the following qpAdm models based on ~22K SNPs with Nganasans as the Siberian reference population:


Eastern_HG 0.788±0.096
Western_HG 0.135±0.078
Nganasan 0.076±0.038
chisq 10.493 tail_prob 0.486685

Eastern_HG 0.735±0.090
Western_HG 0.190±0.072
Nganasan 0.075±0.035
chisq 11.189 tail_prob 0.427555

I couldn't test Latvia_MN1 separately due to a lack of markers. However, using exactly the same setup on the older samples from Jones et al. 2017, the Nganasan-related signal fails to show for Latvia_HG and only registers at 0.5% for Ukraine_HG/N. But that 0.5% looks somewhat shaky considering the ten times higher standard error. The other coefficients make good sense.

Eastern_HG 0.314±0.042
Western_HG 0.686±0.042
Nganasan 0
chisq 10.035 tail_prob 0.612908

Eastern_HG 0.676±0.153
Western_HG 0.319±0.129
Nganasan 0.005±0.053
chisq 11.114 tail_prob 0.433755

So, you're probably asking, does Latvia_MN-related ancestry explain the elevated Nganasan-related ancestry in modern-day far Northeastern Europeans such as Finns? Perhaps some of it, but not all of it. Note the slight drop in the Nganasan-related ancestry for the Finns with the inclusion of Latvia_MN in the model.

Lengyel_LN 0.305±0.020
Western_HG 0.135±0.014
Yamnaya_Samara 0.457±0.025
Nganasan 0.104±0.008
chisq 12.401 tail_prob 0.25911

Latvia_MN 0.137±0.113
Lengyel_LN 0.316±0.070
Western_HG 0.119±0.051
Yamnaya_Samara 0.354±0.123
Nganasan 0.074±0.020
chisq 1.429 tail_prob 0.99764

My verdict: the minor Nganasan-related signal in Latvia_MN, or at least Latvia_MN2, is probably real, and the extra Nganasan-related admixture in modern-day Finns possibly arrived in Northeastern Europe in several waves from the Middle Neolithic onwards, including with early speakers of Uralic languages during the Bronze or Iron Age.

Monday, February 13, 2017

Mitogenome diversity in Sardinians

Good stuff at Mol Biol Evol:

Sardinians are “outliers” in the European genetic landscape and, according to paleogenomic nuclear data, the closest to early European Neolithic farmers. To learn more about their genetic ancestry, we analyzed 3,491 modern and 21 ancient mitogenomes from Sardinia. We observed that 78.4% of modern mitogenomes cluster into 89 haplogroups that most likely arose in situ. For each Sardinian-Specific Haplogroup (SSH), we also identified the upstream node in the phylogeny, from which non-Sardinian mitogenomes radiate. This provided minimum and maximum time estimates for the presence of each SSH on the island. In agreement with demographic evidence, almost all SSHs coalesce in the post-Nuragic, Nuragic and Neolithic-Copper Age periods. For some rare SSHs, however, we could not dismiss the possibility that they might have been on the island prior to the Neolithic, a scenario that would be in agreement with archeological evidence of a Mesolithic occupation of Sardinia.

Olivieri et al., Mitogenome Diversity in Sardinians: a Genetic Window onto an Island's Past, Mol Biol Evol, Published: 08 February 2017, DOI: