search this blog

Thursday, June 30, 2016

The Natufian puzzle

One of the biggest surprises in the new Lazaridis et al. preprint is that the Natufians don't show any Sub-Saharan African admixture when poked and prodded directly with formal statistics. However, TreeMix, which runs on formal statistics, doesn't have much trouble finding Sub-Saharan or related ancestry in both the Natufians and Neolithic farmers from the Levant. So what's going on?

Also worth noting in the fourth graph is the heavy migration edge from near the base of the East Eurasian Dai/Han branch to the early Neolithic farmers from Iran. My bet is that this is a signal of admixture from a Central Asian forager population, perhaps representing a parallel clade to that of the East Asians? Or is it just Ancient North Eurasian (ANE) admixture? I'll try and investigate that soon.

Wednesday, June 29, 2016

German Bell Beakers in the context of the prehistoric Near East

Fascinating stuff, and basically in line with the generally accepted archaeological model of Bell Beaker origins in Iberia.

Of course, these TreeMix results don't necessarily mean that German Beakers are a straight two-way mixture between Yamnaya pastoralists and Chalcolithic Iberians; they simply suggest that the ancestors of German Beakers experienced a significant pulse of admixture from a Chalcolithic Iberian-like population.

All of the samples used in this analysis are freely available for download at the Reich Lab website here.

Update 30/06/2016: Below is the same graph but with German Corded Ware added to the line up. The German Beakers still receive a significant admixture edge from the base of the Iberia Chalcolithic branch. However, the percentage of this edge as a proportion of their ancestry has fallen to 33%.

Saturday, June 25, 2016

D-stats/nMonte open thread #3

For the latest datasheets with D-stats of the form D(Chimp,Columns)(Mbuti.DG,Rows), featuring samples from Lazaridis et al. 2016, see here, here and here.

Datasheets with D-stats of the form D(Chimp,Rows)(Mbuti.DG,Columns) are available here, here and here. D-stats 1 and 1b include Iran_Chalcolithic in both the rows and columns, while D-stats 3 and 3b have Eastern_HG in both the rows and columns.

The interesting question is, which of these sheets is the best for estimating admixture proportions, primarily in populations from West Eurasia?

Thursday, June 23, 2016

A moment of clarity

A lot of things now make so much more sense thanks to all of the recently published ancient DNA. For instance, in the Principal Component Analysis (PCA) below, South Central Asians (SC_Asia) finally look like a three-way mixture of Bronze Age steppe pastoralists, early farmers from Iran and surrounds, and indigenous South Asians, which is exactly what they are.

By the way, I also ran a global analysis but didn't get the chance to make a decent plot. However, the datasheet is available for download here. The samples are from a variety of recent DNA papers and freely available at the Reich Lab site here.

Wednesday, June 22, 2016

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic

The public version of the Lazaridis et al. 2016 dataset is now available for download at the Reich Lab website here. Many thanks to the authors for releasing their data before formal publication, and in fact apparently even before the end of the peer review.

As usual from this team, it's high quality stuff with hundreds of thousands of SNPs genotyped in most of the 45 ancient samples from the Near East. And to think that only a couple of years ago the idea of getting genome-wide data from even a single ancient individual from hot places like the Near East was just that, an idea.

I'm planning to do a lot with this data, but the first issue I want to tackle is the genetic structure of the Yamnaya pastoralists from the Early Bronze Age (EBA) European steppes.

Lazaridis et al. show that Early to Middle Bronze Age steppe groups, including Yamnaya, tagged by them as Steppe EMBA, are best modeled with formal statistics as a mixture of Eastern European Hunter-Gatherers (EHG) and Chalcolithic farmers from western Iran. The mixture ratios are 56.8/43.2, respectively.

However, they add that a model of Steppe EMBA as a three-way mixture between EHG, the Chalcolithic farmers and Caucasus Hunter-Gatherers (CHG) is also a good fit and plausible.

I've looked at the topic before and concluded that Yamnaya had to be in large part of CHG origin, with only minor admixture from early farmers, probably from Eastern Europe (see here and here). After having a chance to study the data from Lazaridis et al. 2016, I stand by my earlier results.

Below are a couple of TreeMix graphs featuring Yamnaya alongside a variety of modern and ancient groups, including several potentially relevant to its ancestry, such as Armenia_Chalcolithic and Iran_Chalcolithic from Lazaridis et al. 2016. The full output is available for download here.

Now, it is true that TreeMix is a temperamental algorithm. It can react in extreme ways to the types of samples chosen by the user, often showing results that might appear wrong, or at the very least counter-intuitive. On the other hand, my experience shows that it's also exceptionally effective at picking up and characterizing significant and relatively sudden pulses of admixture. Moreover, unlike modeling with formal stats, it's an unsupervised test.

Clearly, the graphs below are very much at odds with the claim that Yamnaya might be in large part of Iranian Chalcolithic or similar ancestry. As per my earlier tests, it appears to be overwhelmingly a mixture between EHG and CHG.

It's also important to note that the uniparental marker data in Lazaridis et al. firmly back up my TreeMix output, with the Steppe EMBA groups showing starkly different Y-chromosome and mitochondrial (mtDNA) haplogroups from the ancient samples from Iran.

Indeed, mtDNA haplogroup U7 is an excellent diagnostic marker for ancestry from the southern Caspian region, and, sure enough, it appears in the Iranian Chalcolithic set. Conversely, it's conspicuous by its absence from all Bronze Age steppe remains tested to date.

Admittedly, it's still extremely difficult to be precise about the source of the southern admixture in Yamnaya without lots of high quality samples from all over the steppe and surrounds. But already Iran looks a highly unlikely proposition.

See also...

Modeling Steppe_EMBA

The story of mtDNA haplogroup U7

Another look at the genetic structure of Yamnaya

Monday, June 20, 2016

Ancient genomes from the Himalayas

Over at PNAS at this LINK:

Abstract: The high-altitude transverse valleys [>3,000 m above sea level (masl)] of the Himalayan arc from Arunachal Pradesh to Ladahk were among the last habitable places permanently colonized by prehistoric humans due to the challenges of resource scarcity, cold stress, and hypoxia. The modern populations of these valleys, who share cultural and linguistic affinities with peoples found today on the Tibetan plateau, are commonly assumed to be the descendants of the earliest inhabitants of the Himalayan arc. However, this assumption has been challenged by archaeological and osteological evidence suggesting that these valleys may have been originally populated from areas other than the Tibetan plateau, including those at low elevation. To investigate the peopling and early population history of this dynamic high-altitude contact zone, we sequenced the genomes (0.04×–7.25×, mean 2.16×) and mitochondrial genomes (20.8×–1,311.0×, mean 482.1×) of eight individuals dating to three periods with distinct material culture in the Annapurna Conservation Area (ACA) of Nepal, spanning 3,150–1,250 y before present (yBP). We demonstrate that the region is characterized by long-term stability of the population genetic make-up despite marked changes in material culture. The ancient genomes, uniparental haplotypes, and high-altitude adaptive alleles suggest a high-altitude East Asian origin for prehistoric Himalayan populations.

Jeong et al., Long-term genetic stability and a high-altitude East Asian origin for the peoples of the high valleys of the Himalayan arc, PNAS June 20, 2016, doi: 10.1073/pnas.1520844113

Saturday, June 18, 2016

Genetics of an early Neolithic pastoralist from western Iran (Gallego Llorente et al. preprint)

Just in at bioRxiv:

Abstract: The agricultural transition profoundly changed human societies. We sequenced and analysed the first genome (1.39x) of an early Neolithic woman from Ganj Dareh, in the Zagros Mountains of Iran, a site with early evidence for an economy based on goat herding,ca. 10,000 BP. We show that Western Iran was inhabited by a population genetically most similar to hunter-gatherers from the Caucasus, but distinct from the Neolithic Anatolian people who later brought food production into Europe. The inhabitants of Ganj Dareh made little direct genetic contribution to modern European populations, suggesting they were somewhat isolated from other populations in the region. Runs of homozygosity are of a similar length to those from Neolithic Anatolians, and shorter than those of Caucasus and Western Hunter-Gatherers, suggesting that the inhabitants of Ganj Dareh did not undergo the large population bottleneck suffered by their northern neighbours. While some degree of cultural diffusion between Anatolia, Western Iran and other neighbouring regions is possible, the genetic dissimilarity of early Anatolian farmers and the inhabitants of Ganj Dareh supports a model in which Neolithic societies in these areas were distinct.

Gallego Llorente et al., The genetics of an early Neolithic pastoralist from the Zagros, Iran, bioRxiv prerprint, posted June 18, 2016, doi:

The same individual, tagged as GD13a by Gallego Llorente et al., is also featured in the new Lazaridis et al. prerprint as I1290 (see here). Not sure if there's much point in two different papers on the same sample, but at least the sequences are different.

In any case, here's a very interesting part from the paper dealing with the population history of South Asia:

It is possible that farmers related to GD13a contributed to the eastern diffusion of agriculture from the Near East that reached Turkmenistan (34) by the 6th millennium BP, and continued further east to the Indus Valley (35). However, detecting such a contribution is complicated by a later influx from Steppe populations with Caucasus Hunter-Gatherer ancestry during the Bronze Age. We tested whether the Western Eurasian component found in Indian populations can be better attributed to either of these two sources, GD13a and Kotias (a Caucasus Hunter-Gatherer), using D-statistics to detect gene flow into an ancestral Indian component (represented by the Onge). For all tests where a difference could be detected, Kotias acted a better proxy than GD13a (Fig. S9 and Table S6). This result implies that the majority of the West Eurasian component seen in India derives from the Bronze age migrations; this interpretation is supported by dating of last contact based on patterns of Linkage Disequilibrium (36).

Dating admixture events with Linkage Disequilibrium patterns is somewhat controversial, but what they're saying is more or less in agreement with what I've been bleating about on this blog for the last few years (for instance, see here). So I'm very happy to finally see others noticing the same thing.

However, I have to say that the Principal Component Analysis (PCA) in this paper is off the wall. Many of the ancient samples don't appear to cluster where they should. For instance, most of the European Hunter-Gatherers (HGs) are way too close to present-day samples, and in fact in some cases they're overlapping with them, which is just wrong. Note also the unusual elongated cluster formed by the Neolithic Anatolians, stretching all the way from where they should all cluster (south of the Sardinians), to where they really shouldn't (right next to the North Caucasians).

My bet is that the results are affected by missing markers, with the ancient samples with high rates of missing markers being pulled into the middle of the plot (towards 0.00 in both dimensions).

Below is a similar PCA that I ran using my dataset minus GD13a, which I don't yet have access to. Note the relatively tight clusters formed by all of the ancient populations. The European Hunter-Gatherers are distinct from present-day Europeans, while none of the Neolithic Anatolians (Anatolia_N) fall near the North Caucasians in dimension 2. This is of course in line with a wide range of formal stats.

See also...

Early Neolithic genomes from the eastern Fertile Crescent (Broushaki et al. 2016)

Friday, June 17, 2016

The genetic structure of the world's first farmers (Lazaridis et al. 2016 preprint)

Huge one from the Laz at bioRxiv:

We report genome-wide ancient DNA from 44 ancient Near Easterners ranging in time between ~12,000-1,400 BCE, from Natufian hunter-gatherers to Bronze Age farmers. We show that the earliest populations of the Near East derived around half their ancestry from a 'Basal Eurasian' lineage that had little if any Neanderthal admixture and that separated from other non-African lineages prior to their separation from each other. The first farmers of the southern Levant (Israel and Jordan) and Zagros Mountains (Iran) were strongly genetically differentiated, and each descended from local hunter-gatherers. By the time of the Bronze Age, these two populations and Anatolian-related farmers had mixed with each other and with the hunter-gatherers of Europe to drastically reduce genetic differentiation. The impact of the Near Eastern farmers extended beyond the Near East: farmers related to those of Anatolia spread westward into Europe; farmers related to those of the Levant spread southward into East Africa; farmers related to those from Iran spread northward into the Eurasian steppe; and people related to both the early farmers of Iran and to the pastoralists of the Eurasian steppe spread eastward into South Asia.

Lazaridis et al., The genetic structure of the world's first farmers, bioRxiv preprint, posted June 16, 2016, doi:

And here's a list of the Y-chromosome haplogroups for the new samples in this paper:

Armenia_ChL (Chalcolithic Armenia)

I1407: L1a
I1632: L1a
I1634: L1a


I1635: R1b1-M415(xM269)

Iran_Mesolithic (Hotu Cave)

I1293: J(xJ2a1b3, J2b2a1a1)


I1945: P1(xQ, R1b1a2, R1a1a1b1a1b, R1a1a1b1a3a, R1a1a1b2a2a)

My guess here is that this is R2, and hopefully we shall see when the bam files are released.

I1949: CT


I1671: G2a1(xG2a1a)

Iran_ChL (Chalcolithic Iran)

I1662: J(xJ1a, J2a1, J2b)
I1674: G1a(xG1a1)


I0861: E1b1b1b2(x E1b1b1b2a, E1b1b1b2b)
I1069: E1b1(xE1b1a1, E1b1b1b1)
I1072: E1b1b1b2(xE1b1b1b2a, E1b1b1b2b)
I1685: CT
I1690: CT


I0867: H2 (PPNB)
I1414: E(xE2, E1a, E1b1a1a1c2c3b1, E1b1b1b1a1, E1b1b1b2b) (PPNB)
I1415: E1b1b1 (PPNB)
I1416: CT (PPNB)
I1707: T(xT1a1, T1a2a) (PPNB)
I1710: E1b1b1(x E1b1b1b1a1, E1b1b1a1b1, E1b1b1a1b2, E1b1b1b2a1c) (PPNB)
I1727: CT(xE, G, J, LT, R, Q1a, Q1b) (PPNB)
I1700: CT (PPNC)


I1705: J1(xJ1a)
I1730: J(xJ1, J2a, J2b2a)

Update 25/07/2016: The peer-reviewed paper was published at Nature today under the title Genomic insights into the origin of farming in the ancient Near East. See here.

See also...

A homeland, but not the homeland

A moment of clarity

Early Neolithic genomes from the eastern Fertile Crescent (Broushaki et al. 2016)

Thursday, June 9, 2016

The discrepancy

I posted this in the comments in the previous blog entry [here], but it hasn't received the attention that it deserves, so it's now getting an entry all of its own. I'm also posting Matt's reply, since he motivated me to look at the formal stats from Hofmanova et al. in more detail. I'd like to get to the bottom of this. Any ideas?

Davidski said...


How would you interpret these sets of f4 and D statistics?

The f4 stats are from Hofmanova et al., while the D stats were run by me. The first set of D stats uses the highest quality Anatolia Neolithic sample from Barcin from Mathieson et al. and CHG genotypes from Fu Q et al., and the second uses the same Barcin sample plus CHG genotypes from Jones et al.

Also, keep in mind that, as far as I can tell, the Barcin genomes from Hofmanova et al. and Mathieson et al. date to the same period.

f4 Corded_Ware_LN Bar8 Satsurblia Khomani -0.0367 -8.145
f4 Corded_Ware_LN Bar8 Kotias Khomani -0.0193 -3.437

f4 Spain_MN Bar8 Satsurblia Khomani -0.0327 -5.385
f4 Spain_MN Bar8 Kotias Khomani -0.0182 -3.136


D Corded_Ware_Germany BAR20_I0709 Satsurblia Khomani 0.0215 4.299
D Corded_Ware_Germany BAR20_I0709 Kotias Khomani 0.0205 4.408

D Iberia_MN BAR20_I0709 Satsurblia Khomani -0.0003 -0.06
D Iberia_MN BAR20_I0709 Kotias Khomani -0.0017 -0.339


D Corded_Ware_Germany BAR20_I0709 Satsurblia2 Khomani 0.0224 4.38
D Corded_Ware_Germany BAR20_I0709 Kotias2 Khomani 0.0226 5.168

D Iberia_MN BAR20_I0709 Satsurblia2 Khomani 0.0068 1.222
D Iberia_MN BAR20_I0709 Kotias2 Khomani 0 0.004

Clearly, something's horribly wrong. If I made a mistake, my apologies. But I'm pretty sure I didn't make any mistakes. I checked the datasets that I'm using for consistency with the f4 and D stats published in Mathieson et al. and Fu et al., so I can say with confidence that my D stats should not be much different from correctly run f4 stats using the same ancient samples.

June 8, 2016 at 7:06 PM

Matt said...

@ Davidski, yeah I see what's going on there with the D stats giving a result we would expect from previous work - Anatolia Neolithic and Iberia_MN equally related to CHG, Corded Ware more to CHG - with the stats from this paper being different - Bar8 being more related to CHG than Iberia_MN, and Bar8 even more strongly related to CHG relative to Corded_Ware, also implying Iberia_MN more related to CHG than Corded Ware is. I don't know that there's anything about f4 vs D stats themselves that would explain that difference, and as you say, yours are consistent with the previously published.

This is really stuff that should have been in and the resolved in the early print. That's the whole point of the process!

June 9, 2016 at 12:13 AM

Update 12/06/2016: Here are f4 stats using the same data as for the D stats above. They look basically the same as the D stats.

Corded_Ware_Germany BAR20_I0709 Satsurblia Khomani 0.002073 4.294
Corded_Ware_Germany BAR20_I0709 Kotias Khomani 0.001997 4.402

Iberia_MN BAR20_I0709 Satsurblia Khomani -0.00003 -0.06
Iberia_MN BAR20_I0709 Kotias Khomani -0.000157 -0.34

Corded_Ware_Germany BAR20_I0709 Satsurblia2 Khomani 0.0021 4.388
Corded_Ware_Germany BAR20_I0709 Kotias2 Khomani 0.002031 4.971

Iberia_MN BAR20_I0709 Satsurblia2 Khomani 0.000613 1.221
Iberia_MN BAR20_I0709 Kotias2 Khomani 0.000002 0.004

Monday, June 6, 2016

Comic relief from Hofmanova et al. at PNAS

PNAS has a new paper on the Neolithic transition in Europe. I don't know what the authors were puffing on when they computed the inferred mixture coefficients, but they look like crap, with Loschbour-related admixture (in other words, indigenous European ancestry) peaking near the Caspian Sea, including among North Caucasians and Kalmyks, who today live just northwest of the Caspian, but are recent migrants to Europe from Mongolia.

Moreover, western Turks appear to show fairly even ratios of Loschbour and early Aegean farmer admixture, which is also strange.

I pointed out this problem when the paper was posted for review at bioRxiv (see here), but my comments were ignored. The co-authors responsible for this analysis are Lucy van Dorp, Saioa Lopez and Garrett Hellenthal. The paper was edited by Eske Willerslev of the University of Copenhagen. Holy shit!

Zuzana Hofmanová et al., Early farmers from across Europe directly descended from Neolithic Aegeans, PNAS June 6, 2016, 2016, doi: 10.1073/pnas.1523951113

See also...

The discrepancy

Thursday, June 2, 2016

On crop dispersal in prehistoric Central Asia

Were the Harappans what they ate? If so, they were mostly West Asian. Open access at The Holocene:

Abstract: The period from the late third millennium BC to the start of the first millennium AD witnesses the first steps towards food globalization in which a significant number of important crops and animals, independently domesticated within China, India, Africa and West Asia, traversed Central Asia greatly increasing Eurasian agricultural diversity. This paper utilizes an archaeobotanical database (AsCAD), to explore evidence for these crop translocations along southern and northern routes of interaction between east and west. To begin, crop translocations from the Near East across India and Central Asia are examined for wheat (Triticum aestivum) and barley (Hordeum vulgare) from the eighth to the second millennia BC when they reach China. The case of pulses and flax (Linum usitatissimum) that only complete this journey in Han times (206 BC–AD 220), often never fully adopted, is also addressed. The discussion then turns to the Chinese millets, Panicum miliaceum and Setaria italica, peaches (Amygdalus persica) and apricots (Armeniaca vulgaris), tracing their movement from the fifth millennium to the second millennium BC when the Panicum miliaceum reaches Europe and Setaria italica Northern India, with peaches and apricots present in Kashmir and Swat. Finally, the translocation of japonica rice from China to India that gave rise to indica rice is considered, possibly dating to the second millennium BC. The routes these crops travelled include those to the north via the Inner Asia Mountain Corridor, across Middle Asia, where there is good evidence for wheat, barley and the Chinese millets. The case for japonica rice, apricots and peaches is less clear, and the northern route is contrasted with that through northeast India, Tibet and west China. Not all these journeys were synchronous, and this paper highlights the selective long-distance transport of crops as an alternative to demic-diffusion of farmers with a defined crop package.

Stevens et al., Between China and South Asia: A Middle Asian corridor of crop dispersal and agricultural innovation in the Bronze Age. The Holocene, published online before print June 1, 2016, doi: 10.1177/0959683616650268

Wednesday, June 1, 2016

The man with the flat occiput

The image below is of a skull from a Bell Beaker burial in Bee Low, Derbyshire, England, dated to 2200–2030 calBC. Its "extremely low" oxygen isotope value of 16.2‰ matches that of the Amesbury Archer, and suggests that it may have belonged to a migrant from a place with a cold and "continental" climate, possibly outside of Britain. You can read more about this skull and its unusually flat occiput in a new paper about British Beakers by Pearson et al. at Antiquity here. If you don't have academic access to journals, try the link here. Bell Beaker Blogger also has a useful summary of the paper here.

Abstract: The appearance of the distinctive ‘Beaker package’ marks an important horizon in British prehistory, but was it associated with immigrants to Britain or with indigenous converts? Analysis of the skeletal remains of 264 individuals from the British Chalcolithic–Early Bronze Age is revealing new information about the diet, migration and mobility of those buried with Beaker pottery and related material. Results indicate a considerable degree of mobility between childhood and death, but mostly within Britain rather than from Europe. Both migration and emulation appear to have had an important role in the adoption and spread of the Beaker package.

Mike Parker Pearson, Andrew Chamberlain, Mandy Jay, Mike Richards, Alison Sheridan, Neil Curtis, Jane Evans, Alex Gibson, Margaret Hutchison, Patrick Mahoney, Peter Marshall, Janet Montgomery, Stuart Needham, Sandra O'Mahoney, Maura Pellegrini and Neil Wilkin (2016). Beaker people in Britain: migration, mobility and diet. Antiquity, 90, pp 620-637 doi:10.15184/aqy.2016.72

See also...

Who's your (proto) daddy Western Europeans?