search this blog

Wednesday, January 8, 2014

Another look at the Lazaridis et al. ancient genomes preprint


I've now had a chance to look over the Lazaridis et al. preprint a few times, and also take part in several online discussions about the results, at these blogs and elsewhere. So I thought it might be useful to put together another post on the paper to report what I've learned and reiterate a few points. First of all, to understand the results, it's really important to known what the four main ancestral components in this study represent:

- West European Hunter-Gatherer (WHG), based on an 8,000 year-old genome from Loschbour, Luxembourg

- Ancient North Eurasian (ANE), based on a 24,000 year-old genome from South Siberia (dubbed Mal'ta boy or MA-1)

- Early European Farmer (EEF), based on a 7,500 year-old genome from Stuttgart, Germany, belonging to the Neolithic Linearbandkeramik (LBK) culture

- Eastern non-African (ENA), this basically means East Eurasian, and is based on samples of present-day Onge, Han Chinese and Atayal from Taiwan

Now, from what I've seen online, many people seem to think that ANE is more East Asian than European, and can be considered a signal of pretty much any population expansion from the east into Europe. This is not true. ANE is Amerindian-like, but actually also very similar to WHG. In fact, they're equidistant from ENA:

The results of Table S12.1 provide suggestive evidence that Onge share more common ancestry with hunter-gatherers than with Stuttgart. All statistics involving two hunter-gatherer populations have |Z|<0.9, so ancient Eurasian hunter-gatherers are approximately symmetrically related to Onge, and they are all more closely related to them than is Stuttgart.

We next consider the relationship of ancient samples to East Asia using the set (Ami, Atayal, Han, Naxi, She). East Asians are more closely related to all hunter-gatherers than to Stuttgart, but there are no significant differences between hunter-gatherers (all such statistics have |Z|<1.1) (Table S12.2).

...

We have conveniently labeled MA1-related ancestry “Ancient North Eurasian” because of the provenance of MA1 in Siberia, but at present we cannot be sure whether this type of ancestry originated there or was a recent migrant from some western region.

The various Uralic, Turkic and Mongolian groups expanding into Europe, usually after the Bronze Age, no doubt carried significant ENA, so these groups can't be the source of the fairly high levels of ANE across Europe today, because most Europeans lack ENA. Below is a graph based on two f4 tests, comparing ANE and ENA ancestry among Europeans, this time with the Han Chinese as ENA proxies. Note that most of the samples fall within a cline that runs from the Stuttgart sample to Estonians. The only outliers in the direction of the Han are groups from current or former Uralic and Turkic speaking areas of Europe.


ANE was actually present in Scandinavia during the Mesolithic, because Motala12, the 8,000 year-old hunter-gatherer genome from Sweden, has an ANE ratio of 19%. But this isn't enough to explain the ANE levels carried by most present-day Europeans, so it's very likely there were at least two expansions of ANE into Europe.

Considering that Loschbour and Stuttgart totally lack ANE, it's plausible that a major wave of ANE moved across much of Europe sometime after the early Neolithic, but obviously before the Uralic and Turkic expansions, which, as per above, were rich in ENA. Based on recently published ancient mtDNA evidence from Central Europe (see here), Lazaridis et al. propose that this timeframe was the Copper and/or Bronze Age.

This of course is the generally accepted Proto-Indo-European timeframe. Indeed, the theory I put forward in the previous blog entry (see here) that most of the ANE in Europe today was the result of the Proto-Indo-European expansion, probably from Eastern Europe, looks even better on closer inspection.

Note the elongated cline formed by the European samples running from WHG to EEF on Fig 2B, shown below. It correlates well with latitude, and very likely reflects northward migrations of Neolithic farmers into Europe from the Mediterranean Basin, followed by isolation-by-distance. In other words, this cline probably took thousands of years to form.


On the other hand, there is no cline running from WHG/EEF to ANE, but all of the Indo-European and/or Eastern European samples are fairly evenly lifted up towards ANE relative to a few outliers. These outliers are all southwestern Europeans: Basques, Pais Vasco (Basque Country) Spaniards, southern French and Sardinians.

Of course, southwestern Europe is the most distant part of the continent from the generally accepted Indo-European homeland near the middle Volga. Moreover, Basques don't speak an Indo-European language, while Sardinians were only Indo-Europeanized during historic times.

Indeed, even though a couple of tables in the study report considerable ANE ancestry among Basques and Pais Vasco Spaniards, the authors admit that this need not be the case. For instance:

We next attempted to fit individual West Eurasian populations as a mixture of Loschbour and Stuttgart, as representatives of Early European farmers and West European Hunter Gatherers.

Fig. 1B suggests that this is not possible, as most Europeans form a cline that cannot be reconciled with such a mixture [Davidski's note: I think they actually mean Fig. 2B]. Nonetheless, for Sardinians (Extended Data Table 1), the most negative f3-statistic is of the form f3(Test; Loschbour, Stuttgart), which suggests that at least some Europeans may be consistent with having been formed by such a mixture. We thus fit each European population into the topology of Fig. S12.6. Only Basques, Pais_Vasco, and Sardinians, can be fit successfully with this model. Fig. S12.8 shows a successful fit.

Most European populations cannot be fit as this type of 2-way mixture and, intuitively, this is due to their tendency (Fig. 1B) towards Ancient North Eurasians that is not modeled by such a mixture.

Another intriguing thing about the results shown in Fig 2B is that the expansions of ANE across Europe appear not to have disturbed the presumably Neolithic WHG/EEF cline to any great extent. What this suggests is that ANE was spread largely independently of EEF and even WHG. In other words, the groups that pushed ANE deep into Europe probably had very high ratios of this component. This also seems to be true for the groups that brought ANE to the Near East:

A geographically parsimonious hypothesis would be that a major component of present-day European ancestry was formed in eastern Europe or western Siberia where western and eastern hunter-gatherer groups could plausibly have intermixed. Motala12 has an estimated WHG/(WHG+ANE) ratio of 81% (S12.7), higher than that estimated for the population contributing to modern Europeans (Fig. S12.14). Motala and Mal’ta are separated by 5,000km in space and about 17 thousand years in time, leaving ample room for a genetically intermediate population. The lack of WHG ancestry in the Near East (Extended Data Fig. 6, Fig. 1B) together with the presence of ANE ancestry there (Table S12.12) suggests that the population who contributed ANE ancestry there may have lacked substantial amounts of WHG ancestry, and thus have a much lower (or even zero) WHG/(WHG+ANE) ratio.

So perhaps the 17,000 year-old Afontova Gora 2 (AG2) genome from Central Siberia, classified as part of the ANE meta-population by Lazaridis et al., is genetically the closest sample we have to the Proto-Indo-Europeans? Based on a couple of the PCA from Lazaridis et al. (below) and Raghavan et al. (see here), this genome doesn't appear to be 100% ANE. My very rough estimate is 85/15 ANE/WHG.


If my assumptions are correct here, then it's no wonder that this Bronze Age Danish sample (M4) from the recent Carpenter et al. paper (see here) shows a clear shift towards the Americans on the global PCA. M4 is better known as "the old man" from the giant Borum Eshøj barrow (see here), presumably built by some of the earliest Indo-Europeans in Scandinavia. We can probably expect such Afontova Gora 2-like results from many European samples archeologically linked to the early Indo-Europeans.




As for the first major expansion of ANE into Europe, here's an interesting map that I spotted in one of the online discussions on the paper, which shows the spread of microblade technology in almost all directions from around Lake Baikal just after the LGM (source). Among other things, it offers a very attractive explanation for the presence of ANE in Mesolithic Sweden, as well as the current distributions of Y-chromosome haplogroups R and Q (note that MA-1 belonged to R, which is the brother clade of Q).




But the problem with this scenario is the tight phylogenetic relationship between ANE and WHG. If the former expanded after the LGM from a refugium in South Siberia, then why is it so closely related to the latter, which presumably recolonized Europe from a Southern European LGM refugium, basically at the opposite end of Eurasia?

There also have been a lot of comments online about the potential correlations between ANE and certain clusters generated from modern samples with the ADMIXTURE software. I think it's obvious from just looking at the ADMIXTURE bar graph from Lazaridis et al. that ANE is linked in one way or another to the clusters that peak in Northeastern Europe, the North Caucasus, and South Central Asia (especially among the Indo-Iranian Kalash).

Below is the bar graph from the optimal ADMIXTURE run, the K=16. Note that ANE proxy MA-1 mostly shows membership in the cream and light blue clusters, which peak among the Kalash and Lithuanians, respectively. Click on the image to enlarge.




The Kalash-centered cluster, which actually first appears at K=14, and is more or less repeated in four runs, is particularly interesting, because it shows fairly similar distribution patterns to ANE. Note, for instance, that after South Central Asia it reaches its highest levels in the North Caucasus, which is where ANE also shows a major peak today (see here). Moreover, in Europe it's most pronounced in the east and north, but appears at comparatively trivial levels among the Basques, southern French and Pais Vasco Spaniards, and doesn't show up at all among Sardinians or the ancient European genomes.

However, it's often very difficult to make inferences about ancient population movements from ADMIXTURE results, and I think this is one of those cases. Just because this cluster peaks among the Kalash, doesn't mean that it has its origins within this group, or even in Asia. I'd say the most plausible explanation for its existence is that it represents ANE that expanded rapidly across Eurasia, probably during the early Indo-European dispersals, and today reaches its higher frequencies among some of the most isolated and genetically drifted recipients of this ANE gene flow (ie. those in the Caucasus and Hindu Kush).

By the way, the difference in ANE levels between southwestern Europeans and most other West Eurasians clearly shows on my own PCA and MDS maps. Below is the latest Eurogenes PCA of West Eurasia from a few months ago. Note the pronounced eastern shift among almost all the samples relative to the Basques, Pais Vasco Spaniards, and Sardinians. As per the f4 graph above, only in some instances is this shift also the result of significant ENA ancestry.




It's incredible what a few ancient genomes can add to the context of these sorts of analyses using modern DNA. I didn't really know what was causing this eastern shift when I posted the PCA, and guessed that it might simply be a lack of Mediterranean ancestry across Northern and Eastern Europe (see here).

I also just noticed that Razib posted two articles on the pigmentation traits of the ancient individuals (see here and here). The sample is tiny, but looking back, the fact that the Loschbour hunter-gatherer probably had blue eyes and dark skin, while, on the other hand, the Stuttgart farmer had relatively light skin, is actually quite remarkable.

We'll have a major story on our hands if several other hunter-gatherer genomes come back with similar results. It's just not something anyone would've predicted from modern DNA. Apart from that, there's also the slight shock factor of learning that our not too distant indigenous European ancestors were probably of a deep shade of brown. Imagine that, Europe might have only really lightened up and become white after Near Eastern migrants made their way over. Well, let's wait and see.

Citation...

Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, bioRxiv, Posted December 23, 2013, doi: 10.1101/001552

Raghavan et al., Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, Nature, (2013), Published online 20 November 2013, doi:10.1038/nature12736

Carpenter et al., Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries, The American Journal of Human Genetics (2013), https://dx.doi.org/10.1016/j.ajhg.2013.10.002

See also...

Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans

The really old Europe is mostly in Eastern Europe

EEF-WHG-ANE test for Europeans

Mesolithic genome from Spain reveals markers for blue eyes, dark skin and Y-haplogroup C6

209 comments:

«Oldest   ‹Older   201 – 209 of 209
Onur Dincer said...

Davidski good observation He is obsessed in proving more ASI genes in Kurds and Iranians than there really is.

I have no such obsession.

Though Palisto pretty much proved that ASI in Kurds is on average 1% and weaker than in allot of other Near Eastern groups (Lebanese, Syrians, Turks etc) he doesn't seem to accept that.

Palisto proved no such thing.

I am getting more and more sure that this guy is actually the Onur who is mostly of Bulgarian Turkish descend and wandering through allot of Anthro-Forums trying to spread his obsession.

As I told you before, I am not that guy. My known ancestry is only 1/4 from Bulgaria (1/2 from Anatolia and 1/4 from Greece). I do not write in any forum, I only write in blogs. And I have always been on good terms with Kurds.

Chad said...

ajv70 of Sweden is 3.5% Australasian on globe 13 and ajv52 shows 6.6% south Asian. Perhaps, C is the culprit and in Europe very early.

Grey said...

@Chad
"R1b migrated to the Near East. It isn't from there. R is ANE not EEF. If R1b was in the Near East, then the EEF's would have ANE, which they do not."

@Kurti
"R1b is definitely Near Eastern But who says it has to be from the Western parts of the Near East where EEF dominated? R1b is most diverse in the Eastern parts of the Near East, Eastern Anatolia, Iranian plateau, Mesopotamia, the South Caucasus and parts of Central Asia."

The thing is, if Basques are just two components: WHG and Farmers, then the R1b must come from one of those two. If it's not the HG component - which would be very neat but the data currently says no - then it must be the farmer component.

However if the Basque R1b comes from the farmer component and the Basques only have two components then the Basque farmer component must come from somewhere which didn't have any other admixture but which developed farming anyway and then traveled to the Basque region. This must limit the options of where it could have come from.

Also possibly optional as i don't know if this is certain or not, if R was originally in the far north and the ice age pushed it back and split it into R1a and R1b in separate refugiums then the possible location for unadmixed R1b is limited further.

What candidate regions are there for the source of unadmixed R1b farmers?

Kristiina said...

It seems that the Papuan purple component is only seen in Mal'ta at K=6 and K=7, and its share may exceed 10%. That component is clearly present also in Tajik with a similar frequency and in Iranians. Interestingly, Eastern Europeans lack this component, but it is present in Caucasus, and traceable even in Palestinians and Bedouins (if I see the colors correctly). In Southeast Asia, higher frequencies are in Cambodians and Thais (even over 10%), and trace amount are present in Tibeto-Burmans. The highest frequencies are in Papuans (100%), Australians (c. 95%), Bougainvilleans, Onge (60%), Mala, Lodhi, all other Indians (30-40%), Makrani, Balochi, Brahui, Sindhi, Burusho, Kalash, Pathan (Pakistani group have c. 20%), Kusunda, Hazara, Uygur (trace amounts), in this order.
The picture is really very heavy. It takes almost an hour to upload it!

Kristiina said...

The Papuan purple component is only seen in Mal’ta at K=6 and K=7 (even 10%), but not in Motala, Loscbour or Stuttgart samples. Interestingly, purple color is absent also in Eastern Europeans. In the west, the Papuan component appears in Caucasus and even in Palestinians and Bedouins, but not for example in Cypriots, and higher amounts are present in Tajiks and Iranians. Somewhat higher amounts are seen also in Cambodians and Thais (even 10%) and trace amounts in Tibeto-Burmans. The highest frequencies are seen in India and South Pacific area: Papuans (100%), Australians (95%), Bougainvilleans (90%), Onge (60%), all Indians (30-40%, not Nepalese Kusunda), Makrani, Balochi, Brahui, Sindhi, Burusho, Kalash, Pathan (Pakistani groups have c. 15-20%), Hazara, Uygur and Uzbek (only trace elements). Trace amounts of purple color are seen even in East Africa.

It looks like this component did not arrive through Europe but through West Asia. In several tests, I have scored 1% of Oceanian, and I wonder how typical this is in Northern Europe.

The picture is really very heavy, and it takes an hour to upload it! :-)

About Time said...

Wrt South Asian mtDNA around ancient Mitanni/Assyria. We don't know how old those haplogroups are in South Asia until we get ancient DNA. The forthcoming results from IVC site will be helpful. (Not sure what to expect).

spagetiMeatball said...

Btw, David, do you speak any russian? If you do, or know someone who does, then I really recommend this amazing documentary about the scythians/sarmatians: http://www.youtube.com/watch?v=-3Bil-tT5Mc

It has a lot of interviews with russian archaeologists about finds in the 80s and 90s, like those done by Victor Sarianidi. It is very unfortunate that their works and materials are not translated into english and other western languages, because these guys really found treasure troves of data (literally, look at those gold cauldrons)

I don't know how much of it is true, but a few of the researchers talk about a fire-bird myth that passed on from the scythians possibly to the early turks and slavs.

Kristiina said...

I still think that the Comb-Ceramic culture (CCC) is a good candidate for the introduction of N1c and Uralic languages (both Saami and Baltic-Finnic), but there might have been several waves from several directions and also after the CCC, and languages have undergone important developments during later periods and probably also from the Uralic point of view. Finland has been a hot pot of different N lines and they all may have a different history.

Until I get convincing evidence to the contrary, I suppose that N1c came to Finland from Volga area, but the Eastern European lineage had probably a different route from Siberian lineages. As for I1, I think that it came to Finland from the Baltic area. Some haplotypes of R1a seem to have come from the Baltic area during the Battle Ax culture and thereafter. Saami seem to harbour an older R1a haplotype, but it is not typical of Finns. With this, I do not want to say that there is no Germanic I1 or R1a in Finland. Half of R1a may be of Germanic origin, but IMO the percentage is much smaller for I1. As for mtDNA, it seems to have come mainly from Russia and Baltic area.

Seinundzeit said...

A side note, but apparently, Dienekes once found something rather similar to the "Basal Eurasian" concept, using TreeMix on Admixture components:
http://dienekes.blogspot.com/2012/03/using-treemix-with-admixture-components.html
"Now, there appears to be some gene flow from what appears to be an early Proto-Eurasian population into Southwest Asians."

This was back in early 2012. Interestingly, he finds that his K12b "Southwest Asian" component is 18% "Proto-Eurasian". Pretty cool that he found something similar to the "Basal Eurasian" concept, but via a different route. I'm not sure if he even remembers this, since he didn't mention this experiment in his coverage of the Lazaridis et al. paper.

«Oldest ‹Older   201 – 209 of 209   Newer› Newest»