Wednesday, December 11, 2013
Last year Current Biology put out a paper on the partial genome sequences of two Mesolithic Iberian hunter-gatherers, dubbed La Brana 1 and 2, which showed that they were genetically more similar to modern-day Northern Europeans than Iberians. According to Spanish news portal Leonoticias.com, the genome of La Brana 1 has now been fully sequenced, and the more comprehensive new data not only back up the initial findings, but also suggest that this individual had blue eyes:
El mesolítico 'leonés' afín al ciudadano del norte de Europa
As per the link above, the new paper will be published in a few weeks. I suppose this means we'll finally see a Y-chromosome haplogroup result from pre-Neolithic Europe. I'm betting on hg R, considering that this was the marker of the Mal'ta boy from Upper Paleolithic South Siberia (see here). Siberia might seem like a long way from Iberia, but in fact, for thousands of years both regions were connected by the Mammoth Steppe, which was inhabited by highly mobile herds of animals and human hunters who followed them. However, I won't be surprised if it turns out that La Brana 1 belonged to hg I or even Q. Any other suggestions Maju?
Ancient DNA from Iberian Mesolithic hunter-gatherers
Friday, December 6, 2013
I just had a look at the updated Ancestry Composition (AC) results of many of the people I'm sharing with at 23andMe. Yes, after a year of waiting the AC has finally been updated, with an overfitting fix and an improved disambiguation of African and Asian ancestry. But to be blunt, the AC still sucks.
There's actually no excuse why it should still suck. The 23andMe scientists have obviously done a great job with the algorithms that make up the AC, because they appear to be highly accurate when used together with well defined reference sets that represent robust biogeographical clusters. So the hard part is done. However, the problem is that several of the reference sets aren't well defined, and that's putting it mildly. Here are some examples:
- Croatians are in a Balkan reference set alongside Greeks, and even more unbelievably, Maltese. On the other hand, their genetically very similar neighbors, the Slovenians, are in the Eastern European reference set, alongside the HGDP Russians from Kargopol, in the north of Russia.
- The North African reference set is mostly made up of samples from the Near East, not North Africa. Also, the samples from Northwest Africa, like the Mozabite Berbers, are genetically quite distinct from all of the other samples in this reference set.
- Czechs are in the Eastern European reference set, while their genetically very similar neighbors, the Austrians, in the French & German reference set. Below is a Principal Component Analysis (PCA) from Nelis et al. showing the genetic relationship between these two Central European populations, relative to the differences within another three European countries.
This appears to be causing problems for some users, because supervised admixture analyses like the AC need relatively pure reference sets from robust biogeographical clusters to work properly. For instance, I'm sharing with Germans and Austrians who, in the standard mode, aren't even classified as 2% French & German, but over 10% Eastern European, and mostly nonspecific Northern European and European. In speculative mode their French & German proportions rise a few per cent.
I think what's happened in this case is that 23andMe has ignored the existence of the biogeographical region known as Central Europe. As a result, people from this area are mostly sitting in a nonspecific European no-man's land. That's because they're not particularly French, nor are they typically Eastern European.
Perhaps some might argue that this Central European biogeographical cluster doesn't really exist, and that it's actually a buffer zone between Western and Eastern Europe? If so, I beg to differ. Clusters specific to Central Europe show up in fine scale analyses with such programs as ChromoPainter and Mclust, and they're quite distinct from clusters specific to France (see here).
So unfortunately it seems that the scientists at 23andMe aren't doing enough to search for the most robust clusters in their dataset. Instead, based on what I've read at the 23andMe website, they seem to be using basic PCAs and their customers' self-reported ancestry to guide them. I reckon they should give ChromoPainter and Mclust a go. There's a PDF article about how these two methods compare to each other here. ChromoPainter is better overall, but Mclust much faster.
If, like me, you're a client at 23andMe and agree with what I've said here, then please send a link to this blog post to the scientists at 23andMe responsible for the AC. I think it'd be a shame to see this powerful tool and the thousands of reference samples available to 23andMe not used to their full potential.
Maybe someone over there will listen, and next time there's an update we'll see Northwest Africans in a reference set of their own, Palestinians, Jordanians and Saudi Arabians not classified as North Africans, the Maltese taken out of the Balkan reference set, Germans in a Central European or their own reference set, and plenty of other improvements.
23andMe’s Ancestry Composition – a preliminary review
Nelis M, Esko T, Mägi R, Zimprich F, Zimprich A, et al. (2009) Genetic Structure of Europeans: A View from the North–East. PLoS ONE 4(5): e5472. doi:10.1371/journal.pone.0005472
Thursday, November 21, 2013
The highly anticipated paper on the genome of the 24,000 year old Mal'ta Siberian is now out at Nature. Here's the study abstract:
The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians1, 2, 3, there is no consensus with regard to which specific Old World populations they are closest to4, 5, 6, 7, 8. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers10, 11, 12, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages5. Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians2, 13. Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago14, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.
Indeed, MA-1 looks like it could belong to an early ancestor of West Eurasians, including and especially Europeans. Mitochondrial haplogroup U was almost fixed in Upper Paleolithic and Mesolithic Europe (see here), while R1a and R1b are, after all, the most common and widespread Y-chromosome haplogroups in Europe today.
Below is the bar graph from the K=9 (nine ancestral populations assumed) ADMIXTURE analysis, which turned out to be the optimal run. Note that the Mal'ta sample appears mostly South Asian (37%), European (34%), and Amerindian (26%), but also with minor Oceanian ancestry (4%). Interestingly, among the Europeans, it's the groups from Northern and Eastern Europe that carry the highest levels of these components. This is probably a reflection, at least in large part, of their elevated indigenous European hunter-gatherer ancestry (see here and here).
At K = 9, MA-1 is composed of five genetic components of which the two major ones make up ca. 70% of the total. The most prominent component is shown in green and is otherwise prevalent in South Asia but does also appear in the Caucasus, Near East or even Europe. The other major genetic component (dark blue) in MA-1 is the one dominant in contemporary European populations, especially among northern and northeastern Europeans. The co-presence of the European-blue and South Asian green in MA-1 can be interpreted as admixture of the two in MA-1 or, alternatively, MA-1 could represent a proto-western Eurasian prior to the split of Europeans and South Asians. This analysis cannot differentiate between these two scenarios. Most of the remaining nearly one third of the MA-1 genome is comprised of the two genetic components that make up the Native American gene pool (orange and light pink). Importantly, MA-1 completely lacks the genetic components prevalent in extant East Asians and Siberians (shown in dark and light yellow, respectively). Based on this result, it is likely that the current Siberian genetic landscape, dominated by the genetic components depicted in light and dark yellow (Figure SI 6), was formed by secondary wave(s) of immigrants from East Asia.
Here's a figure showing the levels of shared genetic drift between MA-1 and 147 present-day non-African populations. Among the Europeans it's the Lithuanians, Northwestern Russians and Baltic and Volga Finns who are most similar to the ancient sample. It's also interesting to note the relatively high position on the list of the Kalash from South Central Asia and Lezgins from the North Caucasus. At the bottom are Bedouins and Palestinians, mainly because of their non-trivial Sub-Saharan admixture, followed by Oceanians, East Asians, and South Indians, probably due to deep differentiation between their main ancestral clades and that of MA-1.
Rumor has it that the same team of scientists is now trying to sequence genomes from Upper Paleolithic sites west of Mal'ta. I wonder how far west? I see that the authors mention the Sungir site from near Moscow a couple of times in the paper, in relation to its similarity to the Mal'ta site. Perhaps they're working on a Sungir genome right now? If so, what's the bet that the Y-DNA turns out to be another basal R?
For more on the Mal'ta genome, including some lengthy discussions on the topic with a few regular readers, click on the links below...
Ancient European admixture in the Americas, or ancient Amerindian admixture in Europe?
Surprising aDNA results from Paleolithic Siberia (including Y-DNA R)
Raghavan et al., Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, Nature, (2013), Published online 20 November 2013, doi:10.1038/nature12736
Monday, November 18, 2013
Back in June I hypothesized about the "Portuguese Copper Age conquista of Europe" (see here). This was the theory that Bell Beaker culture (BBC) groups expanded out of Iberia during the Chalcolithic and mixed with Corded Ware culture (CWC) and Unetice culture (UC) groups from Eastern Europe to form the modern Central European gene pool. I still think this is essentially correct, but it seems to have been a much more intricate process than I envisaged, which continued well past the early Bronze Age (EBA).
This is most clearly seen when comparing the mtDNA haplogroup frequencies of the UC sample from Brandt et al. 2013 to the late Bronze Age Urnfield culture sample from Schilz 2006. The former appear exceptionally Eastern European, while the latter typically Western European. For comparison I've also added the CWC and BBC samples from Brandt et al., and the Urnfield sample looks even more western than the BBC.
Note, for instance, the high levels of H and U5b in the Urnfield dataset, and the much lower levels of these haplogroups in the UC. Instead, the UC and also BBC both show fairly high frequencies of the more typically Eastern European U5a.
Admittedly, the Urnfield sample is small and all the individuals come from the same burial complex called the Lichtenstein caves, so it might not be representative of the wider Urnfield population of Central Europe at the time. However, comparing the UC to present-day Central Europeans (CEM) shows basically the same thing, and this was noticed by Brandt et al:
Notably, the CEM clusters with the Late Neolithic cultures and individuals of the BBC in particular, suggesting that the Western European mtDNA variability had a stronger influence than the contemporaneous eastern CWC/EBA complex, implying yet another shift after the EBA.
The CEM is a Central European metapopulation made up of 500 individuals from Austria, the Czech Republic, Germany and Poland. But it's also not a bad proxy for Ukraine and Western Russia (for instance, see here). So not only are we looking at a significant genetic shift in Central Europe, most likely during the middle and/or late Bronze Age, but also at some point in Eastern Europe.
As I had already mentioned in an earlier blog entry about the Brandt et al. paper, this is all very intriguing, because the UC is generally accepted as an early Western Indo-European culture. Indeed, it's assumed to be the direct ancestor of the middle Bronze Age Tumulus Culture, which then apparently spawned the Urnfield Culture, which is supposed to have given rise to the Iron Age Halstatt Culture, which is usually seen as the Proto-Celtic culture.
So perhaps there's something very wrong with this archeological chronology of who begat whom in prehistoric Central Europe? Or maybe the CWC/UC complex simply imposed its Indo-European languages and rituals across Central Europe without leaving a significant impression on the mtDNA structure of the region (although keep mind, Y-DNA and autosomal DNA might be a very different story). But then how is it that the mtDNA structure across Eastern Europe also shifted to the west?
I'd say the only way to explain all of these somewhat unexpected ancient DNA results is to assume that there were major post-EBA expansions from former BBC hot spots in West Central Europe, which resulted in significant west to east population movements across the North European Plain, and even considerable Western European gene flow deep into what is now Russia. I think the main culprits responsible for the latter process were the early Slavs, although non-trivial contributions from various Germanic groups can't be discounted. I actually covered these topics in two earlier blog entries:
Russian mtDNA, Goths of the Ukrainian steppe, and a proto-Slavic expansion from present-day Poland
Post-Mesolithic population replacements/extinctions in Northeastern Europe
Update 20/11/2013: The Brandt et al. paper has become freely available in a preview version of Science here, although I have no idea how long the link will work.
Guido Brandt, Wolfgang Haak et al., Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity, Science 11 October 2013: Vol. 342 no. 6155 pp. 257-261 DOI: 10.1126/science.1241844
Schilz, F., Molekulargenetische Verwandtschaftsanalysen am prähistorischen Skelettkollektiv der Lichtensteinhöhle, Dissertation, Göttingen, D. Schweitzer, Lichtenstein Cave Data Analysis (2008).
Saturday, November 16, 2013
I'm reading a new paper at PLoS ONE on the mitochondrial DNA of Iranians. It's the first study to tackle the topic of Iranian maternal ancestry using complete mtDNA sequences. Here are a couple of quotes that caught my eye:
Between the third and second millennia BCE the Iranian Plateau became exposed to incursions of pastoral nomads from the Central Asian steppes, who brought the Indo-Iranian language of the Indo-European family, which eventually replaced Dravidian languages, perhaps by an elite-dominance model [13,17,20].Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M, et al. (2013) Complete Mitochondrial DNA Diversity in Iranians. PLoS ONE 8(11): e80673. doi:10.1371/journal.pone.0080673
The U5a1a’g cluster itself (based on HVS1 sequence data) is concentrated in populations of the Pontic-Caspian steppe, extending from Romania, Ukraine, southern Russia and northwestern Kazakhstan to the Ural Mountains. The highest frequencies of the U5a1a’g were reported in the Volga-Ural region (5.3%), in particular in Bashkirs (4.3%) and Tatars (3.9%) , although the frequency varies from ,2.7% in Russians to ,1.5% in populations of the northern Caucasus [64,76–81]. It is worth mentioning that despite the low frequency of U5a1a’g haplotypes in Central Asian populations of Turkmens, Karakalpaks, Kazakhs and Uzbeks (,1.5% according to the data of , some haplotypes were common between Karakalpaks (haplotype marked by mutation at np 16293), Turkmens (by mutation at np 64) and Iranians. So, it seems likely that the sub-cluster U5a1g or its founder has arrived to Iran from Eastern Europe/southern Ural via the Caspian Sea coastal route.
Mitochondrial haplogroup U2e as a maternal marker of the Proto-Indo-European expansion
Thursday, November 14, 2013
Ancient DNA is painting a remarkable picture of the period of European prehistory known as the late Neolithic/early Bronze Age. It's showing that after the collapse of genetically Near Eastern-like farming populations of middle Neolithic Central Europe - probably as a result of climate fluctuations, disease, famine and increasing violence - the vacuum was filled by genetically much more European-like groups from the eastern and western peripheries of Neolithic Europe.
First came the settlers from the east, belonging to the vast archeological horizon known as the Corded Ware Culture (CWC). About three hundred years later they were joined in Central Europe by migrants from the Atlantic Fringe, belonging to the Bell Beaker Culture (BBC). During the early Bronze Age, the CWC disappeared, and was replaced by the Unetice Cultre (UC), which briefly overlapped with the late BBC.
Ancient DNA recovered to date suggests that the Bell Beakers were genetically the archetypal Western Europeans, characterized by Western European-specific mtDNA H subclades and Y-chromosome haplogroup R1b. Interestingly, R1b has also been found among remains of aboriginals from the Canary Islands, just off the coast of northwest Africa. It might be a stretch to attribute this directly to the Bell Beakers, but they were certainly capable sailors, so perhaps not?
On the other hand, the CWC and UC populations appear to have been Eastern Europeans to the core, with low levels of mtDNA H and showing mtDNA affinities to Bronze Age Kurgan groups of Kazakhstan and South Siberia. We also know that Y-chromosome haplogroup R1a was present among the CWC of Germany, and it reached frequencies of almost 100% among the Kurgan samples from South Siberia and the European-like mummies of the Tarim Basin in what is now Western China.
Here are a couple of figures from recent studies, Brandt et al. and Brotherton et al., respectively, illustrating much of what I just said.
So it seems everything is falling into place, with ancient DNA, archeology, and modern European genetic substructures all showing basically the same phenomenon.
However, for a while now the ever more precise modern phylogeography of R1b has been hinting that this haplogroup might not have expanded across Europe from the west. That's because the most basal clades of R1b are found in West Asia, and its SNP diversity decreases sharply from east to west across Europe. Below is a schematic of the latest phylogeography of R1b. It was presented at the recently held 9th Annual International Conference on Genetic Genealogy by Arizona University population geneticist Michael Hammer.
And here is another map shown by Hammer at the same conference, illustrating the frequencies of various R1b subclades across Europe.
I didn't see the presentation, so I don't know what Hammer actually said. But it appears as if his theory is that R1b spread across Europe from the Balkans during the late Neolithic or later, and then exploded in-situ from certain areas of Central and Western Europe during the metal ages. If true, this scenario obviously doesn't match the presumed west to east expansion of the Bell Beakers.
But here's yet another slide from Hammer's talk, which shows the frequency peaks of the most common European subclades of R1b: U106, L21 and U152. Curiously, these peaks are all located in and around former Bell Beaker territory (second image below, from Wikipedia).
Admittedly, we only have two Y-chromosome results from Bell Beaker remains, both from the same site in Germany dated to around 4500 YBP, but both belonging to R1b. Based on that, plus all of the indirect evidence outlined above, it's already very difficult to shake the association between the Bell Beakers and R1b. So I'm thinking there are three possible explanations why the latest R1b phylogeography doesn't support a Bell Beaker-driven expansion of this haplogroup in Europe.
1) The current mainstream theory positing the origin of the Bell Beaker Culture in Portugal is wrong, and the earliest Bell Beakers expanded from East Central Europe, as was once thought.The first option basically ignores ancient mtDNA data which shows that the Bell Beakers of Central Europe were of Iberian origin, at least in terms of maternal ancestry. So for now, I'm going with the third option, and looking forward to more ancient DNA results.
2) The latest R1b phylogeography is based on limited sampling, and many more individuals need to be tested from former Bell Beaker areas in Iberia and France to catch the basal R1b subclades in these regions.
3) The people who were to become the Bell Beakers in Iberia originally came from the southern Balkans, via maritime routes across the Mediterranean, and then dominated Western and Central Europe via a series of migrations and back migrations. The latest R1b phylogeography is simply not intricate enough to properly describe this complicated process.
A lot can be said about what might have pushed the Balkan proto-Bell Beakers to Western Europe during the late Neolithic, if they actually existed. At the time Bulgaria was being invaded by steppe nomads from just north of the Black Sea, and its agricultural communities were disappearing rapidly. I suppose the ancestors of the Bell Beakers might have been refugees trying to escape these nomads. Then again, perhaps they were the descendants of the nomads who learned to sail after reaching the Mediterranean? I might revisit the issue when I have more data to work with.
A post-EBA genetic shift across Central and Eastern Europe
Michael Hammer, Origins of R-M269 Diversity in Europe, University of Arizona, FamilyTreeDNA, 9th Annual Conference
Guido Brandt, Wolfgang Haak et al., Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity, Science 11 October 2013: Vol. 342 no. 6155 pp. 257-261 DOI: 10.1126/science.1241844
Brotherton et al., Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans, Nature Communications 4, Article number: 1764, Published 23 April 2013, doi:10.1038/ncomms2656
Monday, November 11, 2013
This looks very promising for those interested in the so called Migration Period of the early Middle Ages. I suppose we'll soon see the full study in print somewhere:
Population Genetic and Craniometric Characterization of an Avar Period Individual from Croatia
Matthew D Teasdale, Noreen von CramonTaubadel, Eppie R Jones, Mario Šlaus, Russell L McLaughlin, Daniel G Bradley, Ron Pinhasi
The Avars were a Eurasian equestrian population that migrated from the East to Europe where they settled in the general area of the Carpathian basin and established a kingdom that lasted from the 6th to the 9th century AD (Curta 2006; Curta and Kovalev 2008). The Avars had a rich material culture, which has allowed for the tracking of their migrations and the subsequent assimilation of local populations as they became more sedentary (Sinor 1990; Curta and Kovalev 2008). These archaeological resources make the Avars an appropriate population for the study of early medieval population movements. Next generation sequencing (NGS) (Metzker 2009) has revolutionized the field of ancient DNA, by providing the ability to rapidly sequence thousands of ancestry informative genetic markers in archaeological samples. These markers can then be used to compare ancient samples to modern reference populations of known geographic origin (SanchezQuinto et al. 2012; Skoglund et al. 2012). The assessment of an ancient populations affinities and variability based on craniometric data is commonplace in biological anthropology (von CramonTaubadel and Pinhasi 2011; Pinhasi and von CramonTaubadel 2012), however thus far no study has combined both NGS genetic and craniometric analyses. In this paper we present the results of a combined genetic/craniometric analysis to characterise the affinity of an Avar period individual from Croatia with excellent osteological preservation. Preliminary results from both of these analyses suggest that this individual shares a closer relationship with modern day European populations than those of the proposed Avar homeland in central Asia.
British Association for Biological Anthropology and Osteoarchaeology (BABAO) 2013 conference program