- the chance that the ancient European populations associated with the Yamnaya, Corded Ware and other closely related archeological cultures formed as a result of migrations from Central Asia is zero - the chance that the Proto-Indo-European homeland was located in Central Asia is zero - the chance that present-day Europeans, by and large, derive from any ancient Central Asian populations is zeroSee also... Central Asia as the PIE urheimat? Forget it The Steppe Maykop enigma Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...
Thursday, April 25, 2019
Some myths die hard
Ancient DNA tells us that the Bronze Age wasn't kind to the indigenous populations of Central Asia. It seems to have wiped them out totally. Indeed, Central Asia might well be the only major world region in which native hunter-gatherers failed to make a perceptible impact on the genetics of any extant populations. Before the Neolithic transition, much of Central Asia was home to hunter-gatherers closely related to those of nearby western Siberia. During the Neolithic, agriculturalists and pastoralists from the Near East gradually moved into the more arable parts of southern and eastern Central Asia, eventually giving rise to the Bactria Margiana Archaeological Complex, or BMAC, and other similar communities. It's not clear what their relationship was like with the native hunter-gatherers in these areas. But they did mix with them in varying degrees. This is obvious because genome-wide genetic ancestry characteristic of the Botai people, who hunted and eventually domesticated horses on the Kazakh steppe during the 4th millennium BCE, and were probably the archetypal Central Asians for their time, is found at significant levels in a number of later samples from Central Asian farmer and pastoralist sites, such as Dali, Gonur Tepe and Sarazm. Thus, even though the Neolithic transition did have a big impact on Central Asia, and clearly led to large scale population replacements in some parts of the region, this was just the beginning of these population shifts. Moreover, in some cases the expanding farmer and pastoralist populations seem to have acquired significant indigenous Central Asian ancestry and spread it with them. The precise geographic extent of the relatively unique Botai-related ancestry in prehistoric Eurasia is still something of a mystery. But to give you a general picture of where it was found from around 6,000 BCE to 2,000 BCE, here's a map with info about samples with significant levels of this type of ancestry from a wide range of sites in space and time. here. I have a strong suspicion that the same sort of thing happened to the aforementioned Steppe Maykop people. In other words, they may have been forced out from the Eastern European steppe, and perhaps sought shelter in the Caucasus Mountains? Admittedly, I'm not offering anything new here. I just wanted to emphasize a few key points, because I'm still seeing some confusion online about the population history of Central Asia, and especially how it relates to the population history of Europe, and also the Proto-Indo-European homeland question. Make no mistake, thanks to the ancient DNA already available from Central Asia, we can confidently infer the following:
Monday, April 22, 2019
R1b-M269 in the Bronze Age Levant
The new Harvard genotype datasets that I blogged about recently include a couple of potentially very useful samples from the Levant dated to 1400-1100 BCE. Search for IDs I2062 and I1934 in the anno files here. They're both from an archeological paper about a Late Bronze Age (LBA) burial site in what is now Israel that was published back in 2017 (see here). Surprisingly, individual I2062 is listed in the anno files as belonging to Y-haplogroup R1b1a1a2, which is also known as R1b-M269. The reason that this is a surprise to me is because R1b-M269 is closely associated with the Bronze Age expansions of pastoralists from the Pontic-Caspian steppe in Eastern Europe, and these expansions didn't impact the Levant in any direct or significant way. The Y-haplogroup assignment may or may not be correct. Sometimes the Y-haplogroups in these sorts of datasheets are indeed wrong. Unfortunately, as far as I know, the BAM file for I2062 isn't available anywhere online, so I can't check whether he does really belong to R1b-M269. But, intriguingly, his autosomes do show a subtle signal of Yamnaya-related ancestry from the Pontic-Caspian steppe that is missing in earlier ancients from the Levant. To characterize his genome-wide ancestry, I first ran a series of unsupervised and supervised analyses with the Global25/nMonte3 method (using this datasheet). For the sake of simplicity, I narrowed things down to the mixture models below based on three reference populations each. Levant_ISR_C is made up of Chalcolithic samples from Israel. The identities of the other reference sets should be obvious to most readers. If confused, feel free to ask for more details in the comments below.
Levant_ISR_MLBA:I2062 Levant_ISR_C,66.8 IRN_Seh_Gabi_C,27 Yamnaya_RUS_Samara,6.2  distance%=1.8905 Levant_ISR_MLBA:I2062 Levant_ISR_C,66.2 Kura-Araxes_ARM_Kaps,30.2 Yamnaya_RUS_Samara,3.6  distance%=2.0856 Levant_ISR_MLBA:I2062 Levant_ISR_C,67.8 Kura-Araxes_RUS_Velikent,31.8 Yamnaya_RUS_Samara,0.4  distance%=2.1738To further confirm the reliability of my models, I tested them with the formal statistics-based qpAdm software. As far as I can tell, the output from qpAdm looks very solid across the board.
Levant_ISR_MLBA_I2062 IRN_Seh_Gabi_C 0.193±0.052 Levant_ISR_C 0.710±0.038 Yamnaya_RUS_Samara 0.098±0.026 chisq 9.304 tail prob 0.67676 Full output Levant_ISR_MLBA_I2062 Kura-Araxes_ARM_Kaps 0.249±0.076 Levant_ISR_C 0.681±0.051 Yamnaya_RUS_Samara 0.071±0.035 chisq 11.101 tail prob 0.52032 Full output Levant_ISR_MLBA_I2062 Levant_ISR_C 0.661±0.042 Kura-Araxes_RUS_Velikent 0.339±0.042 chisq 7.979 tail prob 0.844942 Full outputAdmittedly, even though I2062 can be modeled with Yamnaya-related admixture, he doesn't need to be. Indeed, his ratio of this type of ancestry varies significantly between the models, from around 10% to nothing. This appears to be dependent on the geography of the non-Levant and non-Yamnaya reference populations; the closer they are to the Pontic-Caspian steppe, the smaller the ratio of Yamnaya-related ancestry in I2062. I'd describe this as an artifact of the isolation-by-distance phenomenon, and it totally makese sense, but it prevents me from confirming beyond any doubt that I2062 does harbor genome-wide steppe ancestry. Unfortunately, individual I1934 doesn't offer enough data to be analyzed with the same methods. Samples associated with the Kura-Araxes or Early Transcaucasian culture are particularly strong references for the eastern ancestry in I2062. This probably isn't a coincidence, and it might also explain his Y-haplogroup, because, at its maximum extent, the territory occupied by the Kura-Araxes culture stretched all the way from the Pontic-Caspian steppe to the southern Levant. The map below is from Wilkinson 2014. Downloadable genotypes of present-day and ancient DNA data Early chariot riders of Transcaucasia came from... R-V1636: Eneolithic steppe > Kura-Araxes?
Thursday, April 18, 2019
Early chariot riders of Transcaucasia came from...
I'm finding it increasingly difficult nowadays to fully appreciate all of the ancient DNA samples that are accumulating in my dataset. But it's not entirely my fault. Among the hundreds of ancient samples published last year there was a couple of Middle Bronze Age (MBA) individuals from what is now Armenia labeled "Lchashen Metsamor" (see here). I wasn't planning to do much with these samples because, even after reading the Nature paper that they came with a couple times over, I didn't have a clue what they were about. But after some digging around, I now know that their people, those associated with the Lchashen Metsamor archeological culture, were among the earliest in Transcaucasia, and indeed the Near East, to use the revolutionary spoked-wheel horse chariot. How awesome is that? The invention of the spoked-wheel chariot is generally credited to the Middle Bronze Age Sintashta culture of the Trans-Ural steppe in Central Asia, and its rapid spread is often associated with the early expansions of Indo-European languages deep into Asia. On the other hand, some have argued that this type of chariot was first developed in the Near East, and directly derived from solid-wheeled wagons pulled by donkeys. It's now obvious, thanks to ancient DNA, that the Sintashta people were by and large migrants to Central Asia from somewhere in Eastern Europe, and that they didn't harbor any recent ancestry from the Near East. So if chariot technology spread into the steppes from the Near East, then it did so without any accompanying gene flow, which is possible but not entirely convincing. This begs the question of whether the Lchashen Metsamor population was of Sintashta-related origin, because if it was, then this would corroborate the consensus that spoked-wheel chariots were introduced into Transcaucasia from the steppes to the north. Below is a Principal Component Analysis (PCA) of West Eurasian genetic variation. It does suggest that the Lchashen Metsamor pair (labeled Armenia_MBA_Lchashen), as well as most of the other currently available samples from what is now Armenia dating to the Middle to Late Bronze Age (MLBA), harbor some steppe ancestry. That's because they appear to form a cline between samples associated with the Sintashta and Kura-Araxes cultures. Of course, the Kura-Araxes culture was a major Early Bronze Age (EBA) archeological phenomenon centered on Transcaucasia and surrounds, so its population can be reasonably assumed to have formed the genetic base of most subsequent populations in the region. The relevant PCA datasheet is available here.
Armenia_MBA_Lchashen CWC_Kuyavia 0.183±0.036 Kura-Araxes_Kaps 0.817±0.036 chisq 13.941 tail prob 0.378021 Full output Armenia_MBA_Lchashen Balkans_BA_I2163 0.193±0.045 Kura-Araxes_Kaps 0.807±0.045 chisq 14.780 tail prob 0.321267 Full output Armenia_MBA_Lchashen Kura-Araxes_Kaps 0.788±0.043 Sintashta_MLBA 0.212±0.043 chisq 14.871 tail prob 0.315451 Full outputI sorted the output by "tail prob", but the fact that Sintashta_MLBA is in third place isn't a problem because the stats in all of these models are basically identical. Indeed, CWC_Kuyavia (Corded Ware culture samples from present-day Kuyavia, North-Central Poland) and Balkans_BA_I2163 (a Bronze Age singleton from what is now Bulgaria) are both very similar and probably closely related to each other and to the Sintashta samples. Interestingly, and, I'd say, importantly, ancients from the steppe that are closest to Lchashen Metsamor in both space and time, but not particularly closely related to the Sintashta people, don't work too well as a mixture source in such models.
Armenia_MBA_Lchashen Kubano-Tersk 0.184±0.046 Kura-Araxes_Kaps 0.816±0.046 chisq 22.179 tail prob 0.0526526 Full outputA couple of months ago I suggested that populations associated with the Early to Middle Bronze Age (EMBA) Catacomb culture were the vector for the spread of steppe ancestry into what is now Armenia during the MLBA (see here). After taking a closer look at the Lchashen Metsamor samples, I now think that the peoples of the Sintashta and related cultures were also important in this process. If so, they may have moved from the steppe into Transcaucasia both from the west via the Balkans and the east via Central Asia, and brought with them spoked-wheel chariots. I don't have a clue what language they spoke, but I'm guessing that it may have been something Indo-European. See also... The mystery of the Sintashta people A potentially violent end to the Kura-Araxes Culture (Alizadeh et al. 2018) Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...
Friday, April 12, 2019
Armenians vs Georgians
Armenians and Georgians are ethnic groups that live side by side in the south Caucasus, or Transcaucasia. By all accounts, they've both been there since prehistoric times and they're very similar in terms of overall genetic structure. However, they speak languages from totally unrelated families: Indo-European and Kartvelian, respectively. How did this happen and might the answer lie in the small genetic differences that do exist between them? To investigate this issue, I ran a series of qpAdm formal mixture models of present-day Armenians and Georgians using tens of ancient reference populations. To come up with as straightforward and meaningful results as possible, I constrained myself to two-way models. I then discarded the runs that produced "tail probs" under 0.1 and retained less than 400K SNPs. Only a handful of models passed muster, including these two:
Armenian Mycenaeans_&_Empuries2 0.233±0.041 Kura-Araxes_Kaps 0.767±0.041 chisq 18.422 tail prob 0.142151 Full output Georgian Globular_Amphora 0.071±0.025 Kura-Araxes_Kaps 0.929±0.025 chisq 18.419 tail prob 0.142266 Full outputAt the most basic level, the results suggest that both Armenians and Georgians are overwhelmingly derived from populations of Bronze Age Transcaucasia associated with the Kura-Araxes archeological culture, albeit with minor ancestries from somewhat different sources from the west. As far as I can see, when using more than 400K SNPs and a wide range and large number of outgroups (or right pops), neither Armenians nor Georgians can pass perfectly for any one ancient population in my dataset. The best proxies for the minor but significant western ancestry in Armenians are Mycenaeans of the Bronze Age Aegean region and Greek colonists from Iron Age Iberia (Empuries2). Obviously, and perhaps importantly, these are both attested Indo-European-speaking groups. On the other hand, the very minor western ancestry in Georgians is best characterized as gene flow from Middle to Late Neolithic European farmers rich in indigenous European forager ancestry. It's practically impossible to say what language or languages these farmers spoke. How about something Kartvelian? In any case, for me, the perplexing thing about present-day Armenians is that they harbor very little steppe ancestry. By and large, no more than a few per cent. Compare that to the currently available samples from what is now Armenia dating to the Middle to Late Bronze Age, which show ratios of steppe ancestry of up to 25%. For now, I'm guessing that what we're dealing with here is the classic bounce back of older ancestry layers that has been documented for different parts and periods of prehistoric Europe. See also... Early chariot drivers of Transcaucasia came from... Catacomb > Armenia_MLBA Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...
Sunday, April 7, 2019
On the association between Uralic expansions and Y-haplogroup N
Almost all present-day populations speaking Uralic languages show moderate to high frequencies of Y-chromosome haplogroup N. I reckon there are two likely explanations for this:
- the speakers of Proto-Uralic were rich in N because they lived in an area, probably somewhere around the Ural Mountains, where it was common, and they spread it with them as they expanded from their homeland - Uralic languages often came to be spoken in areas of North Eurasia where N was already found at moderate to high frequenciesThe major exception to this rule are Hungarians, whose language belongs to the Ugric branch of Uralic. Their frequency of N is close to zero and they don't differ much in terms of overall genetic structure from their Indo-European-speaking neighbors in East Central Europe. here). We've since had to wait over a decade to get a more comprehensive look at the Y-chromosome haplogroups of medieval Hungarians. The most useful effort to date, a manuscript courtesy of Neparáczki et al., was posted this week at bioRxiv (see here). The results in the preprint suggest a much more complex picture than simply a migration of an obviously Uralic-speaking population rich in Y-haplogroup N into the medieval Carpathian Basin. But they do confirm the presence of N in Hungarian conqueror elites, and, in fact, of very specific subclades of N that link them to the present-day speakers of Uralic languages from around the Ural Mountains. Here are some pertinent quotes from the prepint:
Three Conqueror samples belonged to Hg N1a1a1a1a2-Z1936, the Finno-Permic N1a branch, being most frequent among northeastern European Saami, Finns, Karelians, as well as Komis, Volga Tatars and Bashkirs of the Volga-Ural region. Nevertheless this Hg is also present with lower frequency among Karanogays, Siberian Nenets, Khantys, Mansis, Dolgans, Nganasans, and Siberian Tatars 23. ... It is generally accepted that the Hungarian language was brought to the Carpathian Basin by the Conquerors. Uralic speaking populations are characterized by a high frequency of Y-Hg N, which have often been interpreted as a genetic signal of shared ancestry. Indeed, recently a distinct shared ancestry component of likely Siberian origin was identified at the genomic level in these populations, modern Hungarians being a puzzling exception 36. The Conqueror elite had a significant proportion of N Hgs, 7% of them carrying N1a1a1a1a4-M2118 and 10% N1a1a1a1a2-Z1936, both of which are present in Ugric speaking Khantys and Mansis 23. ... Population genetic data rather position the Conqueror elite among Turkic groups, Bashkirs and Volga Tatars, in agreement with contemporary historical accounts which denominated the Conquerors as “Turks” 38. This does not exclude the possibility that the Hungarian language could also have been present in the obviously very heterogeneous, probably multiethnic Conqueror tribal alliance.Indeed, a large proportion of the 44 males from elite Hun, Avar and Hungarian Conqueror burials analyzed in the study belonged to Y-haplogroups that can't be plausibly associated with the earliest Uralic speakers, but rather with those of various Indo-European languages, such as I1 and R1b-U106 (these are Germanic-specific markers), I2a-L621 and R1a-CTS1211 (obviously Slavic) and R1a-Z2124 (largely Eastern Iranian). If most of these results aren't due to contamination, then it's likely that both the early Hungarian commoners and elites were, by and large, derived from Indo-European-speaking populations. No wonder then, that present-day Hungarians are basically indistinguishable genetically from their Indo-European-speaking neighbors and, like them, show hardly any Y-haplogroup N. See also... Hungarian Conquerors were rich in Y-haplogroup N (Fóthi et al. 2020) More on the association between Uralic expansions and Y-haplogroup N Ancient DNA confirms the link between Y-haplogroup N and Uralic expansions
Thursday, April 4, 2019
Downloadable genotypes of present-day and ancient DNA data
They're freely available via the Harvard University at this LINK. The linked web page includes this message:
We would be grateful if users of this dataset could alert us to any errors they detect and help us to fill in missing data. This could include: (1) errors or missing information for location, latitude, longitude, archaeological context, date, and group label, (2) concerns about Y chromosome or mitochondrial DNA haplogroup determinations, and (3) evidence for other problems in the data or annotations for individuals. Please write to Swapan 'Shop' Mallick and David Reich with any suggestions. We would also be grateful if members of the community could suggest additional content that would be helpful to add to this page to make it maximally useful. Finally, please let us know if there is any ancient DNA data we should be including that we have missed.By the way, I've updated my Global25 datasheets with many of the samples from this new Harvard release. Same links as always...
Global25 datasheet ancient scaled Global25 pop averages ancient scaled Global25 datasheet ancient Global25 pop averages ancient ... Global25 datasheet modern scaled Global25 pop averages modern scaled Global25 datasheet modern Global25 pop averages modernSee also... New release of ADMIXTOOLS with two additional programs Getting the most out of the Global25 Modeling genetic ancestry with Davidski: step by step