search this blog

Tuesday, January 30, 2018

Indian smoke and mirrors

On January 4 this year Hindi newspaper Dainik Jagran published a so called special feature on Indo-European languages. In fact, the article claimed to be giving its readers a sneak peak at the results from the upcoming and much anticipated archaeogenetics paper on the northern Indian Harappan site of Rakhigarhi. [LINK]

I knew about this article when it first came out, because it was mentioned in a few off topic comments on this blog, like this one by commentator Sanuj.

Latest news on the Rakhigarhi results, published in a prominent Hindi paper, also quoting Niraj Rai, the lead geneticist working on it. It is essentially saying that researchers, both foreign and Indian have established that India is home of the Indo-European family and that the aDNA from Rakhigarhi is a close match to North Indian Brahmins. The results are to be published in a leading journal soon.

I deleted these comments soon after I saw them, not only because they were off topic, but also because they made absolutely no sense whatsoever.

Why? For one, because over the past year or so I've managed to gather a little bit of intel on the Rakhigarhi paper from very reliable sources, and all indications were that the results would show significant ancient population movements from West Asia and Eastern Europe to India, and not the other way around.

Moreover, thanks to already published ancient DNA from outside of South Asia, it's obvious that there were significant population movements from West Asia and Eastern Europe to India, and not the other way around. The one exception to this rule is the migration of the Romani (Gypsy) people from northern India to Europe, but this is irrelevant to the topic at hand, because it didn't have much of an impact on the genetic structure or linguistics of Europeans.

So why have I now decided to give Dainik Jagran my full attention? Well, because commentator Sanuj recently resurfaced in another comment thread and said this...

They have been hinting at the outcome, you are just not ready to listen to what they are hinting at, this Jagran article is a case in point. By the way Jagran is the most widely read newspaper in India, and is one of the most credible - rated by Reuters-BBC.

Yep, he's correct: Dainik Jagran is a huge and well respected newspaper.

Please note, however, that the chances of India being confirmed the Indo-European homeland thanks to the ancient DNA from Rakhigarhi are zero; not just low, not almost zero, but zero. Anyone with a generally healthy mind and the ability to be more or less objective in this matter has to admit that this is indeed the case. So why would one of the biggest and most respected Indian newspapers publish such utter crud?

It's an intriguing question to say the least. Moreover, was Niraj Rai actually interviewed by the reporter from Dainik Jagran? If he was, did he really say what he's claimed to have said, or was he grossly misrepresented? If the latter, has he sought a correction? If not, why not? Have the western scientists who are collaborating with Rai asked him what the fig is going on, and have they sought a correction? If not, why not?

Does anyone know if Dainik Jagran has since published a correction, or at least a letter from Rai straighting things out?

Admittedly, I have no idea what's going on now with the Rakhigarhi study and paper; the trail went cold months ago. But whatever it is, it's something peculiar. That's because I find it extremely unlikely that any newspaper, let alone one of the top newspapers in India, would be allowed to get away with misrepresenting and indeed inverting, either by design or mistake, the outcome of such a major international project.

See also...

Indian confirmation bias

The Out-of-India Theory (OIT) challenge: can we hear a viable argument for once?

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Monday, January 29, 2018

Paleoeuropeoid (steppe herder) infiltration into South Central Asia during the Bronze Age (Dubova et al. 2016)

Update 31/03/2018: Check out this awesome new preprint at bioRxiv: The Genomic Formation of South and Central Asia (Narasimhan et al. 2018)


I don't usually take cranial studies very seriously, mostly because they have a history of being way out of the ballpark. However, Interaction between Steppe and Agricultural Tribes during the Bronze Age: Morphological Aspects by Dubova et al. 2016 is, at the very least, a decent read. A preprint of this paper is freely available at HERE. One day, hopefully in the not too distant future, we'll see a paper like this based on ancient genomes. And I'm pretty sure that the results won't look much different. Emphasis is mine:

Abstract: Here we discuss the results of research conducted on the variability of anthropological features of the populations of Turkmenistan, Uzbekistan, Tajikistan, Pakistan, China, etc., from the Late Stone Age and Bronze Age. A detailed analysis was carried out on 85 craniological series from burial grounds at Gonur and Buston VI (see Table 1). We examined skulls from the steppe, forest-steppe, desert, and semi-desert areas of Central Asia, Ural, Siberia and the North Caucasus. Factor analysis was used to explore the data obtained. Four factors, describing more than 70% of craniological variations, were extracted. The first factor (describing 29.6% of variability) differentiated groups according to the lengthwise sizes of the head and face, mostly taking into consideration cranial breadth, bezygomatic diameter, and orbit width, as well as minimum frontal diameters, upper face and nose heights. The second factor (17.4% of variability) differentiated groups mainly according to facial height, nose and orbit heights. The highest loadings of the third factor, which determined 14.9% of variability, considered important characteristics such as cranial length and breadth, and the fourth factor (10,4% of variability) – nose breadth. As a result, we identified two major anthropological groups: the first comprising North Kazakhstan, South Siberia, Altai, and Ural-Volga, populations with larger latitudinal proportions of the head and face, as well as a smaller width of the forehead, upper face height, and height of the nose; and the second comprising the southern territories, including the majority of the populations of Iran, Pakistan, the Indus valley, and the southern regions of Uzbekistan, Tajikistan, and Turkmenistan as well, who had the opposite combination of features: long and narrow heads, high, narrow faces and noses, and round orbits. The analysis conducted has enabled us to affirm that Southern Turkmenistan manifestations of minimal impurities with regard to anthropological components, which could be linked to pastoral surroundings, were not seen prior to the middle of the 2nd millennium BC.


New data has shed light on the interaction between the steppe pastoralists and the sedentary farmers. Cranial series from the southern regions of Central Asia, representing populations where the features of agricultural and pastoral cultures are combined (Kokcha III, Buston VI, Karaelematasai, and Patmasai, Djarkutan), have been clearly located between ‘typical’ farmers (Hasanlu, Gonur, Mohendjo Daro, Pakistani Timargarha and Butkara) and series from the territory of Kazakhstan, southern Siberia, and the Volga-Ural region. At the same time, Gonur skulls, from the necropolis situated in and around ruins of early buildings, and the Buston VI series, as well as those from later layers of Tepe Hissar in Iran, have been identified as having large transversal dimensions while maintaining the same height-sizes of traits of subjects uncovered from earlier periods at the same monuments. This might be connected primarily to the general brachicephalization processes manifested at that time. But it is also likely that this was the result of a gradual penetration of groups from the Eurasian Steppe to the south, which was initially random but then became increasingly common with frequent mating between steppe groups and farmers. The term “infiltration” best characterizes this process of mixing. It should be noted that the currently available archaeological materials from Gonur Depe reveal that around such major proto-urban centers (which Gonur was at the end of the 3rd-2nd millennium BC) already by the middle of the 2nd millennium BC herders were indigenous, as evidenced by small settlements of cattle breeders in the vicinity of the city walls (see for example: Hiebert & Moore, 2004; Cattani, 2004). In addition, separate (sporadic) steppe pottery fragments have been unearthed from some areas of the site and its surrounding smaller settlements (Sarianidi & Dubova, 2010, pp. 39-42). However, we must particularly emphasize that at Gonur (i.e., in Southern Turkmenistan) manifestations of minimal impurities in anthropological components, which could be linked to pastoral surroundings, were not seen prior to the middle of the 2nd millennium BC.

Another important point to bear in mind is that in the southern regions of Central Asia there were no Bronze Age sites (or earlier ones), where the presence of the so-called ‘Protoeuropean’ anthropological type (a massive variant with a large sized head, low and wide face, rectangular orbits, and with a flattening of the upper part of the face) was fixed. This variant has only been described by researchers in the northern regions of Central Asia. The groups with a small proportion of the ‘Paleoeuropeoid’ anthropological component in their composition reached southern regions in the Bronze Age. The most prevalent among them still being the Mediterranean type. Such a situation, of course, leads to an increase in mixed populations (i.e., in later groups including those of the Iron Age) with the characteristics presented in both groups becoming increasingly mixed (e.g. Mediterranean traits).

Dubova N.A., Junusbayev S.M., Saipov A.B., Interaction between Steppe and Agricultural Tribes during the Bronze Age: Morphological Aspects, Int. Journal of Anthropology – Vol. 31 – n.1-2 – 2016, DOI: 10.14673/IJA2016121026

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Descendants of ancient European (fair?) maidens in Central Asia's highlands

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, January 27, 2018

mtDNAwiki on "Steppe folk" mtDNA and Indo-Iranian origins

Fascinating stuff from Samuel at mtDNAwiki. Emphasis is mine:

Steppe folk were people who resided in what are today Southern Russia and Eastern Ukraine between 6,000 and 4,000 years ago. They were very different from the Anatolian farmers I discussed earlier.

Ancient DNA shows that, between 3000 and 2000 BC, Steppe folk migrated en masse into Northern Europe, Central Asia and Siberia. Shortly afterwards, Steppe folk settled in South Europe, South Asia (India, Afghanistan, etc.), and Iran.

They contributed huge chunks of ancestry to countless modern ethnic groups. Modern-day Europeans are for the most part a two-way mixture between Steppe folk and European Neolithic farmers (who were mostly of Anatolian origin).


As much as 33% of Tajik mtDNA really does derive from Eneolithic/Bronze Age Eastern Europe. No doubt about it. Yes, Tajiks are an exception, because they have a lot more Steppe mtDNA than essentially all other South Central Asians. However, significant frequencies of Steppe mtDNA exist in every population in this region. For example, the mtDNA in the Kalasha, a small ethnic group from the Hindu Kush, is mostly made up of founder effects involving Steppe mt-HGs U4a1, U4b1a4, U2e1h, and J2b1a. Each of these haplogroups has been found in remains from Eneolithic/Bronze Age Eastern Europe.

Typical European haplogroups such as U5a1a1, H2a1, T1a1, H5a1, H6a1, J1b1a1, J2b1a, H7b, etc. consistently pop up in every South Central Asian population. Realistically, none of these haplogroups are more than 10,000 years old. Indeed, all of them are likely to be less than 7,000 years old. The European-related mtDNA in South Central Asia isn’t derived from distant, Paleolithic shared ancestry between Europeans and Asians. It’s recent stuff from the Steppe.

For over a decade Y-haplogroup R1a-M417 perplexed many geneticists because it was the most common Y-haplogroup in two geographically very distant peoples; Balto-Slavs of Eastern Europe and Indo-Aryans of South Asia. But thanks to ancient DNA, it has now been confirmed that R1a-M417 is an European Steppe lineage which expanded both west and east from the Pontic-Caspian Steppe between 4,600 and 3,500 years ago.

Interestingly, I’ve found mtDNA haplogroups which correlate very well with R1a-M417; meaning that they either exist in South Asians & Eastern Europeans, or in South Asians & ancient Central and Eastern Europeans rich in R1a-M417, such as the Corded Ware and Srubnaya peoples.

J1c1b1a: Russia, Ukraine, Hungary, Romania, Denmark, UK, Spain, Tajik, India. Srubnaya (R1a-Z93), Corded Ware (R1a-M417).
H2a1a: Russia, Hungary=2, Finland, Britain, Ireland, France, Pathan, Tajik=16, Turkey, Siberia. Eneolithic Ukraine (R1a-M417), Bronze age Scotland, Unetice.
H5e1: Russia=2, Hungary, Greece, Tajik=3.
T1a1b: Russia=4, Poland=3, Hungary=2, Iran=2, Turkey, Tajik=4, India. Bronze age Latvia, Sycthian=2.
N1a1a1a1: Estonia=3, Finland=2, Italy, Turkmen, India=2. Sintashta, Sycthian, Sarmatian.
K2a5: Estonia, Ireland, Iran, Sindhi, Pathan, India. Corded Ware Germany, Corded Ware Sweden.
U4b2: Russia, Ukraine, Sweden, Spain, Burosho, Tajik, India.
U4b1a4: Kalash, Tajik, Iran, Siberia=3. Catacomb, Sycthian.
U2e1h: Kalash=3, Tajik=8, Siberia, Italy. Sintashta, Potapovka

The most important mt-HGs here are U2e1h, H2a1a, U4b1a4, T1a1b, and N1a1a1a1. They directly link modern Indo-Iranian speakers in Asia with Eneolithic/Bronze age Eastern Europeans generally considered by historical linguists and archaeologists to be Proto-Indo-European- or Proto-Indo-Iranian-speakers (i.e. Sintashta and Potapovka).

When I put all of this data together, and saw the undeniable links between modern-day Indo-Iranian speakers and Eneolithic/Bronze Age Eastern Europeans, I was amazed. The results confirmed to me, beyond any doubt, that the ancient migrations from the western Steppe deep into Asia long hypothesized by historical linguists and archaeologists did happen. Indo-Iranian languages really did originate in Eastern Europe, probably in what is now Ukraine, then took the long journey all the way to the Indian Subcontinent.

Case in point: ancient DNA sample I6561. That’s his lab ID, but he’s a man who died in what is now Ukraine ~5,500 years ago. He belonged to Y-HG R1a-M417 and mt-HG H2a1a. Today H2a1a is most common in the Tajik people of South Central Asia. The most common Y-HG in Tajiks, and many of their neighbors, such as Pashtuns, Kalasha, northern Indians, etc. is R1a-M417.

All of the evidence suggests that Mr. I6561 belonged to a PIE community whose descendants would go on to settle lands that stretch all the way from modern-day Norway to India. His people are important founders of countless modern ethnic groups; Russians, Czechs, Tajiks, Pashtuns, Indians, and so on. Oh yeah, and also the ancient Scythians, who dominated much of Asia around 500 BC, derived directly from his people. Pretty amazing.


It’s been known for a while, via archaeological data, that Steppe folk traded with these farmers. But now, thanks to ancient DNA, it’s clear that they exchanged more than just goods. Enneolithic and Bronze Age genomes from what are now Ukraine, Romania, and Bulgaria show that the Steppe and farmer folks began mixing by at least 4400 BC.

Hence, when Steppe folk expanded both west and east, they took with them at least a little Anatolian admixture. This is also true for the Steppe folk who went to South Asia. Several of the mt-HGs that I labeled “Steppe” are in fact Anatolian mt-HGs that the Steppe folk acquired through admixture with farmer peoples before their mass migrations. These include mt-HGs H1b1, H5a1, H7b, J1c1b1a, J2b1a, N1a1a1a1, K1b1a1, HV6, and HV9.

It’s often said, in scientific literature as well as on various genetic blogs and forums, that the Steppe folk who moved into South Asia didn’t harbor any Anatolian ancestry. But my mtDNA data easily debunks this claim. South Asians do indeed carry some Anatolian-derived mtDNA which they, in all likelihood, acquired from their Steppe ancestors.

See also...

Another look at the genetic structure of Yamnaya

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Descendants of ancient European (fair?) maidens in Central Asia's highlands

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Thursday, January 25, 2018

Unadmixed Basal Eurasians lived throughout the Near East ~45-15 KYA?

Below is a map from a recent review paper at Trends in Genetics by Melinda A. Yang and Qiaomei Fu titled Insights into Modern Human Prehistory Using Ancient Genomes.

It's somewhat speculative and an abstract of geographic realities (note that the ancient "Karelia" population is placed several thousand miles east of Karelia, in Northern Asia as opposed to Northeastern Europe). Nevertheless, the fact that the authors chose to illustrate the home of the so called Basal Eurasians as a rather large range in the middle of the Near East, rather than something more remote and limited, like, say, a small part of the Arabian Peninsula or even North Africa, is interesting.

Indeed, they seem to suggest that post-Basal Eurasian Near Eastern populations took shape not as a result of the expansion of Basal Eurasians across the Near East, but rather due the migration of northern foragers (labeled EUR on the map) from Eastern Europe to the Near East. Like I say, no doubt this is based on some guesswork, and needs to be confirmed with more sampling from the ancient Near East, but still noteworthy that it made it onto the map.


Melinda A. Yang and Qiaomei Fu, Insights into Modern Human Prehistory Using Ancient Genomes,

See also...

Villabruna cluster =/= Near Eastern migrants

Wednesday, January 24, 2018

The Kho people: archaic Indo-Aryans

I've manged to get my hands on two Kho samples from Chitral, northern Pakistan, courtesy of Khana from the comments at this blog and someone named Sam Sloan. Here's what Wikipedia has to say about the Kho, who are Dardic-speakers and thus close linguistic relatives of the Kalasha people:

The Kho people are likely descendants of those who arrived in the region during the Indo-Aryan migration.[5] The Kho people formerly observed a form of ancient Hinduism;[6] during the Mongol invasion of India during the 1200s, many of the northern Kho converted to Islam.[7]


The Kho people speak the Khowar language, a member of the Dardic subgroup of the Indo-Aryan language family. The ethnologists Karl Jettmar and Lennart Edelberg noted, with respect to the Khowar language, that: "Khowar, in many respects [is] the most archaic of all modern Indian languages, retaining a great part of Sanskrit case inflexion, and retaining many words in a nearly Sanskritic form.”[9]

Moreover, Chitral is near Swat, which is the location of a Bronze Age cemetery that is generally presumed to be the oldest Indo-Iranian archaeological site in South Asia. It'll be interesting to compare the two Kho individuals to samples from this ancient burial ground if and when they're finally published (see here and here). Meantime, this is how they compare to the Kalasha from the HGDP dataset in several of my staple genome-wide analyses:

Overall, the qpGraph trees produce almost identical results for both the Kho and Kalasha. However, on the Kho tree, the drift path leading from C to Han is zero (i.e. no genetic drift), while on the Kalasha tree it's 18. That's a subtle, but perhaps important difference, because it suggests that the Kho and Kalasha have somewhat different types of East Eurasian admixture.

Indeed, in the West Eurasian and world Principal Component Analyses (PCA) the Kho pull more strongly towards the Bronze Age steppe and East Asia, respectively, compared to the Kalasha. This might mean that they've been less isolated genetically than the Kalasha since the initial Indo-Aryan settlement of what is now northern Pakistan.

I've also added the Kho to the Global 10 and Basal-rich K7 datasheets (see here and here, respectively). It might be possible to investigate in more detail the differences between the Kho and Kalasha by using this output to model their ancestry with nMonte (for instance, like here).

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Descendants of ancient European (fair?) maidens in Central Asia's highlands

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, January 20, 2018

Ancient mitogenomes reveal Central Asian (Hunnic?) admixture in Hungarian Conquerers (Nepar√°czki et al. 2018 preprint)

Over at bioRxiv at this LINK. The number of ancient mitogenomes in this preprint (102) is fairly impressive, but obviously there's only so much insight one can gain from maternally-inherited genetic markers when studying male-driven conquests like that of the Carpathian Basin by the early Hungarians. So yeah, let's wait and see how the conclusions in this preprint gel with the relevant ancient Y-chromosome and genome-wide data when it arrives. Below is the abstract. Emphasis is mine:

It has been widely accepted that the Finno-Ugric Hungarian language, originated from proto Uralic people, was brought into the Carpathian Basin by the Hungarian Conquerors. From the middle of the 19 th century this view prevailed against the deep-rooted Hungarian Hun tradition, maintained in folk memory as well as in Hungarian and foreign written medieval sources, which claimed that Hungarians were kinsfolk of the Huns. In order to shed light on the genetic origin of the Conquerors we sequenced 102 mitogenomes from early Conqueror cemeteries and compared them to sequences of all available databases. We applied novel population genetic algorithms, named Shared Haplogroup Distance and MITOMIX, to reveal past admixture of maternal lineages. Phylogenetic and population genetic analysis indicated that more than one third of the Conqueror maternal lineages were derived from Central-Inner Asia and their most probable ultimate sources were the Asian Huns. The rest of the lineages most likely originated from the Bronze Age Potapovka-Poltavka-Srubnaya cultures of the Pontic-Caspian steppe, which area was part of the later European Hun empire. Our data give support to the Hungarian Hun tradition and provides indirect evidence for the genetic connection between Asian and European Huns. Available data imply that the Conquerors did not have a major contribution to the gene pool of the Carpathian Basin, raising doubts about the Conqueror origin of Hungarian language.

Nepar√°czki et al., Mitogenomic data indicate admixture components of Asian Hun and Srubnaya origin in the Hungarian Conquerors, bioRxiv, Posted January 19, 2018, doi:

The case of Chalcolithic fortresses in the Northwestern Caucasus (Kozintsev 2017)

It's a pity that we still don't have any decent ancient DNA data from the North Caucasus and nearby steppes, apart from, of course, those few intriguing mitochondrial genomes from Maykop burials (see here). This leaves us guessing about the genetic origins of the people who lived in this region across the millennia, and thus their genealogical relationships to near and far ancient and modern-day populations, which might eventually prove pivotal in the search for the Proto-Indo-European homeland.

The most nagging questions to be solved are whether Yamnaya, and other closely related Eneolithic/Bronze Age steppe herder groups, sourced the greater part of their so called southern ancestry from the North Caucasus, and if so, from who exactly: groups indigenous to the region, or mixed populations with significant ancestry from, say, Transcaucasia (the Southern Caucasus) or even Mesopotamia?

To make matters worse, the archeology of the North Caucasus is fairly poorly understood. It's generally assumed that there was indeed a colonization of the Northwestern Caucasus by various peoples from the south, including Uruk migrants from Mesopotamia. But even if so, did they leave a lasting impact on the populations of the Caucasus and, subsequently, the steppes? Despite some strong opinions on the matter, particularly in the comments at this blog, no one can say for sure at this stage.

However, as far as I can see, a fascinating new archaeological paper by A.G. Kozintsev in Archaeology, Ethnology & Anthropology of Eurasia suggests that one such group of southern migrants, who built a fortress at Meshoko, in what is now Southern Russia, during the Chalcolithic, was overrun by people more culturally "archaic" and indigenous to the region. If true, and this wasn't an isolated incident, then for obvious reasons it might help to explain the lack of Mesopotamian- and South Caspian-specific uniparental markers amongst the Eneolithic/Bronze Age steppe herder groups, which is an issue that I have discussed at length in the past (see here, here and here). Below is the abstract from Kozintsev's paper. Emphasis is mine:

A multivariate method for assessing cultural changes at stratified sites is proposed. The variables are technological properties of ceramics, and occurrences of various categories of flint implements. The method is applied to stratigraphic sequences of Chalcolithic fortresses in the northwestern Caucasus dating to the late 5th–early 4th millennia BC: Meshoko and Yasenova Polyana. The properties of ceramics include hardness (assessed on the Mohs scale), wall thickness, and frequency of fragments tempered with calcium carbonate. For Meshoko, S.M. Ostashinsky’s data on the occurrence of implements made of high-quality colored flint, splintered pieces, and the total number of segments, points, inserts, scrapers, and perforators were used as well. Each parameter undergoes regular changes from the lower to the upper units of the sequence: ceramics progressively deteriorate, whereas flint industry becomes more and more sophisticated. These changes occur in parallel. Data were subjected to principal component analysis. The first principal component is regarded as a generalized measure of cultural change. The results support the view of the excavators: changes were caused by the interaction of two cultures differing in origin. The earlier culture, associated with the constructors of the Meshoko fortress, shows no local roots, and was evidently introduced from Transcaucasia. The one that replaced it was significantly more archaic (a few copper tools notwithstanding), and reveals local Neolithic roots. It alone can be termed the culture of ceramics with interiorpunched node decoration. The ceramics of Yasenova Polyana, too, indicate cultural heterogeneity and two occupation stages; but cultural changes are more complicated there, probably because the site existed longer, and more than two cultural components were involved.

A.G. Kozintsev, A Generalized Assessment of Cultural Changes at Stratified Sites: The Case of Chalcolithic Fortresses in the Northwestern Caucasus, Archaeology, Ethnology & Anthropology of Eurasia 45 (1) 2017, DOI: 10.17746/1563-0110.2017.45.1.062-075

See also...

Steppe Maykop: a buffer zone?

Genetic borders are usually linguistic borders too

On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint)

Wednesday, January 17, 2018

Another look at the genetic structure of Yamnaya

Yamnaya and other similar Eneolithic/Bronze Age herder groups from the Eurasian steppe were mostly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG). But they also harbored minor ancestry from at least one, significantly more westerly, source that pulled them away from the EHG > CHG north/south genetic cline. This is easy to show with formal statistics (for instance, refer to the qpAdm output here) and illustrate with a decent Principal Component Analysis (PCA).

Over the past couple of years I've come to the conclusion that this minor westerly input probably came from the Carpathian Basin (modern-day Hungary) or somewhere nearby, like the Balkans (see here).

However, this inference was based on just a handful of Neolithic samples from the Carpathian Basin. Now, thanks to Lipson et al. 2017, I have genotype data from tens of individuals from several different Neolithic and Copper Age cultures from the region. So let's revisit the issue by plugging these new samples into qpAdm, and also using the very latest qpAdm methods as described in scientific literature (with Ethiopia_4500BP as the base pright sample to 15 other ancient pright groups and individuals).

Below are the results, best to worst, sorted by taildiff. For comparison, I ran extra models with ancient populations from other parts of Europe and also West Asia. It's interesting and, I'd say, important to note that the West Asian reference groups produce amongst the worst statistical fits (in bold text). What this suggests is that Yamnaya did not harbor extra West Asian ancestry on top of its CHG input. And, by the way, please note that I'm only using Yamnaya_Samara in these runs because I prefer UDG-treated, and thus higher quality, ancient samples.

CHG + EHG + Blatterhole_MN 0.465394061 > full output

CHG + EHG + Koros_HG 0.322245651 > full output

CHG + EHG + Germany_MN 0.321017025 > full output

CHG + EHG + Protoboleraz_LCA 0.315521424 > full output

CHG + EHG + Vinca_MN 0.292074267 > full output

CHG + EHG + Baden_LCA 0.255168297 > full output

CHG + EHG + Tisza_LN 0.246555616 > full output

CHG + EHG + ALPc_MN 0.220623346 > full output

CHG + EHG + Blatterhole_HG 0.219418173 > full output

CHG + EHG + Balaton_Lasinja_CA 0.211230222 > full output

CHG + EHG + Tiszapolgar_ECA 0.207527666 > full output

CHG + EHG + LBK_EN 0.182365613 > full output

CHG + EHG + TDLN 0.176675465 > full output

CHG + EHG + Koros_EN 0.15488361 > full output

CHG + EHG + Starcevo_EN 0.136365203 > full output

CHG + EHG + Armenia_EBA 0.127988891 > full output

CHG + EHG + Armenia_ChL 0.123057884 > full output

CHG + EHG + LBKT_MN 0.122780467 > full output

CHG + EHG + Tepecik_Ciftlik_N 0.110155019 > full output

CHG + EHG + Greece_N 0.105880232 > full output

CHG + EHG + Boncuklu_N 0.094240794 > full output

CHG + EHG + Anatolia_BA 0.069141519 > full output

CHG + EHG + Anatolia_ChL 0.067837662 > full output


CHG + EHG + Iran_ChL infeasible > full output

At the top of the list is Blatterhole_MN. Admittedly this is something of a surprise, considering the geographic distance between Blatterhole, Germany, and Samara, Russia. It's also an intriguing result because of the presence of Y-chromosome haplogroup R1b in both Blatterhole_MN and Yamnaya (see here).

However, this doesn't necessarily mean that Yamnaya harbors direct ancestry from Blatterhole_MN, or even any closely related group from North-Central Europe. Rather, Blatterhole_MN is simply the best proxy in this analysis for the non-CHG/EHG ancestry in Yamnaya, and the important question is why?

Considering also the presence at the top of the list of Koros_HG (which includes Hungary_HG I1507), Germany_MN and Vinca_MN, the likely answer is its high ratio of Western European Hunter-Gatherer (WHG) ancestry. Indeed, when I let qpAdm vary the WHG ratio, by dropping Blatterhole_MN and adding Koros_EN and Koros_HG in its place, I get an even better fit.

CHG + EHG + Koros_EN + Koros_HG 0.612772624 > full output

And for comparison...

CHG + EHG + LBK_EN + WHG 0.551431774 > full output

So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds? Quite possibly. But let's wait and see what happens when I add the ancient groups from the Balkans and North Pontic steppe from the forthcoming Mathieson et al. 2018 to this analysis.

Update 17/05/2018: My results have been confirmed in a new preprint from Harvard/Max Planck. See here: On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint).

See also...

What's Maykop (or Iran) got to do with it?

The Yamnaya outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, January 13, 2018

Genetic maps featuring 67 ancient genomes and more than 3,000 present-day individuals

I've got some eye candy for you guys as we wait for 2018 to really get going. Below are three Principal Component Analyses (PCA) plots, or genetic maps, based on the ancient diploid dataset from Martiniano et al. 2017 (described in more detail here). Click on the images to download hi-res PDFs of each plot. The relevant datasheets are available here.

The important thing about these PCA is that none of the samples in the analyses are missing more than 1% of the ~188K markers used to compute the PCs, which means that I didn't have to resort to any type of projection to get things right. In other words, the relationships between the samples that you see on these plots are direct.

PCA are easy to read. The main thing to keep in mind is that the results are dependent on the samples in the analysis. For instance, note that the Indians (Gujaratis and Brahmins) cluster rather close to some Europeans on the West Eurasian plot, but much further from them on the Eurasian/American plot. Why? Because the addition of hundreds of East Eurasian individuals to the latter plot highlights the significant East Eurasian-related admixture in the Indians, and pulls them away from the Europeans, who generally have much less of this type of ancestry.

It's interesting, I think, that all of the ancients from burial sites from within the borders of present-day Europe (discussed in an earlier blog post here), cluster with present-day Europeans, or at least closest to us. See anything else interesting? Feel free to share it in the comments below.

If you're having trouble spotting certain individuals and/or populations, type the relevant individual or population ID in the PDF search box and click enter. The PDF will initially show you a box where the samples of interest are located; click on the box, and the PDF will zoom into the boxed area and highlight these samples, like this:

See also...

Who's your (proto) daddy Western Europeans?

Wednesday, January 10, 2018

Ancient mitogenomes from Sardinia and Lebanon (Matisoo-Smith et al. 2018)

Over at PLoS ONE at this LINK. Emphasis is mine:

Abstract: The Phoenicians emerged in the Northern Levant around 1800 BCE and by the 9th century BCE had spread their culture across the Mediterranean Basin, establishing trading posts, and settlements in various European Mediterranean and North African locations. Despite their widespread influence, what is known of the Phoenicians comes from what was written about them by the Greeks and Egyptians. In this study, we investigate the extent of Phoenician integration with the Sardinian communities they settled. We present 14 new ancient mitogenome sequences from pre-Phoenician (~1800 BCE) and Phoenician (~700–400 BCE) samples from Lebanon (n = 4) and Sardinia (n = 10) and compare these with 87 new complete mitogenomes from modern Lebanese and 21 recently published pre-Phoenician ancient mitogenomes from Sardinia to investigate the population dynamics of the Phoenician (Punic) site of Monte Sirai, in southern Sardinia. Our results indicate evidence of continuity of some lineages from pre-Phoenician populations suggesting integration of indigenous Sardinians in the Monte Sirai Phoenician community. We also find evidence of the arrival of new, unique mitochondrial lineages, indicating the movement of women from sites in the Near East or North Africa to Sardinia, but also possibly from non-Mediterranean populations and the likely movement of women from Europe to Phoenician sites in Lebanon. Combined, this evidence suggests female mobility and genetic diversity in Phoenician communities, reflecting the inclusive and multicultural nature of Phoenician society.

Matisoo-Smith E, Gosling AL, Platt D, Kardailsky O, Prost S, Cameron-Christie S, et al. (2018) Ancient mitogenomes of Phoenicians from Sardinia and Lebanon: A story of settlement, integration, and female mobility. PLoS ONE 13(1): e0190169.

See also...

Something unexpected from Mesolithic Sardinia

Wednesday, January 3, 2018

A genome from the first founding population of Native Americans (Moreno-Mayar et al. 2018)

Over at Nature at this LINK. By the way, when did Nature start adding those "Life Sciences Reporting Summaries" to its papers? I remember having a chat with Broad MIT/Harvard back in May about adding something like this to ancient DNA papers, especially in regards to data exclusions, right after my blog entry about the somewhat suspiciously missing Yamnaya males in Mathieson et al. 2017 (see here), and suddenly, here it is. Eh, probably a crazy coincidence, but a great move in any case. Below is the Moreno-Mayar et al. abstract and an Admixture graph from the paper:

Despite broad agreement that the Americas were initially populated via Beringia, the land bridge that connected far northeast Asia with northwestern North America during the Pleistocene epoch, when and how the peopling of the Americas occurred remains unresolved [1,2,3,4,5]. Analyses of human remains from Late Pleistocene Alaska are important to resolving the timing and dispersal of these populations. The remains of two infants were recovered at Upward Sun River (USR), and have been dated to around 11.5 thousand years ago (ka)6. Here, by sequencing the USR1 genome to an average coverage of approximately 17 times, we show that USR1 is most closely related to Native Americans, but falls basal to all previously sequenced contemporary and ancient Native Americans [1,7,8]. As such, USR1 represents a distinct Ancient Beringian population. Using demographic modelling, we infer that the Ancient Beringian population and ancestors of other Native Americans descended from a single founding population that initially split from East Asians around 36 ± 1.5 ka, with gene flow persisting until around 25 ± 1.1 ka. Gene flow from ancient north Eurasians into all Native Americans took place 25–20 ka, with Ancient Beringians branching off around 22–18.1 ka. Our findings support a long-term genetic structure in ancestral Native Americans, consistent with the Beringian ‘standstill model [9]. We show that the basal northern and southern Native American branches, to which all other Native Americans belong, diverged around 17.5–14.6 ka, and that this probably occurred south of the North American ice sheets. We also show that after 11.5 ka, some of the northern Native American populations received gene flow from a Siberian population most closely related to Koryaks, but not Palaeo-Eskimos [1], Inuits or Kets [10], and that Native American gene flow into Inuits was through northern and not southern Native American groups1. Our findings further suggest that the far-northern North American presence of northern Native Americans is from a back migration that replaced or absorbed the initial founding population of Ancient Beringians.

Moreno-Mayar et al., Terminal Pleistocene Alaskan genome reveals first founding population of Native Americans, Nature, Published online: 03 January 2018, doi:10.1038/nature25173