search this blog

Friday, November 13, 2020

Fatyanovo as part of the wider Corded Ware family (Nordqvist and Heyd 2020)


There's a new archeological paper about the Fatyanovo culture at the Proceedings of the Prehistoric Society [LINK]. It includes this quote on page 18:

In the traditional narrative, the Fatyanovo people – like the CWC populations in general – are regarded as Indo-European, representing the pre-Balto-Slavic (-Germanic) stage (Carpelan & Parpola 2001, 88; Anthony 2007, 380; also Gimbutas 1956, 163; Tretyakov 1966, 109) in the spread of Indo-European languages.

That's correct, but considering the latest ancient DNA research on the Fatyanovo people, the traditional narrative is probably wrong. Fatyanovo males were rich in Y-haplogroup R1a-Z93, which is found at very low frequencies in Balto-Slavic populations (see here). It's actually much more common nowadays in Central and South Asia, where it often reaches frequencies of over 50% in Indo-Iranian speaking groups.

Balts and Slavs are rich in R1a-Z282, which is a sister clade of R1a-Z93, and has been found in Corded Ware and Corded Ware related samples from west of Fatyanovo sites. That is, in present-day Poland and the Baltic states.

Therefore, the origins of the Balto-Slavs should be sought somewhere west of the Fatyanovo culture, probably in the Corded Ware derived populations from what is now the border zone between Poland, Belarus and Ukraine.

Indeed, in my view the Fatyanovo people are more likely to have spoken Proto-Indo-Iranian rather than anything ancestral to Baltic or Slavic (see here).
Nordqvist and Heyd, The Forgotten Child of the Wider Corded Ware Family: Russian Fatyanovo Culture in Context, Proceedings of the Prehistoric Society, online 12 November 2020, DOI: https://doi.org/10.1017/ppr.2020.9

See also...

The oldest R1a to date

Saturday, November 7, 2020

Slavic-like Medieval Germans


The samples labeled DEU_Krakauer_Berg_MA in the Principal Component Analysis (PCA) plot below are from a recent paper by Parker et al. at Scientific Reports. Their remains were excavated from a Medieval cemetery in the now abandoned village of Krakauer Berg in eastern Germany.

Krakauer sounds sort of like Kraków, doesn't it? That's probably not a coincidence, especially considering how these people behave in my analysis. To see an interactive version of the plot, paste the coordinates from the text file here into the relevant field here.


See also...

Yamnaya-related ancestry proportions in present-day Poles

Warriors from at least two different populations fought in the Tollense Valley battle

Viking world open analysis and discussion thread

Wednesday, October 14, 2020

A new model for the genomic formation of First American ancestors in Asia (Ning et al. 2020 preprint)


Over at bioRxiv at this LINK. The main topic of the preprint is largely outside the scope of this blog. However, the manuscript includes a detailed discussion about how to get the most out of the qpAdm mixture modeling program. I've used qpAdm regularly over the years, and I plan to use it more often in the future, so I'll be looking very carefully at the qpAdm methodology that Ning et al. are recommending. Here's the preprint abstract:

Upward Sun River 1, an individual from a unique burial of the Denali tradition in Alaska (11500 calBP), is considered a type representative of Ancient Beringians who split from other First Americans 22000-18000 calBP in Beringia. Using a new admixture graph model-comparison approach resistant to overfitting, we show that Ancient Beringians do not form the deepest American lineage, but instead harbor ancestry from a lineage more closely related to northern North Americans than to southern North Americans. Ancient Beringians also harbor substantial admixture from a lineage that did not contribute to other Native Americans: Amur River Basin populations represented by a newly reported site in northeastern China. Relying on these results, we propose a new model for the genomic formation of First American ancestors in Asia.

Ning et al., The genomic formation of First American ancestors in East and Northeast Asia, bioRxiv, posted October 12, 2020, doi: https://doi.org/10.1101/2020.10.12.336628

See also...

Ancient ancestry proportions in present-day Europeans

Major updates to ADMIXTOOLS

Yamnaya-related ancestry proportions in present-day Poles

Tuesday, September 29, 2020

Viking world open analysis and discussion thread


Global25 and Celtic vs Germanic coordinates for most of the samples from the recent Margaryan et al. Viking paper are now available HERE and HERE, respectively. Look for the VK2020 prefix.

Feel free to put them through their paces and let me know what you find. Below are a couple of examples of what can be done with these coordinates using Vahaduo Global25 Views.

See also...

Viking invasion at bioRxiv

Commoner or elite?

Who were the people of the Nordic Bronze Age?

Wednesday, September 16, 2020

Domestic horses were introduced into Anatolia and Transcaucasia during the Bronze Age (Guimaraes et al. 2020)


Over at Science Advances at this LINK. This is a very important paper because it basically eliminates West Asia as the source of the modern domestic horse lineage, which leaves the Pontic-Caspian steppe in Eastern Europe as the only viable option.

It also corroborates the linguistic theory that the Proto-Indo-European homeland was located on the Pontic-Caspian steppe. That's because the horse is a key animal in the Proto-Indo-European pantheon, and it appears in Indo-European mythology in intricate roles. This suggests that the speakers of Proto-Indo-European weren't just familiar with the horse but also managed to domesticate it. From the paper:

Abstract: Despite the important roles that horses have played in human history, particularly in the spread of languages and cultures, and correspondingly intensive research on this topic, the origin of domestic horses remains elusive. Several domestication centers have been hypothesized, but most of these have been invalidated through recent paleogenetic studies. Anatolia is a region with an extended history of horse exploitation that has been considered a candidate for the origins of domestic horses but has never been subject to detailed investigation. Our paleogenetic study of pre- and protohistoric horses in Anatolia and the Caucasus, based on a diachronic sample from the early Neolithic to the Iron Age (~8000 to ~1000 BCE) that encompasses the presumed transition from wild to domestic horses (4000 to 3000 BCE), shows the rapid and large-scale introduction of domestic horses at the end of the third millennium BCE. Thus, our results argue strongly against autochthonous independent domestication of horses in Anatolia.
Guimaraes et al., Ancient DNA shows domestic horses were introduced in the southern Caucasus and Anatolia during the Bronze Age, Science Advances 16 Sep 2020: Vol. 6, no. 38, eabb0030, DOI: 10.1126/sciadv.abb0030

See also...


Tuesday, September 8, 2020

Warriors from at least two different populations fought in the Tollense Valley battle


I can't get the genotype data from the Burger et al. paper. The lead authors, Joachim Burger and Daniel Wegmann, aren't replying to my emails.

But they were gracious enough to release the BAM files for each of their samples, and these files can be converted to genotype data. So I've included ten of the Tollense Valley warriors (DEU_Tollense_BA) in the Global25 datasheets (see here).

The claim in the paper that these warriors "represent an unstructured population" is absolutely false and extremely naive.

Below are a couple of Principal Component Analysis (PCA) plots produced with Vahaduo Global25 views. The samples are labeled according to their Y-chromosome haplogroups. To see interactive versions of the same plots, paste the Global25 coordinates from the text file here into the relevant fields here.


These warriors are not a single unstructured population, because they cover too much ground in the above plots for that to be possible. It's clear to me that they represent at least two different groups from Central Europe and surrounds.

Of course, this would be a lot easier to work out if Burger et al. cared to supply more information about each of the warriors, such as their attire, weapons, circumstances of death, and so on. It's a complete mystery to me why this wasn't included in the paper, and the authors are refusing to talk to me, so it's unlikely that I'll ever be able to get it from them.

In the absence of such crucial archeological and anthropological data, I don't want to speculate too much, and get overly creative, but here are a couple of possible scenarios to explain the ancient DNA results:
- this may have been a battle between two Central European armies, one rich in Y-haplogroup R1b and the other rich in Y-haplogroup I2a, as well as their allies or hired help, including warriors from Eastern Europe belonging to Y-haplogroup R1a

- or perhaps it was an invasion from the east by warriors rich in Y-haplogroup R1a, and it was a success, with the local armies, rich in Y-haplogroups R1b and I2a, losing the battle and suffering most of the casualties.

I'm sure that one day someone will attempt to undertake a decent multidisciplinary study of this epic battle, and we'll at least have a rough idea about what happened. Or not.

Citation...

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Genetic and linguistic structure across space and time in Northern Europe

Sunday, September 6, 2020

Low prevalence of lactase persistence in Bronze Age Europe (Burger et al. 2020)


Over at Current Biology at this LINK. Unfortunately, this is the long-awaited Tollense Valley battle paper. Despite the obvious presence of some very interesting genetic substructures among the Tollense Valley warriors (see here), the authors have the audacity to claim that these individuals represent a "single unstructured Central/Northern European population".

One of the warriors, labeled WEZ56, belongs to Y-haplogroup R1a and shows an exceedingly Balto-Slavic-like genome-wide genetic structure. But none of this is even mentioned in passing in the paper. Indeed, according to Burger at al., WEZ56 is best classified as belonging to R1, even though the R1a classification is quite secure based on the raw data that the authors posted online.

Be extremely wary of what you read in this paper, and anything else that these scientists have published in the past and will publish in the future. Below is the paper summary:

Lactase persistence (LP), the continued expression of lactase into adulthood, is the most strongly selected single gene trait over the last 10,000 years in multiple human populations. It has been posited that the primary allele causing LP among Eurasians, rs4988235-A [1], only rose to appreciable frequencies during the Bronze and Iron Ages [2, 3], long after humans started consuming milk from domesticated animals. This rapid rise has been attributed to an influx of people from the Pontic-Caspian steppe that began around 5,000 years ago [4, 5]. We investigate the spatiotemporal spread of LP through an analysis of 14 warriors from the Tollense Bronze Age battlefield in northern Germany (∼3,200 before present, BP), the oldest large-scale conflict site north of the Alps. Genetic data indicate that these individuals represent a single unstructured Central/Northern European population. We complemented these data with genotypes of 18 individuals from the Bronze Age site Mokrin in Serbia (∼4,100 to ∼3,700 BP) and 37 individuals from Eastern Europe and the Pontic-Caspian Steppe region, predating both Bronze Age sites (∼5,980 to ∼3,980 BP). We infer low LP in all three regions, i.e., in northern Germany and South-eastern and Eastern Europe, suggesting that the surge of rs4988235 in Central and Northern Europe was unlikely caused by Steppe expansions. We estimate a selection coefficient of 0.06 and conclude that the selection was ongoing in various parts of Europe over the last 3,000 years.

Burger et al., Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 Years, Current Biology, Available online 3 September 2020, https://doi.org/10.1016/j.cub.2020.08.033

See also...

Warriors from at least two different populations fought in the Tollense Valley battle

Sunday, August 23, 2020

Fascinating stuff


Coming soon I guess:

But we have results from the Ezero culture, from Southeastern Bulgaria, which is from the early Bronze Age and which seems to connect the people of this culture with the future Hittites and Trojans. This has been confirmed by archeology many times and has been known for at least half a century. But now we see the genetic parallels between the two. Some of these ancient groups from the Bronze Age in one way or another have survived to this day in our country Bulgarians, as we also carry a certain amount of blood and genes from these same people, perhaps in the range of between 5 and 10%, which connects us with the Hittites, ancient Anatolia and the Trojans. There is a huge processing of the results before they are published, but among them there are huge curiosities from now on. One of them is from the necropolis in Merichleri from the Early Bronze Age and in another necropolis in Tsaribrod (the older of the two), these are mound necropolises from the Yamna culture in the Caucasus, of people who migrated here in Bulgaria and connected between you are. They came from the haplogroup R1a, namely Z93, which is the haplogroup again of the Scythian, but more of the Indo-Aryan tribes, the future Indo-Aryans, who later conquered India. But one of the tribes of the Yamna culture seems to have strayed and arrived in the Balkans instead of going to India. And so by chance, because archaeologists and geneticists have chosen between 260 burial mounds from this period, they have chosen only 3-4 and have come across exactly this extremely ancient group, which is from the time before the Indo-European group was divided into Iranians, Indians and Slavs, they were still one people at the time with the same genomes. And yes, one of these groups is among what we call Thracian tribes, but these are not Thracians. We have results from both the Early Iron Age and the Late Bronze Age, which are possibly Thracian, but I will keep them a secret at this stage, as I do not want to provoke speculation.

See also...

The precursor of the Trojans

Steppe invaders in the Bronze Age Balkans

Wednesday, August 19, 2020

Yamnaya-related ancestry proportions in present-day Poles


Modeling ancient ancestry proportions in present-day Europeans with the qpAdm software is now a lot more difficult. The reasons for this are updates to qpAdm as well as the availabiity of more useuful outgroups or right pops.

This isn't necessarily a bad thing, because users are forced to work harder to find successful models, which is likely to lead to some interesting discoveries. But it can be very frustrating.

I don't think that settling for poor statistical fits or using a small number of outrgoups are acceptable short cuts. Perhaps sequencing modern-day samples in exactly the same way as the ancient samples, and thus increasing the compatability between them, might help?

Limiting qpAdm runs to higher quality SNPs from transversion sites does help, but perhaps largely because of the significant reduction in markers?

In any case, I've now given up on running such analyses, at least until I see some serious pointers on the topic from Harvard's qpAdm experts. But before I put this project to bed for the time being, I'd like to share some new results for Poles from eastern and western Poland, respectively.

right pops:

CMR_Shum_Laka_8000BP
MAR_Taforalt
IRN_Ganj_Dareh_N
Levant_PPNB
GEO_CHG
TUR_Barcin_N
RUS_Piedmont_En
SRB_Iron_Gates_HG
WHG
RUS_Karelia_HG
MNG_North_N
RUS_Ust_Kyakhta

left pops:

Polish_East
CWC_Baltic_early 0.572±0.024
SWE_TRB 0.428±0.024
chisq 11.776
tail prob 0.300296
Full output

Polish_West
CWC_Baltic_early 0.587±0.021
SWE_TRB 0.413±0.021
chisq 11.165
tail prob 0.34478
Full output


Even using transversion sites, this is one of the very few combinations of ancient reference samples that works for the Poles with these right pops. That is, the combination of early Corded Ware samples from the East Baltic (CWC_Baltic_early) and Funnel Beaker samples from Scandinavia (SWE_TRB). The former are obviously the proxy here for Yamnaya-related ancestry.

Adding any sort of hunter-gatherer population to this model doesn't help or even makes things worse (for instance, see here and here). It is possible to add Baltic hunter-gatherers to a similar model after dropping CWC_Baltic_early in favor of closely related samples from the Early to Middle Bronze Age Pontic-Caspian steppe. Note, however, that the statistical fits are somewhat poorer.

Polish_East
Baltic_LTU_Narva 0.032±0.014
PC_steppe_EMBA 0.483±0.019
SWE_TRB 0.485±0.019
chisq 17.143
tail prob 0.0465198
Full output

Polish_West
Baltic_LTU_Narva 0.031±0.011
PC_steppe_EMBA 0.491±0.015
SWE_TRB 0.477±0.016
chisq 22.444
tail prob 0.00757421
Full output


Interestingly, but not surprisingly, the ancestry of many present-day Northwestern European populations can be modeled in basically the same way. That's because ancient ancestry proportions are more closely correlated with latitude than longitude across much of the European continent.

English_Kent
CWC_Baltic_early 0.527±0.024
SWE_TRB 0.473±0.024
chisq 13.042
tail prob 0.221357
Full output

Icelandic
CWC_Baltic_early 0.586±0.023
SWE_TRB 0.414±0.023
chisq 16.517
tail prob 0.085751
Full output

Scottish
CWC_Baltic_early 0.583±0.021
SWE_TRB 0.417±0.021
chisq 12.144
tail prob 0.275536
Full output


A zip file with the qpAdm output from this analysis and a list of the most relevant ancients is available here. I might try to run a few more populations over the next few days, but probably only from the northern half of Europe, so please check the zip file in a week or so to see what else is in there.

If anyone wants to challenge my results, note that these and very similar samples are freely available to the public via Harvard University here and here.

Update 22/08/2020: From Nick Patterson (Broad) in the comments:
My general advice for qpAdm is 1) Work on the right hand set. Don't include irrelevant population (except for one population as an outgroup); picking the best RHS can dramatically reduce s. errors on the admixture weights. 2) If qpAdm gives a very low p-value try and understand why, sometimes it is telling you that the target is not a mixture of the sources but sometimes the assumptions are violated, for example recent gene-flow from left pops -> right.

See also...

Ancient ancestry proportions in present-day Europeans

Tuesday, August 18, 2020

Housekeeping stuff


I'm about to phase out the use of the Global25 datasheets with modern-day samples. In large part, this move is due to the uncertainty about the populations that these individuals represent and the resulting (often idiotic) discussions here and elsewhere about their usefulness.

This uncertainly exists because many, perhaps most, of these people are classified based on their self identity, which may or may not reflect their genetic origins.

Thus, I'll no longer be updating these datasheets and, from next week, I'll also stop linking to them at this blog (like here). The links will remain live for the next few months, so that users can adjust to the change.

However, modern-day samples sequenced from archeological remains, and thus, as a rule, painstakingly classified by experts based on their burial contexts and genetic characteristics, will continue to be featured in the Global25 datasheets.

In other words, as far as the Global25 is concerned, all of the modern-day samples from the living are out, but all of the modern-day samples from the dead will remain, and indeed I'll be adding more of the latter as they become available.

I'm planning to eventually create several sets of Global25 datasheets based on individuals and populations from different periods, including the modern era. But I'll probably need some help with that.

Also, please note that comment moderation will now be the rule here rather than the exception. And I'll be cracking down hard on trolling, insults and any sort of potentially defamatory material, so no more crazy stuff, or else.

See also...

New rules for comments

Friday, August 14, 2020

Awesome new toys from Vahaduo


Vahaduo now offers a 3D PCA experience. Check it out HERE and HERE. Below are a couple of screen caps of me messing around with the new tools.



Vahaduo says:

Hi everyone!

New tool - PCA 3D Viewer.

Global 25 version:

https://vahaduo.github.io/3d/g25

West Eurasia version:

https://vahaduo.github.io/3d/we


Usage:

Dots - ancients, circles - moderns.

Click X, Y, Z or COLOR tab and then click one of the PCx buttons to switch dimensions.

Click already active X, Y, Z or COLOR tab to temporarily reverse selected dimension. It will be restored to a default state when any of the dimensions will be switched to another one.

ADD CUSTOM POINTS - self-explanatory. Points will be added as "+". IMPORTANT - G25 version takes NON-SCALED coordinates. This will be true for any new tool dedicated to G25 and coordinates will be scaled automatically when needed or desired.

"Type parts of names." + TAG button - type parts of names to tag certain samples (try for example "KK1 Afon Pinar"). Search is Case Sensitive. Points will be redrawn as "x".

Next row - Labels and Annotations. Click the right button to cycle trough:

CLEAR LABELS + LABELS:AUTO
CLEAR ANNOTATIONS + ANNOTATIONS:CLICK
CLEAR ANNOTATIONS + ANNOTATIONS:AUTO
CLEAR ALL + LABELS AND ANNOTATION OFF
CLEAR LABELS + LABELS:CLICK

CLICK - click to add/delete labels/annotations.
AUTO - same as CLICK plus labels/annotations will be automatically added to newly plotted or tagged samples. Unfortunately adding new labels and annotations becomes very slow when there is too much of them, so there is a limit for the AUTO setting - 250 labels or 20 annotations at once.

Annotations are editable. They can be dragged to another place and text can be changed. Text can be also wrapped into an HTML SPAN element and some styles can be used, like "font-size" or "color". BR element (new line) works too.

HIGHLIGHT CLICK/OFF/HOVER - highlight all samples that belong to a single population. Set to HOVER to ignore clicks. Set to OFF to disable this feature and to remove highlight triggered by hover (highlights triggered by clicks will stay until they will be cleared or removed by a click). Click white dot to cycle trough available highlight colors.

Plotly buttons:

Default download is set to 1600x1200px PNG.

Custom Plotly buttons:
"Toggle projection: orthographic / perspective" - self-explanatory.
"Toggle background color" - cycles trough dark grey, black, white and light grey. Text color and white highlight will be switched to black when background will be set to white or light grey.
"Toggle color scheme" - cycles trough several gradients.
"Reverse color scheme" - reverses all gradients permanently.
"Download plot as png (custom size)" - default size is the size of the currently displayed plot.

See also...

New Global25 interpretation tools

Tuesday, August 11, 2020

Villabruna people existed in Europe at least 17,000 years ago (Bortolini et al. 2020 preprint)


Over at bioRxiv at this LINK. So, like I said here a few years back, there was no migration into Europe from the Near East ~14,00 years ago. I don't think there was even such a migration ~17,000 years ago. My view is that the so called Villabruna cluster formed somewhere in Europe at least 20,000 years ago. Below is the Bortolini et al. abstract, emphasis is mine:

The end of the Last Glacial Maximum (LGM) in Europe (~16.5 ka ago) set in motion major changes in human culture and population structure. In Southern Europe, Early Epigravettian material culture was replaced by Late Epigravettian art and technology about 18-17 ka ago at the beginning of southern Alpine deglaciation, although available genetic evidence from individuals who lived ~14 ka ago opened up questions on the impact of migrations on this cultural transition only after that date. Here we generate new genomic data from a human mandible uncovered at the Late Epigravettian site of Riparo Tagliente (Veneto, Italy), that we directly dated to 16,980-16,510 cal BP (2σ). This individual, affected by a low-prevalence dental pathology named focal osseous dysplasia, attests that the very emergence of Late Epigravettian material culture in Italy was already associated with migration and genetic replacement of the Gravettian-related ancestry. In doing so, we push back by at least 3,000 years the date of the diffusion in Southern Europe of a genetic component linked to Balkan/Anatolian refugia, previously believed to have spread during the later Bolling/Allerod warming event (~14 ka ago). Our results suggest that demic diffusion from a genetically diverse population may have substantially contributed to cultural changes in LGM and post-LGM Southern Europe, independently from abrupt shifts to warmer and more favourable conditions.

Bortolini et al., Early Alpine human occupation backdates westward human migration in Late Glacial Europe, bioRxiv, posted August 10, 2020, doi: https://doi.org/10.1101/2020.08.10.241430

See also...

Villabruna cluster =/= Near Eastern migrants

Monday, July 27, 2020

Ancient ancestry proportions in present-day Europeans (to be continued)


This year has already been massive in all sorts of ways, including for new data and software releases. So I'm thinking it might be time to update many of the analyses that were featured at this blog a while ago.

Let's start with the classic hunter vs farmer vs herder mixture model for present-day European populations. The rules of the game are as follows:


- run the latest version of qpAdm using qpfstats output

- use transversion sites and 1240K capture data

- pick a set of diverse and chronologically sound outgroups

- for a model to be successful the p-value must reach 0.01

- tweak the left pops in models that are clearly underperforming

- follow high end scientific literature, logic and common sense


Obviously, the reason that I decided to limit my analysis to markers from transversion sites is to mitigate problems associated with modeling the ancestry of modern, high quality samples with relatively low quality ancients. One of these problems appears to be qpAdm assigning faux East Asian/Siberian admixture to present-day Europeans (for instance, see figure 4 here).

My starting reference populations and outgroups are listed below. In qpAdm terminology the former are known as the "left pops", while the latter as the "right pops". Most of these samples are freely available at the David Reich Lab website here.

left pops:
HUN_Koros_N_HG
TUR_Barcin_N
UKR_Yamnaya

right pops:
CMR_Shum_Laka_8000BP
MAR_Taforalt
Levant_Natufian
IRN_Ganj_Dareh_N
Levant_PPNB
CZE_Vestonice16
BEL_GoyetQ116-1
Iberia_ElMiron
RUS_Karelia_HG
RUS_West_Siberia_HG
MNG_North_N
RUS_Ust_Kyakhta

As you can see, I picked a wide variety of right pops. But I chose most of them specifically to be able to differentiate the three streams of ancestry - from ancient hunters, farmers and herders - that are the focus of my analysis. I also intentionally avoided using samples in the right pops that may have experienced gene flow, including cryptic gene flow, from the populations in the left pops.

I somewhat speculatively earmarked HUN_Koros_N_HG, from the Early Neolithic Carpathian Basin, and UKR_Yamnaya, from the Early Bronze Age North Pontic steppe in what is now Ukraine, to represent the hunter-gatherer and pastoralist streams of ancestry, respectively.

That's because I expected HUN_Koros_N_HG to be the best proxy for the hunter-gatherer ancestry that was initially absorbed by the early farmers who fanned out from the Aegean region across much of the European continent, and of course it made sense to choose a steppe pastoralist population that was located close to Central Europe where such groups first made the biggest impact outside of the steppe.

Interestingly, HUN_Koros_N_HG and UKR_Yamnaya did prove to be among most effective choices for the types of ancestries that they represented. For instance, UKR_Yamnaya generally produced much stronger statistical fits than a very similar set of Yamnaya samples from the Caspian steppe (more precisely, from the Samara region in Russia). However, this might well be an artifact, due to very specific characteristics of these few ancient individuals. Larger sample sets would be welcome, especially from Yamnaya sites in Ukraine.

Below, dear audience, is a spreadsheet featuring the preliminary results. Click on the image to view and/or download the spreadsheet. The general rule is that the higher the tail prob, or p-value, the more likely it is that the ancestry proportions are close to the truth (a tail prob of well below 0.05 is usually a strong indication that something isn't right). For a detailed look at each of the qpAdm runs, feel free to consult the zip file here.


Note, however, that many of the European groups in my burgeoning genotype dataset are yet to make an appearance in the spreadsheet. That's because their models with the standard left pops showed p-values well under 0.01, which essentially meant that they failed, and I'm still trying to make them work.

But round one has certainly revealed some fascinating stuff. For instance, except for Hungarians and Estonians, none of the Uralic-speaking groups can be modeled successfully in the standard three-way model.

However, I managed to significantly improve the statistical fits in their models by adding a Siberian population, RUS_Baikal_BA, to the left pops. This is unlikely to be a coincidence, because the Proto-Uralic homeland was almost certainly located in or very near Siberia. Iain Mathieson please take note.

Saami
HUN_Koros_N_HG 0.134±0.043
RUS_Baikal_BA 0.270±0.015
TUR_Barcin_N 0.081±0.026
UKR_Yamnaya 0.515±0.058
chisq 19.865
tail prob 0.0108571

See also...


Tuesday, July 21, 2020

The oldest R1a to date


My popular map of the oldest instances of Y-haplogroup R1a in the ancient DNA record has a new entry: PES001 from the recent Saag et al. preprint. PES001 comes from a burial site in what is now northwestern Russia and is dated to a whopping 10785–10626 calBCE.


Indeed, I'm not aware of any R1a samples older than PES001 among the treasure trove of thousands of ancient samples waiting to be published. So it's likely that this individual will remain the oldest member of our R1a clan for some years to come.

See also...

Y-haplogroup R1a and mental health

Like three peas in a pod

The mystery of the Sintashta people

Tuesday, July 14, 2020

First taste of Early Medieval DNA from the Ural region (Csaky et al. 2020 preprint)


Over at bioRxiv at this LINK. From the preprint:

The ancient Hungarians originated from the Ural region of Russia, and migrated through the Middle-Volga region and the Eastern European steppe into the Carpathian Basin during the 9th century AD. Their Homeland was probably in the southern Trans-Ural region, where the Kushnarenkovo culture disseminated. In the Cis-Ural region Lomovatovo and Nevolino cultures are archaeologically related to ancient Hungarians. In this study we describe maternal and paternal lineages of 36 individuals from these regions and nine Hungarian Conquest period individuals from today's Hungary, as well as shallow shotgun genome data from the Trans-Uralic Uyelgi cemetery. We point out the genetic continuity between the three chronological horizons of Uyelgi cemetery, which was a burial place of a rather endogamous population. Using phylogenetic and population genetic analyses we demonstrate the genetic connection between Trans-, Cis-Ural and the Carpathian Basin on various levels. The analyses of this new Uralic dataset fill a gap of population genetic research of Eurasia, and reshape the conclusions previously drawn from 10-11th century ancient mitogenomes and Y-chromosomes from Hungary.

...

Majority of Uyelgi males belonged to Y chromosome haplogroup N, and according to combined STR, SNP and Network analyses they belong to the same subclade within N-M46 (also known as N-tat and N1a1-M46 in ISOGG 14.255). N-M46 nowadays is a geographically widely distributed paternal lineage from East of Siberia to Scandinavia 33 . One of its subclades is N-Z1936 (also known as N3a4 and N1a1a1a1a2 in ISOGG 14.255), which is prominent among Uralic speaking populations, probably originated from the Ural region as well and mainly distributed from the West of Ural Mountains to Scandinavia (Finland). Seven samples of Uyelgi site most probably belong to N-Y24365 (also known as N-B545 and N1a1a1a1a2a1c2 in ISOGG 14.255) under N-Z1936, a specific subclade that can be found almost exclusively in todays’ Tatarstan, Bashkortostan and Hungary 17 (ISOGG, Yfull).




Csaky et al., Early Medieval Genetic Data from Ural Region Evaluated in the Light of Archaeological Evidence of Ancient Hungarians, bioRxiv, Posted July 13, 2020, doi: https://doi.org/10.1101/2020.07.13.200154

See also...

Hungarian Conquerors were rich in Y-haplogroup N

On the association between Uralic expansions and Y-haplogroup N

More on the association between Uralic expansions and Y-haplogroup N

Ancient DNA confirms the link between Y-haplogroup N and Uralic expansions

Monday, July 13, 2020

Don't believe everything you read in peer reviewed papers


Case in point, here's a quote from a recent paper at the Journal of Human Genetics (emphasis is mine):

The Mordovian and Csango samples have a moderate to slight orientation toward the Central-Asian and Siberian Turkic groups. This could suggest the more significant East Eurasian or Turkic ancestry of these populations, which should be further investigated. German samples are inhomogeneous, and some of the German samples also show this tendency, which can be the result of the recent 20th century Turkish immigration into Germany [42].

Nope, these German samples don't show anything even remotely resembling recent Turkish ancestry. The authors of the paper, Ádám, V., Bánfai, Z., Maász, A. et al., should've been able to figure this out, even with the standard analyses that they ran. Failing that, the peer reviewers at the Journal of Human Genetics should've noticed that the authors were confused.

Moreover, if the authors and peer reviewers actually bothered to take a closer look at metadata for these samples, which were sourced from the Estonian Biocentre, they'd see that they're not even from Germany. In fact, they represent self-reported ethnic Germans from Russia.

My own quick and dirty analysis of these individuals suggests that many of them harbor East Slavic and/or Volga Finnic ancestries. Indeed, only some of them can pass genetically for run of the mill Germans from Germany. The Principal Component Analysis (PCA) below is self-explanatory. It was plotted with the Vahaduo Custom PCA tools freely available here. The relevant PCA datasheet can be gotten here.


That's not to say, of course, that some Germans don't have recent Turkish ancestry, because an increasing number of Germans nowadays do, nor that people with German heritage in Russia shouldn't identify as Germans, because that's entirely their choice.

This blog post isn't about what it takes to be German, and this is not something that I ever want to discuss for obvious reasons. The point I'm making here is that the authors and peer reviewers of the said paper at the Journal of Human Genetics were sloppy and half-arsed in their approach. And, sadly, this isn't an isolated case in peer reviewed scientific literature dealing with human population genetics.

I feel that the Estonian Biocentre is also partly to blame for this cock up, due to its somewhat peculiar sampling and labelling strategies. For instance, its scientists rely solely on self-reported identity to establish the ethnic origins of their samples, and they apparently never remove genetic outliers from their datasets or even try to identify them.

Unfortunately, I fear that this relaxed approach will eventually lead to basic errors and even unusual conclusions in a number of so called peer reviewed papers.

I first raised this issue with the Estonian Biocentre about five years ago, when I noticed that some of the supposedly Polish individuals in its dataset were genetically more similar to various groups from northern Russia than to Poles from Poland. These individuals also showed significant Siberian ancestry, which was very unusual indeed. Where the hell did the Estonian Biocentre find Poles who resembled people from near the Arctic Circle, you might ask? Apparently in Estonia.

OK, I can imagine that sampling ethnic Poles from Estonia may have been easier for the Estonian Biocentre than sampling Poles from Poland. And Estonian Poles certainly make for interesting and useful data points. However, as you can see in the PCA below, some of these individuals (labeled Polish_Estonia by me) aren't representative of the native Polish population, and yet the Estonian Biocentre not only lumps them with their Poles from Poland, but even labels them with the word "Poland". The relevant PCA datasheet can be gotten here.


However, based on my communications with some of the scientists at the Estonian Biocentre, including head honcho Mait Mestpalu, it seems that nothing will ever change there in regards to this issue. Who knows, perhaps some day we'll see a paper based on Estonian Biocentre data in the Journal of Human Genetics claiming that Poles originated near the Arctic Circle? I wouldn't be shocked if that actually happened.

Citation...

Ádám, V., Bánfai, Z., Maász, A. et al. Investigating the genetic characteristics of the Csangos, a traditionally Hungarian speaking ethnic group residing in Romania. J Hum Genet (2020). https://doi.org/10.1038/s10038-020-0799-6

See also...

Like three peas in a pod

Tuesday, July 7, 2020

On the exotic origins of the Hungarian Arpad Dynasty (Nagy et al. 2020)


Hungarians speak a Uralic and Finno-Ugric language. However, the founders of the Medieval Hungarian state, the Arpad Dynasty, probably had Irano-Turkic paternal origins. There's a very interesting new paper on this topic at the European Journal of Human Genetics (see here). From the paper, emphasis is mine:

The phylogenetic origins of the Hungarians who occupied the Carpathian basin has been much contested [40]. Based on linguistic arguments it was proposed that they represented a predominantly Finno-Ugric speaking population while the oral and written tradition of the Árpád dynasty suggests a relationship with the Huns. Based on the genetic analysis of two members of the Árpád Dynasty, it appears that they derived from a lineage (R-Z2125) that is currently predominantly present among ethnic groups (Pashtun, Tadjik, Turkmen, Uzbek, and Bashkir) speaking Iranian or Turkic languages. However, their closest kin, the Bashkirs live in close proximity with Finno-Ugric speaking populations with the N-B539 haplogroup. A recent study shows that this haplogroup is also found in modern Hungarians [41]. Intriguingly, the most recent separation of the N-B539 derived lineages found in Hungarians and Bashkirs is estimated to have occurred ~2000 years before present [42]. This would suggest that a group of people consisting of a Turkic (R-SUR51) component and a Finno-Ugric (N-B539) component left the Volga Ural region about 2000 years ago and started a migration that eventually culminated in settlement in the Carpathian Basin.

Citation...

Nagy, P.L., Olasz, J., Neparáczki, E. et al. Determination of the phylogenetic origins of the Árpád Dynasty based on Y chromosome sequencing of Béla the Third. Eur J Hum Genet (2020). https://doi.org/10.1038/s41431-020-0683-z

See also...

Hungarian Conquerors were rich in Y-haplogroup N

On the association between Uralic expansions and Y-haplogroup N

More on the association between Uralic expansions and Y-haplogroup N

Ancient DNA confirms the link between Y-haplogroup N and Uralic expansions

Saturday, July 4, 2020

Fatyanovo males were rich in Y-haplogroup R1a-Z93 (Saag et al. 2020 preprint)


I'd say that thanks to this preprint we're now a lot closer to solving the mystery of the Sintashta people. Over at bioRxiv at this LINK. From the preprint:

Transition from the Stone to the Bronze Age in Central and Western Europe was a period of major population movements originating from the Ponto-Caspian Steppe. Here, we report new genome-wide sequence data from 28 individuals from the territory north of this source area - from the under-studied Western part of present-day Russia, including Stone Age hunter-gatherers (10,800-4,250 cal BC) and Bronze Age farmers from the Corded Ware complex called Fatyanovo Culture (2,900-2,050 cal BC). We show that Eastern hunter-gatherer ancestry was present in Northwestern Russia already from around 10,000 BC. Furthermore, we see a clear change in ancestry with the arrival of farming - the Fatyanovo Culture individuals were genetically similar to other Corded Ware cultures, carrying a mixture of Steppe and European early farmer ancestry and thus likely originating from a fast migration towards the northeast from somewhere in the vicinity of modern-day Ukraine, which is the closest area where these ancestries coexisted from around 3,000 BC.

...

Interestingly, in all individuals for which the chrY hg could be determined with more depth (n=6), it was R1a2-Z93 (Table 1, Supplementary Data 2), a lineage now spread in Central and South Asia, rather than the R1a1-Z283 lineage that is common in Europe [38,39].


Saag et al., Genetic ancestry changes in Stone to Bronze Age transition in the East European plain, BioRxiv, Posted July 03, 2020, doi: https://doi.org/10.1101/2020.07.02.184507

See also...

Like three peas in a pod

Tuesday, June 30, 2020

The precursor of the Trojans


Who remembers kum4 from Omrak et al. 2016? I'm pretty sure now that this individual packs a lot of ancestry from the Pontic-Caspian (PC) steppe.

If so, that's a big deal, because her Chalcolithic (or Late Neolithic?) burial was located at Kumtepe. That is, in the same part of Anatolia as the later settlement of Troy, which may have been founded by early Anatolian speakers from Eastern Europe (see here).

The qpAdm mixture models below, featuring kum4 and the likely older kum6, also from Kumtepe, are based on qpfstats output. qpfstats is a new program from the David Reich Lab specifically designed to help analyze low coverage ancients (see here). And kum4 is certainly that.

TUR_Kumtepe_N_kum4
RUS_Progress_En 0.383±0.114
TUR_Barcin_N 0.617±0.114
chisq 7.868
tail prob 0.247957
Full output

TUR_Kumtepe_N_kum4
IRN_Seh_Gabi_C 0.325±0.150
TUR_Barcin_N 0.675±0.150
chisq 14.736
tail prob 0.0224096
Full output

TUR_Kumtepe_N_kum6
RUS_Progress_En 0.121±0.042
TUR_Barcin_N 0.879±0.042
chisq 21.790
tail prob 0.00132149
Full output

TUR_Kumtepe_N_kum6
IRN_Seh_Gabi_C 0.283±0.059
TUR_Barcin_N 0.717±0.059
chisq 6.289
tail prob 0.391566
Full output

Indeed, kum4 and kum6 offer just ~10,000 and ~100,000 "valid SNPs", respectively (see here). However, if nothing else, the results are clearly not random.

For one, because they fit the expected pattern, with the likely older individual lacking ancestry from the PC steppe (her model with RUS_Progress_En shows a weak statistical fit). Moreover, the qpAdm mixture ratios align almost perfectly with the results in my Principal Component Analysis (PCA) of ancient West Eurasian genetic variation. Coincidence?

See also...

Perhaps a hint of things to come

Saturday, June 27, 2020

Major updates to ADMIXTOOLS


An important message from Nick Patterson:

Dear Eurogenes bloggers,

Many of you use ADMIXTOOLS and you might like to know that there is a new release on github [LINK] with some important enhancements.

From the README

*** NEW ***

1)

Version 7.0 has numerous upgrades.

a) Two new executables --qpfstats qpfmv allow precomputation of f-statistic basis. This can greatly reduce computation costs.
b) qpAdm, qpWave, qpGraph support qpfstats output as input.
*** This is a much improved way of running with allsnps: YES. ***
c) A new experimental feature of qpGraph (halfscore: YES) allows comparison of 2 phylogenies + a (weak) goodness of fit score. Be careful if running with a large number of populations and consider reducing block size say blgsize: .005

2)

Note that several of the new ideas implemented in version 7.0 were developed collaboratively with Robert Maier, who has implemented them along with the great majority of other ADMIXTOOLS functionality in R: See https://github.com/uqrmaie1/admixtools
Executables run fast, and it has features not available in this C version, such as interactive exploration of graph phylogenies.
A manuscript describing the algorithmic ideas and providing documentation of the methods is in preparation.

qpfstats is the most important new executable. This estimates f-statistics and covariance on a basis.

a) This can be passed into other programs of the package without having to reaccess the genotype files, greatly speeding the computations.
b) In allsnps: YES mode a new computation is carried out (explained in qpfs.pdf) that is much more logical when there is a lot of missing data. Sometimes standard errors are greatly reduced.
qpfstats can be used with up to 30 populations. Much beyond that the output files become large.

As usual there may be bugs...

Nick Patterson 6/27/2020

Update 29/06/2020: As pointed out above, qpfstats is the most important new executable. Indeed, Nick Patterson now recommendeds that qpAdm analyses run with the allsnps: YES flag should be based on qpfstats output.

Several of my recent blog posts featured qpAdm models run with the allsnps: YES flag, but they were based on genotype data because obviously I didn't know anything about qpfstats at the time.

So I went back and ran some of these models again, just to make sure that they were still relevant. Below are three examples which you can compare to the original analyses here, here and here, respectively.

TUR_Arslantepe_LC_Maykop
RUS_Maykop_Novosvobodnaya 0.281±0.042
TUR_Arslantepe_LC 0.719±0.042
chisq 10.923
tail prob 0.449752
Full output

TUR_Barcin_C
RUS_Vonyuchka_En 0.137±0.031
TUR_Buyukkaya_EC 0.863±0.031
chisq 15.074
tail prob 0.0889099
Full output

UKR_N_admixed
RUS_Progress_En 0.083±0.020
UKR_N 0.917±0.020
chisq 6.825
tail prob 0.65538
Full output

As far as I can tell, they're very similar to the original runs, which is a relief, because it means that the conclusions in my blog posts still make sense.

Wednesday, June 24, 2020

Armenian Highland population prehistory


A new preprint at bioRxiv claims that some sort of large-scale population movement resulted in the spread of Sardinian-like ancestry into both the Armenian Highland and East Africa during or just after the Middle-Late Bronze Age. See Hovhannisyan et al. here.

In all seriousness, my suggestion is that the authors should familiarize themselves with the scientific concept of the sanity check and then try again.

For what it's worth, here's a brief outline of the population history of the Armenian Highland based on what I've learned about the topic from ancient DNA in recent years:

- overall, the Neolithic populations of the Armenian Highland were surely very similar to the Caucasus_lowlands_LN samples from what is now Azerbaijan from the recent Skourtanioti et al. paper (see here)

- Chalcolithic era migrations from the Pontic-Caspian steppe and/or the North Caucasus introduced steppe ancestry to the Armenian Highland, bringing at least some of its populations closer genetically to those of Eastern Europe (a somewhat outdated but still useful blog post about this subject is found here)

- population expansions during the Early Bronze Age associated with the Kura-Araxes cultural phenomenon, which may have originated in what is now Armenia, resulted in a resurgence of indigenous Caucasus hunter-gatherer (CHG) ancestry across the Caucasus, as well as its spread to many other parts of West Asia (see here)

- another significant pulse of Eastern European admixture affected the Armenian Highland during the Middle-Late Bronze Age and Early Iron Age (see here)

- it's not yet completely clear what happened in the Armenian Highland during the Iron Age in terms of significant genetic shifts, due to the lack of ancient human samples from the region dating to this period, but it's still possible that the speakers of proto-Armenian arrived there from the Balkans at this time

- the present-day Armenian gene pool is the result of the processes described above, as well as later events, such as those associated with the Urartian and Ottoman Empires.

Indeed, it's probably not a coincidence that present-day Armenians cluster more or less between the prehistoric populations from the Armenian Highland and surrounds in the Principal Component Analysis (PCA) below.


To see a more detailed and interactive version of the plot, copy paste the data from the text file here into the relevant field at the Vahaduo Globabl25 PCA Views here.

Citation...

Hovhannisyan et al., AN ADMIXTURE SIGNAL IN ARMENIANS AROUND THE END OF THE BRONZE AGE REVEALS WIDESPREAD POPULATION MOVEMENT ACROSS THE MIDDLE EAST, bioRxiv, Posted June 24, 2020, doi: https://doi.org/10.1101/2020.06.24.168781

See also...

Armenian confirmation bias

Perhaps a hint of things to come

Understanding the Eneolithic steppe

Tuesday, June 16, 2020

Like three peas in a pod


One of the most interesting questions still waiting to be answered by ancient DNA is where exactly did the ancestors of the present-day European and South Asian bearers of Y-haplogroup R1a part their ways? Indeed, the answer to this question is likely to be informative about the place and time of the split between the Balto-Slavic and Indo-Iranian language families.

I was doing some reading today and discovered that the peoples associated with the Bronze Age Fatyanovo-Balanovo and Unetice archeological cultures shared strikingly similar metalwork, despite being separated by well over two thousand kilometers of forest and steppe. Apparently, this similarity is especially pronounced in the metalwork of the Unetice culture from what is now Slovakia (see Ancient Metallurgy in the USSR: The Early Metal Age, page 136).

S11953 is currently the only sample from Slovakia associated with the Unetice culture (Sirak et al. 2020). There are no Fatyanovo-Balanovo samples available yet. However, as far as I can tell, I0432 from Samara, Russia, should be a decent stand in (Mathieson et al. 2015).

Of course, both S11953 and I0432 belong to Y-haplogroup R1a. Moreover, S11953 belongs to a typically Balto-Slavic subclade of R1a, while I0432 belongs to a closely related subclade that is dominant nowadays among the Indo-Iranian speakers of Asia.

S11953 is younger than I0432, but this doesn't necessarily mean that his ancestors arrived in East Central Europe from deep in Russia during the Bronze Age. Indeed, the opposite is more likely to be true. That is, I0432 is probably the recent decedent of migrants from somewhere near the North Carpathians, because he shows elevated European Neolithic farmer ancestry compared to earlier ancients from the Samara region (see here).

Below is a Principal Component Analysis (PCA) showing how S11953 and I0432 compare to each other in the context of ancient West Eurasian genetic variation. Obviously, they're sitting in the same part of the plot, which suggests that they harbor very similar ratios of ancient genetic components and probably share relatively recent ancestry. The relevant PCA datasheet is available here.


I've also highlighted myself, Davidski, on the plot. That's because I share the same Balto-Slavic-specific subclade of R1a with S11953 and, in terms of overall ancestry, I'm similar to both S11953 and I0432. Moreover, I'm the speaker of Polish, which is a Balto-Slavic language. What are the chances that we're dealing here with a remarkable string of coincidences? Indeed, was the North Carpathian region perhaps the homeland of the language ancestral to both Balto-Slavic and Indo-Iranian?

However, please note that there's nothing unusual or remarkable about my ancestry. The vast majority of people of Central, Eastern and Northern European origin - that is, mostly the speakers of Balto-Slavic, Germanic and Celtic languages - would also land in this part of the plot.

See also...

On the doorstep of India

Y-haplogroup R1a and mental health

The mystery of the Sintashta people

Saturday, June 13, 2020

The Abashevo axe did it (Mednikova et al. 2020)


Open access at the Journal of Imaging over at this LINK. From the paper, emphasis is mine:

A massive bronze battle axe from the Abashevo archaeological culture was studied using neutron tomography and manufacturing modeling from production molds. Detailed structural data were acquired to simulate and model possible injuries and wounds caused by this battle axe. We report the results of neutron tomography experiments on the bronze battle axe, as well as manufactured plastic and virtual models of the traumas obtained at different strike angles from this axe. The reconstructed 3D models of the battle axe, plastic imprint model, and real wound and trauma traces on the bones of the ancient peoples of the Abashevo archaeological culture were obtained. Skulls with traces of injuries originate from archaeological excavations of the Pepkino burial mound of the Abashevo culture in the Volga region. The reconstruction and identification of the injuries and type of weapon on the restored skulls were performed. The complementary use of 3D visualization methods allowed us to make some assumptions on the cause of death of the people of the Abashevo culture and possible intra-tribal conflict in this cultural society. The obtained structural and anthropological data can be used to develop new concepts and methods for the archaeology of conflict.

...

Human skeletal remains from excavations of the Pepkino burial mound bear many traumatic wounds on the skulls and postcranial bones (Figure 4). The primary hypothesis is that young men of the Abashevo culture fell at the hands of enemies, which were the representatives of another tribe or culture [14,16]. After their discovery in the XX century, the skulls of killed people of the Abashevo culture were restored using anthropological paste, including beeswax.

...

A simple explanation for obtaining such injuries is the conclusion that the victim stood face to face with their assaulter and tried to back away from the battle axe, but fell and received other lethal wounds. The superficial trauma by the battle axe as well as serious damage to a bone structure and deep cracks in the skull are visible in the upper part of the model.

...

The comparison of the real bronze axe with the model obtained from molds indicates their complete identity and the belonging of these axes from different archaeological sites of the Abashevo culture to the same cultural group. This conclusion may indicate intra-cultural conflict among the Abashevo people. As a final note, the presented results of quite diverse imaging methods indicate a new direction in the archaeology of conflicts and the applicability of 3D modeling methods to identify both weapons technologies and the specifics of the use of these weapons to injure humans.



Citation...

Mednikova et al., The Reconstruction of a Bronze Battle Axe and Comparison of Inflicted Damage Injuries Using Neutron Tomography, Manufacturing Modeling, and X-ray Microtomography Data, J. Imaging 2020, 6(6), 45; https://doi.org/10.3390/jimaging6060045

See also...

Tuesday, June 9, 2020

Maykop ancestry in Copper Age Arslantepe


At least four individuals from the Late Chalcolithic (LC) burial site of Arslantepe show ancestry typical of the population associated with the contemporaneous Maykop culture in the North Caucasus. They are ART018, ART020, ART027 and ART039 from the recent Skourtanioti et al. paper. I've labeled them TUR_Arslantepe_LC_Maykop in my qpAdm mixture model below:

TUR_Arslantepe_LC_Maykop
RUS_Maykop_Novosvobodnaya 0.318±0.041
TUR_Arslantepe_LC 0.682±0.041
chisq 9.969
tail prob 0.533159
Full output

Considering the tight statistical fit, I think it's even possible that some of these people harbor direct ancestry from Maykop Novosvobodnaya. Here's a Principal Component Analysis (PCA) showing why my qpAdm model works so well. It was produced with the data in the text file here and the Vahaduo PCA tools here.



Moreover, one of the Arslantepe males, ART038, belongs to Y-haplogroup R1b-V1636 (R1b1a2). This is clearly a marker of paternal steppe ancestry, because it's been reported in two Eneolithic samples from the southernmost part of the Pontic-Caspian steppe near the North Caucasus foothills (see here). These individuals are dated to ~4,200 calBCE, so they lived about a thousand years earlier than ART038.

ART038 probably lacks steppe and Maykop-related ancestries on his autosomes. Nevertheless, my point about his Y-haplogroup stands, because autosomal admixture can be bred out and disappear completely within a couple hundred years, or about 6 to 8 generations.

Interestingly, Skourtanioti et al. argued against the possibility of significant steppe and Maykop-related ancestries in the Arslantepe LC samples. They also didn't see R1b-V1636 as an obvious signal of paternal steppe ancestry. I find this very puzzling indeed, because to me it seems way off the mark. From the paper:

However, R1b-V1636 and R1b-Z2103 lineages split long before (~17 kya) and therefore there is no direct evidence for an early incursion from the Pontic steppe during the main era of Arslantepe. Lineage L2-L595 found in ALA084 (Alalakh) has previously been reported in one individual from Chalcolithic Northern Iran (Narasimhan et al., 2019) and in three males from the Late Maykop phase in the North Caucasus (Wang et al., 2019). These three share ancestry from the common Anatolian/Iranian ancestry cline described here, which indicates a widespread distribution that also reached the southern margins of the steppe zone north of the Caucasus mountain range.

See also...

Perhaps a hint of things to come

An early Mitanni?

How relevant is Arslantepe to the PIE homeland debate?

Tuesday, June 2, 2020

Perhaps a hint of things to come


It's still a mystery how the Hittites and other Anatolian speakers ended up in the Near East. However, the leading theory is that their ancestors migrated from the steppes of Eastern Europe to western Anatolia via the Balkans sometime during the Copper Age.

Consider the qpAdm mixture models below, made possible thanks to some of the ancient samples published recently along with Skourtanioti et al. 2020. The key ancients are described in a text file available here.

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.471±0.094
RUS_Vonyuchka_En 0.148±0.040
TUR_Barcin_N 0.381±0.069
chisq 12.874
tail prob 0.116261
Full output

TUR_Barcin_C
RUS_Vonyuchka_En 0.107±0.029
TUR_Buyukkaya_EC 0.893±0.029
chisq 12.107
tail prob 0.207331
Full output

I'd say it's quite clear now that TUR_Barcin_C harbors minor ancestry from the Pontic-Caspian (PC) steppe. The reason this isn't widely accepted yet is because demonstrating it convincingly hasn't been possible without a proximate Anatolian ancestry source for TUR_Barcin_C, precisely like TUR_Buyukkaya_EC.

Admittedly, though, the statistical fits in my models aren't all that great. I suspect the problem lies with RUS_Vonyuchka_En, which is likely to be a rather poor stand in for the people who brought steppe ancestry, and possibly early Anatolian speech, to western Anatolia.

So let's see what happens when I try a more proximate reference for the steppe ancestry in TUR_Barcin_C. How about Yamnaya_BGR, an individual of mixed Balkan and steppe origin from what is now Bulgaria?

TUR_Barcin_C
AZE_Caucasus_lowlands_LN 0.518±0.075
TUR_Barcin_N 0.203±0.056
Yamnaya_BGR 0.279±0.067
chisq 10.602
tail prob 0.225269
Full output

TUR_Barcin_C
TUR_Buyukkaya_EC 0.749±0.058
Yamnaya_BGR 0.251±0.058
chisq 9.687
tail prob 0.376414
Full output

That's a little better. Unfortunately, the problem now is that the models are anachronistic, because TUR_Barcin_C is about a thousand years older than Yamnaya_BGR. Clearly, we need more Copper Age samples from the western edge of the PC steppe, the eastern Balkans, and especially northwestern Anatolia.

The Principal Component Analysis (PCA) below effectively illustrates why my qpAdm models work. It was produced with Global25 data using the Vahaduo PCA tools freely available here. Note that TUR_Barcin_C is shifted away from the essentially perfect cline formed by AZE_Caucasus_lowlands_LN, TUR_Barcin_N and TUR_Buyukkaya_EC towards samples from ancient Eastern Europe, including Yamnaya_BGR.


See also...

Steppe invaders in the Bronze Age Balkans

Thursday, May 28, 2020

An early Mitanni?


I've updated my Global25 datasheets with most of the ancients from the new Skourtanioti et al. paper. Here's a Principal Component Analysis (PCA) based on the data. It was produced with the Vahaduo PCA tools freely available here and the text file here.


Note that one of the Bronze Age females from Alalakh, labeled ALA019, appears to have ancestry from Turan and the Eurasian steppe. She may well have been a Mitanni of Indo-Aryan origin.

Interestingly, a Copper Age male from Arslantepe, ART038, belongs to Y-haplogroup R1b1a2 aka R1b-V1636. This is an unusual find, because R1b hasn't yet been reported in any Copper Age or earlier samples from outside of Europe and the Eurasian steppe.

As far as I can tell, this individual doesn't harbor any genome-wide ancestry from north of the Caucasus. However, R1b-V1636 is a rare lineage that is first attested in Eneolithic samples from the North Caucasus Piedmont steppe, so ART038's Y-chromosome might be the first evidence of the presence of steppe ancestry in Copper Age Anatolia.

I've also added most of the ancients from the new Agranat-Tamir et al. paper to the Gobal25 datasheets. The PCA below is based on the text file available here.


The Megiddo samples include a trio of interesting outliers dated to 1600-1500 BCE with significant ancestry from the steppe. One of these individuals is a male, I2189, who belongs to Y-haplogroup R and probably R1a. So he might also be of Indo-Aryan origin.

Another Megiddo male, S10768, belongs to R1b-M269 and probably shows a few per cent of steppe ancestry. I've already discussed how R1b and steppe ancestry may have ended up in the Bronze Age Near East in a couple of my previous posts:

R1b-M269 in the Bronze Age Levant

How did steppe ancestry spread into the Biblical-era Levant?

R-V1636: Eneolithic steppe > Kura-Araxes?

Wednesday, May 27, 2020

Seven thousand years of French prehistory (Brunel et al. 2020)


Over at PNAS at this LINK. I'm not sure why one of the Bell Beakers, CBV95, is modeled as 100% Yamnaya-like in the paper? I've had a preliminary look at this individual and he appears to be very similar to most Corded Ware samples from Germany, with about 75% Yamnaya-related steppe ancestry. I'll revisit this issue when the authors' genotype data are released, apparently within the next few days. Here's the paper abstract:

Genomic studies conducted on ancient individuals across Europe have revealed how migrations have contributed to its present genetic landscape, but the territory of present-day France has yet to be connected to the broader European picture. We generated a large dataset comprising the complete mitochondrial genomes, Y-chromosome markers, and genotypes of a number of nuclear loci of interest of 243 individuals sampled across present-day France over a period spanning 7,000 y, complemented with a partially overlapping dataset of 58 low-coverage genomes. This panel provides a high-resolution transect of the dynamics of maternal and paternal lineages in France as well as of autosomal genotypes. Parental lineages and genomic data both revealed demographic patterns in France for the Neolithic and Bronze Age transitions consistent with neighboring regions, first with a migration wave of Anatolian farmers followed by varying degrees of admixture with autochthonous hunter-gatherers, and then substantial gene flow from individuals deriving part of their ancestry from the Pontic steppe at the onset of the Bronze Age. Our data have also highlighted the persistence of Magdalenian-associated ancestry in hunter-gatherer populations outside of Spain and thus provide arguments for an expansion of these populations at the end of the Paleolithic Period more northerly than what has been described so far. Finally, no major demographic changes were detected during the transition between the Bronze and Iron Ages.

Brunel et al., Ancient genomes from present-day France unveil 7,000 years of its demographic history, PNAS, first published May 26, 2020 https://doi.org/10.1073/pnas.1918034117

See also...

The Boscombe Bowmen