search this blog

Wednesday, May 2, 2018

Open analysis thread: genetic distance (Fst) matrix focusing on ancient Central and South Asia


I'm hoping that we can learn something new about the genomic prehistory of Eurasia, and especially Central and South Asia, based on this massive new Fst matrix:

Ancient Central and South Asia genetic distance (Fst) matrix

Hint: it's probably easiest to initially explore this format with a program called PAST. Indeed, if you'd like to model fine scale ancestry proportions based on these data, it might be a good idea to use PAST to first turn the matrix into a principal coordinates (PCoA) datasheet (like this).

On a related note, as I was typing this, commentator Chetan alerted me to a post at the Molgen forum claiming that Y-haplogroup R1b-L51 has turned up in Eneolithic remains from Pontic-Caspian steppe (see here). If true, then it's a big deal, because it's the best evidence yet that L51 expanded into Central and Western Europe from the steppes. This is the Google translation of the post. Emphasis is mine.

Hello. Today, the XIV Samara Archeological Conference was held. The following reports were heard. Khokhlov AA Preliminary results of anthropological and genetic studies of materials of the Volga-Ural region of the Neolithic-Early Bronze Age by an international group of scientists. In his report, AA Khokhlov. introduced into scientific circulation until the unpublished data of the new Eneolithic burial ground Ekatirinovsky cape, which combines both the Mariupol and Khvalyn features, and refers to the fourth quarter of the V millennium BC. All samples analyzed had a uraloid anthropological type, the chromosome of all the samples belonged to the haplogroup R1b1a2 (R-P 312 / S 116), and the haplogroup R1b1a1a2a1a1c2b2b1a2. Mito to haplogroups U2, U4, U5. In the Khvalyn burial grounds (1 half of the 4th millennium BC), the anthropological material differs in a greater variety. In addition to the uraloid substratum, European broad-leaved and southern-European variants are recorded. To the game haplogroup R1a1, O1a1, I2a2 are added to mito T2a1b, H2a1.

I'd say that this information sounds legit. But let's wait and see if the results are backed up by one of the major ancient DNA labs in the west, like the Reich Lab.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

47 comments:

Rob said...

If true that’s >1000 years older than current estimates (eg YFull)

The O1a is interesting. The Kazakh DNA team apparently found that in Botai

Dmytro said...

It's worth reading the entire available Molgen discussion. The notion of "Uraloid type" is apparently a private Khokhlov theory not quite mainstream. The L-51 is not questioned but the anthropological label is. Possibly some squabbling involved with respect to the Khokhlov intimation of IE origin in this "uraloid substratum" L-51? In house conflicts (:=)) Also: the date of the sample tested is corrected (not 4th quarter of the 5th, but 4th quarter of the 6th). And the O1a could be a Q1a.The hope is expressed that a more adequate genetic analysis be done (supports Davidski in effect).

Rob said...

Then the whole thing is a mess
It would be quite a thing for P312 to date to 5250 BC
Nor does the Khvalynsk culture date that early. Vybornov starts it at 4800 bC, and that’s even without considering reservoir effects

Richard Rocca said...

R1b1a1a2a1a1c2b2b1a2 is a very downstream marker under U106 and not P312. It has also been found in a Swat valley sample. This is very, very unlikely IMO that we are talking about P312 or U106 based on just that marker. More than likely this is an SNP that exists in more than one subclade. Finding R1b1a2 would not be a surprise given that M269 is also the parent of R-Z2103.

Davidski said...

I corrected an erroneous label in the Fst matrix and uploaded a new version. So please download it again.

Ebizur said...

Rob wrote,

"The O1a is interesting. The Kazakh DNA team apparently found that in Botai"

Are you referring to that website where it has been reported in Russian that a specimen from the Botai culture has been found to belong to haplogroup O1b (at the time known as O2)?

The MRCA of haplogroups O1b (formerly O2) and O1a is estimated by YFull to have lived 29900 [95% CI 27600 <-> 32200] ybp. YFull's estimate for the TMRCA of haplogroup Q is 28700 [95% CI 26400 <-> 30900] ybp and the estimate for the TMRCA of haplogroup R is 28200 [95% CI 25900 <-> 30500] ybp, so the divergence between O1a and O1b is of roughly the same magnitude as the divergence between R1 and R2 or between Q1 and Q2.

Anonymous said...

@David

Wildly offtopic, but could you do the following?

Mbuti Yamnaya_samara GoyetQ116-1 Kostenki14

AWood said...

Text suggests the burials have Mariupol and Khvalynsk features, so it probably is as old as they suggest here. I can't seem to locate where or what the "Ekatirinovsky cape" is though. If it is related to the Mariupol culture, it may suggest that these people interacted with the north Caucasus, ie: early Maykop, which might explain a correlation of P312+ with some G-P303 (whom I believe to be a marker of Maykop) in western-central Europe.

Rob said...

Oh it was O1b not O1a. Cool

Samuel Andrews said...

It's about time R1b L151 was found on the Steppe. I've grown really tired of bs attempts to deny the obvious from Maju and others.

People have complained about not enough Y DNA sampling of Neolithic-CHalcolithic France and Spain. Yet, basically, all our Steppe DNA comes from two sites from eastern Yamnaya. Considering the massive Y DNA founder effects in early IE populations it would be foolish to say those two Yamnaya sites represent all the Y DNA diversity of the ancient Steppe but people did this.

Samuel Andrews said...

Fourth quarter of fifth millenium BC is 4250 BC. The R1b1a1a2a1a1c2b2b1a2 label is obviously wrong. Makes me wonder if the P312 label is also wrong. For mtDNA results they only mention hunter gatherer haplogroups. Maybe, these were hunter gatherers with a form of R1b (maybe M269) other than P3312.

Synome said...

Location of the site for reference, taken from the paper at the following link (https://cyberleninka.ru/article/v/ekaterinovskiy-mys-novyy-eneoliticheskiy-mogilnik-v-lesostepnom-povolzhie) and google translated:

"Cemetery Catherine Cape is located
on the eastern outskirts of the village of Ekaterinovka, on the cape,
formed at the confluence of the Bezenchuk river with
Saratov Reservoir."

Yekaterinovka on Google maps. I assume this is the village in question: Yekaterinovka
Samara Oblast, Russia, 446232

https://goo.gl/maps/893qokaSQF52

Matt said...

Thanks for doing this!

Among ancients, seem to be some slightly odd outcomes for Fst from:

- Lepenski Vir: Lots of negative Fst with recent people, even though they're quite distant in the general PCoA

- Petrovka_MLBA: Seems to get some odd drift dimension phenomena, again might be related to a few negatives

- Varna: Kind of seems to have a close relationship with Atlantic Neolithic samples (Wales_N) which seems strange.

- CWC_Baltic: Looks to have suspect low Fst to recent Europeans (only 0.001 to Belarusian?)

France_MLN and Beaker_The_Netherlands and Haji_Firuz_Chl also seem to have some long branches from neighbours that I'm not totally sure about. Could be real or an artefact.

One thing with these Fst PCoA is that there seem to be ancient->modern drift dimensions that emerge which are probably artefactual, and make particularly mixing ancient and modern a bit of a challenge.

Among moderns, Russian Central also looks to have some slightly aberrant distances at a quick glance - should be like Ukrainian_W or Belarusian or Russia_W but distances look much lower w/some unusual patterns? But could just be because Russian Central is more diverse than these others.

Few graphics using these data: https://imgur.com/a/U3aZFEQ

Finally, set of rank comparisons to moderns: https://imgur.com/a/PdacOBK
(Like to see what Sein can come up with using this.)

MaxT said...

@Synome

Any information on how he was buried? funerary objects?

MaxT said...

Nvm, thanks for the link.

Davidski said...

Thanks Matt.

It sounds like some of the sample sets are behaving strangely due to a lack of coverage. I'll have to take a closer look and remove the worst affected sets.

Open Genomes said...

What is R1b1a1a2a1a1c2b2b1a2? R1b1a1a2a is R-L23, R1b1a2a2a1a1 should be R-U106, so this essentially would be something "Germanic", perhaps Norse/Varangian. It doesn't make sense. Also, neither the Reich Lab Central Asian nor the Denmark Eurasian Steppe found any such thing (except for the R-U106 of unknown date and origin). On the ISOGG tree, R1b1a1a2a1a1c2b2b1a2 is R-S21728. This is R-S21728 on the YFull tree. R-S116 is R-P312.

Can someone notify them of this error?

Also, the researchers need to stop using the ISOGG tree and just report the terminal SNPs. There are issues with "private SNPs" in the mtDNA that may be aDNA damage that need to be resolved, but we don't really see downstream false positives on the Y, just some upstream false negatives.

Of course, if this is true, we really have to question their autosomal analysis.

And what the hell is "uraloid", when we know that skull shapes don't really reflect genetic ancestry? (In that case, the Scythians apparently would be all "uraloid" or whatever. Are Hungarians "uraloid"? The same as Nganasan?) These researchers have to start becoming a little more scientific. It's easy enough to do mtDNA and autosomal extraction. Notice that in the Eurasian Steppe study almost all the mtDNAs were what would be generally described as "Siberian" or "East Asian".

It's pretty disappointing.

Davidski said...

@All

I've removed these ancient samples from the Fst matrix because they were indeed behaving a little too erratically.

- Lepenski Vir

- Petrovka_MLBA

- Varna

Seinundzeit said...

David,

Good stuff! Thanks for trying this out.

Matt,

Yes sir; I'll be giving this a spin tomorrow night. I think it'll be fun.

Archaelog said...

This is a majority Russian team of researchers. It seems their terminologies and methodologies are different. However there's no doubt they meant P312. I suppose R1b1a1a2a1a1c2b2b1a2 is a deep clade of M73 (the Siberian sister of M269) .

Aram said...

In Russian literature uraloid is a racial anthropological term unrelated to language and ethnicity.

Queequeg said...

@D: is there any possibility to have Satan_MLBA_Alakul, Krasnoyarsk_o and Mansi in the Fst-matrix, in order to shed some more light into Uralic enigma. Many thanks anyways, for the nice work so far.

Samuel Andrews said...

Wonder, if R1b1a2 means V88, the type of R1b popular in Mesolithic Europe. The mtDNA types from the site are U5, U4, U2, there's no mention of Mid Eastern HGs (H, J, T, etc). These could be hunter gatherers with R1b V88.

Open Genomes said...

"Hunnic_Hungary:DA85"

Y-DNA: L-M27* mtDNA: U4 Gedmatch: Z710382

nMonte2, unscaled, no population averages:

[1] "distance%=0.3322 / distance=0.003322"

Hunnic_Hungary:DA85

Tubalar:S_Tubalar-2 19.70
Sintashta_MLBA:I0942 12.15
Afanasievo:I3387 9.30
Kyzlbulak_MLBA2:I4784 9.00
Oy_Dzhaylau_MLBA:I4791 8.85
Shor:shor123 7.70
Sintashta_MLBA:I0986 6.25
Beaker_Central_Europe:I6534 4.85
Uzbek:495_R02C01 4.35
Udegram_IA:I6901 4.15
Tepe_Hissar_ChL:I2921 3.95
Geoksiur_Eneolithic:S8532.E1.L1 3.65
Kalash:HGDP00279 3.15
Sintashta_MLBA_o1:I1017 1.80
Karagash_MLBA:I4262 1.15

This guy looks about as far as you can get from a Hun from Hungary.

He seems to be perhaps some kind of Iron Age Pazyryk Scythian from the Altai. His connections are basically modern day Altaian Turkic, but mostly Afanasievo and Sintashta. Notice he also partakes of Swat Valley Udegram IA, but also pre-Steppe Geoksiur Eneolithic.

What do you think?

Ric Hern said...

Samuel when did R1b V88 split from the rest ? Did L51 split during the Mesolithic ? Maybe a Crimea or North Caucasus refuge or somewhere North in the Forest Steppe or Southern Urals ? Now I wonder if the ancestors of L51 were the tailend of R1b ancestors migration to Europe from Siberia with Balkans and Villabruna being the head ? Did they get caught and issolated by a Steppe dryspell that prevented further migration into the Balkans ?

Davidski said...

@epoch

Mbuti Yamnaya_Samara GoyetQ116-1 Kostenki14 -0.0082 -1.719 733059

Rob said...

@ OG

"This guy looks about as far as you can get from a Hun from Hungary."

Why don't you try neating down your set up to come up with 3 to 5 major ancestral streams, eg

Hun DA85
Krasnoyarsk 45%
Karasuk_outlier 28%
Bustan MBA 20%
d 0.026

What did you expect a Hun to look like ?

Davidski said...

I've come up with some D-stats graphs that can test for BMAC ancestry in South Asians.

The Iron Age Swat Valley groups do have a lot of BMAC admixture. They also have steppe, but it has to be said that their BMAC ancestry is through the roof.

On the other hand, modern-day South Asians, and especially Brahmins, Tajiks etc. do have a lot of steppe.

I'll post the results tomorrow, plus a new qpGrah for the Swat groups.

Anonymous said...

@Open Genomes

There were quite a few Iranian sounding names among the Huns. I have a hunch these peoples (Avars, Huns, Magyars, Bulgarians) weren't so much people as flocks or confederations of highly different tribes.

Open Genomes said...

@Rob

I'm first trying to narrow down each sample in terms of individuals, because we can't be entirely sure at first if the individuals are typical of a population. Some individuals might be outliers, even if they haven't been detected as such until now, because we're dealing here with entirely new populations. I can do that "supervised" run afterward, when we have a better idea what do expect.

For example, does Geoksiur_Eneolithic:S8532.E1.L1 have that "unexpected" so-called "Siberian_N" admixture? I wouldn't think that Attila and his war-band in Pannonia had people who galloped straight westward from the Altai in just a few years, especially since the historical sources show that the Huns were in the Pontic Steppe and Romania for several generations. Also the sources show that the "Huns" surrounding Attila were a mix of Lir-Turkic- (Bulgar), Iranic-, and Gothic-speakers. And would we expect the Hungarian Huns to apparently have zero descendants anywhere west of the Altai and Uzbekistan?

What I would expect a Hungarian Hun to look like? First off, Someone with a better fit to some modern populations, and not mostly Bronze Age and earlier steppe populations with a couple of specifically Altaian Turkic groups. The Swat Valley
Udegram_IA:I6901 is interesting because it seems he has a certain specific kind of Steppe ancestry. I would also expect to see at least some relationship to Volga Bulgars (Chuvash) since the Bulgars were descendants of remnant Pontic Steppe Huns, and other Turkic peoples from around the North Caucasus.

That's why I think in fact he may be a Pazyryk Scythian or he might be a Xiongnu. These would be the kind of people who left descendants in the Altai today, rather than Pannonian Huns.

@epoch2013

Of course these people were confederations of various tribes. This is especially documented for the Huns surrounding Attila in Pannonia. The problem here is that this "Hungarian_Hun", aside from the Bell Beaker component, doesn't have any detectable West Eurasian ancestry (or rather, descendants) at all. He just looks "Scythian", but without any direct relationship to the Western (Pontic) Scythians. Even many of of his ancient relationships like Afanasievo, Kyzyl Bulak, and Oy Dzhaylau are from just northwest of the Tarim Basin. It's not the admixture that's a problem, it's the locations of his admixture all being so incredibly distant from Pannonia.

Did anyone notice that this "Hungarian Hun" is Y L-M27*, which is primarily found in Afghanistan and South Asia?

Matt said...

A quick thing using Fsts again (sorry focused on recent and ancient Europeans again, really because this is the first matrix that we've had with all the Latvia_HG and Balkans samples, so if you are, rightly given the post, focused on South and Central Asia then I guess tune out):

PCA experiment using Fst from Early and Middle Neolithic populations and European HG groups (from Baltic westwards), as columns, and modern populations as rows: https://imgur.com/a/h04QskQ

PC1 that forms loads on all columns, and reflects at the positive end being far from both European Neolithic and HG populations, or at the negative end, being close to both. The columns are slightly biased towards Neolithic populations so apex (average closest) is Spanish, and generally West Eurasians, while the other end (average furthest) is Mbuti and Oceanians, the furthest populations from them.

PC2 then reflects familiarly being generally Northern or Southern among West Eurasians; positive end is Druze and relatively far from European HG, and the other Karelian and relatively far from Neolithics. The MN populations which are admixed between European HG and Anatolian farmers are of course intermediate in the loadings on this dimensions.

Now that's not so interesting. But what might be interesting (if unsurprising) is that the next PC3 which forms loads distance from mix of GAC, German and Atlantic farmers + WHG on the negative end (-), and on the other, on the positive end on end (+), distance from a mix of Balkans and Peloponnese farmers + Latvia HG on the other.

And accordingly, the dimension places specifically recent Southwest Europeans and populations from North Africa and NW Europe that are related to them on the end that is relatively differentiated from Balkans and Peloponnese farmers + Latvia HG, relatively closer to GAC, German and Atlantic farmers + WHG.

This is a really small amount of the variance! It's just cool that the Fst stats on ancients can with the right samples and enough of them still find a very subtle distinction between these populations (modern+ancient).

Correspondence analysis shows similar things: https://imgur.com/a/mg2xxqg

(@Davidski if you bothered reading this far into this tangent, if possible splitting the Globular Amphora into Poland and Ukraine might find some interesting differences in scores, since D stats by Arza suggested these subsets were pretty different in behavior towards the Beakers; Polish Beakers better match for Neolithic ancestry in Beakers than MNChl farmers, Ukraine Beakers much worse.)

Rob said...

@ Ted
I see nothing problematic or surprising with this Hun
It’s commonly known that they came from south Central Asia (eg see peter Attwood)
Nor it’s surprising that he doesn’t look like a Goth. The Goths sometimes fought with the Huns and sometimes were allies or subjects but they werent socially panmictic. Such thing as distinct ethnicity and even sub -racial groups are real thing. The world isn’t a rainbow

Ryukendo K said...

Here are very pleasing new fits with Turkic groups with the new Karluk Turk and Tien Shan Hun. The Tien Shan Hun seems systematically excluded from Turkic groups, but a combination of Scythian_Pazyryk + Karluki Turk + a small slice of Mongola absolutely wipes the table for cleanness and completeness of fits, evacuating the need for the complicated and messy Scythian contributions from ZevakinoChilikta and Aldybel and so on.

The fits seem to suggest the ancestral Turkic population ranged within a relatively tight triangle formed by Pazyryk Scythian, Karluki Turk, plus varying amounts of a small slice of Mongola.

Ryukendo K said...

Fits for Uralics generally do not include West_Siberia_N in nMonte. West Siberia N, in the form of the Sintashta_Outlier_3 (~25% Combed Ware, ~50% West Siberia N and rest Sintashta), plays an important role in Khanties and Mansis, as well as Kets and Nganassan, i.e. towards Central Siberia, but even there less than 30% total for Sintashta_Outlier. All these groups are revealed to be Steppe-admixed by the latest genomes.

West_Siberia_N simply contains too much ANE and too little EHG for good representation in Uralics in the Volga and Finnics+Saami however. Uralics west of the Ural Mtns still favour Sintashta+Baltic_BA+Combed Ware+Nganassan over any other combination, with Scythians appearing for the Volga Uralics.

West Siberia N itself is fairly well behaved in nMonte and can be fit very well with the proportions for the qpAdm in Narasimhan et al (which is better than can be said for many modern populations even, e.g. Yakut). However it does not appear to have left much of a legacy anywhere. After Karasuk and Dali_EBA, it is generally absent in the Steppe (Karasuk_outlier + East Asians + Nganassan obviate the need for it in all Scythians and later).

Samuel Andrews said...

Guys, there's a Nepali Brahmin named poi user at Anthrogencia who scores 33% Sintashta even when Sintashta admixed Shahr_I_Sokhta_BA3 is used. Plus, he has more Gonur1 (BMAC) than Shahr_I_Sokhta_BA3.

His Y DNA is R1a L657. Um...

Davidski said...

Pretty sure Shahr_I_Sokhta_BA3 doesn't have any Sintashta/Steppe_MLBA admix.

Aram said...

Davidski thanks for this Fst matrix.
Let's see Hajji_Firuz_Chl.

Hajji_Firuz_ChL 0
Seh_Gabi_ChL 0.032
Hajji_Firuz_BA 0.034
Armenian 0.035
Armenia_EBA 0.035
Armenia_MLBA 0.036
Azeri 0.036
Iranian 0.036
Dzharkutan2_BA 0.037
Kurdish 0.037
Sappali_Tepe_BA 0.037
Assyrian 0.038
Druze 0.038
Dzharkutan1_BA 0.038
Bustan_BA 0.039
Georgian 0.039

Aram said...

Armenia MLBA and Hajji Firuz BA are the closest pops to each other. Without surprise.
Modern pops close to Hajji Firuz BA are the Azeris, Tabassarans, Armenians, Iranians and Circassians.
They entered via Daguestan and Azerbaijan.

Matt said...

Trying to use the procedure from my last post (on the Europe MN, HG and EN Fsts) to try and do something actually on topic for South Asia.

Moderns as rows, Fst from ancient Pakistan_IA and Indus_Periphery as columns: https://imgur.com/a/dnqGTxk

PC1 = separates populations with generally high Fst from all the set (Mbuti) at + end from populations with average low Fst from all the set at the - end (Punjabi Jat).

PC2= separates populations relatively high Fst from Indus_Periphery at the + end from pops with relatively high Fst from Barikot_IA at the - end. Intermediate on the loading in order from most+ to most- are; Indus_Periphery->Aligrama_IA->Butkara_IA->Saidu_Sharif_IA->Loebanr_IA=Udegram_IA=Katelai_IA=Barikot_IA.

Relative distance from Indus_Periphery peaks with a mix of Lithuanian, Latvian, Russian Central-West, Basque, Sardinian. Relative distance from Barikot_IA peaks with Chamar and Austroasiatic_Ho.

PC3 = separates a further distinction between a + end which loads on distance specifically from Aligrama_IA, and a negative end that loads on distance from Indus_Periphery and Barikot_IA. Recent NE European, Gond and Paniya looks to sit on the - end, while West Asians, North Indians and other Europeans look to load on the + end.

Not quite sure what this last dimension really represents. Some way in which Aligrama_IA's affinities differentiate from a combination of Barikot+Indus_Periphery, but other than that... Further dimensions don't look like anything clear to me.

Introducing Sintashta as a column: https://imgur.com/a/hYvoFzy

Relatively similar stuff for PC1 and PC2 as above; slight shifts in PC2 as looking to peak slightly more in NW Europe once Sintashta are included, as against using purely intra IA Pakistani and Indus_Periphery variation. Aligrama_IA as relatively close to NE Europe in PC3 may be relatively more pronounced.

Matt said...

Some plots of residuals in Fst relationships for a few ancient populations and Indus Periphery: https://imgur.com/a/UrUoQlc

A bit of a qualitative shift in which populations most peak the residual for highest Fst to as we move from Aligrama_IA to Sintashta.

War Lord said...

The big deal is that Narasimhan et al. identified R1b-U106 at the Iron Gates (Padina) 8900 BC - and nobody cares.

Davidski said...

No one cares because it's an error.

Matt said...

OK, so a bit more of an experiment on the ancestry of the Beakers, and other BA Europeans, using this data.

First, I generated a PCoA using a subset of Fst scores focused mainly on the Neolithic* and post-Neolithic and on Europe (ancient samples only to avoid any confounds of modern-ancient relatedness).

See here: https://imgur.com/a/Bb8lmQm
Pastebin file: https://pastebin.com/a6UFgLxK

This looked to have some nice structure, and to split apart European Neolithics really quite in all the dimensions, so then made a calc file (https://pastebin.com/Kd1LVpiE) and ran some fits on Beakers and other European ancients with it: https://pastebin.com/0eq0VUQ7

Regionally, the fits look very logical. There's a preference for Scotland_N though over GAC, which is likely the same issue as seen in D-stats, that the combined GAC culture including the Ukrainian samples then tends to be distant from Beakers and its specifically GAC Poland that is useful to fit most Neolithic ancestry in NW Beakers; while Scotland_N looks per the paper to be a combination of mainly Atlantic but some German/Central MN ancestry so works quite well.

The PCoA is well able to split apart ancestry in Hungary and the Balkans from the ancestry streams contributing to the NW European Beakers, and populations in Eastern Europe (Poland BA and Baltic BA).

A fit on the Mycenaeans seems to shows that, though they have steppe ancestry here, they also look to have ancestry from virtually all directions, mainly Anatolian, but with some substructures among this and also Armenian/BMAC and Levant. Minoans look decidely quite different beyond just having less steppe ancestry, much more EEF.

CWC_Czech is preferred by almost all populations in the above, so a set of fits without it: https://pastebin.com/K90i67n0

These fits mainly seem to mean that Ukraine_Eneolithic goes up everywhere, while in the first fits with CWC_Czech as an option, Ukraine_Eneo was only really preferred as an addition for Poland and Baltic BA.

*Arguably I should have removed the Natufians, given the selection procedures I was following, but I don't think this would change the results. I also removed a few ancient populations who seemed to have unusual scores to me (Petrovka, France_MLN, etc.).

Matt said...

Further to try and investigate affinities within European Early Farmers and HG groups on the continent, before introduction of "steppe" and CHG ancestry:

First generated a PCoA using the Fst matrix only for a subset of European/Anatolian/Levant early farmers and HGs:
https://imgur.com/a/hmKzQTq. Nice structure. (Matrix: https://pastebin.com/m6P23wjd)

Take the first ten dimensions as data for nMonte: https://pastebin.com/UZcZ2rxU

A: Fits using on minimally admixed Koros_N, Barcin_N and the HG samples: https://pastebin.com/SQbBKRPk

In this simple scenario, i) GAC fits mostly with WHG, plus some Narva and Ukraine_N, ii) Iberia_Central_CA fits almost exclusively with WHG, iii) Trypillia fits with mostly Latvia_MN (EHG) and some WHG.

B: Introducing Iberia_N to take advantage of the patterns where Atlantic samples seem to largely lie between Iberia_N and WHG: https://pastebin.com/E8rciGgn

GAC comes out as about 50% Iberia_N in this test, and the rest of ancestry mostly about 2:1 Barcin/Koros:Narva_Lithuania. This is all fairly plausible given GAC's location, but probably masks a division between the GAC_Poland and GAC_Ukraine. (Confirmation of the G25 models where Iberia_N contributes significantly to GAC?)

Iberia_Central_CA is essentially Iberia_N with a top up of 11% WHG. Trypillia actually does take some Iberia_N as well, but as a minority component at 17% - the rest of ancestry is Barcin/Koros with Eastern HG. Lepenski Vir strongly takes Iron_Gates_HG, Koros_HG and Narva_Lithuania with only minor Iberia_N (12%). Blatterhole, on the other hand, 50% Iberia_N, 30% WHG, 11% Narva.

C: As a final check that Iberia_N isn't just turning up because Koros and Barcin are too early or low HG or something, introducing Balkans_Chl to the models: https://pastebin.com/ZcQ9D13t

GAC takes on a good chunk of Balkans_Chl (28%) essentially replacing much of the Koros/Barcin... but still 50% Iberia_N. Iberia_Central_CA refused to take on any Balkans_Chl, while Trypillia continues to prefer Koros and Barcin. Lepenski takes on about 50% Balkans_Chl, replacing Koros and Barcin, which historically makes no sense, however probably this does reflect patterns in ancestry. Blatterhole takes on about 9% Balkans_Chl, and steadily remains 50% Iberia_N.

Seems like this all reflects (as we found from G25) that there was significant establishment / mixing in Central Europe of EEF groups who specifically had ancestry from both WHG and the autosomally WHG-like groups in East-Central Europe (Narva, Koros, Iron Gates), and specifically from early farmers going through both the Danubian and Cardial routes.

Matt said...

Similar thing using Fst PCoA to model the early steppe.

PCoA: https://imgur.com/a/ECvbjeZ, Matrix: https://pastebin.com/Ds1uhd9w

I used all the populations in the matrix in the calc file, except the early steppe populations and Vucedol. So whatever was chosen, it wasn't because of limited source options.

Fits using 15 dimensions (https://pastebin.com/4LmVxw3d) here: https://pastebin.com/2Pg93RyF

Yamnaya/Afanasievo resolve as roughly 1:1:1 mixes of Ukraine_Eneolithic, Khavlynsk_Eneolithic and a population that appears to be a more CHG shifted version of Armenia_EBA/Chl/Seh_Gabi_Chl, and which I'd guess will represent Maykop. That would be quite nice if it worked, and would seem to make good sense.

CWC_Baltic_Early is similar, but more like 2:1:1 the above ratios.
It seems that in these fits, the Ukraine_Eneolithic is probably the most unifying element.

As a sense check, using 5 dimensions (https://pastebin.com/puT60ZJg) here: https://pastebin.com/knK3aKT8

Still basically similar in the types of ancestor populations (Ukraine_Eneolithic, Khavlynsk_Eneolithic , CHG). Much more dominated by Khavlynsk_Eneolithic, so it seems like relationship that pushes more Ukraine_Eneolithic in is more high dimension.

War Lord said...

"No one cares because it's an error."

Says who? To the contrary, R1b-U106 in the company of many I-M436 looks very plausibly.

War Lord said...

"It's about time R1b L151 was found on the Steppe. I've grown really tired of bs attempts to deny the obvious from Maju and others.

People have complained about not enough Y DNA sampling of Neolithic-CHalcolithic France and Spain. Yet, basically, all our Steppe DNA comes from two sites from eastern Yamnaya. Considering the massive Y DNA founder effects in early IE populations it would be foolish to say those two Yamnaya sites represent all the Y DNA diversity of the ancient Steppe but people did this. "


"The obvious" is that thee is ZERO R1b-M412 east of the Carpathians. And the first R1b found in Europe is that from Villabruna in northern Italy, 14 000 years ago. Steppe invaders riding mammoths probably got there as early as at the end of the Upper Paleolithic, right?