### Unleash the power: Global 25 test drive thread

Ancestry modeling enthusiasts, feel free to do your best (or worst) with these datasheets and share the output, whatever it might be, in the comments below:

Global 25 datasheet

Global 25 datasheet (scaled)

Global 25 pop averages

Global 25 pop averages (scaled)

Global 25 PAST datasheet

The Global 25 is a more powerful version of the Global 10 ancestry analysis (see here). If all goes well in the comments, it'll soon be offered for free to those who already have Global 10 coordinates. After that, we'll see what happens.

Below is a quick attempt to model Samara Yamnaya with its Global 25 coordinates using nMonte2. Hmmm...interesting stuff.

[1] distance%=2.4936 / distance=0.024936

Yamnaya_Samara

Samara_Eneolithic 71.15
Armenia_EBA 13.6
CHG 12.6
Iran_LN 2.65
ALPc_MN 0
Anatolia_BA 0
Anatolia_ChL 0
Armenia_ChL 0
Balaton_Lasinja_CA 0
Baltic_HG 0
Barcin_N 0
Blatterhole_HG 0
Blatterhole_MN 0
Boncuklu_N 0
EHG 0
Greece_N 0
Greece_Peloponnese_N 0
Iran_ChL 0
Iran_N 0
Koros_EN 0
Koros_HG 0
LBKT_MN 0
LBK_EN 0
Levant_BA 0
Levant_N 0
Narva_Estonia 0
Narva_Lithuania 0
SHG 0
Starcevo_EN 0
TDLN 0
Tepecik_Ciftlik_N 0
Tianyuan 0
Tisza_LN 0
Tiszapolgar_ECA 0
Vinca_MN 0
WHG 0

Obviously, this makes a lot of sense, but it's somewhat different from my recent models of Samara Yamnaya using methods based on formal stats (see here and here). In the end, only ancient DNA from the steppes and Caucasian-Caspian region will settle this issue when enough of it is sampled.

Update 10/02/201: As per our discussion in the comments, in most cases it might be useful to restore the variance of the raw data (like in the datasheets here and here). This can be done with the EigenScale.R script available here. You'll also need a text file of the relevant eigenvalues, available here. Instructions on how to call the R script are here. Below is the same model of Samara Yamnaya as above, except using the "restored" data. The result is very similar, albeit a little cleaner.

[1] distance%=3.8086 / distance=0.038086

Yamnaya_Samara

Samara_Eneolithic 70.35
Armenia_EBA 14.9
CHG 14.75
ALPc_MN 0
Anatolia_BA 0
Anatolia_ChL 0
Armenia_ChL 0
Balaton_Lasinja_CA 0
Baltic_HG 0
Barcin_N 0
Blatterhole_HG 0
Blatterhole_MN 0
Boncuklu_N 0
EHG 0
Greece_N 0
Greece_Peloponnese_N 0
Iran_ChL 0
Iran_LN 0
Iran_N 0
Koros_EN 0
Koros_HG 0
LBKT_MN 0
LBK_EN 0
Levant_BA 0
Levant_N 0
Narva_Estonia 0
Narva_Lithuania 0
SHG 0
Starcevo_EN 0
TDLN 0
Tepecik_Ciftlik_N 0
Tianyuan 0
Tisza_LN 0
Tiszapolgar_ECA 0
Vinca_MN 0
WHG 0

Modeling genetic ancestry with Davidski: step by step

The powerful Global 25 now available via the Eurogenes genetic ancestry online store

Anthro Survey said...

@Simon_W
Ah, that's right. Looked again at Dave's West Eurasia PCA and both 1504 and 1502 are quite WHG-shifted. Between French and Iberians in the orthogonal direction. Confused them for those 2 Maros and all but the WHG-shifted Vatya. Well, considering our timeframe, the WHG-excess couldn't have been obtained via direct WHG admixture, but through Blatterhohle-type EEFs in the vicinity somewhere. Didn't run it, but maybe ~25% steppe.

Good going on your models. The French_South and Nordic_IA make perfect sense. I mean, they essentially describe the "mainstream" modern European spectrum. High-content WHG+varying amounts of steppe. Tuscan_Italy signal, if real, could signify either Ashkenazi or some Roman-age admixture from the Balkans, Italy or even beyond. Not sure what to make of Hungary_IA. Strange and probably a compensatory artifact.

Slav---that could just be bundled in his Prussian Balt ancestry. I mean, modern Lithuanians model as combo of Turlojske and Slav_Bohemia(mostly the former).

I think using Mozabites is a bit problematic even if we assume Roman-age North African influence that far afield. Guanches would be preferable and the genomes are available, btw(I told Dave already). KEB would be even more desirable. Considering location, some ADMIXTURE studies and history, Mozabites probably carry an excess of SSA compared to Roman-age Numidians/Mauri and could be deficient in ANF-like admixture(thus, more similar to IAM people, not KEB) sweeping the region in the late Neolithic as well.

@Rob

I'll say more on this later, but I(and the available evidence) favor 3) and/or 2).
Regarding Minoans---do you mean to say they didn't have an influx of CHG-related ancestry arriving in an Anatolia_BA-like package or that it merely arrived much earlier? I think the former is clearly supported by formal stats, admixture, monte, and hg J.

Matt said...

@Anthro Survey, nope, sorry, the co-ords are for the scaled Global_25; making a sub-PCA / meta-PCA from the Global25 data and the selected samples is only for the purposes of really clearly visualising the two clines and estimating proportions. It's not an important step, just a way of estimation.

@Chad / Sein, here are a few PAST3 graphics plotting where Chad's ASI lands: https://imgur.com/a/bTyle

Neighbour Joining Tree: https://imgur.com/TJAxdhy - places with Negrito populations as per Chad's expectation
PC1 v PC2: https://imgur.com/7YHyYvv / https://imgur.com/dNZNREF
PC1 v PC3: https://imgur.com/u6DyZIF / PC1 v PC4: https://imgur.com/MlrTWOI / PC1 v PC5: https://imgur.com/ZCGfF6u

In all the PC again, pretty close to the Negritos and tends to be pretty far from an extrapolated end beyond SA clines, where the end point of those clines are pointing in an opposite direction from SE Asian populations, which is PC3, PC4, PC5.

I had a go at putting it into "competitive" nMonte3 against the ASI / Austroasiatic zombies I put together, didn't find that any population took it (which makes sense, since it's basically Negrito like, and those populations are far from the South Asian cline in the above, and themselves weren't "competitive").

It would probably better approximate an "ENA ASI" if it were mixed down with Kosipe / Koinanbe / Papuan in about 50% or so. (Various graphics here: https://imgur.com/a/Y5K17). Though that would probably get too much branch specific drift with Papuans, so probably better off to do cline extrapolation.

@Davidski: Peristeria (ID I9033) has a higher cut of Potapovka-like steppe ancestry than the individuals from the commoner graves.
Looks plausible, though in context dispersal v. small among these samples. (No more / less dispersal than within Iberia_EN / Iran_N)

Shift I9033 v Myceneaean average same vector and magnitude as Anatolia_BA-Anatolia_Chl (time direction is opposite).

Matt said...

Models using an ultra basic calc* to try and get Mycenaean:I9033 and I9010 (least and most "north" Mycenaeans) on basically comparable terms:

Mycenaean:I9033 - Anatolia/Greek N 63.6 (Greece_N), Steppe_EMBA,15.8, CHG,14.8, Natufian,5.8

Mycenaean:I9010 - Anatolia/Greek N 75 (Greece_N,46, Anatolia_N,29.2), CHG/Iran_N 13.4(CHG,6.8, Iran_N,6.6), Natufian,6.4, Steppe_EMBA,3.8, WHG 1.2 (Rochedane)

*ultra basic: https://pastebin.com/N1gqifYc
(More complicated models: https://pastebin.com/muay0t3p)

Matt said...

Using another basic calculator to "force" nMonte to fit with either Minoan_Lasithi / Greek_Peloponnese (e.g. eliminating all other pops with substantial Anatolia_N ancestry):

Minoan forced:
Mycenaean:I9033 - Minoan_Lasithi,70.6, Steppe_EMBA,12.4, CHG,7.6, Natufian,5.8, Villabruna,3.6
Mycenaean:I9010 - Minoan_Lasithi,83.6, Natufian,6.4, Rochedane,5, Iran_N,3.8, EHG,1.2

(Remove Natufians and EHG:
Mycenaean:I9033 - Minoan_Lasithi,70.2, Steppe_EMBA,12.8, CHG,7.4, Levant_N,5.8, Villabruna,3.8
Mycenaean:I9010 - Minoan_Lasithi,78.6, Levant_N,10.2, Rochedane,5.4, Iran_N,3.8, Steppe_EMBA,2

Mycenaean:I9033 - Hungary_MNChl,63, CHG,17, Iran_N,5.8, Steppe_EMBA,5.4, Minoan_Lasithi,5.2, Levant_N,3.6
Mycenaean:I9010 - Hungary_MNChl,44.6, Minoan_Lasithi,33.2, Levant_N,9.2, Iran_N,8, CHG,4, Steppe_EMBA,1)

Greek_Peloponnese forced:
Mycenaean:I9033 - Greece_Peloponnese_N,66, CHG,14.6, Steppe_EMBA,13, Natufian,4.4, Villabruna,2
Mycenaean:I9010 - Greece_Peloponnese_N,65.6, Levant_N,15.2, CHG,7.4, Iran_N,4.6, Rochedane,4, Steppe_EMBA,3.2

(Remove Natufians and EHG:
Mycenaean:I9033 - Greece_Peloponnese_N,66, CHG,13.6, Steppe_EMBA,13, Levant_N,4.2, Villabruna,2.2, Iran_N,1
Mycenaean:I9010 - Greece_Peloponnese_N,66, Levant_N,15, CHG,7.2, Iran_N,4.6, Rochedane,3.8, Steppe_EMBA,3.4

Mycenaean:I9033 - Greece_Peloponnese_N,56.2, Steppe_EMBA,14.2, CHG,13, Hungary_MNChl,11.6, Levant_N,3.4, Iran_N,1.6
Mycenaean:I9010 - Hungary_MNChl,52.8, Greece_Peloponnese_N,19, Levant_N,11.4, Iran_N,9, CHG,7.8)

....

At the same time, CHG, Levant_N, Iran_N in excess of Minoan average required, as well as Europe_LNBA related EHG, Steppe_EMBA and WHG related ancestry.

So against Mycenaean as simple mix of Minoan/Greek_Peloponnese+Europe_LNBA/Steppe_MLBA? Or someway inaccurate?

(Force Minoan_I0071 as the most east shifted Minoan:
Mycenaean:I9033 - Minoan_Lasithi_I0071,71.8, Steppe_EMBA,9.6, CHG,8.2, Natufian,6.2, Villabruna,4.2
Mycenaean:I9010 - Minoan_Lasithi_I0071,84.4, Natufian,7, Rochedane,5.2, Iran_N,3.4)

Ariel said...

Like it or not there isn't that much EHG in these Myceneans samples. Even in that one from the royal tomb. It's surprising if you consider the age of these samples, and the genetics of the rest of Europe in that time frame. I would have thought that Myceneans had the same or even more EHG than modern Greeks, but everything points to a early soft migration, and then a more massive "dorian" invasion, that ultimately led to the collapse of the Mycenean world.
We are really missing classic era dna from southern Europe.

Matt said...

Trying an extrapolation and meta-PCA approach to working out Mycenaean and Minoan relation.

First I generated three sims:

Y (where Z=0.13*Yamnaya_Average+0.87*Minoan).
Z (which satisfies Mycenaean_Average=0.2*Y+0.8*Minoan_Average)
Z2 (which satisfies Mycenaean_Average=0.5*Y2+0.5*Greek_Peloponnese_Average)

Then took all of the ancient population averages and these three SIMs, and projected them through another PCA. See: https://imgur.com/a/BFyPo

In basic terms, in the PC1 and PC2 this forms, you can see that Z is actually a pretty good proxy for Mycenaean average in basic terms... and Y and Y2 point respectively more or less towards the Steppe and towards the North Caucasus, summarizing more or less the vector we'd expect Mycenaean to have relative to Minoan and Greek_Peloponnese respectively.

But you can see that straight after this, another PC forms, PC3, which specifically contrasts Yamnaya *and* the Early Neolithic Southeast Europe on one end against Iran_N, Levant_N, and WHG on the other (with CHG) intermediate. Tepecik_Ciftlik_N, which overlaps with Minoan in simple West-East and North-South terms, here is different in having specifically more drift with Levant_N than Minoan_Lasithi does (and also Greek_Peloponnese_N).

On this PC, Mycenaean is simply too loaded towards the Iran_N, Levant_N, and WHG end to fit as a mix of Minoan_Lasithi and Steppe_EMBA / Europe_LNBA.

Though it might be better fit as Tepick_Ciftlik+Steppe_EMBA / Europe_LNBA... Or Anatolia_BA+Minoan_Lasithi+Steppe_EMBA / Europe_LNBA

Similarly the extrapolations of the difference between Minoan+Mycenaean give strange positions in this PC which load heavily towards parts of the plot with no ancient samples...

Tentatively, I suspect that this is real and Global25 is picking up a more subtle difference in Levant vs Anatolian ancestry. That the Mycenaeans were offset towards the Levant and West Asia compared to the Minoans, as well as having received European/Steppe related ancestry, both at fairly low levels.

(Apologies for another "book").

@Ariel, modern Greeks look to have Spinigas2 / Baltic_BA related ancestry. Will be interesting to see if immediately post-Mycenaean / Dark Age Greek samples do, or not really until much later in history.

Lauχum said...

People here seem to be very adamant on Roman Era migration from the Levant and North Africa. However, bio-chemical analysis on a total of 166 individuals from commoner graves active from the 1st to 3rd centuries in Rome and Portus have shown the vast majority of people to be local, and if not local then most likely from other parts of Europe. With only a few possible North African samples. So this indicates that even in the most metropolitan part of Italy, migration from outside of Europe was rather uncommon.
http://onlinelibrary.wiley.com/doi/10.1002/ajpa.20541/abstract
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0147585

Here is study which found negligible influence from North Africa for Sicily and South Italy. Which is also confirmed by the lack of Sub-Saharan admixture in Italy, being a rather unique signature in North Africans out of the peoples in the region.
https://web.archive.org/web/20150402102624/http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0096074%2F

Of course, Im not denying that Italy has non-EEF-WHG-Steppe admixture, but based off these results, North African admixture in modern Italians is negligible and the impact of non-Europeans during the Imperial Roman era also seems to be very minor (again considering this was found in the most metropolitan part of Italy). So attributing this West-Asian shift in Italy to Roman and post Roman era migration seems to be inaccurate.

MomOfZoha said...

@Eren:

"@MomOfZoha:
VIPs should always come first. Hopefully my reply didn’t come off as unpolite, as I only focused on the part of your post about Siberian admixture. I was in a hurry and used your remark as a segue to post my analyses that I wanted to share. "

Not at all unpolite. I am also usually in a hurry and understand that blog comments do not accurately convey "tone", no worries.

"Sorry, about your grandparents. I was lucky enough to test mine when they were all still alive. My maternal grandpa has passed since then. Being able to model my ancestry at the level of my grandparents is definitely great. Don’t give up trying to convince your great uncles."

Thank you. Regarding your dede: Basin sag olsun.

An extra layer of difficulty was added wrt convincing my mom's paternal uncles when my mom's two closest DNA matches turned out to be Iraqi Kurdish. Unfortunately, my extended family is not very open minded, and this extra information was then was too much given their pre-existing distrust of "western" DNA tests to begin with. If I were living near them, maybe I would still try to convince their wives to steal their spit surreptitiously, but alas that too is not an option. :)

"Now with the labels that Arza added to your graph (great job btw), a lot of it makes sense. A very different kind of visualization than I’m used to. Very nice."

Happy that you like it. :)

MomOfZoha said...

@Onur:
I don't disagree that all of Anatolia was "Hellenized" at the very least due to the Byzantine empire, on top of already existing Anatolian Greeks. And, maybe that is a sufficient perspective for the sorts of questions that you wish to ask, especially if your questions are primarily genetic ones for much larger time frames. I am more curious about other aspects of Anatolian history and dynamics, and all the groups that have come and gone are important to me for that reason.

As concerns the Isaurians, even the Byzantine (hence Hellenized) emperors they produced -- one of whom notably married a Khazar princess -- were referred to as alternately Armenian or Syrian. Their region was always together with Cilician Armenia, and then that whole region -- particularly Cumra county of Konya -- became the Karamanid center upon the entrance of the Turks. I have way more questions than any answers concerning any of this.

But, to further illustrate complications to "Konya analysis" in general: Abu Hamid, the second traveler in the book I mentioned, writes of Konyali Turkmen mercenaries in the Byzantine army headed to Hungary. It is in the 1100s remember. Abu Hamid himself convinces the Konya Turkmen not to fight, as they would be fighting fellow Muslims, the Bashgird (Bashkir, probably), Turkic speakers.

In other history, the crusades apparently passed through Konya, devastating the city with the help of Hungarians. Much much prior to that, "Iconium" is mentioned in the Bible, along with its important Synagogue. The Synagogue of Iconium...

And, of course, the Galatians were not a figment of Paul’s imagination...

So, yes, south central Anatolian history gets complicated.

@Matt,

Interesting. I am wondering though. According to the model, my ghost isn't Papuan enough, so I am not sure I would want to take it from the Papuan or Negritoes any more than it is. There could very well be a West Eurasian population mixed into South Asia that doesn't fit into the Iran/Steppe category and is more related to UP Euros. We will have to wait and see.

Looking at your graph I took another stab at it. I now have an ASI ghost that is a much better fit than the last. Here is the fit and the coordinates...

[1] "distance%=17.017 / distance=0.17017"

Paniya
"NewASI" 76.6
"Iran_N" 22.4
"MA1" 1
"ASI" 0
"Yamnaya_Kalmykia" 0
"Yamnaya_Samara" 0
"Srubnaya" 0
"Scythian_AldyBel" 0
"Scythian_Pazyryk" 0
"Scythian_Samara" 0
"Scythian_ZevakinoChilikta" 0
"Poltavka" 0
"Papuan" 0
"Karasuk" 0
"Karasuk_outlier" 0
"Iran_ChL" 0
"Iran_IA" 0
"Iran_LN" 0
"Andronovo" 0
"Andronovo_outlier" 0
"Armenia_ChL" 0
"Armenia_EBA" 0
"Armenia_MLBA" 0
"Altai_IA" 0
"Aeta" 0
"Afanasievo" 0
"Agta" 0

NewASI,-0.024,-0.241,-0.13,0.1,0.035,-0.0085,-0.095,0.008,0.048,0.02,0.021,0.0007,-0.002,0.006,-0.016,0.012,0.014,0.0015,-0.007,0.034,0.00015,0.0015,-0.026,-0.0004,0.002

Matt,

Based on where your ASI ghost is, it is about as West Eurasian as the Paniya and Pulliyar. Try making your ghost a mix of my ASI and these pops. I bet it comes out about 80% ASI and 20% Iran. I'll let you guys play with this from here. I've got an assignment to finish and some other work on farmers to finalize. I'm interested to see how that all turns out or what you can make of it. When I have time later today, or next week, I'll hop back in here.

Matt said...

First some plots: https://imgur.com/a/JhY49 (I've called the new sim Chad ASI v2 on these)

PC1 v PC2: ASI_Final_Sim (my ASI) looks roughly between ASIv2 and Iran_N at about 60:40. ASI v2 is pretty close to Onge in PC1 v PC2.

PC3 and PC4 show that ASIv2 is still pretty close to Onge and far away from the end of South Asian cline. But Onge looks marginally closer to ASI_Final_SIM and the end of the SA cline.

Accordingly when I fit ASI_Final_Sim, allowing your SIM and all World and ancient populations without recent SA gene flow, then:

ASI_Sim_Final - Onge,77.2, Iran_N,20.6, MA1,2.2

It prefers to use Onge ("I said, the *real* Onge... Perfection."?)

The squared euclidean distance is fairly high compared to lots of direct fits, at around 0.017 (though based on Fst conversion I made on the first page, if that's valid, does only translate to 0.004 in Fst terms).

CromwellCorner said...

"@ Anthro

I think most people see this

Rob, I believe David Anthony already used advanced statistics to rule out the Caucasus as the PIE homeland. I can find the reference if you want.

Rob said...

@ Cromwell

I’d be curious to see what stats they are , thanks.
Although I’m my own man, and there are numerous problems with Anthony’s models; and his evidence has at time been circumstantial at best, and can comoletely wrong at worst.
In fact, the classic kurgan model is a thing of the past.

Rob said...

@ AnthroS

Option 3 is least likely
I merely included it out of courtesy for your feelingns :)

Kanishka said...

@Onur Dincer Thanks.

@Anthro Survey Really? Interesting, as I had thought that hybridization between the two groups began much earlier than the late BMAC period.

CromwellCorner said...

Here you go Rob: https://imgur.com/a/56dSW pg. 297 from his book

There are other references to this probability throughout his work.

Matt said...

Thinking more about simulated samples and fits, and South Asia, I tried doing some simple nMonte models of South Asian populations, using only Iran_N, CHG, Onge, Srubnaya_Outlier, Samara_Eneolithic, Steppe_MLBA, Steppe_EMBA.

Results: https://pastebin.com/k4paKSPV

Curiously these only took Onge+Iran_N+Steppe_MLBA+Srubnaya_Outlier and no Steppe_EMBA. Not sure why. (Among these, Steppe_MLBA varies from 23-20 in Kalash/Brahmin to 0 in Dusadh, Velamas, Gond; Onge varies from 15.2 in Kalash to 73 in Gond)

Anyway, took the nMonte output and visualised them with other averages on PCA and neighbour joining: https://imgur.com/a/ZSxcI

The fitted who were mainly made from Iran_N+Srubnaya_Outlier+Steppe_MLBA (Brahui and Kalash) are reasonably close to the real samples... while the ASI heavy populations (e.g. Gond) have fits who are a bit more distant from the real average. Suggesting that Iran_N+Srubnaya_Outlier+Steppe_MLBA seems like a little better proxy for the ANI mixes than Onge (alone) is for ASI.

Generally, comparing fitted to real, the fitted seem a bit further away from other Eurasians in PC2 which is general to modern day East+West Eurasians, and closer to other Eurasians in dimensions that are specific to South Asians (e.g. ASI linked dimensions), and again, particularly for ASI heavy samples.

Vara said...

@Crom

This is just mere assumptions. PIE could've been spoken in the Caucasus while a Caucasian language could've been spoken in Anatolia, and the exchanges could've happened there.

Onur Dincer said...

@MomOfZoha

I don't disagree that all of Anatolia was "Hellenized" at the very least due to the Byzantine empire, on top of already existing Anatolian Greeks. And, maybe that is a sufficient perspective for the sorts of questions that you wish to ask, especially if your questions are primarily genetic ones for much larger time frames. I am more curious about other aspects of Anatolian history and dynamics, and all the groups that have come and gone are important to me for that reason.

As concerns the Isaurians, even the Byzantine (hence Hellenized) emperors they produced -- one of whom notably married a Khazar princess -- were referred to as alternately Armenian or Syrian. Their region was always together with Cilician Armenia, and then that whole region -- particularly Cumra county of Konya -- became the Karamanid center upon the entrance of the Turks. I have way more questions than any answers concerning any of this.

Yeah, so actually not all of Anatolia had been Hellenized by the Byzantine times but certainly most of it had been. Some parts in the interior zone had important Armenian and Syriac pockets and in the easternmost parts such non-Greek Christian groups made up the majority of the population. Bear in mind that I use the historical definition of Anatolia here, so I exclude the parts of what is now Turkey east of the Euphrates, in which Greeks have never been the majority in history.

Davidski said...

@Matt

You might find this interesting. A Global 25 datasheet with two different Loschbour sequences...

One is largely made up of imputed calls, and comes from this dataset.

http://eurogenes.blogspot.com/2017/07/new-resource-67-diploid-ancient-genomes.html

EastPole said...

@Rob

“Option 3 is least likely”

Option 3 seems to be the mainstream now. Options 1 and 2 i.e. Renfrew and Gamkrelidze - Ivanov theories have less and less support.

Listen how Renfrew is prizing Gimbutas’ Kurgan theory:

https://youtu.be/y5u7fls9CIs?t=4076

Regarding relatively low steppe admixture in Myceneans I don’t think Yamnaya-like pure steppe populations migrated to Greece but rather they were more like CWC or Sintashta:

Alberto said...

Some year ago I did this same graphs for SC Asian populations based on D-stats. So now I decided to give it a try with distances from the Global 25 spreadsheet. The results are basically the same, not much to see, really:

https://imgur.com/a/u605d

Apart from a few opulations going towards Iran_Neolithic (Brahui, Makrani and Balochi -missing, I forgot to add it) and the Tajiks towards Andronovo the rest of the populations form a fairly uniform cline.

So then I thought that with an ASI Simulation, we should see the same thing. That the biggest difference between IE and Dravidians, higher cast and lower cast, or north and south is mostly the ratio of ANI/ASI rather than big differences in their West Eurasian components.

This is my go at an ASI Ghost:

ASI_Ghost,-0.014942,-0.231412,-0.246054,0.182356,-0.063257,0.084312,-0.002376,0.023831,0.142769,0.096525,-0.002267,-0.001468,0.00602,0.018795,-0.036669,-0.033767,0.017185,-0.002091,-0.00481,0.031237,0.011215,0.02027,-0.00266,0.009394,-0.013286

No idea where it would land on a PCA, but here are some results from Global 25 fits:

Rob said...

@ EastPole
Yes I’ve heard the talk. Renfrew States the kurgan hypothesis has genetic support . He also was instrumental in making Haak et al rephrase the title of their 2015 study
But genetics also proves the migration of farmers through to Europe and the steppe and through to the urals !
As it does the migration from Caucasus to steppe

In case you weren’t paying attention, I’m basing my arguements based on my analysis of DNA ; not what others think or say
That the majority of scholars auppprt the steppe hypothesis (based on what is a often one sided and sometimes flawed analysis) doesn’t inpinge on my own observations. In fact, I have reason to doubt a steppe promordiality of PPIE on anthropology grounds- withou “farmers” there’d be no kurgans .

And you actually believe Myceneans came from CWC or Srubnaya ? I see no evidence for that

Davidski said...

@Eren & Onur

There is indeed a problem with the Keyseri Turks in this analysis. I've removed them from the datasheet.

Matt said...

@Alberto looks to work fairly well as any cline extrapolation would; unlike sims/ghosts that pretty much match Onge or Aeta/Agta, it is at the end point beyond South Asian clines, so should be favoured over those real populations or simulations that resemble them.

For plotting to visualise it: https://imgur.com/a/XcLTq

The further you move from the real cline, the more likely it is that some subtle structure in your populations will move the ghost into a weird place in some dimension that requires an odd admixture to fit - see what I did with the Mycenaeans for example. So that's one question about going beyond the edge of any cline

north and south is mostly the ratio of ANI/ASI rather than big differences in their West Eurasian components

That's not to say that they *don't* have substantial differences in the West Eurasian components though; if you had fantasy populations that were 90:10 WHG:Anatolia_N and WHG:CHG, then they would be quite close together on a simple dimension distinguishing Anatolia_N and CHG... (e.g. most South Asian populations are compressed together on West Eurasian related dimensions by shared ASI - dimensions don't separate populations in ratio form!).

Matt said...

@Davidski, thanks.

Putting them on the PCA, then calculating the Euclidean distance between them both and other populations (ancient samples and modern averages) and plot: https://imgur.com/uhYMoe6

Perfect correlation at 1:1 (r = 1, r2 = 0.99999).

Putting the two distances through another PCA: https://imgur.com/a/JagU1.
To the degree there is a difference it reflects EEF being more attracted to imputed, others less, *but* since this is 0.00014995% of the variance, far too small for any no change on rank orders of populations, or anything practically meaningful etc.

So based on this, in the case of the Loschbour sample, should be no difference between the observed and one largely imputed? (To limits of this resolution).

Davidski said...

@All

I removed a few more modern samples that looked a bit strange and may have been mix ups, and updated the datasheets.

Also, I made a Past datasheet with each major world region color coded, and all of the obvious outliers removed. It looks pretty good I think.

Alberto said...

@Matt

Thanks for the PCAs. It does look in a similar cline to your own simulation, but more distant from extant Indian populations. Not bad, I'll try to test it more later.

And yes, there are differences in the West Eurasian part of SC Asian populations, but they're fairly small for such a big area. Their ASI might have some influence, but I don't think it's the main reason. Paniya might have some 60-70% ASI, but most others are below 50% to just a few percent. So in your example, we could have those 2 populations being very close to each other, but if then we had another two being 60:40 WHG:CHG and AHG:AN respectively, and then another 2 being 40:60 and another 2 being 20:80, we'll clearly have two different clines.

There are a few options for explaining this cline, but I'm not sure which one is more parsimonious, so I won't go for anyone in particular and wait instead for ancient DNA (it must come at some point...)

Matt said...

Re: PCA no probs. Your and my ASI ghost do look generally along the same cline, though I will say there is a distinction in PC6 where your ghost has a position that probably with your ASI would estimate Austroasiatics in India more ASI and requires them to have ancestry from a more general SE Asian population, while in my model, in the contrary, there would probably need to be some low level (single digits) of specific Austroasiatic related ancestry across various Indian subpopulations... Both outcomes fairly plausible.

Re: SA clines, well, I mean it's like, as a general thing - suppose we have a dimension that separates A and B, and point C fall at 0 between A and B. If then have AC and AB, then AC and AB will be exactly half as distant from each other than AB, on this dimension.

Let's then further assume that actually the points are AD and AE. D is 0.7:0.3 B:C and E 0.3:0.7 B:C (or we could above 0.6:04 and 0:1), then total distance DE is about 0.4 on the BC dimension. Mixing down with 0.5
A (at zero) dilutes this again by half. So AD and AE would only have very small distinction (spanning 0.2 of distance) on the BC dimension, despite having what we'd call substantial heterogenity in the ratio of BC.
It looks like populations spanning Brahmin_UP to Velamas about 0.4-0.6 "ENA ASI" I'd guess.

So I guess I just wouldn't trust myself to get this right by the naked eye is all. From my modeling, to me, something like 55:45 to 10:90 of Steppe:Iran_N in various SA populations actually still looks pretty compatible with how compressed pops appear on cline (though some small amount of "Steppe" and "Iran_N" here may be likely pre-Neolithic / Bronze Age).

Simon_W said...

@ Anthro

"Slav---that could just be bundled in his Prussian Balt ancestry. I mean, modern Lithuanians model as combo of Turlojske and Slav_Bohemia(mostly the former)."

We don't know when exactly this Slavic-like admixture entered the Lithuanian population, could be rather recent. And I would take into account that the Lithuanians originated in a more northern and eastern area than Turlojiske, hence the latter should be a better proxy for the old Prussians.

And finally I would add that real Slavic admixture in some East Prussian Germans doesn't seem that crazy. In the Chelmno land even prior to the conquest by the Teutonic Order there was a mixed population of Poles and Prussian Balts. And starting in the 14th century there was an influx of Masovian Poles into southern East Prussia, which gave rise to the Masurian people. This influx got even stronger in the 16th century. Some of them may have moved around in East Prussia. In fact, Slavic sounding surnames are not uncommon among East Prussian Germans. Also one out of four grandparents of my grandmother carried a Slavic-derived surname. And west of the Vistula (in "West Prussia") the pre-German locals were Slavic Pomeranians anyway; these were partly absorbed by the Germans as well, and certainly there was some mixture and geneflow between Germans from east and west of the Vistula. And even the early German colonists may have carried some Slavic admixture with them, considering the Slavic presence all the way to the Elbe and Saale.

Regarding the Mozabites, well I could try other North African samples in the datasheet, there are all the modern pops available. But I guess they're all quite Sub-Saharan admixed and moreover carry the Arabic admixture.

MomOfZoha said...

@Rob:

Out of curiosity, what kind of "-ski" are you, if you don't mind me asking? Bulgar-ski, Hrvat-ski, Srp-ski, Makedon-ski, Sloven-ski, Rus-ski, Pol-ski, Cze-ski, Slova-ski...

I'm guessing Makedonski or Bulgarski, but I could be totally off... Of course, given this forum, some kind of Polski might make more sense, but with your humor and temperament.... you have *got* to be some kind of BALKANSKI.

Then again, you totally do not have to answer. I'm just having fun here guessing... Hope you don't mind. :)

Anthro Survey said...

@Simon_W

Yeah, the Slav shift in modern Lithuanians could well go back back to medieval/early modern times(Kingdom of Lithuania). I mean, heavy Ruthenian presence was attested to in places like Vilnius. Similar situation in the Moldavian principalities where a lot of Ruthenian peasants were invited.
In agreement as well about the possibility of real Slav admixture in eastern Germans. In fact, I would have actually expected a higher signal. In all likelihood, it's higher than 5%, but hiding in other components---either in Nordic_BA or Turlojske or both---given the inter-relatedness of people in that part of Europe. The other early Slav(568) is different than 569 and plots closer to Turlojske.
So, some uncertainty when it comes to teasing apart Slavic and Baltic ancestry in your context.

The other NA samples have just as many(if not more) caveats as Mozabites do. No choice but to wait for Dave to add Guanche and KEB to be on the safe side. Also----have you tried using Samaritan in your runs, considering greater importance of Roman-age Levant? If so, did it work out or did it "compete" with Cypriot and/or Anatolia_BA due to a lot of shared ancestry?

Simon_W said...

@ Lauχum

North African admixture may be unevenly distributed in Italy.

Gianmarco Ferri et al. 2007 found 4.1% E-M123 and 2.0% E-V65 in the native population of Rimini. The former is typical for Egypt and the southern Levant, the latter is typical for Libya. Together they make up 6.1% of this sample of 98 unrelated males from Rimini.

And Alessio Boattini et al. 2013 found 3.4% Maghreb-specific E-M81 in their Bolognese sample. (Moreover they found 6.9% J1e-P58 there.)

According to my nMonte run, my Italian ancestry is 6.2% North African - that's of course not an average, but may be at the upper end of the individual variation. I'd say given the above yDNA results it's in the ballpark.

I could also cite Busby et al. 2015 according to which there is some North African admixture even in Tuscans and North Italians.

It's tricky to prove or refute this with modern genomic data alone, because Levantines and North Africans are likely to have changed because of the Arabic expansion and subsequent admixture with Sub-Saharan Africans. Hence Lebanese and Syrian Muslims differ from Druze and Lebanese Christians. So the relative paucity of Sub-Saharan admixture in Italy doesn't necessarily disprove ancient North African admixture. Not to forget that ADMIXTURE analyses are not a reliable tool for estimating the proportion of minor admixtures.

As for the isotopic evidence you cited: Maybe the few graveyards just aren't enough. We haven't seen a lot of autosomal DNA from the Roman empire, yet we've already caught the Bedouin-like outlier from Roman Age Britain. And there is also historical and archaeological evidence about migration from the Near East not always and everywhere being that negligible, like that quote from Juvenal about a man complaining about the Syrian Orontes emptying itself into the Tiber or the spread of oriental cults in the Romagna in the later Roman Age.

And the genomic evidence we've seen from re-Roman Italy seem to suggest that the Etruscans were rather North Italian-like, i.e. less West Asian shifted than modern Tuscans.

Simon_W said...

@ Anthro

I haven't tried Samaritan, but actually it's a good idea! I'm going to do this later.

And regarding Slavic admixture in East Germans: Of course East Prussians don't equal East Germans, East Prussia was just the northeasternmost part of Germany, and the only one with a Baltic, rather than Slavic substratum. So all other parts of former and present-day eastern Germany undoubtedly have higher Slavic admixture and zilch Baltic admixture.

Anthro Survey said...

@Lauxum

The studies were inherently limited since isotope ratios in the teeth don't have pinpoint accuracy. Secondly, their imagination and historical knowlede was limited as they mainly considered North Africa to be a potential source of extra-European ancestry(using isotope ratios associated with the region as references). Juvenal never said Carthage was flowing into Rome, but that the Orontes was flowing into the Tiber. Roman Syria was a pivotal and a much underappreciated region. Antioch was more important than any European city aside from Rome itself. Not to mention Roman-age Western Anatolia.

So, I actually agree with you---Roman-age North African ancestry shouldn't be significant in mainland Italy. We need to look east-ward, though.

"So attributing this West-Asian shift in Italy to Roman and post Roman era migration seems to be inaccurate."
One word is missing: "....to EXCLUSIVELY Roman and post Roman....".
Yeah, no question that a West_Asian shift in Italy and Balkans predated Roman times due to Bronze Age movements emanating from the eastern Aegean. The question is to what degree Roman-age movements augmented it.

Anthro Survey said...

@Vara

"This is just mere assumptions. PIE could've been spoken in the Caucasus while a Caucasian language could've been spoken in Anatolia, and the exchanges could've happened there."

Not far-fetched at al, but you have to admit that steppe admixture is significantly associated---however loosely----with the spread of IE languages. So, perhaps it was spoken by the CHG ancestors of early steppe(language later replaced by that of Uruk/Ubaid-related migrants) but steppe-derived/admixed people ultimately disseminated it.

Which Anatolia do you mean? I would argue that C/Western Anatolia wasn't the source of present-day Caucasian languages, but Jazira and Eastern Anatolia. The former was a sink in a Middle Eastern Neolithic and Chalcolithic context, while the latter a source.

Anthro Survey said...

@Rob and East Pole

IF the migration went ~Ukraine---Balkans---Greece, then early Greeks would have sat somewhere on the cline between early steppe and HG-poor EEF. Maybe not as steppic as CWC, though. In any case, don't see the archaeological connection to CWC. We won't see any R1a in future Myceanean graves, but rather R1b-z2103.

Rob said...

@ Anthro
The Caucaus probably had changes after Eneolithic / Majkop just as other parts of Eurasia. So modern NW Caucasian could be a later arrival
I doubt a heterogeneous phenomenon like Meshoko-Majkop-Novosovodbanja spoke a unitary language

@ MOZ
I’m a unionist, Slav-leaning Macedonian from Pelagonia

Simon_W said...

Holy shit, I've added Samaritans as suggested by Anthro Survey, and the Mozabites melted away almost completely; again with my inferred maternal coordinates:

"French_East" 55.25
"French_South" 27.8
"Samaritan" 12.25
"Minoan_Lasithi" 4.1
"Mozabite" 0.6
"Cypriot" 0
"Hungary_BA:I1504" 0
"Remedello_BA:RISE489" 0
"Mycenaean" 0
"Anatolia_BA" 0
"Anatolia_ChL" 0
"Levant_BA" 0
"England_Roman_outlier:3DT26" 0
"Druze" 0

"distance%=1.6602

Keep in mind it's only half Italian, so most of that French_East is from the non-Italian side.

Anthro Survey said...

@Rob

Yeah, a combo of upper Mesopotamian and native Caucasusian origins for the different languages is plausible.

Oh, I thought you were Balkan-leaning? That is, neither Slav-leaning, nor Thracian substrate oriented. I.E. This would entail solidarity with similarly Slav-influenced Romanians(& ofc Moldovans, Tosks and northern Greeks) who just don't happen to speak Slavic but with whom you share other elements of a Balkan cultural sprachbund.

@MomOfZoha

Your intuition was quite sound. You sensed the Anatolia_BA in Rob, so to speak. hehe

Anthro Survey said...

@Simon_W

Haha, true, but now seems you ran into the same extricating issue I had when using Lebanese-Christians because of the overlapping layers ancestry of non-Muslim(and also any northern) Levantine groups with Anatolia_BA and Cyprus.

Do you think you also have minor Ashkenazi ancestry on your maternal side? I used to use Samaritans for them on G10(alongside Germans, Slav569, Tuscans and N. Italians).

Davidski said...

@Anthro Survey

We won't see any R1a in future Myceanean graves, but rather R1b-z2103.

Consider this...

- there's already a Sintashta-like R1a-Z93 sample from Bulgaria dating to the pre-Mycenaean or early Mycenaean period

- Mycenaeans can be modeled as up to 20% Sintashta-like

- Mycenaean royal shaft graves contain horse cheek pieces similar to those found in Sintashta and other Steppe_MLBA graves

- the Sintashta-related horse chariot complex obviously had a huge impact on Myceanean culture, so much so that some scholars have posited that the Mycenaeans were a royal clan from the MLBA Trans-Ural steppe.

And I'm pretty sure that when the Mathieson et al. dataset comes out, Peloponnese Neolithic 3000 BC and Bulgaria_MLBA (R1a-Z93) will provide more than a decent mixture model for the high status Mycenaean. Wanna bet?

Rob said...

@ Anthro
My clan are apparently northerners perhaps from the Balto-Slavic homeland ; so perhaps you’re more Paisan with Tosks?:)
Of course I have paleoBalkan heritage too, and given my knowledge of the region feel my duty to help shed light on a region complex for many westerners

Rob said...

@ Dave
The closest analogies of the cheek pieces in Greece are Hungarian ones

Davidski said...

@All

I've updated the post with a graphic showing all of the Global 25 dimensions (each paired with dimension 1).

Andy Warhol would be proud I reckon.

Davidski said...

@Rob

The closest analogies of the cheek pieces in Greece are Hungarian ones.

I've read different, and I'm pretty sure that the MLBA Z93 guy in Bulgaria wasn't from Hungary, but somewhere much further east.

Ariel said...

Simon_W

I'm getting much lower percentage on Samaritan in my nmonte for central italians.

Italian_Tuscan

Greek,52.6
French_South,18.8
Spanish_Cataluna,16.2
Cypriot,11.2
Samaritan,1.2

Even for Italian_South is lower

Greek,49.2
Cypriot,22.4
Samaritan,10.2
Spanish_Cataluna,9.8
French_South,6.6
Egyptian,1.8

MomOfZoha said...

@Anthro:
"@MomOfZoha

Your intuition was quite sound. You sensed the Anatolia_BA in Rob, so to speak. hehe"

Never found a reason to doubt my intuition, grazie. ;)

And, if indeed Rob-ski is of Balto-Slavic descent, then clearly the epigenetic effects of his ancestral Greek immersion have overshadowed his Balto-Slavic gene expression where social interactions are concerned.

He thinks Balto-Slavic logics, but with Rembetiko playing in the background.

Davidski said...

You guys modeling South Asians, have you had a chance to test out Ulan IV RISE552 as a steppe reference? This guy has a high level of ANE, despite his somewhat unusual Y-haplogroup (I2).

Seinundzeit said...

David,

Despite his inclusion, populations from southern Central Asia all the way to southern India still latch onto Srubnaya_outlier + Steppe_MLBA.

For example, the Kalasha:

37.6% Iran_N + 12.6% Armenia_ChL/EBA + 1.3% CHG
22.4% Srubnaya_outlier + 12.8% Steppe_MLBA
13.3% ASI

distance%=0.3179 / distance=0.003179

Until we see relevant aDNA, I will exercise "skepsis" (in the sense used by the ancient Greeks) with regard to the meaning of the "Srubnaya_outlier" percentages.

Simon_W said...

@ Ariel

Sure, I wouldn't claim that my above results can be taken to be typical for Italy as a whole. It's just a consistent pattern in my results, in practically all genetic tests, that I'm relatively low on CHG and comparatively high on Levant- and Natufian-related components. For that reason apparently Samaritans are a good match for my West Asian pull. But in part that may be just individual variation. One DNA relative of mine, also partly Romagnol, clearly has a stronger CHG component which makes him more similar to central Italians. That's the individual variation of the area. A general tendency of Cesena and surroundings seems to be a substantial South Italian-like pull, and with my Sicilian-like pull I'm roughly in line with this.

@ Anthro Survey

"Do you think you also have minor Ashkenazi ancestry on your maternal side?"

Apparently yes, but it must be very minor and hence negligible. I got a 0.1% Ashkenazi segment on 23andme and a few Jewish matches in precisely this segment. But I don't have lots of Jewish matches, unlike anyone with more relevant Ashkenazi ancestry.

Anthro Survey said...

@Davidski

Bulgaria sits at the SW tip of the Pontic-Caspian grasslands and, as such, has been subject to influences from it right into the historical period. Take the Bulgars, for ex, who barely made a demographic dent and didn't even impose their language despite constituting aristocratic circles.
So, I'm not surprised we've found one R-z93 there.

This finding and the chariot evidence influence by no means preclude the (more likely) scenario of another steppe-rich group with R-z2103(we've found z2103 Balkan samples, btw) ultimately being the primary movers in the modest demographic shift. In fact, R1a/Sintashta model doesn't work so well if we assume steppe-DNA carriers to have introduced proto-Greek because the former are suspected of being proto-Indo-Iranic speakers. Armenians are R-2103 heavy today, too, btw, and this fits rather nicely with the Greco-Armenian hypothesis.

qpAdm is lax on recent drift, as you've said yourself. So, sure, Sintashta works but Iran_Chl works for Yamnaya. Does it mean that Yamna were a product of this specific population? Ofc not and that's more reflective of CHG+EEF(or at least maybe just excess WHG-like ancestry from Ukraine sharing a clade with UHGs important for EEF).

Anthro Survey said...

@Ariel

Your model may be closer to the truth, but here, just like with my old models utilizing ancient samples, South_Italy does choose Samaritan to a greater degree than for those southern Tuscans in any given run using the same set of inputs---even in certain runs where Samaritan-like ancestry was likely spiked.

This is also true with formal stats and numerous admixture studies.

Davidski said...

@Anthro Survey

The finding of a Steppe_MLBA individual in the southern Balkans dating to the early Mycenaean period, and Steppe_MLBA-related admixture in Mycenaeans, fits rather well with this...

https://en.wikipedia.org/wiki/Graeco-Aryan

Of course, there may have been, and probably were, individuals belonging to R1b-Z2103 lineages amongst these steppe migrants to Greece, because R1b-Z2103 is actually pretty common in Indo-Iranian-speakers rich in steppe ancestry, but it's hard to imagine that R1a-Z93 won't be found in early Mycenean remains from burials with Steppe_MLBA-like grave good and chariots. I mean, chariots, think about it.

Anthro Survey said...

@Rob

Well, not surprised that you've got direct patrilinear connections to the original clan leaders(be it I2a-din or R1a-z282). Half of any given Balkaners with a heavy chunk of Slavic ancestry do. Yet, given that over 1000 years has passed, there's been not only demographic but also cultural homogenization. I2a guys from the same ethnicity aren't more Slavic than E-v13 ones today, on average. So, all that being said, I treat the Balkans as its own deal, not an extension---Slavic speakers or otherwise.

Tosks are in your spectrum, though, and many are rather Slavophilic even if some suppress it, lol. Old school Tosks are unionists, as well.

I haven't seen autosomal comparisons between Tosks and Ghegs, but considering Ohrid's grip being so tight in the former's territory, higher R1a and I2a there today, and the former bearing MORE resemblance to "proper" Slavs(imho), wouldn't suprise me in the slightest to see them less "Greek-shifted".

Rob said...

@ Dave

Yes that Bulgarian Z93 could be an Iranic migrant from the Mnoguvelikovya culture IMO
But acc to Kristiansen , Drews etc the chariots were mediated to Greece via carparthian basin
And my hunch based on archaeology is a series of shifts begeinning as early as 3700 BC
Of course we could find z93 in Mycenean shafts; fine, but given that Greeks do have decent amounts of Z2103 and rare Z93, I’d think the weight of evidence favours something more akin to what Antho and I envisage
. Indeed from your G25 data, the models favour Yamnaya over any MBA

@ Anthro /MOH
Thanks for the feedback !

Anthro Survey said...

@Davidski

Which Indo-Iranian populations do you have in mind? Pamiris and Pashtuns---peoples with some of the highest proportions of MLBA ancestry among Indo-Iranics---are almost exclusively R1a-z93 in their R1. Same with Baloch, Punjabi, and so on. Western Iran may have more Z2103 because it was likely at an intersection of IEs(non-Iranic) crossing the Caucasus and Iranics arriving from BMAC area.

There is no guarantee even in those types of burials. I mean, who's to say they didn't (mostly)copy the design(s)? Chariotry, in general, is believed to have arisen in the BA steppe and popularized by steppe folk, spreading as far afield as China and Egypt. Now, because Greece was sequentially low in the chain of transmission, we can expect more striking resemblances to original designs.

Rob said...

@ Anthro
What I think we need to find out is when it happened- immediately or over the Byzantine centuries .
Eg old skool Anthos noted south Slavs were “Nordid” until the 13th c, when Brachycephalization set in ; but that could be due to ?urbanization (<-> intermixture and cosmopolitanism)

Davidski said...

@Rob

Different archaeologists say different things on the matter. But strong links between Mycenaeans and Bronze Age peoples of the Trans-Ural steppes have been posited in literature for a long time.

And if you try modeling the high status Mycenaean, you'll see that her best Neolithic reference is the singleton Peloponnese Neolithic sample. The rest of the model is mostly made up of Bronze Age Armenians and Potapovka.

This makes sense, because it's the best we got right now, but there are more, much younger Peloponnese Neolithic samples on the way, with higher levels of Caucasus admixture. And we also have that Z93 Bulgaria_MLBA coming.

So the final, very best fitting model for this Mycenaean could turn out something like 80% Peloponnese Neolithic from 3000 BC and 20% Bulgaria_MLBA (with nothing from Armenia). If so, I'll take this as a very strong hint of what to expect from the remains in the Mycenaean shaft graves.

@Anthro Survey

There's plenty of R1b-Z2103 among Takijs and Pathans. It's even found in northern India.

Rob said...

Ok I’ll have a look at it too. Which one was the high status female ?

Anthro Survey said...

@Rob

This was said about old South Slavs or medieval Slavs to the north?
Since the phenotypic diversity of Slavic migrants(ultmately a subset) was reduced, perhaps they were mostly "nordid" and maybe basic cranial proportions of these "Nordid" slavs and "dolicho-Med" paleo-Balkners were relatively similar(as old schoolers would oft suggest). Could also be predominance of certain S. Slavic clans over others over time.

Anthro Survey said...

@Davidski

I haven't seen that and how plenty is plenty?
Sure, I can envision some early proto-"Greco-Aryan" population initially rich in both R1a-z93 and R1b or just rich in R1b(before picking up R-z93), undergoing intense but imperfect drift w/rt to Hgs shortly before the split of Indo-Iranian and Greco-Armenian. So, again, R1a prolly wasn't huge in the Hellenizers and the Bulgarian burial could have been from an unrelated Iranian(?) culture with little-to-no memory of the split.

Rob said...

@ Anthro

Yep, moreso west SS. Anthropological and mythological links to Bohemia and Sth Poland.

EastPole said...

@Anthro Survey
“Sure, I can envision some early proto-"Greco-Aryan" population initially rich in both R1a-z93 and R1b”

Could you tell us where you can see this population on the diagram:

https://s7.postimg.org/jhiotcuop/screenshot_308.png

Matt said...

@David, nice graphics. Hopefully this will help people think about where the clines and differentiating dimensions mean and why particular populations are favoured in nMonte which is pretty important, even though we can only tell so much by the naked eye.

Davidski: And if you try modeling the high status Mycenaean, you'll see that her best Neolithic reference is the singleton Peloponnese Neolithic sample. The rest of the model is mostly made up of Bronze Age Armenians and Potapovka.

Though in my experience, in G25, even if you use Minoans at the moment (to get to a proxy for what later Peloponnese neo may have been like) then to get to the Mycenaeans then still you need excess admixture from both CHG and Levant_N.

While Minoans / Greek_Peloponnese_N don't seem to have any Levant_N ancestry (Minoans look pretty Anatolia+CHG, while GPN looks straight Anatolia like)... So seems more like Tepecik-Ciftlik/Anatolia_Chl/BA like migrations plus some migration from Bulgaria/Steppe/etc (if assuming any continuity from early Greek Neolithic).

I guess migrations could've taken whatever sequence to the limits of our resolution at the moment? (Whatever form, it seems consistent with the basic picture of interactions between an Indo-European and East Mediterranean myth/culture complex in early Greeks?)

James Goblin said...

Hey Davidski,

what do you thing of this : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3971580/

Matt said...

Btw, Davidski, this (from Khan's twitter feed) is interesting - https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1393-5

R-V88 expansion star-like, recent (Bronze Age dates, roughly) while basal diversity of R-V88 in Africa more like Neolithic period.

(I only care about high-coverage resequencing of the y on large populations lol).

R types look star-like everywhere? Even when R-V88 is long separated from today's modal European R1b clade?

Other branches show expansion, look slightly less star-like, can't find an indicator of their "star-like index" parameter for other clades, unfort.

However, "The large majority of nodes joining northern and sub-Saharan patrilineages date back to the Green Sahara period. On the contrary, most clades geographically restricted to one of these two macro-regions coalesced after 5 kya", so for groups specific to North Africa or SSA, post "Bronze Age" expansions.

Ryan said...

@David - "there's already a Sintashta-like R1a-Z93 sample from Bulgaria dating to the pre-Mycenaean or early Mycenaean period"

Do you think we'll find any R1b-M269 other than Z2103?

Lauχum said...

@ Anthro
In regards to Juventus' account of a man saying that "the Orontes was emptying itself into the Tiber". Historical records are known to be exaggerated and sometimes completely wrong. Like how the Byzantines described the Slavs as red haired and ruddy/swarthy when this is clearly not the case. Could be possibly describing a temporary presence as well, such as merchants/traders.

There is historical evidence with the Isotopic evidence however which supports minor Levantine influx during the Roman era. David Noy in his book "Foreigners at Rome: Citizens and Strangers" writes that:
a) "foreigners who forgot or deliberately abandoned their original language would still retain their original names (unless they were replaced by Latin or Greek ones, which was often not the case among slaves and ex-slaves). Kajanto (1980, 85) counted only 530 slaves/ex-slaves with local (i.e. not Greek or Latin) cognomina, out of a total of 25,000 slave names."
b) "local names from the eastern provinces) were very rare for slaves at Rome: only 1.9% in a total of 26,300 Vernae."
c) "in a sample of pagan inscriptions from Rome, Kajanto found that the cognomina were about 41.5% Latin, 56% Greek and only 2.5% local; the local (including Semitic) proportion increases to 3.5% in Christian inscrip­tions, where Latin names outnumber Greek by two to one."

But aside from this yes it does appear that we are in agreement. Hopefully Iron Age samples from Italy will be released relatively soon to end the speculation.

Davidski said...

@Rob

Mycenaean I9033.

@Matt

I'm waiting for the Peloponnese Neolithic samples from 3,000 BC before I again model the Mycenaeans and Minoans. They appear to be quite different from the Peloponnese Neolithic sample from 5000 BC, so they might be the key here.

@James Goblin

I don't have an opinion about that paper because it's based on modern-day DNA.

@Ryan

I don't know. Btw, if you send me your data files again I can give you the Global 25 coords.

Kanishka said...

@Seinundzeit What are you using for your ASI proxy? It looks highly accurate? Could you give me some tips? Thanks.

Lauχum said...

@ david
Anything on the South Asian paper btw? They really seem to be dragging it out.

Davidski said...

@Lauχum

Apparently, the paper was basically ready at Broad MIT/Harvard months ago, and I can tell you for a fact that it wasn't backing Out-of-India, and that's something of an understatement. Then some Indian scientists were asked to collaborate, and here were are, still waiting.

So I don't know what happened. Best guess, and this is pure speculation on my part, rather than any inside info, is that at least some of the Indians weren't exactly happy with the findings in the paper, hence the delay.

Lauχum said...

Infuriating how they seem to be holding this up over a politically motivated theory which has been dead for years, but thanks for the info.

Onur Dincer said...

@Davidski

There is indeed a problem with the Keyseri Turks in this analysis. I've removed them from the datasheet.

Thanks. When do you think your datasheet takes its final form for your Global 25 analysis?

Davidski said...

@Onur

Chetan said...

@Davidski Any news about the Indian paper David?

Davidski said...

@Chetan

See my second to last post above.

Chetan said...

@David I guess there's not much point in keeping my fingers crossed now. This is going to take a while.

Onur Dincer said...

@Davidski

I'll send them pretty soon. In the meantime I'd like to hear your thoughts on this new paper regarding Dystruct, a new type of model-based genetic analysis that is a candidate to replace ADMIXTURE:

https://www.biorxiv.org/content/early/2018/02/07/261131

It takes into account genetic drift and population history and in analyses involving both ancients and moderns it usually represents modern samples as mixes of ancient samples rather than the other way around.

Eren said...

@Onur: Nice find, I've only read the abstract yet, but Dystruct sounds very interesting!

Davidski said...

@Onur

I can't really comment on the Dystruct algorithm until I've used it, but the analysis in the preprint isn't overly impressive. I'd call it promising.

Onur Dincer said...

How realistic is it to model Yamnaya as almost totally Kostenki14? That is what Dystruct does.

Davidski said...

@Onur

How realistic is it to model Yamnaya as almost totally Kostenki14? That is what Dystruct does.

It's total BS, not to put a finer point on it.

Matt said...

@Davidski, makes sense. They will have to be Levant+CHG than Minoans to actually model as Steppe_MLBA/Europle_LNBA+them is all.

@Onur, interesting to bring this algorithm to our attention.

Can't comment on the mathematics or modeling assumptions; it looks like their algorithm generally yields less error on their simulated populations, with their assumptions is about all (which does tend to happen with papers offering alternative proposals to ADMIXTURE ;) ).

For the *real* ancient populations though, finding Steppe_EMBA as an unadmixed population and Corded_Ware_Germany as purely admixed between Euro_HG and Steppe_EMBA hardly seems like a clear cut better ordering of things. I suppose this could be improved by the inclusion of Kotias (and more generally using an up to date dataset), but it still gives rise to questions about the assumptions e.g. the ADMIXTURE models that presented Steppe_EMBA as admixed and presented Corded_Ware_Germany as having EEF admixture did prove to actually be telling us something.

Modeling also presents Basques and Sardinians as lacking Steppe admixture and Caucasus populations as extensively Steppe admixture, which would've been a misleading inference had Dystruct existed at the time... (Though it's not like ADMIXTURE detected the CHG fraction here...).

Matt said...

Off topic post here, Davidski, if you're still working with the Ancient67 PCA data, this may be of interest.

I was wondering about the chunkcount data from Martiano's paper which provides the Ancient67 PCA data. Specifically whether the extra dimensions on the Ancient67 PCA basically make the patterns of relatedness which that shows superfluous. Question being, with the extra dimensions, is the information about chunk sharing redundant (tells us nothing much that the Ancient67 PCA doesn't).

So what I did was matched the distances from the Ancient67 PCA World run (scaled) to the chunkcounts from the Martiniano paper.

First matched by sample id: https://imgur.com/a/hyv2J for a few examples. (Only some of the sample IDs matched because the run contained different individuals from many of the populations to what M et al had used).

Correlations are pretty good (0.57-0.73), and there is the expected relationship (i.e. more chunks sharing correlates with lower population distance in the PCA), but there is a lot of random within population variation in chunk sharing, so you'd need a lot of ancients from the same population to say anything about individual relationships to a population via chunk sharing.

Then aggregated the above sampled id matches with an average: https://imgur.com/a/3UL1S

That improves correlation so that they are in the range of about 0.75-0.9. Much better correlation with chunk sharing on a pop level than individual.

Including some other population averages (note these are not from matching sample IDs though): https://imgur.com/a/vMkYn

Correlations do drop down again to middling levels. I'm not sure if this is because of differences between samples Martiniano use for the populations and Ancient67 or because of real differences between chunkcount and Ancient67 PCA distance.

Still overall having looked at it, I do think there are some things that the chunkcounts show that aren't quite in the Ancient67 PCA data - particularly things like, 1) Rathlin Bronze Age are still closer to Scots by chunkcounts (by quite a bit) than the distance from Rathlin Bronze Age in Ancient67 world PCA shows, 2) Western Europeans still seem to show some link to the Iberian and West European (Irish) Neolithic in chunkcounts and Southeastern Europeans to Anatolian Neolithic / LBK that is greater than the distances from Ancient67 PCA show, 3) Slav_Bohemia RISE569 is more decisively linked to Baltic Slavic populations in chunks that Ancient67 PCA distance.

So I'd say there's still a good use for the chunkcount and haplotype data even with good high quality PCA like Ancient67 PCA that manage to capture a lot of fine structure.

Davidski said...

@Matt

Chunk counts are definitely very useful, but they can also be misleading to some degree when used to cluster ancient samples. From memory, Sweden_MN clustered with Portugal_MLBA in that paper, which seems highly unusual, considering that Portugal_MLBA has about 15% Yamnaya-related steppe input, while the only eastern input in Sweden_MN is possibly a trace amount of EHG/SHG. So it seems like they clustered together because they had similar ratios of Hunter-Gatherer and Near Eastern (basal-rich) chunk counts, which is pretty crazy if true.

MomOfZoha said...

Thanks for my family PCA25 coordinates, Davidski.

I just ran my dad via the spreadsheet of all samples. Keep in mind that this is my VERY FIRST nMonte run -- thanks, @Huijbregts.

With eigenvalue scaling, here we go with a distance of 1.8706:

Turkish,51.6
Sephardi_Jew,15.2
Assyrian,4.2
Cypriot,3.2
Druze,1.8
Azeri,1.6
Cherkes,1.4
Karachay,1.4
Lebanese_Christian,1.4
Barcin_N,1.2
Georgian_Jew,1.2
Mycenaean,1.2
Chechen,1
Kumyk,0.8
Anatolia_BA,0.6
BedouinB,0.6
Koros_EN,0.6
Kurdish,0.6
Lebanese_Druze,0.6
Levant_BA,0.6
ALPc_MN,0.4
Balaton_Lasinja_CA,0.4
Georgian_Laz,0.4
Iranian_Fars,0.4
Kabardin,0.4
LBK_EN,0.4
Macedonian,0.4
Saudi,0.4
Abkhasian,0.2
Albanian,0.2
Armenia_MLBA,0.2
Azeri_Dagestan,0.2
Bell_Beaker_Germany,0.2
Eskimo_Naukan,0.2
Georgian_Imer,0.2
Greece_Peloponnese_N,0.2
Greek,0.2
Iran_IA,0.2
Iranian_Jew,0.2
Japanese,0.2
Khakass,0.2
Khanty,0.2
LBKT_MN,0.2
Montenegrin,0.2
Nasoi,0.2
Russian_Orel,0.2
Spanish_Andalucia,0.2
Starcevo_EN,0.2
Swedish,0.2
Tabasaran,0.2
Tajik_Shugnan,0.2
Tatar,0.2
Yemenite_Jew,0.2

Without the scaling, the distance is 1.0518 as follows:

Turkish,37.2
Assyrian,24.2
Cypriot,6.4
Lebanese_Druze,6.4
Druze,6
Italian_Tuscan,2.2
Chechen,1.6
Iran_IA,1.6
Sephardi_Jew,1.2
Karachay,1
Albanian,0.8
Greek,0.8
Mycenaean,0.8
Barcin_N,0.6
Kumyk,0.6
LBK_EN,0.6
Lebanese_Muslim,0.6
Scottish,0.6
Georgian_Imer,0.4
Hazara,0.4
Levant_BA,0.4
Macedonian,0.4
Minoan_Lasithi,0.4
Montenegrin,0.4
Tabasaran,0.4
Tuvinian,0.4
Ulchi,0.4
ALPc_MN,0.2
Anatolia_BA,0.2
BedouinB,0.2
Cherkes,0.2
England_IA,0.2
Georgian_Jew,0.2
Hazara_Afghanistan,0.2
Italian_Bergamo,0.2
Japanese,0.2
Kazakh,0.2
Mongolian,0.2
Poltavka,0.2
Scythian_ZevakinoChilikta,0.2
Slovakian,0.2
Uygur,0.2

@Onur: Can you spell I-S-A-U-R-I-A-N? :P

MomOfZoha said...

And now my dear mama, getting distance 0.5457 with unscaled run as so:

Turkish,56.6
Georgian_Jew,3.2
Greek,3
Kabardin,2.6
Azeri,2.4
Lebanese_Druze,2.2
Assyrian,2
Sephardi_Jew,1.6
Armenia_ChL,1.4
Cherkes,1.4
Iranian_Zoroastrian,1.4
Lebanese_Muslim,1.4
Druze,1
Kumyk,1
Uzbek,0.8
Albanian,0.6
Ashkenazi_Jew,0.6
Azeri_Dagestan,0.6
Cypriot,0.6
Han,0.6
Iranian_Fars,0.6
Iranian_Persian,0.6
Karachay,0.6
Maltese,0.6
Samaritan,0.6
Yakut,0.6
Algerian,0.4
Karakalpak,0.4
Kazakh,0.4
Lebanese_Christian,0.4
Moldovan,0.4
Palestinian,0.4
Tabasaran,0.4
Tunisian_Jew,0.4
Turkmen,0.4
Uygur,0.4
Abkhasian,0.2
Altai_IA,0.2
Armenia_MLBA,0.2
Belarusian,0.2
Bosnian,0.2
Bulgarian,0.2
Chechen,0.2
Colla,0.2
Croatian,0.2
Czech,0.2
Daur,0.2
Egyptian,0.2
Georgian_Imer,0.2
Georgian_Laz,0.2
Hazara,0.2
Hungarian,0.2
Iranian_Lor,0.2
Iraqi_Jew,0.2
Kusunda,0.2
Libyan_Jew,0.2
Miao,0.2
Mongolian,0.2
Moroccan_Jew,0.2
Naxi,0.2
Polish,0.2
Russian_Voronez,0.2
Sakha,0.2
Saudi,0.2
Serbian,0.2
Slovakian,0.2
Tajik_Ishkashim,0.2
Tlingit,0.2
Tu,0.2
Tujia,0.2
Tuvinian,0.2
Xibo,0.2
Zapotec,0.2

With a scaled run, my mama gets distance 1.2345 (hah! WHAT a distance!) like so:

Turkish,61.2
Sephardi_Jew,9.6
Kabardin,4
Azeri,2.4
Cherkes,1.6
Assyrian,1.2
Karachay,1.2
Kumyk,1
Lebanese_Christian,0.8
Azeri_Dagestan,0.6
Druze,0.6
Georgian_Jew,0.6
Iranian_Fars,0.6
Iranian_Zoroastrian,0.6
Abkhasian,0.4
Armenia_MLBA,0.4
Balkar,0.4
Barcin_N,0.4
Cypriot,0.4
Han,0.4
Iraqi_Jew,0.4
Kinh_Vietnam,0.4
Kirghiz,0.4
LBKT_MN,0.4
Romanian,0.4
TDLN,0.4
Ukrainian,0.4
Yemenite_Jew,0.4
Zapotec,0.4
ALPc_MN,0.2
Amerindian_North,0.2
Anatolia_ChL,0.2
Armenia_ChL,0.2
Avar,0.2
BedouinB,0.2
Bulgarian,0.2
Chechen,0.2
Chuvash,0.2
Colla,0.2
Czech,0.2
English,0.2
Eskimo,0.2
Estonian,0.2
Georgian_Laz,0.2
Iranian_Lor,0.2
Irish,0.2
Italian_Tuscan,0.2
Kazakh,0.2
Ket,0.2
Kurdish,0.2
LBK_EN,0.2
Lebanese_Druze,0.2
Lebanese_Muslim,0.2
Levant_BA,0.2
Libyan_Jew,0.2
Mentese_N,0.2
North_Ossetian,0.2
Protoboleraz_LCA,0.2
Samaritan,0.2
Serbian,0.2
Tajik_Shugnan,0.2
Tajik_Yagnobi,0.2
Tubalar,0.2
Tuvinian,0.2
Uygur,0.2
Vinca_MN,0.2

More Caucasus and Iran shifted than my papa... though I should probably remove both of their actual populations (Turkish) from consideration for further analysis...

MomOfZoha said...

Now to my dear Iranian Azeri father-in-law...

With scaling, a distance of 1.413 is obtained as so:

Iranian_Lor,23.6
Turkish,12.8
Sephardi_Jew,6.6
Assyrian,6
Azeri,6
Kurdish,5.2
Azeri_Dagestan,4.8
Iranian_Fars,4.8
Iranian_Zoroastrian,3.2
Georgian_Jew,3
Iranian_Persian,2.6
Armenia_EBA,2
Georgian_Imer,1.4
Georgian_Laz,1.4
Kumyk,1.4
Abkhasian,1.2
Druze,1
Iranian_Mazandarani,1
Armenia_MLBA,0.8
Iranian_Jew,0.8
Kabardin,0.8
Armenia_ChL,0.6
Avar,0.6
Cherkes,0.6
Lebanese_Christian,0.6
Lebanese_Druze,0.6
Palestinian,0.6
Barcin_N,0.4
Chechen,0.4
Iran_ChL,0.4
Karachay,0.4
Tabasaran,0.4
Altai_IA,0.2
Anatolia_BA,0.2
Ashkenazi_Jew,0.2
Balkar,0.2
CHG,0.2
Cypriot,0.2
Egyptian,0.2
Evenk,0.2
German,0.2
Greek,0.2
Hungarian,0.2
Iraqi_Jew,0.2
Libyan_Jew,0.2
Maltese,0.2
Montenegrin,0.2
Tajik_Yagnobi,0.2
Tisza_LN,0.2
Udmurt,0.2

And, without scaling, the following gives a distance of 0.7086:

Turkish,17
Assyrian,12.8
Iranian_Lor,10.8
Georgian_Jew,10.4
Iranian_Zoroastrian,8.2
Iranian_Fars,7.8
Iranian_Mazandarani,5.8
Sephardi_Jew,4
Georgian_Laz,3.2
Armenia_ChL,2.4
Armenia_MLBA,2.4
Azeri,2
Azeri_Dagestan,1.8
Kumyk,1.6
Iranian_Jew,1.4
Armenia_EBA,1
Druze,0.8
Greek,0.8
Kurdish,0.8
Chechen,0.6
Georgian_Imer,0.4
Kabardin,0.4
Karachay,0.4
Lebanese_Christian,0.4
Tabasaran,0.4
Abkhasian,0.2
Australian,0.2
Balkar,0.2
Burmese,0.2
CWC_Baltic_early,0.2
Eskimo_Sireniki,0.2
Iranian_Persian,0.2
Nganassan,0.2
Romanian,0.2
Sakha,0.2
Sicilian_East,0.2
Yakut,0.2

Hmmm... Are the only Armenians in the spreadsheet ancients?

Kanishka said...

@Onur Dincer There's got to be something seriously wrong with that calculator if it's modelling most of the Yamnaya samples as 100% Kostenki.

Nirjhar007 said...

This is going to take a while.

Don't think so bud...

MomOfZoha said...

Re-running unscaled on ancients + few moderns (Zoroastrians, Samaritans, and few Siberians) only in source pops:

"distance%=1.3484"

MoZ_Father

Iran_IA,41
Mycenaean,11.2
Armenia_EBA,6.2
Anatolia_BA,5
Iranian_Zoroastrian,4.8
Levant_BA,4.4
Barcin_N,3.6
England_Roman_outlier,2.4
Minoan_Lasithi,2.2
LBK_EN,1.8
Tepecik_Ciftlik_N,1.6
Altaian,1.4
Anatolia_ChL,1.4
Armenia_MLBA,1.4
Tuvinian,1
LBKT_MN,0.8
TDLN,0.8
ALPc_MN,0.6
England_IA,0.6
Poltavka,0.6
Sarmatian_Pokrovka,0.6
Srubnaya,0.6
Tajik_Shugnan,0.6
Yamnaya_Samara,0.6
Afanasievo,0.4
England_Roman,0.4
Karasuk,0.4
Scythian_Pazyryk,0.4
Yukagir_Tundra,0.4
Altai_IA,0.2
Bell_Beaker_Germany,0.2
Boncuklu_N,0.2
England_Anglo-Saxon,0.2
Eskimo,0.2
Scythian_ZevakinoChilikta,0.2
Sintashta,0.2
Slav_Bohemia,0.2
Tajik_Ishkashim,0.2
Tajik_Rushan,0.2
Tajik_Yagnobi,0.2
Tianyuan,0.2
Yamnaya_Kalmykia,0.2

Mom:

"distance%=0.9604"

MoZ_Mother

Armenia_ChL,28.8
Iranian_Zoroastrian,23.8
Samaritan,8.8
Armenia_EBA,7
Armenia_MLBA,6
Barcin_N,2.4
Anatolia_BA,1.8
Levant_BA,1.4
Minoan_Lasithi,1.4
Tajik_Rushan,1.4
TDLN,1
Tuvinian,1
Yakut,1
LBK_EN,0.8
Protoboleraz_LCA,0.8
Scythian_ZevakinoChilikta,0.8
Altaian,0.6
Hungary_BA,0.6
Hungary_IA,0.6
Iran_IA,0.6
Levant_N,0.6
Scythian_Pazyryk,0.6
Slav_Bohemia,0.6
CWC_Baltic_early,0.4
CWC_Germany,0.4
Greece_N,0.4
Mycenaean,0.4
Nordic_IA,0.4
Tisza_LN,0.4
ALPc_MN,0.2
Battle_Axe_Sweden,0.2
Bell_Beaker_Germany,0.2
Boncuklu_N,0.2
Chukchi,0.2
CWC_Baltic,0.2
Dai,0.2
England_Anglo-Saxon,0.2
England_Roman_outlier,0.2
Eskimo_Naukan,0.2
Kennewick,0.2
Ket,0.2
Koryak,0.2
Sarmatian_Pokrovka,0.2
Scythian_AldyBel,0.2
Sintashta,0.2
Srubnaya_outlier,0.2
Tajik_Ishkashim,0.2
Tajik_Shugnan,0.2
Tepecik_Ciftlik_N,0.2
Tianyuan,0.2
Yamnaya_Kalmykia,0.2
Yamnaya_Samara,0.2
Yoruba,0.2
Yukagir_Tundra,0.2

Father-in-Law:

"distance%=0.8711"

MoZ_FatherInLaw

Iranian_Zoroastrian,46.6
Armenia_ChL,12
Armenia_MLBA,12
Armenia_EBA,11.4
Samaritan,3.4
Iran_IA,3
Iran_ChL,2.6
Anatolia_BA,2
Minoan_Lasithi,2
Yakut,1
Barcin_N,0.8
Greece_N,0.6
LBK_EN,0.6
Anatolia_ChL,0.4
Altaian,0.2
Eskimo,0.2
Evenk,0.2
Levant_BA,0.2
Protoboleraz_LCA,0.2
TDLN,0.2
Tuvinian,0.2

Looks like this will be tweaked and tweaked and tweaked for meaning to emerge...

Davidski said...

@MomOfZoha

Those distances are extremely low because your models are overfitted.

The strategy you used is obviously the same as my own in the Yamnaya models, but it's usually not a useful way to model most modern-day or even ancient people. That's because the algorithm will pick several, or, like in your examples, many reference samples as mixture sources just to minimize the distance, and hence overfit the model.

A better strategy is to be prudent with your choice of reference samples and focus on distinct streams of ancestry, each either represented by single ethnic groups, regional populations (and you can make your own to suit your needs by combing different sample sets in the datasheet) or ancients.

Of course, this is a lot more time consuming than simply casting a very wide net, but having a model when you're modeling is usually the only way to come up with coherent and useful results.

MomOfZoha said...

@David:

I agree completely! Thirty possible source populations the majority with < 1% contribution to ancestry is not at all useful!

But, hell, I'm using your non-averaged pop sheet which is LARGE -- mainly because I'm not convinced that averaging a population always makes sense. Even if I used the averaged pop sheet, it's still not a small number of things. Instead of approaching it by adding little by little to the reference populations, I was pruning those spreadsheets, which takes time.

I have a problem with a priori limiting the reference populations too severely too since that strikes of tautology to me, although sure it is fun if there is time.

At the end of the day, unless that too is done methodically (scriptable), then it is no more useful than these "overfitted" things.

However, the "overfitted" output is not entirely useless either, since it does give some appropriate comparative results.

And, finally, a more interesting "tweak" to nMonte that we really desire would be simply to disallow multiple contributions of < some x% -- i.e. perhaps two input parameters k and x such that the output is not allowed to consider more than k reference populations that each have contributions less than x%. THAT is really what is desired instead of this alleged calculus of "overfitting". The inherent problem is not the over-fittedness of something but simply the use of too many populations with epsilon contributions (which tends to give rise to overfitting too).

Kanishka said...

@Davidski Isn't it so that the less distance the better? What if I use only ancient populations and get a distance below 1%? Is this good or bad, and what's the ideal distance to aim for? Thanks.

Davidski said...

@MomOfZoha

I have a problem with a priori limiting the reference populations too severely too since that strikes of tautology to me, although sure it is fun if there is time.

This is essentially a supervised test, so there might be all sorts of things wrong with the models being considered, but if the analysis is done objectively and thoroughly, with many different options being considered systematically, then it should be possible to find models that more or less match results from totally unsupervised methods.

Davidski said...

@Kanishka

Isn't it so that the less distance the better?

A really low distance usually means that the model is overfitted in some way. It's usually more realistic to aim for distances of around 4%/0.04 when using the scaled data, and a bit less when using the original data.

Matt said...

Davidski: From memory, Sweden_MN clustered with Portugal_MLBA in that paper, which seems highly unusual, considering that Portugal_MLBA has about 15% Yamnaya-related steppe input, while the only eastern input in Sweden_MN is possibly a trace amount of EHG/SHG. So it seems like they clustered together because they had similar ratios of Hunter-Gatherer and Near Eastern (basal-rich) chunk counts, which is pretty crazy if true.

My guess would be that (depending on how they did it) the clustering was heavily influence on haplotype chunk sharing to the Iberian Neolithic, and Sweden MN and Portugal MLBA would both be fairly intermediate on that measure for different reasons; Portugal MLBA because 10-20% Yamnaya ancestry on top of 60-90% Iberian Neolithic and 10-20% other Neolithic, I think, while Sweden MN because it has neolithic ancestry related to the Iberian but not the same.

(Like the chunkcounts appear actually more specific to the specific Neolithic populations and look to decay rapidly more outside those groups and with genetic drift, than being Near Eastern / HG in a general way etc.)

But yeah, I actually agree the chunkcount sharing looks harder to use for models that can be decomposed into ancestry proportions for sure. Hard for me to see much to do with them other than wildly point at them and say "Look, there must be some signal of ancestry here!".

...

Interesting chat chat going on re: nMonte and overfitting. I agree with MOZ re: the tautology issue of population selection (which is what always concerns me a little when I'm trying to prune reference populations - am I just doing this in a way that fits my preconceptions?).

nMonte was a revolution in modelling compared to 4mix from releasing the constraint on 4 populations, back when we had a >4 number of very differentiated ancient populations we wanted to fit moderns (and other ancients as).

But with so many ancient samples, and minimal degrees of difference between best fit, it seems like it might be good if we had a more constrained method with fewer degrees of freedom; something like 8 populations maximum, each can only take proportions in increments of 12.5%.

To give a result more like one of the "Oracles", that is more interpretable in similar terms (e.g. thinking in terms of equivalence to X great grandparents from group Y, X2 great grandparents from group Y2, etc.).

(Slightly different to what MOZ suggests but with a similar intent I think?).

I'm not suggesting modifying the nMonte code (esp. without authorisation Ger), or actually clear on how this could be done, just that more constraints might help interpretation.

Alberto said...

Yes, reading MoZ's comments I was also thinking in the same lines. It's a slightly different (and quite more simple) script what would be needed to do such thing. For example one that takes as a variable the number of generations you want to obtain in the results (1 generation = 2 source pops, 50/50 each, 2 generations = 4 source pops, 25% each, etc...).

If I get some time I will give it a try to see how that works (unless someone beats me to it - which I'd prefer, since I'm basically illiterate in R scripting).

Simon_W said...

@ Rob & Anthro

That's a typological map about the subracial classification of early Slavic crania:

https://jpst.it/1b0x7

Source: Ilse Schwidetzky, Rassenkunde der Altslawen, 1938.

According to this, the types shared by all the early Slavic tribes are Nordid and (a still longheaded version of) Osteuropid; in the eastern Balkans more gracile, obviously Mediterranean skulls are also found and in the western Balkans Dinarid individuals.

Davidski said...

I updated the sheets with many new European individuals and several new European populations.

Darn, forgot to add modern-day Armenians again. Will do that tomorrow.

Chetan said...

@Nirjhar You have any inside info on the article?

Arza said...

@ Anthro Survey
When you do find two extrapolated populations with a small distance between them(obv impossible to get a perfect intersection) suspected to be in the vicinity of your ghost, do you then take their average?

It depends. Sometimes there is a well defined cline on the one side, but on the other one you have just two points that point to the neighbouring area. In such case and if there is a big distance between two ghosts I take the one sitting on the visible cline.

But I rather avoid using ghosts in nMonte if the other side of the cline is not well defined (I'm using them rather to explore the geometry of clines). E.g. in case of India (even if we ignore that the SA cline at closer inspection looks like built from 3-4 other clines) we don't know what exactly sits on the other side. So it's hard to make an ASI ghost for nMonte, because as soon, as you start to tweak your ghost, you will realize that even the smallest change in ASI can result in big differences in which samples nMonte will pick up on the other side.

Of course you can make a rigid model with Yamnaya and Iran, so nMonte will give you estimated percentages of ASI ancestry, but who really cares if it is 40 or 50%? That's why in Global 10 I'm using Chamar as a proxy. Maybe it's giving the wrong numbers, but as it sits in the right place it gives the right Population_That_Proto-Aryans_Mixed_With-->Brahmin vector, which affects which populations will be picked up on European side.

@ Matt
something like 8 populations maximum, each can only take proportions in increments of 12.5%

G25 vanilla

Nbatch/batch_0/batch_def = 8

Brahmin:Brahmin1
ASI_ghost,37.5
Iran_N:AH1,37.5
Srubnaya:I0361,12.5
Srubnaya:I0430,12.5
"distance%=1.9443"

It's good to raise the Ncycles number to e.g. to 100000.
It's also interesting (and fun) when combined with negative value of penalty in nMonte3 where more distant sources will be preferred (kind of drift simulation for distant=old sources).

Arza said...

^^^
Better method would be penalting or promoting samples by age relative to the target, e.g.:

3000 years younger than target - penalty = 0.001
same age as target - penalty = 0
3000 years older - penalty = -0.001
10000 years older - penalty = -0.010

Kanishka said...

@Davidski Thanks. I guess from now on I will have to be careful with my models.

@Simon_@ Interesting... Is there a huge difference between north and south Germans in terms of Steppe admixture? I would assume that those from what was once East Prussia, and from northern Germany would have more Steppe admixture than those from Bavaria. I think Prussians would top the list.

Simon_W said...

@ Kanishka

Generally speaking you are right, and steppe admixture in Germany is higher towards the east and north and lower towards the west and south. But individually and probably also micro-regionally it's more compley. My paternal grandfather for instance was from the border area of Southwestern Germany and northwestern Switzerland, and yet gets quite a strong Nordic_IA in my models.

And the name "Prussians" is ambiguous BTW. Many people associate this with the Prussian state which once covered almost the entire northern 2/3 of Germany from the far west to the eastern borders. Because of this many Americans with German roots regard themselves as descended from Prussians. And not to forget the Prussian kings and queens which figure prominently in the European history. But this has little to do with the province East Prussia and the Baltic Prussian substratum there. The reason being that this original Prussia eventually got acquired by Brandenburg and this merged Brandenburgian-Prussian state eventually got called Prussia. But the capital of the Prussian kingdom was Berlin, which lies in Brandenburg, very far from the original Prussia with its German and Polish population on a Baltic Prussian substrate.

Arza said...

Re: Yamnaya as Kostenki

Maybe it's... correct?

As Matt was explaining some time ago, when you mix two distant diverged populations, you are effectively getting closer to the parent node.

Maybe the algorithm mistook admixture with a drift (drift from older to younger vs. admixture bringing population from a "younger state" to being closer to older one)?

In such case this drifted Kostenki would represent UHG(?), which is present as a part of acestry in CHG and EHG(?).

And without proper references for e.g. Basal/CHG it mistook both admixtures as an additional drift on top of the drift between Kostenki and UHG.

Imagine that we have an UHG sample. Wouldn't ADMIXTURE show Yamnaya as majority UHG with stripes of Basal, ANE, WHG and whatever sits in EHG?

Simon_W said...

Not wanting to spam this thread with my results, but David's recent addition of a Dutch and another French sample changed my results quite a bit (again working with unscaled data):

East Prussian grandmother:
"Nordic_IA" 41.25
"Dutch" 31.8
"Baltic_BA:Turlojiske3" 26.95
"England_Anglo-Saxon" 0
"Slav_Bohemia" 0
"Hungary_BA:I1502" 0
"Polish" 0

distance%=1.5827

Swabian/Swiss grandmother (also tried some Scythians here):
"French" 39.75
"Hungary_IA" 16.45
"Italian_Tuscan" 15.15
"Nordic_IA" 14.7
"French_South" 9.9
"Scythian_Samara" 4.05
"French_East" 0

distance%=0.6876

inferred position of South German/Swiss grandfather:
"Nordic_IA" 69.05
"Hungary_IA" 16.8
"French_South" 6.9
"French" 6.5
"Italian_Tuscan" 0.75
"French_East" 0

distance%=2.202

inferred maternal side:
"French" 52.7
"French_South" 17.7
"Samaritan" 14.05
"Nordic_IA" 9.35
"Minoan_Lasithi" 6.15
"Mozabite" 0.05
"Hungary_IA" 0
"Cypriot" 0
"Hungary_BA:I1504" 0
"Remedello_BA:RISE489" 0
"Mycenaean" 0
"Anatolia_BA" 0
"Anatolia_ChL" 0
"Levant_BA" 0
"England_Roman_outlier:3DT26" 0
"Druze" 0

distance%=1.5992

And by the way, a while ago I had modeled the Eurogenes K15 data of a DNA relative of mine whose parents are from Cesena and Sardinia with 4mix. The best approximation I had found was:
49% Sardinian + 20% French + 12% Italian_Jewish + 19% South_Italian @ D = 3.6039

So his Romagnol ancestry has a strong French-like and Italian_Jewish-like edge, which is very reminiscent of my above results. Maybe this has something to do with the Jewish community of Forli and its dissolution??
http://www.jewishencyclopedia.com/articles/6234-forli

MomOfZoha said...

@David:
I don't think I could get meaningful results until you add in the Armenians, which I recall made up almost 10% of the samples in your IBD runs of 3K+ people a while back. This is especially the case for my father-in-law whose paternal descent from Karabagh before the Russo-Persian wars is even indicated via his surname. Even in his cM matching IBD analysis you performed, several of his top cM matches were not only Armenians but Armenians from Dprabak (which is transliterated "Karabagh" as it was settled by them).

@Alberto, Matt, and Arza, regarding nMonte tweakings:
It seems there is agreement on the desire to limit the possible source populations in the output. Note that the method that I hastily mentioned above (with parameters k and x such that no more than k source populations may have < x% contibution in the output) also does imply such a limit on the number of sources in the output: Num output source pops <= k+100/x
(where "<=" means $\leq$ rather than $\Leftarrow$ of course) E.g. if x = 12.5% and k=0 is chosen, then num-source pops is limited by 8 by pigeonhole principle. Etc..

At any rate, it might be too indirect an approach. While I like Alberto's idea of successively considering great^n grandparents, that still might not be a fine enough granularity for Anatolian Turks like me. Even so, I still think that even really admixed people like myself would prefer to limit output source pops to <= 8 even if we've got other stuff going on. Even if half of my great-great-grandparents are more than half some-kind-of-Assyrian, for example, due to a wide variety of admixture paths, the congregate results are valuable in noting this without the effect of millions of epsilon other contributions.

Ideally, IMO, the search for good reference populations should be guided by an existing hierarchical clustering of populations. Even a non-hierarchical thing can suffice initially. But, to illustrate what I mean for my own familia, consider this:

In going about choosing, say, good ref pops for my parents, I'd like to pick AT MOST ONE from each of the following pop-groups:
* Kurdo-Iranian group: Kurdish, Yezidi, Zaza, or actual Iranians
* Northwest Caucasus group: Ubykh, Kabardin, Adyge, Cerkes, Abkhasian
* Mid-South Caucasus group: Armenian, Georgian, Lezgin, Chechen, Dagestani (I only do not include Azeri here despite her affinity with Azeris because I want to separate the "Turkic" contribution using Turkmen or Siberians)
* Central Asian Turkic and "non-Tajik": Turkmen, Hazara, Uzbek, Burusho, Kazakh, Kyrgyz
* Central Asian not-so-Turkic: Tajik, Kalasha, Pamiri, Pathan
* Anatolian and North-West Semitic: Druze, Assyrian, Samaritan, Lebanese Christian, some-kind-of-Jewish
* South/East European: Hungarian, Bulgarian, Greek, Macedonian, Romanian, Montenegrin, Serbian
* Siberian: Tuvan, Altaian, Ket, Koryak

***
Just had the most awesome interruption from my bestest friend from way-back-when. And, anyway, I wrote too much as usual, so let me wrap this up now!

To be continued.... (maybe maybe not, we'll see) . Cheers..

Matt said...

Re: new samples, I have had a quick go at putting into the population average neighbour joining tree (scaled), with a couple of colour schemes: https://imgur.com/a/LB9Cu

(Though of course the effect of increasing samples is often increasing overlap).

@Arza, thanks for mentioning and running that modification.
Re: age penalty, it's an interesting idea. Functionally, I guess that would work out the same as exaggerating distance on the PCA data itself between samples with decreasing age.

That does seem like it has two risks, a) the penalty is not very large and it doesn't shift the results too much (e.g. it's still optimal to select Yamnaya over EHG+Satsurblia, even though the latter is an older combo) and b) the penalty is high and selects samples with drastically lower fit to the real distance, e.g. Kostenki14 over recent WHG for West Eurasians, Tianyuan+Ust Ishim over modern East Asians for East Asians.

(The latter outcome b) seems like it's similar question to the goals of Dystruct vs ADMIXTURE; are we concerned with fitting the relationships to all samples, including modern samples, optimally, or are we more concerned with models that give best fitting historically logical recent populations as mixtures of older ones, even if this doesn't actually explain the between population variance to the same degree because the set of ancient samples is often an incomplete sampling?)

I'll have to catch up with the rest of this thread later on.

Simon_W said...

I think the strong Hungary_IA in my two Alemannic grandparents loses a lot of its oddity if we consider the wide extent of the Thraco-Cimmerian horizon, another case where the allegedly outdated old migrationist interpretation appears to turn out right in the end:
https://en.wikipedia.org/wiki/Thraco-Cimmerian

And moreover Hungary_IA sample IR1 is only half from the Iron Age steppe, half of his ancestry is local, which means the scores have to be halved to get the real amount of Iron Age steppe admixture.

Kanishka said...

@Simon_W Thanks for your reply. When I said Prussians I did not mean what became of the Prussians later, I meant those of original Prussian stock, i.e. from East Prussia, since East Prussia was the heartland of the original Prussian people. Yes, I know about the union with Brandenburg and how it allowed for the expansion of Prussia. However, when I said Prussia, I meant this: https://en.wikipedia.org/wiki/Duchy_of_Prussia.

Obviously, later on Prussia expanded into most of Germany and Poland, and became the leading German state, as you stated.

Anyhow, was your grandmother from Prussia proper, i.e. Konigsberg and surrounds? I would assume so since you mentioned "East Prussian". I think the original Prussian people were Baltic speakers, who were later assimilated by the Crusaders/Teutonic Knights.

Simon_W said...

@ Kanishka

My grandmother's dad was from Braunsberg (now: Braniewo) and her mum from a village nearby named Pettelkau. So this was an area that hadn't belonged to the Duchy of Prussia as visible in your wikipedia link; instead it's situated in the "wedge" called Warmia (in German: Ermland) that was under direct Polish control from 1466 - 1772. That's the reason why this part of the later Prussian province East Prussia stayed purely Catholic. And while the Protestant parts of Prussia were open to Protestant refugees from other countries, the Catholic population of Warmia was more stable, and also after the unification with the formerly ducal part stayed endogamous. Moreover the northern part of the Ermland and the central part differed strongly in their dialects: Low German in the north and Middle German in the middle, which hampered geneflow even within this Catholic sub-population. So in effect that Catholic and Low German population my grandmother comes was rather small.

Yes, the Baltic Prussians were the truest Prussians. Their western border was the Vistula, so the area where some of their DNA may have lived on also included a small part of the later Prussian province West Prussia, but mostly it was indeed in East Prussia.

BTW, this Lithuanian language movie is AFAIK the only historical drama film dealing with the Baltic Prussians. It's about the uprising under Hercus Monte:
https://youtu.be/eOZqSH7SDxk

Alberto said...

So I tried that idea of writing a script that would restrict the number of populations in the results (without restricting them in the sources), but it turned out to be a much more complicated thing than I anticipated. Combinations are too high to try them all and an incremental approach is not possible (or difficult) while restricting the output. But anyway I ended up with something that gives some results (clearly not the best possible combination, just one of the many that can be acceptable). I'm not sure if this can be useful at all, but since I wrote it I thought I'd share it for anyone interested in trying it. (With all the usual warnings that I'm not familiar with R scripting, this is almost untested, etc...)

Probably more useful (and much better tested, since I've been using it for a long time) is this other script that works in the usual way (similar to nMonte or 4mix) but has some differences and features (multi target, on the fly scaling) that might be useful for some. I also put it out just for testing purposes, and for anyone interested in fixing/improving it and sharing the modified version (it's Free code, as the former). For the details look at the README file:

Kanishka said...

@Simon_W Thanks for answering my questions. Interesting things you have said, especially the fact that the Polish controlled part of what later became East Prussia, remained heavily Catholic, due to being ruled by a Catholic rather than Protestant nation. I always saw that gap in the old duchy of Prussia and wondered why it was under Polish, rather than Prussian, control. Fairly interesting to say the least. Poland-Lithuania was certainly a superpower for the time, and made sure to counter any Turkish incursions into central and eastern Europe.

Final question, did the Prussians from this region you mention identify more with Poland or with Prussia? Did they maintain their Catholic traditions after the Prussians captured the region from the Poles?

Kanishka said...

@Davidski I was wondering, how much more Steppe do you think the elite Mycenaean woman will have in comparison to the lower class men? Have those less Steppe shifted Mycenaeans been confirmed to be from the lower classes? I would reckon that the Steppe among the Mycenaean elites was around 50% Sintasha/Andronovo/Steppe MLBA-like. Thoughts?

Kanishka said...

@Davidski I am talking about Steppe admixture in the female after this happens:

"This sample probably has the lowest cut of Caucasus ancestry out of the six Greece_Peloponnese_N samples that will eventually be published with Mathieson et al. 2018. So when that happens, the level of steppe admix in Peristeria I9033 is likely to rise in these models."

Arza said...

@ Alberto
Xmix, line 310

< resultsTableRow <- c (populationName, " ", percentages, c (round (sqrt (modelDistance), 6) * 100, "%", sep = ''))
---
> resultsTableRow <- c (populationName, percentages, paste (round (sqrt (modelDistance), 6) * 100, "%", sep = ''))

Anthro Survey said...

@EastPole

I would't be so rigid in locking in linguistic and haplogroup phylogenies with vast archaeological horizons and their successors/predecessors in a way that doesn't leave room for horizontal exchange scenarios. Btw, we only have one sample from SS atm, and its connections to later steppe cultures isn't completely understood yet.

Anyway, going with Taylor's diagram, the node common to Indo-Iranian and Balto-Slavic could have harbored mainly R1a (directly basal to z93 and z283), but the population(s) associated with the node directly above it could have had both R1a and z2103. Alternatively, that node might have been R1a, but merged with a z2103-dominant culture and to create the steppe population ass w/Greco-Armenian node. Of course, such a scenario would either entail language replacement via R1a elites or creolization. It goes without saying that this is all contingent on how close to the truth Greco-Aryan language hypothesis is.

Anthro Survey said...

@Arza

Yeah, I see what you're saying about SA.

What ghosts have you managed to pin down so far with cline intersection?

Let's say there are four or five real populations suspected of being clinal. The presumed tail and head are taken and an artificial set of populations is generated from these(essentially a straight line in 10D or 25D is drawn)---easy to do with excel.

Essentially, the 2-3 middle (real) populations will be some euclidean distance away from this generated (c)line. ("Cross-product" won't be anywhere near zero unless they're PERFECTLY clinal, that is.) So, what's your cutoff here to call them clinal and how far would be too far?

Davidski said...

@Kanishka

I was wondering, how much more Steppe do you think the elite Mycenaean woman will have in comparison to the lower class men? Have those less Steppe shifted Mycenaeans been confirmed to be from the lower classes?

The social status and archaeological contexts of the Mycenaean samples are discussed briefly in the Laz paper and its supplementary info. Sample I9033 is described as the only high status individual from a royal tomb.

The authors didn't find any significant genetic differences between the four Mycenaean samples, but in my PCA of ancient West Eurasia, I9033 is, by a whisker, the most steppe shifted Mycenaean. And this also appears to be reflected in her Global 25 data, but you can test this yourself using Global 25 and nMonte.

I would reckon that the Steppe among the Mycenaean elites was around 50% Sintasha/Andronovo/Steppe MLBA-like. Thoughts?

Possibly, and if so, this would explain the later Crete_Armenoi individual, who is much more steppe shifted than the four Mycenaeans. See here...

https://3.bp.blogspot.com/-qDAN5pZ8Qm0/WYUXuKv8fPI/AAAAAAAAF80/ltaSA7RXUEMCELq5ceMKYmdDqxHpH4nNwCLcBGAs/s1600/Minoans_%2526_Mycenaeans.png

Alberto said...

@Arza

Indeed, thanks for caching that bug in a last minute change. Actually I think it's better like this (with a decimal number instead of a string) for a spreadsheet (as I had it before adding the * 100 to the distane for compatibility reasons).

> resultsTableRow <- c (populationName, percentages, round (sqrt (modelDistance) * 100, 4))

Corrected version here:

Kanishka said...

@Davidski Thank you. I agree with your assessment here. The issue I had with that study was that it tried to say that the Indo-European languages originated in the Caucasus/Iran horizon. Furthermore, another weird thing I got from it was that they tried to say the Crete_Armenoi sample was low resolution and not reliable. Would you agree with their assessment here? Personally, I believe that the Crete Armenoi sample was solid. I also have a hunch that the elites from mainland Greece were even more Steppe shifted than this sample. I guess it is worth waiting and seeing whether or not they test the higher status individuals from those main tombs. It would also be nice to see genetic testing on Classical Greeks. Finally, I think it's odd that modern Greeks would be close to 40% Slavic on average, so probably most of their Steppe admixture is native.

Simon_W said...

@ Kanishka

From what I gathered: the German citizens of the state of the Teutonic Order weren't happy with the rule of the Order either. It was a bit like in Switzerland at the same time, the people wanted to decide for themselves, they wanted self-government. So to be for or against the Teutonic Order at that time wasn't a matter of being ethnically German or Polish. And for centuries the people of Warmia were happy with belonging to Poland. For our modern understanding this sounds strange, but back then the romantic notion of national states wasn't invented yet. Thus, the Germans of Warmia under Polish rule were comparable to the German speaking Alsatians under French rule: They like to be French and are not longing to join Germany. But my grandmother for instance was born long after this episode. She grew up in the third German empire, in the province of East Prussia, so that's what her homeland is in her memory, and not Poland. But yes, they preserved the Catholic traditions, they were a rather conservative people. Like that map shows: predominantly Catholic.
https://jpst.it/1b6Cg
That seems to be referring to the early 20th century. There was only very limited immigration of Protestant Prussian functionaries after the unification. Also politically in the decades around 1900 they strongly preferred the conservative party over the social democrats, unlike the other parts of East Prussia.

Davidski said...

@Kanishka

Furthermore, another weird thing I got from it was that they tried to say the Crete_Armenoi sample was low resolution and not reliable. Would you agree with their assessment here?

It is a low resolution sample, so the authors were right to be cautious. But it's good enough to run in a PCA, and the result isn't likely to change much with a higher quality sequence.

I also have a hunch that the elites from mainland Greece were even more Steppe shifted than this sample. I guess it is worth waiting and seeing whether or not they test the higher status individuals from those main tombs. It would also be nice to see genetic testing on Classical Greeks. Finally, I think it's odd that modern Greeks would be close to 40% Slavic on average, so probably most of their Steppe admixture is native.

Some high status Mycenaean samples might be mostly Steppe_MLBA, but this won't tell us if most of the steppe admixture in modern-day Greece is pre-Slavic or Slavic. Only Iron Age and later ancient samples from Greece can be informative about this, and they might turn out less steppe-admixed than the four currently available Mycenaeans. If so, most of the steppe admixture in Greece today can't be considered pre-Slavic.

Kanishka said...

@Simon_W Thanks, interesting to hear. By the Third German Empire are you referring to the Third Reich? Yeah, but it makes sense why she would have a sense of belonging to Germany and not Poland. It is just sad how many conflicts in the 20th century forced millions of people to abandon their ancestral homelands.

@Davidski Thank you for clarifying. I thought that it might have been a decent resolution sample, hence why you managed to plot it on your PCA, but I guess I was mistaken. Furthermore, if most of the Slavic admixture in Greece is post-Slavic, would that mean that the Slavs who invaded Greece were like the Slav samples from Bohemia?

Davidski said...

@Kanishka

I don't know. You should try modeling Greeks with the Global 25.

Kanishka said...

@Davidski Thanks.

Nezih Seven said...

"distance%=4.0847 / distance=0.040847"

Anatolia_BA 59.25
CHG 21.40
Scythian_Samara 17.30
Scythian_Pazyryk 2.05

MomOfZoha said...

Interesting @Nezih... The "steppe" part of my folks' ancestry seems to prefer the Sarmatians:

"distance%=3.6002"

MoZ_Father

Tepecik_Ciftlik_N,54.4
Sarmatian_Pokrovka,25.6
Iran_N,17.6
Lapita_Tonga,2.4

[1] "distance%=4.2271"

MoZ_Mother

Tepecik_Ciftlik_N,46.2
Sarmatian_Pokrovka,31.8
Iran_N,16
Han,4.4
Lapita_Tonga,1.6

Although Sarmatians are part Scythian too, makes sense with my mom that Sarmatian women were totally badass...

Simon_W said...

Oops, just saw that Kanishka is banned! No wonder my replies to her (or rather him) keep being deleted. Alright, I'll respect that from now on.

Open Genomes said...

@David

Here's the 1240k hg19 SNPs for DA101, the Tien Shan Eastern Hun from 1783 BP (uncalibrated) who died of the plague, in 23andMe format.
This has 399529 of 1198444 autosomal (no Y) 1240k SNPs.

There is no Y, or mtDNA in this autosomal subset.

In this run there was no filter on read depth and quality scores, but I can put in any kind of filter you want, like depth, quality, and map quality. Also, I can restrict this to only those SNPs with alleles that have either the 1240 reference or alternate alleles, if that turns out to be an issue.

With the Reich Lab SNP names, regardless of whether some "Affy-" and "snp_" SNPs now have rsIDs:

http://www.open-genomes.org/genomes/Eurasian%20Steppe/DA101/genome_DA101-Reich_Lab_names.zip

With as many as possible rsIDs, substituting for the proprietary names:

http://www.open-genomes.org/genomes/Eurasian%20Steppe/DA101/genome_DA101-rsIDs.zip

If you need this in another format besides 23andMe like the ANCESTRYMAP .geno format, I can provide that too.

Rudy Winono said...

Just curious, will there be a Global 50 and beyond in the future? I'm wondering why not just do Global 1000 while we are at it.

Is it difficult to plot PCA with PC1000?