search this blog

Friday, August 24, 2018

Global25 workshop 3: genes vs geography in Northern Europe


To produce the intra-North European Principal Components Analysis (PCA) plot below, download this datasheet, plug it into the PAST program, which is freely available here, then select all of the columns by clicking on the empty tab above the labels, and choose Multivariate > Ordination > Principal Components or Discriminant Analysis.


I'd say that the result more or less resembles a geographic map of Northern Europe. Of course, if you're in the possession of your own personal Global25 coordinates, you can add yourself to this plot to check whether your position matches your geographic origin.

Please keep in mind, however, that the vast majority (>90%) of your ancestry must be from north of the Alps, Balkans and Pyrenees to obtain a sensible outcome. Also please ensure that all of the columns in the datasheet are filled out correctly, including the group column, otherwise your position on the plot will be skewed.

See also...

Global25 workshop 1: that classic West Eurasian plot

Global25 workshop 2: intra-European variation

Global25 PAST-compatible datasheets

Modeling genetic ancestry with Davidski: step by step

50 comments:

Ric Hern said...

Interesting that the Connection Point between East and West seems to be Hungarian, Czech and Slovakian...

Ric Hern said...

Interesting that Finns and Swedish do not overlap a little...

EastPole said...

Here is my position:

https://s8.postimg.cc/5rz6zbpxx/screenshot_422.png

I wonder what modern or ancient populations would fit in the ellipse with the question mark?

Davidski said...

@Ric

Interesting that the Connection Point between East and West seems to be Hungarian, Czech and Slovakian...

Interesting that Finns and Swedish do not overlap a little...


Practically all of the North-Central Europeans overlap, including Eastern Germans and Western Poles, and I'm sure that Finns would overlap with Swedes too if I had bigger sample sets from these countries.

What I find really interesting is that most Russians appear more "western" than the Balts, especially Latvians.

Davidski said...

@EastPole

You chose your nick wisely. :)

I don't know which ancient population will end up clustering in that area of the plot, but have a look at where Baltic_IA clusters. He's even more eastern than you.

Ric Hern said...

@ Davidski

Yes indeed.

Arza said...

@ EastPole

Welzin or similar IMHO. It looks like a spot somewhere between Baltic_BA and Hungary_BA.

Ric Hern said...

@ Davidski

Could the more Western appearance of Russia maybe be due to Scandinavian (Viking) admixture ?

Davidski said...

@Ric

Could the more Western appearance of Russia maybe be due to Scandinavian (Viking) admixture?

This is unlikely to be a significant factor.

The main factor, I think, is the higher ratio of Middle Neolithic farmer ancestry to Baltic/Eastern Euro forager ancestry among Russians relative to Balts.

There are different reasons for this, but I think the main one is the expansion of Steppe_MLBA and related groups (Slavs) throughout what is now Russia, as well as possibly the presence of Eastern Germanic groups north of the Black Sea, but I guess we'll soon see if the latter is a possibility when the relevant ancient DNA is published.

Genetic continuity in the western Eurasian Steppe broken not due to Scythian dominance, but rather at the transition to the Chernyakhov culture (Ostrogoths)

https://www.isba8.de

Open Genomes said...

@David

How about a separate post about the Levantine Chalcolithic, and the "new" Reich Lab Sidon results?

I have some new Global25 results for the Levantine Chalcolithic, and Levantine Bronze Age that you'll find interesting.

Davidski said...

@OG

I have a new post about ancient Anatolia and the Levant coming mid next week.

Open Genomes said...

@David

Why are Hungary_Medieval:DA199 and Scythian_Hungary:DA199 almost identical in Global_25_PCA.txt?
Are these duplicates?

Davidski said...

@OG

Looks like it...

Hungary_Medieval:DA199 and Scythian_Hungary:DA199

I don't know which of the pop IDs is correct though. I didn't label them.

Davidski said...

Just had a look at the datasets. It appears that the latest, and thus probably correct, label for this sample is Scythian_Hungary:DA199. So feel free to discard the other one.

Open Genomes said...

@David, thanks.

Consider this then, for your post:

This one seems just right, according to the study, although the distance isn't close:

Restricted nMonte3: I1178 Levant_ChL Chalcolithic Near East

However, remember you said that the Bronze Age Sidonians shouldn't cluster with Modern Lebanese? This one does, even though Nick Patterson generated the data:

Restricted nMonte3: ERS1790730 Levant_BA_North Bronze Age Near East

Notice the 15.0% Minoan. Apparently, as archaeology has shown in Dor, the Minoans made it to the Levantine Coast in the Middle Bronze Age.

Levantine Chalcolithic is only 12.0%, but Levant Bronze Age South is 46.8%. The Levantine Chalcolithic are clearly not the ancestors of the Levantine Bronze Age.

The most interesting thing is the small percentage of 1.8% Afanasievo. Early Steppe ancestry started to arrive by the Middle Bronze Age, as part of the Amorite expansion southward. I think if this was the Mitanni, it would be more Sintashta-like, so this may be from the elusive Anatolians. This early Steppe ancestry seems to have arrived via the Caucasus and (Eastern?) Anatolia.

Maybe this small affinity to Afanasievo (a proxy for a pre-Yamnaya population) can be proven with D-statistics?

Ric Hern said...

Wonder why Afanasevo and not Yamnaya ? Mmm...Repin = Hittites ?

Ric Hern said...

Looks like everybody is sampling around the Hittites...a bit of Steppe here and a pinch of Afanasevo there. It is as if they want to build up the tension...

Davidski said...

I don't know if 1.8% Afanasievo-related admixture is something that can ever be reliably confirmed.

When_in_Rome said...

@ Davidski

A bit off topic, but is there a way you could model Italians by % based off of the following populations?:
1. Sardinian-like (to represent the Neolithic Italians)
2. Urnfield Culture or Bell Beaker Culture (to represent the Indo-Europeans entering Italy)
3. Iron Age Greeks (to represent Greek Colonization)
4. Phoenician (to represent Phoenician Colonization or Near Eastern DNA in Italians)
5. Early Middle Age Germanics (to represent the Migration Period)

I think that these populations could represent the peopling of Italy and I would like to know the % of each. Thanks in advance.

Davidski said...

@When_in_Rome

Here are some quick attempts at something like that using qpAdm...

Modern-day Greeks & Italians vs Mycenaeans

But I reckon someone else, maybe at Anthrogenica, like Mike, can come up with a more elaborate and on target analysis now using the G25 and nMonte.

EastPole said...

I have marked ethnic groups with deferent colours: Slavic-red, Germanic-bue, Ugro-Finnic –aqua, Celtic-green, mixed populations-grey (German, Austrian, Hungarians):

https://s22.postimg.cc/ggsv1lmhd/G25_Etno_Graph.png

I am not sure how Celtic and Germanic should be separated. Are Scottish and Irish Celtic or Germanic genetically? They look mixed too.

Ric Hern said...

@ EastPole

They are genetically Celtic. Mostly R1bs directly related to Bell Beaker People. There was also a lot of admixture from the Middle Bronze Age until the Vikings.

Ric Hern said...

@ EastPole

https://www.google.com/amp/www.dailymail.co.uk/sciencetech/article-5312697/amp/DNA-map-Britain-Ireland-reveals-Viking-genes.html

Ric Hern said...

@ EastPole

http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1007152#sec002

Open Genomes said...

@David

This J1-Z2324 from has a total of 4.0% Yamnaya Samara and Afanasievo. I don't think that this is a coincidence:

I1705 'Ain Ghazal c. 2030 BCE Levant_BA_South Bronze Age

Open Genomes said...

@David

But I reckon someone else, maybe at Anthrogenica, like Mike, can come up with a more elaborate and on target analysis now using the G25 and nMonte.

Which Modern Greeks and (South) Italians and Mycenaeans do you want? I can run them here.

Samuel Andrews said...

@EastPole,
"Are Scottish and Irish Celtic or Germanic genetically?"

Irish have no significant non 'Celtic' ancestry which is why they are so similar to Iron age Celtic Britons. Almost all Scottish don't have big Germanic (English) admixture either. Celtic might might not have spread with gene flow. What is referred to as Celtic genetic signature is probably just British Bell Beaker.

Samuel Andrews said...

@Ric Hern,

Vikings smikings. It sounds cool but unlikely. Raiders can't make a big impact on already established populations.

Davidski said...

@OG

See When_in_Rome's comment here.

Seems like he/she wants a fairly comprehensive analysis of Italian and Greek genetic prehistory. But I'm not sure if we have all of the necessary ancient populations for that.

When_in_Rome said...

@OG, Davidski

I'm trying to understand the peopling of Italy by measuring modern Italians (north, south, central, Sicilian) based off of the populations I mentioned above (unless you think there are any better proxies). I picked those populations because I thought they best represented the migrations into the peninsula over the last 3000+ years, but I don't know how to run any stats. Is it possible to run an model like a qpGraph? This way we could understand the percentages of not just these populations I mentioned, but also of the populations that formed them as well, such as the Steppe component.

Davidski said...

@When_in_Rome

Have you tried Global25/nMonte?

Unleash the power: Global 25 test drive thread

It's easier to use than formal stats modeling but it produces very similar results.

Ric Hern said...

@ Samuel

It looks like the picture changed a bit regarding the Vikings. Looks like they settled and integrated nicely. Limerick and Dublin were especially big Viking trading posts. And there is some evidence about recent finds in Iceland that are interesting.

The picture start to look more like cooperation than anything else...

Celtic Dragongod said...

When the Germans conquered Celtic areas they killed the men and mated with the women. Therefore, as the Germans proceeded to conquer more and more Celtic territory they ended up acquiring more and more Celtic DNA. This is likely why the English and the Dutch seem more Celtic than the Germans and the Austrians.

Davidski said...

Certainly, the Low Countries, like Belgium, were populated by Celts before the Germanics arrived there, and it's unlikely that they didn't contribute significant ancestry to the modern populations in the region.

Also, it should be noted that there has been a lot of immigration within Northwestern Europe in the past thousand years, including from the British Isles and Ireland to Scandinavia, especially to Iceland and Norway, but also Sweden.

Heck, Scots even emigrated on a large scale to Germany and Poland during the Middle Ages. Some people in Poland still have Scottish names that have been Polonized to sound more Polish.

Ric Hern said...

@ Celtic Dragongod

What precisely is Celtic DNA ?

Because what I see in Germany, France, Britain etc. is a good old mixture of mainly Bell Beaker Culture related DNA....

The Romans couldn't really with absolute certainty distinguish many Celtic and Germanic Tribes living on the opposite sides of the Rhine. Many apparently had mixed Cultural heritage.

When we look at different dialects especially just in the Netherlands we see how extremely different they can be from each other within a very small country....

Garvan said...

Would it be possible to use Global 25 to estimate Hallstatt ancestry in Irish, or even Viking ancestry? Looking at the positions on the PCA, I cant see what populations I should use in nMonte. Just for fun.

Celtic Dragongod said...

In this context, "Celtic DNA" would be genetic material that is shared between the Irish and the Scots but not between the Germans and the Austrians. By the same token, "Germanic DNA" would be genetic material that is shared between the Germans and the Austrians but not between the Irish and the Scots.

When_in_Rome said...

@ Davidski

I'm still trying to figure out the Global25/nMonte. I've downloaded all the files from the link you sent me and converted them to .csv files on Excel. I also selected the populations I thought were best and created a seperate file. I know I need to use nMonte on R programming, but I'm not sure where to go from here. Any advice would be appreciated? Thanks

Davidski said...

@When_in_Rome

Don't convert those Global25 files to anything. You'll need them as text files.

Download this zip folder, extract the files into your Documents, and then follow the quick start guide that is also included. Obviously, you'll also need nMonte3 in your Documents.

LINK

Then manipulate the data and target files accordingly depending on which population you would like to model and how. They're just text files, like the Global25 sheets, and all you need to do is to copy populations from the Global25 sheets to the data and target files.

Just make sure to never try to model scaled data with unscaled data, and vice versa.

When_in_Rome said...

@ Davidski

Thanks, when I figure it out I'll send any good results I get.

When_in_Rome said...

@ Davidski

Italian_South
"distance% = 1.6992"

Mycenaean,27
Barcin_N,24
Levant_BA_North,21.4
Yamnaya_Samara,20.8
Armenia_EBA,4.4
WHG,1.8
Iberomaurusian,0.6

This is what I got for Southern Italians. I was playing around with it for a while to see the best fit for Middle Eastern DNA, and it looks like Northern Levant as opposed to Southern. I know there was Greek colonization, but could the Greek admixture be this high?

Ric Hern said...

@ Celtic Dragongod

Maybe this will be informative ?

https://www.eupedia.com/europe/autosomal_maps_dodecad.shtml#French_German

Davidski said...

@When_in_Rome

Around 27% or even more ancient Greek admixture in southern Italy isn't out of the ballpark IMO.

But when modeling you should try to sick to mostly similar periods and relatively highly differentiated reference populations, otherwise you're offering the algorithm essentially the same components, which can result in confusing results.

For instance, in your model Mycenaean is just another layer of Barcin_N, Yamnaya_Samara, Armenia_EBA, and WHG, except a more recent version, which might confuse the algorithm. So you need to represent ancient southern Italians with something more proximate, like Beaker_Sicily. Actually, that's all there is at the moment for this type of model, but hopefully we'll have more soon.

PF said...

I just got the G25 coordinates of a friend of mine (thanks Davidski) -- first modern Northern Euro genome I'm playing with.

She is Irish and English... at least fully half-Irish, not sure about the exact breakdown of the other half. Of course I expected lots of overlap between people from the British Isles and NW Euros more generally, but damn, didn't think it would be quite this hard to disentangle. Guess getting more meaningful results would require hyper-local regional data.

Here's the PCA generated from the dataset in this post, and, in an effort to zoom in a bit, a second one with all the Eastern Euro populations removed: https://ibb.co/eLvd59

Also, nMonte pop averages results:

"distance%=1.4464"

English_Cornwall,53.4
Orcadian,46.6

Celtic Dragongod said...

PF: It would seem that your friend's "English" ancestry comes from the more Celtic parts of Britain.

Davidski said...

Yeah, there are lots of caveats to modeling such fine scale ancestry.

The major one is that there are very few regional groups from Europe in the Global25 datasheets. So even if your ancestors were, say, German from way back, you might still come out mostly Czech if they were from near the Czech border, and thus shifted southeast relative to the German Global25 national average.

The same thing goes for the British Isles and Ireland. But here it's even more complicated.

Also, a lot of people know the ethnic origins of their ancestors, but they don't know their genetic structure, so they might think that they should come out 50/50 English/Irish, but they won't if, for example, their English ancestor had a lot of Irish ancestry from ancestors who identified as English.

Garvan said...

@PF

You should get better separation along the Irish-English-Dutch cline if you use the data set with the Eastern Euro populations removed, but select "Discriminate analysis LDA" instead of PCA in the PAST3.

Davidski said...

I'm liking that Discriminant Analysis option. It's squeezing even more out of the data.

Theo Deric said...

How do I acquire my personal Global25 coordinates? I'm new to this.

Davidski said...

@Theo Deric

Genetic ancestry online store (to be updated regularly)