search this blog

Wednesday, May 4, 2016

Ice Age Europeans in a global PCA


The datasheet can be downloaded here. It might be possible to model many of the populations in the PCA as mixtures of Ice Age Europeans and other ancients using nMonte, but I haven't tried this yet. Most of the samples in this run are from various Reich lab public data releases available here.
See also...

On the modern genetic affinities of Ice Age Europeans

PC/nMonte open thread

For a typical West Eurasian PCA, the Epipaleolithic is the final frontier

34 comments:

George Okromchedlishvili said...

From this plot it looks like CHGs are less basal than ENFs or I am misinterpreting stuff

Olympus Mons said...

Davidski, when I asked for satsurblia was just to see close it would be from Villabruna.... And its really close.

I thought from the get go that south Caucasus from 8th up to end 6th millennia BC was crowded with population (R1b) that would cluster with Villabruna.

It’s just that the UBAID going to Uruk going to Mesopotamia really kicked the shit out of resident people (Shulaveri, Aratashen , even Tsiak tepe in Iran) by 4900 bc so bad it really changed the face of the all Caucasus that from that point on was J2/J1 all the way up until the mountains.

Davidski said...

@George

Not sure. CHGs might just share more drift with Central Asians, while the Anatolians with most Africans.

I'm not convinced that Satsurblia is 68/32 MA1-like/Basal Eurasian, like it was shown in the Fu paper. The high level of non-Basal ancestry in their models might be an artifact of the data and methods.

Davidski said...

@Olymous

Well, you should try model Satsurblia and Kotias as part Villabruna using these nine dimensions, and see what happens. It might work.

Olympus Mons said...

Davidski.
If only I knew how to.... :-)

Nirjhar007 said...

Nice PCA.

Karl_K said...

@Olympus Mons

"Davidski, when I asked for satsurblia was just to see close it would be from Villabruna.... And its really close."

But remember that this is just a 2D plot of clusters of SNPs that differ THE MOST between ALL of the populations involved. It does not imply that thise SNPs are connected by history.

Just because they are close on this plot doesn't mean that they would be close on a different plot with a different alternative clustering based on history (which would only be possible with all populations and time points accounted for).

That is why you can't just stick Africans onto a plot based on Eurasians. They will all be 100 times as far from the origin as anyone else.

Olympus Mons said...

@Karl_K
Yes. thank you. it makes sense.

Its just that me, and my pet theory, are screwed. because if we don’t target south Caucasus for DNA at a specific date range (8th-5th M BC) then we will not find the specific cluster of R1b that became Europe R1b subclades. Everything after 4,900 BC is the mess that became the Sumerians and so forth and that was the merger, and subdue actually, of populations by the more evolved Ubaid from the south.
All those layers of ashes and settlements abandonment are telling.

Karl_K said...

@Olympus Mons

"Its just that me, and my pet theory, are screwed. because if we don’t target south Caucasus for DNA at a specific date range (8th-5th M BC) then we will not find the specific cluster of R1b that became Europe R1b subclades."

It could be that your pet theory has been screwed for several thousand years...

Olympus Mons said...

@Karl_K
Ahahah. It might very well be true. By I am sure I am Gioiello version 2.0 ;-)

I dare you!:
http://blogs.sapo.pt/cloud/file/eb6b52b82097d41dfa0e5797a2fa7945/olympusmons/2016/From%20Shulaveri%20to%20Bell%20beaker.pdf

ryukendo kendow said...

Frank, you can't really draw that conclusion that villabruna is ill-defined from pca... pca likewise clusters karelia, stuttgart and goyet together, but they cannot be part of the same cluster. The number of dimensions in this pca is too small, so things are being compressed together.

The place to get immediate visual feedback on the presence of the clusters is fig 3a, where the presence of three yellow squares tells us that those three core groups are quite apparent from just the way the leaves fell. Villabruna has a orange halo, but that just means that its members tend to have higher affinities with a general selection of other European HGs, or that some of themo are notare pure villabruna, not that the square, representing a symmetrically autorelated group, comprising Loschbour to Berry for their formals, doesn't exist.

FrankN said...

Dave - two questions:

1.) I have problems telling apart Kostenki from the Vestonice Cluster. Could you mark Kostenki by an arrow? Thx!

2.) Oase and UI seem to be drawn to the African pole of the PCA, which makes sense in general. Which are those three African pops at the eastern end of the African cline?

@rk: I suppose you have been answering to my last post in the previous thread:
"Visual inspection of Fig.3 indicates that the Vilabruna cluster has been compiled somewhat arbitrarily. La Brana, and Bockstein and Ofnet (both mtDNA U5b1d1) are all quite distant from the remaining cluster members. Motala12 is hardly further distant than the a/m, and in the PCA clearly clusters with Ofnet and Bockstein, so I wonder why it hasnt been included into the Villabruna cluster as well. C.f., e.g., from Tab S5.6/7

D(Motala12,La Brana; Chaudardes,Mbuti) Z=-4.6
D(Ko1,La Brana; Chaudardes,Mbuti) Z=-3.4 "


Note that my point hasn't just been based on the PCA (Fig. 3b), but also on Fig. 3a. The "orange halo" of the wider Villabruna cluster, i.e. La Brana, Ofnet and Bockstein, is virtually indistuingishable from the Motala coloring. Especially in relation to Berry_au_Bac, Chaudardes, Rochedane and Ranchot, Motala becomes quite "yellowish", or at least light orange, more than Ofnet and Bockstein. IOW - for some yet unclear reason, Motala appears to display higher (in fact: the highest) affinity to the Eastern French than to the SW German Mesolithic.

"pca [Fig, 3b] likewise clusters karelia, stuttgart and goyet together" Indeed, and that obviously isn't making any sense! Do you think this is a weakness of just this specific PCA, or of PCA-based analysis in general?

Tobus said...

FrankN: I have problems telling apart Kostenki from the Vestonice Cluster.

There are 2 Kostenki samples (K12 and K14). They are the two (darker) dots just above the Veronice cluster, and just to the right of the purple Goyet.

ghostnorris said...

It's interesting that the pre-Villabruna Europeans are all on the same Anatolian Neolithic-to-UI cline as CHG, despite the paper stating they lack basal admixture, and Villabruna are on the same cline as EHGs and the Mal'ta

Davidski said...

@Frank

I changed the color of the two Kostenki samples. They should be easier to spot now, but it might be useful to download the plot and zoom in.

The Africans at the far end of the plot are Yoruba from West Africa.

@ghost

Across the nine dimensions the pre-Villabruna and Villabruna are quite distinct from CHG. The position of the pre-Villabruna is probably influenced by Neanderthal admixture.

ryukendo kendow said...

@ Frank

Frank, Motala has the highest sharing with with Karelian, MA-1 and Afontova Gora in the Villabruna-like group, seen in the orange stripe. The fact that the orange halo covers Motala probably means that the non EHG/ANE part of Motala is Villabruna-like or is just Villabruna, so Motala is indeed quite close to Villabruna, but the outlier affinities to ANE-like populations would complicate everything if we included him for the formal calculations I would expect.

The yellow square is there for Villabruna, so the 'pure' cluster exists, but the orange halo probably means that there are many admixed representatives on the edge of Villabruna unlike the Elmiron or Vestonice, which are separated from others by chasms. The formal stats in the paper that utilize the clusters restrict the Villabruna to only the very yellow populations, excluding Bockstein and Ofnet, which are probably admixed.

For the PCA, the exchange between ghostnorris and Davidski addresses perfectly whats going on when we make inferences from only 2 dimensions of a PCA. E.g. in the world PCA, NAms, Latin Americans, and South Indians will end up quite close, but this is because the first two dimensions load onto African-vs-Eurasian and European-vs-East Asian, and NAms and S Indians are forced into these two axes of variation, in which case both will be represented as Euro+E Asian mixes and therefore end up in a similar position. In higher dimensions they will be quite far apart, and you can check this by calculating the total dist via pythagoras' theorem across all the dimensions. In cases such as Kostenki or Ust-Ishim, their individuality, which does not generalise into any other population, may not load into any dimension at all, so U-I will end up in the middle of Eurasians and Kostenki will end up in the middle of West Eurasians, when, in terms of variation, they should be completely outside of the modern clusters altogether. So there are limitations of PCA to consider as well. Both these phenomena are probably influencing the position of Stuttgart, Goyet and Karelian.

An analogy via parallax: we are 'looking down' a tube from which perspective the difference between Han and Sardinian is the greatest, but by this persepctive NAms and S Indians so happen to align visually. Similarly for Fig 3b, where the perspective is optimised for distinguishing Han, El Miron and Loschbour, but by this perspective Karelian, Stuttgart and Goyet overlap.

ryukendo kendow said...

@ Shaikorth

Here are those Neand stats, ranked:

Jordanian 0.81 0.282 0.005 0
BedouinB 0.858 0.386 0.007 0
Samaritan 0.888 0 0.002 0
Palestinian 0.909 0.074 0.01 0
Iraqi Jew 0.926 0.231 0.02 0
Yemenite Jew 0.947 0.277 0.012 0
Druze 0.965 0.186 0.011 0
Iranian 0.968 0.351 0.022 0
Greek 0.975 0.579 0.005 0
Abkhasian 0.976 0.1 0.011 0
Crete 0.993 0.187 0.014 0
Chechen 1.019 0 0.025 0
French 1.023 0.188 0.012 0
Turkish 1.024 0.226 0.014 0
Spanish 1.031 0.13 0.018 0
Makrani 1.041 0.141 0.015 0
Tajik 1.064 0.068 0.016 0
Czech 1.067 0 0.028 0
Kapu 1.069 0.705 0.055 0
Balochi 1.07 1.046 0.026 0
Estonian 1.076 0.167 0.021 0
Armenian 1.077 0.121 0.013 0
Bulgarian 1.078 0.25 0.005 0
North Ossetian 1.079 0.226 0.013 0
English 1.085 0.21 0.015 0
Polish 1.086 0.24 0.036 0
Pathan 1.097 0.469 0.041 0
Brahui 1.099 0.261 0.018 0
Basque 1.1 0.098 0.011 0
Brahmin 1.101 0.635 0.064 0
Kalash 1.113 0.409 0.025 0
Hungarian 1.122 0.057 0.019 0
Lezgin 1.125 0.338 0.014 0.019
Adygei 1.126 0.119 0.02 0
Madiga 1.126 0.795 0.073 0
Mala 1.127 0.527 0.052 0
Orcadian 1.132 0.077 0.004 0
Kharia 1.133 0.38 0.085 0
Sardinian 1.133 0.2 0.009 0
Bergamo 1.134 0.015 0.02 0
Georgian 1.134 0 0.012 0
Russian 1.148 0.243 0.018 0
Tuscan 1.151 0.131 0.016 0
Punjabi 1.156 0.156 0.061 0
Norwegian 1.157 0.297 0.001 0
Yadava 1.157 0.469 0.047 0
Finnish 1.165 0.302 0.013 0
Uygur 1.17 0.398 0.057 0.019
Sindhi 1.174 0.188 0.048 0.022
Kashmiri Pandit 1.175 0.235 0.041 0
Relli 1.19 0.572 0.064 0.019
Irula 1.199 0.212 0.089 0
Albanian 1.203 0.334 0.019 0
Hazara 1.225 0.324 0.034 0
Icelandic 1.237 0.147 0.015 0
Bengali 1.261 0.268 0.063 0
Burusho 1.272 0.2 0.035 0.061
Kyrgyz 1.306 0.101 0.04 0
Mansi 1.311 0.091 0.04 0
Kurumba 1.313 0.751 0.081 0

Outside of the East Med, where the stats are depressed, there doesn't really seem to be a pattern. My pet theory is the 'Southwest Asian' component in the East Med hiding some cryptic ancestry from just outside the Eurasian clade, but overall I agree the modern figures don't give us much to work on.

ryukendo kendow said...

@ Alberto

Alberto, its been a long time, but I'm sending some materials to you now only, do check your email.

ryukendo kendow said...

@ David
Looking at the table S9.1 where MA-1 is compared to Euro HGs, while none of the figures cross 3, if we use 2 as the threshold there are large numbers of significant figures, especially for Loschbour, La Brana and KO1 against non-Villabruna HGs, but not Rochedane or Villabruna itself; i.e., precisely those pops which give us the East Asian signal. So whichever produced the East Asian signal may also give us some MA-1 signal, which we can hopefully explore via plots. If this turns out to be true, Kristiina's idea may receive some support.

@ FrankN
FranK, the actual cluster formation was done using D statistics to solidify the basic inference from f3s, there Motala does not fit well too. You can check it out in the supplements.

Rob said...

@ Ryu & Alberto

Well done on (& others too) for calling the ENA admixture to Final Palaeolithic / Mesolithic European HGs. The paper suggests that the Villabruna cluster itself is rather heterogenous in terms of differential inputs from the 3 or more sources. However the authors could only with confidence identity 2 of them : El Miron & the aforementioned eastern stuff. Presumably the 3rd or more sources are in parts of Europe not yet sampled. Have you any further thoughts ?

epoch2013 said...

@ryukendo kendow

You can similar things in figure 4b in the paper: If there is an Asian signal there also is an American signal. That is, for the mesolithic WHG's, Goyet116-a shows an Asian signal without an American signal.

Could this be ANE in Han instead of the other way around? IIRC the paper clearly states that Villabruna has no ANE admixture, though. The point is that not all Villabruna samples show this Asian affinity. The ones that do are neither clustered in time nor geographically together. See the difference in Rochedane and Bichon, living in the same neighbourhood in the same time.

This doesn't look as simple as a straightforward admixture event.

@Rob

The paper builds the case for no basal Eurasian admixture in K14. That is all nice and dandy but K14 *does* show clear affinity with ENF and Sardinia. So if it isn't basal eurasian than something else must do it. Somehow that could be related to the elevated Middle-Eastern affinity in WHG. [1]

Mind you, KO1 shows considerable less el Miron and Goyet116-1. It could be closer, if not a member of to the third population. It also has "weird" mtDNA, BTW. R1b. It also has the most Han affinity.

[1] That affinity already starts to grow in the El Miron cluster.

Shaikorth said...

It indeed could be a population that affected both East Asians and WHG's. Wong et al's Treemix runs on high converage sequences kept modeling East Asians as mixes involving Oceanian or in their absence African-like or basal stuff. This persisted even when MA-1 was included.

http://oi68.tinypic.com/k2bs7s.jpg
http://oi67.tinypic.com/zujp90.jpg


Looking at extended figure 3 Loschbour and Bichon don't seem to be as Oceanian-shifted compared to older genomes as they are East Asian shifted (perhaps suggesting that the shared side is the non-Oceanian part hinted at by above runs), but could this be affected by older HG's having more archaic ancestry like Oceanians do?

epoch2013 said...

@Shaikorth

Whenever people mention Oceanian admixture I tend to think: "Genetic evidence for two founding populations of the Americas"

http://genetics.med.harvard.edu/reich/Reich_Lab/Welcome_files/nature14895_Skoglund_2015.pdf

There is an interesting little tidbit in the Ice Age paper. Oase 1 shows affinity, this time above significance level, to Karitiana. Karitiana is one of the Indian populations with a clear admixture signal from a mystery population related to Onge and Oceanian. See table S8.1 in the Supplementary PDF.

We could check if our mystery Asian affinity population could possibly contain a similar signal: Take a WHG with a clear Asian signal (Loschbour or KO1), outgroup Mbuti or Yoruba, and have it choose between Mixe and Surui. Repeat with a WHG without a clear Asian signal, such as Villabruna. Repeat with El Miron members.

@David

Could that be done?

Davidski said...

Can you post the stats to run, like this?

Mbuti Bichon Mixe Surui
Mbuti Loschbour Mixe Surui

epoch2013 said...

Certainly. And lets look at more samples.

Mbuti Bichon Mixe Surui
Mbuti Rochedane Mixe Surui
Mbuti Hungarian.KO1 Mixe Surui
Mbuti Villabrune Mixe Surui

Mbuti ElMiron Mixe Surui
Mbuti GoyetQ-2 Mixe Surui

Mbuti Vestonice16 Mixe Surui
Mbuti Vestonice43 Mixe Surui

Mbuti GoyetQ116-1 Mixe Surui
Mbuti Paglicci133 Mixe Surui

epoch2013 said...

@David

O, and since Oase 1 shows a slight Karitiana signal:

Mbuti Oase1 Mixe Surui

Mind you, is that the correct notation for Oase1? Similar question for KO1 in the previous row? You recently posted a list with the standard names but I lost it.

Davidski said...

Mbuti Bichon Mixe Surui -0.0003 -0.086 593636
Mbuti Rochedane Mixe Surui 0.0003 0.067 117063
Mbuti Hungary_HG Mixe Surui 0.0018 0.489 430565
Mbuti Villabruna Mixe Surui 0.0032 0.836 506947
Mbuti ElMiron Mixe Surui -0.0009 -0.226 386234
Mbuti GoyetQ-2 Mixe Surui -0.0016 -0.208 35729
Mbuti Vestonice16 Mixe Surui 0.0035 0.883 438073
Mbuti Vestonice43 Mixe Surui 0.0068 1.238 79225
Mbuti GoyetQ116-1 Mixe Surui 0.0071 1.739 413039
Mbuti Paglicci133 Mixe Surui 0.0008 0.124 39962

Davidski said...

Mbuti Oase1 Mixe Surui 0.0076 1.531 121380

Rob said...

I've only now had the chance to read the recent paper in a little more detail, and still rather briefly, but many interesting facets strike me. Its implications for the archaeology & anthropology of prehistoric Europe - at a both general population and social-contextual analyses. some of it was already evident from the preceding mtDNA paper of Posth 2016. Eg

* The Magdalenians which recolonized western, northern and central Europe weren't the decisive element long & unanimously espoused. They appear to have been an brave vanguard which first ventured north, characterised by their U8 mtDNA line.

* It has long been recognised that the climactic variations around Younger Dryas had profound changes on the settlement of north-central Europe. For example Scandinavia, northern Germany & Britain became again depopulated. Further south, in France, the Swiss Alps, etc there is an Azilianization, whilst new industries - as Frank mentioned - appear in the north, often after a 500 - 1000 year hiatus. It was debated whether this process of change was a mere "adaptation", but we can see population change, with dominance of mtDNA U5 and Y DNA I.

* The Villabruna cluster, appears to basically be Mesolithic WHG, and its Late Glacial (immediate) forbears which spread across Europe, with some Magdalenian input increasing toward the west.

* About the CHG, again congrats to all the fellas for figuring this out all along - CHG is BE + some other west Eurasian lineage- in fact mostly ANE (~ 67%).

* Which brings it to what the "Near Eastern" shift of the Villabruna cluster means. The authors aren;t shy about ascribing it to a significant migration. Yet it's not the BE portion of CHG, nor CHG cluster itself; which must mean its the ANE part of the Satsurblia-like population.


* Finally about the R1b in Villabruna. This is Venitia, NE Italy, which probably had close links with east-central Europe/ northern Balkans/ Carpathian basin. I'd guess the next place east of that is Ukraine - Russia. So we know R1 has been in Europe since at least 15 kya, and possibly earlier in EE.

epoch2013 said...

@Davidski

Thanks a lot. Not a trace of the mystery population that popped up in American Indians. There were speculations that population is derived from a first wave OoA remnant, so if there was anything left in Europeans I gathered it should pop up in the UP.

Chad Rohlfsen said...

CHG has a lot of Villabruna. It is equally close to MA1 and only slight closer to AG3 than Villabruna.

Olympus Mons said...

@Chad,

Can you elaborate a little more....?

Open Genomes said...

An interactive 3-D version of the Ice Age Eurasians World PC Plot:

http://www.open-genomes.org/analysis/PCA/Eurogenes_Ice_Age_Eurasians_PC_plot_1-2-3.html

See the comments on the next post, "On the modern genetic affinities of Ice Age Europeans".

Davidski said...

Very nice, thanks.