Eurogenes Blog: qpGraph open thread

Thursday, June 8, 2017

qpGraph open thread

I managed to put together a simple qpGraph model for the Kalash using present-day populations. It's largely based on the model for the Paniya by Nakatsuka et al. (see Supplementary Figure 5. here). The graph and pops files for my model can be downloaded here and here, respectively. I'm now working on a more complex model for the Kalash that includes ancient genomes from Eastern Europe and West Asia.

I'm willing take a few requests for qpGraph models in the comments below. Please note, however, that these requests will have to be accompanied by graph and pops files, and the graph files must be correctly set out; if they don't work, then they don't work, and you won't get your graph. On the other hand, you only need to supply pops files with the correct populations and I'll do the rest.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

80 comments:

P Piranha said...: The graph files have to be specified manually, including admixture percentages, no? Wouldn't this have to involve a huge amount of trial and error before you get any graph for which no statistic is off? Did you go through that process?; June 9, 2017 at 12:15 AM
Davidski said...: No trial and error process to get the graph above; I basically just copied the graph from the Nakatsuka et al. preprint.; June 9, 2017 at 12:20 AM
Matt said...: IRC, qpGraph is fitting all the f2, f3, f4 statistics possible within the tree, so does any model have to fit all these all exactly? It's just a question of better / worse fits and those that fall under "infeasible"? Which gives more leeway.; June 9, 2017 at 12:26 AM
Davidski said...: If you're asking if the graph files have to produce models that work, then no, they just have to be correctly set out, and then we'll see if your model works or not. In other words, the graph files will produce a model if correctly set out, even if the model itself is a fail.; June 9, 2017 at 12:34 AM
Matt said...: No, sorry, not really what I was trying to say, just in response to P Piranha's that I would think the graphs may not have to be exhaustive or involve huge trial and error because not every stat has to be exactly right (just enough that the model is feasible under so p value). Thank you anyway though.

I might have a look at the models in Reich and Lipson's 2017 working model for Eurasians and Fu et al 2016's model for Eurasian, because I had questions there about why they changed between them, if Fu's models fit.

Also would've been interesting to replicate Schlebusch's model with an addition of Neanderthal and Denisovan to the scaffold, and Papuan in place of Sardinian, but I guess that will have to wait until they publish their Ballito Bay data.; June 9, 2017 at 1:05 AM
Unknown said...: Off-topic

Anyone knows why I1945 & I1949 samples from neolithic Ganj Dareh appears in Lazaridis 2016 paper as P1(xQ,R1b1a2,R1a1a1b1a1b,R1a1a1b1a3a,R1a1a1b2a2a) & CT ydna haplogroups but in the Bell Beaker paper appears like R??; June 9, 2017 at 1:45 AM
Davidski said...: Probably because they sequenced more of the Y-chromosomes for those samples.

When they sequence even more they'll probably see that they belong to R2.; June 9, 2017 at 2:04 AM
Unknown said...: thks.
if you want, take a look at the estimated ages of R1a-M198* in pakistan/afganistan/india & R1b-M269* in iran/turquia https://doi.org/10.1371/journal.pone.0041252.s011; June 9, 2017 at 2:16 AM
Davidski said...: Those estimates are outdated and totally useless.

If you want to know where R1a and R1b are oldest and where they may have originated, then just look where they're found in ancient DNA from forager populations.

So far it's Europe and Siberia.; June 9, 2017 at 2:21 AM
Unknown said...: Why are they outdated? i dont see at the absolute values but the ratio with other pops; June 9, 2017 at 2:34 AM
Davidski said...: See here, including the comments...

http://eurogenes.blogspot.com.au/2011/11/origins-of-r1a1a1-in-or-near-europe-aka.html

http://polishgenes.blogspot.com.au/2014/03/the-story-of-r1a-academics-flounder-as.html; June 9, 2017 at 2:37 AM
Unknown said...: The posts are about R1a, the passarino paper & the first post are previous that Grugni et al (2012). In regard the second ( Underhill), in supp. material, the oldest coalescent time estimates is for M417 from Europe(as total region) followed closely by Pakistan, so IMO not very much outdated, and it don´t solve the question about R1b

Anyway, thanks for the links.; June 9, 2017 at 3:33 AM
truth said...: Seems like this was done before knowing about CHG, WHG, Steppe, etc.; June 9, 2017 at 4:33 AM
K33 said...: Help an ignorant guy out...

What do the numbers beside the solid lines mean in qpAdm graphs? I know the dotted lines and their values imply admixture proportions...

I thought the numbers accompanying solid lines meant "kya" but 92,000 years of genetic isolation for the Onge doesn't seem to make sense?; June 9, 2017 at 6:58 AM
Arza said...: What do these numbers mean in the data.pops file?
Are more than one admixture events allowed in the tree?; June 9, 2017 at 8:12 AM
AP said...: @Algan mardi

It is outdated enough that Underhill himself no longer would use those numbers that were based on Zhiv and STRs. Perhaps Roy King who posts here and is a co-author on the paper could confirm.; June 9, 2017 at 10:29 AM
Seinundzeit said...: This is going to be fun; I've always wanted to see David work with this methodology.

Also, considering the reference populations, that model of the Kalasha is very sensible.

If I model the Kalasha with PCA-based nMonte, using only Georgians and an average of the Andamanese_Onge + Andamanese_Jarawa + Austroasiatic_Bonda, this is what I get:

71% Georgian + 29% ASI

distance=1.3491

Very similar to what we see with qpGraph.

Now, with ancient references:

Kalash

51.2% Iran_Meso/Neo (averaged Iran_Neo and Iran_Hotu) + 0.6% MA1

37.1% Steppe (average of Srubnaya_outlier + Potapovka_outliers + Srubnaya)

11.1% ASI (same as in previous model)

distance=0.4191

With aDNA, looks much more sensible.

So, it'll be very interesting to see if David's qpGraph modelling of the Kalasha ends up looking similar to the fit above.

Still, like Matt, my primary interest with this sort of methodology lies in deep modelling, and I would really love to see something like the Reich and Lipson working model, but with the addition of a few extra Paleolithic genomes, the Yoruba, Mota, and modern West Eurasians (I'm thinking maybe French and Kalash).

Note: The qpGraph model for the Paniya from Nakatsuka et al. can be easily replicated via nMonte.

Paniya

83.7% ASI + 16.4% Georgian

distance=1.2698

Identical to what we see with qpGraph.

Now, with aDNA:

Paniya

74.45% ASI

24.40% Iran_Meso/Neo + 1.15% MA1

distance=0.9347; June 9, 2017 at 11:19 AM
capra internetensis said...: @Matt

I don't know, couldn't you skip the BBay genomes and just use Ju|hoan in the first place, if you're going to model the East African gene flow anyway?

I'd like to see how Pygmies do as sisters to the Basal Human element of Yoruba. Don't have time to put anything together right now unfortunately but I am looking forward to what other people come up with.; June 9, 2017 at 11:36 AM
P Piranha said...: David or any others, is there some way to come up with an algorithmic way of approaching the 'best fit' or 'least tense' tree given a set of populations? There will probably be multiple minima in terms of fit but some will be more and less parsimonious, and we can filter that way. Otherwise, as all these trees are created manually, the process will prove to be incredibly tedious as we keep feeding in different topologies, different ghost pops and different admixture percentages, not to mention confusing as each change we make affect multiple stats that are misfit in an array of directions.

Maybe the entire program can be fit in a container Monte Carlo process? That seems extremely inefficient though, there must be some optimisation method somehow.

Anyone familiar with tree-building algorithmic methods with admixture? Maybe look at Treemix code?; June 9, 2017 at 12:45 PM
Ravai said...: Hello David, how about creating a model for the modern Ashkenazi populations with Levant_Neolithic, Italian_Tuscan, Polish, Lithuanian, Iran_Chalcolithic and Sidon_BA. It's possible?

Regards; June 9, 2017 at 1:44 PM
Samuel Andrews said...: @David,

Have you downloaded any of these new ancient genomes from Romania and Spain...

https://3.bp.blogspot.com/-6wTyrhNIuWM/WSc8Tti_WqI/AAAAAAAAFqA/JlTXFMfDt5Ai5PwD0m4hLcgLdjfOWVYBQCLcB/s480/Gonzalez-Fortes_Table_1.png

If you have can you do sometype of graph with all the European HGs and MA1-AG3?; June 9, 2017 at 5:01 PM
Chad said...: qpGraph up and running here too

https://drive.google.com/file/d/0B962TtPkX1Ync2JOM2FJaTkwOEU/view?usp=sharing; June 9, 2017 at 5:25 PM
Davidski said...: @Arza

The numbers in the pops file are sample counts. Obviously, only I know how many samples from each population are in my dataset, so you can skip this part when making a pops file.

@Piranha

No such thing available at the moment. And I've basically quit TreeMix because it's too temperamental and its output often difficult to interpret.

@Samuel

Those ancient genomes aren't available yet.

@Chad

I don't think many people reading this will be able to view dotfiles. You should post a screencap.

@David Rabaez

As per my blog post, I don't have the time to design topologies for the requested graphs. You guys need to do that and supply me with the graph and dots files.

Even fairly simple graphs can take a couple of hours each to plan properly. Complex graphs can take a full day or even a week.

Good luck.; June 9, 2017 at 6:38 PM
Chad said...: Wasn't thinking there. I'm doing complex ones now. This is cool, but labor intensive.; June 9, 2017 at 7:03 PM
Chad said...: Doubt that this is totally legit, but interesting. Inspiration from one of the graphs in Lazaridis et al (2016), with some extra doctoring.

https://drive.google.com/file/d/0B962TtPkX1YnakgtX1JRVnF4eEk/view?usp=sharing; June 9, 2017 at 9:12 PM
Chad said...: Still playing around...

https://drive.google.com/file/d/0B962TtPkX1YnVUN3X19pN2tzbXM/view?usp=sharing; June 9, 2017 at 9:57 PM
Matt said...: @Davidski: I've had a go at replicating figures from S8 from Haak 2015, to see if I understand how these files are put together really, and its one I wanted to modify to compare with Reich+Lipson 2017.

S8 3-5 Pops: https://pastebin.com/6tivAcLb
S8-5 Graph: https://pastebin.com/CLck5EBM
S8-3 Graph: https://pastebin.com/ZDipE9R2
S8-5 Original: http://i.imgur.com/5rlJsyc.png
S8-3 Original: http://i.imgur.com/wIg6gue.png

Is this all laid out right and would run in theory?; June 10, 2017 at 2:50 AM
Davidski said...: @Matt

https://drive.google.com/file/d/0B9o3EYTdM8lQV3Q3cEQxanNPUm8/view?usp=sharing

https://drive.google.com/file/d/0B9o3EYTdM8lQeEp1aWV5RURvdFU/view?usp=sharing; June 10, 2017 at 5:12 AM
Matt said...: Ah, thanks. S8-5 looks right, I messed up S8-3 by switching a B and D node, and should be: https://pastebin.com/qLfgnLKm (not to re-run necessarily unless you wanted to test!).

I noticed the proportions changed a little from the input graph, is that an automatic thing or did you have to manually tweak after the software threw back infeasible results?

I modified S8-5 to try and take in some of features of "working model of the deep relationships" model: https://pastebin.com/8Ms91rSC

Also, how do we tell if a model is a success or fail from the three files in the .zip, or if one's a better fit than another?; June 10, 2017 at 6:03 AM
Project "Magnus Ducatus Lituaniae" said...: As far as I understand, a model is considered well-fitted, if the final Z-score is less than |z| < 3.; June 10, 2017 at 7:44 AM
Davidski said...: @Matt

https://drive.google.com/file/d/0B8XSV9HEoqpFTm9fN1hLNzNHZzg/view?usp=sharing

The other model you wanted failed due to a lack of overlapping SNPs. I need to have a closer look at why that happened.

And yeah, it looks like qpGraph estimates new admix coefficients regardless of what you put in the graph file. Also, the stat at the top of the image tells you how well the model went. Anything less than 3 means the model worked.; June 10, 2017 at 7:50 AM
Matt said...: Cheers both.; June 10, 2017 at 8:39 AM
Anonymous said...: @Matt & Davidski

Ha! So the Basal Eurasian in K14 is back on the table, albeit in a watered down version.; June 10, 2017 at 8:48 AM
Samuel Andrews said...: @David, Chad

Bronze age DNA Central Asia, Viking age DNA Britain, recent DNA Denmark.

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0170940

M.Myllyla already did analysis of the Viking age British DNA.

http://terheninenmaa.blogspot.fi/2017/06/british-viking-age-samples-placed-on.html

In D-stats it shares most alleles with Swedish, Irish, and Scottish.; June 10, 2017 at 12:25 PM
Davidski said...: @Arza

You can only have two edges at each population. The output is saying that you have more than 2 at EurAs2.

Also, I don't have all of the samples that you want in the same dataset. So instead of Bond you'll have to choose Ho, and instead of Paniya you'll have to choose Gond.; June 10, 2017 at 2:26 PM
Davidski said...: @Samuel

Yeah, I'll get those samples.; June 10, 2017 at 2:27 PM
Arza said...: OK, so maybe something like this (and thanks for patience):

Mbuti
Ust_Ishim
Satsurblia
Gond
Igorot
Onge

root Root
label Mbuti Mbuti
label Ust_Ishim Ust_Ishim
label Satsurblia Satsurblia
label Gond Gond
label Igorot Igorot
label Onge Onge
edge Mbuti Root Mbuti
edge A Root A
edge B A B
edge EurAs1 A EurAs1
edge Ust_Ishim EurAs1 Ust_Ishim
edge EurAs2 EurAs1 EurAs2
edge DeepEast EurAs2 DeepEast
edge DeepEast1 DeepEast DeepEast1
edge DeepEast2 DeepEast DeepEast2
edge EurAs3 EurAs2 EurAs3
edge West EurAs3 West
edge Satsurblia West Satsurblia
edge C West C
edge South EurAs3 South
edge Gond South Gond
edge D South D
admix Pre_Igorot C DeepEast1 35 65
edge Igorot Pre_Igorot Igorot
admix Pre_Onge D DeepEast2 79 21
edge Onge Pre_Onge Onge; June 10, 2017 at 3:19 PM
Arza said...: Complex spatio-temporal distribution and genogeographic affinity of mitochondrial DNA haplogroups in 24,216 Danes

http://biorxiv.org/content/early/2017/06/10/148494; June 10, 2017 at 5:03 PM
P Piranha said...: How about something basic, like this?

https://shrib.com/#EarlyTree1

Pops here:
https://shrib.com/#EarlyTree1Pops; June 10, 2017 at 5:53 PM
Davidski said...: @Arza

The >3 Z score suggests the model doesn't work.

https://drive.google.com/file/d/0B8XSV9HEoqpFZlRpQ25qT0RZaVk/view?usp=sharing; June 10, 2017 at 9:42 PM
Davidski said...: @Arza

Trying your model again, this time with the inbreed: YES flag in the parfile.

https://drive.google.com/file/d/0B8XSV9HEoqpFcVVQR3VSUnFXSmc/view?usp=sharing

@Piranha

There's a problem with your graph: Ust_Ishim isn't linked to any part of it.; June 10, 2017 at 10:27 PM
P Piranha said...: edited, should be correct now; June 11, 2017 at 12:36 AM
Anonymous said...: Is there a quick and dirty howto on how to create these files and on qpAdm/qpGraph itself; June 11, 2017 at 2:18 AM
Garvan said...: Davidski said...
"@Arza
Trying your model again, this time with the inbreed: YES flag in the parfile"

The graphics are identical from the two runs you uploaded, as far as a I can see.; June 11, 2017 at 2:20 AM
Matt said...: @Epoch2013: Well, the models from Haak with K14 having some kind of Basal Ancestry still works for those populations it in includes (and actually maybe better from improvements to qpGraph). So I'm still why those were rejected in favour of Reich and Lipson's current working model. It may be that as all of the populations in their WM get added, the models with K14 having forms of Basal Ancestry start to fail... At the moment Davidski was having some problems running extensions.

I could upload a quick how to on how it seems to me that Graph input files are put together in a bit.

@ All: Had a go at doing as Capra suggested and inputing the model from Schlebusch 2017 for Africa, and just leaving out Ballito Bay:

Direct Copy: Pops https://pastebin.com/D0cndEDy
Direct Copy: Graph https://pastebin.com/eXAzCykJ

Plus a couple of modifications:

Pops: https://pastebin.com/8Z6UY8d0 (swap LBK_EN for Ust_Ishim, include Altai Neanderthal and Somali)

Graph A: https://pastebin.com/RwLWtSnj
(intended to add in Altai Neanderthal divergence and Neanderthal admixture into Ust Ishim, and Somali as population similar to the admixing population into JuHoanNorth).

Graph B: https://pastebin.com/BVHC8pma
(same as Graph A, but I had an irrational dislike of Mota's unadmixed position in A, so this has an extra admixture model for Mota, even though it's not as parsimonious as a graph).

@Capra, if you have any suggestion about where and how to add Mbuti to these graphs, I will try that.; June 11, 2017 at 2:36 AM
Davidski said...: @Piranha

The Ami are unrooted in your new topology.; June 11, 2017 at 3:59 AM
Davidski said...: @Matt

There are no valid SNPs for the models you posted. I don't know why?

Maybe Chad can try them and see whether I'm missing something?; June 11, 2017 at 4:38 AM
P Piranha said...: Corrected; June 11, 2017 at 5:32 AM
Chad said...: I will shortly. Gotta make my bacon, eggs, and corned beef hash first. I've got a massive tree to share too. It has WHG, East Asian root into ANE, Natufians, and Iran EN. It probably can use more tinkering, but it looks pretty solid.; June 11, 2017 at 5:36 AM
Chad said...: Another tree

https://drive.google.com/file/d/0B962TtPkX1YndUowYXVTNTQzNFU/view?usp=sharing; June 11, 2017 at 6:39 AM
Davidski said...: That looks pretty solid Chad.; June 11, 2017 at 6:50 AM
Chad said...: I'm going to test adding in Kostenki, Chimp, and Neandertal. Kostenki was the one I had trouble placing, which I think is caused by being 2-3% more Neandertal thanthe others, so, I'll see how it goes.; June 11, 2017 at 7:40 AM
Chad said...: Matt,

I had to do some doctoring to get it to work, but it doesn't look great. What is used for an outgroup here, Chimp?; June 11, 2017 at 8:10 AM
Matt said...: @Chad, thanks, yeah the three models in my post June 11, 2017 at 2:36 AM should diverge from Chimp as the outgroup around the root, then into "BasalHuman" or "Pre-Human" depending on the model.

Is it the basic model or the modifications that don't work so well or both? The basic model (https://pastebin.com/eXAzCykJ) should just have the same output as http://i.imgur.com/RnrA10j.png minus Ballito_Bay so should be able to fit, unless BallitoB is vital for the qpGraph to properly calculate distances etc or I've muffed the topology. Even the modifications should just mainly add admixture from Neanderthal that fits OK.

Nice tree btw. Like to see if you can fitting K14 and the Upper Paleolithics to it, as that seems tricky with the existing models (leading to an extra Basal edge of some sort to K14, or else an admixture between East and West Eurasia HG).; June 11, 2017 at 8:44 AM
Project "Magnus Ducatus Lituaniae" said...: Chad, have you tried to model Iran_ChL as a two-way mix between Iran_N and Levant_N (Levant_N would require additional separate stream of Basal-like ancestry). The 'best' model I've got so far, has a z-score of 4.22. So the model is not fitted to data.

https://1drv.ms/f/s!Al0_H1OD8OmfgSZXYdje8eLR5jCb; June 11, 2017 at 9:52 AM
Chad said...: I'll get to that. I've added CHG to the tree and tried EHG, but it is very complex. I'm working on adding Kostenki14 right now. I plan on tinkering with this most of the day, so, I'll post what works.; June 11, 2017 at 9:58 AM
Chad said...: Matt,

Ballito Bay is probably necessary as there are no stats from that making this tree. Adding and removing pops can drastically alter every admixture edge.; June 11, 2017 at 10:08 AM
Unknown said...: Could someone try to model this admixture with Oase1, Oase2, Ust'-Ishim, Kostenki14, Kostenki12 and Vestonice16 instead of Onge and Paniya?; June 11, 2017 at 2:38 PM
Unknown said...: >I will shortly. Gotta make my bacon, eggs, and corned beef hash first. I've got a massive tree to share too. It has WHG, East Asian root into ANE, Natufians, and Iran EN. It probably can use more tinkering, but it looks pretty solid.

Are you sure about Natufians? They and Anatolian Famers don't seem to be mongoloid shifted unlike all the other mesolithic, neolithic and modern West Eurasians.; June 11, 2017 at 2:47 PM
Chad said...: Mongoloid shifted? What are you talking about? You might want to re-examine that tree.; June 11, 2017 at 3:00 PM
Chad said...: I should say, look at the tree before commenting on it. There's no East Asian in Natufians. I can get minor East Asian in Iran via ANE.; June 11, 2017 at 3:02 PM
a said...: Can someone create qpAdm/qpGraph by cherry-picking from the following samples, to get an alternate graph >3 and excluding an OoA Eastern route ?

Homo heidelbergensis
Graecopithicus freybergi
Neaderthal
Jebel Irhoud
Denisovan
Oase1
Ostuni1
Grotta Paglicci
Villabruna; June 11, 2017 at 3:07 PM
Unknown said...: >Mongoloid shifted? What are you talking about? You might want to re-examine that tree.

Commented on:
>East Asian root into ANE, Natufians, and Iran EN. It probably can use more tinkering, but it looks pretty solid.

Natufian don't seem to have any East Asian admixture unlike all other West Eurasians.

>Ha! So the Basal Eurasian in K14 is back on the table, albeit in a watered down version.

Varying levels of Basal Eurasian admixture in pretty much all available paleolithic samples are likely as they all display near eastern genetic components unlike the Villabruna/El Miron.

https://genetiker.files.wordpress.com/2017/06/all-14-20.png

Basal Eurasian and Papuan/Ust'-Ishim allele frequncies present in Paleolithic Europeans disappeared in WHG probably due to genetic drift during the LGM. It's also possible Basal Eurasian doesn't exist and is just a reflection of internal genetic substructure of Paleolithic Europeans that become dominant in the Mesolithic Near East but not among those who stayed in Europe.
Although the relation between ASI, Ust'Ishim and UP Europeans seems to be unexplored.; June 11, 2017 at 3:56 PM
Chad said...: ^^ last two comments are incorrect. No Basal Eurasian in the UP and they have less affinity to the Near East. There's also no Papuan. Admixture is not to be trusted. UP Euros have more archaic and are less related to other clusters that get pulled at various k levels, so they get dubious results. Also, the reason that Basal may pop up in UP involved graphs is also the extra archaic ancestry making them less related to East Asians than Mesolithics. It may fit, but doesnt make sense.

You might want to study the difference between Admixture and Admixtools. Admixtools results are much better.; June 11, 2017 at 4:08 PM
Unknown said...: >last two comments are incorrect. No Basal Eurasian in the UP and they have less affinity to the Near East. There's also no Papuan. Admixture is not to be trusted. UP Euros have more archaic and are less related to other clusters that get pulled at various k levels, so they get dubious results. Also, the reason that Basal may pop up in UP involved graphs is also the extra archaic ancestry making them less related to East Asians than Mesolithics. It may fit, but doesnt make sense.

Only some UP Europeans had elevated archaic ancestry such as Oase1 and probably MA1. Archaic ancestry can't show as Middle Eastern as Basal Eurasian lowers archaic ancestry and even Papuans have much lower archaic ancestry than the australoid related ancestry UP Europeans are showing. Less affinity to Middle East also makes sense as they are much older, don't have the Middle East-specific drift and perhaps unknown admixtures(different than BE) Middle Easterners might have picked up after leaving Europe in Paleolithic.
Paleolithic Europeans also plot on PCAs near ASI and Basal Eurasian-admixed South Asians.; June 11, 2017 at 4:51 PM
Chad said...: Sorry, but you have no idea what you're talking about. Go read the paper on Ice Age Europe. I don't have time to explain all this.; June 11, 2017 at 5:29 PM
Davidski said...: @Unknown

Quit posting nonsense in this thread.; June 11, 2017 at 5:52 PM
Davidski said...: @Piranha

https://drive.google.com/file/d/0B8XSV9HEoqpFalRJWWpYdzVnQk0/view?usp=sharing; June 11, 2017 at 6:11 PM
Karl_K said...: Wow. I was waiting for the uninformed to start posting on this one.

But seriously. Just read the papers closely. Then you can start some really dumb, but long lasting, discussions. Once you know what to even say in order to argue.; June 12, 2017 at 3:50 AM
Davidski said...: @Unknown

That's it. No more crap. I'll just delete it.; June 12, 2017 at 4:41 AM
Arza said...: @ Davidski
Thanks! So it failed miserably.

But the interesting part is that I can force this model to work with nMonte/4mix by creating DeepEast ghost (global 10):

DeepEast,0.00628,-0.13924,0.04308,0.07007,0.01894,0.09968,0.07214,0.00162,-0.00241,-0.00261

Population,Paniya,DeepEast,Satsurblia:SATP,Ulchi,D statistic
Austroasiatic_Santhal,96,04,00,00,0.0071
Austroasiatic_Asur___,95,05,00,00,0.0077
Austroasiatic_Ho_____,90,10,00,00,0.0057
Austroasiatic_Savara_,90,10,00,00,0.0060
Austroasiatic_Kharia_,86,12,02,00,0.0069
Austroasiatic_Gadaba_,85,14,01,00,0.0041
Austroasiatic_Juang__,84,16,00,00,0.0062
Austroasiatic_Bonda__,84,16,00,00,0.0064
Austroasiatic_Khasi__,44,21,14,21,0.0102
Kusunda______________,45,11,11,33,0.0053
Igorot_______________,00,65,35,00,0.0011
Ami__________________,01,61,34,04,0.0037
Atayal_______________,02,60,33,05,0.0046

Austroasiatic_Bonda
Austroasiatic_Santhal 73.7
Paniya 13.2
DeepEast 13.1
distance%=0.3783 / distance=0.003783

Now I have a headache just from thinking about PCA topology and I'm starting to doubt reliability of nMonte (at least for a "long range" mixes between outermost samples).; June 12, 2017 at 6:34 AM
Mark Moore (Moderator) said...: Not a professional anthropologist here. I take it that the pros on this board are rejecting the idea of Basal Eurasian in the samples that are more than 13K old? If not, what is the oldest genome which is confirmed to show it? Thanks.; June 12, 2017 at 7:21 AM
Davidski said...: Paleolithic Caucasus forager Satsurblia and the Epipaleolithic Natufian foragers are the oldest samples with Basal Eurasian.

So Basal Eurasian was widespread in the Near East and Caucasus already during the Paleolithic, but did not reach Europe, or at least the vast majority of Europe, until the Neolithic.; June 12, 2017 at 7:48 AM
Davidski said...: Finally figured out a model with CHG, EHG, Iran_ChL, Kalash and Yamnaya. It looks pretty strong and all of the ancestry coefficients are basically what we've seen before elsewhere. I'll post it tomorrow in a new blog post.

No idea yet if it works for South Asians other than the Kalash.; June 12, 2017 at 8:12 AM
Mark Moore (Moderator) said...: I for one will look forward to seeing that. I noticed from your earlier work on the subject that it was very difficult to get the CHG "right" from the perspective of Basal Eurasian. It seemed they were "too low" along with other issues. I refer to http://eurogenes.blogspot.com/2016/04/estimating-basal-eurasian-ancestry.html

Feel free to shoot this scenario down if it does not help explain problems with data: Basal Eurasians first admixed (after Satsurblia) with the descendants of those Levant populations whose ancestors had admixed with the Satsurblia ancestors. So Satsurblia read a trace of Basal Eurasian, but not because it had a trace of basal Eurasian, but because his ancestors admixed with the population which first admixed with the BE. Satsurblia mixed with the ancestors of the first farmers, and the first farmers were later admixed with the BEs. Later, the descendants of Satsurblia mixed some with populations which had actual BE in them (as well as the other component of first farmers) but not so much as the First Farmers.

Or maybe I should leave such speculation to the pros.; June 12, 2017 at 11:38 AM
Samuel Andrews said...: I can search for mtDNA matches at the click of button now. I'm currently dotting down ancient matches.

The following match relates to the controversy over Steppe ancestry in South Asia. A U2e1b founder effect lineage in Andrah Predesh India belongs to the same very specific U2e1b subclade as a early Bronze age sample from England. As of far I've only found this U2e1b clade in Andrah Predesh and EBA England. Mind boggling.; June 12, 2017 at 11:03 PM
Jijnasu said...: @samuel andrews
Any idea what ethnicity the individuals from andhra pradesh belong to ?; June 13, 2017 at 5:41 AM
Samuel Andrews said...: Three are tribal Dravidians. One is a Middle Caste Dravidian.; June 13, 2017 at 6:56 AM
Anonymous said...: In the 2015 paper on West Eurasian mtDNA in India and Bangladesh, Andhra Pradesh has 6 x U2e1b samples. 5 of them are labelled "Tribe" and the other one "Middle Caste". The linguistic affiliation of all 6 is labelled "Dravidian".; June 14, 2017 at 4:58 AM
Unknown said...: In regards to australoid admixture in the paleolithic Europe. This study clearly says that the oldest WHG sample from Iberia had genetic component that peaks in South India which is obviously ASI and from other studies we know older paleolithic genomes had more of it.
http://www.cell.com/current-biology/fulltext/S0960-9822(17)30559-6
Why it disappeared in later WHG samples is anyones guess.
And regarding Basal Eurasian if Davidski won't delete this post then I can say Willerslev detected it and one very brief test in the Ice Age article based on comparing Ust'-Ishim to younger UP Europeans is for me not enough to fully discount it. Anyone can have their own opinion but this topic needs more research.; June 19, 2017 at 4:01 PM

search this blog

Thursday, June 8, 2017

qpGraph open thread

80 comments: