Tuesday, February 9, 2016

CHG admixture in early western Anatolian farmers

Anatolian Neolithic farmer I0708 from the Mathieson et al. 2015 dataset belongs to Y-haplogroup J2a and is the most Caucasus-shifted of the early Anatolian farmers in my Principal Component Analysis (PCA) of West Eurasia (see below). This is unlikely to be a coincidence and provides strong evidence that at least some Neolithic farmers in western Anatolia harbored Caucasus Hunter-Gatherer (CHG) ancestry.

Note that the two CHG genomes sequenced to date courtesy of Jones et al. 2015, Kotias and Satsurblia, belonged to Y-haplogroups J and J2a. Moreover, J2 today shows peaks in frequency and diversity in and around the Caucasus. In other words, Y-haplogroup J, and in particular J2, appear to represent paternal signals of CHG admixture.

Unfortunately, it's not yet possible to demonstrate with formal tests beyond any doubt that I0708 has CHG admixture.

For instance, the D-stats below, in which a couple of the least Caucasus-shifted Anatolian farmers are Anatolia Neolithic1, while I0708 is Anatolia Neolithic2, fail to reach significance (Z=3). Please note, I ran the stats with the Amerindian and Siberian samples to test for Ancient North Eurasian (ANE) admixture, which appears to be a feature of CHG.

However, the results are all clearly positive, and might reach significance with higher quality data and/or a better reference than Anatolia Neolithic1.

Indeed, the subtle difference in ANE affinity between Anatolia Neolithic1 and Anatolia Neolithic2 is underlined by the D-stats below. Note that here Kotias shows significant signals of admixture when paired with Anatolia Neolithic1, but not when paired with Anatolia Neolithic2. This is despite the fact that Anatolia Neolithic2 is a higher coverage sequence (6.95x vs 2.66x) and offers more markers.

I0708 is unlikely to be the only early western Anatolian farmer with CHG/ANE admixture. The PCA above show that a couple of others are also pulling strongly towards the Caucasus. Indeed, all of the Anatolian and European Neolithic samples might harbor low levels of CHG ancestry. The problem with testing this idea at present is a lack of more basal Near Eastern ancient genomes from core areas of the Near East, like, say, the Levant.

Hopefully they're on their way, but in any case, it's almost certain now that CHG was already expanding west, and in all likelihood east, during the early Neolithic. This probably has some important implications for the peopling of West Eurasia and their linguistic affinities. Feel free to post what these might be in the comments.

Update 11/02/2016: I came up with new Anatolia Neolithic1 and Anatolia Neolithic2 sets using D-stats (by comparing each of the Anatolians to Kotias versus sample I0708). For a breakdown see here. Anatolia Neolithic2 now shows significant signals of admixture from Kotias, Dai, Surui and Han. This implies that it not only harbors CHG ancestry, but also ANE and East Asian-related admixtures.

ANE admixture in Caucasus Hunter-Gatherer Kotias


Davidski said...

If I didn't mess anything up here, then these results make the discussions about Kum6's CHG admixture redundant.

Krefter said...

This should also be able to detect CHG in Anatolia_Neolithi2.

Anatolia_Neolithic2 Anatolia_Neolithic1 Kotais Stuttgart
Anatolia_Neolithic2 Anatolia_Neolithic1 Kotais HungaryGamba_EN

Davidski said...

Krefter, it's Kotias not Kotais. Anyway I think this looks better. Check out the Iberian EN/MN scores, they're almost significant.

Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Stuttgart -0.0111 -1.584 356188
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Hungary_EN -0.0115 -2.163 406610
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias LBK_EN -0.011 -2.17 416928
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Iberia_EN -0.0128 -2.24 410605
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Iberia_MN -0.014 -2.467 406060
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Iberia_Chalcolithic -0.0089 -1.65 388602
Anatolia_Neolithic1 Anatolia_Neolithic2 Kotias Esperstedt_MN -0.0053 -0.725 345333

Simon_W said...

Congrats to this discovery. The question if Kum6 had any CHG seems redundant now, but not if he had any more than the others. There still was a remarkable increase in CHG ancestry in western Anatolia, eventually. And presumably this later swapped over to Greece and Italy. (Two of the Bronze Age Hungarians show a shift along the Greek/Italian cline, and we know there was yDNA J2 among them.)

George Okromchedlishvili said...

I am still entertaining the idea that kartvelians despite being probably the most CHG-related language group on Earth (at least on par with Abkhazs) actually belong to a language group carried by ENF.

There are four strong reasons for my conviction:

1.A very strong presence of G2a in "core" Kartvelian groups like Svans and Mingrelians. Especially Svans who look like they are super-dominated by G2a. I don't think that this is a coincidence especially given how patrilineal their society is.
2.A very well reconstructed farming vocabulary for proto-Kartvelian
3.The unrealtedness of Kartvelian to NE and NW Caucasian languages together with the abundance of J2 and J1 among NE-Caucasian speakers. Again I don't think that it is a coincidence that guys who are very CHG-like by Y-DNA speak totally different languages from Kartvelians
4.Isolate status of Kartvelian languages. If they are the remnant of ENF-like language group then it is not surprising that they don't have living "relatives". ENFs have basically "emigrated" to Europe where their language carriers got overrun by Indo-Europeans while "at home" Semitic, IE and Turkic languages have totally wiped out all "ancient" language groups of West Asia.

apostateimpressions said...

"Hopefully they're on their way, but in any case, it's almost certain now that CHG was already expanding west, and in all likelihood east, during the early Neolithic. This probably has some important implications for the peopling of West Eurasia and their linguistic affinities."

David, why do you say "and their linguistic affinities"? Are you saying that IE could have from them or do you allude to non-IE languages?

I am not sure that we will ever get a definitive answer to the proto-IE question, because we just don't have ancient data in the way we do with genetics. Its not like we can point to sample XYZ from 6,000 years ago and say "he was IE!"

Rob said...

That makes sense to me
A true relict group ;)

Nirjhar007 said...

Incredible David!, submit this in a journal...

Gökhan said...


Agree with you, Early Anatolian Farmers might were speaking pre south west caucasian language. But on the other hand we should not neglect mtDNA lines as far as lingustics are concerned. I have not any idea whats the common mtDNA lines between south caucasia and neolethic Anatolians.

Krefter said...

@Davidski, from previous post.
"Chimp Surui Anatolia_Neolithic2 Kotias 0.0148 3.359 505722
Chimp Itelmen Anatolia_Neolithic2 Kotias 0.0142 3.486 505722
Chimp MA1 Anatolia_Neolithic2 Kotias 0.0139 2.6 365166

I though Anatolia_Neolithic was the one who probably has CHG? Also, is the J2a guy apart of what Anatolia_Neolithic2 is?

FrankN said...

Generally, CHG presence in ENF makes a lot of sense.
Animal husbandry seems to have developed along the Upper Tigris. It reached the Levante from there, possibly promoted by and accompanying Obsidian trade from the Lake Van area that is attested already for the late 11th mBC (linkl).

Specifically, it has been demonstrated that the first domesticated NE pigs bore the Arm1T haplotype, which dominates in southeastern Anatolia, Armenia, Syria, Georgia, and Iran. Early Western Anatolian domesticated pigs (late 6m BC) however carried Anatolian Y1/Y2 lineages, and these are also the lineages found in EEF contexts.

"This temporal and geographic pattern (fig. 1) could be the result of two different processes. First, it is possible that genetically differentiated wild boar populations in eastern and western Anatolia were domesticated independently. More likely, however, is a scenario in which southeastern Anatolian wild boar were initially domesticated and subsequently transported west out of the Neolithic “core zone” (Özdoğan 2011). Then, following admixture with female wild boar indigenous to western Turkey, they acquired the local Y1 lineage that prevailed over the Arm1T lineage in this area."

Thus, pre-/proto-neolithic linkage between the Lake Van area and the Levante, ultimately also Western Anatolia has long been without doubt. The only question was/is, to which extent the Lesser Caucasus was permeable and allowed interaction between West Georgia and the Lake Van area.
Excavations form Armenis show this to be the case: Armenian highland sites served as seasonal camps during the UP (14-15 ky calBC) and the early mesolithic (11-10 ky calBC, unrel. for contam./ reservoir effects, 2nd phase 8 mBC)). They are culturally clearly tied to W. Gerogia. For the mesolithic Kmlo2 site, east of the Aragats massif, it is stated;

"The most frequent types of microliths are the backed bladelets and scalene (straight-backed andobliquely truncated) bladelets, which are related to the Late Upper Palaeolithic tradition of Kalavan-1, but are found also on Mesolithic sites of the 10-9th millennium BC in Georgia, for exampleat Kotias Klde (Meshveliani et al. 2007);

Other artefacts (“Kmlo tools”), which carry a very particular retouch (Fig. 10), are reminiscent of the pre-pottery Neolithic (8-7th millennia BC), both of the Near East and of north-western Transcaucasia;"

Obsidian dominated the Armenian lithic assmblage, but came from regional sources, not the Lake Van sources that are characteristix for the NE PPN. Which leaves us with some 400 km distance between CHG and ENF, for which AFAIK evidence of interaction is still lacking. However, obsidian gatherers, sheep and goat hunters/herders coming from the south and the north should have roamed that area. Chance enough for contact.

When looking for CHG further NE - the following article demonstrates the "export" of domesticated sheep through the Caucasus to the Mongolian Plateau some 7 ky ago, making tentative connection to the arrival of millet in the NCauc foothills by around the same time. [Accidentally, dogs were also on a move from WEurasia to Mongolia during the mesolithic].

Kurd Dgk said...

@ David

Here is another clue pointing at CHG admixture in Anatolians. I seriously doubt that Iraqi Kurd C3, who is loaded with CHG admixture, could be modeled with qpAdm so well using only Anatolians and Potapovka steppe.

I don't have the exact numbers in front of me but they were very close to:

POPULATION Kurd_C3 Chisq Tail Probability
Anatolian 62.00% 0.55 94%
Potapovka 38.00%
Kotias CHG 0%

compared to these other very good qoAdm fits:

POPULATION Kurd_C3 Chisq Tail Probability
Stuttgart 47.80% 0.235 97.20%
Potapovka 22.60%
Kotias CHG 29.60%

POPULATION Kurd_C3 Chisq Tail Probability
LBK_EN 50.00% 0.35 95.00%
Potapovka 32.20%
Kotias CHG 17.80%

Potapovka would simply not have sufficient CHG for Kurd C3 to get such a good fit, if Anatolians did not contain CHG.

Taymas said...


That would also make sense in terms of climate specialization, no? I've no knowledge as far as 6000 BC, but based on modern maps it would make a lot of sense to me that Georgia would be far more attractive to the neolithic package than the north side of the Caucasus.

Matt said...

Hey, what do the D stats for

Chimp Atayal Anatolia_Neolithic1 Anatolia_Neolithic2
Chimp Loschbour Anatolia_Neolithic1 Anatolia_Neolithic2 / Chimp WHG Anatolia_Neolithic1 Anatolia_Neolithic2
Chimp Atayal Anatolia_Neolithic1 Kotias
Chimp Loschbour Anatolia_Neolithic1 Kotias / Chimp WHG Anatolia_Neolithic1 Kotias
Chimp Atayal Anatolia_Neolithic2 Kotias
Chimp Loschbour Anatolia_Neolithic1 Kotias / Chimp WHG Anatolia_Neolithic1 Kotias


Davidski said...


You're mixing up two different Anatolia_Neolithic2 sets. Stick to the Anatolia_Neolithic2 in this blog post.


Chimp Atayal Anatolia_Neolithic1 Anatolia_Neolithic2 0.0096 1.988 479518
Chimp Loschbour Anatolia_Neolithic1 Anatolia_Neolithic2 0.0056 0.862 409927

Chimp Atayal Anatolia_Neolithic1 Kotias 0.0115 2.47 445526
Chimp Loschbour Anatolia_Neolithic1 Kotias -0.0226 -3.737 382151

Chimp Atayal Anatolia_Neolithic2 Kotias 0.0008 0.143 468658
Chimp Loschbour Anatolia_Neolithic2 Kotias -0.0306 -4.291 400730

Matt said...

Thanks. Those look consistent to me with AnatoliaNeolithic_2 being like AnatoliaNeolithic_1 plus a measure of CHG and AnatoliaNeolithic_1 having slightly more Basal Eurasian ancestry/less of some low level sharing with ENA that is in CHG.

Do you think qpAdm would work to fit AnatoliaNeolithic_2 as AnatoliaNeolithic_1 and CHG?

Davidski said...

I haven't been able to get it to work. But mixing Anatolia_Neolithic and Kotias in the same qpAdm models is usually very difficult, and especially in this case, I'm not sure which outgroups to use, because anything from North and East Asia, or even the Americas, might skew the results.

I did get a half decent result with about 12% of Okunevo admixture into Anatolia_Neolithic2, using Kostenki14 and Ust_Ishim as some of the outgroups, but the standard errors were huge, something like 24%.

German Dziebel said...


One thing I want to mention in the context of your post is that Svans have preserved a very interesting linguistic/social relic, namely a 4-partite division of siblings into man's brother, woman's brother, man's sister, woman's sister. It's a very rare pattern (I wrote about it in my book The Genius of Kinship) and it's not found anywhere in Eurasia besides Svans (we can reconstruct it for proto-Kartvelian), Basques and Burushaski. The latter two are linked into a hypothetical Dene-Caucasian phylum, but Kartvelians are not. (North Caucasian-speakers may have lost it but the fact remains - they don't have it.) This could be part of Kartvelians' and Basques' agricultural legacy but this pattern is not found anywhere in West Asia or North Africa.

Ebizur said...

German Dziebel wrote,

"One thing I want to mention in the context of your post is that Svans have preserved a very interesting linguistic/social relic, namely a 4-partite division of siblings into man's brother, woman's brother, man's sister, woman's sister. It's a very rare pattern (I wrote about it in my book The Genius of Kinship) and it's not found anywhere in Eurasia besides Svans (we can reconstruct it for proto-Kartvelian), Basques and Burushaski."

Have you not counted Korean among these languages because it recognizes such a four-way distinction only for elder siblings?

male's elder brother: 형 hyeong (apparently from Chinese 兄 Mandarin xiōng, Cantonese hing1 "elder brother," but very well-integrated into the Korean colloquial lexicon with many dialectal variants, such as hei, hing-i, hia, seong, seongga, sei, siya)

female's elder brother: 오빠 oppa, 오라비 orabi, 오라버니 orabeoni, 오라버님 orabeonim (In order from least to most honorific. Some of these words, e.g. orabi, have also been used to refer to a female's younger brother or to all the male siblings of a female regardless of relative age.)

male's elder sister: 누나 nuna, 누님 nunim (the latter is more honorific than the former)

female's elder sister: 언니 eonni (also may be used to refer to the wife of a female's elder brother, and some people have adopted a habit of using this word to refer to any elder sibling, even those who normally should be called hyeong, oppa, nuna, etc.)

Other Korean words for siblings include the following:

아우 au < Late Middle Korean 아ᇫᄋ azgh: An old word for a younger sibling (originally, probably only one of the same sex). Now used mainly to refer to a male's younger brother, though it may sometimes be used to refer to a younger individual among a group of elderly women.

동생 dongsaeng < Chinese 同生 ("same-born," i.e. an individual born of the same parents): This is the normal word for "younger sibling" in Modern Korean. It may refer to a younger sibling of a male or of a female. To refer to a younger brother, the prefix 남 nam- (< Chinese 男 "man, male") may be added. To refer to a younger sister, the prefix 여 yeo- (< Chinese 女 "woman, female") may be added.

누이 nui: A somewhat antiquated word for a male's sister. Apparently related to Modern Korean nuna ~ nunim ("a male's elder sister"), but nui is more commonly used to refer to a male's younger sister.

Aram said...


I think You will be proven correct by future aDNA studies. More Imho the G2a/ENF will be found in Western Georgia earlier than most scholars assumed.

Just one remark.

**3.The unrealtedness of Kartvelian to NE and NW Caucasian languages together with the abundance of J2 and J1 among NE-Caucasian speakers. Again I don't think that it is a coincidence that guys who are very CHG-like by Y-DNA speak totally different languages from Kartvelians***

NW Caucasians aka Abkhaz-Adygheans also have high levels of G2a. Imho their level of J2/J1 is insufficient to link them definitively with NEC groups.
Their G2a is another branch G2a-P303>U1>L1264. The last common ancestor of U1>L1264 and G2a1-Y5797 ( very frequent in Kartvelians ) lived 17600 years ago. So the split happened even before the farmers existed.

Aram said...


**One thing I want to mention in the context of your post is that Svans have preserved a very interesting linguistic/social relic, namely a 4-partite division of siblings into man's brother, woman's brother, man's sister, woman's sister. It's a very rare pattern**

Do this pattern give a hint to us that Early Farmers where more inclined to be monogamous ?

Davidski said...

Check out the update!!!11

Matt said...

Do the new sets have any interesting stats with European EN or MN or WHG or MA1?

Also, I had a look at the Supplement for Mathieson and the samples for both Anatolia_Neolithic 1 and 2 all look like they're from the Barcin site with one exception, so I guess that means the population structure doesn't maps to two different sites used, more maybe different ancestry in one community / location.

Anatolia Neolithic2 now shows significant signals of admixture from Kotias, Dai, Surui and Han. This implies that it not only harbors CHG ancestry, but also ANE and East Asian-related admixtures.

Another option could be that 1 has more ancestry from the Levant that depresses the relatedness to ANE, ENA, CHG. But we don't have any good evidence of that atm.

In terms of the PCA,the AN group didn't look unusually dispersed compared the EEF EN and MN, so I guess a question of whether the later groups also break up into these subsets of different relatedness to CHG (which I guess would show this the CHG / ENA related ancestry would take a long time to homogenise over history, maybe longer than expected from just purely random mating).

Davidski said...

These are the same as in the updated table. Only Karelia_HG with a significant Z score.

But I think that neither PCA nor D-stats like the way I ran them do a good job of finding the most CHG and least CHG Anatolians, because, as usual, the basal ratios get in the way. The ideal solution would be a Neolithic genome from somewhere like Lebanon or Jordan.

Chimp Karelia_HG Anatolia_Neolithic1 Anatolia_Neolithic2 0.0108 3.166 534096
Chimp Esperstedt_MN Anatolia_Neolithic1 Anatolia_Neolithic2 0.0095 2.721 478532
Chimp Loschbour Anatolia_Neolithic1 Anatolia_Neolithic2 0.0077 2.184 478539
Chimp Iberia_EN Anatolia_Neolithic1 Anatolia_Neolithic2 0.0056 1.994 547210
Chimp Satsurblia Anatolia_Neolithic1 Anatolia_Neolithic2 0.0078 1.94 393709
Chimp Iberia_MN Anatolia_Neolithic1 Anatolia_Neolithic2 0.004 1.459 536772
Chimp Ust_Ishim Anatolia_Neolithic1 Anatolia_Neolithic2 0.0037 1.146 561176
Chimp LBK_EN Anatolia_Neolithic1 Anatolia_Neolithic2 0.0026 1.036 561397
Chimp MA1 Anatolia_Neolithic1 Anatolia_Neolithic2 0.0028 0.77 405968
Chimp Iberia_Chalcolithic Anatolia_Neolithic1 Anatolia_Neolithic2 0.0011 0.407 508964
Chimp Bichon Anatolia_Neolithic1 Anatolia_Neolithic2 -0.0015 -0.413 376613

Davidski said...

Here's a bit of the Admixture bar graph from the Mathieson et al. preprint. I can actually see the CHG in some of the Anatolians and even LBK farmers here. But the Hungary_EN samples lack it completely.

PF said...

Nice work! You know a discovery is good is there are some 1s after the exclamations points!!!11 ;-)

Well, I mentioned back in the day that Otzi seemed to have some ANE, obviously before the CHG genomes were uncovered, for which I was promptly laughed at. I still don’t know what to make of it, but obviously *something* is going on…

I also find it really interesting that Basal dampens the CHG/ANE signal, yet at the same time where there’s likely a ton of Basal, there’s also a ton of CHG and hg J. Though not vice versa. So there must have been insane population turnovers in the middle east, or continuous constant gene flow from northeast Africa, which is what “Basal” may simply be reflecting.

No doubt the most significant open question right now is what “Basal” is exactly. Like Davidski said, ancient Levantine DNA is sorely needed — not just one genome, but many.

FrankN said...

Under "Ancient European mitogenomes", I had provided a link to the most recent World Language Tree established by Jaeger e.a., 2016, using automated lexical comparison:

In the tree three types of "unexpected" linkages show up:
a) S(E)A to Africa, partly continuing into S. America,
b) SEA/Oceania to the Americas,
c) East Eurasia to the American Pacific

I'll provide more detail on a) and b) later. Here a focus on c) pertinent to the discussion, where the tree shows:

1. Yenissean forms a clade with Tsimishian (SE Alaska, BC), which is linked to the Altaic macrofamily

2. Kartvelian forms a clade with Takelma (Oregon coast, extinct). This clade sorts into the Altaic macro-family, inbetween its Mongolic-Tungusian and Yukhagir-Turkic branches

3. Mochica (NW Peru, extinct) clusters into Uralic, next to Samoyedic. Uralic and IE are sisters.

4. Within the Amerind macrofamily, a closely connected substructure comprises
a) N Caucasian
b) Chukotko-Kamchatkan and Eskimo-Aleut
c) the Mosan Sprachbund (Wakashan, Salish, Chimakuan, Sahaptian, Molala, Coosan, Alsea, Siuslaw) from the US/Canadian Pacific

5. Also in that Amerindian macrofamily, Aimu clusters together with Maku (Braz-Ven border, unclass.). A sister clade is made up by Yawa (Japen Isl. off Papua) and Kebar (Bird's Head), next follow Zaparoan and Peba-Yaguan (b. Ecuad/Peru) [This already goes towards the Oceanian-SAmer links to be described later].

6. A (S)EA macrofamily made up of Sino-Tibetan, Hmong-Mien and Japanese includes Siouan as fourth member. Together with Siouan clusters Yuchi (Tennessee, isol., acc. to WP "well known mound builders"). With Japanese cluster Munichi (NE Peru, isol., ext.) and Busa (NW PNG).

Note that this is just lexical comparison that can't tell apart genetic relationship and borrowing/ "Sprachbund". However, even the latter is sufficient to indicate substantial contact, possibly resulting in genetic exchange.
Also, "chance correspondence" can't be excluded, but the multiplicity of identified connections makes it unlikely they all result from chance.

At least two patterns appear to shine through:

a) A trail from/ to the Caucasus (Kartvelian, N Caucasian) and Urals (Samoyedic, Yenissean), to/ from the American Northwest. [Note, btw., that the tree denies Dene-Caucasian, even Dene-Yenissean. Na Dene clusters tightly with Oto-Manguean (SW Mexico).]
In the "out of America/Beringia" direction, there has already been quite some genetic confirmation for that trail. Dave's CHG analysis here might be taken as support to the "out of Eurasia" move. Kartvelian-Takema (OR) is intriguing. If the connection can be confirmed, it would supply us with a terminus ante quem for the migration, namely early 2m BC, before Svan split from the remainder of Kartvelian.
Equally intriguing is Samoyed-Mochica. The Mochica were likely bearers of the Mocha culture (200 BC - 600 AD) that provides the first American evidence of fully developed ore smelting.

b) Japan to SC Amer. Pacific. Several waves may be considered. The first one would be the transfer of Jomon ceramics to coastal Ecuador during the late 4th mBC, signified by the Ainu-Ecuadorian linguistic connection.
A second, broader wave may tentatively be dated to the IA. It possibly rather originated in NE China/ Korea, but may have been better preserved by Japanese which was introduced about the same time. Evidence includes a Chinese report (Hui Sheng, 458 AD), American wolf DNA in Chinese dogs, a rare gene mutation shared by the Xoloitzcuintli, the Peruvian Hairless and the Chinese Crested dogs, and parallels between Japanese maki-e and Mexican maque lacquering technology.
Siouan as East Asian language family is particularly intriguing, and provides a possible link from Chinese to North American mound building, which started around 3500 BC.

Matt said...

Thanks. There's not a strong clear pattern to those (Esperstedt_MN near the highest, then Iberia_Chalcolithic least significant, Iberian_MN intermediate, Bichon non-sig etc.) so maybe an effect of the Basal Eurasian or other slight differences in population structure affect those. Might be that because those closely West Eurasian populations are close, even though the overlapping SNPs are high, random effects can get involved it's harder to find enough to distinguish.

PF: continuous constant gene flow from northeast Africa, which is what “Basal” may simply be reflecting

The theory against BE being anything related to Africans is that a) African admixture that would pushes EEF away from WHG / ANE / ENA would take it closer to Africans than they are and b) more tenuously maybe, the level of Neanderthal admix is the same so Basal Eurasian had to have gone through that event (I say more tenuously because Neanderthal admix is hard to measure).

I think Basal Eurasian not being anything African would also mean D stats like

Chimp Mbuti Anatolia_Neolithic1 Anatolia_Neolithic2
Chimp Yoruba Anatolia_Neolithic1 Anatolia_Neolithic2
Chimp Mbuti Anatolia_Neolithic1 Kotias
Chimp Yoruba Anatolia_Neolithic1 Kotias
Chimp Mbuti Anatolia_Neolithic1 Loschbour
Chimp Yoruba Anatolia_Neolithic1 Loschbour
Chimp Mbuti Anatolia_Neolithic1 Yamnaya
Chimp Yoruba Anatolia_Neolithic1 Yamnaya
Chimp Mbuti Anatolia_Neolithic1 Dai
Chimp Yoruba Anatolia_Neolithic1 Dai

should all be basically around D: 0 and Z:0 (although a population without Eurasian ancestry close to the Eurasian bottleneck like a working version of the Mota sample would work better).

FrankN said...

Continuing with a run through the 2016 World Language Tree, here the SEA/ Oceania connections to the Americas. Linkage has been demonstrated through shared Denisovan ancestry, and in the Raghavan e.a. and Skoglund/Reich e.a. 2015 studies on the peopling of the Americas. Other evidence includes Coconut DNA (from S Phillip. to Mesoamerica by 300 BC) and Polynesian-American contact.
Sorting out the initial and follow-up contacts will be difficult, and should, aside from aDNA (of which Raghavan 2015 has supplied some), also call for much more fine-grained sets of Papuan DNA.

1. Great and South Andamese (incl. Onge) cluster with Maybrat (PNG Bird's head) and Anem (New Brit., isol.). Sister clades are Lakes-Plain, Eleman and W.Bougainville languages (all PNG). The latter form a clade with Piraha (Amaz. Brazil, isol.).

2. Inside a primarily Papuan-Australian macro-family that includes Trans-New Guinea (TNG) and Pama-Nyungan (PN), Asmat-Kamoro (Papua, TNG) cluster with Chapacura (Amaz., extinct). A sister clade combines Kwerba, Kapauri (both NC Papua), Waroani and Arawan (both Amazonia). The "cousin" structure, otherwise mostly Papuan, clusters together Kamsa (Colombia), Yaruro (Venez.) and Senagi (C NGuin), with N. Halmaheran as sister clade. Further down in that structure, deeply nested within various TNG languages, Pawaia (SC NGuin) and Warao (Venez/Guyana) form a sub-clade.

3. Nested in the Amerind macrostructure is a clade of Andoque (S Colomb), Kwomtari (N NGuin) and Enggano (off Sumatra, AN). Sister clades are Tucanoan, Saliban and Yuwana (all Orinoco Basin), and the Tupian family, widespread across Brazil, which ao includes Surui and Karitiana.

4. What appears to be a Pacific Coast family, comprising a/o Yamana (S.Chile), Guahiban (Col), Lencan (Hond/El Salv), Chumash and Penutian (CA to WA), includes the following sub-clades:
a) Pech (Hond., Chibchan), Yokuts (CA, isol) and Kayagar (SW NGuin);
b) Jivaroan (Ecuad) and Yele (Rossel Isl. off SE NGuin)

5. The Timor-Alor-Pantar (TAP) Family includes, in addition to a few Papuan isolates (Yale, Kaure, Hattam), the following subclades:
a) Atakapa (Louisiana, isol.), Pyu (NW PNG, isol.) and Abui (NW Alor Isl., TAP)
b) Wiyot (CA, Algic) and W Bomberai (Papua, TAP)
c) Fur (Darfur, NS), Puri (SW Brazil, MGe)

6. The Tai-Kadai Family includes Guachai (WBraz, isol.)

7. A separate clade, attached to Oceanic (AN), is made up by Central Solomons languages and Karaja (Braz., MGe).

Note in addition the Ainu and Japanese linkages listed in my previous post, which include "stopovers" in Northern Papua en route to Ecuador/Peru. Trans-Indian Ocean linkages to be described in my next post will provide a few more Oceanic-South American links.

FrankN said...

To finish up with the World Language Tree, here the connections between Asia and Africa, and beyond both:

1. Dravidian forms a macro-family with Nilo-Saharan, Niger-Congo and KhoiSan. Kunama (Eritrea, NS) clusters immediately with Dravidian.

2.The Afro-Dravidian macrofamily also includes, next to most of Nilo-Sudanic, a subcluster made up by
a) South Birds Head and Mairasi (both W New Guinea)
b) Temne-Baga (Sierra Leone/ Guinea, NC)
c) Hadza (Tanz,isol) and Shabo (Eth, NS)
d) Sko (PNG, isol) and Kanoé (E Brazil, isol)
e) Ijoid (Niger Delta, NC)

3. Attached to Afro-Dravidian is a separate family, consisting of
a) Kenabol (Malaya, extinct) in a clade with Kulango-Lorhon (CIV, NC);
b) Gumuz-Koman (WEth/SSudan, NS)

4. Within Afro-Dravidian, the first branches off the KhoiSan Family are
a) Sandawe and Kawesqar (Patagonia, isol)
b) Sape (SW Venez, isol) and Kambot (PNG, LSR)

5. Within AfroAsiatic, Nihali forms a clade with some Cushitic languages, Moggodo, a HG language from Central Kenya being the closest relative. The clade furthermore includes South Cushitic languages from Tanzania (Iraqw, Burunge, Jwadza), and Abun (Papua, Birds Head)

6. Within a generally Papuan-Australian macrofamily, the following clades are found:
a) Korean and Burushaki, directly next to
b) Songhay (NS), Pulabu (N NGuin, TNG), Klamath (Oregon coast), Kala Lagaw Ya (N Queensland, PN)
c) Laal (Chad, unclass.) within Nuclear Trans-New Guinea,
plus, all clustering together with Australian languages,
d) Nivkh and Araucarian (SC Chile)
e) Trumai (C. Brazil, isol.) and Kaingang (Brazil, MGe), clustering with Banaro and Awar (b. N NGuin, LSR) and a couple of PN languages from N Queensland.
f) Kordofanian (S Sudan, NC) and Tiwi (Tiwi isl., NAustr., isol)

7. Embedded in a SW American/Andean cluster of Quechua, Aymara, Barbacoan, Chipaya and, Yuracare,, which also includes Tunica (low.Miss., ext.),
a) Beria and Berti (E. Saharan) form a clade with Kunza (N Peru, isol.);
b) a sister clade is made up by Leko (S L. Chad, NC), Cofan (Ecuad/ Col.) and Paez (Col lowland).

Afro-Dravidian and Nihali-Cushitic correspond to post-Mota Indo-African contacts. The spread of bananas, possibly also taro/yam, from insular SEA to W.Afr. may be reflected in some of the linkages under 2/3/6. Kunza (7) shows strong SEA (TK, AuA) influence, e.g. numerals, that isn't reflected in the tree, but may provide the bridge to East Saharan.
Some of the other clusters (e.g. Nivkh-Autralian-Araucarian, KhoiSan-Sape/Venez.) just look weird and so far lack archeologically plausible explanation. However, remember that there also have been "weird" D-stats showing up in Skoglund/Reich 2015, e.g. for Maya-Turkmen or Pima-Tshwa. Thus, I wouldn't exclude the possibility of quite ancient and basal connections showing up here, possibly somewhat randomly and on the borderline to "chance correspondence".

Otherwise worth mentionning from the tree:

1. Attached to AusA is a clade of Kusunda and Shom Penh;

2.The earliest spin-off within IE is a clade of Celtic, Albanian, and Brahui (NDrav, attached to Albanian). The further splitting sequence (2) Armenian, (3) Indo-Aryan, (4) Graeco-Romance (5) Germanic vs. Balto-Slavic. Hittite and Tocharian are not included.

Lexo Gavashelishvili said...


FIY, Imeretians (i.e. another Kartvelian population) have considerably greater G2a frequency than Megrelians. In particular, the ones in Upper Imereti are almost 'super-dominated' by G2a and represent the largest G2a population among Kartvelians based on pretty representative samples.

George Okromchedlishvili said...


Wow, missed this. Could be another support to "my" "theory" (just a speculation so far)
Then again we know that kartvelians have spread from South-West to North-East (for example most Kakhetia was Caucasus Albanian just 2 thousand YBP)


That's interesting. Do you have a link to this estimate? I thought it was more like 8-10 thousand years of separation.
NWC is tricky but it seems like it likely is connected to NEC on a deeper level. Or it could be another Neolithic Farmer language, who knows.

German Dziebel said...


"Have you not counted Korean among these languages because it recognizes such a four-way distinction only for elder siblings?"

This Korean type is upstream from the one found among Basques, Svans and Burushaski. Sibling terms get progressively simplified as distinctions get removed: Korean has 6, Svan has 4, Georgian, Mingrelian has 2. Uralics and Altaics removed the relative sex distinctions but kept the relative age distinctions, Basques, proto-Kartvelians and Burushaski did the opposite - kept the relative sex distinction but dropped relative age. The oldest sibling term sets in Eurasia are found in Ainu, Koreans, some Hmong-Mien groups, some Sino-Tibetan groups (very rare) and Papua New Guineans. But the one that Svans and Basques have are rather unique.

Davidski said...


Chimp Mbuti Anatolia_Neolithic1 Anatolia_Neolithic2 0.0042 1.652 562367
Chimp Yoruba Anatolia_Neolithic1 Anatolia_Neolithic2 0.0049 2.056 562367

Chimp Mbuti Anatolia_Neolithic1 Kotias 0.0081 2.295 489164
Chimp Yoruba Anatolia_Neolithic1 Kotias 0.0068 1.999 489164

Chimp Mbuti Anatolia_Neolithic1 Loschbour 0.0008 0.21 482190
Chimp Yoruba Anatolia_Neolithic1 Loschbour 0.0029 0.817 482190

Chimp Mbuti Anatolia_Neolithic1 Yamnaya_Samara 0.0053 2.09 564669
Chimp Yoruba Anatolia_Neolithic1 Yamnaya_Samara 0.0046 1.905 564669

Chimp Mbuti Anatolia_Neolithic1 Dai 0.0032 1.183 566992
Chimp Yoruba Anatolia_Neolithic1 Dai 0.0032 1.244 566992

Aram said...


They split just under the G2a-P15. Estimate based on Yfull

In fact the Kartvelian G2a1 seems to be very old and unique branch.


**2.The earliest spin-off within IE is a clade of Celtic, Albanian, and Brahui (NDrav, attached to Albanian). **

Woow. I think You are aware that Albanians have high level of J2b2a. But the most perplexing thing is that the basal J2b2* is found in South Asia. I have strong suspicion that J2b is a secondary minor IE marker. Some calculations shows that perhaps it's real TMRCA is much lower than is assumed by Yfull

Matt said...

Thanks David. Those show that if anything Mbuti and Yoruba seem to have more sharing with the Anatolia_Neolithic2 group, Kotias, Loschbour, Samara_Yamnaya and Dai than they do with Anatolia_Neolithic1 group. Although without significance.

If you do have time, to run

Chimp Mbuti Anatolia_Neolithic Kotias
Chimp Yoruba Anatolia_Neolithic Kotias
Chimp Mbuti Anatolia_Neolithic Loschbour
Chimp Yoruba Anatolia_Neolithic Loschbour
Chimp Mbuti Anatolia_Neolithic Yamnaya_Samara
Chimp Yoruba Anatolia_Neolithic Yamnaya_Samara
Chimp Mbuti Anatolia_Neolithic Dai
Chimp Yoruba Anatolia_Neolithic Dai
Chimp Mbuti Anatolia_Neolithic Egyptian
Chimp Yoruba Anatolia_Neolithic Egyptian
Chimp Mbuti Anatolia_Neolithic BedouinB
Chimp Yoruba Anatolia_Neolithic BedouinB
Chimp Mbuti Anatolia_Neolithic BedouinA
Chimp Yoruba Anatolia_Neolithic BedouinA
Chimp Mbuti Anatolia_Neolithic Palestinian
Chimp Yoruba Anatolia_Neolithic Palestinian

to see where the whole set of Anatolia_Neolithic (1, 2, plus the others) sits, that would be interesting to see to me. I added in the Palestinian, Bedouin and Egyptian stats to just check my assumption and see if the groups who really do seem to have African ancestry have significant stats.

Chad Rohlfsen said...

Chimp Mbuti Anatolia_Neolithic Kotias 0.0052 1.714 19471 19270 508715
Chimp Yoruba Anatolia_Neolithic Kotias 0.0031 1.055 20790 20661 508715
Chimp Mbuti Anatolia_Neolithic Loschbour -0.0015 -0.445 19353 19411 501921
Chimp Yoruba Anatolia_Neolithic Loschbour -0.0009 -0.269 20745 20782 501921
Chimp Mbuti Anatolia_Neolithic Yamnaya_Samara 0.0025 1.267 23137 23024 588923
Chimp Yoruba Anatolia_Neolithic Yamnaya_Samara 0.0014 0.730 24761 24693 588923
Chimp Mbuti Anatolia_Neolithic Dai 0.0009 0.389 24792 24749 592940
Chimp Yoruba Anatolia_Neolithic Dai -0.0001 -0.053 26529 26535 592940
Chimp Mbuti Anatolia_Neolithic Egyptian 0.0081 6.616 23796 23415 592940
Chimp Yoruba Anatolia_Neolithic Egyptian 0.0111 9.252 25601 25041 592940
Chimp Mbuti Anatolia_Neolithic BedouinB 0.0084 5.945 23294 22905 592940
Chimp Yoruba Anatolia_Neolithic BedouinB 0.0091 6.567 24995 24544 592940
Chimp Mbuti Anatolia_Neolithic BedouinA 0.0082 6.850 23552 23167 592940
Chimp Yoruba Anatolia_Neolithic BedouinA 0.0096 8.268 25296 24816 592940
Chimp Mbuti Anatolia_Neolithic Palestinian 0.0084 7.580 23348 22959 592940
Chimp Yoruba Anatolia_Neolithic Palestinian 0.0102 9.443 25087 24582 592940

Chad Rohlfsen said...

SSA in Iberia? Quite possible. Either that, or the West Eurasian in Africans is pretty much identical to the Iberians.

Chimp Mbuti Iberia_EN Kotias -0.0010 -0.295 18474 18512 489952
Chimp Yoruba Iberia_EN Kotias -0.0033 -0.985 19767 19900 489952
Chimp Mbuti Iberia_EN Loschbour -0.0083 -2.099 18092 18394 483258
Chimp Yoruba Iberia_EN Loschbour -0.0068 -1.787 19418 19686 483258
Chimp Mbuti Iberia_EN Yamnaya_Samara -0.0039 -1.454 21968 22140 566958
Chimp Yoruba Iberia_EN Yamnaya_Samara -0.0051 -2.003 23524 23766 566958
Chimp Mbuti Iberia_EN Dai -0.0052 -1.921 23422 23667 568947
Chimp Yoruba Iberia_EN Dai -0.0060 -2.307 25088 25391 568947
Chimp Mbuti Iberia_EN Egyptian 0.0014 0.663 22602 22540 568947
Chimp Yoruba Iberia_EN Egyptian 0.0045 2.202 24331 24114 568947
Chimp Mbuti Iberia_EN BedouinB 0.0016 0.726 22121 22049 568947
Chimp Yoruba Iberia_EN BedouinB 0.0024 1.102 23747 23634 568947
Chimp Mbuti Iberia_EN BedouinA 0.0018 0.854 22348 22270 568947
Chimp Yoruba Iberia_EN BedouinA 0.0034 1.716 24023 23861 568947
Chimp Mbuti Iberia_EN Palestinian 0.0017 0.825 22165 22089 568947
Chimp Yoruba Iberia_EN Palestinian 0.0036 1.828 23831 23658 568947

Krefter said...


Did you get that email with D-stats I sent you, David, and Tobus? The results will be interesting.

Matt said...

@ Chad, thanks for all those. As expected looks like the D(Chimp,West African,Anatolia_Neolithic,Eurasian) stats are non-significant below 3, particularly with the Loschbour HG and Dai, except for the ones involving modern day Near East / Middle East populations who do seem to have African ancestry (BedouinA, BedouinB, Palestinian, Egyptian).

Interesting that the Iberia_EN stats, although they're *all* significant below 3, they all consistently change by around -0.0064 in their D (compared to the equivalent Anatolia_Neolithic stat), and this causes the NE / ME comparisons to become non-significant. That sounds like it could be West African ancestry contribute to them, or another population could contribute to both West Africans and Iberian_EN (Mesolithic North Africans?), or could be some Iberia_EN to West Africa (but then I would expect that to mean relatedness to Anatolia_Neolithic as well, since not much time for Iberia_EN and Anatolia_Neolithic?).

I also wonder if, if Davidski has identified a least admixed subgroup of Anatolia_Neolithic in Anatolia_Neolithic1, whether that would give significant D(Chimp,West African,Anatolia_Neolithic1,Iberia_EN stats). Also, Chad, do you find any signal in the D(Chimp, West African,Loschbour, La Brana) if you tried any of those?

Chad Rohlfsen said...

I can check it out. One interesting thing is that an f3 Anatolian Loschbour EEF pop is negative for LBK and Hungary, but not Iberia. Even though Iberia shares the most drift with Loschbour. There is something special about Iberia.

Seinundzeit said...

This is somewhat off topic, but not too much. At Anthrogenica, Chad has posted these results, in order to gauge levels of steppe ancestry in South Central Asia:

result: Yoruba Loschbour Sintashta Paniya -0.0869 -18.365 6907 8222 102206
result: Yoruba Loschbour Pashtun_Afghanistan Paniya -0.0341 -9.880 7998 8562 113795 = 39.24
result: Yoruba Loschbour Tajik_Afghanistan Paniya -0.0275 -7.686 8029 8482 113795 = 31.65
result: Yoruba Loschbour Tajik_Ishkashim Paniya -0.0391 -9.965 7972 8620 114012 = 44.99
result: Yoruba Loschbour Tajik_Rushan Paniya -0.0489 -12.146 7922 8736 114011 = 56.27
result: Yoruba Loschbour Tajik_Shugnan Paniya -0.0469 -13.023 7937 8719 114013 = 53.97
result: Yoruba Loschbour Kalash Paniya -0.0342 -11.355 7958 8521 114013 = 39.36
result: Yoruba Loschbour Pathan Paniya -0.0309 -10.976 7972 8479 114013 = 35.56

result: Yoruba Esperstedt_MN Sintashta Paniya -0.0835 -17.060 6439 7612 94428
result: Yoruba Esperstedt_MN Pashtun_Afghanistan Paniya -0.0452 -12.294 7510 8220 107819 = 54.13
result: Yoruba Esperstedt_MN Tajik_Afghanistan Paniya -0.0449 -12.684 7502 8207 107819 = 53.77
result: Yoruba Esperstedt_MN Tajik_Ishkashim Paniya -0.0564 -14.352 7436 8326 108047 = 67.54
result: Yoruba Esperstedt_MN Tajik_Rushan Paniya -0.0567 -14.288 7465 8363 108046 = 67.90
result: Yoruba Esperstedt_MN Tajik_Shugnan Paniya -0.0603 -15.780 7449 8405 108048 = 72.22
result: Yoruba Esperstedt_MN Kalash Paniya -0.0470 -14.868 7471 8207 108048 = 56.29
result: Yoruba Esperstedt_MN Pathan Paniya -0.0438 -15.259 7480 8165 108048 = 52.46

As Alberto pointed out, a serious issue here is that Paniya are terrible non-IE references for the Tajiks, and probably for the Pashtuns as well, considering BMAC were quite unlikely to resemble the Paniya (same for IVC as well). This is very problematic, but we can't change that until we have South Central Asian aDNA.

But one thing that can be changed is the European population in the formula.

It really doesn't make sense to use Loschbour, as South Central Asians don't have much in the way of WHG ancestry (although they seem to have very substantial EHG/MA1-related ancestry). Which is quite expected, as steppe populations seem to have very little of it themselves (in the case of Sintashta and Andronovo, it seems that they don't have any WHG admixture. They just have WHG-related ancestry via EEF), and South Central Asians are a mix of ancient steppe, West Asian/Caucasus, and South Asian ancestries.

And surely the middle Neolithic population is very far from being a good gauge of steppe ancestry either, as it lacks both EHG and CHG. It's just Anatolian Neolithic + WHG.

If one wants actual estimates of steppe ancestry, why not use steppe populations? So, I was hoping Chad could try these:

Yoruba Sintashta Pashtun_Afghanistan Paniya
Yoruba Sintashta Tajik_Afghanistan Paniya
Yoruba Sintashta Tajik_Ishkashim Paniya
Yoruba Sintashta Tajik_Rushan Paniya
Yoruba Sintashta Tajik_Shugnan Paniya
Yoruba Sintashta Kalash Paniya
Yoruba Sintashta Pathan Paniya
Yoruba Sintashta GujaratiA Paniya
Yoruba Sintashta GujaratiD Paniya
Yoruba Sintashta Punjabi Paniya

Yoruba Srubnaya Pashtun_Afghanistan Paniya
Yoruba Srubnaya Tajik_Afghanistan Paniya
Yoruba Srubnaya Tajik_Ishkashim Paniya
Yoruba Srubnaya Tajik_Rushan Paniya
Yoruba Srubnaya Tajik_Shugnan Paniya
Yoruba Srubnaya Kalash Paniya
Yoruba Srubnaya Pathan Paniya
Yoruba Srubnaya GujaratiA Paniya
Yoruba Srubnaya GujaratiD Paniya
Yoruba Srubnaya Punjabi Paniya

Yoruba Andronovo Pashtun_Afghanistan Paniya
Yoruba Andronovo Tajik_Afghanistan Paniya
Yoruba Andronovo Tajik_Ishkashim Paniya
Yoruba Andronovo Tajik_Rushan Paniya
Yoruba AndronovoTajik_Shugnan Paniya
Yoruba Andronovo Kalash Paniya
Yoruba Andronovo Pathan Paniya
Yoruba Andronovo GujaratiA Paniya
Yoruba Andronovo GujaratiD Paniya
Yoruba Andronovo Punjabi Paniya

(If possible, please exclude the Andronovo samples with heavy/moderate Siberian admixture). Thanks in advance.

Chad Rohlfsen said...


What I was doing is using f4. You have to use a West Eurasian that is more distant from L3. That is how you're supposed to do it. I've already done the f4 on Paniyas. They came out 75% Onge, 25% Kotias. What I was looking for is the percent of Sintashta and then comparing it to the Dstat Onge Tajik_Ishkashim Kotias Sintashta. This allows me to compare the best admixing population and how the f4 looks. If you look over the previous two pages on that thread and my explanation to Alberto, you'll understand why I did it that way. What you're asking for is just shared drift with Andronovo. That doesn't tell us much as far as admixing populations and it can't give us any percentages.

Chad Rohlfsen said...

This had nothing to do with presuming what BMAC or anything else was like, with the Paniya. But, it can tell us just where to go from Paniyas to that pop by looking at my dstat admixing population and the f4. Please, read over those pages again. Alberto didn't understand what I was doing and why.

Davidski said...

Using Andronovo (minus outlier RISE512) and Pulliyar.

Yoruba Esperstedt_MN Tajik_Shugnan Pulliyar 0.760079
Yoruba Esperstedt_MN Tajik_Ishkashim Pulliyar 0.711088
Yoruba Esperstedt_MN Tajik_Rushan Pulliyar 0.708293
Yoruba Esperstedt_MN Kalash Pulliyar 0.575286
Yoruba Esperstedt_MN Pathan Pulliyar 0.53606
Yoruba Esperstedt_MN GujaratiA Pulliyar 0.443327
Yoruba Esperstedt_MN GujaratiB Pulliyar 0.394501
Yoruba Esperstedt_MN GujaratiC Pulliyar 0.313591
Yoruba Esperstedt_MN GujaratiD Pulliyar 0.256383
Yoruba Esperstedt_MN Punjabi_Lahore Pulliyar 0.251655

Chad Rohlfsen said...

BTW, I'm very certain Sintashta does have a good amount of WHG. I think it is close to the amount in Iberia_MN. In my new calculator, they're about 26% WHG, 22% Near East, 23% CHG, and 29% EHG.

left pops:

right pops:

0 Sintashta 5
1 Karelia_HG 1
2 Kotias 1
3 Iberia_MN 4
4 Mbuti 10
5 Ami 10
6 Ju_hoan_North 5
7 Japanese 29
8 Yoruba 70
9 Papuan 14
10 Karitiana 12
11 Loschbour 1
jackknife block size: 0.050
snps: 133461 indivs: 162
number of blocks for block jackknife: 703
dof (jackknife): 571.156
numsnps used: 82479
codimension 1
f4rank: 2 dof: 5 chisq: 1.347 tail: 0.930052419 dofdiff: 7 chisqdiff: -1.347 taildiff: 1
scale 1.000 1.000
Ami 0.651 0.386
Ju_hoan_North -0.012 -0.094
Japanese 0.719 0.417
Yoruba -0.061 -0.074
Papuan 0.423 -0.153
Karitiana 1.754 1.635
Loschbour 1.674 -1.991
scale 346.182 717.771
Karelia_HG 1.164 0.992
Kotias -1.270 1.032
Iberia_MN -0.176 -0.975

full rank 1
f4rank: 3 dof: 0 chisq: 0.000 tail: 1 dofdiff: 5 chisqdiff: 1.347 taildiff: 0.930052419
scale 1.000 1.000 1.000
Ami 0.651 0.406 -0.868
Ju_hoan_North -0.011 -0.111 1.099
Japanese 0.720 0.402 0.487
Yoruba -0.061 -0.121 1.766
Papuan 0.421 -0.118 -1.238
Karitiana 1.749 1.625 0.368
Loschbour 1.679 -1.998 0.126
scale 344.789 717.986 10790.179
Karelia_HG 1.145 1.002 -0.828
Kotias -1.285 1.038 -0.521
Iberia_MN -0.195 -0.959 -1.429

best coefficients: 0.293 0.198 0.508
0.000059248 -0.000110587 -0.000079133 -0.000174471 0.000099097 -0.000100166 -0.000133973
0.521807884 -0.973968449 -0.696942693 -1.536606079 0.872767856 -0.882187287 -1.179930715

Jackknife mean: 0.292914522 0.198911256 0.508174222
std. errors: 0.043 0.058 0.071

error covariance (* 1000000)
1839 -107 -1732
-107 3352 -3245
-1732 -3245 4977

fixed pat wt dof chisq tail prob
000 0 5 1.347 0.930052 0.293 0.198 0.508
001 1 6 39.339 6.14089e-07 0.487 0.513 -0.000
010 1 6 11.901 0.0642092 0.305 -0.000 0.695
100 1 6 44.182 6.80228e-08 0.000 0.185 0.815
011 2 7 124.957 7.0985e-24 1.000 -0.000 -0.000
101 2 7 138.767 9.21198e-27 0.000 1.000 0.000
110 2 7 50.642 1.0802e-08 0.000 -0.000 1.000
best pat: 000 0.930052 - -
best pat: 010 0.0642092 chi(nested): 10.555 p-value for nested model: 0.00115905
best pat: 110 1.0802e-08 chi(nested): 38.741 p-value for nested model: 4.83871e-10

I'll continue to work on this.

Chad Rohlfsen said...

When I say Iberia_MN, I mean the European farmer admixture looks very similar to Iberians in terms of WHG/Near East

Davidski said...

And again with Andronovo (minus outlier RISE512), but this time with Paniya. Very similar to the above results with Pulliyar, and my TreeMix runs.

Yoruba Esperstedt_MN Tajik_Shugnan Paniya 0.755721
Yoruba Esperstedt_MN Tajik_Rushan Paniya 0.708391
Yoruba Esperstedt_MN Tajik_Ishkashim Paniya 0.706277
Yoruba Esperstedt_MN Kalash Paniya 0.581948
Yoruba Esperstedt_MN Pathan Paniya 0.542171
Yoruba Esperstedt_MN GujaratiA Paniya 0.45105
Yoruba Esperstedt_MN GujaratiB Paniya 0.404643
Yoruba Esperstedt_MN GujaratiC Paniya 0.328915
Yoruba Esperstedt_MN GujaratiD Paniya 0.2765
Yoruba Esperstedt_MN Punjabi_Lahore Paniya 0.265459

Btw, in regards to WHG in Sintashta, the issue there I think is that one of the Sintashta with the highest marker count has no WHG above what is in Anatolia_Neolithic, and this probably caused the lack of WHG in Sintashta in the modeling done in the Mathieson paper.

Analyzed individually, most of the Sintashta samples do show extra WHG in all of the analyses that I've run.

Chad Rohlfsen said...

I think that Sintashta had more CHG to EHG than Yamnaya. Which number was that? I'll remove it and try it again?

Davidski said...

Can't see Sintashta having a higher ratio of CHG to EHG than Yamnaya Kalmykia.

The modeling in Haak et al. shows that Corded Ware came from a Yamnaya-like pop with a slightly higher EHG ratio than Yamnaya Samara. Sintashta is probably ultimately from the same source, albeit with more EEF admix than Corded Ware.

The Sintashta samples, from most basal to least basal...


The last one definitely has more extra WHG than Srubnaya.

Chad Rohlfsen said...

Nevermind. Rise386 has 3% African from the deamination. Rise394 has about 1.5%. That might be the issue. I'll remove them and try it again.

Chad Rohlfsen said...

Without the samples featuring African noise..... BOOM!

left pops:

right pops:

0 Sintashta 3
1 Karelia_HG 1
2 Kotias 1
3 Iberia_MN 4
4 Mbuti 10
5 Ami 10
6 Ju_hoan_North 5
7 Japanese 29
8 Yoruba 70
9 Papuan 14
10 Karitiana 12
11 Loschbour 1
jackknife block size: 0.050
snps: 133461 indivs: 160
number of blocks for block jackknife: 703
dof (jackknife): 571.315
numsnps used: 80726
codimension 1
f4rank: 2 dof: 5 chisq: 0.325 tail: 0.997144984 dofdiff: 7 chisqdiff: -0.325 taildiff: 1
scale 1.000 1.000
Ami 0.666 0.382
Ju_hoan_North -0.027 -0.105
Japanese 0.727 0.415
Yoruba -0.071 -0.091
Papuan 0.443 -0.165
Karitiana 1.758 1.620
Loschbour 1.654 -2.003
scale 347.521 731.244
Karelia_HG 1.182 0.954
Kotias -1.256 1.045
Iberia_MN -0.157 -0.999

full rank 1
f4rank: 3 dof: 0 chisq: 0.000 tail: 1 dofdiff: 5 chisqdiff: 0.325 taildiff: 0.997144984
scale 1.000 1.000 1.000
Ami 0.668 0.420 1.420
Ju_hoan_North -0.024 -0.074 0.911
Japanese 0.727 0.440 1.003
Yoruba -0.071 -0.082 0.381
Papuan 0.442 -0.125 1.409
Karitiana 1.759 1.586 -0.941
Loschbour 1.653 -2.022 -0.361
scale 347.570 726.107 9486.556
Karelia_HG 1.217 0.970 0.759
Kotias -1.226 1.061 0.609
Iberia_MN -0.124 -0.966 1.432

best coefficients: 0.290 0.211 0.498
0.000236563 0.000097413 0.000196141 0.000033281 0.000195600 0.000114560 0.000048174
1.570941255 0.646887934 1.302516128 0.221006683 1.298921042 0.760755873 0.319909338

Jackknife mean: 0.290245196 0.211566107 0.498188697
std. errors: 0.047 0.063 0.077

error covariance (* 1000000)
2183 -123 -2060
-123 3957 -3834
-2060 -3834 5894

fixed pat wt dof chisq tail prob
000 0 5 0.325 0.997145 0.290 0.211 0.498
001 1 6 32.353 1.39607e-05 0.475 0.525 -0.000
010 1 6 10.717 0.097541 0.302 -0.000 0.698
100 1 6 35.668 3.19837e-06 0.000 0.205 0.795
011 2 7 114.000 1.35636e-21 1.000 -0.000 -0.000
101 2 7 116.863 3.4454e-22 0.000 1.000 0.000
110 2 7 42.870 3.53453e-07 0.000 -0.000 1.000
best pat: 000 0.997145 - -
best pat: 010 0.097541 chi(nested): 10.392 p-value for nested model: 0.00126596
best pat: 110 3.53453e-07 chi(nested): 32.153 p-value for nested model: 1.42487e-08

Davidski said...

Yeah, I had a feeling the post-mortem damage might be making them more basal than they really are.

Chad Rohlfsen said...

Yeah. Sintashta has about as much WHG as Central Europeans.

Chad Rohlfsen said...

Those around the Carpathians and Alps that is.

Seinundzeit said...


Now I understand what you were trying to do.


Thanks for running those. It's striking how those numbers match qpAdm, and TreeMix. Rather consistent.

Chad Rohlfsen said...

Just a few with separated Beaker and Corded males and females, with their respective best admixing population.

Yoruba Yamnaya_Samara German_BeakerM German_BeakerF -0.0032 -0.672 5724 5761 114451
Yoruba Karelia_HG German_BeakerM German_BeakerF -0.0068 -0.904 5374 5448 108351
Yoruba Iberia_MN German_BeakerM German_BeakerF 0.0017 0.303 5382 5364 108376
Yoruba Loschbour German_BeakerM German_BeakerF 0.0058 0.823 4969 4912 98512
Yoruba Kotias German_BeakerM German_BeakerF -0.0110 -1.798 4686 4790 99652
Yoruba Yamnaya_Samara Beaker_RiseM Beaker_RiseF -0.0010 -0.120 1943 1947 42944
Yoruba Karelia_HG Beaker_RiseM Beaker_RiseF -0.0284 -2.210 1806 1912 40809
Yoruba Iberia_MN Beaker_RiseM Beaker_RiseF 0.0151 1.495 1883 1827 41217
Yoruba Loschbour Beaker_RiseM Beaker_RiseF -0.0030 -0.268 1652 1663 36896
Yoruba Kotias Beaker_RiseM Beaker_RiseF -0.0110 -1.001 1629 1665 38297
Yoruba Yamnaya_Samara German_CordedM German_CordedF 0.0020 0.717 26494 26389 548789
Yoruba Karelia_HG German_CordedM German_CordedF 0.0056 1.255 25607 25320 526438
Yoruba Iberia_MN German_CordedM German_CordedF -0.0036 -1.136 25047 25226 526720
Yoruba Loschbour German_CordedM German_CordedF -0.0020 -0.441 22333 22424 468323
Yoruba Kotias German_CordedM German_CordedF 0.0032 0.814 21743 21602 473969
Yoruba Yamnaya_Samara German_Corded_RiseM German_Corded_RiseF 0.0052 0.317 485 480 10615
Yoruba Karelia_HG German_Corded_RiseM German_Corded_RiseF -0.0147 -0.600 454 468 9991
Yoruba Iberia_MN German_Corded_RiseM German_Corded_RiseF -0.0290 -1.624 457 484 10446
Yoruba Loschbour German_Corded_RiseM German_Corded_RiseF -0.0077 -0.339 400 406 8852

Chad Rohlfsen said...

Yoruba Kotias German_Corded_RiseM German_Corded_RiseF 0.0259 1.215 424 403 9461
Esperstedt_MN German_BeakerM Kotias Karelia_HG 0.0023 0.187 4280 4260 88906
Esperstedt_MN German_BeakerM Kotias Yamnaya_Samara 0.0086 1.087 4423 4348 93977
Esperstedt_MN German_BeakerM Karelia_HG Yamnaya_Samara 0.0029 0.289 4835 4808 102378
Esperstedt_MN German_BeakerF Kotias Karelia_HG 0.0254 4.100 18441 17528 399066
Esperstedt_MN German_BeakerF Kotias Yamnaya_Samara 0.0162 3.672 18868 18265 421775
Esperstedt_MN German_BeakerF Karelia_HG Yamnaya_Samara -0.0082 -1.575 20568 20909 466038
Esperstedt_MN Beaker_RiseM Kotias Karelia_HG 0.0111 1.171 7032 6878 159008
Esperstedt_MN Beaker_RiseM Kotias Yamnaya_Samara 0.0114 1.700 7249 7086 168797
Esperstedt_MN Beaker_RiseM Karelia_HG Yamnaya_Samara 0.0009 0.114 7875 7861 182577
Esperstedt_MN Beaker_RiseF Kotias Karelia_HG 0.0060 0.479 3196 3158 73926
Esperstedt_MN Beaker_RiseF Kotias Yamnaya_Samara 0.0019 0.212 3315 3303 78228
Esperstedt_MN Beaker_RiseF Karelia_HG Yamnaya_Samara 0.0036 0.347 3639 3613 84919
Esperstedt_MN German_CordedM Kotias Karelia_HG 0.0282 4.091 18013 17026 387137
Esperstedt_MN German_CordedM Kotias Yamnaya_Samara 0.0213 4.155 18388 17622 406122
Esperstedt_MN German_CordedM Karelia_HG Yamnaya_Samara -0.0068 -1.217 20147 20423 451826
Esperstedt_MN German_CordedF Kotias Karelia_HG 0.0321 4.648 18844 17672 399341
Esperstedt_MN German_CordedF Kotias Yamnaya_Samara 0.0200 3.922 19262 18506 422316