Wednesday, January 17, 2018

Another look at the genetic structure of Yamnaya


Yamnaya and other similar Eneolithic/Bronze Age herder groups from the Eurasian steppe were mostly a mixture of Eastern European Hunter-Gatherers (EHG) and Caucasus Hunter-Gatherers (CHG). But they also harbored minor ancestry from at least one, significantly more westerly, source that pulled them away from the EHG > CHG north/south genetic cline. This is easy to show with formal statistics (for instance, refer to the qpAdm output here) and illustrate with a decent Principal Component Analysis (PCA).


Over the past couple of years I've come to the conclusion that this minor westerly input probably came from the Carpathian Basin (modern-day Hungary) or somewhere nearby, like the Balkans (see here).

However, this inference was based on just a handful of Neolithic samples from the Carpathian Basin. Now, thanks to Lipson et al. 2017, I have genotype data from tens of individuals from several different Neolithic and Copper Age cultures from the region. So let's revisit the issue by plugging these new samples into qpAdm, and also using the very latest qpAdm methods as described in scientific literature (with Ethiopia_4500BP as the base pright sample to 15 other ancient pright groups and individuals).

Below are the results, best to worst, sorted by taildiff. For comparison, I ran extra models with ancient populations from other parts of Europe and also West Asia. It's interesting and, I'd say, important to note that the West Asian reference groups produce amongst the worst statistical fits (in bold text). What this suggests is that Yamnaya did not harbor extra West Asian ancestry on top of its CHG input. And, by the way, please note that I'm only using Yamnaya_Samara in these runs because I prefer UDG-treated, and thus higher quality, ancient samples.

CHG + EHG + Blatterhole_MN 0.465394061 > full output

CHG + EHG + Koros_HG 0.322245651 > full output

CHG + EHG + Germany_MN 0.321017025 > full output

CHG + EHG + Protoboleraz_LCA 0.315521424 > full output

CHG + EHG + Vinca_MN 0.292074267 > full output

CHG + EHG + Baden_LCA 0.255168297 > full output

CHG + EHG + Tisza_LN 0.246555616 > full output

CHG + EHG + ALPc_MN 0.220623346 > full output

CHG + EHG + Blatterhole_HG 0.219418173 > full output

CHG + EHG + Balaton_Lasinja_CA 0.211230222 > full output

CHG + EHG + Tiszapolgar_ECA 0.207527666 > full output

CHG + EHG + LBK_EN 0.182365613 > full output

CHG + EHG + TDLN 0.176675465 > full output

CHG + EHG + Koros_EN 0.15488361 > full output

CHG + EHG + Starcevo_EN 0.136365203 > full output

CHG + EHG + Armenia_EBA 0.127988891 > full output

CHG + EHG + Armenia_ChL 0.123057884 > full output

CHG + EHG + LBKT_MN 0.122780467 > full output

CHG + EHG + Tepecik_Ciftlik_N 0.110155019 > full output

CHG + EHG + Greece_N 0.105880232 > full output

CHG + EHG + Boncuklu_N 0.094240794 > full output

CHG + EHG + Anatolia_BA 0.069141519 > full output

CHG + EHG + Anatolia_ChL 0.067837662 > full output

...

CHG + EHG + Iran_ChL infeasible > full output

At the top of the list is Blatterhole_MN. Admittedly this is something of a surprise, considering the geographic distance between Blatterhole, Germany, and Samara, Russia. It's also an intriguing result because of the presence of Y-chromosome haplogroup R1b in both Blatterhole_MN and Yamnaya (see here).

However, this doesn't necessarily mean that Yamnaya harbors direct ancestry from Blatterhole_MN, or even any closely related group from North-Central Europe. Rather, Blatterhole_MN is simply the best proxy in this analysis for the non-CHG/EHG ancestry in Yamnaya, and the important question is why?

Considering also the presence at the top of the list of Koros_HG (which includes Hungary_HG I1507), Germany_MN and Vinca_MN, the likely answer is its high ratio of Western European Hunter-Gatherer (WHG) ancestry. Indeed, when I let qpAdm vary the WHG ratio, by dropping Blatterhole_MN and adding Koros_EN and Koros_HG in its place, I get an even better fit.

CHG + EHG + Koros_EN + Koros_HG 0.612772624 > full output

And for comparison...

CHG + EHG + LBK_EN + WHG 0.551431774 > full output

So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds? Quite possibly. But let's wait and see what happens when I add the ancient groups from the Balkans and North Pontic steppe from the forthcoming Mathieson et al. 2018 to this analysis.

Update 17/05/2018: My results have been confirmed in a new preprint from Harvard/Max Planck. See here: On the genetic prehistory of the Greater Caucasus (Wang et al. 2018 preprint).

See also...

What's Maykop (or Iran) got to do with it?

The Yamnaya outlier

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

83 comments:

  1. Thanks for sharing! You have made a few excellent points here!

    ReplyDelete
  2. Mitogenomes like J2b, T2c1a, T1a in Yamna are almost certainly from EEF side.

    ReplyDelete
  3. Considering the underlying more basic components in the Neolithic samples, all the working (Blatterhole, Koros_EN, Koros_EN+Koros_HG) models are pretty sharp on the total amount of CHG+Anatolia_N together, which are approximately 50% (varies from 49.7% to 51.0% drafting in the WHG+Anatolia_N proportions from Lipson's paper), and the remaining 50% (50.3% - 49%) WHG+EHG.

    Within the models, CHG+WHG are positively related (models with slightly higher CHG fraction have slightly higher underlying WHG fraction). With as many outgroup relevant statistics on the right, it seems like this is balancing it to a very precise point on general north-south underlying dimension.

    ReplyDelete
  4. SM. has placed R1b samples from Blatterhole and Iron Gates between L754+ and V88+
    https://kumbarov.com/ht35/R1b_xP312xU106_V.38.1.pdf

    ReplyDelete
  5. @Davidski

    "At the top of the list is Blatterhole_MN. Admittedly this is something of a surprise, considering the geographic distance between Blatterhole, Germany, and Samara, Russia."

    But what stands out is that Blatterhole_MN has a lot more WHG than other Old Europeans, apart from Koros_HG. Which is the second on the list.

    Which makes me wonder, would the Ukranian Mesolithic from Mathieson + CHG fit better than CHG + EHG? These mesolithic samples have more WHG than the samples mostly called EHG. Or would CHG + EHG + Ukrainian mesolithic fit? Or CHG + Ukrainian mesolithic + some farmer/EEF sample?

    ReplyDelete
  6. Input of women from Cucuneti-Tripollye?

    ReplyDelete
  7. @ a

    Did I read well the PDF of Sergey Malyshev and Atanas Kumbarov? Villabruna is placed on the M73 line... How's it possible?

    I noticed also that ATP3 is surely placed on the M269 line...

    ReplyDelete
  8. "and the important question is why?"

    any copper/gold/silver mines near Blatterhole?

    ReplyDelete
  9. "any copper/gold/silver mines near Blatterhole?"

    or less fun but possibly more likely, salt

    ReplyDelete
  10. "So is the missing piece of the Yamnaya puzzle a population with roughly equal ratios of Early Neolithic (EN) and WHG ancestries from the Carpathian Basin or surrounds?"

    I just want to mention that this is what a lot of us "R1b/Yamnaya skeptics" have been suggesting for a while (ie that R1b-M269 radiated out from around the Carpathians/Danube before some downstream clades then hitched a ride on Yamnaya). Obviously it remains to be seen if that skepticism will be validated.

    The "third neolithic" talk on Bell Beaker Blogger's page fits this well IMHO.

    ReplyDelete
  11. @David

    Another thing: If I recall correctly it looks like Koros_HG has a tad extra ANE and a tad EEF. I can imagine that would somehow cause qpAdm to choose it over any other WHG sample with a target that has ANE and basal eurasian.

    ReplyDelete
  12. I think individuals like GB1 Eneolithic romania will be interesting for Yamnaya

    ReplyDelete
  13. And also all Steppe ancestry in Europe came a long with significant EEF/WHG admixture. Those two mega "races" had a long history of mixing. By the time R1b P312 Steppe folk made it to Spain they may have been only 50% Steppe.

    ReplyDelete
  14. >> any copper/gold/silver mines near Blatterhole?

    > or less fun but possibly more likely, salt

    Well, it's in the Ruhr Valley, but I doubt if coal or iron ore were of any interest at that stage. There is copper in Sauerland, 40 to 100 km away, but ten minutes googling didn't turn up any references to pre-Roman mining.

    ReplyDelete
  15. regarding the Speculation of Copper and Blatterhohle - there's no basis. In fat the Blatterhohle R1b-V88 individual was found without context (in a cave without clear TRB or Baalberg or whichever local MNE culture artefacts). But an interesting clue is the links between Iron Gates, El Trocs and some of the Mariupol Ukraine V88 individuals, further evidence of intra-Europe movement patterns before the big steppe migration.

    ReplyDelete
  16. AlanL, Rob

    me wildly speculating - the V88 distribution in Africa is either very odd (if the source was distant) or not odd at all (if it was local) but on the assumption it was distant i googled around one time to see if there was anything in the areas of high concentration of V88 that might have drawn people from a long way away hoping for something fun like gold or copper but iirc the only clear potential candidate was salt. One of the high V88 regions was a major salt production center from ancient times iirc (that's if i got the right spot in the first place so may be nothing).

    ReplyDelete
  17. Epoch's hypothesis mirrors mine and deserves some attention.

    Who here remembers the Ukrainian hunter gatherers and that Eneolithic steppe-like sample from Ukraine from Mathieson 2017 et al?

    As many of you probably recall, the Ukranian HGs were not exactly like EHGs. They had some WHG-like ancestry pulling them westward.

    So, WHAT IF the WHG-like ancestry in Ukranians and Romanians was closely related to UHG in Neolithic Anatolians(and in turn EEFs)? It's a relevant question as Eastern Ukraine was, in all likelihood, the birth of the steppe ancestral package considering aDNA and archaeology alike.

    In fact, I recall modeling Yamnaya Kalmykians a while back on nMonte using a battery of samples. They took no ANF ancestry at all, but preferred a mix of EHG+some WHG+CHG(don't recall the exact samples), instead. So, no surprise David's models prefer Koros_HG and HG-enriched Blatterhole_MN.

    ReplyDelete
  18. @Anthro Survey

    There are EEF-related archaeological sites derived from the Balkans and/or Carpathian Basin as far east as the Donets in eastern Ukraine/southern Russia.

    http://eurogenes.blogspot.com/2017/02/three-way.html

    It seems unlikely that these settlements were inhabited by WHG-like populations, as opposed to something resembling a mixture of Koros_EN and Koros_HG.

    ReplyDelete
  19. @Davidski

    In this paper, the authors do not rule out, by any means, the arrival of agricultural, husbandry and pottery-making packages to the Donets/Lower Don regions from the NW Caucasus(or perhaps via the Eurasian route). In fact, they arguably show a tacit preference for this.

    Furthermore, agriculture definitively began there at the tail end of 5th millennium BC. This is either contemporaneous with or post-dates some of the earliest "steppic" sites. Hence, if we assume an EEF route of transmission from Tripolye-like peoples, it doesn't preclude the earliest hybrid populations there being CHG+EHG+minor WHG.
    Either way, it doesn't require local EEF-like populations of E. Ukraine to have been significant contributors to the steppe package in the long run. They could have lived side-by-side and perhaps engaged in limited mixing but ultimately ended up as a dead-end population.

    I remain open-minded, of course, and don't discount limited EEF in Yamnaya.

    When do you think those genomes from Ukraine and Romania will become available?

    ReplyDelete
  20. @Anthro Survey

    When do you think those genomes from Ukraine and Romania will become available?

    Hopefully soon.

    ReplyDelete
  21. Anthro
    It is thought these days that pottery making was invented in Russia - Ukraine independently. It’s just silt and clay, which was then fired for hardness

    Your statement that “agriculture definitely began there at the tail end of 5th M BC” is more a term ante quem.

    ReplyDelete
  22. Khvalynsk and Armenia EBA R1b in the same branch makes sense. That would explain the EHG in Chl Armenia.

    But the other R1b from Armenia is MBA not EBA. There is an error there.

    ReplyDelete
  23. Blatterhole_MN have R1b and clear links with the Western North of the Black sea region by mtDNA. Most likely, they are some kind of Mesolithic(? early? late?) aliens from the Western part of the Northern Black sea region. May be under pressure from Neolithic farmers is one part of their ancestral population get away in Germany, and the other on the Dnieper river in the Neolithic.

    Protoboleraz_LCA clearly had contacts with the steppe culturally, and therefore received genetic flow. Baden_LCA comes from him and Tisza_LN & Tiszapolgar_ECA & Balaton_Lasinja_CA.


    What is Körös_HG?

    ReplyDelete
  24. Hmm. Could this be do to a spread of R1b to the steppe via expansion of the Magdalenian culture. Archeology demonstrates that the magdalenian culture did expand out to at least poland

    Based on Reich's paper of Ice age Europe The El miron cluster was fairly diverse with the El miron individual falling in between the rest of the El miron cluster and the Villabruna (WHG) cluster.

    Or it could have been a slightly later, replacement expansion, of the Villabruna genetic cluster (WHG) which expanded out to at least the Czech republic. (https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/FuQ_nature17993.pdf)

    Blatterhold_MN is smack in the middle of that range and shows ~70% Villabruna or KO1 related WHG ancestry and ~30% EEF farmer ancestry. (https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/nature24476_final.pdf)

    Would seem parsimonious that the R1b came from the Villabruna lineage.

    So R1b expanded out to eastern europe/steppe from western/central europe, where it was found only at low percentages, via the Villabruna-like lineage during the Late paleolithic/early mesolithic?

    Just a thought

    ReplyDelete
  25. @Lee Albee,

    Um. If early R1b spread with Magdalenian that would really be something considering for like 10 years the consensus was R1b L151 spread with Magdalenian.

    ReplyDelete
  26. @supernord

    The area is not far from the Netherlands where the Vlaardingen Culture could be found in the peat- and marshlands. It was contemporary to the Funnel Beaker culture, even survived that well into the Corder Ware era. It showed clear Mesolithic treats, most likely a continuation of the local Dutch variety of Ertebolla called Swifterband culture.

    https://en.wikipedia.org/wiki/Vlaardingen_culture

    We also have a Funnelbeaker graveyard from near Schwerin - the Ostorf flat grave - which had fully mesolithic mtDNA and showed clear signs of being far more hunter-gatherer than normal funnelbeaker.

    http://www.sciencedirect.com/science/article/pii/S2352409X17303231?via%3Dihub

    Your scenario seems too far fetched with such examples in the neighbourhood.

    ReplyDelete
  27. @ Epoch
    So where is all the Mesolithic V88 in Central -Western Europe ??

    ReplyDelete
  28. MN Euro may not be the way to go. Depends how M269 looked during the Eneolithic. If it was like Khavalynsk, then there is no MN, but 10-15% input from an Armenia EBA/IranChL pop.

    ReplyDelete
  29. @Samuel Andrews

    Do you have a reference for that? I would love to read it. I had thought that such an early prescence of R1b in western europe was fairly unheard of?

    Thank you

    ReplyDelete
  30. @Rob

    V88 could have been brought to Blätterhöhle by the farmers part of them.

    ReplyDelete
  31. @Chad

    MN Euro may not be the way to go. Depends how M269 looked during the Eneolithic. If it was like Khavalynsk, then there is no MN, but 10-15% input from an Armenia EBA/IranChL pop.

    Yamnaya still shows a strong signal from the west when it's modeled as part Khvalynsk, and no Armenia_EBA/Iran_ChL.

    http://eurogenes.blogspot.com/2017/08/genetic-and-archaeological-continuity.html

    ReplyDelete

  32. epoch2013 "Your scenario seems too far fetched with such examples in the neighbourhood."

    It means nothing, as neither genetics Dutch cultures or their relations with the Blatterhol we do not know. It is important that in the Neolithic in Blatterhol and the Dnieper lived R1b-(V88), and, on the Dnieper river the population received more WHG in comparison with the Mesolithic.


    @Davidski -
    What is Körös_HG?

    ReplyDelete
  33. @supernord

    What is Körös_HG?

    It's the Koros_HG sample from Lipson et al., plus Hungary_HG I1507, which also comes from a Koros Neolithic site, but is overwhelmingly WHG.

    https://reich.hms.harvard.edu/datasets

    ReplyDelete
  34. Epoch
    Sometimes Supernord does make sense
    Although his scenario misses the point that my one made earlier- iron gates expansion to Ukraine and Central Europe

    ReplyDelete
  35. @Davidski
    This is the Koros_EN in Lipson et al. In Lipson et al. there are not a Koros_HG.

    ReplyDelete
  36. @supernord

    In Lipson et al. there are not a Koros_HG.

    In the Lipson et al. dataset ind file there is.

    I4971 M Koros_EN_HG

    https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/LipsonEtAl2017.tar.gz

    ReplyDelete
  37. @Chad

    By the way, using this latest pright pops set up that I have, I can model Yamnaya as a mixture of CHG, Blatterhole_MN and Samara_Eneolithic, without any additional EHG.

    https://drive.google.com/file/d/1Q-Jcv4AFIzpZr3iD9KINW330emfUjOSe/view?usp=sharing

    But it's true that Samara_Eneolithic doesn't take any Blatterhole_MN.

    https://drive.google.com/file/d/1rpffJguM9jI9Gl16_ja4-02JI_nr15pp/view?usp=sharing

    So this makes me think that the archaeologically based idea that Yamnaya came about from a fusion of the westerly Sredny Stog and easterly Khvalynsk might be spot on.

    ReplyDelete
  38. That Koros individual I4971 is from Neolithic context c. 5300 BC, but seems to be yet another 'assimilated hunter-gatherer'. Reported lineages are I2a2 (Y) and K1 (mtDNA), although he can be further determined to be I2a2b.
    Just more confirmation that the area around the eastern Carpathian basin and adj. mountains were hunter-gather 'refugia' of I2a2 and probably certain types of R1b.

    ReplyDelete
  39. @Davidski
    “So this makes me think that the archaeologically based idea that Yamnaya came about from a fusion of the westerly Sredny Stog and easterly Khvalynsk might be spot on.”

    Maybe it wasn’t the Sredny Stog but Corded Ware which influenced the Yamnaya:

    https://www.researchgate.net/publication/292423589_The_Yamna_culture_and_the_Indo-European_homeland_problem

    From Allentoft “Population genomics of Bronze Age Eurasia”

    https://s10.postimg.org/nv0yovixl/screenshot_324.png

    ReplyDelete
  40. David,

    Isn't that just one way to make an Armenia EBA? You're relying on a pure CHG pop to exist in the Caucasus. That, in all likelihood, has no chance of being true. Why say Hungary_N + Kotias? Just have it be a pop between Armenia_EBA and Iran_ChL.

    The Caucasus pops with good Anatolian admixture make the best proxy for an admixing source, going from Khavalynsk > Yamnaya

    Steppe_ChL WHG Steppe_EMBA 0.005605 0.001315 4.261
    Steppe_ChL Blatterhole_MN Steppe_EMBA 0.002843 0.001059 2.685
    Steppe_ChL Vinca Steppe_EMBA -0.001168 0.001108 -1.054
    Steppe_ChL Anatolia_N Steppe_EMBA -0.002015 0.000712 -2.829
    Steppe_ChL TDLN Steppe_EMBA -0.001191 0.000825 -1.443
    Steppe_ChL LBK Steppe_EMBA -0.001502 0.000731 -2.056
    Steppe_ChL Armenia_EBA Steppe_EMBA -0.002890 0.000902 -3.203
    Steppe_ChL Tiszapolgar Steppe_EMBA -0.001606 0.001002 -1.603
    Steppe_ChL Armenia_ChL Steppe_EMBA 0.000495 0.000823 0.601
    Steppe_ChL Tepecik_Ciftlik Steppe_EMBA -0.003153 0.001059 -2.977
    Steppe_ChL Iran_ChL Steppe_EMBA -0.003186 0.000880 -3.620

    ReplyDelete
  41. @Chad

    My models rely on a largely CHG, but partly EEF and EHG, farmer population existing in the North Caucasus and Don region of the steppe, both of which are yet to be sampled, during the Late Neolithic/Chalcolithic.

    This is very plausible considering the archaeological and skeletal finds from that period in the North Caucasus and nearby parts of the steppe.

    http://eurogenes.blogspot.com/2017/02/three-way.html

    ReplyDelete
  42. Chad
    Doesn’t the high CHG bounceback in Kura-Arx suggest that there was indeed an almost pure CHG population living around still ?

    ReplyDelete
  43. David,

    I think it is probably something very Armenian and Iranian-like. It won't be a little Caucasus, plus MN Euro. The EEF-like mtDNA need not come from Europe, but the Caucasus.

    Rob,

    It isn't a CHG resurgence. look at PCA. They are right under them, towards Iran_ChL. It is a mixed pop, from near the Zagros or Mesopotamia.

    left pops:
    Armenia_EBA
    Armenia_ChL
    Iran_ChL
    CHG

    right pops:
    Mbuti_DG
    Kostenki14
    Ust_Ishim
    Onge
    Karitiana
    Anatolia_N
    WHG
    EHG
    Iran_N
    Levant_N
    MA1


    best coefficients: 0.518 0.541 -0.059
    std. errors: 0.154 0.194 0.111


    fixed pat wt dof chisq tail prob
    000 0 8 3.861 0.869461 0.518 0.541 -0.059 infeasible
    001 1 9 4.188 0.898595 0.517 0.483 0.000

    Extra CHG is a no-go. However, look towards Mesopotamia or the Zagros, and that is probably your answer.

    left pops:
    Armenia_EBA
    Armenia_ChL
    Iran_ChL
    Tepecik_Ciftlik

    right pops:
    Mbuti_DG
    Kostenki14
    Ust_Ishim
    Onge
    Karitiana
    Anatolia_N
    WHG
    EHG
    Iran_N
    Levant_N
    MA1



    best coefficients: 0.253 0.661 0.086

    std. errors: 0.175 0.158 0.081

    fixed pat wt dof chisq tail prob
    000 0 8 2.438 0.964543 0.253 0.661 0.086
    001 1 9 3.573 0.937198 0.323 0.677 -0.000

    ReplyDelete
  44. I can fine-tune that and do a qpGraph if you like.

    ReplyDelete
  45. @Chad

    I think it is probably something very Armenian and Iranian-like.

    Can't be, because I've included all of the available ancient West Asian groups in my models. The fits were below par or even very poor, with the Iran_ChL model coming back as infeasible.

    The EEF-like mtDNA need not come from Europe, but the Caucasus.

    A bigger problem I think is the lack of South Caucasus and South Caspian mtDNA HGs in Yamnaya etc., like U7.

    ReplyDelete
  46. These guys aren't from south of the Caspian. This is likely Mesopotamian and eastern Anatolian.

    ReplyDelete
  47. Chad
    K-A doesn’t appear to have expanded from Iran but into it
    In fact, it might have originated in east Anatolia before finalising in the Kura and araxes valleys

    ReplyDelete
  48. @Chad

    Mesopotamia is south of the Caspian, and South Caspian-specific mtDNA HGs are common there. They're found in the Zagros too, including in Iran_ChL.

    Anywhere south of the Caucasus is no go for the West Asian-related admix in Yamnaya. Take a look at my models; I've got it all there, and it matches the uniparentals on the ancient steppes.

    ReplyDelete
  49. Mesopotamia isn't south of the Caspian. There's limited data from here and not enough to say it didn't happen. The trail from the Caucasus farmers is from Mesopotamia. There really isn't any way around that. After someone gets a couple hundred samples across this region and the Caucasus, it will be more clear.

    ReplyDelete
  50. @Davidski,
    "A bigger problem I think is the lack of South Caucasus and South Caspian mtDNA HGs in Yamnaya etc., like U7."

    Maybe, the Caucasus lacked south Caspian mHGs but carried EEF mtDNA. That would make a Caucasus route for Yamnya's EEF stuff possible. But I think it makes a lot more sense Yamnaya's EEF is from Europe considering southeast Europe right around the corner was 90% Anatolian.

    ReplyDelete
  51. @Chad

    Mesopotamia is next to the Zagros, which is where Iran_ChL is from, and its mtDNA is South Caspian-specific, and very much unlike the mtDNA in Yamnaya.

    So Mesopotamia is a long shot, especially since the North Caucasus is right next to the steppe, and the Carpathian Basin is an extension of the steppe in Europe.

    ReplyDelete
  52. I think chad is right -there’s too few genomes outside Anatolia to really appreciate what’s what
    Anyhow , CHG must be either from south of the Caucasus or south / east of the Caspian. In fact as I’ve previously suggested, it’s probably several geographic and temporal layers

    ReplyDelete
  53. Nonsense.

    The Chalcolithic South Caspian populations that were expanding from Mesopotamia and Zagros made it to Egypt and Anatolia. We know this because it's easily seen in the ancient data without having to resort to any mental gymnastics.

    They didn't make it onto the steppe, because their markers are missing from Bronze Age steppe populations.

    Just follow the data without wishing too hard, and you'll see the reality.

    ReplyDelete
  54. So you’re envisaging a cryptic CHG population living in seclusion somewhere north of the Caucasus ? Which archaeological group would that be ? How’d it get there in the first place ?

    And you’ve been corrected before several times- if you want to understand affairs, we need to look at Late neo and Eneolithic steppe markers, not Yamnaya

    ReplyDelete
  55. @Rob

    So you’re envisaging a cryptic CHG population living seclusion somewhere north of the Caucasus ? Which archaeological group would that be ? How’d it get there in the first place ?

    They lived in the Northwest Caucasus since the Mesolithic. Duh.

    And you’ve been corrected before several times.

    Give me two examples.

    ReplyDelete
  56. “They lived in the Northwest Caucasus since the Mesolithic. Duh.”

    No they didn’t . The Mesolithics who at one point lived in the NW Caucasus, and where of southern origin (Imeretian-Zarzian epipaleolithic) appears to have gone extinct. There is a 1000 year hiatus between the Mesolithic and the arrival of Mariupol- like groups in the north Caucasus (which would be on the WHG-EHG cline, and I2 /R1).
    Then new southerners arrived with Meshoko horizon from Georgia, mixed with Sfedny Stog people. Then again new southerners arrived with Majkop contacts (northern Ubaid and halaf, therefore not really “Mesopotamian”).

    We also have kelteminar contacts impacting Elahanka -Volga groups

    ReplyDelete
  57. @Rob

    No they didn’t.

    It looks like they did, or somewhere nearby, because the ancient genetic data show very clearly that the southern ancestry in both Khvalynsk and Yamnaya is overwhelmingly CHG.

    Again, I ask you, where are the Mesopotamian/South Caspian/Zagros markers in Khvalynsk, Yamnaya, Catacomb, etc.?

    Can you explain this discrepancy?

    The Mesolithics who at one point lived in the NW Caucasus, and were of southern origin (Imeretian-Zarzian epipaleolithic) appears to have gone extinct.

    The operative word here is "appears".

    It also appeared to many until recently, including to you, that Corded Ware and R1a-M417 weren't from the steppe.

    ReplyDelete
  58. Later today, or tomorrow, Rob, I'll tell you a little tale about a fortress in the Northwest Caucasus.

    ReplyDelete
  59. @Salden

    I'm not advocating anything about ancient Egypt here; just pointing out patterns in the data that have been discussed in scientific literature and on this blog already (ie. the spread of South Caspian/Mesopotamian/Iran_ChL ancestry to ancient Egypt).

    ReplyDelete
  60. The way I see it:

    The best model posted is the one with CHG + EHG + Koros_EN + Koros_HG, with proportions of 44% + 44% + 6% + 6%.

    Ukraine Neolithic was SHG-like, and only received EHG during the Eneolithic with al already admixed Yamnaya-like population.

    The North Caucasus (pre-Maykop) had clear links to the NW Black Sea region*, and I'd guess that those Balkans outlier can only realistically come from the North Caucasus (and they plot near modern Europeans).

    The South Caucasus (home of CHG) had too much AN ancestry from early on (Armenia_Chl and Armenia_EBA). Everything around the Black Sea was too "western" by the LN/Chl to be the origin of a Yamnaya-like population.

    So where does this leaves us? We need a population as "eastern" as Samara_Eneolithic (Khvalynsk), but more southern. This would make the lower Volga-Ural region as a good candidate. Either that, or Central Asia.

    The first Yamnaya-like (but not exactly) sample we have is the Samara Eneolithic sample (probably not from Khvalynsk tribe) belonging to HG Q, and eastern marker.

    The earliest Yamnaya proper (genetically) samples we have are from Samara and the Altai (Afanasievo), ca. 3000 BC.

    Apparently (unplublished data), the best match so far for the mtDNA of Yamnaya comes from Ulug Depe**.

    * https://www.academia.edu/2543641/The_chronology_of_the_Maikop_culture_in_the_Northern_Caucasus_changing_perspectives

    ** http://eurogenes.blogspot.com/2017/06/the-pigtailed-figures.html?showComment=1496922199462#c7884162881824309440

    ReplyDelete
  61. @Alberto "Ukraine Neolithic was SHG-like"
    Wrong. They were not SHG-like, such a term is not applicable to describe the mixture of EHG and WHG. They had an increase in the number of WHG compared to the Mesolithic.


    "and only received EHG during the Eneolithic with al already admixed Yamnaya-like population."

    EHG in Ukraine is present in large quantities in the Mesolithic.

    ReplyDelete
  62. @Rob, re: the discussion in this thread between you and Chad, as you know from past discussion, I was pretty interested in the idea of Iran_Chl admixture into the Armenia_EBA population (simply to tie in with general change in the Near East ME region as a single phenomenon).

    But, arguing against it, would say I notice in the new West Eurasian Ancient 67 panel, in the higher dimensions than 1 and 2:

    - Dimension 3 adds further distinction between CHG and Iran_N / Natufian / Levant_N Anatolian_N and shows some bending of particularly present day South Caucasian (Georgian / Abkhazian) towards Satsurblia and Kotias. North Caucasus also shows some of the same phenomenon (but seems reduced due to greater EHG affinity).

    Armenian and Turkish populations who overlap in Dimension 1 and 2 with South Caucasus listed above do not overlap in this dimension and are more removed to the other, non-CHG end (which makes sense given their languages are from the north / east one way or another and relatively lower isolation means drawing in more ancestry from both Levant and Iran).

    Steppe_EMBA samples also seem outbent towards CHG on this dimension (The West Eurasian type Karasuk and Mezhovskaya samples in the run don't seem to be).

    - Dimension 4 where NW_Anatolia_N ancestry is distinguished from other Western farmer and Neolithic streams of ancestry (and where Western Europeans and Eastern Europeans are slightly more distinct from each other than PC2), there is also seems to be a slight outbending of North Caucasus and Steppe_EMBA samples towards the "other Western farmer" end (where Levantine and Bedouin samples are at the other pole).

    ...

    Overall, I guess I'm in agreement that the sampling of eastern Anatolia and early South Caucasus / Armenian plateau is too weak to reject ancestry from there in Yamnaya (the Armenia Chalc and EBA may just be rejected becaue they are too complex mixes that don't have enough freedom to be fit).

    But also seems like there has to be at least some continuity of CHG in South Caucasus, and don't totally see any compelling reason in modern dna as represented there that a "CHG stronghold" should have been in North rather than South Caucasus (working in complete ignorance of archaeology).

    ReplyDelete
    Replies
    1. Armenia EBA and Iran ChL are the best admixing source to go from Khavalynsk > Yamnaya by f3. Nothing weak or rejected about it. Northern Mesopotamians are the ancestors of Caucasus farmers. Once we get the kind of coverage in Europe, across West Asia, it'll be pretty clear.

      Delete
  63. Well, they're certainly rejected in Davidski's qpAdm models, whatever authoritative assertions we wish to make.

    ReplyDelete
  64. I haven't looked at his, but I think that has to do with the set-up. I have no issues using them and get fine scores. No rejections.

    ReplyDelete
  65. I think that could be different pright; he seems to be using greater number of European Upper Paleolithic+Mesolithic samples plus Natufian, ancient Ethiopia, Tianyuan. No recent Mbuti, Onge, Karitiana. (Always appreciate your comments, but we will get more out of this is comparing the methodology?).

    ReplyDelete
  66. Tianyuan and Mota really don't matter here. It's having MA1, EHG, WHG, Iran, Levant, Anatolia, that will flesh out what we're looking for. I only use transversion sites too.

    ReplyDelete
  67. Matt
    Yes I agree
    Georgia was the stronghold of CHG. Glacial forests there survived the longest; and its local neolithic is different to the Shuvaleri group (? Which came from Halaf)

    ReplyDelete
  68. @Chad

    Seems to me like your set up doesn't have enough power for this. You need to use the latest methods as described in literature.

    Also, no idea why you say that Tianyuan doesn't matter, considering the variable East Eurasian-related input into ancient West Eurasians? Tianyuan, as the only ancient East Asian we have, is crucial.

    And of course there's the issue of the lack of Mesopotamian/South Caspian diagnostic markers on the ancient steppe.

    ReplyDelete
  69. Not enough power? I'm using the same stuff as you, so I don't follow here. Tianyuan is an early East Asian and not relevant. Karitiana will likely be closer to the ENA side of ANE anyway. Groups closer in time matter more here. Try a run with no farmers or mesolithic samples to see. Tianyuan changes nothing in the outcome, really. I've checked.

    Also, there is only an H3a from Halaf and no y-DNA, so lets not get ahead of ourselves and say it's not in the later steppes. Just wait for the samples to come.

    ReplyDelete
  70. @Dave, Chad

    In qpAdm, in terms of modelling a European BA, say Rathlin, on earlier BA populations in the pleft like BB, would including Yamnaya in the pright (along with the obvious necessary ancients) be recommendable in terms of fleshing out things phylogeny wise? Or would that be going too deep/risking too much shared drift between left and right pops?

    Or including EEFs as well Anatolia_Neolithic in the pright?

    ReplyDelete
  71. @Chad

    Not enough power? I'm using the same stuff as you, so I don't follow here.

    You're not, otherwise you'd be getting the same results as me.

    Also, there is only an H3a from Halaf and no y-DNA, so lets not get ahead of ourselves and say it's not in the later steppes. Just wait for the samples to come.

    There's not much difference between the ancient mtDNA south of the Caucasus today and in the Neolithic/Chalcolithic.

    Ancient steppe mtDNA doesn't derive from this biogeographic zone.

    ReplyDelete
    Replies
    1. There's no difference in our samples or SNPs. What is in your pright?

      Delete
  72. @Basil S

    In qpAdm, in terms of modelling a European BA, say Rathlin, on earlier BA populations in the pleft like BB, would including Yamnaya in the pright (along with the obvious necessary ancients) be recommendable in terms of fleshing out things phylogeny wise? Or would that be going too deep/risking too much shared drift between left and right pops?

    Or including EEFs as well Anatolia_Neolithic in the pright?


    Judging by the latest literature on qpAdm, that would be fine.

    ReplyDelete
  73. @Chad

    There's no difference in our samples or SNPs. What is in your pright?

    All of the full qpAdm output is linked to above.

    ReplyDelete
  74. @Seinundzeit,

    Somewhat belated, but here it is. I am going to preface this by highlighting three samples of interest: Eneolithic Samara 434, Yamnaya Samara I0357, and Y. Samara I0441(somewhat less interesting). If you recall from one of Dave's posts, 434 is the Hg Q suspected Kelteminar migrant(or admixed individual)

    So, the other day, I was modeling a set of modern Indo-Iranian speaking populations on Monte, namely: UP Brahmins, Gujarati A, Brahmin, Kshastrya, Afghan Pasthuns, Pamiris & Yaghnobis, Dardics, Persians, Ir. Zoroastrians, Lors and Mazandaranis. My input contained a consistent panel of samples: Yamnaya(all), Sintashtans, Andronovo, Iran_N, Iran_chl, a couple of ENA "adjusters", and ASI-rich(Pulliyars & Paniyars).

    At a first go, virtually all the Indians just took I0357 and the Eneolithic Samara sample(434) for their steppe and no Sintasthta/Andronovo. Kshastriya took 434, primarily. Kalash took mainly the Samara sample, but also I0441. Pashtuns just took I0357, as did Burusho. Iranians from Iran just took Andronovo and Sintashta, though.

    Midway through this, I changed my trajectory and decided to do a preliminary assessment of how Iranian Persians relate to Messopotamians and Arabs, choosing Iraqi_Jews, Arab Israel 1, Saudis, and Leb Muslims.

    Guess what? I got back on track as two interesting things happened. The model SNATCHED Iraqi Jews AND opted for I0357 instead of Andronovo/Sintashta. The fit improved from 0.4% to 0.03%. Granted, it was an overfit as it opted to use >10 samples all of a sudden. After a bit of trimming, I was still left with 0.05% or so. One of my best fits(or overfits).
    Arab_Israel1 was used, but it's unlikely to reflect Arab-related ancestry since Zoroastrians also took it. Same story with them: fit improved, snatched Iraqi_Jews and opted for I0357. Mazandaranis and Lors "fell into line" in the same fashion.
    ______________

    I've played around with 434 in the past using a panel of various Hgs. Unlike the other 2 Eneolithic Samaran Samples, it actually preferred a good chunk of Iran_N(in addition to CHGs) and the fit was still worse than for the other 2. IIRC, it also took extra ANE.

    Spurred on with my Indo-Iranian results, I chose to examine all the Yamnaya_Samara samples, using a panel of HGs AND Eneolithic 434. Guess what? They mainly opted for CHGs and Euro HGs, while I0357 took a considerable slice of 434 and Iran_N. I0441 grabbed these, too, but to a lesser extent.
    __________________________________________
    This leads me to suspect two things:

    1.The Samara Bend area may have experienced some exotic steppe-like influence(if projected on 2D PCA, that is) from further East or SE. Possibly some Iran_N+EHG+extra ANE population. I0357 and 434 could be hints of this. Haplogroup-wise? Maybe Q+J2b.

    2.If the first Indo-Iranians were, in fact, R1a-z93 Corded Ware folk carrying some EEF, they hybridized with some of these possible Central Asian(?) populations en route to BMAC and/or India. In other words, Monte's choice of I0357(and 434 in some) may reflect an imperfect composite tapping into both CWC and a mystery ancestral stream.

    Can you look into these 3 samples(or at least I0357) using formal methods somehow?

    ReplyDelete
  75. @Davidski

    I am interested in how you selected your Right versus left populations in this analysis.

    Your right populations have groups that could, theoretically, have mixed with you left populations.

    Looking at the documentation for the qpAdm program it states:

    "Caveat...
    1) It is important to realize that the answers are invalid if there has
    been post admixture gene-flow between left and right populations."

    How sure are you that this analysis is using the appropriate right population? How do you determine that for this work?

    Sincerely,
    Lee

    ReplyDelete
  76. @Lee Albee

    The qpAdm documentation is no longer current in these aspects. You need to refer to the latest literature to see how its use has developed. For instance, page 26 here...

    https://media.nature.com/original/nature-assets/nature/journal/v548/n7666/extref/nature23310-s1.pdf

    ReplyDelete
  77. @Davidski

    Thanks..

    So your basing the suitability of your groups then based on the statistics? Tail probability and error level of the Jackknife mean?

    Lee

    ReplyDelete
  78. @Lee Albee

    No, I'm just packing the right pops with as many genetically diverse ancient populations and individuals as I can, while at the same time ensuring that my analyses are each based on at least 100K SNPs, so that I have as much discriminatory power as possible.

    And then I look at the output, mainly the taildiff, to see how the models perform. If I'm seeing clear patterns that make sense in terms of biogeographical affinities, with, say, most groups being clearly discriminated against relative to a few that are obviously working well, then I'm happy.

    By the way, you can e-mail Nick and Iosif about this sort of stuff, especially the more technical aspects. If your questions are legit they'll reply. And if you do find out anything new and useful, feel free to share it here.

    ReplyDelete
  79. @Lee, I believe in the latest papers and generally (Lazaridis 2017 supplement is good to look at) there is a lot of consideration to using qpWave to prune the minimal number of necessary populations in the pright (e.g. if you have all of La Brana, El Miron, GoyetQ-116-1, Villabruna in the pright, and the qpWave is only 2, then there is some way in which you can simplify down to needing only 2).

    That said, even in that paper they abandon this for "All" sets including various outgroups, on the basis that ("Adding these later populations has one disadvantage: if populations A and B are both included in the
    larger set and are composed of the same ancestral elements in similar proportions then A may be modeled as deriving most of its ancestry from B and vice versa. This does not clarify the ancestral origins of either population. However, this approach also has the advantage of identifying mixture when the admixing populations are themselves complex. For example, if a population A is a mix of B and C, and B and C are themselves 2- or 3-way mixtures, then this approach might identify a simpler mix in the origin of A than would be possible if B and C were not considered as source populations").

    We should probably not think of qpAdm as actually less problematic than ADMIXTURE or PCA in regard to the problem of the pright being arbitrary. Using qpWave there is some degree of testing for redundancy, and whether the pright are even distinguished by multiple streams of ancestry, but it ultimately seems like an somewhat arbitrary choice of selecting populations which are believed to be able to distinguish the pleft in formal stats.

    The other new approach in the Lazaridis paper (which is not yet part of the ADMIXTOOLS I think) is the simulation approach - directly simulating mixes of n populations, then running f4 of the form (real,simulated;X;Y) for various X;Y in a pright. The advantage of that is that the results are directly understandable in terms of comparison to (real,real;X;Y) and the f4 Z test for significance.

    As well, with the simulation approach, you can run f4(simulated1,simulated2;real;outgroup), so that in the event that two simulations get all the outgroup relationships right, but actually either or both is not very close to the real population, then you could detect that (qpAdm can't really do this at all).

    But still this does not get you away from arbitrary elements in the pright choice.

    ReplyDelete

Read the rules before posting.

Comments by people with the nick "Unknown" are no longer allowed.

See also...


New rules for comments

Banned commentators list