Saturday, July 29, 2017

New resource: 67 diploid ancient genomes


Published this week along with Martiniano et al. 2017, a dataset of 67 new and publicly available genomes, genotyped and imputed for 30 million markers:

Data from: The population genomics of archaeological transition in west Iberia: investigation of ancient substructure using imputation and haplotype-based methods

Martiniano R, Cassidy LM, Ó'Maoldúin R, McLaughlin R, Silva NM, Manco L, Fidalgo D, Pereira T, Coelho MJ, Serra M, Burger J, Parreira R, Moran E, Valera AC, Porfirio E, Boaventura R, Silva AM, Bradley DG

Date Published: July 28, 2017

DOI: http://dx.doi.org/10.5061/dryad.g9f5r

Keep in mind however, that this dataset is specifically designed for haplotype-based tests, like those done with Chromopainter (for more details, see S5 Text in Martiniano et al. 2017). As far as I know, it should also perform well in ADMIXTURE runs.

On the other hand, the diploid and imputed genotype calls are likely to slightly skew results in formal statistics and formal statistics-based modeling analyses. So it's best to use pseudo-haploid genomes for such tests, and/or high coverage diploid genomes if available, with 100% observed calls.

I'm about to run a quick and dirty haplotype/Principal Component analysis with this dataset using BEAGLE, mainly to check whether South Asians show greater recent genetic affinity to Afanasievo/Yamnaya over Andronovo/Sintashta (for more on this controversy, see here). It's a pity that this dataset doesn't include any genomes from Neolithic Iran, because then I'd also be able to try haplotype-based mixture models for South Asians.

By the way, I won't be using all of the 30 million markers. I've only kept the SNPs that overlap with the Harvard Medical School's 1240K SNP ancient capture array, which should mean that only a small minority of the calls in my analysis won't be real.

Update 02/08/2017: The BEAGLE run is complete and the analysis is unfolding. See post and comments here.

28 comments:

  1. "I'm about to run a quick and dirty haplotype/Principal Component analysis with this dataset using BEAGLE, mainly to check whether South Asians show greater recent genetic affinity to Afanasievo/Yamnaya over Andronovo/Sintashta..."

    That'll be exceedingly interesting.

    Very much looking forward to seeing what you find.

    "It's a pity that this dataset doesn't include any genomes from Neolithic Iran, because then I'd also be able to try haplotype-based mixture models for South Asians."

    So, no way to eventually integrate WC1 into this data-set?

    Rather unfortunate; it would have been pretty awesome/informative to see haplotype-based mixture models for Central/Southern Asians.

    ReplyDelete
  2. So, no way to eventually integrate WC1 into this data-set?

    No way to do it reliably by anyone but the authors. We have to wait for the dataset to be updated by them.

    ReplyDelete
  3. Ah, that makes sense.

    In that case, I really hope that the authors can eventually fill this lacuna (the Iran_N samples are essential for analyzing contemporary West, Central, and South Asian populations).

    ReplyDelete
  4. David, when do you think the Mathieson 2017 and Olalde 2017 genomes will be released?

    ReplyDelete
  5. Davidski: mainly to check whether South Asians show greater recent genetic affinity to Afanasievo/Yamnaya over Andronovo/Sintashta

    Not sure if you'll get a lot as, Afanasievo/Yamanya and Andronovo/Sintashta haplotype donation seem *really* correlated in this set, e.g: http://imgur.com/a/cFtje and http://imgur.com/a/cFtje

    Excesses of Yamnaya donation relative to KK1 donation seem to characterize North Europe and more so Lezgins though: http://i.imgur.com/8sNL0yc.png.

    (Which is interesting - Lezgins possibly because this is a population who trace descent from one which which *directly* contributed to Yamnaya? Or alternateively directly received Yamnaya genotypes? In the first of these two possibilities, poss another reason for skepticism that contribution to Yamnaya was anything other than from a specific North Caucasus early Neolithic population.).

    So could check how South Asia fits in that pattern?

    ReplyDelete
  6. Sein: In that case, I really hope that the authors can eventually fill this lacuna (the Iran_N samples are essential for analyzing contemporary West, Central, and South Asian populations).

    Seems like no reason why they couldn't do it soon, as the coverage is good for the early Iran_N... (apparently unlike public EHG samples are is not good enough for imputation thresholds per their paper) but I guess they wouldn't in this case as it's peripheral to what they investigate here and perhaps they wouldn't see value in doing so here.

    If any of the Mathieson 2017 and Jones 2017 are good enough, I imagine they'll also tell us much more about Eastern Europe and the emergence of Yamnaya. In time...

    ReplyDelete
  7. @Matt

    Not sure if you'll get a lot as, Afanasievo/Yamanya and Andronovo/Sintashta haplotype donation seem *really* correlated in this set, e.g: http://imgur.com/a/cFtje and http://imgur.com/a/cFtje

    Challenge accepted.

    ReplyDelete
  8. Good luck. Iron Age Russia samples also look to have an excess of donation to Chuvash, Iranian, Turkish and marginally Finnish, Adygei, Russian and Lezgin here. May / may not show up with Pakistan / Afghanistan / "Gedrosia" samples.

    Also more OT, but cool was a few of the samples in their Table S9 seem to show anomalous donation from ancients relative to their population - 1) Armenian_arm21 in particular, Armenian_arm7, Armenian_arm14, Georgian_mg70 look like they have a haplotypes intermediate Russia and the main set of their population, 2) Romanian_Romania7 and Bulgarian_Bulgarian13 look closer to Turkish than most samples from the same population and 3) Chuvash_chuvash3 looks closer to Russian and distant from other Chuvash: http://imgur.com/a/4lZWD

    ReplyDelete
  9. @Matt
    There are plenty of Georgians and Armenians with recent Russian/East Slavic ancestry

    ReplyDelete
  10. @Matt, George

    Checking LBK relative to Russian_HGDP average on Georgians makes Georgian_mg70 stand out, so this sample probably has recent Russian ancestry. Using non-Balto-Slavic speaking Northern or Eastern Europeans instead of Russians doesn't seem to produce the same effect.

    ReplyDelete
  11. @Davidsky and all

    This is the best scenario I am able to suggest, opinions are welcome.

    Yamnaya migration to western Altai, beginning around 3000 BC (where Afanasievo culture developed) before reaching that region crossed Kazakhstan (Anthony and Brown 2017:41),and maybe, from there, part of the migrants separated and moved Southwards into the Indus valley, becoming the first Indo-Europeans to reach South Asia around 2500 BC, at the beginning of Mature Harappan period. They arrived as sheep-goat-cattle pastoralists with domesticated horses and copper metallurgy, but were absorbed by Harappan culture and lost almost inmediately their horses and chariots, assimilating Harappan culture but mixing their Yamnaya ancestry with local people. This ancestry can be mirrored in Poznik et al. 2016. "Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences" in Nature Genetics, where Poznik et al comment: "The most striking were expansions within R1a-Z93, occurring 4.0-4.5 kya. This time predates by a few centuries the collapse of the Indus Valley Civilization..."

    The subsequent wave of migration was that of Asko Parpola`s hypothesis regarding Atharvavedic (or Vratya) people reaching South Asia around 2000-1900 BC. And the 3rd one is of Vedic Indo-Aryans around or after 1500 BC.

    ReplyDelete
  12. @Carlos,

    What we need is Indus Valley, Hittite, Greek, and Tocherian DNA. I think Indo Aryans were like Andronovo, I can't imagine proto-Sycthians and Indo Aryans were very different.

    ReplyDelete
  13. @ Samuel

    Could that Steppe admixture of 950 bC. in the Canaanite Paper be Hittite ? Interesting that it is derived from a Early Middle Bronze Age Steppe Culture....

    ReplyDelete
  14. @Carlos Aramayo and all

    Historians have been proposing that for a while now. We need to consider at least three waves of Indo-European migration from steppes into South Asia.

    Wave I - Folk migration.
    Wave II - Invasion by chariot riding, Soma drinking, Rigvedic tribes.
    Wave III - Hunas, Scythians, Kushans from 3rd B.C to 6th century A.D ruled over various parts of South Asia.

    "First wave of Indo-Aryans were engulfed by the later soma-pressing Indo-Aryan Andronov tribes that eventually became the composers of the Rigveda wherein they refer to themselves as Aryas."

    Silva et al also mentions "aDNA will also be needed to test the hypothesis that there were several streams of Indo-Aryan immigration (each with a different pantheon), for example with the earliest arriving ~3.4 ka and those following the Rigveda several centuries later."

    ReplyDelete
  15. @ Carlos

    "Yamnaya migration to western Altai, beginning around 3000 BC (where Afanasievo culture developed) before reaching that region crossed Kazakhstan (Anthony and Brown 2017:41)"

    Not feasible.
    Afansievo dates to c. 3300 BC, thus as early if not earlier than yamnaya.

    ReplyDelete
  16. @Rob

    Afansievo dates to c. 3300 BC, thus as early if not earlier than yamnaya.

    No it doesn't.

    - Yamnaya, Early Bronze Age, 3000–2450 cal BC

    - Afanasievo, Early Bronze Age, 2900-2500 cal BC

    http://eurogenes.blogspot.com.au/2015/10/essential-reading-paleoecology.html

    ReplyDelete
  17. @Singh

    "First wave of Indo-Aryans were engulfed by the later soma-pressing Indo-Aryan Andronov tribes that eventually became the composers of the Rigveda wherein they refer to themselves as Aryas."

    I agree with this possible situation.

    @Rob

    "Afansievo dates to c. 3300 BC, thus as early if not earlier than yamnaya".

    Afanasievo period is from 3300 BC to 2500 BC, migration could happen within this time, not necessarily at the beginning. But your observation is suggestive, Anthony and Brown (2017) paper does not mention such an upper limit.

    ReplyDelete
  18. Afanasievo is from the Western Steppe and younger than Yamnaya.

    ReplyDelete
  19. Agree that it's Western but we can't just bend the chronology to suit us

    ReplyDelete
  20. Then stop doing it, because Afanasievo is younger than Yamnaya.

    ReplyDelete
  21. No it's not
    As I said, it's just as early or earlier

    ReplyDelete
  22. @ Davidski

    About that Middle Bronze Age Steppe ancestry in the Canaanite samples 950 bC. of Sidon ? Could they be Hittites ?

    ReplyDelete
  23. @Rob

    Duh...

    - Yamnaya, Early Bronze Age, 3000–2450 cal BC

    - Afanasievo, Early Bronze Age, 2900-2500 cal BC

    http://eurogenes.blogspot.com.au/2015/10/essential-reading-paleoecology.html

    @Ric

    No idea.

    ReplyDelete
  24. Duh indeed "Thus, most of the bone dates reported here, and wood and charcoal dates from the literature, are in agreement in placing the start of the Afa- nasievo culture around 3000 BC, "

    -Svyatko/ Mallory

    ReplyDelete
  25. Yamnaya is not younger than Afanasievo.

    And it takes about six months to move leisurely by cart from the Western Steppe to the Altai.

    ReplyDelete
  26. @Davidski

    It can be obvious that Yamnaya is at least a little earlier than Afanasievo but, re-checking David Anthony`s recent two papers, I see that first in 2015 he mentions Afanasievo`s onset around 3300 BC, but in his 2017 co-authored paper suggests a time "around 3000 BC".

    ReplyDelete

Read the rules before posting.

Comments by people with the nick "Unknown" are no longer allowed.

See also...


New rules for comments

Banned commentators list