search this blog

Friday, February 14, 2014

Mapping the history of human admixture (paper + website)

Update 19/09/2015: Recent admixture in West Eurasia (including Europe)


Hellenthal et al. is a valiant attempt to map out the history of human admixture using modern DNA. Hopefully it's the last.

I don't want to sound too harsh, because it's a fascinating read, and the companion website a lot of fun, but studies like this really need ancient genomes nowadays to look convincing. Let's hope these guys can find the resources to repeat this effort using a wide range of carefully chosen ancient remains.

Indeed, here are a couple of examples of the sort of stuff that makes me skeptical about the accuracy of the methods employed by the authors. First of all, unless I'm reading this map wrong, then what I'm seeing here is a 23% contribution from the Hadza of East Africa to the Lithuanians. Apparently, this supposed admixture event happened during the early middle ages. Is this a typo or what?

Secondly, what's with the difference between the Orcadians and Norwegians? How can the Orcadians be unmixed if a large part of their recent ancestry derives from Norse settlers (which it certainly does)? So, did all of this admixture in Norway take place after the pure Norwegians settled the Orkneys? It's possible, but hardly plausible.


Hellenthal et al., A Genetic Atlas of Human Admixture History, Science 343, 747 (2014), DOI: 10.1126/science.1243518


Ponto said...

I think the study says that the Orcadians are Uncertain and that a larger sample of Orcadians could show admixture. The French and Armenians are also Uncertain and need larger samples. In fact according to what I read of the pdf all Northwest Europeans who are Unadmixed are so because the sample sizes used for those groups only show their more recent admixture which happens to be of a similar nature. A larger sample would allow deeper delving, getting past the similar admixture phase.

I do not like the breakdown of ancestry as it is too precise when admixture is not an organized thing, it just happens. I do not like the dating either. It would be easy to refute, just find remains older than the purported admixture event and test the dna. I would not be surprised that the result is similar to their modern descendants or the people occupying the same geography.

About Time said...

NW Europeans probably mixed earlier then others. Or whomever they mixed with weren't used in analysis.

Kalash show European admixture c. 500's bc. That's the major surprise in paper IMO.

The Uyghur have Italian like admixture. Interesting bc cave paintings show redheads in the area. N Chinese have some West Asian it looks like, as do (strangely) Pima Indians from a late date.

Druze are least African of Musiim world, prob bc they banned slavery early in history of Islam. Even Iranians and Pakistanis have some late African %. It means Gedrosia etc have mixed since apparent c. 500 BC mixtures with Europeans.

V Robazza said...

What or where did you get your lithaunian example from.

i get only 51.7% polish, 38.4% belorussian and 8.9% russian

Davidski said...

Look at the screen cap from the admixture atlas (Lithuanian - Full analysis: second event). It shows Hadza admixture among the Lithuanians, does it not?

About Time said...

Yes I was wondering about the Hadza. Bizarre, but who knows. Or is that a misprint for Hazara?

Davidski said...

If that's not a typo, then there's something wrong with the methods.

But check out the Finnish full analysis second event; it shows a 2.2% contribution from the SanKhomani, also, it seems, dated to the early middle ages.

Even if these signals are very low, you really have to wonder WTF is going on.

About Time said...

@Davidski, well Globetrotter is all linkage based if I understand. So technically what it shows is some segments found both places but better preserved in Hadza (?? Not exactly sure how directionality is inferred by software).

If so, then is not excluding that both Hadza and Lithuanians were mixed with "Population X" that is gone today (or not sampled in analysis) but theoretically went both places.

Ditto for "German/Austrian/Orcadian" mixture in Kalash 700-500 BC. Could be "Population Y" that went both places (much easier to imagine this -- some kind of West Asian or steppe element).

Davidski said...

Yeah, but what's the point of showing this sort of thing in this context, and implying that the admixture event, with the Hadza or whomever, took place during the early middle ages? It just looks illogical and puts in doubt all of the other results.

About Time said...

It's good to show oddball results though IMO. For one, they might turn out to be right, but demand scrutiny of method, data, etc and independent corroboration. Curve balls are good though, it helps us shake up our thinking and approaches.

Part of scientific process of constantly testing working hypotheses with new data and revising when the time comes.

The worst science is when no curveballs are shown, and research community becomes complacent and blind to errors in paradigm that don't match emerging data.

V Robazza said...

yes it does.
seems accurate to me especially since the paperwork states they cannot go back further than 5000 years.

Why does the eastern European people after these extra options and the western European does not.......

Davidski said...

The Hadza admixture among Lithuanians seems accurate to you? Well, all other analyses I've seen, including those using ancient DNA, show that Lithuanians have the most Mesolithic ancestry in Europe, and the lowest levels of Near Eastern and African ancestry.

In any case, the reason the Eastern Europeans have the extra analyses is because they were found to show signals of admixture. But as per one of the comments above, what we're probably seeing here is the result of earlier admixture events in western Europe, but much later events in Eastern and southern Europe. In other words, it looks like the hunter-gatherer and farmer DNA came together much later in the east of the continent, so the signals of this event can still be picked up by this software, unlike in western Europe. But even so, the dating of the admixture events seems off in many cases, and too late. We can probably push many of the dates back to the Iron or even Bronze age.

About Time said...

I have no idea, but best would be to isolate the segments that program decided are "Hadza" and take a closer look. I'm still thinking it might be a misprint for Hazara.

But don't Europeans get noise level West African in GEDmatch? Who knows, maybe it's picking up something. The Ethiopian/Mideast thing in Pickrell is still up for debate. Wrt Sandawe in Finns, African definitely shows up in Middle East and there were slave traders in Eastern Europe at the time. Vikings were active pretty Far East into Baghdad.

I want to know about Gedrosia in west Europe. Phantom of "population y" or really some kind of gene flow to Kalash?

For dates, I like actual ancient specimens. Until then, it's all educated guesswork and hypothesis building.

Davidski said...

Why would Lithuanians have Hazara admixture? Nothing like that shows up anywhere else.

And that still leaves the supposed San ancestry in the Finns.

About Time said...

I agree, it's bizarre. I'd first guess mislabel, then artifact of model, then artifact of tiny sample, then "population x" (Mideast is my guess in that case), then if all else fails - scrutinize the chromosomal segments in a lot more detail and try to duplicate with other means.

barakobama said...

According to Ahmed(link below) everything great in history was achieved by black(Sub Saharan) people, and that every place on the planet was originally inhabited by black(sub Saharan) people. So actually people in Lithuania, Finland, etc. have evil white man ancestry on top of the original sub Saharan people of that region.,d.aWM

barakobama said...

Davidski, right now I am making maps(with pie charts and other stuff) from admixtures K12b, K7b, globe13, maybe I will do others(I have already done EEF-WHG-ANE). My focus is on Europe, the near east, and north Africa(more so Europe) and I am not sure if I should also do south Asia, because there are over 40 south Asian populations with results in these admixtures(it would take days) and I am not sure if it will significantly help me learn about Europe, near east, and north Africa. Do I think understanding results in south Asia will significantly help learning about Europe, near east, and north Africa? Of course eventually my focus will shift more towards south Asia and other people but not yet.

I am most interested in the whole WHG, EEF, ANE thing. mtDNA and Y DNA from FTDNA and papers(whatever you want call it) are other good sources to try to figure this stuff out. OF course learning about ancient people(through archeology, modern societies, etc.), learning the science behind the DNA, etc. are also good tools.

Davidski said...

South Asia isn't very important for Europe in the context of genetic substructures that exist in Europe, but it's important for learning about ANE dispersals, because ANE probably moved deep into Europe and South Asia at about the same time. So it might be useful to pick a few representative South Asian groups to map the region.

V Robazza said...

only confusing parts to me are -
No Albania?
And whats with the german?austrian combination. Austrians who only became known in 998AD and are of Bavarian descant only became germans less than 1800 years ago.

barakobama said...

Jaxman(I think jackson_montgomery_devoni on this blog), showed me a part of Der Sarkissian 2014 were it says H2a2 is the closest match to Mesolithic H from Karelia.

Here is the quote from Sarkissian 2014 that he showed me.

"The detection of haplogroup H in the Mesolithic site of aUz (one haplotype) is noteworthy. To date, haplogroup H has either been rare or absent in groups of hunter-gatherers previously described. It has not been found in hunter-gatherer mtDNA datasets of eastern Europe [12] and Scandinavia [13], but has been found in two hunter-gatherers of the Upper Palaeolithic sites of La Pasiega and La Chora in northern Spain [20]. The closest match to the ancient H haplotype in aUzPo belongs to sub-haplogroup H2a2 [59], which is more common in eastern Europe [60] with highest frequencies in the Caucasus. Current ancient data is too scarce to investigate the past phylogeography of haplogroup H in full detail. However, together with U4, U5 haplotypes this H haplotype suggests continuity of some maternal lineages in (North) East Europe since the Mesolithic."

According to this Eurogenes thread.

J was also found. All of the the 7,500 year old hunter gatherers from Karelia mtDNA is typical of European hunter gatherers(U5a, U4, U2e) except the H(H2a2?), J, and C1f are surprising.

Uznyi Oleni Ostrov, Russia 7,500BP

mtDNA=9: U=5(U2e=2, U4=2, U5a=1), C1f=3, H(H2a2?)=1

Popova, Russia 7,000BP

mtDNA=2: U4=2

Bolshoy Oleni Ostrov, Russia 3,500BP

mtDNA=23: U=8(U5a=6(U5a1=4), U4a1=2, C=8(C*=6, C5=2), D*=3, Z1a=3, T*=1

windy said...

"Wrt Sandawe in Finns, African definitely shows up in Middle East and there were slave traders in Eastern Europe at the time. Vikings were active pretty Far East into Baghdad."

But not so active in what is now Finland. It's possible some Finns were involved in the Varangian trade, but San from southeast Africa ending up there is still stretching it.

In any case, that "component" in Finns looks more like a random mix of populations (Orcadian, Daur, Oroqen, Norwegian, Surui, Hezhen, Mongolia, Basque, SanKhomani, Xibo...!)

Alexandros said...

I came across this publication yesterday and since then, as the majority of people here, I am scratching my head wondering what to make out of it..

There are many bizarre and completely unexplainable results, but there are a few which are in fact spot on and in some cases help clarify some things. I will use the Greek and Cypriot populations to highlight some of these positives and negatives.

1. For the first time, a paper actually quantifies the Slavic expansion into the Greek gene pool, which according to these authors is huge (30%)! I do not have any difficulty accepting this number, given the very large proportion of North European admixture observed among Greeks in practically all admixture calculators found at GEDmatch.

2. For the first time, a paper confirms, genetically, the ancient Greek colonization of Cyprus. In fact, the currently available admixture calculators (including the default 23andme calculator, as well as the Eurogenes k36 calculator) assign an unexplained 25% Italian component to Cypriots. It now becomes clear that this component is in fact Greek and given that it is in close proximity to the East Sicilian component, it suggests to me that it very likely represents an ancient Greek admixture event. This is consistent with both the historical and archaeological evidence, as up to now the Greek colonization of Cyprus seemed like a fairy tale, as far as genetic studies were concerned. If such an event did not occur, we would all have a hard time explaining why Cypriots speak Greek today and not Italian..

1. The 'Mediterranean analysis' for the Greek population (see study website), shows a large admixture event from 'Turkey' into Greece. Of course this is hugely misleading as admixture analysis from all available calculators shows almost 0 admixture of Turkic people into Greece. What this event represents is probably an influx of pre-Ottoman Anatolians into Greece probably during the Byzantine Empire.

2. How come Basque is included under 'Polish-like'?! (see admixture of Greeks in website)

3. How come Iranian is included under 'EastSicilian-like?! (see admixture of Cypriots in website)

4. How come Armenian is included under both 'Egyptian-like' and 'EastSicilian-like?! (see admixture of Cypriots in website)

Davidski said...


Those Polish-like, Egyptian-like etc. labels are general descriptions of the various components involved in the admixture events. Basques are probably included under the Polish-like label because Poles have some western European ancestry (in fact, Poles are described as French-like mixed with Lithuanian-like).

That might be fairly confusing, but not as problematic as some of the strange admixture results, late estimated timeframes of admixture, and the suspect theories, some of which are now all over the press, like the one about Alexander's army leaving an impact on the genetic substructure of the Kalash.

Grey said...


"In other words, it looks like the hunter-gatherer and farmer DNA came together much later in the east of the continent...But even so, the dating of the admixture events seems off in many cases, and too late."

The increase in frequency of mtdna H and decrease of mtdna U could be due to selection in place of the surviving farmer dna after the LBK collapse through the female line because of the advantages of genes like SLC24A5.

If so the "admixture" in eastern europe might be measuring an event that wasn't an event but a gradual process that continued over centuries.

Matt said...

I can understand that to approximate haplotype blocks in Lithuanians, from a range of modern populations, then yes, it could make sense to fit them as a mix of about 99% blocks taken from various other East European groups plus around 1% blocks taken from various Native American and Siberian populations.

And if you wanted to approximate haplotypes blocks in Poles, using a range of modern populations, then yes, it would make sense to use a take blocks from a bunch of mainly more Northeastern European populations mixed with blocks from with a minority of diverse mainly West European samples (but also East Med and North African).

(And under some sampling strategies, it may even make sense to take some Hadza blocks to approximate a Northern European population!)

What any of this has to do with any actual admixture happening though...

I haven't read much of the paper though, and don't have a firm theoretical grasp of what is going on, but one of the things that stands out for me here is that other studies essentially either assume or try and generate an "unadmixed scaffold" to work from, i.e. finding or generating synthetic unadmixed ancestral populations.
This study seems to simply assume that all populations except the target are unadmixed and then generates a sampling (not a random mix) of all the non-target populations to approximates the target.

I can't see this is going to determine direction from admixture, or distinguish admixture from descent from a shared source (which may not be a present day population).

I don't know the paper, and perhaps there it is not as I think or there is a reason why that is non-significant, but that seems like it will lead to the vast majority of results being wrong.

About Time said...

@Matt, your description gets us closer to actual genomic reality, which does not = "ethnicity" despite all efforts to arm wrestle data into modern notions of identity. Sorry folks.

That doesn't mean projected dates aren't up for grabs etc, but science means stretching our (cognitively biased) human brains to be more objective and learn new things.

Dates for European/Orcadian admixture in Kalash (800-600 BC) are way wrong for Macedonians or Alexander (300s BC). Dates could be right for "cities of the Medes" like Kandahar/Mungidak.

Chad Rohlfsen said...

All this is, is a fancy looking Oracle from Gedmatch with bogus mixing dates.

Daniel Falush said...

Dear David, Thank you for your interest in our paper. You are right to look critically at our results and to point out funny features. We agree I think (and have said in our interactions with lots of people!) that ancient DNA is going to transform things for many questions, and it will be v. v. helpful to use ancient sources for events. It should let us directly test hypotheses. We also suspect questions will remain - e.g. which ancient groups are really ancestors of current day populations, etc. etc. and it will probably some time before sample sizes of ancient nuclear DNA genomes are large, and also span a cross-section of times, as really needed.

Firstly, we do not say that there is no admixture in the Orcadians, only that it does not (though there is some evidence) give a significant signal based on the current dataset on based on the criteria for significance that we made based on simulations. The interactive map can be used to see the best guess and then it makes it a mix of welsh vs Norwegian. There will be more on this population when the Peopling of the British Isles paper comes out.

Secondly, you highlight the Lithuanian results in the "full analysis"and the Hazda contribution. It is worth going through these in a bit of detail as it will help in interpreting our results, for this and other groups.

As well as the full analysis, we have also done other analyses where the other East European populations are not included as donors (EastEuropeI analysis; East EuropeII analysis). We did this because we found fineSTRUCTURE - which we ran initially to understand overall STRUCTURE patterns - had particular trouble splitting populations from Eastern Europe (Figure S17 in the supplement) and because we found - based on our simulation results - and really common sense too - that including very similar groups in the analysis could mask events they share (a caveat we included in the main text). Thus the "EuropeI" analyses ought to have more power to find historical events in the region because they do not mask admixture that is common to all East European populations. (In any case, we believe it is therefore worth checking whether groups seen in the Full analysis are verified in these analyses too.)

The total admixture signal in the "full analysis" for the first Lithuanian event is 1% of the genome, with admixing source Daur +Oroquen+Columbian+Shi. Lithuanians are generally very similar to the other East European populations and these are assigned 99% of the ancestry. This is by far the strongest signal in this analysis, and we think reflects a real event, but because the admixture fraction is potentially tiny, source inference might still be quite inexact, and "masking" may be an issue.

However, despite the very small proportion of incoming DNA, the analysis also suggests that the admixture is multi-way. The p-value for this is 0.02 - which is a marginal signal, though in the EuropeI analysis the p-value is more convincing. In the paper we write that our source inference is less good for complex events (like this one) - in fact as explained below for this reason we do not provide direct source inference for secondary events - and so this second signal will be one of the toughest in the dataset for our approach (because it is weaker than a quite weak signal!).

Daniel Falush said...

part 2:
It is worth mentioning what the squares relating to the event mean here, because they're different to the circles shown for the stronger events we find! They're called "contrasts" on the webpage and if you mouse over it says "Square size cannot therefore be directly interpreted as similarity to (unsampled) admixing sources, but rather highlights differences between sources." To be a bit clearer, the second component shows the different directions of the minority sources of the genome; technically, differences in populations' inferred contributions of haplotypes in the mixture representation. This means that they do NOT correspond to admixture proportions (which we do not try to estimate) but rather to populations whose ancestry is most strongly differentiated in the different components of admixture in Lithuanians - because our method is currently not able to infer sources fully for these second events. This is discussed briefly in the main text under things we find challenging. In this case squares show broadly speaking South versus North. North is associated with the first admixture component. Hadza is one of the South components, so what we're saying here is that a haplotype shared with Hadza is very strongly indicative of coming from the "South" group - not that Hadza is necessarily a dominant member of this group. Nevertheless, Hadza admittedly is a pretty odd choice - and not seen in the EastEurope I version where the signal is much stronger - and this may have something to do with the weakness of the signal. There are only 3 Hadza in the dataset and they may have ended up on the extreme end of the PCA decomposition by chance. In any second event where squares are shown, they only relate to these differences between sources.

Although the three-way admixture here is on the border of statistical significance and quite hard to interpret, the fact that there was in fact a three-way admixture event shared by the populations in the region is strongly supported by the EastAsiaI analysis. These are the results we present in the figure in the paper.

Another highlighted case is the Finnish - we only have 2 Finns which is difficult to say much from (and we could have removed cases like this - but chose not to for completeness). Probably as a consequence, their signal is one of the weakest in the whole dataset, in terms of the "fit" of the best curve to the data (0.415). Again, the Finnish second signal seems slightly odd (and includes an African group) - I think most of the same reasons might apply there (though perhaps more strongly) as for the Lithuanians. We're working on the methods for these "complex" events but likely they will remain tricky until we can obtain access to more local samples.

(with contributions from Simon Myers).

barakobama said...

This is totally off subject, it is about my own mtDNA. I have U5 and I am pretty sure I have U5b2a2 and i maybe a member of a subclade that has not been discovered. I don't know much about how mtDNA haplogroups are determined so that's why I am posting this.

I took the mtDNAplus test at FTDNA.

It tests HVR1(16001-16569) and HVR2(00001-00574). They said I have U5(It must suck for you guys and girls that have farmer maternal lineages), which is what I excepted and wanted.

At filtered matches section I can never find HVR1 and HVR2 matches. I have tried probably over 30 times to find HVR1 and HVR2 matches it always takes something like 5-10 minutes before it stops loading and just freezes and says HVR1-HVR2 matches=0.

When I look for HVR1 matches it loads within like 10 seconds and I have 308 HVR1 matches, all that took FMS(HVR1, HVR2, and coding region) are under U5b2a2(with various subclades after that mainly U5b2a2b's but some U5b2a2a's and I didn't see any from the third subclade U5b2a2c).

When I goo to the Advanced matches section and look for HVR1 matches I get the same results as in the section I named above. What is strange is when I look for HVR1 and HVR2 matches in advanced matches section I get the exact same matches as when I look for HVR1 matches. I don't know why because the other section said I have NO HVR1 and HVR2 matches.

Later I went to this predictor(link below). To try to find if I really have U5b2a2.

It says my best match is U5b2a2 and showed the line from rCRS to U5b2a2 with defining HVR1 and HVR2 mutations I HAVE EVERY SINGLE ONE!! That means I have U5b2a2 right?

I think this is the predictor's source.

The defining mutations of U5b2a2a and U5b2a2b are in the coding region so I might have them and i might not. That explains why my HVR1 matches were apart of U5b2a2b and U5b2a2a. U5b2a2c has defining HVR1 matches and I don't have any of them.

I have all of the defining HVR1 and HVR2 mutations leading up to U5b2a2 but I also have two extra differences to RSRS in HVR2: 215G and 315.1C. That explains why I have U5b2a2 HVR1 matches but no HVR2 matches. Does this mean I am apart of a new U5b2a2 subclade that is defined by those mutations?

Davidski said...

Hi Daniel,

As you know, I've played around a bit with Chromopainter/fineStructure, and it seems to me that ancient/effective population sizes have a profound effect on these sorts of analyses.

In fact, what I'm seeing in many of these results aren't admixture events within historic times, but rather signals of prehistoric migrations via more recent and very rapid in-situ expansions across low population density areas, and/or significant endogamy.

So basically what I think is happening is that small, chopped up ancient haplotypes are being reconstituted into much larger ancestral haplotypes among many of these groups, and the results look like recent admixture events, when in fact they mainly reflect things that happened during the Bronze Age or even Neolithic.

On the other hand, regions with long histories of large effective population sizes are less likely to be affected by this problem, because their ancestral haplotypes aren't reconstituted so readily.

The Slavic expansion across sparsely populated Eastern Europe during the early middle ages is probably a good example of what I'm describing. This was in all likelihood a population of mixed indigenous European Mesolithic hunter-gatherer/Near Eastern farmer ancestry, probably with minor input from Siberian groups from the taiga belt and steppe nomads like the Sarmatians. However, because its expansion was so rapid (and we can see this particularly well via several Y-chromosome haplogroup subclades, like within haplogroup R1a), in your analysis it basically looks like three distinct groups coming together first in what is now Poland, and then in the rest of Eastern Europe.

Of course, to some extent it's true that the Slavic expansion was, as described in your paper, a mixture event between Northern European-like Slavs and the much more southerly native populations in the Balkans. But this really could not have been the case in Poland. In other words, it's impossible that two such genetically distinct northern and southern populations met in Poland or anywhere nearby at this very late timeframe.

Similarly, I doubt the Kalash have any non-trivial ancestry from Alexander's army and/or Western Europe. What I think they have is the so called Ancient North Eurasian (ANE) ancestry from somewhere like the Volga-Ural region or western Siberia, which moved both deep into Europe and Asia during the Copper Age expansions of the Kurgan cultures, along with haplogroup R1a and maybe R1b. But the Kalash have been stuck in the remote valleys of the Hindu Kush all by themselves for such a long time, that some of their haplotypes look like they arrived from Scotland and other unexpected places only a couple thousand years ago along the Silk Road.

About Time said...

Looking at the website, I see the Irish get some "Greek" %. I'd like to see more about this, and whether it stick around when other pops are included. Is it more like the Cypriot or Polish part of Greeks?

The "Orcadian" % in Kalash is too early to be from Alexander's time (looking at middle range of CI not tail). But it is not too early to be Mede (Kandahar/Mundigak was a Mede city, and Aramaic was spoken there later on). In the neighborhood and could explain strange old customs of Kalash.

Right time frame for Medes (800-600 BC), who were big in Central Asia for a time before merging / being coopted by Persians (similar culture, but different people maybe).

The "Alexander" thing with Kalash is just lazy thinking. The "civilized world" was centered further east in 800 BC (Greeks were just on the edges and Josephus said they were basically pirates and mercenaries for the "real" empires back then).

Davidski said...

There might be several European-like haplotype layers contributing to this effect among the Kalash, but I'd say all of them are of Indo-Iranian origin, and date as far back as the Sintashta culture or even earlier.

Anders PĂ„lsen said...


When will the Globetrotter software be available?

barakobama said...

If anyone wants to answer my question about my mtDNA here is more info, I have ~80,000 HVR1 matches in Europe and the United States but 0 HVR2 matches. Is that normal? I plan on taking FMS to confirm that I have U5b2a2 and maybe discover I have one of the three subclades. I hope FTDNA will find if I am a first sampled member of an unknown U5b2a2 subclade, I hope they let me name it if I do.

Ryan said...

Just a lay person here, but re: the Lithuania/Hadza link, assuming that Hadza isn't a misprint of Hazara (which is probably the most plausible explanation)...

Wouldn't a pretty straightforward explanation be that a missionary from a Baltic country got a little friendly with one of the locals, and that one of the descendants of that missionary happened to be one of the Hadza sampled? The Hadza were subjects of the German Empire from 1870-1918 I believe, and a decent chunk of Lithuania was part of the German Empire at that time too.

Alternatively, R1b1c's prevalence in Cameroon suggests that there was at least some genetic exchange between European/Eurasian and African hunter-gatherers during the most recent wet phase of the Sahara. Is it totally implausible that some group X living along the rivers of the Sahara imparted some DNA both to European and African hunter-gatherers? I'd note that R1b1c is found further north in the rift valley.

How many loci are tested in this method? Could a small sample size, genetic drift or a founder effect be amplifying some small spec of genetic commonality?

Strange in any case. I'd bet on it just being an error or random noise.

Grey said...

"Alternatively, R1b1c's prevalence in Cameroon"

Alternatively some R1b people from the megalithic culture wandered down south looking for the West African gold fields.

Daniel Falush said...

Ryan (et al): Your intuition is correct. To reiterate, our analysis does not actually imply Hadza ancestry in Lithuanians and yes the fact they come out prominently in the PCA decomposition is likely noise. Read our comments above for a detailed explanation.

David: we will try to respond on Slavs/Kalash soon.

MfA said...

@Daniel Falush, Is there any future plans adding new samples to your work like Abkhazians, Kurds, Ossetians etc. ?

Zachary Shumway said...

Anyone: Is there a better way to ask Davidski if he could take a look at my GEDmatch results? I'm sure he's busy. Was not quite sure how this all works. Thank you!

About Time said...

@Zachary, IMO you will get more insight from the GEDmatch oracle sections using Eurogenes K13.

There is a discussion thread on Davidski's Eurogenes blog linked on the blog menu.

Zachary Shumway said...

About Time: Thank you. I appreciate your help! Just kind of reaching around in the dark at this point, especially with such limited ancestral information. The dodecad 7b is interesting, & I wasn't sure how to look at that; I am mostly European. Thanks again!

Alexandros said...

So basically what I think is happening is that small, chopped up ancient haplotypes are being reconstituted into much larger ancestral haplotypes among many of these groups, and the results look like recent admixture events, when in fact they mainly reflect things that happened during the Bronze Age or even Neolithic.

Hi David, very interesting to see your above remark. I was actually about to contradict my self and say that what I described in my previous post as 'ancient Greek' admixture in Cypriots may not be so, given the current study's estimated dates of the admixture event from Greece to Cyprus (300-1100 CE). That is basically the Byzantine era in Cyprus. I have a difficulty accepting that there was such a big admixture event from Greece (and especially Italy) into Cyprus during that period, as this is not documented at all in the hundreds of historical reports from that period. If however we accept that such event happened in addition to the ancient Greek (Mycenaean) migration into Cyprus, then we could suggest that the pre-Slavic expansion Greeks had an almost pure Mediterranean/Anatolian admixture, otherwise it would be hard to explain why Cypriots, with such an intense and consecutive migration from Greece, still have very low Northern European admixture.

Davidski said...


I don't know much about the population history of Greece, but, as you're probably aware, one of the theories about Greek origins is that the Mycenaeans originally came from the Russian steppe. If that's true, then that means there were two, probably quite distinct migrations from the north into Greece: Mycenaean and Slavic.

Interestingly, in Lazaridis et al., Greeks show a relatively high level of ANE at around 16%, and a surprisingly low level of WHG at less than 6%. But if almost all of the northern-like admixture among modern Greeks was of Slavic origin, then surely their WHG ancestry would be much higher, because all northern Slavs carry a lot of WHG. Well, it actually might be much higher in former and current Slavic-speaking areas, but that hasn't been investigated yet.

Anyway, Cypriots also show quite a bit of ANE, with a maximum estimate of just over 13%, but basically no WHG. So if there was large scale emigration of Greeks to Cyprus, then most of that probably took place before the Slavic invasion of mainland Greece, thereby bringing a lot of Mycenaean (?) ANE to Cyprus but no WHG. That's not to say there weren't any population movements from Greece or even further north and west into Cyprus well after the Mycenaean period, but it looks like these weren't very important and/or they affected only certain parts of the island.

Now, if there indeed were at least two migration waves from the north into Greece and then Cyprus, then what is the software picking up in modern DNA, especially since there might no longer be a decent proxy available for the Mycenaeans among modern populations? Is it picking up all of these population movements and giving us the average, or is the date based on the later waves? Honestly, without a few Mycenaean and Slavic genomes from Greece, we can only speculate.

Please note, I'm in a hurry right now, so hopefully what I just typed actually makes sense.

Davidski said...

Oh, by the way, I should've added somewhere there that it seems to me that ANE is expressed in ADMIXTURE tests as North European, West Asian and Caucasus-related ancestry. So the fact that Cypriots have low North European admixture doesn't argue against what I just said. In other words, in a roundabout way, West Asian/Caucasian ancestry across West Asia might be a signal of North European admixture events, albeit the really early ones of the Bronze Age.

V Robazza said...

You forgot the Doric invasion. The myceaneans where middle bronze-age people, the late bronze-age Doric invasion covered modern, albania, epirus, peleponese, crete and cyrene in north-africa.basically most of the mycenean areas.
The dorics in albania where eventually my the early middle-ages replaced by serb slavs, these in turn where eventually absorbed into albanian society, what is the ANE of albanians?

Davidski said...

I did forget the Dorians. But I'm guessing they were very similar to the Mycenaeans...maybe.

Albanians have around 13% of ANE. So less than Greeks.

Alexandros said...

Hi David,

Yes definately what you mention regarding the ancient admixture components makes perfect sense. I have played around with your EEF-WHG-ANE calculator using the k13 population averages and I have noticed that among West Eurasians, the ancient ANE component is positively correlated with the modern Baltic and West Asian (Caucasus) components. I beleive Greeks have higher ANE than the rest of the Balkan populations because they have higher Caucasus admixture, in addition to their relatively high Baltic admixture. There are populations with high Caucasus admixture and low Baltic admixture (i.e. the Georgians) that have higher ANE admixture than the Greeks. The Adygei with both high Baltic and really high Caucasus admixtures, have even higher ANE. I have noticed also that the ANE component is negatively correlated with both the East and West Mediterranean components, as well as the Red Sea component. In the context of the Hellenthal et al paper, I agree with you that if there was a substantial migration of mainland Greeks into Cyprus, post-Slavic expansion, as the paper suggests, we would have had some WHG and in fact we would also have more Baltic admixture and more ANE in Cyprus. Unless these recent migrations to Cyprus were coming from a low Slavic-admixed Greek region, such as Crete. I am not aware of the origin of the Greek samples used in that study.

Anyhow, thanks a lot for sharing your thoughts and also thanks for the excellent EEF-WHG-ANE calculator!

Alexandros said...

@V Robazza

The Dorian invasion is a very complicated issue and the notion that the Dorians were a northern population who descented southwards into Greece has never been proved by any historical or archaeologiocal source. At the momment, the theory that receives the highest acceptance among historians and archaeologists, is that the Dorian invasion probably represents an 'inernal affair' in the complex ancient Greek world, with an indiginous population introducing a new culture (the Dorian), which replaced an already demised Mycaenean culture. What we know for sure, is that after the demise of the Mycenaean culture in mainland Greece, this same culture appeared in a sudden and dramatic fashion in the island of Cyprus and continued to be present there for centuries thereafter, with practically no evidence of Dorian culture, unlike neighburning islands such as Crete.

Daniel Falush said...

Hi David,
With respect to the eastern european analysis. (part 1)
(1) In general we think that for events that took place at a single time point our results on timing of admixture are more robust than of admixture sources. Even if the admixture source is very badly captured by the inferred sources, the decay curves should still in principle have the right slope, although the confidence limits might be wider than they would be if the source would be well-captured.

(2) The dates that we estimate are lower bounds because the admixture happens after the populations meet. However, in practice we find that they seem to be bang on in many cases.

(3) Admixture course can be a gradual process but we test for multiple admixture dates and is possible to look at the curves and see if there is any hint that admixture started earlier than the fitted date. One example of this is the Moroccan curve fitted for the French, which does not seem to fit the single admixture date curve that is fitted for French (when you force it to).
(4) For the Eastern European populations, the most informative curves are for the "EastEuropeI" analysis curves because these are least effected by masking, as discussed above. Generally, I would say the East European curves fit well, even when the date is a few hundred years later than for Polish (up to 1000 CE). The Hezhen curve for Lithuania (643CE) does not quite fit at short genetic distances but I would say its an exception and there is no curve that shows real evidence for substantially older admixture occurring, they are all nice well-behaved looking curves. There is no statistical evidence for two dates for any of these populations. One caveat to this is that in simulations, recent events do often dominate the signals.

Daniel Falush said...

part 2
(5) We have simulated expansion events and find that this makes no difference to estimated admixture times. It is true that very strong genetic drift after admixture can in principle affect estimated admixture times but we have developed a "NULL INDIVIDUAL" approach to investigate this (see pages 64-65 of the Supplement). Here, we calibrate curves using as denominator the average ancestry of other individuals with the same label at each genetic distance. Drift shared by individuals in the population will not affect these curves (see table S9 of the SOM). We do not use this analysis by default because dividing by a denominator can affect inference of source groups in general but we have checked all the populations to see whether it changed the dates substantially. There are a few populations where it did and for and for these populations, admixture has been classified as uncertain (see page 88 of the supplement). None of the Slavic groups fall in the category and Polish, for example the date we get is almost identical at 410 CE and the oldest confidence limit is actually more recent than the main analysis at 38BCE.
(6) there is no evidence for strong endogamy within these populations, since fineSTRUCTURE does not separate them cleanly. This is another argument about strong drift being a factor.

In sum, there is strong evidence from analysis of multiple East European countries of admixture with an East Asian like source CE. There is no evidence from the curves that this source had earlier admixture with other Western Eurasian groups although this does not completely exclude earlier admixture. Our analysis do not demonstrate that this happened physically in Poland but it does show that modern Poles have received a substantial fraction of their DNA from this mixture event.

With respect to the Kalash, it is worth noting that although several of the populations show evidence for "ancient" western eurasian ancestry, the Kalash component looks significantly more Northern European in that component (this is shown in figure S18, in fact, Kalash get their own separate North/East Europe component). we state that the results are consistent with Alexander the Great, not that it proves it. There is also no evidence that our dates for this earlier event (which does have wide confidence limits) are substantially affected by genetic drift.

Davidski said...

Hi Daniel,

It's the Kalash who've been affected by endogamy, not the Eastern Europeans.

The Eastern Europeans were affected by massive in-situ expansions and founder effects. That's why today, for example, any two random Balto-Slavic individuals from different ethnic or even linguistic groups, and from locations separated by hundreds of kilometers, share more IBD than any two random Irish individuals.

So the end result is very similar to high endogamy, but it encompasses a much wider area.

Population genetic analyses are often strongly affected by this phenomenon, and obviously so is yours, because you've just told us that you can't separate these populations in fineSTRUCTURE. This is precisely the reason why.

By the way, the highest level of North European ancestry is found in the East Baltic area, not Northwestern Europe (with North European ancestry being defined as the component with the highest level of indigenous European input). That's why if you run the relevant PCA you'll see that East Baltic groups, like Lithuanians and Latvians, are the furthest removed Europeans from Middle Eastern populations, without being shifted towards anyone else from outside of Europe at the same time. This is supported by all ancient DNA to date.

So I find it extremely difficult to believe that there was a late admixture event in this region like the one you're positing. Eventually ancient DNA might support your findings, but based on everything I've seen to date, I doubt it.

Davidski said...

Actually, I should add that when I said the Kalash had Ancient North Eurasian (ANE) admixture rather than North European admixture, I didn't mean your results were way off in this case. That's because present-day Northern Europeans apparently derive a lot of their ancestry from the so called Ancient North Eurasians.

So in other words, the Kalash might indeed have North European-like ancestry, but it probably didn't come from Europe. Indeed, I can't believe that it arrived in the Hindu Kush with Alexander's army, who were in all likelihood Southeast Europeans, Anatolians and Persians anyway.

I think the most plausible explanation for the northern admixture among the Kalash is the expansion of the Sintashta chariot complex from the southern Urals during the Bronze Age. This actually fits perfectly with the spread of Y-chromosome haplogroup R1a-Z93 into South Asia. It also fits well with the post-Neolithic appearance of the Ancient North Eurasian component in Europe, because it seems these people, who were probably the early Indo-Europeans, pushed out in both directions.

It's a shame the dates don't fit, because everything else does, especially if we don't assume that present-day populations are good proxies for ancient population movements.

Davidski said...


Yes, ANE seems to show a high correlation with the Caucasus and related components. But this doesn't necessarily mean that the fairly high ANE among Greeks is due to migrations from the Caucasus and nearby. What it might mean, at least in part, is that populations with a high ratio of ANE expanded into Greece and the Caucasus at about the same time.

I think that makes sense, because there was no ANE influence among the early Neolithic farmers, and these samples also mostly lack the Caucasus-like admixture, which appears to be a hybrid of pre-ANE Mediterranean/Near Eastern and ANE components.

Steve Finnell said...


Did you ever notice that secularists accept historical writings as fact, unless they are about God the Father, God the Son and the Holy Spirit.

Have you ever heard a secularist proclaim that the following men did not live and that they were not who historians said they were?
Confucius 551-479 B.C.
Plato 427-347 B.C.
Alexander the Great 356-323 B.C.
Julius Caesar 100-44 B.C.
Socrates 469-399 B.C.
Buddha 563-483 B.C.
Ludwig van Beethoven 1770-1827
Homer 700-800 B.C.
Isaac Newton 1642-1727
Galileo Galilei 1564-1642
Leonardo da Vinci 1452-1519
Marco Polo 1254-1324
John Locke 1632-1704
George Washington 1732-1799
Abraham Lincoln 1809-1865

Secularists do not question the historical fact, that these men lived and died. They do not deny the role these men played in history. They believe this, by faith, that the historical accounts are accurate.

Secularists do deny the historical accuracy of the Bible and all other accounts that proclaim Jesus as the Son of God. They deny the historical accounts of God the Father resurrecting Jesus from the grave.

THE RESULTS OF A SELECTIVE VIEW OF HISTORICAL FACTS.(2 Thessalonians 1:8-9 dealing out retribution to those who do not know God and to those who do not obey the gospel of our Lord Jesus. 9 These will pay the penalty of eternal destruction, away from the presence of the Lord and from the glory of His power.)