Thursday, August 2, 2018

Global25 coordinates for almost 500 Ashkenazi Jews

I know that some of you are looking at the genetic structure of Ashkenazi and other Jewish populations with the Global25 data. So to help things along here are Global25 coordinates for 471 Ashkenazi individuals from Bray et al. 2010 (see here).

AJ G25 coordinates

AJ G25 coordinates (scaled)

AJ G25 coordinates PAST datasheet (scaled)

I don't know what the genotyping accuracy is for these samples. It's probably very accurate, but just in case, considering their age, it might be useful to remove the most extreme outliers before trying any fine scale analyses. In any case, on average, they're very similar to the Ashkenazi Global25 reference panel. To illustrate the point, below is a plot based on the above PAST datasheet.

Indeed, across all 25 dimensions their lowest distance is to the Ashkenazi Global25 reference panel, followed by various Mediterranean populations. So it seems that everything makes sense.

Ashkenazi_G25_reference 1.031630
Maltese 1.995762
Italian_South 2.020773
Sicilian_East 2.166531
Sicilian_West 2.313971
Italian_Abruzzo 2.593161
Italian_Jew 2.757282
Greek_Crete 2.930938

old europe said...

With the exception of Crete all the population listed belong not to eastern mediterranean but to the western part.

Davidski said...

@old europe

Fair enough. I've edited it to just Mediterranean.

Samuel Andrews said...

There's lots of discussion about European Jews being an Agean-Levantie mix at Anthrogencia.

It shouldn't be difficult for the experts to pinpoint exactly who the European ancestors of western Jews are, right? There's definitely a pretty easy way to differentiate between Spanish vs Greek ancestry or between Italian vs Greek ancestry or between German vs Greek ancestry.

If genome-wide data fails, parental markers can give answers. I bet if I took a look at Jewish mtDNA there'd be signals pointing to where their European ancestry is from. I do remeber an Ashknazi customer who belonged to a signature 'Slavic' mtDNA haplogroup but that's probably an exception.

Just at a glance, it looks obvious Sephardic & & Italian & Ashkenazi Jews share more than Israeli ancestry they also derive from the same ancient European Jewish populations. So, heavy Spanish & German contribution can probably be ruled out for both. If their European ancestry is Agean, there should be a way to confirm it.

mzp1 said...

Do you know anything about where the R1A in Ashkenazis Levites could have come from?

Davidski said...

Ashkenazi R1a is probably from Iran...

The genetic variation in the R1a clade among the Ashkenazi Levites’ Y chromosome

mzp1 said...

Slightly off-topic, but unless I'm mistaken, Abraham's destruction of Idols is a defining event in the history of Judaism. However, Anti-Idolatry is also reflected across the Iranian and Germano-Balto-Slavic religions, as none of these are known to have kepp idols. The Shahnama also has a reference to a strong prevailing Anti-Idolatry.

I believe, in addition to other ideas, the Anti-Idolatry in Judaism and, further in Christianity and Islam likely has origins in some early IE culture, likely Iranian. This is in contrast to atleast IA, Greek, Roman and other Middle-Eastern religions, who did keep idols.

R1A in Jews seems to reflect this early and borrowing of religious ideas.

Uruk Osmerius said...

Out of curiosity @Davidski,
Do you know the specific region (and province) that the Italian_South reference population is from?

And do you know the specific provinces that the Sicilian_West and Sicilian_East reference populations are from?

Thanks in advance.

E. Donovan said...

Davidski said...

@Uruk Osmerius

Sorry I don't know that.

Open Genomes said...

Thank you @David for the 471 Ashkenazi Jews from Bray et al.(2010). These are control samples from a set 547 parents of one schizophrenic child, which were curated to eliminate poor read quality and relatives revealed by high IBD.

My understanding from Dr. Itsik Pe'er is that these "500" DNA samples have had Illumina whole genome sequencing done at the NY Genome Center, with joint variant calling, and right now they are attempting to identify CNVs. This should prove a very valuable dataset for a closely related population isolate.

Here is an initial tree of only this set of Ashkenazi Jews showing two major clusters (colored greenish and reddish) which is the result of some difference between them.

471 Ashkenazi Jews clustered into two groups using Ward's distance-squared algorithm on scaled data
(PDF, enlarge to 110% and scroll all the way to the right to see the individual samples)

One possible difference is the degree of North African ancestry, although this doesn't seem to be exactly true in the cases I've checked on the larger Global25 ancient and modern tree. We have seen this kind of split in much earlier attempts at clustering Ashkenazi Jews. This needs to be investigated further.

Open Genomes said...

First, the complete Global25 tree, including all the recent ancient and modern samples, along with the Bronze Age Sidon individuals, the 471 Ashkenazi Jews from Bray et al. (2010), and 6 others, an Iyer Brahmin ("Iyer_Vathula"), a South Italian from Calabria with some Arbereshe (Albanian) ancestry ("Ciliberto")and 4 Ashkenazi Jews with known ancestries from the Polish-Lithuanian Commonwealth ("OG3", "OG1", "TK", and "MR"):

Eurogenes Global 25 Ward's distance-squared clustering tree of 460 ancient and modern samples, divided into 650 clusters

Here are restricted nMonte3 runs for the 4 specific Ashkenazi individuals mentioned above with known 100% Ashkenazi ancestries from the Polish-Lithuanian Commonwealth. Each of these has been verified in Gedmatch (and Family Finder and Ancestry) has having 100% Ashkenazi or part-Ashkenazi and other Jewish matches down to a very low level of IBD, so they have no recent non-Ashkenazi admixture. These runs are with the distance penalty = 0, and the percent limit (cutoff for the initial nMonte3 run) at 0.75%. These runs vary by which populations and periods were included.

Even though these 4 individuals cluster closely (3 of them cluster together on the tree) and share IBD, there are important differences between them.

OG3 is a rather unique sample: This individual was born in Lublin just after WWI, and has all of his ancestors from just the Lublin region. This individual is a sample of a nearly "vanished" population, Ashkenazi Jews with all of their ancestry from very specific small areas in Europe, without known recent admixture from other areas. This sample does appear to have some differences with the others, who have "mixed" 100% Ashkenazi ancestries from the historical Grand Duchy of Lithuania (Lithuania, Latvia, Belarus, and northeast Poland - Lomza and Suwalki) and the Kingdom of Poland (Southern Poland and Ukraine - the Austrian Kingdom of Galicia, and those who went south from this region to Romania after 1829). These differences may be the result of drift, or may be the result of specific historic founders in the Lublin region.

Open Genomes said...

The runs for the 4 Ashkenazi individuals for all samples outside of their own Ward's distance-squared tree subclusters, including all ancient and modern samples and other Ashkenazi Jews:

Sample: OG1 distance: 0.58480%
Sample: MR distance: 0.14820%
Sample: TK distance: 0.90250%
Sample: OG3 distance: 0.38160%

There are significant difference here, in the degree of matching with other Ashkenazi Jews, and which non-Ashkenazi and non-Jewish components show up. Some significant differences are the type and degree of European, Near Eastern, North African, and Iranian ancestry, which are more apparent when Ashkenazi Jews and other Jews are excluded in the following runs. This is also highlighted when comparisons are limited to only either modern or ancient samples.

Open Genomes said...

The runs for the 4 Ashkenazi individuals for modern-only non-Ashkenazi samples, but including other non-Ashkenazi Jews:

Sample: OG1 distance: 1.14460%
Sample: MR distance: 0.44250%
Sample: TK distance: 1.05180%
Sample: OG3 distance: 0.53790%

There are significant differences between the level of matching with non-Ashkenazi Jewish ancestry. Non-Jewish European ancestry becomes more apparent.

Open Genomes said...

The runs for the 4 Ashkenazi individuals for modern-only non-Jewish samples:

Sample: OG1 distance: 1.04060%
Sample: MR distance: 0.51640%
Sample: TK distance: 0.87010%
Sample: OG3 distance: 0.78200%

Here there are some common ancestral components, such as South or Central Italian, Balkan, French, and Levantine, but differences in the presence of North African (Moroccan) and Iranian ancestry.

Open Genomes said...

The runs for the 4 Ashkenazi individuals for only medieval and ancient samples:

Sample: OG1 distance: 0.47130%
Sample: MR distance: 0.80990%
Sample: TK distance: 0.44670%
Sample: OG3 distance: 0.71830%

The ancient samples provide very good fits. Some significant differences are the level of Germany Medieval ancestry, the presence of Iberomaurusian ancestry, the presence and level of Baltic ancestry (in the two individuals with Lithuanian or Western Belarussian Ashkenazi ancestors), and the non-trivial amount of BMAC ancestry.

DA29, the Medieval "Golden Horde" individual, is clearly from 100% Lithuanian-Belarussian-Polish ancestry. The Pagan Grand Duchy of Lithuania was an ally of the Tatar-Mongol Golden Horde:
DA99 Medieval Golden Horde modern matches showing Lithuanian, Polish, and Belarussian ancestry

Open Genomes said...

While we lack Medieval French and Roman and Medieval Italian samples from Italy, ancient DNA shows both Medieval German and Baltic / Grand Duchy of Lithuania admixture among some Ashkenazi Jews, depending on their region of origin. There is also evidence of native North African admixture from the historically documented Berber conversion to Judaism between the 2nd to 7th centuries CE, but this too is unevenly distributed. Not only did the Jews of Medieval Iberia have North African Jewish ancestry, but after 1149 many North African Jews settled in Sicily, where the use of Judeo-Arabic persisted for a much longer period than in Iberia. Italian admixture seems to be common to all Ashkenazi Jews, and this is because Roman Italy was the first place of settlement for Jews in Europe. Initially, Jewish male captives from Judea would have married (and converted) local Italian women (fellow slaves?), mostly from Rome and Southern Italy. In Southern Italy they would have been primarily of Greek origin.

It's hard to say why particular regions of Poland-Lithuania have differing levels of North African and French ancestry. Like Italy and Germany, France and North Africa were primary regions of Jewish settlement in the Early and High Middle Ages, but unlike Baltic region ancestry, we might expect this to be more homogenous.

The interesting Iranian-BMAC components lacking in Europeans certainly derive from the Near East. We see this on the Y-DNA, the prime example being "Ashkenazi Levite" R1a-CTS6, which has its closest matches in Iran. There is also a historical explanation for this:
During the return from the Babylonian Exile in 444 BCE, there seems to have been a new class of people included, the "Nethinites" ("Netinim" = "given ones"), "temple servants (slaves)" brought from Persia. These seem to have been a rather substantial proportion of the population of the newly-rebuilt Jerusalem. They seem to have had distinctive given names, and in fact one individual in the Dead Sea Scrolls has a given name and patronymic from this set, so they could be said to be archaeologically attested. So perhaps this reflects an origin of the proto-Ashkenazi community in Second Commonwealth Judea in the 1st century, as opposed to the large Jewish communities of Iraq and Egypt.

So, some mysteries remain, but we can trace the origins and migrations of the Ashkenazi Jewish community of Eastern Europe by a close examination of the Global 25 results using a clustering algorithm and nMonte.

Open Genomes said...

One other interesting thing revealed here that is not directly related to Ashkenazi ancestry:

Some Maltese, Cypriots, Greeks from Crete, Sicilians, South Italians, and Tuscans appear "within" the larger "Ashkenazi Jewish" clade. Most significantly, I9033, the Mycenaean outlier who clusters with South Italians on the tree without these 471 Ashkenazi Jews, clusters closely with with Ashkenazi Jews as well as an East Sicilian, Maltese, and Italians from Abruzzo. This may not be just due to historic South Italian / Greek admixture among Ashkenazi Jews. Rather, perhaps this reflects the influence of the Aegean Sea Peoples during the Late Bronze Age Collapse in the Levant. While this Mycenaean outlier may have a different result due to aDNA damage, his results seem consistent with what we know historically and archaeologically about the Late Bronze Age collapse in the Mediterranean.

Search for I9033 here in the Ward's distance-squared scaled clustering tree

@Roy King

Open Genomes said...

I forgot to mention that a major potential source of Greek admixture would be the very large Jewish community of Classical Alexandria, which existed until 415 CE. The native Egyptians tended to remain separate from the Greek majority and the Jewish community. The Jews made up 1/5th the population of Classical Alexandria.

We don't know where the Jews of Alexandria ended up, but it's possible they went westward and then some went north to Italy. However, it may be that they are the ancestors of the Libyan Jews, rather than Italian Jews and their descendants, Ashkenazi Jews.

Joshua Lipson said...

@Open Genomes—the data from Ancestry seems to confirm a first-order divide between Litvaks (including the major share of Jews from northern and eastern Ukraine), on one hand, and other more westerly and southerly Ashkenazim, on the other:

(The original published study, Han et al 2017, also identifies a smaller 3rd cluster, but I wasn't able to glean any geographical pattern from it.)

That said, the result of this clustering analysis might not be reflected in PCA. Or maybe there really is a significant difference in certain components—say Baltic—that might reflect exactly that divide. Does PCA, at a high-enough number of dimensions, necessarily converge with network clustering?

Not that I've seen many calculator results for "pure Litvaks" or "pure Galitzianers", but it seems to me that going off PCA, the biggest gap is between Western and Eastern Ashkenazim (like in Behar et al 2013). Network clustering, on the other hand, suggests that gene flow and migration from Germany/Bohemia continued to affect southern Polish/Ukrainian/Romanian Ashkenazim for longer than it affected Litvaks—the difference between a German and a Galitzian Jew most likely being an early infusion of Slavic ancestry in the latter's ancestry (a Litvak would have something similar, but would also be much more genealogically remote from non-Litvaks).