Eurogenes Blog: South Asian R1a in the 1000 Genomes Project

Wednesday, May 15, 2013

South Asian R1a in the 1000 Genomes Project

After a recent update, the 1000 Genomes project now includes 62 individuals of South Asian origin belonging to Y-DNA haplogroup R1a-M17. Their full Y-chromosome sequences have been analyzed by Semargl and Maximus (aka. YFull project), with some interesting but not unexpected results:

- All individuals belong to R1a-Z93, which appears to totally dominate South Asian R1a-M17.

- A single Punjabi from Lahore, northeastern Pakistan, is ancestral for the Z94 mutation, which is just below Z93. All the other individuals are derived for Z94.

- Six individuals - of Punjabi, Bangladeshi and Gujarati origin - are ancestral for L657 and Z2124, the two main mutations immediately below Z94.

- All individuals of South Indian and Sri Lankan origin are derived for L657 or Z2124.

- Based on this sample, there appears to be no substructure along ethnic or geographic lines within South Asian R1a-M17 derived for L657 and Z2124.

Thus, it seems the SNP diversity of South Asian R1a-M17 is low, and decreases from Pakistan, North India and Bangladesh to South India and Sri Lanka. In comparison, there are only 12 European R1a individuals in the 1000 Genomes sample, and they represent all the major subclades of this haplogroup: R1a-Z283, R1a-Z93 and R1a-L664. Therefore, sampling bias can't be used as an argument for the more diverse result from Europe.

The lack of substructure along ethnic and geographic lines within South Asian R1a-L657 and R1a-Z2124 looks unusual, especially considering the caste system in India, and needs to be verified with more extensive sampling. However, if this outcome holds up, it'll suggest that paternal gene flow across South Asia has not been restricted by the caste system or geography. Then again, it could mean the caste system appeared after R1a-L657 and R1a-Z2124 arrived in South India via massive population movements from the north.

Below are all the results in as much detail as the current R1a SNP tree allows. Key: BEB - Bengali from Bangladesh; GIH - Gujaratai from Houston, Texas; ITU - Indian Telugu from the UK; PJL - Punjabi from Lahore, Pakistan; STU - Sri Lankan Tamil from the UK.

Z93+ Z94-
PJL - 1

Z94+ L657- Z2124- Z96-
BEB - 2 PJL - 3 GIH - 1

L657+,Y2+ etc.
1) Y9 (inc. Y7)
GIH - 7
STU - 4
ITU - 4
PJL - 8
BEB - 2

2) Y4+, Y8+, Y28+ (inc. Y6+)
GIH - 6
ITU - 6
PJL - 2
STU - 6
BEB - 5

Z2125+ (Z2124+ Z2122- Z2123-)
PJL - 1

Z2123+ (Z2124+ Z2122-, Z2125-)
PJL - 2
STU - 3
BEB - 1
ITU - 6
GIH - 2

7 comments:

SBMay 15, 2013 at 8:17 AM
Semargl and Maximus!
ReplyDelete
Replies
Nirjhar007May 17, 2013 at 12:31 AM
''Thus, it seems the SNP diversity of South Asian R1a-M420 is low, and decreases from Pakistan, North India and Bangladesh to South India and Sri Lanka. In comparison, there are only 12 European R1a individuals in the 1000 Genomes sample, and they represent all the major subclades of this haplogroup: R1a-Z283, R1a-Z93 and R1a-L664. Therefore, sampling bias can't be used as an argument for the more diverse result from Europe.''
Dear Davidski dude, we need the age of these SNPs for scientific analysis for its connection to the IE PEOPLE and if you say the date they provide for example here-
http://www.familytreedna.com/public/r1a/default.aspx
Then it is only hypothetical with a chance of sure biased conclusions.
Of course aDNA is the only way to confirm the relations for example corded ware has 4600yo old aDNA of R1a1 but we don't know the language of the culture!
It is very very vital to have the correct age of the SNPs newly discovered and it is clear that South Asians have a lack of SNPs but the high STR variance is unmatched and if Farmanas aDNA is R1a1a Z93+ and other local ones then as i say the movement of IE languages will be more older than suggested.
The latest update also does not rule out the possibility of South Asian origin of R1a as-
''(http://www.familytreedna.com/PDF/New_Y_Chromosome_Binary_Markers_Improve_Phylogenetic_Resolution_Within_Haplogroup_R1a1.pdf) it is said that "the origin of R1a1-M198 arguably occurred somewhere between South Asia and Eastern Europe. Potential candidates could be the Eurasian Steppes (Ukraine – Southern Russia – Kazakhstan – Caucasus) or the Middle East." I would add: between South Asia and Eastern Europe there is also Southern Central Asia: Iran, Turkmenistan, Afghanistan, Tajikistan, even Baluchistan. It is also significant that Z280 is absent in India, showing no movement from Europe to India, whereas Z93+ is present in Europe, not only in Romas. You can find it also often in Arabic populations: http://www.familytreedna.com/public/r1a/default.aspx?vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&vgroup=r1a&section=yresults''
we should not forget also that the Archaeological data of South Asia
does not speak of any kind of culture changing intrusion from the time of 4500BC to ~600B.C.
Have a good time.
ReplyDelete
Replies

Add comment

Read the rules before posting.

Comments by people with the nick "Unknown" are no longer allowed.

See also...

New rules for comments

Banned commentators list