Wednesday, May 15, 2013

South Asian R1a in the 1000 Genomes Project

After a recent update, the 1000 Genomes project now includes 62 individuals of South Asian origin belonging to Y-DNA haplogroup R1a-M17. Their full Y-chromosome sequences have been analyzed by Semargl and Maximus (aka. YFull project), with some interesting but not unexpected results:

- All individuals belong to R1a-Z93, which appears to totally dominate South Asian R1a-M17.

- A single Punjabi from Lahore, northeastern Pakistan, is ancestral for the Z94 mutation, which is just below Z93. All the other individuals are derived for Z94.

- Six individuals - of Punjabi, Bangladeshi and Gujarati origin - are ancestral for L657 and Z2124, the two main mutations immediately below Z94.

- All individuals of South Indian and Sri Lankan origin are derived for L657 or Z2124.

- Based on this sample, there appears to be no substructure along ethnic or geographic lines within South Asian R1a-M17 derived for L657 and Z2124.

Thus, it seems the SNP diversity of South Asian R1a-M17 is low, and decreases from Pakistan, North India and Bangladesh to South India and Sri Lanka. In comparison, there are only 12 European R1a individuals in the 1000 Genomes sample, and they represent all the major subclades of this haplogroup: R1a-Z283, R1a-Z93 and R1a-L664. Therefore, sampling bias can't be used as an argument for the more diverse result from Europe.

The lack of substructure along ethnic and geographic lines within South Asian R1a-L657 and R1a-Z2124 looks unusual, especially considering the caste system in India, and needs to be verified with more extensive sampling. However, if this outcome holds up, it'll suggest that paternal gene flow across South Asia has not been restricted by the caste system or geography. Then again, it could mean the caste system appeared after R1a-L657 and R1a-Z2124 arrived in South India via massive population movements from the north.

Below are all the results in as much detail as the current R1a SNP tree allows. Key: BEB - Bengali from Bangladesh; GIH - Gujaratai from Houston, Texas; ITU - Indian Telugu from the UK; PJL - Punjabi from Lahore, Pakistan; STU - Sri Lankan Tamil from the UK.

Z93+ Z94-
PJL - 1

Z94+ L657- Z2124- Z96-
BEB - 2 PJL - 3 GIH - 1

L657+,Y2+ etc.
1) Y9 (inc. Y7)
GIH - 7
STU - 4
ITU - 4
PJL - 8
BEB - 2

2) Y4+, Y8+, Y28+ (inc. Y6+)
GIH - 6
ITU - 6
PJL - 2
STU - 6
BEB - 5

Z2125+ (Z2124+ Z2122- Z2123-)
PJL - 1

Z2123+ (Z2124+ Z2122-, Z2125-)
PJL - 2
STU - 3
BEB - 1
ITU - 6
GIH - 2