Abstract
Do genetic and linguistic affinities necessarily go hand in hand? An attempt has been made in the present work to explore this dimension of population structure using three evolutionarily important TaqI sites (TaqI A, TaqI B, and TaqI D) on the dopamine receptor D2 (DRD2) locus. For the first time, DNA samples from 612 unrelated individuals belonging to 11 Indo-European-speaking tribal population groups of Gujarat, western India, have been analyzed for these three sites. All the three sites are found to be polymorphic with greater interpopulation variation seen at the TaqI B site. The average heterozygosity for the haplotype system has been found to be high in the populations under study. Most of the populations share six of the eight haplotypes pointing toward underlying genetic uniformity, which is further reaffirmed by regression analysis of heterozygosity on genetic distance. The frequency of ancestral haplotype B2D2A1 is found to range between 1.9% and 15.9%. Linkage disequilibrium between TaqI B and TaqI D sites and between TaqI B and TaqI A sites is statistically significant in all but one population. Our findings reveal strong affinities between Indo-European-speaking tribal groups of Gujarat and Dravidian-speaking tribal groups of South India, suggesting that genetic affinities may not necessarily be dependent on linguistic similarities.
Introduction
T
The tribal population groups are possibly the original inhabitants of India (Thapar, 1966; Ray, 1973). Their dialects have been classified under AA, DR, and Tibeto-Burman linguistic families (Kosambi, 1991). However, several tribes in the western and northwestern regions of India speak languages belonging to the IE linguistic family. Tribes of Gujarat are no exception to it. It is said that the tribes of Gujarat who speak IE languages are probably of proto-Australoid racial affinity, belonging to the earliest group of settlers in India who must have interacted with the people of various racial affinities such as Mediterraneans, Alpines, Dinarics, and others passing through their habitats while entering India. This would have resulted in change in racial constitution of the Gujarat tribes to some degree (Fuchs, 1964).
There is, thus, considerable debate about the evolutionary histories of the Indian tribes. Some consider the proto-Australoid tribes (speaking AA languages) to be the earliest settlers of the subcontinent (Risley, 1915; Rapson, 1955; Thapar, 1966; Gadgil et al., 1998; Pattanayak, 1998; Chakrabarti et al., 2002; Basu et al., 2003), whereas others consider DR speakers as the original inhabitants (Buxton, 1925; Sarkar, 1958). Several scholars have proposed various routes of entry for the different linguistic families in the Indian subcontinent (Guha, 1935; Renfrew, 1987; Ruhlen, 1991; Diamond, 1997). Efforts have been made to study ethnic affinities of Indian populations based on a number of anthropological variables (Risley, 1908; Guha, 1944; Sarkar, 1954); serological and biochemical genetic markers (Malhotra and Vasulu, 1993; Bhasin et al., 1994; Bhasin and Walter, 2001); and molecular genetic markers: autosomal, mtDNA, and Y-chromosomal markers (Roychoudhury et al., 2000; Bamshad et al., 2001; Majumder, 2001; Mukherjee et al., 2001; Kivisild et al., 2003; Vishwanathan et al., 2004).
One such molecular marker, dopamine receptor D2 (DRD2), has over the decade emerged as one of the several examples of genes being increasingly studied both for their functional and evolutionary significance. This gene, spanning over 270 kb and mapped to locus 11q22.3-q23.1 (Grandy et al., 1989), encodes the D2 subtype of dopamine receptor. It is one of the five types of dopamine receptors encoded by five separate genes expressed in central nervous system. DRD2 is of special interest as it is a target site of many neuropsychiatric drugs, and is thus of prime concern in the fields of neurology, psychiatry, and endocrinology among others. It is a strong candidate gene implicated in behavioral disorders and alcoholism and other substance use disorders (Neiswanger et al., 1995; Noble et al., 1993; Gejman et al., 1994; Blum et al., 1998; Noble, 1998; Lawford et al., 2000; Dalley et al., 2007). Beginning with detection of the TaqI A site (Grandy et al., 1989), several other restriction polymorphism sites have been identified in the region encompassing the coding sequence of this gene (Iyengar et al., 1998), of which the TaqI A site is the one most frequently studied in association studies (Grandy et al., 1989; Kidd et al., 1996; Iyengar et al., 1998).
Over the years haplotype analysis has emerged as a powerful tool for gaining insights into the population dynamics as it gives more meaningful insights both in evolutionary studies and disease-association studies than unlinked markers (Tishkoff et al., 1996; The International HapMap Consortium, 2005; Plagnol and Wall, 2006). With reference to the DRD2 locus, three single-nucleotide polymorphisms are used to construct haplotypes important for analysis of evolutionary relationships in the order of TaqI B (Hauge et al., 1991), TaqI D (Parsian et al., 1991), and TaqI A (Grandy et al., 1989). The TaqI B site (rs1079597; G→A mutation) is located 913 bp upstream of the initiation codon in exon 2, the TaqI D site (rs1800498; C→T mutation) is located in intron 2, and the TaqI A site (rs1800497; T→C mutation) is located 10,542 bp downstream of the termination codon. B2, D2, and A1 alleles are the ancestral alleles (Castiglione et al., 1995; Kidd et al., 1996; Kidd et al., 1998), and B1, D1, and A2 are the corresponding derived alleles.
The global survey of variation in the DRD2 gene (Kidd et al., 1998) reiterated the findings from nuclear DNA studies that African populations have significantly higher genetic variation than non-African populations (Bowcock et al., 1994; Tishkoff et al., 1996). It also brought into focus the importance of the three DRD2 sites in studying genetic structure of human populations. The study documented diversity at the DRD2 locus in several populations from different geographical areas of the world but not from India. Studies on the DRD2 locus done in India in the last decade have been mainly restricted to South Indian DR-speaking groups (Vishwanathan et al., 2003; Bhaskar et al., 2008; Prabhakaran et al., 2008; Saraswathy et al., 2009c) and northeastern populations (Saraswathy et al., 2009b). However, to the best of authors' knowledge, no study has been conducted so far on the DRD2 locus in the tribal groups of northwestern and western India who speak languages belonging to the IE linguistic family. This is critically important as the tribal populations of northwestern and western regions of India constitute 29.31% of the total tribal population of India (Census of India, 2001). Keeping this in view, the present study is aimed to assess genetic heterogeneity with reference to the DRD2 locus among the 11 tribes of Gujarat, western India, against the backdrop of continuing debate over the genetic and linguistic similarities of the Indian tribes. Sampling locations of the tribes are given in Figure 1.

Sampling locations of the tribes of Gujarat (western India) under study.
A brief ethnographic profile of these 11 tribes can be summarized in the following ways:
Varli
The term “Varli” has been derived from the Sanskrit word “Varal,” meaning uplander. They are generally found to inhabit hilly terrains. Varlis are famous for their ancient Indian folk art tradition of painting. Historians believe that the Varli tradition can be traced back to the Neolithic period between 2500 BC and 3000 BC, indicating the antiquity of the tribe. Varlis have migrated to South Gujarat from Konkan area of Maharashtra state. They are physically tall, dark, slim, and well built, with features described as proto-Australoid. They were traditionally hunters and gatherers, but now the majority of tribesmen own small land holdings. They have four endogamous divisions, namely, Shuddha, Murdes, Davars, and Nihirs. Each of the four divisions has exogamous clans. Their population size in Gujarat is 255,271 (Census of India, 2001).
Konkana and Dangi Konkana
The Konkanas in Gujarat are immigrants from the western coastal strip of Maharashtra, western India. They are dark, short-statured people and show ethnic affinities with Varlis. Konkanas largely depend upon agriculture, agriculture labor, fishing, and collection of minor forest products for their subsistence. The community is divided into a number of exogamous units like Mahala, Gavid, Gavit, and Gaikwad. They practice group endogamy and clan exogamy. The Konkanas are largely confined to the Valsad and Dang districts of Gujarat. The Konkana of Valsad district are known as Konkana, whereas the Konkana of district Dang are known as the Dangi Konkana. Geographical separation between these two groups has led to the formation of two distinct Mendelian isolates, namely, Konkana and Dangi Konkana. Therefore, in the present study these two groups are studied separately. According to the Census of India (2001), the total population of Konkana in Gujarat is 329,496, whereas that of Dangi Konkana is 50,201.
Kolgha
The Kolghas are classified as a primitive tribal group in Gujarat state of India. The Census of India (2001) records their population size as 48,000. The spoken dialect of Kolgha has a strong admixture of IE and DR language family words. They are mainly dependent on labor, cattle grazing, and tanning of animal hides for their subsistence. Kolghas are divided into several exogamous clans.
Dhodia
The Dhodias are one of the largest tribal groups of Gujarat state (western India) with the total population size of 586,108 (Census of India, 2001). The origin of Dhodia is shrouded in mystery and has been given different interpretations. Nevertheless, it appears by different accounts that they owe their roots in the area of western Gujarat and northern Maharashtra. They are mainly agriculturists. The Dhodia are subdivided into a number of equally ranked exogamous clans.
Dubla
The Dublas are also known as Halpati. They are one of the predominant tribal groups in Gujarat state (western India), with total population of 596,865 individuals (Census of India, 2001). Their origin and ethnic affinities are unclear. Land is their main economic resource, but they are also engaged in tailoring work, shop keeping, government services, diamond-cutting, and polishing. The tribe is subdivided into 20 exogamous clans. They are monogamous and Hindu by religion.
Vasava and Dangi Bhil
The Bhils are one of the largest tribal communities in India. Bhils do not have an exact known history of their origin, but the most reasonable account is that they first originated in the Aravalli hills of northwestern India and then spread to northern, western, and central India. At present, Bhils are spread over contiguous area covering four large Indian states, namely, Gujarat, Maharashtra, Rajasthan, and Madhya Pradesh. Their physical features indicate that they belong to the proto-Australoid racial stock (Fuchs, 1964). Traditionally, they were the inhabitants of forest, practicing hunting and gathering, but now most of them are settled agriculturists and some of them are employed in various government offices. There are several subdivisions among Bhils. The Vasava community constitutes one of the major groups of Bhils in Gujarat (Enthoven, 1975). Vasavas are an endogamous community maintaining clan and village exogamy. They are monogamous and follow patrilineal, patrilocal, and patriarchal rules of marriage and kinship.
The Dangi Bhils are another subdivision of Bhils inhabiting the Dang district of Gujarat. They are geographically separated from other subgroups of Bhils and form a separate endogamous group. Total population size of Dangi Bhils in Dang District is 45,901 (Census of India, 2001).
Gamit
The Gamits are migrants from south of Konkan coast near Goa (Enthoven, 1920). Their physical features qualify them to be categorized under proto-Australoid group (Vyas et al., 1958). Gamits are an endogamous group having a number of exogamous clans. They are mostly monogamous, but a few cases of polygynic marriages are also seen. They are Hindu by religion. Gamits are mainly agriculturists, but some are employed in white-collar jobs and others work as laborers. They are also expert in basket making, weaving, carpentry, and house construction. According to the Census of India (2001) Gamits number 354,362.
Chaudhary and Mota Chaudhary
The Chaudharys are one of the indigenous tribal groups of Gujarat. The traditional occupation of the tribe is agriculture followed by animal husbandry. Some are employed in government departments, but there are others who are engaged in different trades. Their spoken dialect belongs to the IE family of languages. Their origin has been subjected to various interpretations. However, Shah (1964) points out that they probably migrated from the northern parts of Gujarat. Chaudharys are divided into five subgroups, namely, Mota Chaudhary, Nana Chaudhary, Pavagadhi Chaudhary, Valavada Chaudhary, and Bonda Chaudhary. Inter subgroup marriages among four of the five groups are permitted, barring Mota Chaudhary. Precisely for this reason, Mota Chaudhary has been considered as a separate Mendelian population in the present study. The total population of Chaudharys has been enumerated as 282,392 in Gujarat (Census of India, 2001).
Materials and Methods
The present study was conducted between December 2007 and February 2009. The study population comprised 612 individuals belonging to 11 tribes. Sample sizes subdivided by tribes and birth-place are given in Table 1.
Five milliliters of intravenous blood was collected from each unrelated subject by a trained medical practitioner after taking informed written consent from the subjects. After blood collection, DNA was isolated using the salting-out method (Miller et al., 1988). The three autosomal codominant biallelic DRD2 sites—TaqI A, TaqI B, and TaqI D—were amplified using the standard primers and protocols (Castiglione et al., 1995; Kidd et al., 1996; Iyengar et al., 1998). The polymerase chain reaction products were then digested with the restriction enzyme TaqI as per the manufacturer's recommended conditions. Electrophoresis was subsequently carried out in 2% agarose gel stained with ethidium bromide for observation.
Allele frequency estimates for the three DRD2 sites were made by direct gene counting. The assumption of Hardy-Weinberg equilibrium was tested using chi-square goodness-of-fit test. The average heterozygosity was estimated by Nei's method (1973). Within each population the haplotype frequencies were computed by the maximum likelihood method from the multisite marker typing data, using the program HAPLOPOP (Majumdar and Majumder, 1999). The standardized pairwise linkage disequilibrium value D′ was computed for each pair of markers (Hill, 1974). A regression analysis of heterozygosity on genetic distance (Harpending and Ward, 1982) was carried out to understand the population structure of the tribes under study. A dendrogram was constructed using the neighbor-joining method (Saitou and Nei, 1987) to identify affinities among the tribal populations of Gujarat and South India, and European populations.
Results
Allele frequencies at the three DRD2 sites in the 11 tribes of Gujarat are given in Table 2.
B2, D2, and A2 are positive alleles; B1, D1, and A1 are negative alleles.
χ2-Values are nonsignificant at df = 1 and p < 0.05.
2n, number of chromosomes tested.
All the three sites at the DRD2 locus were polymorphic in the tribal groups of Gujarat. Of the three sites, greater variation in allele frequency was seen at the TaqI B site. The allele frequency estimates at these sites were used for goodness-of-fit chi-square test to determine whether the phenotype (genotype) frequencies in the 11 tribes of Gujarat depart from Hardy-Weinberg proportion. All the phenotype (genotype) frequencies were in agreement with their respective Hardy-Weinberg expectations. The average heterozygosity (Ĥ) among the tribes of Gujarat ranged between 37.99% (Mota Chaudhary) and 47.58% (Dangi Bhil) and showed a high level of diversity.
Table 3 presents computed values of haplotype frequencies among the tribal populations of Gujarat.
Only the Dubla tribe showed all the eight possible three-site DRD2 haplotypes, and Dangi Konkana and Vasava tribes showed seven of the eight possible haplotypes. All the tribal populations shared at least five of the eight possible haplotypes. All the populations exhibited highest frequencies of the same set of three haplotypes: B2D2A2, B2D1A2, and B1D2A1. The frequency of the ancestral haplotype B2D2A1 ranged between 1.9% in Dangi Bhil and 15.9% in Mota Chaudhary. The two haplotypes having a combination of ancestral allele B2 and derived allele A2 (B2D1A2 and B2D2A2) together constituted the majority of haplotypes in all the study populations ranging from 46.4% in Kolgha to 67% in Dhodia. Haplotypes with derived alleles at TaqI B and TaqI D sites (B1D1A1 and B1D1A2) were present in 5 of the 11 tribal populations under study. Both were present in small frequencies in Dubla and Vasava. Dangi Konkana and Gamit tribes showed small frequencies of the B1D1A2 haplotype, while the Chaudhary tribe exhibited a very low frequency of the B1D1A1 haplotype. Similarly, the B1D2A1 haplotype considered to be derived from the ancestral haplotype by a point mutation had been found to be present in high frequency (23-38.2%) in all the populations except Mota Chaudhary (7.9%).
Data on pairwise linkage disequilibrium values (D′) for the three DRD2 sites are shown in Table 4.
Significant at df = 1 and p < 0.05.
The values were generally low (<0.2) for all the comparisons. All the populations, except Mota Chaudhary, showed significant linkage disequilibrium between TaqI B and TaqI D sites and between TaqI B and TaqI A sites. Similarly, linkage disequilibrium between TaqI A and TaqI D sites was found to be significant in 6 of the 11 studied populations.
Harpending and Ward (1982) explained that local fluctuation in the frequencies of biochemical markers is caused by genetic drift, and gene flow among groups counteracting the effect of genetic drift indicating that local structure is nearly exclusively driven by migration and drift. They showed that the genetic distance (rii) of the ith subpopulation from a hypothetical centroid of all subpopulations is related to the average heterozygosity (Hi) of the ith subpopulation. If gene flow from outside is uniform, then Hi = b(1 − rii), with the absolute value of b being equal to
Regression analysis: Hi = b(1 − rii); Hi plotted against 1 − rii through the origins has t = −0.6285; df = 9, p > 0.05.
Regression coefficient through origin (b) = 0.4444 ± 0.0164.
Average heterozygosity in pooled population (Ĥ) = 0.4442 ± 0.0124.
Average heterozygosity in the pooled population did not differ significantly from the regression coefficient.
Discussion
In India the frequency of ancestral allele B2 at the TaqI B site ranges from 36.67% in the Onge tribe (Bhaskar et al., 2008) to 91% in the Toda tribe (Vishwanathan et al., 2003). The median value in Indian populations is estimated to be 0.68 (Vishwanathan et al., 2003; Bhaskar et al., 2008; Prabhakaran et al., 2008; Saraswathy et al., 2009a, 2009b, 2009c). Further, the median value computed for the B2 allele (0.67) in the present study is close to its corresponding value of 0.68 computed from various available studies on DR-speaking tribal populations of South India (Vishwanathan et al., 2003; Bhaskar et al., 2008; Prabhakaran et al., 2008; Saraswathy et al., 2009c), but not close to the median value of 0.9 and 0.85 obtained for African and European populations, respectively (Kidd et al., 1998), and the median value of 0.81 computed for IE-speaking North Indian population groups (Saraswathy et al., 2009a). Similarly, the distribution of A1 and D2 alleles in the tribal groups of Gujarat is comparable to their distribution in the DR-speaking tribal groups of southern India.
Haplotype analysis reveals that most of the study populations share six of the eight haplotypes pointing toward genetic uniformity of these populations at the DRD2 locus. Genetic structure analysis in the present study based on regression of heterozygosity on genetic distance also indicates that none of these 11 tribal groups are either very isolated or overtly admixed. This is further substantiated by the fact that there is underlying social and cultural homogeneity among them, which is in accordance with historical past of these groups. The range of frequency observed for ancestral haplotype B2D2A1 (1.9-15.9%) in the Gujarat tribal populations is not as high as that observed for African populations (Kidd et al., 1998) and some of the South Indian tribal populations (Saraswathy et al., 2009c), but is comparable to the findings from other South Indian populations where the frequency is found to range between 2.1% in Irulas (Vishwanathan et al., 2003) and 16.6% in Malayali (Prabhakaran et al., 2008). This is in contrast to the findings among European populations where the frequency of ancestral haplotype ranges mostly between 0% and 4% (Kidd et al., 1998) and IE-speaking North Indian populations (Saraswathy et al., 2009a) where the frequency of B2D2A1 haplotype is less variable compared with its distribution in the tribes of Gujarat. Our findings, therefore, do not rule out the possibilities suggested by Prabhakaran et al. (2008) that (1) the ancestral haplotype could have arisen in India and then carried to other parts of the world and/or (2) this haplotype is not specific to Africa. Further, most recently derived haplotypes B1D1A1 and B1D1A2 are present in nearly all of the South Indian DR-speaking populations (Vishwanathan et al., 2003; Bhaskar et al., 2008; Prabhakaran et al., 2008; Saraswathy et al., 2009c). These two haplotypes are not found in any of the North Indian (Saraswathy et al., 2009a) and most of the European populations (Kidd et al., 1998). Incidentally, these two derived haplotypes are present in 5 out of 11 tribal groups, implying that the tribal groups of Gujarat could have been part of the older population substratum of this subcontinent.
Moreover, it is a generalized view that linguistic differences in India account for much of the genetic differences (Parpola, 1975; Cavalli-Sforza et al., 1994; Roychoudhury et al., 2001). It may be worth mentioning here that the DR speakers were possibly widespread throughout India before the arrival of IE speakers some 3500 years before present (Poliakov, 1974; Renfrew, 1987; Thapar, 2003), when they were forced to retreat southward due to IE dominance, after an initial period of admixture and language adoption. Recently, Basu et al. (2003) also found strong genetic similarities between the IE-speaking Halba tribal group of Central India and DR-speaking tribal groups of South India. The neighbor-joining tree constructed from the genetic distance matrix reveals strong affinities of IE-speaking tribal population groups of Gujarat with DR-speaking South Indian tribal groups (Fig. 2).

Neighbor-joining tree depicting genomic relationships among tribal populations of Gujarat and South India, and European populations.
Finally, it will not be out of place to emphasize that the original composition of proto-Australoid genes in the tribes of Gujarat might have been diluted to some extent through the differential contributions from various incoming groups from the north and the northwestern corridor of India. In the process of cultural assimilation and absorption, these tribal groups of western India adopted languages that belong to the IE linguistic family, and the proximity of IE-speaking tribes of Gujarat at the DRD2 locus with the DR-speaking tribal populations of South India suggests that genetic affinities may not necessarily be dependent on linguistic similarities.
Footnotes
Acknowledgments
We thank all the individuals who volunteered to provide blood samples for this study on genetic variation. We would also like to extend our gratitude toward the Valsad Raktdan Kendra (Centre for Blood Donation, Valsad), Valsad, Gujarat, for helping us in collecting blood samples. The authors wish to thank Department of Biotechnology, New Delhi, for providing us necessary grant (vide letter BT/PR9840/MED/12/366/2007) in support of the project. We also wish to thank Department of Anthropology, University of Delhi, Delhi, for granting us ethics clearance necessary for getting grant from Department of Biotechnology, Delhi.
Disclosure Statement
No competing financial interests exist.
