This study was designed to determine the carrier frequency of single gene disorders other than β-thalassemia for which the carrier frequency in the Indian population is already known, and screening is usually done through haematological tests. Recessive disorders are considered to be more prevalent in India, owing to the practice of endogamy and consanguinity. β-thalassemia has a carrier frequency of 1–17%, mean of about 3.3%.1 SMA, Fragile X syndrome and DMD have been noted to be common in all populations including South Asians but are difficult to detect with NGS.34−5 However, carrier frequency for other single gene recessive disorders is not known and significant differences in prevalence and pathogenic variants have been seen in different populations.35−6
CFTR pathogenic and likely pathogenic variants
There were nine significant variants identified in the CFTR gene in this cohort. Of these, only one case had the common p.Phe508del pathogenic variant i.e. 11% (n = 1/9).
Two pathogenic variants detected in CFTR gene in this study have been observed before in our laboratory (p.Arg75Ter and p.Ser549Asn). The remaining six pathogenic variants have not been reported in Indians before (Table 2). The variants p.Ser549Asn, p.His199Tyr, p.Arg1070Gln have been described by multiple authors and functional studies have been carried out classifying them as pathogenic as per ACMG criteria. The other four variants p.Ile1366Phe, p.Cys491Phe, p.Phe1337Val, p.His620Leu have been documented associated with disease however lack adequate functional studies and cannot be classified as pathogenic, although they meet the criteria for likely pathogenic variants (Table 2).
Studies on the genetic profile of cystic fibrosis patients in India shows high variability and many rare and new variants have been observed, while only few pathogenic variants (p.Arg1162Ter, p.Met1Thr, c.1161delC, p.Ser549Asp and c.1525-1G > A) are reported more than once.37−9 This suggests the lack of founder or common mutations in CFTR gene and thus the need for full gene sequencing of CFTR in suspected cases in the Indian population. In the present study except for p.Phe508del pathogenic variant, no other pathogenic variant was present in the ACMG panel of cystic fibrosis.40 Mandal et al also suggested that because of heterogeneity in pathogenic variants, a single panel of pathogenic variants cannot be used for diagnosis or carrier testing of CF in India.41
CFTR c.3854C > T, p.Ala1285Val variant was identified in three individuals, which although has been reported in literature with the CBAVD (congenital bilateral absence of vas deferens)42 is more likely to represent a common polymorphism due to its observance in high frequency in the NGS data in Indian population (0.5% minor allele frequency in South Asians in gnomAD exomes). This variant was classified as VUS and not included in the list of significant variants.
A high carrier frequency of cystic fibrosis was noted in this study in a non-Caucasian population. Till date most studies have been done either on a Caucasian population4 or were targeted genotyping2 and hence cannot be considered representative of the full gene carrier frequency. The pathogenic variants in cystic fibrosis can vary according to ethnic origin as noted in a recent study by Archibald et al. where an affected fetus had a pathogenic variant outside the listed ACMG panel for cystic fibrosis.34 Lim et al. reported in ExAC database that the pathogenic variants in the CFTR gene in non-Europeans are different from those in people of European descent. They noted that none of the current genetic screening panels or existing CFTR pathogenic variant databases cover a majority of deleterious variants in any geographical region outside of Europe.43 Among the nine significant variants identified in the CFTR gene in the present cohort, only one case had the common p.Phe508del pathogenic variant i.e. 11% (n = 1/9) of all significant CFTR variants identified. Kapoor and Kabra et al. studied cord blood samples of 955 new born babies and reported a p.Phe508del carrier frequency of one in 238 (0.42%).2 This estimated the frequency of homozygous p.Phe508del as 1/228,006. However, this cannot be considered representative of the true prevalence of cystic fibrosis in India as it accounts for only one pathogenic variant. Comparison of p.Phe508del allele frequency with that reported from the West shows that Indians have a low percentage of the p.Phe508del pathogenic variant frequency (19–44%).43−5 The lack of hot spots suggests the need for full gene sequencing of CFTR in suspected cases in the Indian population. Cystic fibrosis was thought to be extremely rare in India. However, a growing number of publications in the last two decades have suggested a higher prevalence.41,46 This indicates that CF is much more common in the Indian population with majority of cases being missed or undiagnosed. CFTR related pathogenic variants may be rarely recognized in Indians in view of the different phenotypes (including cystic fibrosis and congenital absence of vas deferens), variable clinical severity and lack of awareness regarding diagnostic modalities, and absence of new born screening.
GJB2 c.231G > A, p.Trp77Ter and c.71G > A, p.Trp24Ter
The pathogenic variants identified in GJB2 represent the common Indian pathogenic variants that have been previously reported. Ram Shankar et al studied the pathogenic variants in GJB2 gene in Indian patients with deafness and found p.Trp24Ter to be the most common pathogenic variant in India with a likely founder effect based on haplotype analysis.3 In addition, they documented two other common pathogenic variants p.Trp77Ter and IVS1 + 1G > A.
They differ from the common pathogenic variants identified in the Western population (c.35delG)47 and Japanese and Korean populations (c.235delC and p.Val37Ile).48−9
SLC26A4 related hearing loss
Hearing loss due to SLC26A4 has been reported as third most common cause of hearing loss in a study done on pan-ethnic population.50 In this study, two out of the four significant variants reported have been previously described in individuals belonging to Indian ethnic origin: p.Arg409Pro51 − 2 and p.Ile490Leu.53 Other variants found in our study include p.Gly334Val that has been described chiefly in people of Mediterranean origin54 and the p.Phe335Leu is a common variant reported worldwide.55
Carrier screening and prenatal diagnosis for a disorder like hearing loss which impairs quality of life can have differing perceptions among families in different countries. The parental perceptions in Indian culture where resources are scarce towards congenital hearing loss have been pointed out by Nahar et al. previously.56 While some families are interested in using the information to help in the management, planning and emotional adjustment to the birth of a child with deafness others opt for discontinuing an affected foetus especially if financial resources are scarce.
GBA c.1448T > C, and c.866G > C, p.Gly289Ala
This variant p.Gly289Ala and p.Leu483Pro were observed in one individual in the present cohort. Ankleshwari et al.28 studied 33 Indian patients with Gaucher disease, and identified p.Leu483Pro as the most common pathogenic variant 60.60% (n = 20/33). In addition, they reported p.Gly289Ala as a novel pathogenic variant. The variant most commonly observed in Western population (p.Asn370Ser) is observed less commonly in India.
GAA c.1933G > A, p.Asp645Asn variant
We observed three individuals to be carriers for this variant in the GAA gene. This variant was reported for the first time in 1998 by Huie et al. where they demonstrated low enzyme activity with this pathogenic variant in vitro and in vivo.29 Subsequently this pathogenic variant has been reported in patients affected with infantile onset Pompe disease in several studies.57−8 This variant lies in exon 14 of the gene, reported to be a hot spot for this gene.59 An Indian study however reported no hot spots for this gene.60
OCA2 c.1580T > G, p.Leu527Arg variant
This variant was observed in heterozygous in two individuals in our cohort and was reported for the first time by Jowerek et al.31 in a Pakistani albino family with some pigmentation of hair. They reported that this pathogenic variant lies in highly conserved residue of amino acids in the transmembrane 8 domain of the protein and segregated with affected members in the family. They hypothesized that this variant has not been reported in any other ethnic population till date and may be specific to Pakistani albino individuals.
AGXT c.302T > C, p.Leu101Pro variant
We observed one carrier (belonging to Punjabi community) for this variant in our cohort. This variant was reported for the first time by Williams et al.43 who demonstrated that the mutant gene protein had less than 1% of normal activity in vitro. Subsequently, a study by Chanchlani et al. in three patients with primary hyperoxaluria type 1 showed the p.Leu101Pro variant in homozygous state.32 All the three patients belonged to north India or Pakistan. They suggested a possibility of this being a founder pathogenic variant in India although larger studies and haplotype analysis are required.
ASPA c. 902T > C, p.Leu301Pro
One individual was found to be carrier for this variant. This variant has been reported in the literature in a patient of Indian ethnicity with classical Canavan disease and raised urine N-acetyl aspartate.61 While functional studies have not been done, on the basis of the reported literature this variant could be reclassified using ACMG criteria as likely pathogenic. This variant has not been reported elsewhere in the world to the best of our knowledge and could represent an Indian pathogenic variant.
ACADM c.811G > A, p.Gly271Arg
This is a well reported pathogenic variant in the ACADM gene worldwide. It was observed in one individual in this study. The c.985A > G pathogenic variant commonly seen in the West, believed to be a founder pathogenic variant in Caucasians originating from an ancient Germanic tribe was not observed in the present cohort.62
Disorders like AR Polycystic Kidney Disease, Methyl Malonic aciduria, Galactosemia, Smith-Lemli Opitz Syndrome, Albinism type II, Cystic megalencephalic leukoencephalopathy, Gaucher disease, Phenylketonuria and Junctional Epidermolysis bullosa can be expected to be common in the Indian population as at least two cases were detected among the 200 individuals screened.
Our group has identified a number of disorders with founder mutations among the Agarwal community. Carriers for only two of these were identified in the current study (Calpainopathy, and Megalencephalic leukodystrophy with cysts). The mutations detected are not the common ones noted in the Agarwal community. However there were only 28 individuals in the cohort belonging to the Agarwal community.