The prevalence of diabetes has risen spectacularly over the years in Pakistan. According to the International Diabetes Federation, in Pakistan, 26.7% of adults have diabetes and this number can be lower than the expected as a large number of patients remain undiagnosed (31). It was recently reported that the incidence of diabetes was much higher in urban areas of Pakistan than in the rural areas (32). Therefore, the participants in this study belonged to the urban area of Karachi, Pakistan.
In our case-control study, we first analyzed the HVS2 region of the mitochondrial genome through Sanger sequencing to identify potential variants associated with diabetes. We found that the distribution of two variants, m.315dup and m.309_310insCT, was significantly different between the two groups, with higher prevalence of these two variants in diabetic patients when compared to controls. The m.302–315 positions cover the interrupted C stretch of the HVS2 region. The m.309_310insCT and m.315dup are relatively common variants according to GenBank frequency (25.791% and 30.086%, respectively), however they are not reported in the gnomAD database. Both variants were also observed quite frequently in our NGS-tested samples, although differences between groups were not statistically significant, possibly because of the small sample size of control group. After finding variants in the HVS2 with a possible association with diabetes, these variants were further evaluated using NGS to gain more insight into their relationship with diabetes. The most common haplogroups found in diabetic samples were H (43.64%) and K (25.45%) whereas in control samples haplogroups U (25%), H (21.88%) and M (21.88%) constituted the highest portion (Fig. 3).
The non-coding region of mtDNA exhibited 329 variants when NGS was carried out in samples from diabetic subjects. This is the highest number compared to any other mtDNA (coding) region due to the highly polymorphic nature of the mtDNA control region (Fig. 6). These mutations can interfere with some regulatory elements in the mtDNA control region and affect the binding affinity of promoters modulating the transcription and replication processes (33). Of these, the m.16189T > C variant was quite frequently found (14.7%) exclusively in samples of diabetic subjects. In previous findings, m.16189T > C was reported to be associated with an increased risk of type 2 diabetes in Asians (34) and was recently reported to be associated with type 2 diabetes in the individuals from Pakistani population (35). This variant is considered as the genetic marker of type 2 diabetes along with other variants in Asians (36). Hence, our results are in agreement with these reports.
Protein-coding regions constitute the vast majority of the mitochondrial genome. We evaluated non-synonymous mtDNA variants found exclusively in diabetic samples. Most of these variants were found in genes encoding the mitochondrial respiratory complex I subunits (ND1-ND6). Some previous studies have reported the association of mitochondrial ND genes variants with type 2 diabetes (37) and other metabolic syndromes (38). Interestingly, all the mtDNA coding region variants linked to diabetes (according to Mitomap patient reports) identified in our patients were non-synonymous and found in the ND1, ND2 and ND5 subunit genes (Tables 3 and 4). Variant m.14766C > T was frequently occurring and present in both diabetic and control groups as being a haplogroup marker of various clusters.
The pathogenicity prediction of non-synonymous mtDNA variants with MutPred tool can allow for assessing their functional impact. We evaluated possibly deleterious variants with MutPred scores above 0.5 and literature survey was conducted to determine whether there is any association of these variants with diabetes. It is important to note that the substitution types of the variants with higher MutPred scores (greater than 0.75) were transversions with high occurrence. This is consistent with a well-known fact that transversions frequently disrupt protein coding genes and result in amino acid changes (39).
Variants identified in the whole mtDNA in diabetic patients were also compared with data from public databases. Mitomap and gnomAD databases are of particular importance as they include variant details from different populations, including South Asian and Pakistani populations. This helped in the identification of novel variant in our study. Mitochondrial DNA variants not observed frequently in this study and not reported in databases and scientific literature were considered novel variants. We identified one novel variant located in non-coding region of mtDNA (Table 6). When comparison of mtDNA variant frequency between our diabetic samples and PJL population from 1000 Genomes Project was performed, eight variants were found to be possibly associated with diabetes (Table 7). Four of these variants m.7193T > C, m.9336A > G, m.11935T > C and m.14766C > T were present in the coding mtDNA region. However, no such association was observed when we compared diabetic samples with control samples from this project. This can be due to the limited size of control population in our study as large sample size can reveal more interesting variants that can be compared.
There are some variations that are strongly associated with diabetic phenotype in various populations but are not even present in the Pakistani population. For instance, it is known that increasing levels of heteroplasmic transition mutation m.3243A > G, a known pathogenic variant, are responsible for the pathogenesis of the diseases such as diabetes, neuromuscular degenerative disorder, and perinatal lethality in some populations (40, 41). However, m.3243A > G was not found in any diabetic subject in our study, and it was also absent in a recent study of six diabetic individuals of a Pakistani family (42).
As shown in the pedigree analysis in families affected with diabetes, inheritance in the maternal line was more frequently observed than in the paternal line (19), which suggests the role of mtDNA in pathogenesis of diabetes. To establish this scenario in Pakistani population, type 2 diabetic patients with family history should be studied along with their close maternal relatives so that mitochondrial genetic variations within similar haplotypes could give a more detailed picture. There is also involvement of nuclear genomic variants in the onset of diabetes and cause aberrations in the mitochondrial metabolism. All these important phenomena, studied together, could show more insight into the molecular basis of diabetes. Furthermore, this study has some limitations as it reflects data from a limited Pakistani population. Therefore, more detailed studies on a large number of samples with closely matched case-control groups in terms of age and gender could be very useful.