The Identification of Hub Genes and Pathways in type 2 Diabetes Mellitus by Bioinformatics Analysis

DOI: https://doi.org/10.21203/rs.3.rs-58811/v1

Abstract

Background This study aimed to identify potential core genes and pathways involved in type 2 diabetes mellitus (T2DM) through exhaustive bioinformatics analysis. This study elucidated parts of the pathogenesis of T2DM and screened therapeutic targets of the treatment.

Method: The original microarray data GSE25724 was downloaded from the Gene Expression Omnibus database. Data were processed by the limma package in R software and the differentially expressed genes(DEGs) were identified. Gene Ontology(GO) functional analysis and Kyoto Encyclopedia of Genes and Genomes(KEGG) pathway analysis were carried out to identify potential biological functions and pathways of the DEGs. The STRING(Search Tool for the Retrieval of Interacting Genes ) and Cytoscape software were used to establish a protein-protein interaction(PPI) network for the DEGs.Hub genes were identified using the PPI network.

Results In total, 75 DEGs were involved in T2DM, with 1 up-regulated gene, and 74 down-regulated genes. GO enrichment analysis showed that DEGs mainly enriched in the regulation of hormone levels, unfolded protein binding.KEGG pathway enrichment analysis showed that DEGs were significantly enriched in the fatty acid metabolism pathway, propionate metabolism pathway, and degradation pathway of valine, leucine, and isoleucine. Furthermore,Neuroendocrine protein 7B2(SCG5),Synaptosomal-associated protein 25 (SNAP25), Sterol Carrier Protein 2 (SCP2), Carboxypeptidase E (CPE),and Protein Convertase Subtilisin/Kexin Type 1 (PCSK1) were the core genes in the PPI network.

Conclusion This study identified 5 hub genes as potential biomarkers of type 2 diabetes through bioinformatics analysis, which might increase our understanding of the potential molecular mechanisms of T2DM and provided targets for further research.

Background

Nowadays, with the increase of aging people, obesity, and sedentary lifestyles, the incidence rate of diabetes increases rapidly. According to the International Diabetes Federation(IDF) reports, there were 463 million diabetic patients worldwide in 2019[1]. Diabetes not only increases blood glucose but also leads to acute and chronic complications, such as cardiovascular disease, renal failure, visual deterioration, and other complications ,even death. It has a serious impact on the patient’s life and also gives rise to a heavy social burden,however, the pathophysiological mechanism of type 2 diabetes remains unclear. It is currently believed that type 2 diabetes is a metabolic disorder. Numerous changes in protein structure and function have been associated with the occurrence and development of T2DM. These variations are mainly caused by multiple genetic factors combined with environmental influence. Genetic reason accounts for parts of the etiology of type 2 diabetes. To prevent and reduce the incidence rate of type 2 diabetes, it is especially crucial to clarify the molecular mechanism at the genetic level.

At present, bioinformatics analysis is of great advantage for understanding the pathophysiological mechanism of the disease on a genetic basis. Many bioinformatics datasets in Gene Expression Omnibus (GEO) have been applied to mining the pathogenesis of various diseases. Islet is a cardinal organ involved in type 2 diabetes, islet dysfunction is the central cause of diabetes mellitus. In this study, we aimed to identify potential hub genes and pathways involved in T2DM through exhaustive bioinformatics analyses using GSE25724 microarray profiles of pancreatic islet cells obtained from healthy controls and patients with T2DM. The original microarray data were downloaded from the Gene Expression Omnibus database. Using R software packages and bioinformatics analysis to explore the molecular mechanism of the pathogenesis in diabetes.

1. Materials And Methods

1.1 Database selection: 

The microarray dataset GSE25724, based on the GPL96 platform ([HG-U133A] Affymetrix Human Genome U133A Array)was obtained from the GEO (www.ncbi.nlm.nih.gov/geo/) database. The GSE25724 dataset was provided by Veronica Dominguez et al. Human islets were isolated from 7 non-diabetics and 6 T2DM organ donors by collagenase digestion followed by density gradient purification, these samples were used in the microarray.

1.2 Methodology

1.2.1 Screening of differentially expressed genes

R language was used to analyze the raw data onto microarray, the normalize-Between- Arrays function of limma package was applied to normalize the intensity of expression. Then, t-tests were performed in the limma package to identify DEGs. The threshold value of DEGs was selected by a p-value <0.05 and | log2 fold change (FC) |>2. EnhancedVolcano and heatmap packages were used to visualize the DEGs.

1.2.2 GO function and KEGG pathway analysis

GO function analysis is a widely used bioinformatics tool to investigate the annotation of genes and proteins. It can be utilized to integrate annotation data and provides tools access to all the data provided by the project. KEGG can integrate currently known protein interaction network information. In this study, we applied createKEGGdb, org.Hs.eg.db, cluster Profiler packages of R language to comprehend the biological function of the DEGs. Gene ontology(GO) functional analysis was applied to annotate DEGs from biological processes(BP), Cellular components(CC), and molecular functions(MF). Kyoto Encyclopedia of Genes and Genomes(KEGG) was applied to annotate the DEG pathways. The p-value cutoff is 0.05. The ggplot2 package was used for visualization of the results of GO and KEGG pathway analysis.

1.2.3 Protein-protein interaction program analysis

Since proteins rarely perform biological functions independently, it is noteworthy to be aware of protein interactions. STRING (Search Tool for the Retrieval Interacting genes) (http://string-db.org/) is an online software for interactions of genes and proteins. Cytoscape is an open-source tool for network visualization of genes and proteins. Protein-protein interaction(PPI) of the DEGs was constructed from the STRING database and was visualized by Cytoscape. The Cytoscape software uses the default parameters for analysis, and the connectivity degree of each node in the network was measured by connectivity analysis. DEGs with a degree of connectivity ≥3 were defined as having a high degree of connectivity and were used to screen for core genes. The top 5 hub genes were selected.

2. Results

2.1 Identification of DEGs between T2DM and normal islet tissues

The dataset was standardized by the Normalize-Between-Arrays-function of the limma package, then we deleted duplicated genes and values lacking specific gene symbols. A total of 75 DEGs were obtained. Among these DEGs, 1 genes up-regulated and 74 genes down-regulated. The DEGs from the dataset were presented in the volcano maps (Fig. 1A). The top 30 DEGs performed by heatmap were shown in Fig. 1B.

2.2 GO biological process analysis and KEGG pathways enrichment

GO analysis of genes includes molecular function(MF), Biological processes(BP) ,and cell composition(CC). In our study, GO analysis was utilized to perform the functional process of the DEGs. A P-value < 0.05 was defined to identify up and down-regulated genes in GO functional enrichment. The results are presented in Fig. 2A and Table 1. GO biological process analysis found that at the BP level, DEGs were mainly enriched in the regulation of hormone levels, hormone secretion, hormone transport, hormone metabolism, and regulation of insulin secretion. KEGG pathway enrichment analysis results showed that DEGs were significantly enriched in the fatty acid metabolism pathway, propionate metabolism pathway, degradation pathway of valine, leucine, and isoleucine. The results were presented in Fig. 2B and Table 2.

Table 1   GO enrichment analysis

Table 2 KEGG pathway analysis

2.3 PPI network constructions

We used the STRING database (https://string-db.org) and Cytoscape software to investigate the PPI network, a PPI network of DEGs was performed as showed in Fig. 3. SCG5, SNAP25, SCP2, CPE, and PCSK1 were the key genes in the PPI network.

3. Discussion

In recent years, with the rapid development of modern biotechnology such as biochip and high-throughput sequencing, the increasing maturity of bioinformatics analysis, data analysis and mining of candidate genes play a leading role in the progress of diseases gradually. Bioinformatics analysis can provide fresh ideas about the study of the pathogenesis of diseases and screen for therapeutic targets. The incidence of type 2 diabetes is rapidly increasing, nevertheless, the exact pathogenesis is still unclear. Type 2 diabetes is a metabolic disease with multiple genes involved. Exploring the molecular level dysfunction, in particular, targeting key abnormal genes in islet cells of type 2 diabetes can provide efficacious analysis of differentially expressed genes and related biological functions and signaling pathways to type 2 diabetes. These are extremely important for the elucidation of the pathogenesis of type 2 diabetes.

Dominguez V isolated human islets from the pancreas of 7 non-diabetics and 6 type 2 diabetic organ donors by collagenase digestion followed by density gradient purification. They performed microarray analysis to evaluate differences in the transcriptome of type 2 diabetic human islets compared to non-diabetic islet samples. The platform is GPL96[HG-U133A]Affymetrix Human Genome U133A Array. The GEO accession number is GSE25724. In our study, we extracted the expression data from GSE25724. By using the R language limma package, we screened differentially expressed genes, which may be associated with the development of type 2 diabetes. To further investigate the interactions between the DEGs, GO function, and KEGG pathway enrichment analysis were performed. The hub genes were found by the PPI network analysis. Here, we got 75 DEGs, the vast majority of which were down-regulated, only one gene was up-regulated, it was SRY (sex-determining region Y)-box 4(SOX4). The top 30 genes with the greatest differences were showed in Fig. 1. Subsequently, we performed GO and KEGG functional enrichment on these DEGs. Besides, PPI network analysis was performed on DEGs. The GO analysis indicated that the DEGs were primarily enriched in the regulation of hormone levels, hormone secretion, hormone transport, hormone metabolic processes, and regulation of insulin secretion at the level of biological processes (BP). The DEGs were primarily enriched in unfolded protein response at the molecular functional level(MF). KEGG pathway enrichment analysis showed that DEGs were markedly enriched in fatty acid metabolism pathway, propionate metabolism pathway, degradation pathway of valine, leucine, and isoleucine.

Increased circulating lipid levels and metabolic alterations in fatty acid metabolic pathway dysfunction and intracellular signaling have turned out to be associated with insulin resistance in the muscle and liver of diabetic patients[2]. Imbalance in fatty acid metabolism can lead to impaired GSIS(glucose-stimulated insulin secretion) with concomitant oxidative and metabolic stress, endoplasmic reticulum stress, and numerous pro-apoptotic signals, all of which lead to a decrease of β-cell survival[3]. Propionate inhibits hepatic glucose gluconeogenesis via the G protein-coupled receptor 43/AMP-activated protein kinase(GPR43/AMPK) signaling pathway[4]. The plasma levels of branched-chain amino acids(valine, leucine, isoleucine) increase in conditions related to insulin resistance, such as obesity and diabetes[5]. Higher circulating level of branched-chain amino acids is strongly linked to a higher risk of type 2 diabetes[67]. Experimental studies have found that impairment of the adaptive unfolded protein response in mouse β-cells leads to reduce transportation from endoplasmic reticulum to Golgi protein and further increase β-cell death[8]. Our findings were generally consistent with those papers.

A PPI network of DEGs was performed by using the STRING database and Cytoscape software. SCG5, SNAP25, SCP2, CPE, and PCSK1 were the hub genes. SCG5 encodes a secretory chaperone that prevents the aggregation of other secret proteins, including those associated with neurodegenerative and metabolic diseases. It has been mostly studied for its role in the transportation and activation of the prohormone convertase 2. SCG5 acts as a molecular chaperone for kexin2 proprotein convertase subtilisin/prohormone convertase 2 (PCSK2/PC2), preventing its premature activation in the regulated secretory pathway. SCG5 binds to inactivated PCSK2 in the endoplasmic reticulum, facilitates its transport from there to later compartments of the secretory pathway where it is proteolytically matured and activated. It is found that the changes of type 2 diabetic phenotype in GK rats may be caused by the accumulation of multiple genetic variants, including the SCG5 gene, and the mutated genes may affect biological functions including adipocytokine signaling, glycerolipid metabolism, PPAR signaling, T cell receptor signaling, and insulin signaling pathways [9]. The present study found that the SCG5 gene was mainly enriched in unfolded protein binding (GO:0051082), which might be linked to hormone preprocessing and insulin secretion.

SNAP25 is interconnected with proteins that participate in vesicle docking and membrane fusion.SNAP25 regulates plasma membrane recirculation through its interaction with centromere protein F (CENPF).SNAP25 also modulates the gating characteristics of the delayed rectifier voltage-dependent potassium channel, potassium voltage-gated channel, Shab-related subfamily, member 1(KCNB1) in pancreatic beta cells. Studies have evaluated the possible role of SNAP25 polymorphisms in T2DM, suggesting that the minor SNAP25 rs363050 (G) allele, which results in a reduced SNAP25 expression is associated with altered glycemic parameters in patients with T2DM, possibly because of reduced functionality in the exocytotic machinery leading to the suboptimal release of insulin[10]. Tao Liang[11]found that SNAP23 is the ubiquitous SNAP25 isoform that mediates secretion in non-neuronal cells, similar to SNAP25 in neurons. Pancreatic islet β cells contain an abundance of both SNAP25 and SNAP23. SNAP23 depletion promotes SNAP25 to bind calcium channels more quickly and longer where granule fusion occurs to increase exocytosis efficiency. In this study, we found that the SNAP25 gene was mainly enriched in hormone transport (GO:0009914), which might be linked to insulin cytokinesis and exocrine secretion.

SCP2, a non-specific lipid-transport protein; mediates the transfer of all common phospholipids, cholesterol, and gangliosides between cell membranes. SCP2 may play a role in the regulation of steroidogenesis. It was found that SCP2 protein levels were decreased significantly in severely hypercholesterolemic diabetic animals. This differential expression of sterol carrier proteins SCP2 may accompany diabetic dyslipidemia, which should be considered a potential contributing mechanism through which cholesterol metabolism may be altered in diabetes[12]. Our study found that the SCP2 gene was enriched mainly in coenzyme binding (GO:0050662), which was involved in disorders of diabetic lipid metabolism.

CPE encodes a member of the M14 family of metallo-carboxypeptidases. It is a categorical receptor that directs hormone precursors into regulatory secretory pathways. It also serves as a hormone precursor processing enzyme in neural/endocrine cells, removing dibasic acid residues from the C terminus of peptide hormone precursors after initial endonuclease cleavage. Carboxypeptidase E is a peptide processing enzyme involved in the cleavage of numerous peptide precursors, including neuropeptides and hormones associated with appetite control and glucose metabolism including proinsulin. Diseases associated with CPE contain hyperinsulinemia and insulinoma. CPE is involved in the biosynthesis of various neuropeptides and peptide hormones in endocrine tissues and the nervous system. Loss of normal CPE leads to various disorders, containing diabetes, hyperinsulinemia, low bone mineral density, and deficits in learning and memory[13]. Truncating mutations in the CPE gene have been shown to cause morbid obesity, intellectual disability, abnormal glucose homeostasis, and hypogonadotrophic hypogonadism, it reveals the importance of CPE in the regulation of body weight and metabolism, and brain and reproductive function in humans[14]. GO annotations associated with this gene include cell adhesion molecule binding, carboxypeptidase activity, and peptide hormone processing(GO00016486).

PCSK1 also known as neuroendocrine convertase 1, encodes a member of the Bacillus subtilisin-like preprotein convertase family, which have the capacity to regulate secretory pathways or to be a component of one of the branchings, proteases that process protein and peptide precursors. PCSK1 is involved in the processing of hormone and other protein precursors at sites comprised of pairs of basic amino acid residues. Substrates include proopiomelanocortin (POMC), renin, enkephalin, dynorphin, somatostatin, and insulin. The universal genetic variants rs6232 and rs6235 within PCSK1 are found to determine glucose-stimulated proinsulin conversion, but not insulin secretion. Besides, rs6232 influences glucose homeostasis and insulin sensitivity independently of Body Mass Index (BMI) and proinsulin concentrations[15]. Rona JS et al [16]identify nine genetic variants associated with fasting insulinogen, including PCSK1, which is associated with the glucose homeostasis and T2DM development in humans and argues against a direct role of proinsulin in coronary artery disease pathogenesis. Mayumi Enya[17]found that the genetic variant of PCSK1 may influence glucose homeostasis by altering insulin resistance independently of BMI, incretin level, or proinsulin conversion, and may be associated with the occurrence of type 2 diabetes in Japanese. In our study, we found that the PCSK1 gene was mainly enriched in regulating hormone levels (GO:0010817). it was concluded that the gene is related to the regulation of hormone and glucose homeostasis in diabetes mellitus.

Conclusion

In summary, the present data provide a comprehensive bioinformatics analysis of DEGs that might be linked to the progression of T2DM. We have identified 75 candidate DEGs and 5 hub genes SCG5, SNAP25, SCP2, CPE, PCSK15 based on profile datasets, and bioinformatics analyses. To a certain extent, these findings could lead to a rise in our understanding of the etiology and underlying molecular events of T2DM, and provide the research direction and theoretical basis for revealing the molecular mechanism and therapeutic targets of T2DM. However, supplementary experiments in vitro and in vivo are needed to validate the role of these screened genes and pathways in the progression of type 2 diabetes.

Abbreviations

DEGs

differentially expressed genes

GO

Gene Ontology

KEGG

Kyoto Encyclopedia of Genes and Genomes

STRING

Search Tool for the Retrieval of Interacting Genes

PPI

protein-protein interaction

IDF

the International Diabetes Federation

GEO

Gene Expression Omnibus

BP

biological processes

CC

Cellular components

MF

molecular functions

GSIS

glucose-stimulated insulin secretion

SCG5

Neuroendocrine protein 7B2

SNAP25

Synaptosomal-associated protein 25

SCP2

Sterol Carrier Protein 2

CPE

Carboxypeptidase E

PCSK1

Protein Convertase Subtilisin/Kexin Type 1

AMPK

AMP-activated protein kinase

PCSK2/PC2

kexin2 proprotein convertase subtilisin/prohormone convertase 2

CENPF

centromere protein F

KCNB1

potassium voltage-gated channel, Shab-related subfamily, member 1

T2DM

type 2 diabetes mellitus

POMC

proopiomelanocortin

BMI

Body Mass Index

SOX4

sex-determining region Y-box 4.

Declarations

Ethics approval and consent to participate

This analysis was based on a previously published study and no ethical approval and patient consent are required.

Consent for publication

Written informed consent for publication was obtained from all participants.

Availability of data and materials

The datasets used or analyzed during the present study are available from the corresponding author on reasonable request.

Corresponding author

Correspondence to Hong Chen

Competing interests

The authors declare that they have no competing interests.

Funding

Not applicable.

Authors' contributions

Jing Li wrote the manuscript. Hong Chen conducted the design of the study and edited the drafts, and is the guarantor of this work and had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis, and consent to participate. Xinkui Jiang researched data and edited the drafts and co-wrote the final draft. Libo Chen contributed to the discussion. All authors read and approved the final manuscript.

Acknowledgements

Not applicable.

Authors' information (optional)

Jing Li ,Xinkui Jiang contributed equally to this work

  1. Jing Li, Libo Chen

Department of Endocrinology ,HuaZhong University Of Science and Technology Union ShenZhen Hospital, Taoyuan Road, Guangdong,518052, China

2.Xinkui Jiang

Department of ultrasonography ,HuaZhong University Of Science and Technology Union ShenZhen Hospital,Taoyuan Road, Guangdong,518052, China

3.Hong Chen

Department of Endocrinology, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China

References

  1. https://diabetesatlas.org/en/sections/worldwide-toll-of-diabetes.html.
  2. Dilek, YazıC. Havva Sezer.Insulin Resistance, Obesity, and Lipotoxicity. Adv Exp Med Biol. 2017;960:277–304.
  3. Petr Ježek M, Jabůrek B, Holendová, et al. Fatty Acid-Stimulated Insulin Secretion vs Lipotoxicity. Molecules.2018;19,23(6):1483.
  4. Hiroki Yoshida M, Ishii. Mitsugu Akagawa.Propionate suppresses hepatic gluconeogenesis via GPR43/AMPK signaling pathway.Arch Biochem Biophys.2019:15,672:108057.
  5. María M, Adeva-Andany L, López-Maside. Cristóbal Donapetry-García et al.Enzymes involved in branched-chain amino acid metabolism in humans.Amino Acids.2017;49(6):1005–1028.
  6. Luca A, Lotta RA, Scott SJ, Sharp, et al. Genetic Predisposition to an Impaired Metabolism of the Branched-Chain Amino Acids and Risk of Type 2 Diabetes: A Mendelian Randomisation Analysis. PLoS Med. 2016;29(11):e1002179. 13(.
  7. Zachary Bloomgarden. Diabetes and branched-chain amino acids: What is the link.Diabetes.2018; 10(5):350–352.
  8. Mohammed Bensellam EL, Maxwell JY, Chan, et al. Hypoxia reduces ER-to-Golgi protein trafficking and increases cell death by inhibiting the adaptive unfolded protein response in mouse beta cells. Diabetologia. 2016;59(7):1492–502.
  9. Tiancheng Liu H, Li G, Ding, et al. Comparative Genome of GK and Wistar Rats Reveals Genetic Basis of Type 2 Diabetes. PLoS One. 2015;10(11):e0141859.
  10. Nasser M, Al-Daghri AS, Costa, Majed S, Alokail, et al.Synaptosomal Protein of 25 kDa (Snap25) Polymorphisms Associated with Glycemic Parameters in Type 2 Diabetes Patients.Diabetes Res,2016. doi: 10.1155/2016/8943092.
  11. Tao Liang T, Qin F, Kang, et al. SNAP23 depletion enables more SNAP25 calcium channel excitosome formation to increase insulin exocytosis in type 2 diabetes. JCI Insight. 2020;5(3):e129694.
  12. McLean MP, Billheimer JT, Warden KJ, et al. Differential expression of hepatic sterol carrier proteins in the streptozotocin-treated diabetic rat. Endocrinology. 1995;136(8):3360–8.
  13. Lin Ji H-T, Wu X-Y, Qin, et al. Dissecting carboxypeptidase E: properties, functions, and pathophysiological roles in disease. Endocr Connect. 2017;6(4):R18–38.
  14. Suzanne IM, Alsters AP, Goldstone JL, Buxton, et al. Truncating Homozygous Mutation of Carboxypeptidase E (CPE) in a Morbidly Obese Female with Type 2 Diabetes Mellitus, Intellectual Disability and Hypogonadotrophic Hypogonadism. PLoS One. 2015;10(6):e0131417.
  15. Heni M, Haupt A, Silke A, SchäFee. Association of obesity risk SNPs of PCSK1 with insulin sensitivity and proinsulin conversion. BMC Med Genet. 2010;11:86.
  16. Rona J, Strawbridge J, Dupuis I, Prokopenko, et al. Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2. diabetesDiabetes. 2011;60(10):2624–34.
  17. Mayumi Enya Y, Horikawa K, Iizuka, et al. Association of genetic variants of the incretin-related genes with quantitative traits and occurrence of type 2 diabetes in Japanese. Mol Genet Metab Rep. 2014;1:350–61.