Although the exact cause of T1D is unknown, the condition significantly adds to the patients' physical, financial, and emotional burden. The diagnostic biomarkers for T1D have been identified in previous research, however in this study, T1D and CD have been combined. In this work, we employed a variety of bioinformatics studies, methodologies, and We employed machine learning algorithms to identify common potential genes and pathways shared between Type 1 Diabetes (T1D) and Celiac Disease (CD). We integrated the DEGs from DESeq2 and EdgeR to get a total of 568 shared probable candidate genes, excluding duplicates. The potential candidate genes have a wide range of connections to intracellular transport, cellular protein localization, and metabolic pathways, according to enrichment analysis. Then, in order to perform a more thorough examination of the potential candidate genes, we utilized a machine learning strategy. The PPI network, which provides insights into the gene interactions, was built using the combined candidate genes acquired from the random forest and LASSO. The ClueGO research also revealed that the genes interact biologically with their corresponding pathways and activities. The highly diagnostic candidate genes RPL21, HCLS1, and NAA15 were created by the intersection of genes from the RF and LASSO. The ROC curve and estimated AUC values were then evaluated. In the end, we identified 3 important candidate genes for the diagnosis of T1D and CD: RPL21, HCLS1, and NAA15 we also calculated the diagnostic value of the genes.
According to the functional enrichment study, 925 genes from the DEGs were expressed in biological processes that involved the intracellular transport of proteins, establishment of protein localization, protein transport, and cell cycle as key pathways. Major mechanisms for the expression of 570 genes include the mitochondria, catalytic complex, organelle sub compartment, and microtube cytoskeleton. 257 genes are involved in molecular processes including ATP binding, Adenyl nucleotide binding, and protein-containing complex binding. Ribosome, diabetic cardiomyopathy, endocytosis, and metabolic pathways include 109 genes.
NAA15, RPL21, and HCLS1 were the potential genes we discovered using machine learning. Chromosomal position 4q31.1 is where N-alpha-acetyltransferase 15 (NAA15) is found. The NAA15 gene produces, is considered to attach to the ribosome so that proteins can be modified post-translationally when they leave the ribosome [31]. Retinoic acid applied to these cells triggered neuronal development and downregulated the neuronal marker genes Ard1 and Narg1 [32] With regard to the development and differentiation of neurons, NARG1 and ARD1 are involved. Mammalian brain development relies on neuronal communication facilitated by the N-methyl-d-aspartate receptor-regulated gene 1 (NARG1), NARG2, and NARG3 (also known as cloned NARG1 or NAA15 [33]) family of glutamate receptors [34]. Notably, individuals with Type 1 diabetes exhibited notably higher levels of brain glutamate. The impact of glutamate extends to pancreatic islets, where it's released by cells and acts as a signaling mediator influencing hormone production. This neurotransmitter, originating from the central nervous system, similarly affects the pancreatic islets, which underscores its role as an excitatory signal [35], [36].
RPL21 gene, which codes for the ribosomal protein L21, is a part of the 60S subunit of the ribosome. It is located in the cytoplasm and belongs to the ribosomal protein L21E family [37]. This gene produces several processed pseudogenes that are scattered throughout the genome, as is normal for genes that encode ribosomal proteins. HCLS1, Hematologic malignancy risk and hematopoiesis disruption are both caused by ribosomal protein (RP) mutations in conditions such 5q syndrome [38]. Hematological cancer and diabetes have a tight relationship. Common diseases like diabetes and cancer are linked to high rates of morbidity and death. Diabetes can be a lesser-known risk factor for hematological malignancies [39]. HCLS1 hematopoietic cell-specific Lyn substrate 1 is a gene that is unique to hematopoietic cells and is located on chromosome 3q13.33. A 75-kD intracellular protein that is only expressed by hematopoietic cell lineages is the product of the human HCLS1 gene. They are crucial for leukaemia cell movement that is driven by chemokines and is ROR1-dependent Wnt5a-enhanced. T1D and early-onset leukaemia are co-morbid conditions [40], [41], [42], [43].
In recent years, the utilization of bioinformatics technologies, machine learning algorithms, and deep learning techniques to address pertinent medical concerns has gained significant traction, and there is a ton of literatures on the subject. To identify the genes for prospective diagnostic biomarkers, methods based on machine learning and network algorithms are also available. A novel deep learning system can anticipate probable correlations of disease-associated metabolites, in contrast, traditional biological inquiries typically require substantial time and financial resources to analyze fluctuations in the concentration of specific metabolites between patients and healthy individuals. In recent years, advanced model design has grown more prevalent, especially when it involves model fusion, which is the reasonable merger of various algorithms [44]. The biggest trend right now is the use of various algorithms to boost model performance and predictive ability. In this work, we integrate two machine learning techniques to significantly improve the prognostic accuracy of T1D and CD.