DEGs of NOA
First, DEGs of NOA were screened via normal control in the GSE45885 and GSE45887 databases. The results showed that the number of upregulated genes in the two databases was 118 and 116, respectively. Further, the number of downregulated genes, 871 and 772 respectively, was much higher than that of the upregulated genes (Figure 1). In order to determine the same DEGs in the two databases, the VENN assay was performed, and identified 91 upregulated and 772 downregulated genes in both databases.
NOA-associated Significant Modules Identified by the WGCNA Assay
To further explore the genes associated with NOA, the WGCNA assay was performed. The WGCNA assay is a systems biology method used to describe gene association patterns among different samples, and it can be used to identify highly synergistic gene sets, and to determine candidate biomarker genes or therapeutic targets based on the association between the gene sets and phenotypes.
There was no obvious outlier in the sample clustering (Figure2A,3A), and the connectivity between genes in the gene network met a scale-free network distribution with a soft threshold power of β = 8 or 12, respectively(Figure 2B,3B). The WGCNA assay of GSE45885 and GSE45887 identified 28 and 26 modules, respectively (Figure 3C,D and 4C,D). In the GSE45885 database, the top three modules were the brown, light green, and tan modules, with a score of 0.61, 0.63, and 0.7, respectively. In the GSE45887 database, the top three modules were the black, blue, and dark green modules, with a score of 0.56, 0.64, and 0.53, respectively. The module-feature relationship of both datasets was assayed by Pearson’s correlation analysis. The black (r=0.57, P<1.9e-105), blue(r=0.39, P<1.2e-111), and dark green (r=0.63, P<2.6e-07) modules in GSE45887(Figure 4), and the brown(r=0.68, P<1e-200), light green(r=0.55, P<3.4e-14), and tan (r=0.8, P<3.3e-81)modules in GSE45885 were highly associated with the NOA phenotype (Figure 5).
With the help of the NOA-associated modules, the hub genes in each module were determined in both databases, as the NOA-associated genes with an X coefficient higher than 0.8. Then,the intersection of NOA-associated genes in the two database was taken, and the genes of interest were identified as the NOA-associated gene set, which was further assayed.
Biological Function of NOA-associated Gene set
In order to explore the potential biological characteristics of the NOA-associated gene set,the KEGG signaling pathway and GO assays were performed. The TOP12 different KEGG signaling pathways were identified (Figure 6A). Ubiquitin-mediated proteolysis, cellular senescence, and autophagy signaling pathways were highly related with the pathogenesis and biological state of NOA. According to the GO assay, autophagy, protein deubiquitination, and protein modification by small protein removal were highly related with NOA, and they may refer to the biological characteristic of NOA (Figure 6B). Cellular senescence is an important biological process,which can lead to organ dysfunction and may take part in NOA. In order to identify the most important genes in NOA, the genes in the cellular senescence signaling pathway were intersected with the genes in the DEG of NOA. This analysis showed a prominent role of SMAD2 and HIPK4. SAMD2 was upregulated, whereas HIPK4 was downregulated. Between them, SAMD2 exhibited the most biological effects, involved in the regulation of six KEGG signaling pathways from the TOP12. In addition to being involved in the cellular senescence signaling pathway, SMAD2 also took part in endocytosis, the Hippo signaling pathway, hepatocellular carcinoma, colorectal cancer, signaling pathways regulating pluripotency of stem cells, gastric cancer, the Relaxin signaling pathway, pancreatic cancer, and proteoglycan signaling pathways in cancer, which also indicated the multi-functional role of SMAD2. In the GO assay, SMAD2 also related with many biological processes, including protein deubiquitination, protein modification by small protein removal, ncRNA metabolic processes, and nuclear chromosomal-associated processes.
GSEA Assay of the NOA Database Classified Based on the Expression Levels of SMAD2 or HIPK4
In order to further explore the potential biological characteristics of SMAD2 and HIPK4,the NOA databases were classified again based on the expression levels of SMAD2 or HIPK4. Then, the GSEA assay was performed in the newly classified groups. Terms with an FDR q value lower than 0.25 and a p value lower than 0.05 were regarded as differential signaling pathways. In the GAE455887 dataset, the RNA POLYMERASE pathway was downregulated in the SMAD2 high expression group, with an FDR q value of 0.21918765. In the GAE455885 dataset, the Energy dependent regulation of mTOR BY LKB1 AMPK pathway was also downregulated in the SAMD2 high expression group, with an FDR q value of 0.15020823. (Figure7A, B)
The same assay was also performed for HIPK4. The HUNTINGTONS DISEASE,and PARKINSONS DISEASE signaling pathways were upregulated in the HIPK4 high expression group. No downregulated pathways were identified.
Co-expression Assay of SMAD2 or HIPK4 and Pathway assay
The co-expression assay was performed to further explore the biological effects of SMAD2 or HIPK4. The results showed that SMAD2 was co-expressed with 807 genes in GSE45885 and 1507 genes in GSE45887. Moreover, 722 genes were co-expressed with SMAD2 in both databases, which were consequently analyzed with the KEGG and GO pathway assays. The KEGG assay identified the following signaling pathways: cellular senescence, valine, leucine and isoleucine degradation, biosynthesis of unsaturated fatty acids, propanoate metabolism, and ubiquitin-mediated proteolysis. The GO assay showed that protein deubiquitination, protein modification by small protein removal, and vacuolar transport were clustered out. (Figure 8A, B)
With respect to the HIPK4 gene, 1635 co-expressing genes were identified in GSE45885, and 3033 in GSE45887. Among them, 1500 genes were found in both databases (Supplementray TXT1), and they were analyzed with the KEGG and GO assays. The KEGG results identified Huntington disease, oocyte meiosis, the PPAR signaling pathway, the glucagon signaling pathway, and lysosome signaling pathway. In the GO assay, many pathways involved in reproduction and spermatogenesis were identified, including fertilization, single fertilization, spermatid differentiation, spermatid development, sperm−egg recognition, cellular process involved in reproduction in multicellular organism, flagellated sperm motility, and sperm motility. (Figure 8C, D)