Weighted Gene Co-Expression Network Analysis Revealed The Cellular Senescence Signaling Pathway Associated With Non-Obstructive Azoospermia

Background Non-obstructive azoospermia (NOA) in males is a disease commonly associated with infertility, and is characterized by a disorder in spermogenesis, which may be caused by chromosomal abnormalities, orchitis, or chemical therapy. Spermogenesis is a process tightly regulated by precise gene expression patterns. However, the key genes and signaling pathways contributing to NOA remain unclear. In this study, weighted gene co-expression network analysis was performed to identify NOA-associated gene clusters, and combined with the


Background
Infertility occurs in 10%-15% of all couples, and male factors contribute to approximately 50% of cases [1,2] .Non-obstructive azoospermia (NOA) is a common disease in male infertility [3] , whose mechanism is different from that of obstructive azoospermia.The pathogeny of NOA is associated with chromosome variation, environmental pollution testis infection endocrine diseases and varicocele [4] .Only a few therapeutic options for NOA are known, including percutaneous epididymis testicular sperm aspiration (PETSA) [5,6] , which often leads to non-ideal results.If mature sperm cannot be obtained from the testes, these patients must rely on other options, such as sperm donors or adoption.
An obvious reproductive duct disorder has not been found in NOA, which results instead from a failure in spermatogenesis.Spermatogenesis is a complex and highly ordered process, mainly involving the storage and differentiation of spermatogonial stem cells (SSCs), and it is regulated by the stem cell niche, which contains the testicular Sertoli cells, various immunological cells, extracellular matrix, and other components [7] .Biological changes in the state of testicular cells directly in uences spermatogenesis, and their dysregulation may contribute to the occurrence of NOA.Previous research has shown that mutation of important genes may play a role in NOA.The mutation of genes such as RNF212, STAG3 [8] , MEIOB [9] , CLCA4 [10] , CDK2 [11] , TDRD9 [12] , and DMC1 [13] , may cause meiotic arrest.Additionally, BNC1 cooperates with TAF7L [14] , and the SOX10 [15] protein plays important roles in spermatogenesis.However, the key genes and signaling pathways associated with NOA remain unclear and need to be further explored.
In this study, we performed weighted gene co-expression network analysis (WGCNA) on different microarrays from human testicular biopsies from men with idiopathic NOA, which is a systems biology approach for describing gene correlation patterns across microarray samples.Based on WGCNA, we built a gene set containing 664 genes associated with the phenotype of the disease, and then, we performed the KEGG pathway assay.The results of the assay, combined with the differentially expressed genes (DEGs) in NOA, showed a signi cant correlation between NOA and the SMAD2 and HIPK4 genes.
Consequently, the database was divided based on the expression level of SMAD2 or HIPK4, and the GSEA assay was performed.Simultaneously, the GO Gene Ontology and KEGG assays of genes co-expressed with SMAD2 or HIPK4 were carried out.This showed that genes co-expressed with SMAD2 also took part in the cellular senescence pathway, while genes co-expressed with HIPK4 played important roles in the regulation of fertilization, sperm differentiation and development, sperm−egg recognition, cellular processes involved in reproduction in multicellular organisms, agellated sperm motility, and sperm motility in general.This indicates that SMAD2 and HIPK4 are relevant for NOA and may be used as therapeutic targets.

Data Sources
The preprocessed mRNA gene expression pro les from the GSE45885 and GSE45887 datasets were downloaded from the GEO database of NIH.GSE45885 contains data from Affymetrix Human Gene 1.0 ST microarrays from 31 testicular biopsy samples, 27 of which were obtained from patients with various types of NOA and 4 from patients with normal spermatogenesis.GSE45887 contains data from 16 patients with NOA and 4 patients with normal spermatogenesis. [16,17]

DEG Screening
The GEO2R interactive webtool was used in order to identify different gene expression levels.Genes with a value of |log2FC| > 1 and a false discovery rate (FDR) < 0.05 were identi ed as DEGs.Then, the two datasets were intersected to generate the nal DEG gene set for subsequent analysis. [18]eighted Gene Co-Expression Network and Genest Identi cation WGCNA is a systems biology method for describing gene correlation patterns across microarray samples.WGCNA can be used to identify the Gene modules that are associated with a disease occurrence.In our study, we rst used the WGCNA package (version 1.68) to analyze the two gene datasets.According to the estimate power function of the WGCNA package, the power of GSE45885 and GSE45887 was set to 8 and 12, respectively.Then, the correlation of each module with NOA was calculated.Genes with a correlation coe cient higher than 0.8 were chosen as hub genes, and they were further assayed to identify the associated signaling pathway in NOA. [19]

KEGG Signaling Pathway Assay and Key Genes Discovery
To identify the function of the hub genes screened in WGCNA analysis, the clusterprolifer package was used to perform GO and KEGG analysis, with p<0.05 set as the threshold value.Thus, in combination with the DEG results, the most important genes associated with NOA could be identi ed. [20]nctional Annotation and Gene Set Enrichment Analysis Based on the Expression Levels of SMAD2 or HIPK4 In order to explore the in uence of SMAD2 and HIPK4 on the biological function of signaling pathways in testicular cells samples from the GSE45885 and GSE45887 datasets were ranked according to the expression levels of SMAD2 and HIPK4 and classi ed accordingly to the high-expression group (upper than median) or low-expression group (lower than median).Then, the Functional Annotation and Gene Set Enrichment Analysis (GSEA) was performed on the samples to unravel the function of SMAD2 and HIPK4.Signi cant terms were de ned based on a p-value < 0.05 and an adjusted p-value < 0.25. [21]-expression of SMAD2 or HIPK4 and Pathway Assay To better understand the pathways in which SMAD2 and HIPK4 may be involved, the co-expressed genes of SMAD2 and HIPK4 were screened from the GSE45885 and GSE45887 datasets.The coe cient between each gene and SMAD2 or HIPK4 was calculated, and the genes with a coe cient higher than 0.7 were identi ed as co-expressed genes.Genes co-expressed with either SMAD2 or HIPK4 from the two databases GSE45885 and GSE45887 were then intersected to generate the nal set of co-expressed genes.Then, the KEGG and GO function enrichment of co-expressed genes assays were performed as previously described.

Results
DEGs of NOA First, DEGs of NOA were screened via normal control in the GSE45885 and GSE45887 databases.The results showed that the number of upregulated genes in the two databases was 118 and 116, respectively.Further, the number of downregulated genes, 871 and 772 respectively, was much higher than that of the upregulated genes (Figure 1).In order to determine the same DEGs in the two databases, the VENN assay was performed, and identi ed 91 upregulated and 772 downregulated genes in both databases.

NOA-associated Signi cant Modules Identi ed by the WGCNA Assay
To further explore the genes associated with NOA, the WGCNA assay was performed.The WGCNA assay is a systems biology method used to describe gene association patterns among different samples, and it can be used to identify highly synergistic gene sets, and to determine candidate biomarker genes or therapeutic targets based on the association between the gene sets and phenotypes.
There was no obvious outlier in the sample clustering (Figure2A 3A), and the connectivity between genes in the gene network met a scale-free network distribution with a soft threshold power of β = 8 or 12, respectively(Figure 2B 3B).The WGCNA assay of GSE45885 and GSE45887 identi ed 28 and 26 modules, respectively (Figure 3C D and 4C D).In the GSE45885 database, the top three modules were the brown, light green, and tan modules, with a score of 0.61, 0.63, and 0.7, respectively.In the GSE45887 database, the top three modules were the black, blue, and dark green modules, with a score of 0.56, 0.64, and 0.53, respectively.The module-feature relationship of both datasets was assayed by Pearson's correlation analysis.The black (r=0.57,P<1.9e-105), blue(r=0.39,P<1.2e-111), and dark green (r=0.63,P<2.6e-07) modules in GSE45887(Figure 4), and the brown(r=0.68,P<1e-200), light green(r=0.55,P<3.4e-14), and tan (r=0.8,P<3.3e-81)modules in GSE45885 were highly associated with the NOA phenotype (Figure 5).
With the help of the NOA-associated modules, the hub genes in each module were determined in both databases, as the NOA-associated genes with an X coe cient higher than 0.8.Then the intersection of NOA-associated genes in the two database was taken, and the genes of interest were identi ed as the NOA-associated gene set, which was further assayed.

Biological Function of NOA-associated Gene set
In order to explore the potential biological characteristics of the NOA-associated gene set the KEGG signaling pathway and GO assays were performed.The TOP12 different KEGG signaling pathways were identi ed (Figure 6A).Ubiquitin-mediated proteolysis, cellular senescence, and autophagy signaling pathways were highly related with the pathogenesis and biological state of NOA.According to the GO assay, autophagy, protein deubiquitination, and protein modi cation by small protein removal were highly related with NOA, and they may refer to the biological characteristic of NOA (Figure 6B).Cellular senescence is an important biological process which can lead to organ dysfunction and may take part in NOA.In order to identify the most important genes in NOA, the genes in the cellular senescence signaling pathway were intersected with the genes in the DEG of NOA.This analysis showed a prominent role of SMAD2 and HIPK4.SAMD2 was upregulated, whereas HIPK4 was downregulated.Between them, SAMD2 exhibited the most biological effects, involved in the regulation of six KEGG signaling pathways from the TOP12.In addition to being involved in the cellular senescence signaling pathway, SMAD2 also took part in endocytosis, the Hippo signaling pathway, hepatocellular carcinoma, colorectal cancer, signaling pathways regulating pluripotency of stem cells, gastric cancer, the Relaxin signaling pathway, pancreatic cancer, and proteoglycan signaling pathways in cancer, which also indicated the multi-functional role of SMAD2.In the GO assay, SMAD2 also related with many biological processes, including protein deubiquitination, protein modi cation by small protein removal, ncRNA metabolic processes, and nuclear chromosomal-associated processes.
GSEA Assay of the NOA Database Classi ed Based on the Expression Levels of SMAD2 or HIPK4 In order to further explore the potential biological characteristics of SMAD2 and HIPK4 the NOA databases were classi ed again based on the expression levels of SMAD2 or HIPK4.Then, the GSEA assay was performed in the newly classi ed groups.Terms with an FDR q value lower than 0.25 and a p value lower than 0.05 were regarded as differential signaling pathways.In the GAE455887 dataset, the RNA POLYMERASE pathway was downregulated in the SMAD2 high expression group, with an FDR q value of 0.21918765.In the GAE455885 dataset, the Energy dependent regulation of mTOR BY LKB1 AMPK pathway was also downregulated in the SAMD2 high expression group, with an FDR q value of 0.15020823.(Figure7A, B) The same assay was also performed for HIPK4.The HUNTINGTONS DISEASE and PARKINSONS DISEASE signaling pathways were upregulated in the HIPK4 high expression group.No downregulated pathways were identi ed.

Co-expression Assay of SMAD2 or HIPK4 and Pathway assay
The co-expression assay was performed to further explore the biological effects of SMAD2 or HIPK4.The results showed that SMAD2 was co-expressed with 807 genes in GSE45885 and 1507 genes in GSE45887.Moreover, 722 genes were co-expressed with SMAD2 in both databases, which were consequently analyzed with the KEGG and GO pathway assays.The KEGG assay identi ed the following signaling pathways: cellular senescence, valine, leucine and isoleucine degradation, biosynthesis of unsaturated fatty acids, propanoate metabolism, and ubiquitin-mediated proteolysis.The GO assay showed that protein deubiquitination, protein modi cation by small protein removal, and vacuolar transport were clustered out.(Figure 8A, B) With respect to the HIPK4 gene, 1635 co-expressing genes were identi ed in GSE45885, and 3033 in GSE45887.Among them, 1500 genes were found in both databases (Supplementray TXT1), and they were analyzed with the KEGG and GO assays.The KEGG results identi ed Huntington disease, oocyte meiosis, the PPAR signaling pathway, the glucagon signaling pathway, and lysosome signaling pathway.In the GO assay, many pathways involved in reproduction and spermatogenesis were identi ed, including fertilization, single fertilization, spermatid differentiation, spermatid development, sperm−egg recognition, cellular process involved in reproduction in multicellular organism, agellated sperm motility, and sperm motility.(Figure 8C, D) Discussion NOA is caused by a failure in spermatogenesis, which is regulated by various factors, especially the genetic background of SSC, and the function of Sertoli cells.NOA was quilt different form obstructive azoospermia which presents with obvious testis microstructure changes.NOA is a result of meiotic arrest, and it has been suggested that microenvironment changes may play critical roles [22] .Therefore, it is important to understand the biological characteristics of testis, and explore the gene expression status and identity of NOA-associated genes.
In order to identify the NOA-associated gene set, the WGCNA assay was performed, which is a systems biology method for describing gene correlation patterns across microarray samples, and can be used for searching disease-associated genes [23].Instead of screening out DEGs, WGCNA can classify clusters of highly correlated genes into one module, which may be more bene cial in identifying clinical biomarkers for diagnosis and therapy.In this study, two NOA databases were analyzed, and the WGCNA assay identi ed 28 and 26 modules, respectively, and the top three of these modules from each database were chosen for further analysis.In order to determine the hub genes associated with NOA in these six modules, genes with an X value higher than 0.8 were chosen to develop a new gene set.This NOAassociated gene set contained about 600 genes.If the X value was set higher than 0.9 the number of genes in the gene set was quite limited.Therefore, 0.8 was the ideal demarcation line, resulting in enough genes with NOA association.
After identifying the NOA-associated hub gene set, it was necessary to explore their potential biological effects, which can be used for further narrowing down the gene set and nd out the core genes in NOA.For this purpose, the KEGG signaling pathway assay and GO assay were carried out.The KEGG signaling pathway assay showed that the TOP ten signaling pathways included the ubiquitin-mediated proteolysis pathway, the Herpes simplex virus 1 infection pathway, the endocytosis pathway, the cellular senescence pathway, the cell cycle pathway, the propanoate metabolism pathway, the autophagy -animal pathway, the Hepatitis-B pathway, and the hepatocellular carcinoma signaling pathway.Combined with the biological characteristic, this analysis indicated that cellular senescence may play the core roles in these pathways.Cellular senescence can be considered as a long-term cell cycle arrest [24] , associated with functional disorder.During the senescence process [25] , changes in protein homeostasis, mitochondrial respiration, autophagy, the number of stem cells, senescence-associated secretion phenotype (SASP), and cellular crosstalk are observed.In the KEGG assay, most of the TOP 10 signaling pathways had a close relationship with senescence, which indicates a key role of senescence in NOA.In the Cellular senescence pathway, 12 genes are clustered, namely, SMAD2, RB1, MAP2K2, PIK3CA, NRAS, TRPM7, MDM2, PPP1CB, RAD50, HIPK4, NBN, ATM.These genes take part in the regulation of senescence at different levels.
In order to explore the core NOA-associated genes in the cellular senescence pathway, the intersection between the cellular senescence pathway and the DEGs of the two databases was analyzed.The upregulated SMAD2 gene and downregulated HIPK4 gene were identi ed in the intersection of the three gene sets, suggesting that these two NOA-associated genes may play key roles.In order to explore the effects of the dysregulation of these two genes on the biological characteristics of testicular tissue, the two databases were divided again according to the expression levels of SMAD2 or HIPK4.Thereafter, the GSEA assay was performed on the SMAD2 high/low expression groups, and HIPK4 high/low expression groups.
SMAD2 belongs to SMAD, a family of proteins similar to the gene products of the Drosophila gene 'mothers against decapentaplegic' (Mad) and the C. elegans gene Sma.It has complex functions and takes part in many pathways, such as the TGF-β signaling pathway [26] , which also play important roles on the spermatogenesis.The GSEA assay on the effects of SMAD2 showed that it can in uence the energy dependent regulation of mTOR by LKB1 AMPK pathway, and RNA polymerase.The mTORassociated signaling pathway acts as a nutrition sensor of nutrition, and regulates cell differentiation and senescence [27] , whose disorder may cause spermogenesis arrest.Similarly, spermogenesis depends on the energy metabolism balance, which also indicated that energy dependent regulation of mTOR by LKB1 AMPK pathway.The KEGG assay was also performed on the genes co-expressed with SMAD2, and the cellular senescence pathway was again identi ed, which strengthens the hypothesis that testicular cell senescence induced by SMAD2 may play a center role in NOA, and may be used as a therapeutic target.
We found that HIPK4, a member of the homeodomain interacting protein kinase (HIPK) family [28] , is downregulated in NOA.HIPK4 may phosphorylate the tumor suppressor protein p53, which play important roles in the regulation of cell proliferation, cellular senescence and aging [29] .The GSEA assay on the effects of HIPK4 showed that HIPK4 can down regulate the HUNTINGTONS DISEASE and PARKINSONS_DISEASE signaling pathways.The GO assay of genes co-expressed with HIPK4 showed the HIPK4 may also take part in spermatogenesis, and can regulated sperm differentiation and development, which is highly relevant for NOA, thus, indicating the important functions of HIPK4 in NOA.
As already mentioned, HIPK4 can induce the phosphorylation of the senescence gene p53, and SMAD2 plays important roles in senescence.The results of the bioinformatics assays described in this study suggest that senescence of Sertoli cells and sperm cells may be the core signaling pathway of NOA.Therefore, methods aiming at the removal of senescent cells or the reversal of senescence altogether may be bene cial for NOA therapy.

Conclusion
In this research, it was revealed that two genes, SMAD2 and HIPK4, were associated with NOA.These genes took part in the regulation of senescence, and the function of senescence cells often was disorder.So, it can be indicated that the cellular senescence of sertoli cells may play important roles in the NOA.This research can help us understand the mechanisms underlying NOA, and suggest anti-senescence therapeutic approaches that target these associated genes, especially SMAD2 and HIPK4.