Identification of Novel Hub Genes Through Expression Profiles Analysis in Adrenocortical Carcinoma


 Background: Adrenocortical carcinoma (ACC) is a heterogeneous and rare malignant tumor associated with a poor prognosis. The molecular mechanisms of ACC remain elusive and more accurate biomarkers for the prediction of prognosis are needed.Methods: In this study, integrative profiling analyses were performed to identify novel hub genes in ACC to provide promising targets for future investigation. Three gene expression profiling datasets in the GEO database were used for the identification of overlapped differentially expressed genes (DEGs) following the criteria of adj.P.Value<0.05 and |log2 FC|>0.5 in ACC. Novel hub genes were screened out following a series of processes: the retrieval of DEGs with no known associations with ACC on Pubmed, then the cross-validation of expression values and significant associations with overall survival in the GEPIA2 and starBase databases, and finally the prediction of gene-tumor association in the GeneCards database.Results: Four novel hub genes were identified and two of them, TPX2 and RACGAP1, were positively correlated with the staging. Interestingly, co-expression analysis revealed that the association between TPX2 and RACGAP1 was the strongest and that the expression of HOXA5 was almost completely independent of that of RACGAP1 and TPX2. Furthermore, the PPI network consisting of four novel genes and seed genes in ACC revealed that HOXA5, TPX2, and RACGAP1 were all associated with TP53. Conclusions: This study identified four novel hub genes (TPX2, RACHAP1, HXOA5 and FMO2) that may play crucial roles in the tumorigenesis and the prediction of prognosis of ACC.


Introduction
Regarded as an orphan disease with an annual incidence worldwide between 0.7 and 2 cases/million, adrenocortical carcinoma (ACC) is a heterogeneous malignancy that originates from the adrenal cortex [1][2][3]. Even though it can occur at any age group, ACC displays a bimodal incidence pattern, peaking at adults, especially women, aged 40-50, and children aged 1-5 [4,5]. In addition to the common and nonspeci c attributes of abdominal mass effect, endocrine disturbances can be seen in 40-60% of adult patients and up to 90% in children with ACC [6][7][8]. At present, authenticated prognostic markers of ACC include initial staging, resection margin status, Ki-67 staining state, autonomous cortisol secretion, and the general condition of patients [9]. Even though these markers serve as relatively good prognostic tools, the outcome of ACC remains rather variable owing to its nature of heterogeneity even within the same disease stage or the same resection margin status [10]. Importantly, the initial staging of ACC has become the foremost factor in the determination of prognosis, with a ve-year survival of 81% for stage I and of only 13% for stage IV [3,6,11]. Unfortunately, early diagnosis of ACC serves as a vast roadblock due to the lack of speci c symptoms and the rarity of this malignancy [10,12]. Currently, limited treatment options are another Gordian knot in ACC. While surgical resection is the last straw of curative treatment for ACC, recurrence, and progression are common events, even in patients undergoing R0 resection [11]. Adjuvant therapies, including the only formally approved drug Mitotane and radiotherapy, Page 3/24 have shown certain improvements on the recurrence-free survival in ACC, but not on the overall survival [3,[13][14][15].
Even though the management of ACC remains challenging because of the heterogeneity and unpredictability, the advances in gene analysis with the help of bioinformatics have bene ted the identi cation of promising targets and established novel approaches for research, to improve our understanding of ACC [5,11]. Therefore, we aimed to identify novel hub genes that may play essential roles in ACC using public datasets to elucidate the development and the prediction of prognosis of ACC in this study. An integrated analysis was conducted using the comprehensive gene expression data to distinguish the differentially expressed genes (DEGs) in ACC. Novel hub genes were screened considering the gene expression and the relation to overall survival using multiple databases. Gene expression analysis and survival analysis were used to screen novel hub genes. Co-expression analyses and the construction of protein-protein interaction (PPI) network consisting of hub genes and seed genes in ACC were completed to offer new opportunities for the investigation of possible mechanisms underlying ACC and thus facilitate the strati cation and the prediction of prognosis for patients with ACC.

Materials And Methods
Inclusion of gene expression datasets. Identi cation and representation of DEGs.
To determine the DEGs, the web tool GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) was used. Through Benjamini & Hochberg false discovery rate method, the adjusted P-value (adj.P.Value) was selected to provide an appropriate balance between the discovery of statistically signi cant genes, and the limitation of false positives. DEGs (adj.P.Value < 0.05 and |log 2 FC|>0.5) comparing ACC with NT from the three datasets were retrieved. Volcano plots were performed to represent the distribution of genes between ACC and NT for three datasets respectively. Overlapped DEGs were then identi ed using Venny 2.1 (http://bioinfogp.cnb.csic.es/tools/venny/index.html) for further analysis. The hierarchical clustering of overlapped DEGs in three datasets for visualization of expression patterns was performed in R.
Screening of novel hub genes.
To determine novel hub genes that might play crucial roles in the development or the prediction of prognosis in ACC, the following procedures were performed. Firstly, using queries for ["gene name of a speci c DEG" and "adrenocortical carcinoma"] on Pubmed to screen novel genes, with the exclusion criterion: retrieved results showing any known correlation between target genes and adrenocortical carcinoma. Secondly, the survival analysis by overall survival (OS) method and the expression values of each gene in ACC compared with NT were performed through GEPIA2 (http://gepia2.cancerpku.cn/#index) and starBase (http://starbase.sysu.edu.cn/), with the inclusion criteria: 1) retrieved results were consistent in two databases; 2) retrieved results were statistically signi cant as P < 0.05. Additionally, the GeneCards database (https://www.genecards.org/) was used to provide information concerning gene-disease associations, especially gene-tumor associations in this study, to help locate possible novel genes that may contribute to the development of ACC.

PPI network construction.
To understand the possible mechanisms underlying novel hub genes in ACC, we constructed a PPI network. Known crucial genes that contribute to the development of ACC were obtained using Phenolyzer software (http://phenolyzer.wglab.org) and were referred to as seed genes, using "adrenocortical carcinoma" as the disease term. Seed genes and identi ed novel hub genes were then entered into the STRING database (https://string-db.org/). Results were retrieved after choosing Homo sapiens as the organism under the default parameters and visualized in a PPI network, constructed in the Cytoscape software (http://www.cytoscape.org/).

Data analysis.
Three datasets used in this present study are all available in the GEO database. Expression values of genes in different stages of ACC were retrieved from the UALCAN database (http://ualcan.path.uab.edu/analysis.html). Co-expression analysis was performed using data from the starBase database. All survival analyses were carried out in the GEPIA2 database and represented by Kaplan-Meier curves, and p values between each group pair were analyzed by the log-rank test. Statistical signi cance of different experimental data was analyzed using GraphPad Prism 8. P < 0.05 was considered to be statistically signi cant.

DEGs were identi ed in ACC.
Repeated measures and DEGs without gene symbols were excluded from the results. Based on the cutoff criteria, 267, 242, and 186 up-regulated DEGs comparing ACC with NT were retrieved in the GSE90173, GSE19776, and GSE14922 datasets, respectively; 714, 943 and 202 down-regulated DEGs were shown as well in volcano plots ( Fig. 1A-C). Notably, a total of 10 up-regulated and 28 down-regulated DEGs were overlapped among the three independent cohorts (Fig. 1D-E) and these DEGs were used for further analysis.
Expression patterns of overlapped DEGs.
To determine the expression patterns of the aforementioned 38 DEGs in three datasets, we performed clustering analyses to visualize the expression pro les of all samples in each heat map in which the genes had been hierarchically clustered to provide an overview of the general pattern (Fig. 2). The expression pro les among ACC and NT in GSE90173 and GSE19776 showed a certain degree of heterogeneity ( Fig. 2A-B) while ACC and NT samples within GSE14922 were placed in two separate clusters (Fig. 2C), which revealed that ACC indeed had an expression pro le distinguishable from NT based on these 38 DEGs.
Four novel hub genes were screened out.
To determine novel hub genes that may serve vital roles in the tumorigenesis and the prediction of prognosis of ACC, we tried to screen among the aforementioned 38 DEGs in a series of procedures. 3 upregulated DEGs, CDK1, CCNB1 and PRC1, and 4 down-regulated DEGs, PHYHIP, IGFBP6, CDKN1C, and CXCL12, were shown to be associated with ACC after searching on Pubmed (Table 1). Therefore, a total number of 7 genes were ruled out rst to nd novel genes. After inputting the remaining 31 DEGs to the GEPIA2 and starBase databases, we obtained 7 up-regulated DEGs and 2 down-regulated DEGs displaying consistent and statistically signi cant results in the expression values and OS of ACC. For these 7 up-regulated DEGs, we additionally searched them in the GeneCards database to seek promising genes showing possible gene-tumor associations, and 2 genes were retrieved with positive results while TPX2 was associated with hepatocellular carcinoma and RACGAP1 with uterine carcinosarcoma (Table 1). Together, we screened out 2 up-regulated DEGs (TPX2 and RACGAP1) and 2 down-regulated DEGs (HOXA5 and FMO2) that may serve crucial functions in ACC. Genes in red indicated known associations with ACC; genes in blue indicated identi ed novel hub genes in ACC.
Expression and prognostic values of novel hub genes in ACC.
After screening of four novel hub genes, we evaluated and visualized the expression levels of each hub gene in ACC compared with NT and the survival analysis of each gene in ACC using data from the GEPIA2 database (Fig. 3). TPX2 and RACGAP1 were overexpressed in ACC while the expression of HOXA5 and FMO2 was down-regulated, which was consistent with the data collected from the three datasets used in the rst place (SupFig. 1). The survival analyses showed that high expression of TPX2 or RACGAP1 (Fig. 3A-B), as well as low expression of HOXA5 or FMO2 (Fig. 3C-D), was associated with shorter OS among ACC patients, implying that all four hub genes might serve as independent prognostic targets in ACC.
Since initial staging of ACC is the most important feature in determining prognosis, we next retrieved the expression values of these four novel genes in different stages of ACC to see whether they played a role in the development of ACC (Fig. 4). A signi cantly elevated expression of TPX2 and RACGAP1 was observed along with the increase of stages in ACC, with the lowest expression in stage I and the highest in stage IV (Fig. 4A-B), indicating that both genes might contribute to the development of ACC and act as markers for staging. As for two down-regulated novel hub genes, the expression of HOXA5 was independent of the stages (Fig. 4C) while a decreasing trend was noticed for the expression of FMO2 in different stages of ACC (Fig. 4D).
Co-expression analysis of four hub genes in ACC.
We next performed co-expression analysis and survival analysis of these four novel hub genes to verify possible associations in between. A strong correlation was observed between the expression of TPX2 and that of RACGAP1 (r = 0.903, Fig. 5A upper image), while a weak association existed between FMO2 and TPX2 (r=-0.376, Fig. 5E upper image) or between FMO2 and RACGAP1 (r=-0.330, Fig. 5C upper image).
Furthermore, the expression of HOXA5 seemed to be independent with that of the rest three novel hub genes (Fig. 5B, D&F, upper images). Furthermore, a total number of 76 patients with ACC were included in the study to assess the relationship between these four genes and the overall survival. While the survival analysis of co-expression of TPX2 and RACGAP1 or that of HOXA5 and FMO2 was consistent with the survival analysis of single-gene expression in ACC ( Fig. 5A&F and Fig. 3A-D, lower images), the impact of higher expression of TPX2 or RACGAP1 outweighed the bene ts that higher expression of HOXA5 or FMO2 brought on the OS of ACC patients, and vice versa ( Fig. 5B-E, lower images). Together, these results indicated that among these four genes, co-expression of TPX2 and RACGAP1 was the most strongly associated and might be of great prognostic value in ACC.
PPI network of novel hub genes and seed genes in ACC.
To better understand the possible roles and mechanisms underlying the aforementioned four novel hub genes in ACC, we constructed a PPI network to establish the correlation between these genes and seed genes in ACC. 16 seed genes in ACC were retrieved from the Phenolyzer software (SupTab. 1). KCNQ1OT1, H19, and WT2 were excluded for further PPI construction since they encoded long noncoding RNA instead of protein products. As was shown in Fig. 6, a total number of 17 nodes and 45 edges were included in the PPI network containing seed genes in ACC and four identi ed novel hub genes. Both TPX2 and HOXA5 were shown to have possibly direct connections with TP53, which was a seed gene with the highest score in ACC (SupTab. 1). Combined with co-expression results that HOXA5 had little relevance with the rest three novel hub genes, it might be assumed that different mechanisms and pathways were involved in the regulation of HOXA5 and TPX2, even though both genes were predicted to be associated with TP53. Furthermore, RACGAP1 might be associated with TP53 through the regulation of TPX2, consistent with the results of the co-expression analysis (Fig. 5A, upper image), while FMO2 was found to have no association with the seed genes in ACC or the other three novel hub genes. Together, this PPI network provided us a basic starting point to further investigate the roles these four novel hub genes might play in ACC.

Discussion
In the present study, we used three gene expression pro ling datasets in the GEO database for the identi cation of overlapped DEGs in ACC compared with normal adrenal tissues. Together, 10 upregulated and 28 down-regulated DEGs were screened out. Clustering analysis revealed a certain inconsistency of the expression of DEGs within the same dataset, especially in GSE90173 and GSE19776 datasets (Fig. 2A&B), highlighting the heterogeneity of ACC. The phenomenon that the GSE14922 dataset showed obvious and clear strati cation might be partially attributed to the small number of specimens involved (Fig. 2C). Next we aimed to identify novel hub genes among these 38 DEGs within a series of procedures. 7 DEGs shown known associations with ACC on Pubmed were ruled out rst (Table 1), among which CDK1, CCNB1, PRC1, and CDKN1C were involved in the regulation of cell cycle while IGFBP6 interacted with IGFs. The expression values and prognostic signi cance of the remaining 31 novel DEGs were cross-checked among the three datasets, the GEPIA2 database, and the starBase database to increase data reliability and reduce false positives, as a result of which 7 up-regulated and 2 down-regulated DEGs were identi ed. To further narrow down the range of promising hub genes, 7 upregulated DEGs were inputted into the GeneCards database and two hub genes were retrieved showing gene-tumor associations. Together, the identi cation of two up-regulated and two down-regulated novel hub genes in ACC was completed. The survival analysis indicated that RACGAP1 had the greatest impact on the OS of ACC, followed by TPX2, FMO2, and HOXA5 (Fig. 3). Since the staging system in ACC is the most important factor in the prediction of prognosis, we analyzed the expression pro le of four novel hub genes in different stages of ACC and found that TPX2 and RACGAP1 were positively correlated with the staging, indicating a promising role of such genes in the strati cation and the prediction of prognosis in ACC [11]. Interestingly, co-expression analysis revealed that the association between TPX2 and RACGAP1 was the strongest among all contrast groups and that the expression of HOXA5 was almost completely independent of that of RACGAP1 and TPX2 (Fig. 5). Therefore, a PPI network was constructed to visualize the possible connections of four novel genes and seed genes in ACC (Fig. 6). Surprisingly, HOXA5, TPX2, and RACGAP1 were somehow all associated with TP53, indicating that at least two pathways were involved. And the phenomenon that RACGAP1 was associated with TP53 through TPX2 partially explained the aforementioned strong association between TPX2 and RACGAP1, implying that future studies concerning TPX2 and RACGAP1 in ACC should be warranted.
TP53 encodes the transcription factor p53, which plays an important role in the inhibition of cancer cells. Along with the accumulation of p53, TP53 inactivation mutation, the incidence of which is around 15-70%, is one of the most common molecular events in the formation of ACC and is closely associated with a larger volume, more advanced stage, a higher rate of metastasis, and shorter disease-free survival in ACC [16][17][18][19][20][21]. Furthermore, TP53 germline mutations are more likely to occur in pediatric ACC and are associated with worse outcomes [22], emphasizing the magnitude of genetic counseling in patients with ACC, especially in those of suspected hereditary cancer syndrome [23]. It has also been reported that ACC is an integral part of Li Fraumeni Syndrome (LFS), which is a familial cancer predisposition caused by germline mutations of TP53 [19]. Since TPX2, RACGAP1, and HOXA5 were found to be associated with TP53 in this study, it could be possible that these three novel hub genes might play important roles in the formation of ACC.
TPX2 gene encodes a microtubule-associated protein, which serves a vital role in mitotic spindle formation and chromosome segregation process [24,25]. The overexpression of TPX2 has been reported in many studies and related to poor clinical outcomes of multiple malignancies, such as lung cancer, thyroid carcinoma, clear cell renal carcinoma, gastrointestinal cancer, and hepatocellular cancer, implying the possibility that TPX2 can be used as a prognostic marker and potential therapeutic target in these tumors [26][27][28][29][30][31][32][33][34]. Furthermore, TPX2 was reported to be involved in the invasion of tumor cells by the elevation of MMP2 and MMP9 through AKT signaling in hepatocellular carcinoma [35,36]. With the help of MYC, TPX2 could also cooperate with AURKA to drive the tumorigenesis of colon cancer, providing a promising therapeutic option for MYC-driven tumors by inhibition of the AURKA-TPX2 axis [37,38]. In terms of the association between TPX2 and p53, a novel regulatory circuitry of TPX2-p53-GLIPR1 was found in bladder cancer to regulate proliferation, migration, invasion, and tumorigenicity, while silencing of TPX2 could activate p53 signaling and inhibit the proliferation of breast cancer cells [39,40]. Together, these data suggest that TPX2 might be a promising target for the prediction of prognosis and therapeutic approaches in ACC and future studies concerning TPX2 in ACC should be warranted.
RACGAP1, which is located in the mitotic spindle and participates in Rho mediated inactivation of various signals, is a member of GTPase activating proteins (GAPs). RACGAP1 is a known regulator of cytokinesis during the normal cell cycle and plays an important role in cell growth regulation, differentiation, cell transformation, invasion, migration, oncogenesis, progression, recurrence, and metastasis [41,42].
RACGAP1 has been correlated with poor outcomes in various tumor types, including breast cancer, hepatocellular carcinoma, gastric cancer, colorectal cancer, epithelial ovarian cancer, esophageal carcinoma, and invasive cervical cancer [43][44][45][46][47][48]. Interestingly, RACGAP1 was reported to be a target gene of mutant p53 [49,50]. Furthermore, AURKA is reported to be directly correlated with the expression of RACGAP1, which is a modulator of the classical Wnt signaling pathway [44]. Since AURKA has been recognized to be closely associated with TPX2 and the Wnt signaling pathway plays an important role in ACC, whether the association between TPX2 and RACGAP1 through AURKA exists and whether RACGAP1 could function through Wnt signaling pathways in ACC are worth exploring [3,37].
HOXA5 is a member of the HOX family that participates in various cellular processes. The methylation status of HOXA5 has also been suggested to be associated with tumor prognosis in multiple studies [51,52]. In breast cancer, the loss of HOXA5 expression, resulting from methylation of its promoter regions, could lead to loss of p53 expression and the functional activation of Twist, and therefore promotes breast tumorigenesis [53][54][55]. Furthermore, the cooperation between HOXA5 and p53 is found to be able to inhibit the invasion of tumor cells, at least partially owing to the decrease of MMP2 activity in NSCLC. [56]. Low expression of HOXA5 is associated with high levels of nuclear β-catenin, which is the hallmark of the Wnt signaling pathway in colorectal carcinoma [57]. Moreover, HOXA5 could also negatively regulate the Wnt canonical pathway in the developing lungs [58]. Since RACGAP1 is a modulator of the classical Wnt signaling pathway [44], whether there is a connection between RACGAP1 and HOXA5 remains to be explored. Together, it is highly promising that the regulation of HOXA5 could provide a potential therapeutic approach in ACC [59].
FMO2, a major isoform mainly and greatly expressed in the human lung, is a member of a superfamily of monooxygenase genes. FMO2 has been shown to regulate oxidative stress levels by producing metabolites that promote the release of ROS in the form of hydrogen peroxide [60]. However, there is little evidence to support the association between FMO2 and tumorigenesis. In experimental animals, the expression of the FMO2 gene is regulated by sex hormones, and putative glucocorticoid responsive elements are found to be located in the 59-anking regions of the rabbit FMO2 gene, indicating that FMO2 may be regulated by the excessive release of hormones in ACC [61]. Whether FMO2 contributes to the development of ACC and how it functions deserves future investigation.
There were still some limitations in this study. The sample size of the datasets we used was limited, and it might be more statistically signi cant to adopt a larger sample size for statistical analysis. In addition, this study was completely based on in silico evidence, which needed to be veri ed by experiments and real-world clinical samples. And due to the default limitations of the online analysis tools we used, unexpected variances might occur during the analyses. In spite of these limitations, we still believe that this study can still provide researchers with some potential research targets in ACC.
In conclusion, this study identi ed four novel hub genes using integrative analysis of gene expression pro ling in ACC. TPX2, RACGAP1, and HOXA5 may function through the regulation of p53 while FMO2 may be associated with the excessive release of hormones. Therefore, this study provided potential prognostic and therapeutic targets for future experimental and clinical investigation of ACC.

Declarations
Availability of data and materials The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.