Weighted Gene Co-Expression Network Analysis to Construct Competitive Endogenous RNAs Network in Chromogenic Renal Cell Carcinoma

Background: This study aimed to construct the competing endogenous RNA (ceRNA) network in chromophobe renal cell carcinoma (ChRCC). Methods: Clinical and RNA sequence proles of patients with ChRCC, including messenger RNAs (mRNAs), microRNAs (miRNAs), and long noncoding RNAs (lncRNAs), were obtained from The Cancer Genome Atlas (TCGA) database. “EdgeR” and “clusterProler” packages were utilized to obtain the expression matrices of differential RNAs (DERNAs) and to conduct gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. Weighted gene co-expression network analysis (WGCNA) was performed to screen the highly related RNAs, and miRcode, StarBase, miRTarBase, miRDB, and TargetScan datasets were used to predict the connections between them. Univariate and multivariate Cox proportional hazards regressions were performed in turn to elucidate prognosis-related mRNAs in order to construct the ceRNA regulatory network. Results: A total of 1628 DElncRNAs, 104 DEmiRNAs, and 2619 DEmRNAs were identied. WGCNA showed signicant correlation in 1534 DElncRNAs, 98 DEmiRNAs, and 2543 DEmRNAs, which were related to ChRCC. Fourteen DEmiRNAs, 113 DElncRNAs, and 43 DEmRNAs were screened. Nine mRNAs (ALPL, ARHGAP29, CADM2, KIT, KLRD1, MYBL1, PSD3, SFRP1, SLC7A11) signicantly contributed to the overall survival (OS) of patients with ChRCC (P < 0.05). Furthermore, two mRNAs (CADM2, SFRP1) appeared to be independent risk factors for ChRCC. Conclusion: The ndings revealed the molecular mechanism of ChRCC and potential therapeutic targets for the disease.


Introduction
As one of the three major renal cell carcinoma histological subtypes, chromophobe renal cell carcinoma (ChRCC), accounts for 4-5% of renal cancer cases [1]. The average diagnostic age of ChRCC is 58 years, and most patients are male [2]. Most patients with ChRCC have good prognoses with 5-year survival rates of 78-100%. However, metastases still occur in about 6-7% of patients and usually affect the liver or lungs [3]. Furman and Panel tumor classi cation schemes have already been proposed for use in staging ChRCC over the past decades [4,5]. However, considering the ambiguity of the grading criteria and the lack of applicability to the characteristics of the nucleus of ChRCC, their prognostic value appears to have been overestimated [6,7]. In order to better standardize treatment and improve patient prognosis, it is critical to elucidate highly speci c biomarkers and effective therapeutic targets. In 2011, Salmena et al.
described the competing endogenous RNA (ceRNA) hypothesis, which re-explored the regulatory function of long noncoding RNAs and the potential network between messenger RNAs (mRNAs), microRNAs (miRNAs), and long noncoding RNAs (lncRNAs) [8]. As a key element in the ceRNA network, miRNAs could simultaneously be competitively antagonized by lncRNA, mRNA, and other RNAs through shared microRNA response elements (MREs). Overexpressed MRE-containing transcripts (so-called "RNA genes and pathways-pathways networks. [15]. After verifying and con rming the optimal soft threshold, we conducted weighted gene co-expression network analysis (WGCNA) using the "WGCNA" package. RNAs were classi ed into different color modules according to the connectivity and synergy between them. In selecting the RNAs for next analyses, connectivity was established between each module and the relevant ChRCC trait.
Finally, univariate and multivariate Cox proportional hazards regressions were performed in turn using the "survival" package of R to elucidate the most signi cant independent risk factor mRNAs associated with the OS of patients with ChRCC. Sample scores were compared to the median risk score and divided into high-risk and low-risk groups. ROC and C-indices were used to evaluate the stability and reliability of the mRNA-based prognostic model. The detailed ow chart is presented in Fig. 1.
Based on the elucidated relationships between lncRNAs-miRNAs and miRNAs-mRNAs, and the Cox results, we were able to derive the lncRNAs-miRNAs-mRNAs competing endogenous network. Cytoscape software (version of 3.6.1) was used to visualize the ceRNA network. The Kaplan-Meier curves were used to analyze the reliability with which each RNA in the ceRNA network was able to predict the patient's OS (with P < 0.05 indicating signi cant reliability).

Results
The lncRNAs, miRNAs and mRNAs expression matrices of the 89 patients (24 normal and 65 with ChRCC) were downloaded from the TCGA dataset. Patients' clinicopathological characteristics are presented in Table 1.  Fig. 2 (A, B, C). GO analysis showed that the top ve functions of the 2619 DEmRNAs focused on organic anion transport, regulation of membrane potential, regulation of ion transmembrane transport, modulation of chemical synaptic transmission, and regulation of trans − synaptic signaling (Fig. 3A).
Meanwhile, the top ve KEGG pathways of these DEmRNAs were enriched in neuroactive ligand − receptor interaction, cAMP signaling pathway, complement and coagulation cascades, retinol metabolism, and chemical carcinogenesis (Fig. 3B). Insulin secretion and connection between pathways were presented in the pathways-pathways network (Fig. 3C). In the pathways-genes network, multiple RNAs were related to ve pathways: complement and coagulation cascades, metabolism of xenobiotics by cytochrome P450, neuroactive ligand − receptor interaction, retinol metabolism, and steroid hormone biosynthesis (Fig. 3D).
In the WGCNA, the power of the soft threshold of the lncRNAs-miRNAs matrix was 10 (Fig. 4A), and the miRNAs-mRNAs matrix was 14 ( Fig. 4B), both of which achieved the best satisfaction and consistency of the scale-free R-squared value. After calculating their adjacency and connectivity, these lncRNAs-miRNAs were classi ed into 10 modules (Fig. 4C), and miRNAs-mRNAs were classi ed into 11 modules (Fig. 4D). Their topological overlap matrix heatmaps are presented in Fig. 4 (E, F). Red, yellow, brown, and grey modules of lncRNAs-miRNAs were found to have signi cant correlation (Fig. 5A), and greater connections were also observed in green, turquoise, and grey modules of the miRNAs-mRNAs (Fig. 5B). Modules in these two groups included a total of 1534 DElncRNAs, 98 DEmiRNAs, and 2543 DEmRNAs, which were also more closely related to ChRCC than the others (Fig. 5C, D).
Additionally, Kaplan-Meier analyses for the ceRNA members showed that low expression of KLRD1 and high expression of LINC00520 signi cantly contributed to worse OS for patients with ChRCC (P < 0.05) ( Fig. 7B, C). Meanwhile, the low-risk group also showed obvious superiority over the high-risk group, despite its P value being slightly greater than 0.05 (P = 0.06016) (Fig. 7D).

Discussion
With the progress of molecular biology, the function of noncoding transcriptome has been extensively explored. Multi-RNAs competition regulatory networks appear to play indispensable roles in the biological processes and courses of cancer diseases [16,17]. Several studies have explored and veri ed ceRNA networks in the past. Wang JD et al. included 407 normal and 151 acute myeloid leukemia (AML) samples from Genotype-Tissue Expression (GTEx) (https://commonfund.nih.gov/GTEx/) and TCGA datasets in their study. They found that the ceRNA network in AML involved 108 lncRNAs, 10 miRNAs, and 8 mRNAs, which appeared to in uence prognosis and cancer progression [18]. Meanwhile, Yao Y et al. also established a ceRNA network from TCGA database comprising 52 lncRNAs, 17miRNAs, and 79 mRNAs in breast cancer's RNAs matrix and in which ve lncRNAs were found to signi cantly affect patients' OS. Furthermore, results of GO and KEGG analyses of these mRNAs were also related to biological characteristics of tumors [19].
WGCNA is a bioinformatic and sensitive method, which is especially suitable for these large and highdimensional data, as well as for low abundance or fold change genes. It is able to cluster highly related genes from microarray samples into different color modules and explore the relationship between the genes and cancer traits [20]. WGCNA has already been used in various oncological studies to explore hub genes and the regulatory relationships between them [21][22][23]. In our study, we preformed WGCNA to select highly related module genes, which helped us elucidate the more meaningful RNAs for further prediction.
Importantly, prediction in multiple datasets allowed us to rapidly lockdown the shared high-value genes similar to previous studies [24][25][26][27]. Another advantage of our study was the application of univariate and multivariate Cox proportional hazards regressions on selected target mRNAs from which we obtained a reliable and stable prognostic model and identi ed important genetic biomarkers for ChRCC within the ceRNAs network. The excellent C-index and 3-and 5-year survival AUCs further proved the superiority of our model. The Kaplan-Meier curves showed that low-risk patients would achieve better long-term OS.
A member of the cell adhesion molecule gene family, CADM2, has been reported to be under-expressed in the nine mRNAs. This might contribute to the progression of various cancers, including prostate cancer, ovarian cancer, lymphoma, melanoma, and clear renal cell cancer (cRCC) [28][29][30][31][32]. CADM2 is believed to prevent tumor progression, invasion and metastasis by maintaining cell's polarity and adhesion [32].
Tyrosine protein kinase (KIT) is overexpressed in various cancers [33,34]. Especially in ChRCC and oncocytoma. Huo L et al. reported that KIT was more sensitive to ChRCC and oncocytoma than other renal cancers, and hence it would be useful in precise tumor classi cation and targeted therapy [35,36].
In the past, SFRP1 has been considered to be a tumor suppressor gene and possibly antagonistic to the wnt signaling pathway [37]. It has been found that increased methylation levels in the SFRP1 promoter region might lead to SFRP1 silencing in Crcc [38,39]. Meanwhile, low SLC7A11 expression was found to be an important target in the p53 tumor suppression pathway, which is closely related to cell-cycle arrest, apoptosis, and senescence. As the main component of the cystine/glutamate antiporter, under-expressed SLC7A11 could inhibit cellular uptake of cystine and eventually lead to increased cell sensitivity to ferroptosis [40]. Additionally, upregulation of ARHGAP29 might be related to metastasis in gastric cancer [41]. ALPL is primarily related to hypophosphatemia [42]. SR Rao et al. found that high expression of ALPL led to poor survival outcomes for patients with prostate cancer [43]. However, another study proposed that ALPL could inhibit the motility and aggression of serous ovarian cancer cells [44]. High expression of KLRD1 was reported to inhibit the function of natural killer cells and cytokine induced killer cells [45,46]. PSD3 is considered to be a candidate metastasis suppressor gene, and its low expression has been observed to be associated with poor prognosis in ovarian cancer and metastasis in breast cancer [47,48]. Moreover, MYBL1 is highly expressed in adenoid cystic carcinoma and is often accompanied by genomic rearrangements [49]. These previous ndings lend con dence to the hypothesis that the ceRNA network plays an important role in the occurrence and development of cancers. Moreover, to our knowledge, this is the rst report regarding the role of these mRNAs in the ChRCC, in which KLRD1 was found by Kaplan-Meier analysis (P = 2.344e-2) to signi cantly affect patients' OS.
Previous studies involving cRCC have reported the importance of the six miRNAs (hsa-mir-222, hsa-mir-204, hsa-mir-206, hsa-mir-183, hsa-mir-372, hsa-mir-221) in the ceRNA network. In particular, hsa-mir-206, hsa-mir-204, and hsa-mir-372 were found to suppress cancer through corresponding biological functions [50][51][52], and hsa-mir-183 was considered to be a potential oncogene [53]. Kaplan-Meier analysis also showed that high expression of LINC00520 had an effect on OS. Chen B et al. in their study based on the cBioPortal dataset, also emphasized its importance in cRCC [54]. However, more studies are needed to fully explore the biological function of the lncRNAs in ChRCC.
In this study, we constructed a ceRNA network including 79 lncRNAs, 6 miRNAs, and 9 mRNAs. Their possible competitive synergistic biological functions might jointly regulate various processes in ChRCC, and, hence, they may provide new therapeutic targets and a new perspective for ChRCC genetic biology studies. However, there were some limitations to our study. Firstly, the prognostic model of mRNA has not been externally veri ed. Also, we lacked in vivo and in vitro experiments to verify our results.

Conclusion
In this study, we established the ceRNA network in ChRCC, which included 79 lncRNAs, 6 miRNAs, and 9 mRNAs. Among them, three mRNAs (CADM2, SFRP1, KLRD1) and one lncRNA (LINC00520) showed promise as potential biomarkers for ChRCC. Our results offer new insights into the diagnosis and treatment of CHRCC and demonstrate the merit of further genetic biology research into ChRCC.