RPS27A is a Poor Prognostic Predictor for Hepatocellular Carcinoma via Interacting with METTL3-Mediated RNA Modications

Background: Increasing evidence has pointed to the involvement of RNA modications in the pathogenesis of human cancers. However, they are rarely studied in hepatocellular carcinoma (HCC). Method: We summarized multiple types of RNA modication-related genes (RMRGs) from public references, and identied differentially expressed RMRGs (DEGs) between HCC tissues and matched normal samples, where their genetic variation were then investigated. The potential hub genes in the protein-protein interaction (PPI) network constructed by co-expression genes of RMRGs were recognized and veried in METTL3-knockdown HCC cell lines by quantitative PCR assay. Results: Seventy-six RMRGs, including six writers, seven readers, and seven erasers, were collected, of which 34 were identied and validated as DEGs. YTHDC2 exhibited the highest mutation rate, while ADAT2 showed widespread deletions. High correlations were observed between the expressions of 34 RMRGs. The PPI network constructed by 1080 co-expression DEGs related to RNA regulations consisted of 513 nodes and 11557 edges, with RPS27A presented the most directed edges and maximum closeness centrality. Patients with high expression of RPS27A showed worse overall survival (P < 0.01) and disease-free survival (P = 0.019). Moreover, RPS27A was found upregulated on high-risk metastatic and recurrent HCC tissues. Quantitative PCR assay indicated that RPS27A was signicantly decreased in cancer cell lines when METTL3 was knocked down. Conclusions: Remarkable differences were observed for RNA modications between HCC and normal samples, and RPS27A could be a poor prognostic predictor for HCC via interacting with METTL3-mediated RNA modications.


Introduction
Currently, more than 100 types of RNA modi cations have been identi ed owing to the extensive usage of high-throughput sequencing technologies [1,2]. It has been reported that these modi cations are participated in RNA metabolisms at multiple steps, including the process of DNA transcription to mRNA translation and post-transcriptional modulations. After the initiation of mRNA transcription, the modi cation of m6A is carried out for un-matured mRNA by a multicomponent named writer complex [3].
The protein synthesis, known as mRNA translation is also ne-tuned widespread with three types of RNAs (mRNA, tRNA, and rRNA) been modi ed [4]. Many genes were identi ed that participated the process of RNA modi cations and they were categorized into three groups which were called writers, erasers, and readers [5]. The deposition of RNA modi cation can be catalyzed by writers and recognized by readers, or removed by erasers. Accumulating evidence has demonstrated that the dysregulations of RNA modi cations were associated with many disease progression including human cancers [6].
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors derived from digestive organ of the liver. The rising incidence makes it ranked the top 3 leading causes of cancer death worldwide [7]. Chronic liver diseases including virus infection, nonalcoholic liver, and hereditary hemochromatosis, are the most important risk factors of HCC [8]. Many molecular variations have been reported that are associated with HCC pathogenesis by the investigation of multi-omics data [9,10], however, the underlying molecular mechanisms are not completely elucidated.
RNA modi cations have been documented that are comprehensively associated with tumorigenesis and cancer progressions, such as acute myeloid leukemia [11], breast cancer [12,13], and liver cancer [14].
Methyltransferase-like 3 (METTL3), one of the writers of m6A modi cation, was reported that its overexpression promoted the progression of liver cancer, and silencing its expression suppressed tumorigenicity [14]. WTAP [15] and KIAA1429 [16] were also found that they were up-regulated in HCC tissues, and both were associated with HCC prognosis. Nevertheless, different roles of RNA modi cations on speci c mRNAs were observed that downregulation of METTL14 promotes the metastasis of HCC [17], which is opposite to that of METTL3. Additionally, the m5C methyltransferase of NSUN1 was discovered upregulating in lung and prostate cancer and served as a predictor of poor prognosis [18].
These ndings suggest a very complex relationship between RNA modi cations and HCC. Therefore, further exploring the expression of RNA modi cations would facilitate better understanding of the potential roles of RNA modi cations in HCC.
In this study, the landscape of RNA modi cations in HCC was comprehensively investigated in seven independent datasets retrieved from public databases. A total of 76 RNA modi cation-related genes (RMRGs) were rst collected from public references and their expressions were then assessed between HCC tissues and normal samples. Co-expression genes were then identi ed using the differentially expressed RMRGs and were used to construct a protein-protein interaction (PPI) network. Hub-genes were regarded according to the parameters of PPI network and veri ed on external independent datasets. This is the rst time that we identi ed RPS27A as the hub gene which might facilitate the poor progression of HCC via RNA modi cations.

Data preparation
Datasets used in this study were retrieved from Gene Expression Omnibus (GEO) database and The Cancer Genome Atlas (TCGA) ( Table 1). For datasets from GEO, raw data were download and then normalized by the robust multi-array (RMA) method. For TCGA HCC (http://www.cbioportal.org/study? id=lihc_tcga), level 3 data, including RNA-Seq, copy number variation, somatic mutation, and the corresponding clinical features were obtained using TCGAbiolinks [19]. Differentially expressed genes (DEGs) between tumor and normal tissues were identi ed using the two datasets of GSE39791 [20] and GSE57957 [21], as both of them were generated by affymatrix platform and consisted of matched tumor and normal samples. The expression pro le of TCGA HCC was used to verify the ltered DEGs and to identify co-expression genes. GSE130012 [22] and GSE146806 [23] are gene expression pro les of colorectal cancer and PAAD cell lines with METTL3 knocked down. We selected these two datasets to validate the relationship between the identi ed hub gene and RNA modi cation-related genes. GSE14520 [24] is consisted of 488 tumor and paired non-tumor tissue of HCC samples, and we only selected 221 samples that were classi ed as high-risk and low-risk metastasis. GSE56545 contains 12 primary and 9 recurrent tumors. Collection of RNA modi cation-related genes m6A modi cation is the most prominent type of RNA modi cation and many involved genes have been identi ed. However, other types of chemically modi ed DNA including m5C, m1a, etc., which also belong to RNA modi cations are rarely investigated. Therefore, we rst collected RNA modi cation-related genes (RMRGs) by searching public literature. Seventy-six genes that were reported involving in the modi cations of mRNA, tRNA, miRNA, and rRNA by experimental data, were nally retrained (

Differential expression analysis
Low expression genes were de ned as the expression values of more than 90% samples in each dataset equaled zero, and were then excluded before performing differential expression analysis. The normalized gene expression pro les of GSE39791 and GSE57957 were used to identify differentially expressed RMRGs (DEGs). Paired student t-test was used and the corresponding P-value was calculated to evaluate the differential expressions of RMRGs between tumor samples and matched normal samples. Adjusted P values were then calculated by Benjamini and Hochberg's approach to controlling the false discovery rate (FDR). RMRGs with FDR less than 0.05 were recognized as DEGs.
Co-expression analysis RNA-seq data of TCGA HCC was selected to preform co-expression analysis. The expression pro les of identi ed DEGs and other non-RMRGs were extracted and their correlation was assessed by Pearson correlation coe cient (Pearson's r). T-test P-value and the corresponding FDR was then calculated to evaluate the signi cance level of correlation. Non-RMRG was recognized as co-expression gene if its correlation coe cient was >= 0.6 and FDR was <0.05.
Gene Ontology (GO) and KEGG pathway analyses Functional enrichment analysis including GO and KEGG pathway was performed in the web-based tool of GeneCodis 3.0 [27]. A new algorithm that summarizes signi cantly enriched terms was implemented in GeneCodis 3.0 to conduct enrichment analysis. We used all the human genes as background and adjusted P-value of the hypergeometric test to determine signi cantly enriched biological process terms and KEGG pathways.

PPI network and Hub Gene identi cation
The protein-protein interaction (PPI) network of co-expression genes were constructed using the Search Tool for the Retrieval of Interacting Genes/Proteins database (STRINGdb) [28]. We set 0.8 as the minimum cut-off of an interaction score. PPI network was then imported to Cytoscape v3.7.2 [29].
Network topological parameters were analyzed using the plugin of NetworkAnalyzer [30]. The node that had a maximum degree and closeness centrality was recognized as the hub gene of the PPI network.

Cell lines culture and transfection
Three HCC cell lines, SNU398, SNU449, and SNU475, were purchased from Shanghai Biochip Company Ltd. (Shanghai, China). Cells were propagated in Dulbecco's Modi ed Eagle's medium, then 100ul of the cell suspension was transferred to a six-well plate and incubated at 37°C with 5% CO 2 for one day. Next, the small interfering-METTL3 (si-METTL3) and si-NC (Empty vector) were added to 200ul of Opti-MEM (serum-and antibiotics-reduced) mixed with 3 ul lipofectamine 2000 to a nal concentration of 20 nM. Then the above mixture was transferred to the six-well plate containing cell suspension, followed by adding 700 ul of pre-warmed DMEM (no serum and double antibodies). After incubation in a 37°C cell incubator for 4-6h, the medium was replaced by the complete medium with 10% fetal bovine serum and 1% double antibodies. Cells were collected for subsequent experiments after 48h.

Quantitative PCR
Total RNA was extracted from the transfected cells after 48h using the TRIzol™ Reagent (Thermo Fisher Scienti c) under the instructions. The HiScript Q Select RT SuperMix for qPCR kit (vazyme, Nanjing, China) and SYBR Green Master Mix kit (vazyme, Nanjing, China) were utilized for cDNA synthesis and quantitative PCR, respectively. Real-time qPCR was performed in the BioRad CFX96 PCR system. The three genes primers used in the study were as follows, GGACTCATGACCACAGTCCA (GAPDH forward), TCAGCTCAGGGATGACCTTG (GAPDH reverse), AACCAGGGTCTGGATTGTGA (METTL3 forward), TCCAGTTGGGTTGCACATTG (METTL3 reverse), ACTTCGTGGTGGTGCTAAGA (RPS27A forward), and CCCAGCACCACATTCATCAG (RPS27A reverse). GAPDH was selected as an internal reference, and 2 -ΔΔCt was calculated to quantify the relative expressions of METTL3 and RPS27A.

Statistical analysis
Statistical analysis and gures plotting were conducted in R programming software (v3.6.1). Student ttest and Kruskal-Wallis rank test was used for the comparisons of paired groups and multiple groups respectively. For survival analysis, TCGA HCC patients were rst divided into high group and low group according to the median expression value of hub gene. Then cox proportional hazards model was built to evaluate the association of gene expression with patient overall survival (OS) and disease-free survival (DFS) [31]. Difference between the survival curves of high group and low group was tested using the 'survdiff' method in the package 'survival' [32]. In this study, we assigned '****', '***', '**', and '*' indicating P-value was < 1.0e-5, 0.001, 0.01, and 0.05 respectively.

Deregulation of RMRGs in hepatocellular carcinoma
A total of 76 RMRGs belonging to 31 types of RNA modi cations were chosen in this study, including 62 writers, 7 readers, and 7 erasers ( Table 2). mRNA and tRNA were two major targets that accounted for almost 70% of the total targets (35.53% and 31.58%). Sixteen genes (21.05%) are related to m6A modi cation and 8 genes (10.53%) to m5C modi cation. Genes related to other types of modi cation were less than 5 except for m1G modi cation.
We attempted to assess the similarity between normal and HCC samples using a multidimensional scaling method based on the expression pro les of 76 RMRGs. The distances between samples can be interpreted by assigning observations to speci c locations in a two-dimensional space. Results indicated that tumor and normal samples in both GSE39791 (72 vs 72) and GSE57957 (39 vs 39) were separated clearly and clustered together in each group ( Figure 1A&B). Forty-six and 47 DEGs were identi ed in GSE39791 and GSE57957 respectively, with 35 genes shared by the two datasets (Supplementary table  1). We observed much more up-regulated genes (n=32) than down-regulated genes (n=3) in tumor samples ( Figure 1C), of which 28 were writers, 5 were readers and 2 were erasers ( Table 2)

Genomic features of RMRGs in hepatocellular carcinoma
Somatic mutations were found in 43 samples (11.81%) in 27 DEGs (7 genes showed no mutations), with YTHDC2 exhibited the highest mutation frequency (Figure 2A). For copy number variation (CNV), the alterations were prevalent on ampli cation in copy number ( Figure 2B). TRMT12 and YTHDF3 showed the highest proportion of ampli cation, while a widespread frequency of deletion were found in ADAT2, TRMT5, ALKBH8, and YTHDC1. Expressions of 33 DEGs were found up-regulated in tumor tissues except for NSUN6, which presented decreased expression in tumor tissues ( Figure 2C) and its down regulation was observed in early stage of HCC (Supplementary gure 1). Besides, prognostic analysis indicated that its high expression was associated with better patient outcome (Supplementary gure 2). Additionally, we found that ADAT2, which exhibited widespread deletions showed the lowest expression, and it was similar to YTHDF3 and JMJD6 ( Figure 2B&C). Principle component analysis (PCA) based on the expression pro les of 34 DEGs indicated that HCC and normal samples could be clearly distinguished from each other ( Figure 2D). Furthermore, expression pro les of these genes showed signi cantly positive correlations with each other ( Figure 2E).

Co-expression analysis of RMRGs
Co-expression analysis identi ed 2511 genes, of which 1080 genes (43.01%) were differentially expressed between tumor and normal samples. We then performed GO and KEGG pathway enrichments for the 1080 co-expression DEGs to investigate their potential biological functions. A total of 788 GO terms were enriched, including 540 biological processes (BP), 152 cell components (CC), and 96 molecular functions (MF). While only 14 pathways were signi cantly enriched. The top 15 enriched GO terms and KEGG pathways were mainly related to cell division, ncRNA processing, RNA transport, and spliceosome ( Figure 3A&B), which all involved in the process of RNA regulation.
Identi cation of hub genes PPI network was constructed based on the 1080 co-expression DEGs by integrating the protein-protein connection information of STRINGdb [28] and then visualized in Cytoscape [29]. The network consisted of 513 nodes and 11557 edges, among which 7 nodes were RMRGs and they all presented up-regulated in tumor tissues ( Figure 4A). The plugin NetworkAnalyzer [30] was used to assess the topological parameters of this undirected network. RPS27A was found obtained the most directed edges and maximum closeness centrality ( Figure 4B&C). In a given undirected network, the number of edges represented the node degree and closeness centrality re ected information spreading speed from one node to other reachable nodes. Therefore, we regarded RPS27A as the hub gene of this network. Furthermore, no direct connections were found between RPS27A and METTL3, but three genes POLR2K, POLR2J and NCBP2 showed direct interactions with both RPS27A and METTL3 according to the PPI network (Supplementary gure 3). Besides, we investigated the methylation and copy number variation (CNV) data of RPS27A in TCGA HCC dataset. We found a weak positive correlation between the CNV and expression levels of RPS27A (R = 0.28, P < 0.001, Supplementary gure 4A). However, the methylation levels did not show strong correlation (Supplementary gure 4B-C).

Validation of hub genes in cell lines
To verify the relationship between RPS27A and RNA modi cations, the expression pro les of two cancer cell lines, where METTL3 were knocked-down were retrieved from GEO. Both RPS27A and METTL3 were signi cantly up-regulated on HCCs ( Figure 5A). We then investigated their expressions in colorectal cancer cell lines (GSE130012) and pancreatic cancer cell lines (GSE146806  Association of dysregulation RPS27A with poor prognosis of hepatocellular carcinoma Survival analysis showed that HCCs with high-expression levels of RPS27A had worse DFS. We further investigated the relationship between RPS27A and the metastasis and recurrence of HCCs. According to the reference [33] which provided six HCC metastasis-related signatures, we evaluated the metastasis risk score of TCGA HCCs. It was found that high risk score of HCCs presented worse DFS ( Figure 6A) and the expression levels of RPS27A showed positive correlation with metastasis risk scores ( Figure 6B). RPS27A was up-regulated on the high-risk metastasis group ( Figure 6C). In addition, the up-regulated RPS27A was also found in recurrent HCC tissues ( Figure 6D).

Discussion
In this study, 76 genes belonging to 31 types of RNA modi cations were collected from public references. The m6A modi cation was the most prominent and abundant form of RNA modi cations, which accounted majority of RMRGs followed by m5C modi cation and m1A modi cation. Previous studies have focused on the relationship between m6A modi cation and HCC [5, 18], however, other types are rarely concerned. For the rst time, we attempted to evaluate the association of over 30 types of RNA modi cations, not merely concentrating on m6A modi cation, with HCC by integrating multiple datasets. The potential hub gene, RPS27A, identi ed by constructing a PPI network based on the co-expression genes, showed an unfavorable prognostic relationship with HCC patients. These ndings revealed the extensive associations of RNA modi cations with HCC from a landscape view. Further investigations will provide important insights into the development of novel candidate therapeutic targets, diagnostic markers and prognostic predictors for HCC.
We observed extensive variations for RNA modi cations between HCC and normal samples, as more than 60% of RMRGs were differentially expressed on both GSE39791 and GSE57957 datasets, and they were able to clearly separate HCC from normal samples. Almost all DEGs (97.06%) were up-regulated in tumor tissues, re ecting the activation of RNA modi cations in HCC compared to normal tissues. The genomic variation analysis revealed that mutations of RMRGs were only found in about 10% of the samples, indicating a low overall mutation rate in HCC [34]. Interestingly, we also observed that copy number variations (CNV) commonly occurred in the regulators of RNA modi cations and they were dominated by genomic ampli cation, which might interpret the extensive over-expression of RMRGs in HCC. Besides, most of the RMRGs showed strong correlations with each other, suggesting a synergistic regulation of these RMRGs which probably is related to the formed multi-functional complexes of many RMRGs in the process of RNA modi cations [35][36][37].
Notably, we observed a down-regulated NSUN6 in HCC tissues, which was contrary to all other DEGs. NSUN6 belongs to the family of evolutionary conserved m5C RNA methylases [1] and has been reported its expression downregulated in tumor tissues compared to normal tissues [38]. Besides, patients with high expression of NSUN6 showed a better survival rate. m5C was regarded as a prototypic DNA modi cation that participated in gene expression control and epigenetic regulation. It turned out that m5C regulators might be associated with cancerogenesis and progression of HCC via RNA decoration [39].
Ribosomal protein S27a (RPS27A) was recognized in this study as the hub gene of the PPI network constructed by co-expression genes of RMRGs. RPS27A is a member of the S27AE ribosomal protein family and encodes a fusion protein with ubiquitin at the N terminus and RPS27A at the C terminus.
Several studies have reported the over-expression of RPS27A in HBV-induced HCC tissues [40][41][42]. This study revealed that the expression of RPS27A was signi cantly upregulated in HCC tissues. Furthermore, RPS27A expression was signi cantly downregulated in colorectal cancer cell lines (GSE130012) and pancreatic cancer cell lines (GSE146806) when METTL3 was knockdown, and the observation was obtained in three METTL3-knockout HCC cell lines. These results suggested a possible regulation relationship between RPS27A and METTL3-mediated RNA modi cations.
Currently, the association of RPS27A with RNA modi cations is not well studied. Our ndings provide a novel insight to unveil their potential relationship. According to the PPI network, no direct connections were found between RPS27A and METTL3, but three genes POLR2K, POLR2J, and NCBP2 showed direct interactions with both RPS27A and METTL3. As expected, the silencing of METTL3 in three HCC cell lines markedly decreased the expression of RPS27A, further revealed the possible regulation pattern between METTL3 and RPS27A though no causal relationship was demonstrated. However, more thorough molecular evidence is needed to elucidate this direct or indirect regulation.
Meanwhile, we found expression of RPS27A negatively correlated with the HCC patients' survival rate, and RPS27A was also found upregulated on the high-risk metastasis group of HCC and recurrent HCC tissues. These results suggested that RPS27A might interact with METTL3-mediated RNA modi cations to contribute to poor outcomes of HCC patients. In addition, RPS27A was found as an important regulator in decreasing P53 expression in HPV-immortalized cervical epithelial cells [43]. In colorectal cancer, RPS27A not only interacts with apolipoprotein M to promote the growth and inhibit apoptosis of cancer cells [44] but also participates as a hub gene in regulating iNOS/NOS2-mediated poor prognosis [45]. Besides, this study also indicated that RPS27A was differentially expressed between normal and tumor tissues in other cancer types including ESCA, LUAD, LUSC, HNSC, UCEC, CESC, and BRCA. These ndings suggest that RPS27A appears to be a common poor prognostic factor in multiple cancer types, representing a possible convergence of cancer prognostic regulatory pathways that may serve as a general therapeutic target.

Conclusions
In

Declarations
Ethics approval and consent to participate All data used in this study are publicly accessible to any researcher, our research did not require the approval of an ethics committee.

Consent for publication
Not applicable.

Data Availability Statement
The ve datasets used in this study are publicly available.  Red color, blue color, and white color indicated positive, negative and no correlation respectively.