The prognostic roles and clinical characteristics of mRNAsi/corrected in SOC
mRNAsi is a novel stemness index for evaluating the similarity between tumor cells and pluripotent stem cells. It could describe the dedifferentiation potential of tumor cells and be considered a quantitative marker of CSCs stemness. The mRNAsi index was reported to be derived from normal cells and cells with different degrees of stemness via the OCLR algorithm calculating on the TCGA transcriptomic data . Tumor tissues were composed of different kinds of cells, including tumor cells and other types of cells, such as stromal and immune cells. Tumor purity was considered a an interference factor affecting the evaluating of the mRNAsi score. A previous study had reported the tumor purity score of multiple cancers in TCGA evaluated by the ESTIMATE method and we obtained the tumor purity score of OC from the study . The mRNAsi was corrected as previously reported (mRNAsi/tumor purity) . To explore the relevance of mRNAsi/corrected mRNAsi and the clinical features, we performed analyses on SOC patients in TCGA. Because of the extremely small sample size of Grade 1 and Grade 4, as well as stage I and stage IV in the TCGA database, we divided the SOC samples into two groups according to histopathological grades (G1+G2 and G3+G4) and clinical stages (stage I-II and stage III-IV), respectively. As shown in Figure 1A and 1B, SOC samples of higher grades (Grade 3 and Grade 4) had greater mRNAsi and corrected mRNAsi score than those tumor samples of lower grades (Grade 1 and Grade 2) with significance. mRNAsi and corrected mRNAsi scores were not significantly associated with clinical stages (Figure 1C,1D). These results indicated that SOC samples of higher histopathological grades had greater stemness than those of lower grades.
As reported, CSCs were the cause of tumor chemoresistance and recurrence. PFS was used as a surrogate of chemotherapy response in OC . To reveal a potential drug response indicator of stage III-IV SOC by using mRNAsi, we first conducted the PFS analyses on mRNAsi and corrected mRNAsi. In the study, SOC cases of stage III-IV with complete available information of progression-free survival time were divided into low and high score groups based on the mRNAsi or corrected mRNAsi scores. We observed that there was no significant difference in PFS between low and high-mRNAsi/corrected mRNAsi groups (Figure 1E,1F). More unexpectedly, advanced-stage SOC patients with greater mRNAsi score, which indicated stronger stemness of the tumor cells, had an even higher PFS rate within approximately 4 years showed in the survival curve. Similar results were observed in the analysis of corrected mRNAsi. This finding was very different from our understanding that greater stemness of tumor cells would indicate decreased progression-free survival [11, 12]. The above results indicated that the mRNAsi score in SOC samples had a close correlation with the histopathological grades. The corrected procedure didn’t impact the results of clinical correlation analyses. Therefore, mRNAsi, instead of corrected mRNAsi, was used in the subsequent analyses.
DEGs between normal ovarian and SOC tissues
To reveal the potential key genes associated with stemness characteristics of SOC cells according to mRNAsi, we first screened DEGs between normal ovarian samples and SOC samples of all stages (N=379). Thus, we identified 7255 DEGs, including 3790 upregulated ones and 3465 downregulated ones (Figure 2A,2B). The top GO categories were shown in Figure 2C. Among the GO categories, “mitotic nuclear division” was the most enriched biological process (BP) of the DEGs between normal and SOC samples.
Identification of mRNAsi-related modules and key genes by WGCNA
To identify mRNAsi-associated modules and key genes, WGCNA was used to construct a co-expression network of DEGs to cluster the samples and genes into biological modules. In the study, outlier samples were first eliminated from the analysis (Figure S1A) and the remained samples of stage III-IV (N=337) were clustered according to mRNAsi and EREG-mRNAsi score (Figure S1B). β=8 (scale-free R2=0.950) was selected as the soft threshold to ensure the scale-free network (Figure S1C). Thus, we obtained 9 biological gene modules (Figure 2D).
To explore the relationship between gene modules and the mRNAsi scores of the advanced-stage SOC samples, we used MS to quantify the correlation between the overall gene expression level of the corresponding module and mRNAsi. As shown in Figure 2E, the upper row of each module was the R2 value, representing the degree of correlation between gene expression and mRNAsi or EREG-mRNAsi in the corresponding module. According to the R2 value, the green module was most positively associated with mRNAsi (with a correlation coefficient close to 0.80, p value=1.8e-68) and the blue module was most negatively associated with mRNAsi (with a correlation coefficient of 0.71, p value=7.9e-62) (Figure 2F,2G). To study the key genes positively correlated to stemness characteristics of advanced-stage SOC, we chose the green module for further analyses. To avoid missing some crucial enriched pathways caused by a lack of genes, the criteria were defined as cor. MM>0.80 and cor.GS>0.40 (not 0.50 as reported ) to obtain more potential key genes thus we could acquire full information of enriched pathways. Finally, we obtained 42 key genes including AURKA, AURKB, BIRC5, BUB1, CCNA2, CCNB2, CDC20, CDC6, CDCA5, CDK1, CENPA, CKAP2L, DEPDC1, DLGAP5, ECT2, ERCC6L, EXO1, FAM83D, HJURP, KIF15, KIF18B, KIF23, KIF2C, KIF4A, MCM10, MELK, MYBL2, NCAPG, NCAPH, NDC80, NEK2, NUF2, NUSAP1, ORC1, PLK1, RACGAP1, RRM2, SGO1, TOP2A, TPX2, TTK, and UBE2C.
Function annotation and pathway enrichment of the selected key genes
To elucidate the biological process and signaling pathways the selected key genes involved in, GO and KEGG analyses were performed. The results revealed that the top 5 BP of the green module was “mitotic nuclear division”, “chromosome segregation”, “nuclear division”, “nuclear chromosome segregation”, “mitotic sister chromatid segregation” (Figure 3A). These key genes were mainly involved in the pathways of the cell cycle (Figure 3B). Interestingly, the selected key genes were also enriched to the platinum response-associated pathways “cellular senescence” [13-15], “p53 signaling pathway” [16, 17], and “Platinum drug resistance” (Figure 3B). Cell cycle arrest was also the main molecular mechanism of the platinum anticancer effect [18, 19]. To explore the potential predictor of platinum-based therapeutic response, we selected a total of 13 key genes enriched to the four platinum response-associated pathways to be further analyzed in advanced-stage SOC.
Correlation between key genes at mRNA and protein level
To explore the mutual correlation of the key genes and their protein products, we used the Pearson correlation method and STRING online tool to perform the analysis. As shown in the Figure 3C, each node represented a protein in the network. The PPI network constructing with the encoded proteins of the selected key genes showed a wide and strong relationship among the proteins. We calculated the edge connecting nodes in the network (Figure 3D) and found every node in the network was connected to the rest 12 proteins, indicating having a mutual correlation with all of the rest proteins. We observed dense and balanced correlations among all the paired key genes. At the mRNA level, The relationship between PLK1 and CDC20 had the highest correlation coefficient of 0.78. The relationship between BUB1 and TOP2A, as well as BUB1 and CDC20, had a lower correlation coefficient of 0.77 and 0.76. AURKA and ORC1 had the lowest correlation coefficient of 0.58 (Figure 3E). These results demonstrated that the selected key genes composed a strong and dense interaction network with each other.
Expressional validation of key genes in multiple datasets
As shown in Figure 4A and 4B, the 13 selected key genes all had significantly higher expression levels in SOC samples of stage III-IV than in normal ovarian samples (p<0.001). To verify the overexpression of the key genes in SOC samples at the bulk RNA level, we selected two GEO datasets for validation. We first compared the expression of the key genes between normal ovarian samples (N=10) and advanced stage, high-grade SOC samples (N=53) in the GSE18520 dataset, and confirmed that all the key genes were significantly overexpressed (p<0.001) in SOC tissue (Figure 5A). Moreover, we also evaluated the differential expression of the selected key genes between paired normal oviducts and HGSOC samples in GSE69428. As shown in Figure 5B, all of the 13 key genes were significantly upregulated in SOC tissue compared to normal oviducts (p<0.001). These results further confirmed that the overexpression of the screened key genes in HGSOC.
To further understand the expression levels of the key genes in multiple cancer types, we used Oncomine to perform the pan-cancer analyses. Except ORC1 was not in the top 10% gene rank of DEGs, the other 12 key genes were all ranked in the top 10% gene within at least 1 OC dataset (Figure 5C). The results strongly indicated that these key genes might be consistent genes of oncogenes or even consistent stemness biomarkers widely overexpressed in multiple cancer types.
In contrast to traditional bulk RNA-seq, where gene expression was measured by average method across thousands of cells, the technology of scRNA-seq focused on the gene expression in individual cells and could provide more information on tumor heterogeneities. In the analyses with PanglaoDB, we first explored the expression level and positive distribution of the key genes within a scRNA-seq dataset of HGSOC. The 13 key genes had different expression levels and different positive-cell distribution among all the OC cells in the analysis (Figure S2). The positive expression rate of the key genes was very different from each other. However, we observed that the positive-expression cells were distributed in the cell population with a quite uniform pattern. Moreover, we also evaluated the expression pattern of the key genes in another scRNA-seq dataset of ESCs. Interestingly, except for MYBL2 and ORC1, the positive-expression ESCs of the rest 11 key genes tended to be gathered in a small subpopulation at the upright in the t-SNE plots (Figure S3). On one hand, the above results suggested that each of the key genes was expressed in a certain portion of the ovarian cancer cells but not every single cell in the cell population, indicating a presence of intra-tumor stemness heterogeneities in SOC. On the other hand, the expression level and distribution pattern of the key genes in the ESCs further confirmed that the key genes were closely associated with the stemness of cells.
The Prognostic Value and platinum-based chemotherapeutic response of the Key Genes
First, we explored the OS prognostic role of the key genes in SOC patients of stage III-IV by Kaplan-Meire plotter. The results revealed that AURKA, MYBL2, ORC1, and PLK1 were associated with the OS of advanced-stage SOC patients with statistical significance. The higher expression level of AURKA, MYBL2, and ORC1 could predict shorter OS while the higher expression level of PLK1 could predict a longer OS (Figure 6,Table1).
To explore potential indicators of platinum-based chemotheraputic response among the stemness-associated key genes, we conducted OS and PFS analyses on the key genes among stage III-IV patients of SOC treated with chemotherapy containing platinum or the combination of platinum and taxol. Among the 13 key genes, the expression level of only 4 genes, AURKA, CCNA2, MYBL2, and ORC1 significantly affected the OS of advanced stage SOC patients treated with chemotherapy containing platinum (Figure S4). The higher expression level of AURKA, MYBL2, and ORC1 could predict shorter OS while the higher expression level of CCNA2 could predict longer OS. The expression level of 11 key genes (AURKA, BIRC5, CCNA2, CCNB2, CDC20, CDK1, ORC1, PLK1, RRM2, TOP2A, and TTK) were significantly associated with PFS (Figure 7). Among the prognostic genes of PFS, except for ORC1, the higher expression level of the other 10 genes could predict significantly longer PFS (Table 2).
We further explore potential response indicators of chemotherapy containing both platin and taxol by OS and PFS analyses. The expression level of CCNA2, CDK1, ORC1, TOP2A, and TTK were significantly associated with the OS of advanced-stage SOC patients receiving platin combined with taxol and the higher expression level of the 5 genes could predict shorter OS (Figure S5,Table1). The higher expression level of 7 PFS-associated genes, BIRC5, CCNB2, CDC20, MYBL2, PLK1, TOP2A, and TTK, could predict longer PFS (Figure 8,Table 2).
The differential expression of key genes between platinum-resistant and sensitive SOC samples
As shown above, the 13 selected key genes were closely associated with the drug response to platinum-based chemotherapy. Thus, we selected two GEO datasets to perform validation to illuminate whether the key genes played roles in the modulation of platinum sensitivity. The analysis on GSE131978 revealed that BIRC5, BUB1, CDC20, CDK1, and ORC1 had statistically significant differential expression between platinum-resistant and sensitive samples of stage III-IV SOC. And the expression level of the 5 genes was higher in platinum-sensitive samples than in platinum-resistant samples. In the analysis on GSE51373, we found that BUB1, CDC20, PLK1, and TOP2A had differential expression between chemotherapy (contains platinum) resistant and sensitive samples of advanced-stage SOC. Consistent with the results of GSE131978, the 4 genes were also downregulated in chemotherapy-resistant samples compared to the chemotherapy-sensitive ones. The two datasets had only one overlapped DEG CDC20 among the 13 selected genes. Consider the results of survival analyses, the higher expression level of CDC20 could predict longer PFS of advanced-stage SOC patients treated with chemotherapy containing platinum or a combination of platinum and taxol. These results indicated the higher expression level of CDC20 could predict a higher sensitivity to platinum-based chemotherapy.