Ovarian cancer staging and interstage immune cell infiltration based on copper death-associated genes
The flow chart of this study is shown in Fig. 1.
To investigate the relationship and function of copper death-related genes in ovarian cancer, the PPI network analyzed the interrelationship of 10 copper death-related genes, and the results suggested a close relationship between 8 copper death genes (Fig. 2A). The survival curve results suggested that five copper death-related genes (LIAS, LIPT1, MTF1, DLD, and PDHB) were significantly associated with patient prognosis (Fig. 2B-F). TCGA-OV patient samples were clustered and analyzed by the expression levels of 10 copper death-related genes to identify two different subtypes (Fig. 2G). Principal component analysis showed significant differences in transcriptome data between these two subtypes for subtype A (n = 265) and subtype B (n = 114) (Fig. 2H). Tumor immune microenvironment scores obtained by ESTIMATE showed that the immune scores and estimated scores were higher than those in group B, suggesting that group A had a higher immune cell content (Fig. 2I). Further analysis of 23 human immune cells in ovarian cancer by the CiberSort algorithm for inter-subpopulation infiltration revealed a large number of activated immune cell infiltrates in the group A, such as activated B cells, activated CD4 and CD8 T cells and neutrophils (Fig. 2J). Finally inter-subpopulation GSVA analysis suggested that the subpopulation A was significantly enriched in immune activation pathways, including autophagy regulatory pathways, primary immunodeficiency, and natural killer cell-mediated cytotoxicity (Fig. 2K).
Construction and validation of a risk model for copper death-related genes
To construct a more accurate and highly available risk model, 23 inter-subtype differential genes were screened by applying the screening conditions of |logFC|≥1 and adj. P.Value < 0.05 (Fig. 3A). We used the R package "Sva" to correct the 23 inter-subtype difference genes from the patient data in the GEO database and extracted the intersection genes (n = 15).Removed the data with survival time less than 30 days, and used the TCGA data as the training set (n = 378) ,the GEO data as the validation set (n = 554). Single-factor Cox analysis of 15 inter-subtype differential genes finally obtained 5 prognostic differential genes associated with copper death (Fig. 3B), and Lasso regression analysis and multifactor Cox analysis of 5 prognostic differential genes finally obtained 5 (RARRES1, IFI27, IFI6, CXCL10, PI3) risk genes, including 2 high-risk genes (RARRES1, PI3) and 3 low-risk genes (IFI27, IFI6, CXCL10).
Risk score = (0.1415*RARRES1 expression) + (-0.0284*IFI27 expression) + (-0.0553*IFI6 expression) + (-0.1678*CXCL10 expression) + (0.1328*PI3 expression).
Kaplan-Meier survival curves between the high and low-risk groups for the TCGA training set showed that overall survival was significantly higher for patients in the low-risk group (Fig. 3C). Risk scores in the TCGA group were indicated to be significantly associated with survival in both univariate and multifactorial COX regression analyses (HR = 2.907, 95% CI = 1.929–4.380, p < 0.001; HR = 2.816, 95% CI = 1.856–4.271, P < 0.001) (Fig. 3D, E). The Kaplan-Meier survival curves for the high and low -risk groups of the GEO validation set showed that the overall survival of patients in the low-risk group was significantly higher (Fig. 3F). In both univariate and multifactorial COX regression analyses, the risk scores in the GEO group were significantly associated with survival (HR = 1.705, 95% CI = 1.100-2.641, P = 0.017; HR = 1.650, 95% CI = 1.048–1.473, P = 0.013) (Fig. 3G, H).
Prognostic value based on risk model
To analyze the prognostic value of the model for ovarian cancer, we constructed a nomogram containing three clinical factors for predicting 1, 2, and 3-year survival based on TCGA clinical data, and the calibration curves verified the accuracy of the nomogram (Fig. 4A, B). A nomogram was constructed based on GEO clinical data with calibration curves (Fig. 4C, D). The area under the ROC curve of the risk model showed that the risk model had some reliability (AUC = 0.626 (Fig. 4E). The DCA curve plot showed that the risk score had more clinical benefit than other clinical factors (Fig. 4F), The C index indicated that the nomogram had a good predictive value (Fig. 4G). Further analysis of the correlation between risk score and clinicopathology, the survival curves showed that patients in the high-risk group had poorer survival under different pathological conditions (Fig. 4H-K).
Correlation of the model with the immune microenvironment
To elucidate the biological functions and pathways associated with risk scores, GO, KEGG enrichment analysis was performed using differential genes between the high-risk and low-risk groups (Fig. 5A, B), and the results showed that differential genes are involved in multiple immune components such as cytokine activation, chemokine activation and chemokine receptor binding, and in multiple immune regulatory pathways. Next, we analyzed the correlation between risk genes and 10 copper death genes, and the results showed that only 9 genes may be correlated with risk genes. Copper death-related genes (CDKN2A, FDX1) were significantly positively correlated with most risk genes, while LIAS was significantly negatively correlated. (Fig. 5C). When the five risk genes of the model were evaluated in relation to immune cells for abundance, a variety of macrophages and T cells were observed to be significantly correlated with these five genes (Fig. 5D). When immune function analysis was performed between high and low -risk groups, the results showed that multiple immunomodulatory responses showed a significant expression of activation in the high-risk group (Fig. 5E). In the immune subtype analysis, four risk genes (IFI27, IFI6, CXCL10, PI3) exhibited significant differences (Fig. 5F). In addition, we investigated the association between 21 immune checkpoints and risk models, and the results showed that multiple immune checkpoint genes were mainly enriched in the low-risk group (Fig. 5G). Finally, the TME scores between the high and low risk groups showed that the low-risk group had more immune cells (Fig. 5H). Using the CIBERSORTx algorithm, we observed that fibroblasts and neutrophil (white) cells were mainly enriched in the high-risk group, and cytotoxic lymphocytes, NK cells, T cells, B-series cells and CD8 + T cells were mainly enriched in the low-risk group (Fig. 5I-O).
Comprehensive analysis of risk score models
The sensitivity of the risk model as a biomarker for predicting treatment response in ovarian cancer patients was assessed using the R package "pRROphetic". The results showed that patients with low-risk values may have higher sensitivity to Doxorubicin, Etoposide, Cyclopamine, etc.; patients with high-risk values may have higher sensitivity to Imatinib, PHA.665752, X17.AAG, etc. (Fig. 6A-F). The correlation between the risk model and mRNA stem cells was assessed using the R package "ggExtra", and the results showed that the risk score was negatively correlated with stem cell scoring (R=-0.33, P = 3e-08) (Fig. 6G). Next, we analyzed the differential expression of risk genes in the transcriptome (Fig. 6H), and in the protein level (Fig. 6I).