CRGs Genetic and transcriptional alterations in CC
We created a catalog of 19 CRGs including NFE2L2, NLRP3, ATP7B, ATP7A, SLC31A1, FDX1, LIAS, LIPT1, LIPT2, DLD, DLAT, PDHA1, PDHB, MTF1, GLS, CDKN2A, DBT, GCSH and DLST1, 17, 18. Somatic mutations of CRGs were explored in 447 colon cancer samples from TCGA-COAD. 83 samples exhibited mutations of CRGs, with a mutation frequency of 18.57%(Fig. 2a). Above all, the NLRP3 showed the highest mutation frequencies followed by ATP7A. In contrast, the essential factor for cuproptosis, FDX1 and copper-importer SLC31A1 exhibited no mutations. Then, the number variations(CNVs) of CRGs was investigated, as is shown in Fig. 2b, CNV alterations of CRGs in CC were common. The copper exporter ATP7B whereas MTF1 and NLRP3 were focused on the amplification in copy number, while others had more deletion frequency. The CNV alteration mutations on chromosomes are displayed in Fig. 2c. Consequently, most of the CRGs were identified to have expression variation between CC samples and adjacent normal tissue samples(Fig. 2d). Among those differentially expressed CRGs, 7 genes (ATP7B, LIPT1, LIPT2, PDHA1, GLS, CDKN2A and GCSH) were upregulated, while 6 genes (NLRP3, FDX1, DLD, MTF1, DBT and DLST) were downregulated in CC. Of note, CRGs are previously published for their collaboration in copper induced cell death1. Hence, the correlation between these 19 CRGs in CC were evaluated(Table S1, Fig. 2e). Many of CRGs showed apparently correlated with each other. These relationships prompted that the cooperation between the CRGs may promote the progress of copper-induced cell death. As is shown in the scatter plot of Fig. 2e, GLS and NFE2L2 performed the highest correlation. We also analyzed the relationship between independent CRGs expression status and OS of patients, 15/19 CRGs are demonstrated to correlate with prognosis(Fig. S1). Figure 2f showed that CRGs and immune cells are broadly related. Among them, neutrophils and NLRP3 presented the most prominent positive correlation.
Classification of coproptosis clusters in CC
For further analyse the expression patterns of the prognostic value of CRGs, 1097 patients from TCGA-COAD and GEO(GSE39582, GSE17536, and GSE38832) databases with sufficient clinical information were involved in subsequent studies(Table S2). The full-scale of CRGs interactions, associations, and prognostic value in CC patients were demonstrated in a network map(Fig. 3a). Then, we performed a multivariate Cox regression analysis on CRGs, 3 of which (DLD, PDHA1 and PDHB) were identified as independent predictive factors (Table 1). According to the expression patterns of CRGs, patients with CC were categorized by the “ConsensusClusterPlus” package. Our findings showed that k = 2 appeared to be the selectable result for sorting the entire cohort into CRGcluster A (n = 530) and B (n = 594) (Fig. 3b, S2). Using PCA analysis showing the result of significant differences between the two clusters of CRGs transcription level (Fig. 3c). Survival curves revealed a longer OS time in patients in CRGcluster A than that in patients in CRGcluster B (Fig. 3d). In Fig. 3e, the heatmap showing the clinicopathological features differences and CRGs expression variation in CC patients in different CRGclusters was performed, which revealed that cluster B was related to higher lymph nodes positive rate (p < 0.05) compared to those in cluster A(Table S8).
Table 1
Multivariate Cox regression analysis of 3 CRGs associated with CC patients OS.
Gene Symbol | HR | 95.0% CI | pvalue |
---|
DLD | 0.736197 | 0.591526–0.916251 | 0.006079 |
PDHA1 | 0.792511 | 0.644698–0.974214 | 0.027244 |
PDHB | 0.720232 | 0.531829–0.975377 | 0.033914 |
Relationship between CRGs clusters in pathways and immune characteristics in CC
The GSVA result showed that CRGcluster A was significantly enriched in mitochondrial activity and TCA cycle(tricarboxylic acid cycle) metabolism(Fig. 4a-b and Table S3-4). To further comprehend the role of CRGs in the TME of CC, we assessed the correlations between the two CRGclusters and 23 infiltrating immune cells by utilizing the CIBERSORT algorithm(Table S5). The infiltration of most immune cells between the two CRGclusters has shown significant differences(Fig. 4c). However, the CRGcluster B with a poor prognosis has shown a significantly higher proportion of immune cell infiltration. Including some tumor-killing cells such as activated CD8 + T cells and NK cells19. The TME score(stromal score, immune score, and estimate score) of the two CRGcluster was also identified using the “ESTIMATE” package. AS the TME score shows, higher stromal scores or immune scores represented relatively higher contents of stromal cells or immune cells in the TME, while the ESTIMATEScore, indicating the aggregation of stromal or immune scores in the TME, is used to indirectly reflect tumor purity. Similarly, the result in Fig. 4d showed higher TME scores in CRGcluster B patients.
Classification of gene subtypes based on DEGs
In order to explore the possible prognostic predictive potential of each CRGcluster gene expression pattern(Fig. S3), the “limma” package was utilized to identify 136 CRGcluster-related DEGs, followed by the functional enrichment analysis(Fig. 5a-b). We observed that those CRGcluster-related genes were significantly enriched in biological processes that were correlated with metabolism and copper response(Fig. 5a-b). At the aim of further validating this prognostic predictive potential, we used a consensus clustering algorithm to divide patients into 2 gene subtypes I and II based on the screened-out 85 genes from CRGcluster-related DEGs which related to OS time by Cox regression(Table S6 and Fig. 5c). The Kaplan-Meier curves showed that patients with gene subtype II had the relatively lower OS (log-rank test, p < 0.001; Fig. 5d). Through the result of CRGs expression(Fig. 5e), we found that many CRGs showed differences in 2 gene subtypes, indicating subtype II has a lower level of cuproptosis. In Fig. 6 we noticed that CRGcluster-related DEGs have complex connections and interactions in protein level.
Construction and validation of the prognostic risk level
Consequently, based on the subtypes of CRGcluster-related DEGs, the risk level was established. Figure 7a demonstrates the distribution of patients in the two CRGclusters, two relative gene subtypes, and two risk levels. Firstly, we randomly classify the patients into training (n = 549) and testing (n = 548) sets at a ratio of 1:1 using the “caret” package. Next, 10 OS-associated genes from 85 CRGcluster-related prognostic DEGs(Table S7) which remained according to the minimum partial likelihood deviance were precisely selected by LASSO regression analysis(Fig. S4). Those 10 selected genes were used to perform multivariate Cox regression analysis. We then selected candidates from 10 OS-associated genes based on the Akaike information criterion (AIC)20 value to finally obtain four genes (CFTR, HOXC6, MSLN and CKMT2), including two high-risk genes(HOXC6 and MSLN) and two low-risk genes(CKMT2 and CFTR). According to the results of the multivariate Cox regression analysis, the risk score was constructed as follows:
Risk score = (− 0.185* expression of CFTR) + (− 0.0807*expression of CKMT2) + (0.09779* expression of HOXC6) +(0.07347* expression of MSLN).
Whether there is a significant difference in risk_score between CRGclusters and gene subtypes was testified and presented by boxplot in Fig. 5b-c. We observed that the risk level of CRGcluster B and gene subtype II is higher than it in subtype I (Fig. 7b-c). Patients with a risk_score lower than the median were categorized into the low-risk level group (n = 275), whereas those whose risk score was greater than the median were placed in the high-risk level group (n = 274). Figure 7d-e showing the distribution of the risk_score revealed that survival times decreased along with the increase in risk scores. Comparing with the patients with high risk scores, the Kaplan–Meier survival curves revealed that patients with low scores had a relatively higher overall survival rate (log-rank test, p < 0.001; Fig. 7f). In addition, the 1-, 3-, and 5-year survival rates of risk_score were represented by AUC values of 0.709, 0.677 and 0.632, respectively (Fig. 7g). Figure 7h showed the 4 model performing genes expression level in 2 risk levels. To validate the stability and reliability of the risk_score, we performed risk_scores into the testing set and aggregate database (TCGA-COAD and GEO, n = 1097). Fig. S5–S6 indicates that the risk_score had an excellent ability to predict the survival of CC patients. Between the high- and low-risk levels, a chi-square test of clinical information found that the high-risk group had a higher proportion of elderly patients and a relatively late stage (Table S9). We further combine the risk levels and clinical information established the nomogram in order to predict OS rates in 1-, 3-, and 5-year in CC patients (Fig. 7i). Moreover, in both the training and testing sets, comparing to an ideal model, the proposed nomogram had a relatively good performance, which was suggested by the subsequent calibration plots showed in Fig. 7j. Figure 7k showed that in the training set, the 1-, 3-, and 5-year AUC values of the nomogram were 0.833, 0.774, and 0.740, indicating that the prediction power of the nomogram is dependable, and it has certain clinical practicability. The subsequent calibration plots and ROC suggested that the proposed nomogram had a similar performance in the training, testing and all sets compared to an ideal model (Fig. S7).
TME and immune checkpoints were assessed between the high- and low-risk levels
CIBERSORT algorithm was utilized to assess the association between risk score and the abundance of immune cells infiltration. As Fig. 8a the scatter diagrams showing the risk score was positively correlated with M1 macrophages, CD8 + T cells, follicular helper T cells, gamma delta T cells, activated NK cells, neutrophils and negatively correlated with resting NK cells, plasma cells, activated CD4 + T cells, resting CD4 + T cells. In addition, the correlation between the 4 genes in the proposed model and the infiltration of immune cells was also analyzed. A significant correlation of immune cells with those 4 genes in the proposed model was revealed, excepting mast cells, B cells, dendritic cells and eosinophils (Fig. 8b). Interestingly as the results we observed in Fig. 3, a low risk score was also closely associated with a relatively low immune score, whereas a high risk score was associated with a high immune score (Fig. 8c). To supplement, we investigated the variations between immune checkpoints gene expression and risk levels. Figure 8d showed that 22 of 32 immune checkpoints were differentially expressed in the two risk levels, however Current targets for immune checkpoint inhibitors21 such as PD-1, PD-L1, and CTLA-4 showed no difference.
Assessment of MSI and TMB in 2 risk levels
The mutated genes ranking in the high- and low-risk levels were basically the same, APC, TP53, TTN, KRAS, PIK3CA, SYNE1, MUC16, FAT4, ZFHX4, and RYR2 comes to the top 10 (Fig. 9a-b). Patients with a high risk score had markedly lower frequencies of APC and TP53 mutations compared to those patients with a low risk score. However, the exact opposite was observed regarding the mutation levels of other genes on the plot. Both of evidence demonstrate that Colon cancer patients with high microsatellite instability (MSI-H) are more sensitive to immunotherapy22, 23. Compared with low-risk level, the high-risk level showed a higher proportion of MSI-H patients, meanwhile MSI-H group showed a higher risk stratification (Fig. 9c-d). We then performed survival analysis in the MSI and MSS groups to evaluate the influence of MSI status on OS in patients with CC. Although there is no statistical difference. However, in the long-term prognosis, those two groups began to show a separation tendency (p = 0.785; Figure S8). Accumulative evidence shows that patients with a high TMB may benefit from immunotherapy due to their higher numbers of neoantigens24. A positive association between the risk score and the TMB was illustrated by Spearman correlation analysis (p < 0.01; Fig. 9e). By analyzing the mutation data from the TCGA-COAD cohort, higher TMB in the high risk score group than that in the low-risk group was demonstrated in Fig. 9f, implying that the high-risk group might benefit better from immunotherapy. Between high- and low-risk levels, we have tried to select adjuvant therapy drugs currently used to treat colon cancer patients or associated with cuproptosis, then evaluate the susceptibilities. Interestingly, we found that the patients in the low risk score group had a higher IC50 value for Elesclomol(Fig. 9g). Other drugs, although showing differences, however, are not currently used to treat colon cancer(Fig. S9).