3.1 Mutation landscape and copy number alteration of CRGs in STAD
The flowchart of the whole study is presented in Fig. 1. A total of 10 cuproptosis genes were included in this study. Mutational analysis showed that cuproptosis genes had a relatively high mutation frequency in STAD. Among the 433 STAD samples, 56 (12.93%) had cuproptosis gene mutation (Fig. 2a). 8 (80%) genes had different mutation frequencies and types. The highest mutation frequency gene was CDKN2A (4%), followed by DLAT (3%), DLD, MTF1, LIPT1 (2%), and GLS, LIAS, PDHB (1%). We did not observe FDX1 and PDHA1 mutations in STAD samples. We searched the Cbioportal database and recorded each gene's most common mutation sites. The M274V mutation was observed in the second exon of PDHB (Fig. 2b), while FDX1 harbors the G151D mutation in the fourth exon (Fig. 2c). Fig S1 presents the common mutation site of other cuproptosis genes.
Next, we assessed the copy number variation (CNV) of the cuproptosis genes and found that they all had different CNV frequencies. GLS, MTF1, and LIPT1 were associated with an increased CNV, while the CNV was decreased for CDKN2A, DLAT, FDX1, PDHB, and DLD (Fig. 2d). 10 cuproptosis genes were located in the different sites of 23 chromosomes, and it showed that MTF1 was located in the first chromosome, LIPT1 and were located in the second chromosome, and other genes were located in the different chromosomes, respectively (Fig. 2e). We also explored cuproptosis gene expression differences between STAD and normal samples. Nine genes were differentially expressed, and all exhibited significantly higher STAD expression than normal samples. Cuproptosis genes with CNV gain, such as GLS, MTF1, and LIPT1, were significantly elevated in STAD samples, indicating that CNV might regulate mRNA expression of cuproptosis genes. In contrast, genes with CNV loss, such as DLD, PDHA1, CDKN2A, DLAT, FDX1, and PDHB, exhibited upregulated mRNA expression, while other genes with high frequencies of CNV gain or loss showed no differences between tumor and normal samples, such as LIAS, suggesting that CNV is not the only factor influencing the expression of cuproptosis genes. Indeed, other factors, including non-coding RNA, DNA methylation, m6A RNA methylation, and transcription factors, could also regulate cuproptosis gene expression.
3.2 Identification of cuproptosis subtypes in STAD
Univariate Cox regression analysis revealed the prognostic significance of 10 cuproptosis genes in STAD patients (Table S2). Based on the univariate Cox regression analysis, a prognostic cuproptosis network was established, which showed the comprehensive landscape, interrelationships between each cuproptosis gene, and their predictive value in STAD (Fig. 2f). Five cuproptosis genes showed OS-related significance, including FDX1, LIAS, LIPT1, DLAT, and PDHA1. Kaplan-Meier curves showed that STAD patients with high expression of the 5 genes had a longer OS than those with a lower expression (Fig. 2g, 2h, S2). Furthermore, most cuproptosis genes were expressed differentially between STAD and normal samples (Fig. 2i).
According to the expression of CRGs, we divided STAD patients into different clusters using a consensus clustering algorithm (Fig S3). k = 2 was the appropriate choice for dividing the patients into subtypes A (n = 531) and B (n = 245) (Fig. 3a, Table S3). At k = 2, the sample size was relatively balanced between the two subgroups. PCA analysis showed a significant difference in cuproptosis transcription profile between subtypes A and B (Fig. 3b), while the Kaplan Meier curves showed no difference in OS between subtypes A and B (p = 0.487; Fig S4). Further analysis revealed relative differences in cuproptosis gene expression and clinical features between the two subgroups (Fig. 3c).
3.3 Function analysis and TME characteristics in distinct cuproptosis subtypes
To compare functional differences between the two cuproptosis subgroups, we performed GSVA analysis and found that subtype B was significantly enriched in steroid biosynthesis and cell cycle-related pathway, while subtype A was mainly enriched in cell apoptosis and immune-related pathways, including NK cell-mediated cytotoxicity, T cell receptor pathway, B cell receptor pathway, leukocyte transendothelial migration, and chemokine signaling pathway (Fig. 4a, Table S4).
According to the ssGSEA results, we calculated the immune cell infiltration in each STAD sample and observed significant differences in the infiltration of most immune cells between the two cuproptosis subtypes (Fig. 4b). The infiltrating levels of B cell, CD8 + T cell, dendritic cell, Macrophages, mast cells, neutrophils, NK cell, naive T cell (Tfh cell), T helper 1 cell, and T helper 2 cells were significantly higher in subtype A than subtype B. This result was consistent with the GSVA enrichment analysis. Several kinds of immune functions such as APC co-stimulation, immune checkpoint function, cytolytic activity, Chemokines, and Chemokine Receptors (CCR), HLA function, inflammation-promoting function, T cell inhibition, T cell stimulation, and type 2 IFN-γ response exhibited higher enrichment levels in subtype A than in subtype B (Fig. 4c). In addition, we assessed the TME score, including the stromal score, immune score, and estimate score of the two cuproptosis subtypes using the ESTIMATE package. A higher stromal or immune score suggested a higher infiltration and activity of stromal cells and immunity (Fig. 4d). Overall, subtype A was associated with a higher immune and stromal score.
3.4 Cuproptosis gene cluster based on prognostic DEGs
We identified 103 cuproptosis-related DEGs between A and B subtypes using the R package "limma" (Table S5). Functional analysis was performed based on these DEGs. First, GO analysis revealed that DEGs were significantly enriched in biological processes associated with the metabolism of intracellular substances, including metabolism and olefinic compound metabolic process. They were also enriched in molecular functions, such as endopeptidase activity (Fig. 5a).
Then, a univariate Cox regression analysis was performed, and 26 prognostic DEGs related to OS for STAD were identified (p < 0.05, Table S6). A new consensus clustering was conducted to evaluate potential regulation mechanisms based on these prognostic DEGs; STAD patients were divided into three cuproptosis gene subtypes, gene subtype_A, gene subtype_B, and gene subtype_C (k = 3, Fig. 5b, S5, Table S7). Significant differences were observed in six CRG expressions between three gene clusters, including FDX1, LIAS, LIPT1, DLAT, PDHA1, and CDKN2A (Fig. 5c). Figure 5d shows the relative differences in T, N classification, age, gender, TNM stage, pathological grade, and cuproptosis cluster between gene subtypes A, B, and C. The ESTIMATE analysis showed that clusters A, B, and C had significantly different TME scores, and subtype C had the highest stromal score and a relatively higher immune score (Fig. 5e). In addition, the Kaplan-Meier curve showed that gene subtype A had the best OS, while subtype_C had the shortest median OS (p < 0.001, Fig. 5f).
3.5 Construction and validation of the prognostic cuproptosis risk score
STAD patients from TCGA and GEO cohorts were combined and randomly divided into training and validation cohorts at a ratio of 1:1. The training cohort was used for the next analysis. LASSO analysis was performed for 26 prognostic DEGs, and 9 genes were selected for subsequent multivariate Cox regression analysis (Fig. 6a, b). Finally, we obtained 5 prognostic CRGs that were high-risk genes, including RPL39L, PEG10, SYNPO2, MMP11, and KRT17(Table S8). Based on the multivariate Cox regression analysis results, the CRG risk score was calculated as follows: CRG risk score = (0.1098 * expression of RPL39L) + (0.1460 * expression of PEG10) + (0.1657* expression of SYNPO2) + (0.1701* expression of MMP11) +(0.0598* expression of KRT17). Based on the median CRG risk score, all STAD patients were divided into high- and low-risk groups.
The CRG risk_score was significantly different among the two cuproptosis and three gene clusters (Fig. 6c, d). The CRG score was lowest in gene cluster_A; no significant differences were observed between B and C (p = 0.89), consistent with the Kaplan-Meier analysis findings. The Sankey plot allowed visualization of the interrelationships among two cuproptosis clusters, three gene clusters, risk_score, and overall survival status (Fig. 6e). The distribution plot showed that the OS of STAD decreased with increased CRG risk_score (Fig. 6f, g). The correlation results in the whole-patient and validation set are presented in Fig S6. Kaplan–Meier curve analysis showed significantly longer OS in the low-risk relative to the high-risk group (p < 0.001, Fig. 6h). ROC curves indicated that the CRG risk_score had potential value in predicting OS of STAD patients at 1, 3, and 5 years, with AUC values of 0.645, 0.698, and 0.704, respectively (Fig. 6i). We obtained similar results in the whole-patient and validation set (Fig S7). In addition, the majority of cuproptosis-related genes were differentially expressed between the two risk groups, including LIAS, DLAT, PDHA1, PDHB, CDKN2A, and FDX1 (Fig. 6j). In the training cohort, a heatmap was generated to establish the relationships of the 5 prognostic marker genes with CRG risk groups. All 5 genes were highly expressed in the high-risk group (Fig. 6k).
3.6 Validation of the expression of the 5 prognostic cuproptosis genes in the prognostic model
The expression levels of 5 prognostic genes were measured in 14 pairs of STAD tissues and normal adjacent tissues by RT-qPCR. As shown in Fig S8, KRT17, MMP11, and RPL39L exhibited significantly higher expression in STAD tissue than normal tissue. SYNPO2 was lower in the STAD tissue with no significant difference between the expression of PEG10 in STAD and corresponding normal tissues. Results of RT-qPCR are presented in Table S9. According to the HPA database, expression levels of KRT17 were higher in STAD than in normal tissues, suggesting this gene may be a risk factor. Moreover, RPL39L and SYNPO2 exhibited lower expression levels in STAD than in normal tissues, suggesting that these genes are protective factors. There was no significant difference between the expression of MMP11 and PEG10 (Fig. 7). The primers used in PCR assays are listed in Table S10.
3.7 Tumor microenvironment evaluation in the high- and low-risk groups
The CIBERSORT algorithm results revealed positive associations between the prognostic CRG risk_score with resting CD4 T memory cells, activated NK cells, resting mast cells, M0 macrophages, and M2 macrophages, and negative correlations with activated memory CD4 T cells, CD8+T cells, follicular helper T cells, resting NK cells, resting dendritic cells, and M1 macrophages (Fig. 8a, Table S11).
The ESTIMATE algorithm was additionally applied to simulate TIME. The results showed that a high CRG risk_score was associated with a high stromal score in STAD samples, while the immune score was not significantly different between high and low-risk groups (Fig. 8b, Table S12).
The relationship between 5 prognostic cuproptosis genes and 22 human immune-related cells was further examined (Fig. 8c). Most immune cells were significantly positively or negatively regulated by the 5 genes. Importantly, the SYNPO2 gene was positively correlated with resting memory CD4 T cells, activated NK cells, resting mast cells, M2 macrophages, and memory B cells and negatively with follicular helper T cells, activated memory CD4 T cells, resting NK cells, M1 macrophages, and resting dendritic cells, suggesting that SYNPO2 might play a vital role in the regulatory role of cuproptosis in TIME. In addition, the relationship between CRG prognostic risk_score and immune checkpoints was further assessed (Fig. 8d). 9 immune checkpoints were differentially expressed in the different cuproptosis risk groups, including TNFSF4, ICOS, VTCN1, CD44, NRP1, CD160, LGALS9, KIR3DL1, and CD276.
3.8 Correlations of CRG risk_score with MSI, CSC index, HLA gene expression, and TMB score
Experiments were performed to determine the relationship between the CRG risk_score with immunotherapeutic biomarkers, such as MSI, CSC index, HLA gene expression, and TMB score. We first analyzed the distribution of somatic mutations between the two cuproptosis risk groups in the TCGA STAD cohort. Low-risk samples showed relatively higher mutation frequency than the high-risk group (90.57% vs. 83.67%, Fig. 9a,b). Multiple genes displayed distinct mutation types and frequencies in gastric cancer cells. For example, the frequency of TTN mutation was 44% in the high-risk group and 49% in the low-risk group. The top 5 mutated genes in the high- and low-risk groups were TTN, TP53, MUC16, ARID1A, and LRP1B. Besides, the TMB score was negatively associated with the cuproptosis gene cluster in this study (Fig S9) and significantly lower in the high-risk group than the low-risk group (Fig. 9c). The CRG risk_score showed a negative linear correlation with the CSC index (R= -0.57, p < 0.001, Fig. 9d), indicating that STAD cells with high CRG risk_score have less distinct stem cell properties and a higher level of cell differentiation. Notably, a low CRG risk_score was significantly correlated with the MSI-H status. The MSI-H status in the low-risk group was higher than in the high-risk group (20% vs. 12%)(Fig. 9e), and the median risk_score of MSI-H was relatively lower than for MSI-L (p = 0.079) and MSS groups (p = 0.012; Fig. 9f). We additionally performed a correlation analysis between CRG risk_score and HLA gene expression. Our results showed lower expression of HLA-DMA and HLA-F in the high-risk group relative to the low-risk group (Fig. 9g).
3.9 Drug susceptibility analysis
We performed drug susceptibility analysis to select promising chemotherapy or targeted drugs for high- and low-risk groups of STAD. Interestingly, patients in the high-risk group had lower IC50 values for docetaxel, lapatinib, pazopanib, and imatinib, while those in the low-risk group had significantly lower IC50 values for sorafenib (Fig. 9h-l). These results corroborate the value of the cuproptosis risk score in predicting drug sensitivity and selection of potential beneficiaries of specific treatment chemotherapy and targeted therapy drugs among STAD patients.
3.10 Construction and validation of nomogram based on CRG risk_score
Univariate and multivariate Cox regression analyses were performed, and factors, including age, T, N, M classification, and the cuproptosis risk score, were independent prognostic factors in this study (Fig. 10a,b). A nomogram was constructed by incorporating the cuproptosis risk_score and common clinicopathological factors to predict OS at 1, 3, and 5 years (Fig. 10c). The C-index of the model was 0.746, indicating a relatively good prognostic value of the nomogram. The calibration curves of the nomogram indicated excellent consistency with the standard curve between predicted and actual 1-,3-, and 5-year OS rates in STAD patients (Fig. 10d). DCA was conducted to evaluate the predictive value of the nomogram in clinical decision-making (Fig. 10e). Notably, the nomogram showed better reliability than the TNM stage.