Characteristics and cluster analysis of polyamine metabolism genes in colorectal cancer
59 PAM-related genes were obtained from REACTOME_METABOLISM_OF_POLYAMINES in the GSEA MsigDB database (Table S1). We then conducted a differential analysis of the RNA expression levels of these 59 genes in normal tissues and tumor tissues in the TCGA database, and found that the expression of most PAM-related genes were significantly different between colon cancer and normal colon samples (Figure S1A). We also employed heatmap of significantly different PAM genes to show RNA expression between normal tissues and tumors (Figure S1B). GO and KEGG enrichment analysis demonstrated PAM genes were enriched in the pathways related to PAM, protein or amino acid metabolism and the synthesis of enzymes (Figure S1C-D). we also showed the protein-protein interaction (PPI) of the above 59 genes, and found that proteasome family genes and ODC1 played a major role in the network (Figure S1E). Besides, we performed a survival analysis of the above PAM related genes using Kaplan-Meier method and listed genes with significant differences (Figure S2).
Next, we also examined the somatic copy number variation (CNV) of the above 59 genes and found that CNV gain is the most prevalent change, which might account for the high expression of PAM genes in cancer tissues. Besides, strong CNV deletion was observed in several genes such as AZIN2, SMOX, PAOX, PSMF1, PSMD1, SRM, PSMA5, PSMB2, AGAMT, OAZ1, AMD1 and PSMC1, which was also decreased in gene expression (Fig. 2A). Afterwards, we took the intersection of common PAM-related genes in TCGA, GSE39582 and GSE38832, and finally obtained 55 PAM-related genes (Figure S1G). First, PCA analysis revealed there were clear boundaries between three PAM patterns, suggesting that CRC samples of different PAM patterns may have different characteristics. (Fig. 2B). In order to explore the prognostic value of PAM-related genes and the mechanisms for the tumorigenesis and progression of colorectal cancer, 1224 CRC samples from TCGA, GSE39582, and GSE38832 were classified into 3 clusters by unsupervised consistent clustering (Figure S3). There were 547 in pattern A, 370 in pattern B, and 307 in pattern C (Table S2), which were defined as PAMclusterA, B and C. PAMcluster-A showed significant better survival, while PAMcluster-C had the worst prognosis (Fig. 2C). In terms of clinical characteristics, the proportion of gene mutation in different patterns was explored. We found that in the comparison of the results of MMR, KRAS and BRAF mutations, the proportion of patients with mutations in PAMcluster-A was generally higher than that in PAMcluster-B and PAMcluster-C (Fig. 2D-G).
Analysis of differential pathways and immune cell infiltration
GSVA showed that PAMcluster-C was enriched with a variety of nutrient metabolism pathways, including amine metabolism compared with PAMcluster-B. Meanwhile, pathways such as mismatch repair and cell repair were not enriched in PAMcluster-C (Fig. 3A, B). In addition, we analyzed three patterns of PAM based on the top 10 cancer-related signaling pathways summarized by Sanchez-Vega et al[48] (Fig. 3C, Table S3). From this result, most of cancer signaling pathways were lowly expressed in PAMcluster-A. And it is worth noting that the cancer signaling pathways of PAMcluster-C are various. Among them, WNT, NOTCH, and MYC were the most enriched, and HIPPO, PI3K and NRF2 expression were the lowest. Further, we used the ssGSEA algorithm to quantify the immune cell infiltration scores of the three PAM patterns to further demonstrate the relationship between PAM and immune cells in the TME (Table S4). Compared with PAMcluster-B and PAMcluster-C, PAMcluster-A has higher levels of CD4 + T cells and CD8 + Tcells, indicating stronger tumor immune function. The dendritic cell (DC) of PAMcluster-C had higher NK cells, which may be related to the stronger antigen-presenting ability. Classical immunosuppressive cells including MDSCs, regulatory T cells, follicular helper T cells, were significantly elevated in PAMcluster-A, Interestingly, macrophages, B cells, and neutrophils, which are considered to be still controversial for cancer development, were elevated in PAMcluster-C, suggesting that there were significant differences in TME of CRC samples with different polyamine metabolic patterns (Fig. 3D). Consistent with our expectation, PAMcluster-A is characterized by immune activation and high infiltration of tumor immune cells, while PAMcluster-C was generally immunosuppressed.
Selecting prognostic differential gene sets and constructing a polyamine metabolism scoring model
In order to quantify the heterogeneity of each CRC sample, we analyzed the DEGs of the three patterns and got the genes that differ between the pairs (Table S5). Then, we pooled 1742 DEGs according to the changes in transcriptome expression of the three patterns, and selected 328 DEGs with prognostic value according to the above genes through univariate COX regression analysis (Fig. 3E, Table S6). Based on the 328 DEGs, we could divide them into two clusters by consistency clustering (Figure S4A). Therefore, we formed two gene patterns, A and B, and named geneCluster-A and geneCluster-B, respectively (Table S7). Survival analysis showed that geneCluster-A had a significantly better survival prognosis than geneCluster-B (Fig. 3F). In addition, GO enrichment analysis revealed these 328 DEGs were mainly enriched in DNA replication, cell cycle, and chromosomal genetic features (Figure S4B). KEGG enrichment analysis showed that DEGs were associated with mismatch repair, metabolites and metabolic pathways of various substances (Figure S4C). Next, we developed a scoring model by PCA analysis based on the 328 prognostic DEGs. Based on this model, each CRC sample was assigned a score, termed as the polyamine metabolism score (PAMscore). We used “Survminer” package to classify CRC patients into high-PAMscore group and low-PAMscore group (Table S8), and we used Alluvial diagram to depict the relationship between PAMcluster, geneCluster, PAMscore and survival status (Fig. 3G). The results showed that geneClusterA was mainly derived from PAMclusterA and was the main component of the low PAMscore group, and geneClusterB was mainly derived from PAMclusterC and was the main component of the high-PAMscore group. Survival analysis showed that patients with low-PAMscore had a better survival benefit (Fig. 4A). We conducted a spearman correlation analysis between PAMscore and various immune cell ssGSEA scores. The results showed that PAMscore was significantly negatively correlated with tumor immune cells CD4 + T cells and CD8 + T cells, and other immune cells were mostly positively correlated with PAMscore (Fig. 4B). It is suggested that PAMscore is closely related to the immunosuppression of CRC. We analyzed the correlation of PAMscore with PAMcluster and geneCluster, and found that in the PAM pattern, PAMcluster-A with the best prognosis corresponds to the lowest PAMscore, and correspondingly, PAMcluster-A with the worst OS corresponds to the highest PAMscore (Fig. 4C). Likewise, in both A and B gene patterns, better survival pattern of geneClusterA corresponded to lower PAMscore (Fig. 4D). We also produced heatmap of the expression levels of the prognostic characteristic gene of both gene patterns (Figure S4D). In addition, the expression of PD-L1 in CRC patients between the high and low-PAMscore groups were compared. Compared with the high PAMscore group, the low-PAMscore group had higher PD-L1 expression (Fig. 4E). This suggested that PAMscore may be used to distinguish CRC samples with different responses to immunotherapy. Afterwards, we explored whether there were differences of biological processes in TME between the high and low score groups using previous signature[44] (Fig. 4F). The results showed that multiple tumor suppressor signatures such as Base excision repair, CD8 + T effector, DNA damage response, DNA replication, Mismatch repair, Nucleotide excision repair and TMEscore were significantly increased in the low-PAMscore group, while EMT, Pan-fibroblast TGF-β response signature (Pan -F-TBRS) and TME inhibited signature of TMEscoreB significantly increased in the high-PAMscore group. As expected, the low-PAMscore group was associated with immune-related features, while the high-PAMscore group was associated with immunosuppression and stroma-related features.
Evaluation of Risk Characteristics of Scores and Validation of Other Cohorts
To better enhance prognostic risk stratification, we constructed a nomogram with PAMscore and clinicopathological features to quantify individual patient risk assessments (Fig. 5A). In addition to this, calibration analysis was performed to validated the accuracy of the nomogram (Fig. 5B). The results show that the predicted line of the nomogram is close to actual survival. The AUC of the time-dependent ROC curve at 1-year, 3-year and 5-year overall survival was 0.570, 0.605, 0.587, respectively (Fig. 5C). Next, we validated PAMscore in other CRC cohorts (GSE17536, GSE14333), and the results were as expected, with a significant correlation between low PAMscore and favorable prognosis, confirming that PAMscore is a reliable and independent predictive model. (Fig. 5D-E). We also evaluated PAMscore by univariate and multivariate regression analysis, and the P value was less than 0.05, showing significant differences (Figure S4E-F). In addition, we evaluated clinical characteristics using PAMscore based on available CRC clinical cohort information. The results found PAMscore had significant differences in the prediction of 65-year-old age, T3-4, N1, M0 and Stage Ⅲ group (Figure S5). To further analyze the clinical practice of PAMscore, we used the CellMiner database and detected the correlation between PAMscore and drug sensitivity (Fig. 5G, Table S9). The expression of KBTBD8, CARF, LIN54, EXOG, and ZNF117 had a positive relationship with the sensitivity of Nelarabine (p < 0.001). And the higher the expression of CFTR, the stronger the drug sensitivity of Bendamustine (p < 0.001). The higher the expression of PSMB10, the stronger the drug sensitivity of Hydroxyurea (p < 0.001). PLCB4 had a negative correlation with the sensitivity of Vinorelbine (p < 0.001). And the higher the expression of PLCB4, the weaker the drug sensitivity of Eribulin mesylate (p < 0.001).
Research of polyamine scoring model with clinical relevance and response to immunotherapy
Our analysis revealed differences in survival prognosis between the high- and low-PAMscore subgroups. Based on this, we further explore the relevant mechanisms and internal connections of these results. MSI and TMB are of significant value in the prognostic judgment, medication evaluation and immunotherapy prediction of CRC. Therefore, we performed a correlation analysis of MSI status and TMB of CRC samples from TCGA. From the PAMscore, the microsatellite stability (MSS) and MSI-L status had higher scores, while MSI-H corresponded to lower scores (Fig. 6A). The proportions of MSS and MSI-H were also different in the high- and low-PAMscore subgroup (Fig. 6B). In the high-PAMscore subgroup, MSS accounted for 75% of patients compared with 8% of MSI-H patients; in the low-PAMscore subgroup, MSS patients accounted for 59% and MSI-H patients accounted for 25%. Subsequently, we further performed a significantly mutated gene (SMG) analysis of the CRC samples in the high- and low-PAMscore subgroup (Fig. 6C-D). The SMG mutation map indicated that the somatic mutation signature was overall higher in the low-PAMscore subgroup than in the high-PAMscore subgroup, among which, MYC16 (39% vs 21%), SYNE1 (37% vs 23%), PIK3CA (32% vs 25%) %) and FAT4 (29% vs. 20%) had higher mutation rates in the low-PAMscore subgroup. These results suggested that PAMscore was closely associated with genomic variation in CRC and reflected its clinical relevance in predicting possible immunotherapy responses.
Following, based on the association between somatic mutations in tumor genomes and response to immunotherapy, we explored TMB distribution patterns in the high- and low- PAMscore groups. As expected, the results showed that the low-PAMscore subgroup had an extremely high frequency of tumor mutations compared to the high-PAMscore subgroup (Fig. 6E). Further, we combined geneCluster to analyze the degree of correlation between TMB and PAMscore, and found that TMB and PAMscore showed a strong negative correlation (Fig. 6F). Based on the predictive ability of TMB for CRC patients, we further explored whether PAMscore could enhance its survival predictive ability (Fig. 6G-H). The results showed that the low TMB plus low-PAMscore groups had the best prognosis, while the high TMB plus high-PAMscore groups had the worst prognosis, and the remaining two groups were in the middle, which fully confirmed the favorable predictive ability of PAMscore combined with TMB.
From the characteristic results of MSI and TMB in PAMscore, we had reason to believe that the scoring model of PAM in tumors played a crucial role in mediating immune responses and predicting immunotherapy. We first performed a correlation analysis of the expression levels of immune checkpoint genes in high- and low-PAMscore subgroups (Fig. 6I). Most of the immune checkpoint genes were significantly different between the two groups, and important immune checkpoints such as CD274, CD80, ICOS, FNG, IL-12A, JAK1, JAK2, LAG3, PVR were highly expressed in the low-PAMscore subgroup. These results correlate strongly with immunotherapy response, so we next used PAMscore to investigate whether patient response to immune checkpoint blockade (ICB) treatment in two independent immunotherapy cohorts could be predicted. In the Imvigor210 cohort of bladder cancer, all patients treated with atezolizumab, we analyzed the relationship between PAMscore and immunotherapy response and the prediction of survival benefit in this cohort. It can be clearly found that there is a significant difference in PAMscore between CR/PR and SD/PD patients. Consistent with our conjecture, CR/PR patients corresponded to lower PAMscore, while SD/PD patients corresponded to higher PAMscore (Fig. 6J). And then, we also used PAMscore to perform a survival prognostic analysis of the Imvigor210 cohort, and the results also showed that the low-PAMscore subgroup had a better survival benefit (Fig. 6K). Furthermore, we also analyzed the relationship between PAMscore and immunotherapy response in the gastric cancer Kim cohort treated with Pembrolizumab, and the results showed that the PAMscore of responders was significantly lower than that of non-responders (Fig. 6L). The above results demonstrated that patients with low PAMscore could acquire more advantage and greater benefit from ICB treatment, and the PAMscore could also improve the prediction of response to anti-PD-L1 or anti-PD1 immunotherapy.
Screening and identification of characteristic genes
We screened the 328 genes used to construct the PAM scoring model through the random forest method to obtain representative characteristic genes. Also, the "randomForestSRC" package is used for feature selection. We determined the genes with relative importance threshold < 0.4 as the final features, and ranked them according to their importance. Finally, 6 feature genes of ACAT2, SPHK1, SNED1, KPNA2, BZW2 and KIF15 were identified. (Fig. 7A, Table S10). We used these six genes to construct Easy-PAMscore using the same method, and again performed a survival prognostic analysis of high and low PAMscore scores for patients in the CRC cohort, the results still showed that the low-PAMscore subgroup was able to achieve better survival than the high-PAMscore subgroup (Fig. 7B). In order to validate the expression of the above genes in normal and tumor tissues of CRC, we first analyzed the RNA expression levels of six genes in TCGA (Figure S6A). We discovered the expression of all genes except SNED1 were basically higher in tumors than in normal tissues. More specifically, 75 pairs of CRC tissues from the Shanghai Minimally Invasive Surgery Center of Ruijin Hospital (Shanghai, China) were analyzed by IHC staining. We performed immunohistochemical analysis of ACAT2, SPHK1, SNED1, KPNA2, BZW2 and KIF15 in cancer and normal tissues and statistically analyzed the staining intensity (Figure S6B-G). The results indicated that six characteristic genes of the signature were generally overexpressed in the CRC tissues compared to the normal tissues (Fig. 7C).
Single-cell analysis reveals the characteristic of polyamine metabolism in the TME.
Single-cell analysis was performed to reveal and verify the crucial effects of PAM in the TME. To depict the landscape and feature, single-cell transcriptomes from GSE132564 were performed with subcluster analysis and visualized by t-SNE approach. As shown in a t-distributed stochastic neighbor embedding (tSNE) plot, we manually annotated these clusters as the following 6 cell types: B cells, epithelial cells, mast cells, Myeloid cells, stromal cells and T cells (Fig. 8A). Expression level of PAM genes based on PAM score was delivered into AUC scoring algorithm and eventually assign over 60,000 cells into high/ low-AUCscore group. (Fig. 8B). We further measured the percentage composition in the high/ low-AUCscore group and extracted the top 6 markers of each immune cells cluster. In the high-AUCscore group, epithelial cells were in the majority, and immune cells such as T cells and B cells, which were considered to have favorable prognostic characteristics, dominated in the low-AUCscore group (Fig. 8C, Figure S7A). We also demonstrated the markers of 6 immune cell clusters in the high/low -AUCscore group (Figure S7B). To further assess the interaction of the immune cells in the CRC patients under the background of PAM, we investigated the cell-cell communication conditions. To this end, we evaluated the putative crosstalk of 6 immune cells with the R package “CellChat”. We found that the interaction between several cell clusters was so frequent, and both Myeloid cells and T cells received signals from other cell clusters, especially stromal cells as the “sender” had strong communication with Myeloid cells as the “receiver” (Fig. 8D, E). We conducted the correlation analysis to elucidate the communication between six cell clusters and metabolites (Fig. 8F). In the framework of immune cell clusters, the role and association of metabolites might be the crucial way for cell-cell communication. For this purpose, we analyzed the complex network of intercellular metabolites and related molecules (Fig. 8G, H, Table S11). In this network, the core corresponding metabolites include Prostaglandin E2, L-Glutamine, D-Mannose and Cholesterol with the 6 cell types starting. And the pivotal “Transporter” contained SCL7A5, SCL3A2, SCL38A1, SCL2A3, RORA, and PTGER4, and the final “Receiver” were all directed to T cells. Myeloid cells to T cells and Mast cells to T cells has more prominent status in the overall communication network, and the most advantageous communication score focused on L-Glutamine-SLC7A5, L-Glutamine-SCL3A2 and L-Glutamine-SCL38A1. These results suggested underlying mechanisms of between PAM and TME for the cell-cell communication status and the related metabolites functions.