Expression and mutation of ARG in CESC
We determined the expression levels of 181 ARGs in TCGA TARGET GETx-CESC in tumor and normal samples. DEG was selected between tumor tissue and normal tissue. Then the DEG and ARG were taken to intersect to obtain the DEG associated with angiogenesis between the two samples. As shown in Fig. 1A, most DEGs were abundantly expressed in tumor tissues. To reveal DEG interactions, we established a protein-protein interaction (PPI) analysis through the STRING website (Fig. 1B). Further, 10 hub genes were found by Cytoscape. They are STS3, CASP3, MAPK1, CREB1, MAPK14, MAPK14, FGF2, NFKB1, GLI1, MKI67,and MKI67.
We next further explored the 181 ARG somatic mutations in CESC. Somatic mutations were found in 137 of 287 samples (44.7%). MKI67 has the highest mutation rate. It was followed by TP53 and FBN1 (Fig. 1C)
Characterization of the angiogenic subgroup in CESC
To explore the relationship between angiogenesis and CESC, we selected 290 tumor patients with TCGA-CESC. The prognostic value of 181 ARGs was shown by univariate COX regression (Table S2).
To further clarify the relationship between ARG expression and CESC, we classified CESC into Cluster 1 (n = 194) and Cluster 2 (n = 96) by consensus clustering analysis based on ARG expression levels. PCA analysis showed a significant difference in gene expression between ARG clusters (Fig. 2A). Furthermore, by comparing the OS time of ARG clusters, we found a significant difference in the survival of the ARG cluster (Fig. 2B). After analyzing the clinical information of ARG clusters, we found no statistical difference between the two clusters in TNM stage, grade, and stage (Table S3).
Characteristics of TME in different ARG clusters
According to the results of the GSEA analysis, the DEG between ARG clusters was mainly enriched in cell movement-related pathways and therefore associated with tumor metastasis (Table S4). To determine the relationship between TME in ARG and CESC, we investigated the level of infiltration of immune cell subpopulations in both groups by CIBERSORT and ssGSEA (Fig. 3A-B). Substantial differences in the degree of enrichment of most immune cells between ARG clusters. The enrichment levels of CD8 T cells, CD4 memory T cells, follicular helper T cells, gamma delta phenotype T cells, NK cells, M1-type macrophages, dendritic cells, Th1 and Th2, were significantly higher in Cluster 1 than in Cluster 2. CD8 central type memory T cells, mast cells, eosinophils, natural killer cells, and Th17 showed the opposite. It suggests that ARG clusters have different immune-related. Meanwhile, we explored the expression levels of three vital immune checkpoints (ICPs) between ARG clusters. These three ICPs are the drug inhibitor targets of choice for current clinical trials. We found that PD-1 was highly expressed in Cluster 1, while PD-L1 was highly expressed in cluster 2 (Fig. 3C).
TME scores can assess the abundance of immune and matrix elements in TME. We further implemented ESTIMATE's algorithm to evaluate TME scores between different clusters, including matrix scores, immune scores, and ESTIMATE scores. The study results showed that the TME score of cluster 2 was significantly higher than that of cluster 1 (Fig. 3D). The results of the immune evaluation showed the existence of different immune statuses in ARG clusters. CD8 T cell expression was significantly different. ICPs activate the immune response by regulating T cells in the immune response. We speculate that in CESC, different clusters may respond differently to using the ICIs process.
Identification of ARG clusters - based on DEG clusters
To investigate the potential biological activity between ARG clusters, we obtained 476 DEGs between the two clusters. Functional enrichment analysis was performed on DEG (Fig. 4A). These DEGs are mainly focused on biological processes such as angiogenesis and cell motility. KEGG pathway analysis yields associations with angiogenesis, EGFR resistance pathways, cell motility, etc. This result is similar to the GO enrichment results (Fig. 4B and Table S5). We then conduct a univariate COX analysis. The presence of 52 genes was determined to be survival significance (P < 0.05). To investigate specific regulatory mechanisms, we classified CESC patients into 2 DEG clusters according to DEG by consensus clustering (Fig. 4D). Kaplan-Meier analysis showed a shorter survival time for DEG cluster A (P = 0.033) (Fig. 4C). Figure 4E shows that most ARGs show substantial differences in expression between the two DEG clusters. Thus the DEG cluster verified the differences between the ARG clusters.
Establishment and validation of predictive ARGscore
ARGscore are built from the DEG of the ARG cluster. We performed LASSO Cox on 52 DEGs to establish the best prediction model. Finally, we obtained 12 genes (Figure S1). The ARGscore were as follows: Risk score=(-0.0487*CFAP73) + (0.0041*TSPAN8) + (0.1323*TLL1) + (0.1427*IL1B) + (-0.3913*C11ORF16) +(0.0890*BCO0) +(0.3722*ATOH1) +(0.6696*PCDHAC2) +(1.4462*CCDC175) +(0.0198*PAPPA) +(0.5123*SYT16) +(0.0238*SLC35F3).
By comparing the respective risk scores of ARG and DEG clusters, we found that cluster 2 had a higher score than the ARG cluster (Fig. 5A). The fraction of Cluster B in the DEG cluster is higher (Fig. 5B). We considered the association of a high ARGscore with reduced survival in conjunction with survival analysis. In addition, the Kaplan-Meier analysis showed better survival in low-risk patients (Fig. 5C). The AUCs at 1, 3, and 5 years were 0.815,0.753 and 0.744, respectively (Fig. 5D). Figure 5E-F also shows that as the ARGscore increases, the time to OS decreases, and mortality increases.
To assess the predictive robustness of the ARGscore, we calculated the risk scores for the validation set. The ARGscore was divided into high and low-risk groups (Figure S2). Kaplan-Meier analysis reveals that survival is lower in the high-risk group compared to the low-risk group. Predictions of 1-, 3-, and 5-year survival demonstrated that the ARGscore still had a respectable AUC score. It means that the ARGscore has a good performance in assessing the prognosis of CESC patients.
Clinical correlation analysis of ARGscore and establishing nomogram for prognosis
To determine the relationship between ARGscore and clinicopathological features, we discussed the relationship between ARGscore and stage, grade, T, N, M, and survival status. We found that the higher the T stage, the higher the risk score (Fig. 6A-J). Furthermore, we performed univariate Cox and multivariate Cox analyses to explore the prognostic independence of multiple clinical factors. The results showed significant differences in risk scores (Fig. 6K-J), which were independent prognostic factors.
Since risk scores are highly correlated with patient prognosis, we built a nomogram combining clinical parameters (Fig. 6M). This nomogram was used to assess patients' OS at 1, 3, and 5 years. The calibration curve of the nomogram shows the essential accuracy between the actual observed and predicted values (Figure S3). In addition, we found that this predictive model with multiple clinical factors had a more significant net benefit in predicting prognosis.
Assessing TME and immune checkpoints in high- and low-risk groups
We explored the correlation between ARGscore and immune cells based on the results of the CIBERSORT analysis. Figure 7A shows that ARGscore was positively correlated with NK cells, mast cells, B cells, and eosinophils. It is inversely related to helper follicular T cells, CD8 T cells, CD4 T cells, M1 macrophages, dendritic cells, and neutrophils. We then investigated the correlation between the 12 genes that established the predicted ARGscore and immune cells. Most immune cells were found to be closely associated with the selected genes (Fig. 7B). In addition, we selected 22 genes associated with ICPs and assessed their expression between high and low-risk groups. The study showed that the expression of most ICPs was significantly different between the two groups (Fig. 7C). It indicated that the ARGscore has a guiding role in immunotherapy.
The relationship between ARGscore and TMB and immunotherapy
Numerous studies have demonstrated that TMB is a valuable predictor of tumor immune response. High TMB can benefit from immune checkpoint inhibitors [14]. We compared the expression of TMB in high and low-risk groups. The results showed no significant differences and correlation between ARGscore and TMB (Fig. 8A-B). However, we divided the patients into a high TMB group and a low TMB group (TMB = 2.64 as the cut point). The high TMB group was discovered to have a better prognosis than the low TMB group (Fig. 8C). We then combined TMB and ARGscores for survival analysis of CESC patients. The ARGscore eliminated the prognostic benefit in the high TMB group (Fig. 8D).
Immunotherapy is of great clinical value in the treatment of tumors. Nevertheless, only some patients are responders to immunotherapy. TIDE algorithm is a Method using induction of T cell dysfunction in tumors with high infiltrating cytotoxic T cells (CTL) and prevention of T cell infiltration in tumors with low infiltrating cytotoxic T cells. It can mimic the tumor immune escape mechanism to predict the response to tumor immunotherapy[15]. We calculated TIDE scores to assess the immune response of CESC patients. The findings displayed a higher TIDE score in the low-risk group than in the high-risk group (Fig. 8E). It was indicated that the high-risk group responds less to immunotherapy than the low-risk group. The low-risk group is more likely to benefit from immunotherapy.