Removing of batch effect
We obtained 11,283 genes in two different CC cohorts (TCGA-CC Project and GES40967 microarray). To identify the comprehensive landscape of TIME, the CIBERSORT algorithm was carried out (Supplementary file 1: Table 1). Figure 1A showed the abundance of 22 TIC types. To further reveal the underlying correlation between these TIC, the connection was employed to visualize the comprehensive landscape of TIME (Figure 1B). PC was most negatively correlated with PC (p < 0.05; r =− 0.4), whereas PC were most positively correlated with T cells CD4+ memory resting (p < 0.05; r = 0.25).
Establishment of the WGCNA network
We performed immune infiltration subgroup analysis on sequencing files of 11,283 genes. The optimal soft threshold power (β) was set to 9 after establishing the scale-free network, as it was the first power value when the scale-free topology index reaches 0.90 (Figure 2A). To enable different modules to form hierarchical clustering trees, the dynamic tree cutting algorithm (module size = 8) introduces genes with similar expression patterns into the same module. According to weighted correlation, hierarchical clustering analysis was employed and the clustering outcome were segmented based on the set criteria to acquire 8 gene modules (Figure 2B). Each row showed the candidate module with characteristics vector genes and each column of presented the 8 TIC types in Figure. 2C. Among 8 candidate modules, the black module was the most correlated with PC (cor = 0.51, p = 5e-33). Therefore, PC-related genes from the black module (Supplementary file 1: Table 2) for further investigate.
Development of risk signature
The expression data and follow-up information from the TCGA-CC project were obtained to further investigate prognostic value of candidate genes. Eight PC-related genes were determined with importance prognostic value by UCR (p < 0.05, Supplementary 1: Table 4). Prognostic features were performed for these HG, lasso regression was employed to avoid overfitting, and 5 PC-related genes correlated with CC prognosis were identified (Figure 3A). After 10 rounds of cross-validation, the optimal value of the penalty parameter was identified (Figure 3B). After MCR analysis, five PC-related genes (CD177, LGALS2, TMC8, TNFRSF13C, VWA5A) were identified, and all were regarded as prognostic indicators (p < 0.05, Supplementary file 2: Table 5). High expression of 3 HG (VWA5A, CD177, LGALS2) was positively correlated with prognosis. TMC8 and TNFRSF13C were the opposite (Figure. 4).
The genome in the TCGA database demonstrated that significantly different expression patterns in CC tissues compared to normal tissues (Supplementary file 2: Figure 1A–E). The HPA database showed that proteins (CD177, LGALS2, TMC8, and VWA5A) were significantly dysregulated in tumor tissue relative to normal samples (Supplementary file 2: Figure 2A–J). Survival analysis of most HG showed abnormal mRNA expression resulted in significantly different OS times (most p < 0.05, Supplementary file 2: Figure 3A–F).
Base on the median expression of HG, all samples were divided into high/low expression group. Subsequently, GSEA identification function enrichment was conducted on the high/low-expressed HG.
As shown in Figure 4A, KEGG showed that high expression of CD177 was concentrated in cell adhesion molecule signaling pathway, neuroactive ligand receptor interaction signaling pathway, and calcium signaling pathway. As shown in Figure 4B, Genesets uncovered that high CD177 expression mainly associated with keratin filament, immunoglobulin complex, and regulation of lymphocyte activation. The three KEGG demonstrated that the high expression of TNFRSF13 were exhibited in Figure 4C, TNFRSF13 high expression was positive enriched in primary immunodeficiency signaling pathway, graft versus host disease signaling pathway, and neuroactive ligand receptor interaction signaling pathway. Figure 4D demonstrated the GO pathway most significantly correlated with high TNFRSF13 expression. The high expression with TNFRSF13 was mainly in adaptive immune response based on somatic recombination of immune, B cell receptor signaling pathway, antigen receptor mediated signaling pathway, and B cell mediated immunity. KEGG showed that the LGALS2 was high expression in allograft rejection signaling pathway, acute myeloid leukemia, and intestinal immune network for IgA production (Figure 4E). Moreover, the GO pathways in immunoglobulin complex circulating and immunoglobulin receptor binding were identified as the most LGALS2 relevant signaling pathways (Figure 4F). KEGG enrichment term exhibited that the high expression of TMC8 was mainly associated with intestinal immune network for IgA production (Figure 4G). Figure 4H demonstrated the GO pathway most significantly correlated with high TMC8 expression. The high expression with TMC8 was mainly in B cell receptor signaling pathways. However, VWA5A did not enrich for the relevant signaling pathways and the GO terms. In addition, we calculated risk scores for 5 HG in the risk profile of CCP. RS = (− 0.1453∗ expression of CD177) + (− 0.2687∗ expression of VWA5A) + (− 0.1755∗ expression of LGALS2) + (− 0.5049∗ expression of TMC8) + (− 0.2988∗ expression of TNFRSF13C). Finally, we classified the corresponding RS into low-risk and high-risk subgroups based on the median cut-off value of the CC samples (1.3001).
Validation of risk prognostic signature
K-M survival curves showed significantly lower OS times in high-risk samples than in low-risk samples (P < 0.001; Figure 3C). Moreover, distributions of dot pot of survival status and RS indicated that shorted OS for high-risk CCP (Figure 3D, E). Next, UCR showed the hazard ratio (HR) of RS was 1.851 (95% CI 1.522−2.250; Figure 3F). The results of MCR (HR =1.765, 95% CI 1.427−2.184; Figure 3G) pointed to RS as an independent prognostic indicator for CC. These outcomes demonstrated that our 5 HG feature has excellent ability to predict clinical of for clinical prognosis.
Correlation of risk signature with clinicopathological variables
As showed in Figure 5A, the distribution of clinical variables in the high/low risk subgroups was recognized and visualized (Figure 5A). Figures 5B-G showed the proportion of clinical subtypes based on age, gender, clinical stage, tumor grade, N category and T status in the low/ high risk subgroups.
Construction of prognostic nomogram
As show in Figure 6A, ROC curves were plotted with AUC value of 0.684, 0.663, and 0.662 for 1-, 3-, and 5-year OS, demonstrating excellent prognostic. To further demonstrated that RS is the most prognostic indicator among multiple clinicopathological variables, we designated age, gender, and clinical stage as candidate prognostic factors.
These clinical characteristics were included in the AUC analysis for 1-year, 3-year, and 5-year OS, and we found that clinical staging acquired the highest AUC values (Figure 6B). Subsequently, a prognostic nomogram including of RS and clinical staging was employed to quantitatively predict prognosis (Figure. 6C). Lastly, calibrate curves demonstrated that the nomogram model had excellent prognosis predictive performance (AUC value > 0.5 Figure 6D).
Correlation of risk signature with TMB
First, TMB levels were detected in both high- and low RS subgroups. The study found that the low RS subgroup had lower TMB levels compared to the low high-risk sample (p = 9.1e-05, Figure 7A). Patients were assigned to different subtypes according to the TMB immune set point [31]. Survival curves discover that the low TMB values available showed longer OS time (P = 0.019, Figure 7B). A correlation analysis further validated the vitual positive association between TMB and RS (R = 0.21, p = 1.41-05; Figure 7C). Subsequently, we validated the cooperative effect of RS and TMB in the prognosis prediction of CC. Stratified survival curves for RS subgroups in low and high TMB status subtypes showed significant prognostic differences (P = 0.001; Figure 7D). Summary, the results demonstrated that RS may serve as an independent prognostic predictor to assess the clinical prognostic of anti-tumor immunotherapy.
In addition, the distribution among high-risk and low-risk scoring subtypes in gene mutations was also identified and visualized. The mutation patterns and clinical characteristics of the top 20 most frequently altered driver genes (Figure 7E, F). The mutation landscape revealed that APC (76% vs. 65%) had a higher somatic mutation rate in the high-RS subtype, while ABCA13 (18% versus 14%) had a higher somatic mutation rate in the low-RS subgroup. The above results provide insight into the intrinsic link between somatic cell mutations and plasma infiltration in CC immunotherapy.
Risk characteristics in TIME context of CC
The intrinsic connection between PC-based RS and TIC explores the fundamental contribution of RS to the sophisticated and variety of TIME. The outcome demonstrated that RS was negatively correlated with subpopulations of T cell CD4+ memory resting, B cells, Macrophage M2, Myeloid dendritic cell resting while positively associated with plenty of Macrophage M1, B PC, T cell follicular helper, T cell CD4+ Th1, Tregs, T cell CD4+ Th2, CD8+ (Supplementary file 2: Figure S4). In addition, as showed in Figure 8A, we further analyzed the Spearman associated of RS with immune infiltration, and detailed results were presented in Supplementary file 1: Table 6.
There was a significant upward trend in stromal scores and immune scores in the low-risk group and an significantly upregulation of ESTIMATE scores in the low-risk samples (Figure 8B).
Enrichment of signaling pathways in high/low risk groups
GSVA was used to explore the biological role of different risk groups in tumorigenesis and progression (Figure 9A). Enhanced activity of the CHEMOKINE pathway, JAK/STAT pathway, T-cell receptor signaling pathway, and B-cells receptor pathway was found in the low-risk group. Gene with high expression levels were enriched in P53 pathway, INSULIN pathway and PPAR pathway in high-risk groups.
Predicting of patients’ clinical outcome to immunotherapy
Response to immunotherapy was further explored. It was discovered that most of the genes associated with ICB (such as PDCD1 and CTLA4, etc.) were importantly and positive associated with RS (Figure. 10B). However, the risk scoring system revealed that scores for IPS-PD1 and CTLA-4 blockers were no significant differences (Supplementary file 2: Figure S5). In summary, these results suggested that RS was potential associated with the response to immunotherapies.
Prediction of response to chemotherapy
The IC50 of 23 chemotherapeutic medicines were estimated in CC patients according to the pRRophetic algorithm. These chemotherapeutic revealed higher IC50 in lower-risk patients (P<0.05; Supplementary file 2: Figure S6). Those results demonstrated that chemotherapeutic agents are more effective in low-risk samples.
Differential expression of CD177 in samples and cells
Dysregulated expression levels of CD177 were most pronounced in these prognostic plasma cell-associated genes. Moreover, the biological function of CD177 gene in CC was further explored in subsequent trial. As showed in Figure 10A, we examined 10 pairs clinical samples by qRT-PCR The results revealed that CD177 expression in CC was substantially lower than that in the adjacent normal tissue. From the protein level low CD177 expression was observed in six patient cancer tissues, while normal tissues had relatively high CD177 expression (Figure 10B).