Landscape and genetic variation of costimulatory molecules in SCLC
GSEA analysis revealed that inhibition of T cell activation is a typical feature of SCLC (Figure 1A). Considering the essential role of costimulatory molecules in regulating T cell response, we systematically explored the expression profiles of costimulatory molecules from B7-CD28 and TNF families. The B7-CD28 family consists of 8 ligands (CD80, CD86, ICOSLG, PD-L1/L1, CD276, VTCN1, and HHLA2) and 5 receptors (CTLA4, CD28, ICOS, PD-L1, and TMIGD2), and the TNF family consists of 19 TNFSF ligands and 29 TNFRSF receptors, which are identified in previous studies (Figure 1B). Next, considering the essential role of costimulatory molecules in T cell-related immune responses, we visualized the characteristic mutation spectrum of those molecules in a water plot (Figure 1C). Exploration of somatic mutations of 60 costimulatory molecules in 110 SCLC patients revealed that 30 samples (27.7%) showed mutations. Missense mutations were the most common type of mutation in the SCLC samples. We investigated the discriminatory capacity of costimulatory molecules in SCLC patients by using three-dimensional principal component analysis (PCA) to clearly differentiate groups based on gene expression levels (Figure 1D). In addition, a heatmap showing the variance in the gene expression between normal lung and SCLC tissues was generated (Figure 1E). Details information regarding the expression levels of costimulatory molecules is presented in Figures 1F and 1G. A correlation matrix exploring the relationships between costimulatory molecules showed positive relationships with each other, as well as potentially important synergistic effects (Supplementary figure 1).
Robust prognostic gene identification and CMS construction in the training cohort
In order to develop a prognostic model for clinical use, we firstly used univariate Cox analysis to select valuable prognostic genes from the set of costimulatory molecules, which led to the selection of 18 genes with clinical relevance (P<0.2) (Figure 2A and 2B). Next, LASSO Cox regression was performed with seven prognosis-related genes, including six protective genes (PDCD1, CD276, TNFSF14, TNFRSF25, ICOSLG, and RELT) and one risk gene (EDA2R) (Figure 2C-2E). A set of 7 OS-relevant costimulatory molecules was used to build the CMS. The risk score formula was as follows: Risk score = (-0.358 * ICOSLG) + (0.4192 * EDA2R) + (-0.3482 * TNFRSF25) + (-0.2544 * CD276) + (-0.4414 * RELT) + (-0.0781* PDCD1) + (0.00671* TNFSF14). The risk score of each patient was calculated, and Pearson correlation analysis showed the relationship between risk score and the selected prognostic genes (Figure 2F). Patients were ranked according to the continuous risk score and divided into high- (n = 42) and low-risk groups (n = 35) according to the optimal cutoff point (Figure 3A). The 7 selected genes were differentially expressed between cases stratified by risk scores, and the PCA also showed significant heterogeneity between the high-risk and low-risk groups (Figure 3A-3B). Kaplan-Meier curves were applied to compare the OS of SCLC patients, and the results of this analysis indicated that the high-risk group had remarkably worse OS in comparison with that of the low-risk group (HR = 3.00, 95% CI: 1.68-5.38, P<0.001) in the training sets (Figure 3C). The prognostic accuracy of the CMS was determined using time-dependent ROC analysis at 1-, 3-, and 5-year follow-up assessments (Figure 3D). Additionally, the performance of the CMS was compared to widely accepted prognostic and predictive factors for SCLC, including sex, age, tobacco use, and SCLC staging, using ROC curves (Figure 3E). The AUC value of the risk score was highest for predicting survival (0.729). The risk score model achieved a C-index as high as 0.815 (Figure 3F), suggesting that the CMS possesses good predictive capacity.
Validation of the performance of the CMS for predicting clinical outcomes in the validation cohort
To assess the prognostic ability of the CMS for SCLC in different population, we selected 131 SCLC FFPE samples with qRT-PCR data from the NCC as the validation cohort, and we calculated the risk score of each case. According to Kaplan-Meier analysis, we found that the high- (n = 73) and low-risk patients (n = 58) had significantly different survival status, and the high-risk group had worse outcomes (HR = 3.55, 95% CI: 2.24-5.61, P<0.001) (Figure 4A). We conducted a time-dependent ROC analysis to assess the prognostic accuracy of the CMS, and the AUC values at time points of 1, 3, and 5 years were 0.744, 0.684 and 0.716, respectively (Figure 4B). A comparison of the predictive performance of the CMS (AUC = 0.684, C-index =0.801) with other recognized factors, including sex (AUC = 0.50, C-index = 0.498), age (AUC = 0.546, C-index = 0.608), tobacco use (AUC = 0.538, C-index = 0.551), and SCLC staging (AUC = 0.665, C-index = 0.675), showed that the predictive accuracy of the CMS at 3-year survival was significantly better through the ROC and C-index analyses (Figure 4C and 4D). Moreover, high- (n =73) and low-risk patients (n =58) showed an obvious difference in relapse-free survival (RFS) according to Figure 4E (HR = 2.69, 95% CI: 1.76-4.11, P<0.001), and the AUC values at 1, 3, and 5 years were 0.633, 0.688, and 0.672, respectively (Figure 4F). Subsequent ROC and C-index analyses revealed that the CMS had better predictive capacity for RFS in comparison with the other predictive factors (Figure 4G and 4H).
Validation of the CMS in different clinical subgroups
Patients with complete clinical information were subjected to subgroup analysis to confirm the stability of the CMS in the training cohort. The subgroups included sex (male or female), age (age ≥ 60 or age<60), and smoking history (smoker or non-smoker), The Kaplan-Meier analysis showed that high- and low-risk groups had distinct survivorship, and the low-risk group had obviously better OS among different subgroups (Supplementary Figure 2).
CMS is an independent risk factor for the prognosis of SCLC patients
To further explore whether the CMS could act as an independent predictive factor for SCLC patients, univariate and multivariate Cox regression analyses were performed using the training cohort (Table 2). The results of this analysis indicated that the CMS was an independent risk factor for SCLC patients’ survival, controlling for sex, age, smoking status, and SCLC staging (HR = 3.997, 95% CI: 2.078, 7.688, P<0.001). Next, the CMS was validated in the testing set of 131 SCLC patients with qPCR data, and the results also proved the CMS was an independent risk factor for predicting OS (HR = 3.696, 95% CI: 2.132, 6.409, P<0.001) and RFS (HR = 2.566, 95% CI: 1.572, 4.186, P<0.001) when adjusting for confounders (age, sex, smoking, and SCLC staging).
The predictive value in estimating clinical benefits of adjuvant chemotherapy (ACT) for SCLC patients
ACT is the standard of care for SCLC after surgery, but most patients may develop drug resistance and the clinical efficacy is far from satisfactory. Therefore, we analyzed the relationship between risk score and ACT response in SCLC patients. In the training cohort, we calculated the risk score based on the same formula used above, and we divided patients into high- (n = 24) and low-risk groups (n = 26) by the optimal cutoff point. Based on the Kaplan-Meier analysis, the treatment benefits of ACT were higher for patients with low-risk scores (HR = 2.33, 95% CI: 1.11-4.87, P = 0.011) (Figure 5A). The AUC of risk score at time points of 1, 3, and 5 years was 0.84, 0.649, and 0.683, respectively, suggesting that the CMS had good predictive accuracy (Figure 5B). Moreover, the ROC curves demonstrated the clinical usefulness of the CMS (AUC = 0.649), which was significantly better than that of several recognized predictive factors, including sex (AUC = 0.615), age (AUC = 0.527), tobacco use (AUC = 0.503), and SCLC staging (AUC = 0.693) (Figure 5C). The C-index also confirmed the better predictive ability of CMS (C-index = 0.727) compared with other models (Figure 5D). In the validation cohort, we also investigated the association between risk score and ACT response, with RFS and OS used as treatment outcomes. Kaplan-Meier curves indicated that high- (n = 63) and low-risk patients (n = 49) had significantly different OS (HR = 3.62, 95% CI: 2.20-5.96, P<0.001) and RFS (HR = 2.59, 95%CI: 1.63-4.10, P<0.001) (Figure 5E and 5I), and patients with high risk suffered from worse clinical outcomes after ACT. The predictive accuracy of the CMS was further confirmed by subsequent ROC and C-index analyses, which demonstrated the superior predictive value of the CMS in comparison with that of previously recognized predictive factors (Figure 5F-5H and Figure 5J-5K).
Functional enrichment analysis of the CMS
The robustness of the CMS in predicting clinical outcomes prompted us to explore the biological role of the members of the CMS. We firstly filtered out genes that were closely related with the risk score (Pearson |R| > 0.4), and 670 negatively related genes and 187 positive related genes were selected (Figure 6A). Next, we conducted GO and KEGG pathway analyses using the DAVID database. Our analysis revealed that the top enriched GO terms were related to immune response (Figure 6B), and the KEGG analysis showed that the CMS was closely associated with immune-related pathways (Figure 6C). Moreover, to better understand CMS-related inflammatory activity, the relationships between the CMS and seven metagenes were investigated. A metagene is a pattern of gene expression, and each metagene represents a cluster of genes that are functionally correlated. We generated seven metagenes using GSVA. The expression profile of the inflammatory metagenes and the risk score is presented in Figure 6D. Pearson correlation analysis revealed that the risk score was negatively associated with some metagenes, including HCK, LCK, MHC_I, and MHC_II (Figure 6E).
TIICs and immune checkpoints associated with risk score
Since TIICs are key elements of the TME that are essential for the carcinogenesis of SCLC, we characterized the expression patterns of immune cell infiltration in SCLC patients. ESTIMATE scoring was used to calculate the stromal and immune scores in the TME. The risk score was significant negatively correlated with the stromal score (Coefficient = –0.226, P = 0.018) and immune score (Coefficient = -0.269, P = 0.048), and positively correlated with tumor purity (Coefficient = 0.2565, P = 0.024) (Figure 7A). The correlation matrix also showed a negative relationship between the risk score and various immune cells (Figure 7B). The heatmap displayed distinct immune cell expression in high- and low-risk patients; low-risk patients showed greater infiltration by immune cells, especially CD8+ T cells (Figure 7C). In addition, to examined the relationship between risk score and immune checkpoints in SCLC, Pearson correlation was performed and showed that risk score had negative concordance with other immune checkpoints in Figure 7 D and E.