Cohort demographic and genomic overview
Within the Genomics England 100,000 genomes project cohort (GEL cohort), we identified a total of n = 318 patients who had received either anti-PD-1, anti-PD-L1, anti-CTLA-4 or a combination treatment of CPIs as part of their cancer care and whose tumour was subject to WGS sequencing before CPI treatment (median 221 days). Our pan-cancer cohort comprised 38.99% (124/318) melanoma, 26.41% (84/318) lung, 18.87% (60/318) renal, 4.4% (14/318) bladder, 3.46% (11/318) ovarian, 2.83% (9/318) breast and 5% (16/318) other tumour types (Fig. 1a). The majority of the cohort, 76.73% (244/318), received PD-1/PD-L1 targeting agents as a first CPI therapy, while 22.6% (74/318) had CTLA-4 or CTLA-4/PD-1/PD-L1 combination therapy. As a measure of treatment response, we used the time from the first CPI treatment to the date of the last follow-up or death (days). The median survival time for the cohort was 744 days (95% CI: 544-1,098 days), with a total of 179 events (Fig. 1a). Median coding TMB (cTMB) across the cohort was 4.93 muts/Mb, the median whole-genome TMB (wgTMB) was 6.17 muts/Mb and the median ploidy was 2.10 (Fig. 1a). Demographically, 44.97% (143/318) of the cohort were female and 7.81% (25/318) were non-white participants (Fig. 1b).
Whole-exome derived biomarkers perform similarly in whole-genome data
We first compared TMB derived from coding regions (cTMB) compared to a whole-genome derived TMB (wgTMB). We observed a strong correlation between counts, both pan-cancer (Spearmans rho = 0.98, p < 2.2x10− 16, Fig. 1c), and across individual tumour types (Spearmans rho ≥ 0.84, Extended data Fig. 1a), reflecting that noncoding mutations mirror distinct biological processes, affecting the entire genome22. WgTMB was marginally skewed towards slightly higher counts (Fig. 1d), therefore upon applying a 10 muts/Mb threshold to classify patients into TMB high or low, we observed an additional 4.1% of patients would be reclassified as potentially sensitive to treatment (TMB high) using wgTMB compared to just 0.5% of patients where cTMB predicted a higher estimate (Fig. 1e). Segregating patients into wgTMB high and low based on this threshold showed distinct survival profiles (Fig. 1f). Segregation of patients into discrete groups (high vs low) performed similarly in a tumour-type adjusted Cox model for cTMB (HR 0.608, p = 0.00256) and wgTMB (HR 0.568, p = 0.000392). Across the entire GEL cohort of n = 15,208 cancer patients, TMB showed a treatment-specific association with outcome (HR 0.85, p = 0.03), predicting improved time from diagnosis to date of last follow-up or death, when incorporated into a Cox model adjusted by propensity scores derived from core demographic variables (see methods).
We next benchmarked the performance of canonical determinants of CPI response identified from WES studies against wgTMB and cTMB, including clonal and subclonal TMB14, estimated total predicted neoantigen burden, T cell infiltrate (predicted using T cell ExTRECT)25, aneuploidy26, Indel TMB27 and HLA loss-of-heterozygosity (LOH, using LOHHLA)28. Across the pan-cancer cohort, clonal TMB most significantly predicted improved overall survival (OS) following CPI treatment, with a hazard ratio (HR) of 0.684 (95% CI 0.549–0.852) p = 0.0007, followed by total neoantigen burden HR 0.722 (95% CI 0.566–0.920), p = 0.00841, cTMB HR 0.756 (95% CI 0.602–0.949), p = 0.0160 and wgTMB HR 0.759 (95% CI 0.603–0.954), p = 0.0182 (Fig. 1g). Tumour T cell infiltrate (TCRA fraction) showed a suggestive albeit nonsignificant association with survival following treatment, HR 0.851 (95% CI 0.723–1.0021), p = 0.0531, however, when also adjusted for wgTMB provided independent predictive value HR 0.847 (95% CI 0.721–0.994), p = 0.0429. In a subsample of the cohort, for patients with available digitised H&Es, we noted a moderate to strong correlation between tumour T cell infiltrate and manual TIL scoring performed by a pathologist (Spearman’s rho = 0.58, p = 4x10− 4, Extended data Fig. 1b). Of five measures of lymphocyte spatial arrangement, we noted tumour T cell infiltrate estimated from the genomic TCRA fraction was correlated most strongly with overall stromal lymphocytes and central stromal TILs compared to intratumoural TILs or TILs at the invasive edge (Extended data Fig. 1c). We also observed within this subset, that patients with a high Clonal TMB had higher numbers of TILs (p = 0.0042, Extended data Fig. 1c). When corrected for baseline differences across tumour types a linear model showed clonal tmb and HLA-B LOH were both associated with reduced TIL infiltration on H&Es (adjusted linear model coefficients 8.785, p = 0.0119 and − 17.848, p = 0.0243 respectively).
As biomarkers were called relative to a matched normal DNA, we observed no significant racial inflation and underperformance for significant biomarkers (interaction with ancestry for clonal TMB p = 0.433, total neoantigen burden p = 0.373, cTMB p = 0.374 and wgTMB p = 0.297 respectively)29. Most WES biomarkers also performed similarly across tumour types, however measures of TMB and total neoantigen burden associated more strongly in lung cancer (Extended data Fig. 1e).
Mutational signatures derived from WGS predict differential CPI response
Of 30 single-base WES-derived mutational signatures in COSMIC v2, those linked to tobacco, UV light and APOBEC activity have been previously associated with improved CPI response in a large meta-analysis conducted by our group14. COSMIC v3 now describes an updated compendium of signatures describing 77 single base substitutions (SBS), 11 double base substitutions (DBS) and 18 indel signatures30,31. Within our cohort, signatures associated with improved CPI response were all associated with UV (SBS7a) or tobacco-induced mutational processes (DBS2. DBS4), after adjustment for tumour type and wgTMB (Fig. 2a). Many of these signatures were highly correlated to one another, reflecting the same underlying processes (Extended data Fig. 2a). UV and tobacco signatures were strongly specific for melanoma and lung tumours respectively (Fig. 2b). Interestingly, this analysis also identified four signatures associated with poor outcomes following CPI including SBS44 (defective DNA mismatch repair and microsatellite instability)30, DBS3 (Polymerase epsilon exonuclease domain mutations))30, DBS9 (associated with APOBEC activity)32 and SBS25 (platinum chemotherapy)30. We did not observe any samples with SBS87, a thiopurine chemotherapy exposure-associated signature previously linked to improved CPI response in lung cancer33. SBS38, previously described as CPI resistance-associated34 in WGS-sequenced melanoma, was present in only 6.53% (8/123) of melanoma patients and was not associated with response (HR). Likewise, SBS5, also associated with resistance in melanoma34, was found in all tumour groups and 35.22% (112/318) of the cohort, but did not predict survival after CPI in melanoma or pan-cancer (HR 1.185, p = 0.150 or HR 1.0714, p = 0.398).
As tumours with mutations attributed to chemotherapeutic agents were associated with poor overall survival after CPI agents, we compared the outcome of patients who had received chemotherapy either before (n = 156) or after CPI treatment (n = 71). We observed reduced survival in patients who had received chemotherapy before CPI (Fig. 2c), after correcting for tumour type (multivariate (MV) Cox HR 2.020, p = 0.0020) or tumour type and wgTMB (MV Cox HR 2.31, p = 0.00031). We recapitulated this finding in another independent cohort (p < 0.0001, Fig. 2d). We observed no significant differences in the wgTMB between samples taken before or after chemotherapy (12.571 vs 13.185 respectively, Wilcoxon test p = 0.943) nor a difference in total neoantigen burden (848.11 vs 745.58, p = 0.921). It has previously been noted that patients who have previously received chemotherapy may have higher numbers of subclonal neoantigens35. We observed a higher number of low variant allele frequency (VAF < 5%) mutations in patients who received chemotherapy prior to CPI (one-way Wilcox test p = 0.00707, Fig. 2d), and a lower subclonal TMB (one-way Wilcox test, p = 0.033).
Immune germline SNPs do not strongly predict CPI response
Between 15–20% of the intratumoral variation of interferon signalling is heritable, therefore, modulating the immune response in cancer37. Thus, we considered the effect of germline SNPs which can modulate the tumour-immune microenvironment (TIME)37. Recent studies have identified germline variants in immune cells that can influence the tumour-immune microenvironment37,38. These SNPs have been reported to influence follicular T cell infiltration38, MHC class II expression37, and associate with immune checkpoint blockade response, though the overall heritability of traits is less than 20%37. To characterise the effect of TIME SNPs in our cohort, we considered any association of previously identified SNPs37 with cTMB, wgTMB, and tumour T cell infiltration (quantified from DNA TCRA fraction25. Only two SNPs, rs35356925 and rs1920021, which are negatively associated with interferon expression, were significantly positively associated with tumour TCRA fraction in renal cancer and malignant melanoma (Fig. 3a). These SNPs were significantly negatively associated with cTMB and wgTMB in bladder cancer (Fig. 3a). Next, we used Cox proportional hazards models to test for TIME SNP association with overall survival, in the context of TMB, sex, and disease type (see Methods). No SNPs were significantly associated with overall survival after false-discovery rate (FDR) correction (Fig. 3b), though four SNPs (6:32575658:C:G, OR: -0.19, p = 0.01, q = 0.07; 6:31783507:C:G, OR:-0.15, p = 0.049, q = 0.14, 6:32574305:A:G, OR: -0.16, p = 0.04, q = 0.1, 6:32575493:C:G, OR: -0.16, p = 0.04, q = 0.11), which are negatively associated with MHC-II expression, were nominally associated with overall survival following CPI therapy (Fig. 3b). Empirical power analysis confirmed that our current cohort of n = 318 is powered to detect SNPs with hazard ratios greater than 2, and with an event rate of 0.57 (Fig. 3c), therefore we are able to rule-out single SNPs which may double the risk of death following CPI therapy.
Structural variants facilitate genomic rearrangement with a minimal neoantigen footprint
Structural variants (SVs) represent a potential source of neoantigens39, but also modulate the expression of key genes influencing tumour fates23 and thus, may influence immunotherapy outcome. Measuring the total SV burden, we observed a median of 108 SVs per patient (range 16-1382). SV load significantly differed by tumour type and was highest in breast cancers, and lowest in renal cancers (Extended Fig. 3a). SV burden was only weakly correlated with wgTMB (Spearman’s rho = 0.270, p = 9.78x10− 7).
We next examined how SV burden is associated with outcomes following CPI. Total SV burden was associated with worse outcomes, however, in particular, the total head-to-head (h2hINV), tail-to-tail inversions (t2tINV) and translocations (TRA) most strongly predicted poor survival after CPI treatment in univariable associations, compared to gene duplications (DUP) or deletions (DEL) (Fig. 4a). However, we noted strong collinearity between these SV measures (Extended data Fig. 3b). Tumours with a higher SV burden also had significantly fewer tumour-infiltrating T cells (TCRA fraction, Fig. 4b, two-way Wilcox test p = 6.08x10− 6) when compared to patients with low SV burden. This held true when correcting for both tumour type and wgTMB (linear model coefficient for SV burden [continuous] -0.202, p = 0.000405). Thus, we compared the proportion of SVs which create predicted neoantigens (total SV neoantigens / total SV count), derived from either germline DNA; which are immune tolerated, or somatic DNA samples; which are subject to immune selection. The proportion of SVs forming predicted neoantigens was greatly reduced in somatic samples compared to the proportion in normal (p < 0.00001, Fig. 4c). While all tumours had at least one detected SV, only 27% of patients in the cohort (86/318) had one or more predicted SV neoantigens. In other mutation classes such as SNVs and indels, the total mutation count was strongly correlated with the number of predicted neoantigens (Spearman’s rho = 0.78, p < 2x10− 16, Fig. 4d). In stark contrast, the total SV burden was weakly correlated with the number of predicted SV neoantigens (Spearman’s rho = 0.15, p = 0.014, Fig. 4e).
Chromothripsis, detected using ShatterSeek was present in 31.76% (101/318) of patients in the cohort. The presence of chromothripsis was associated with significantly reduced OS following CPI treatment (Log-Rank p = 0.003, Fig. 4f) and held true when adjusting for tumour type (MV Cox regression HR 1.441, p = 0.0195). The presence of chromothripsis was associated with a significantly higher SV burden (positive mean 146.009 vs negative mean 30.732; two-way Wilcox test p = 2.2x10− 16) and reduced T cell infiltrate (positive mean 0.0683 vs negative mean 0.1207; two-way Wilcox test p = 1.083x10− 5).
TP53 loss-of-function (LOF) mutations are suggested biomarkers for CPI response, often resulting in an increased mutational burden and increased neoantigen load40–43. However, TP53-induced genomic instability is similarly associated with the development of chromothripsis44–47. A significantly higher TMB and neoantigen burden was evident in TP53 LOF mutants (mean 14.636 vs 20.782; p = 0.00898 and 849.182 vs 1219.242; p = 0.00721; Extended data Fig. 3d&e), however, tumours with TP53 mutations were also more likely to have chromothripsis, occurring in 67.56% of tumours with TP53 LOF mutations versus just 37.62% in those without (Chi-squared test of independence p = 1.67x10− 6, Fig. 4g). Patients with TP53 LOF mutations subsequently had lower tumour T cell fractions than those without (Extended data Fig. 3f).
Alterations in the MHC class I pathway associates with CPI resistance
Loss of antigen presentation is a known mechanism of CPI resistance48,49. We aimed to assess the landscape of non-coding short variants affecting regulatory elements (REs) in the MHC class I antigen presentation pathway (see methods, Table 1). We considered mutations in promoter or enhancer regions regulating genes in the class I MHC presentation pathway. We utilized experimentally defined REs inferred from consensus across a range of RE databases, compiled and scored by Genehancer50. REs listed in Genehancer were defined from five databases; dbSUPER, ENCODE, Ensembl, FANTOM5 and VISTA. We then applied stringent filtering to this database to improve confidence in RE:target interactions (see methods). From this filtered list, we assessed mutations in promoters, enhancers and enhancer/promoters targeting genes within the MHC class I presentation and survival following CPI therapy. For mutations occurring in a single enhancer, only one site had a significant association after correction for multiple testing. GH06J033061 (chr6:33061202–33063000 [GRCh38/hg38]), a ~ 1.8kb super-enhancer located on chromosome 6 which targets members of the TAP complex (TAP1, TAP2 and TAPBP), responsible for peptide transport into the ER and acts to stabilize the peptide-MHC complex during loading51. Patients with alterations in this enhancer had significantly shorter survival following CPI treatment (Extended data Fig. 4a). Finally, we also assessed LOH occurring within genes in this pathway and found only loss-of-heterozygosity in ER-resident aminopeptidase ERAP1 was associated with reduced survival following CPI treatment after controlling for multiple testing (Extended data Fig. 4b).
Telomere lengthening reduces CPI efficacy in renal and bladder cancers
Telomere elongation is a common adaptation in tumour cells and increased telomerase reverse transcriptase (TERT) activity has been linked with a more immunosuppressive TME52 and CPI response53–55. Telomere length was estimated in both germline (normal, whole-blood) and tumour WGS samples using Telomerecat56. Consistent with previous observations, telomere attrition was associated with age in normal (whole-blood) derived DNA (Spearman’s rho: -0.3, p = 7x10− 7) but not in tumour samples (Spearman’s rho: -0.041, p = 0.500, Fig. 5a). Telomere length derived from tumour DNA differed marginally across tumour types, with significantly lower length in renal cancer with respect to lung and melanoma (pairwise Dunn’s test p = 0.0238 and p = 0.0293 respectively, Fig. 5b). Overall, compared to normal tissue, tumour tissue had a significantly reduced telomere length (two-sided Mann-Whitney U p = 5.3x10− 5), however, when stratified by tumour type, this divergence was driven primarily by differences in renal and bladder cancers (one-sided Mann-Whitney U with Benjamini-Hochberg correction, p = 1.5x10− 5 and p = 3.55x10− 2 respectively, Fig. 5c&d).
Overall, across the pan-cancer cohort, increased telomere length in tumour tissue did not predict reduced survival following CPI therapy (Cox HR:1.00, p = 0.96). However, it was significantly associated with reduced survival in renal cancers (Cox HR 1.629, p = 0.0046) and bladder cancers (Cox HR 1.963, p = 0.039). Comparing those tumours within the top quartile of telomere lengths to those below this threshold highlights marked differences in overall survival following CPI therapy (Fig. 5e&f). Increased telomere lengths were not a consequence of canonical TERT activating mutations (C250T or C228T) in these cancer types as we observed no significant differences in tumours with or without these mutations (two-way Wilcox test, p = 0.505 and p = 0.1419 for renal and bladder respectively).
Increased intratumoral microbial presence is associated with reduced CPI response
CPI response is influenced by the gut microbiome, with increased gut microbiome diversity associated with improved treatment response. However, there is a paucity of evidence investigating CPI response in relation to the intratumoral microbiome. We therefore explored the association of the intratumoral microbiome with CPI response in a pan-cancer setting. The proportion of reads aligning to the microbiome was similar between tumour types (Kruskal-Wallis p = 0.72, Extended data Fig. 5a). Increased intratumoral microbiome proportion was found to associate with reduced survival following CPI treatment (Extended data Fig. 5b), which remained true when adjusted for tumour type (HR 1.176, p = 0.0236) and Tumour type with wgTMB (HR 1.171, p = 0.03049). Species diversity, measured by Shannon index, was not significantly associated with outcome (HR 1.083, p = 0.2714).
A multivariable model improves the prediction of treatment response compared to TMB alone
Finally, to determine whether WGS-derived biomarkers can improve the prediction of survival following CPI treatment compared to the clinical standard of TMB alone, we created a series of multivariable Cox models. Variables showing an association with survival after CPI in all prior analyses were taken forward for inclusion. To remove strong collinearity, we clustered all variables based on Spearman’s correlation and in highly correlated clusters, selected variables with the strongest univariate association with survival across the pan-cancer cohort. Finally, we applied elastic net penalisation with 5-fold cross validation over 1000 random seeds to remove variables with a coefficient of zero in all permutations. Models were constructed on the pan-cancer dataset and in the most represented tumour types, melanoma, lung, renal and all others. The multivariable pan-cancer model significantly outperformed that of TMB alone (partial likelihood test, p = 0.015), however, it had the lowest overall accuracy (c-index = 0.626, Fig. 6a). Constructing a separate model for the main three tumour types (Melanoma, Lung, Renal) and all others showed promise in improving survival prediction following CPI therapy, with all models outperforming TMB alone (Fig. 6a). This was most apparent in the lung (partial likelihood test p = 0.036, c-index 0.762 vs 0.627) and all others (partial likelihood test p = 0.046, c-index 0.683 vs 0.512). In the multi-variable models, clonal TMB most strongly predicted a positive therapeutic outcome across tumour types, except for the mixed all others group (Fig. 5b). Collectively, these genomic and clinical variables reflect five biological features, firstly at the genomic level; mutation quantity, neoantigen fitness, the broader tumour microenvironment, processes facilitating immune evasion and finally, host immunity (Fig. 5c).