Background: Current variability in methods for tumor mutational burden (TMB) estimation and reporting urges the need for a homogeneous TMB assessment. Here we compared the TMB distributions in different cancer types using two customized targeted panels commonly used in clinical practice.
Methods: TMB spectrum of the 295- and 1021-Gene panels in multiple cancer types were compared using targeted next-generation sequencing (NGS). Then the TMB distributions across a diverse cohort of 2,332 cancer cases were investigated for their associations to clinical features. Treatment response data was collected for 222 patients who received immune-checkpoint inhibitors (ICIs) and their homologous recombination DNA damage repair (HR-DDR) and PD-L1 expression were additionally assessed, and compared with TMB and response rate.
Results: The median TMB between the gene panels were similar despite wide range in TMB values. Highest TMB was 8 and 10 in patients with squamous cell carcinoma and esophageal carcinoma according to the classification of histopathology and cancer types, respectively. Patients with high TMB and HR-DDR positive status could benefit from ICIs therapies (23 patients versus 7 patients with treatment response, P = 0.004). Additionally, PD-L1 expression was not associated with TMB and treatment response among patients receiving ICIs.
Conclusions: Targeted NGS assays demonstrated advantageous ability to evaluate TMB in pan-cancer samples as a tool to predict response to ICIs. Also, TMB integrated with HR-DDR positive status could be a significant biomarker for predicting ICIs response in patients.
The immune system plays a pivotal role not only in cancer recognition but also in monitoring cells for neoantigen expression and modulating antitumor activity (1). Recently the use of immune checkpoint inhibitors (ICIs), such as cytotoxic T lymphocyte-associated antigen 4 (CTLA-4) (2, 3) and programmed cell death receptor-1 (PD-1) and its ligand (PD-L1) (4, 5), has shown remarkable clinical benefit in the treatment of melanoma, lung, renal-cell and prostate cancers (6–8). However, since only a subset of patients responds to ICIs (9), it is of paramount importance to identify biomarkers that predict treatment response and outcomes; allowing more efficient and timely treatment.
Defective DNA mismatch repair (dMMR) and PD-L1 expression are the only two approved predictive biomarkers for immunotherapy by the Food and Drug Administration to date (10, 11). Several studies have demonstrated that the presence of these biomarkers was associated with greater numbers of somatic mutations and tumor neoantigens, influencing the sensitivity of different tumor types (e.g., melanoma, non-small cell lung cancer (NSCLC), and mismatch-repair deficient tumors) to ICIs (10). Tumor mutational burden (TMB), an indirect measure of tumor-derived neoantigens, is measured through whole-exome sequencing (WES) or cancer gene panels and has been another extensively studied biomarker which shows promising potential in predicting treatment response to ICIs (12, 13). The homologous recombination DNA damage repair (HR-DDR) pathway mutations have also been identified as a potential indicator for treatment response to anticancer therapies (14). Till present, no research has been conducted to assess the correlation between HR-DDR status and TMB or whether that these biomarkers could jointly better stratify patients for ICIs response.
To better understand the landscape of TMB across the spectrum of human cancers, we performed a cohort study of 2,332 cancer patients who underwent the 295- and 1021- Genepanels genetic testing using a next-generation sequencing (NGS) platform at the Sun Yat-sen University Cancer Center (SYSUCC) in Guangzhou, China. This study aimed to characterize TMB in detail across various cancer types and to understand the significance of combining TMB and HR-DDR status in predicting ICIs response using PD-L1 expression as the reference.
Data of 2,332 patients who underwent genomic profiling with hybridization capture-based NGS assay from January 1, 2017 to January 31, 2020, at the SYSUCC were retrospectively retrieved. Eligible patients were defined as those having a pathologically confirmed cancer diagnosis. We collected clinical data of all patients concerning age, sex, smoking status, the percentage of PD-L1 membranous staining of tumor cells, cancer stage, and family history.
A subset of patients (n = 222) received immunotherapy and their treatment response was characterized as complete or partial response (CR/PR), stable disease (SD), or progressive disease (PD) based on the RECIST 1.1 criteria (ref). We defined the effectiveness of ICIs as durable clinical benefit (DCB) or no durable benefit (NDB). DCB was defined as CR/PR or SD for at least one month, whereas NDB was defined as disease progression within one month after start of ICIs treatment. The study protocol is summarized in Fig. 1 and was approved by the ethical review board of SYSUCC (RDD number: RDDA2020001465). All procedures in this study involving human participants were conducted in accordance with ethical standards of the Medical Ethic Committee of SYSUCC.
Paraffin-embedded tumor biopsy or surgery samples were retrieved from all patients, with a minimum of 20% tumor cells within each tissue for sequencing, which was assessed through examination of hematoxylin and eosin (HE)-stained slides by a pathologist (Yan-Fen F). We used two targeted sequencing assays including the 295 OncoScreen panel containing whole exons of 287 genes and selected introns of 22 genes (Burning Rock Biotech Ltd., Guangzhou, China; Supplementary Table S1, online only) and the 1021 Gene panel containing whole exons and selected introns of 288 genes and selected regions of 733 genes (Geneplus-Beijing, Beijing, China; Supplementary Table S1, online only). Briefly, DNA was extracted from the retrieved tumor samples and matched peripheral blood or adjacent tissue samples, which was used for TMB assessment and filtering germline mutations across multiple cancer types. DNA fragmentation was conducted using the Covaris M220 Focused-ultrasonicator (Woburn, MA, USA), followed by end-repair, phosphorylation, and adaptor ligation. Barcoded libraries were generated and sequenced at all exons, and selected introns of a custom panel of 295- (for 1740 patients) and 1021- (for 592 patients) genes, respectively. All indexed libraries were sequenced to a minimal unique coverage depth of 100X on a NextSeq 500 platform (Illumina, San Diego, CA) and Gene+Seq-2000 (Geneplus-Beijing Institute, Suzhou, China). Adaptor sequences and low-quality reads were removed and the clean reads in the FASTQ format were mapped with reference human genome (hg19) using the Burrows-Wheeler Aligner (BWA, version 0.7.10-r1039). Local alignment optimization, variant calling, and annotation were performed using GATK (version 3.4-46-gbc02625), MuTect, and VarScan, respectively. To normalize the somatic TMB across the 295- and 1021- gene panels, the total number of mutations was divided by the number of coding regions captured in each panel (1.02 and 1.60 mega-bases (Mb), respectively).
The calculation of TMB was performed using Ion Reporter™ Analysis Software v5.10 (IR) using the Oncomine™ Tumor Mutation Load w2.0 workflow (Thermo Fisher Scientific). From both the 295- and 1021- Genepanel targeted profiling samples, we calculated the numbers of somatic missense mutations, nonsense mutations, and coding indels by the number of exonic bases with at least 500x coverages and displayed as the number of mutations per Mb of captured genome. Fusions, CNVs, and non-coding mutations were not counted (15). The default limit of detection (LOD) was set at 5% allelic frequency (AF) and adjusted to 10%, depending on the presence of potential deamination artifacts. Subsequently, we examined the frequency of common oncogenic mutations such as TP53, KRAS and PIK3CA, found in cancers and their association with TMB. We also analyzed the mutation status of HRD genes associated with TMB spectrum in a subset of 222 patients.
Formalin-fixed, paraffin-embedded (FFPE) tissue blocks were sectioned at 4 µm for PD-L1 expression, using monoclonal antibodies against Ventana PD-L1 (SP142 assay on Ventana Benchmark Ultra), based on previously described methods (16). The sections were then dried and adhered to the slides by baking at 60 ℃ for 1 hour. Immunohistochemical staining was conducted on automated platforms based on manufacturer’s instruction. According to the current convention for PD-L1 expression assessment, the percentage of tumor cells with membranous staining was scored using the Olympus microscope by two pathologists (Yan-Fen F and Yuan H). In cases of disagreement, a consensus was reached after joint review under a multihead microscope. Different thresholds for PD-L1 expression were determined as ≤ 1% and > 1%.
Descriptive statistics were used to characterize the demographic and clinical features of the patients using mean or median values for continuous variables. The significance of baseline characteristics was assessed using unpaired T-test and Fisher’s exact test among patients tested by the two Genepanels. The nonparametric Mann–Whitney U test was used to test for difference in the median values of TMB between the 295- and 1021- gene panels. The correlations between TMB, HR-DDR status and ICIs treatment response were examined by the Spearman rank correlation coefficients. All statistical analyses were performed using Stata version 14.0 (StataCorp LLC, Texas, USA), Graphpad Prism 7.04 (San Diego, CA) and R version 3.3.3 software (www. R-project.org). A P value < 0.05 was considered statistically significant.
The demographic and clinical characteristics of the enrolled patients are detailed in Table 1. The median age at diagnosis was 55 years (range: 1–92), including 53.7% male and 16.4% current or former smokers. Smoking status was known for 2,103 patients, including 383 (16.4%) smokers. The distribution of cancer types is illustrated in Fig. 2A. The most common cancer type was colorectal cancer (CRC, n = 681, 29.2%), followed by lung cancer (n = 510, 21.9%), melanoma (n = 232, 10.0%), and gastric cancer (n = 143, 6.3%). Histologically, 1582 (67.8%) of the cases were adenocarcinoma, irrespective of tumor origin (Fig. 2B). The other histopathological types included mesenchymal tumors, squamous cell carcinoma, and adeno-squamous cell carcinoma.
In total, 2,332 tumor samples were successfully sequenced using the 295- and 1021- gene panels. The overall median TMB was 6 (range: 2-227) and 7 (range: 2-802) mutations per Mb in the 295- and 1021- gene panels, respectively (P < 0.0001, Supplementary Figure S1, online only). Not surprisingly, sex and smoking status were associated with median TMB (P < 0.0001 for sex and smoking status, Supplementary Figure S2, online only).
Among all cancer types, the median TMB ranged from 3 to 10 with the lowest median of 3 (range: 2–9) noted in salivary gland carcinoma (n = 46) and the highest median of 10 (range: 3–41) noted in esophageal carcinoma (n = 19, Fig. 3A and Supplementary Table S2). Fourteen patients had a median TMB > 100, including eight CRC patients, three endometrial cancer patients and three melanoma patients. The median TMB was 7 (range: 2-802) among CRC patients, and was greater when using the 1021- gene panel (median: 8; range: 2-802) than using the 295- gene panel (median: 7; range: 2-145) (P = 0.0003, Fig. 3B). Among patients with endometrial, gallbladder, liver or gastric cancer, the difference in median TMB was also statistically significant when comparing the 295- and the 1021- Genepanels (P for difference < 0.05, Fig. 3B and Supplementary Table S2, online only). Although similar trend was also noted for lung cancer, the difference was not statistically significant (median: 6; range: 2–37 for the 295- Genepanels and median: 7; range: 2–55 for the 1021- Genepanels (P = 0.3066; Fig. 3B). Comparisons of median TMB values for other cancer types using are shown in Supplementary Table S2 (online only).
The majority of histological subtype was adenocarcinoma (67.8%, 1582/2332). The median of TMB was 6 among patients with adenocarcinoma, and there was statistically significant difference between the two Genepanels (median: 6, range: 2-145; and median: 7, range: 2-802; P for difference < 0.0001, Fig. 3C and D; Supplementary Table S3, online only). The greatest median TMB was observed in neuroendocrine carcinoma patients (median: 8; range: 2–50). The median TMB was 10 in squamous cell carcinoma when using the 1021-Genepanel, which was higher than that of the 295-Genepanels (P = 0.003, Fig. 3C; Supplementary Table S3, online only). Relatively low median TMB values were found among patients with mesenchymal tumors (median: 4; range: 2-227) and blastoma (median: 4; range: 2–12) (Fig. 3C), with no significant difference observed between the two panels (Fig. 3D). Please refer to Supplementary Table S3 for more details.
In lung cancer, the substantial proportion was adenocarcinoma (404/510, 79.2%). The median TMB was higher in squamous cell lung cancer than adenocarcinoma (P = 0.0195, Supplementary Figure S3, online only). In CRC, no difference was found in median TMB between left- and right-sided colon, irrespective of gene panels used (P = 0.0780 and P = 0.5072, Supplementary Figure S4A and B, online only). Median TMB values in specific pathological subtypes of ovarian cancer, gastric cancer and melanoma are summarized in Supplementary Figure S4 (online only).
Variations in TP53 were observed among 1,241 of the 1993 patients analyzed (62.2%; Supplementary Figure S5A, online only). KRAS mutations were observed in 517 of the 1993 patients (26.0%), including 304 patients with CRC (58.8%), indicating mutated KRAS was probably associated with higher TMB. PIK3CA mutations were enriched in 9 out of 14 patients with a median TMB above 100. Other main genes with mutation types and frequencies were summarized in Supplementary Figure S5A, in relation to TMB. More importantly, the variations in ARID1A were most frequent among patients with high TMB (25.3%, 191/754), followed by BRCA2 (21.4%, 161/754) and ATM (19.0%, 143/754) (Supplementary Figure S5B).
The characteristics of the 222 patients treated with ICIs are described in Table 2, mainly including patients with melanoma (n = 107, 48.2%) and lung cancer (n = 34; 15.3%). The prevalence of HR-DDR mutation was 15.8% (n = 35), and the proportion of DCB was 46.8% (n = 104). We found a higher median TMB among patients with HR-DDR mutation than patients without HR-DDR mutation (P < 0.0001, Fig. 4). Interestingly, patients with HR-DDR mutation and high TMB had a better disease control than patients with low TMB (P = 0.0001, Fig. 4). Specifically, in the HR-DDR mutant subgroup, there were two CRC patients with Lynch syndrome that had a DCB: a median TMB was 79 and 80 Muts/Mb, respectively.
Among the 116 patients who were evaluable for PD-L1 expression (Supplementary Figure S6, online only), 52 patients were defined as PD-L1 expression with ≤ 1% and had a median TMB of 5. The remaining 64 patients were classified as having PD-L1 expression > 1% and had a median TMB of 5. These were no statistically significant correlation between TMB and PD-L1 expression. No clear difference was however detected in ICIs response by PD-L1 expression status, without taking TMB in to account.
In this study, we evaluated and compared TMB distributions using the two commercially customized NGS panels among 2,332 cancer patients and further assessed the treatment response in relation to TMB and HRD status in a subset of patients who received ICIs therapy (n = 222). The novelty of the study includes the comprehensive description of TMB among more than 20 cancer types with detailed information on pathological subtypes. The finding of a better treatment response in relation to high TMB plus HR-DDR mutant status in the ICIs treated patients is another novelty.
The targeted panel size has been linked to the accuracy of TMB estimation (15). A sequencing panel comprising more than 300 cancer-related genes can help predict TMB, whereas a panel comprising less than 150 genes has poor performance (17). However, Li et al reported that 106-CDS (coding sequencings) mutation panel was also reliable in the estimation of TMB (18). We used both a 1.02-Mb (295 genes) and a 1.60-Mb (1021 genes) NGS panel, which are both sufficient for accurate TMB estimation, in the present study. The median TMB was similar (6 versus 7 mutations/Mb), indicating that diagnostic NGS panels targeting several hundred genes could accurately measure TMB and might be clinically useful.
The present results show that esophageal carcinoma patients had comparatively higher median TMB than other cancer patients, and most of these patients had squamous cell carcinoma. Smokers had greater TBM than non-smokers. Many studies have demonstrated that smokers had higher TMB than non-smokers (19–21) and smoking is the major risk factor for lung squamous cell carcinoma (22–24). Accordingly, we observed a higher TMB in patients with lung squamous cell cancer in the present study. Furthermore, a lower TMB was found in patients with salivary gland carcinoma than other cancers, in the study, in agreement with a previous study showing low TMB (range, 3–6) in patients with salivary gland carcinoma (25, 26).
Mutations in a number of genes have been found to be responsible for increased TMB (27), which is important to better understand this key driver of cancer progression and the related molecular mechanisms. In our study, we found that 14 patients exhibited a median TMB value > 100, of whom 8 patients were CRC. More importantly, 6 of 14 were patients with microsatellite instability-high (MSI-H) and 4 patients had POLE/POLD1 mutations. A positive correlation is known between altered microsatellite loci and increased TMB (28). POLE/POLD1 is a key pathway for DNA replication in which defects can lead to increased somatic mutation rate (29). In the present study, the highest mutation rate was noted for TP53. This is similar to other studies showing that loss of TP53 by somatic mutation, copy number loss, and epigenetic silencing are very common in cancer and can be associated with increased mutation frequency (30, 31). Our study showed that a mutated KRAS gene was mostly occurred in CRC patients who had commonly high TMB, which was in line with a previous study demonstrating that median TMB was higher among the KRAS-mutant patients than in the KRAS-wild patients (32).
In addition to TMB, other molecular features have also been hypothesized to affect treatment response of ICIs. For example, HR-DDR deficiency has been associated with better response to platinum-based neoadjuvant therapy in breast cancer (33). Perturbations in HR-DDR are considered deleterious to genomic integrity (34). Several studies have reported that PD-1/PD-L1 expression has limited predictive power for treatment response of ICIs, although ICIs targeting PD-1/PD-L1 are offered a novel treatment avenue in some cancer types such as CRC, lung cancer, metastatic urothelial carcinoma and melanoma (11, 35–39). In a clinical trial, TMB was shown to be more strongly associated with response to ICIs than PD-L1 expression (40). To date, little has been reported regarding the relationships among alterations of HR-DDR pathways, PD-L1 expression and TMB. In the present study, we described the correlations between HR-DDR status, PD-L1 expression and TMB. Patients with HR-DDR positive status had a higher TMB and better treatment response to ICIs than patients with HR-DDR negative status. There were no significant association between PD-L1 expression and ICIs response, indicating that HR-DDR and TMB might be better markers for ICIs response than PD-L1 expression. We also found that 11 out of 14 patients with a TMB above 100 had HR-DDR gene mutations, corroborating a recent study showing that alterations in DDR genes were strongly associated with clinical benefit in patients with metastatic urothelial carcinoma (39).
This present study has two limitations. We did not use the TMB cutoff value for analysis mainly because it is a continuous variable without a clearly defined cut point below which responses do not occur and above which response is guaranteed. Each cancer type might have a specific cut-off value with the accumulated of the number of each caner type in a subsequent study. The current study had a big sample size, whereas a small number of cases who lack of follow-up data were analyzed in the association of ICIs response and TMB and HR-DDR status, which may limit the power of conclusions. A large cohort size is needed to determine causal relationship between HR-DDR status and TMB.
Our findings showed that the two customized NGS assays targeting 1.02- and 1.6- Mb of coding genome of each gene could accurately assess TMB in clinical settings, and patients with HR-DDR gene alterations are more likely to have higher TMB and experience better responses to ICIs. Additional investigation is warranted to evaluate the mechanisms that link together HR-DDR alterations, TMB and immunotherapy response, which might represent a useful predictive biomarker for ICIs therapy.
NGS: next-generation sequencing; ICIs: immune checkpoint inhibitors; CTLA4: cytotoxic T lymphocyte-associated antigen 4; PD-1: programmed Death 1; PD-L1: programmed Cell Death-Ligand 1; dMMR: defective DNA mismatch repair; NSCLC: non-small cell lung cancer; TMB: tumor mutational burden; WES: whole-exome sequencing; HR-DDR: homologous recombination DNA damage repair; SYSUCC: Sun Yat-sen University Cancer Center; CR: complete response; PR: partial response; SD: stable disease; PD: progression disease; DCB: durable clinical benefit; NDB: no durable benefit; HE: hematoxylin and eosin; Mb: mega-base; AF: allelic frequency; LOD: limit of detection; IHC: immunohistochemistry; FFPE: formalin-fixed, paraffin-embedded; CRC: colorectal cancer.
Ethics approval and consent to participate
For the use of the clinical data for research purposes, prior written informed consents from all patients and approval from the Institute Research Ethics Committee of Sun Yat‑sen University Cancer Center were obtained.
Consent for publication
All the authors have reviewed and approved the final manuscript for publication.
Availability of data and materials
The key raw data have been deposited into the Research Data Deposit, with the approval number of RDDA2020001465 and the datasets used in this study are publicly available.
The authors declare that they have no competing interests.
This study was partially supported by Natural Science Foundation of Guangdong Province (2020A1515010313) and the National Natural Science Foundation of China (81602468).
We thank Dr. Seeruttun Sharvesh Raj for the professional English language revision of this manuscript.
Study concepts and design: HYW, FW. Data acquisition: LD, YQL, YH. Experiment conduct: XZ, XZ, YKL, TT. Pathology evaluation: YFF. Data analysis and interpretation: HYW, XHY, FW. Manuscript drafting and revision: HYW, LD, QYL, XZ, YKL, XZ, YFF, YH, TT, XHY, FW. All authors read and approved the final manuscript.
Due to technical limitations Table 1 and Table 2 are available as downloads in the Supplementary Files.