Background: Tumour mutation burden (TMB), defined as the number of somatic mutations per megabase within the sequenced region in the tumour sample, has been used as a biomarker for predicting response to immune therapy. Several studies have been conducted to assess the utility of TMB for various cancer types; however, methods to measure TMB have not been adequately evaluated. In this study, we identified two sources of bias in current methods to calculate TMB.
Methods: We used simulated data to quantify the two sources of bias and their effect on TMB calculation, we down-sampled sequencing reads from exome sequencing datasets from TCGA to evaluate the consistency in TMB estimation across different sequencing depths. We analyzed data from ten cancer cohorts to investigate the relationship between inferred TMB and sequencing depth.
Results: We found that TMB, estimated by counting the number of somatic mutations above a threshold frequency (typically 0.05), is not robust to sequencing depth. Furthermore, we show that, because only mutations with an observed frequency greater than the threshold are considered, the observed mutant allele frequency provides a biased estimate of the true frequency. This can result in substantial over-estimation of the TMB, when the cancer sample includes a large number of somatic mutations at low frequencies, and exacerbates the lack of robustness of TMB to variation in sequencing depth and tumour purity.
Conclusion: Our results demonstrate that care needs to be taken in the estimation of TMB to ensure that results are unbiased and consistent across studies and we suggest that accurate and robust estimation of TMB could be achieved using statistical models that estimate the full mutant allele frequency spectrum.