To evaluate whether mutated genes within biological processes can predict ICI treatment responses in metastatic melanoma, we obtained training and validation mutation and clinical datasets from metastatic melanoma patients treated with anti-PD1. For all experiments, models were trained on the same designated training dataset, and evaluated using the same designated validation dataset (See methods). Throughout this work, we used Gene Ontology (GO)34,35 to aggregate genes into biological processes. We first investigated whether the mutation load in genes belonging to distinct biological processes can accurately predict ICI responses. For each GO biological process, we counted the number of mutations in that process per sample in the training datasets and used these values to predict anti-PD1 responses. These analyses revealed that the total mutation counts in distinct biological processes were only mildly predictive of response (Supp. Table 1). We surmised that only a subset of the mutated genes within specific biological process may be predictive of ICI responses. To identify subsets of genes within distinct biological processes in which the mutation count best predicts ICI response, we applied feature selection methods to mutations in each biological process.
We used the sum of mutations in selected subsets of genes within distinct biological processes to predict melanoma ICI responders vs. non-responders. The area under the receiver operating characteristic curve (ROC AUC) was used to evaluate the predictive capacity of mutations in subtests of genes belonging to each biological process. We first employed greedy forward feature selection that iteratively finds the best new feature to add to a set of selected features. In this process, the algorithm starts with an empty set, and then iterates over all genes in a biological process, to add the gene that best improves the predictive performance. When using the greedy forward selected genes within each biological process, several biological processes showed high predictive performance on the training dataset, (ROC AUC>0.75). However, none of these predictors maintained high performance in the validation dataset (that is, at least 90% of the training performance, Supp table 2). We reasoned that the greedy feature selection strategy impaired generalization by converging into local optimum. We therefore applied randomized forward feature selection, which sequentially selects features to add using a probabilistic function (see methods for details). In contrast to the greedy forward selector, four processes that performed well on the training dataset maintained high performance when applied to the validation dataset (Supp 2 and Figure 1A). These include RNA polymerase II transcription regulation, enzyme regulator activity, establishment of protein localization and regulatory regions of nucleic acid binding (Figure 1A). We next applied a genetic algorithm feature selection36–38. This method outperformed the forward selection algorithms, where selected subsets of mutated genes in 15 processes maintained high performance on the validation dataset (Figure 1A and Supp table 2). The best performing processes include immune response, leukocyte differentiation and cell motility (Figure 1A). Several genes that were frequently selected within these processes have important roles in melanoma progression and prognosis. These include CD44, shown to have an effect on tumor progression and subsequent poor prognosis39,40 and TNFSF14, a regulator of T-cell proliferation that is commonly expressed in melanomas41.
Importantly, using all three feature selection methods, the biological processes with best performance on the training dataset performed significantly better on the validation dataset compared to processes that showed poor performance on the training dataset (Fig. 1b). We found positive correlation between the performances of selected subsets of mutated genes in different biological processes across the feature selection methods (Figure 1C). Overall, these results support the premise that subsets of mutated genes within specific biological processes maintain comparable predictive performance to that of the TMB.
Using selected subsets of mutated genes, none of the best-performing processes demonstrated a substantial improvement over TMB. We reasoned that accounting for complex interactions between mutated genes in biological processes may be critical for prediction of ICI response. We therefore applied non-linear classifiers to mutated genes within each biological process. First, we trained decision tree algorithms, including gradient boosting (GB) and random forest (RF) using mutations in all sequenced genes within a biological process. The top biological processes using both methods showed a strong predictive capability across the training and validation datasets (Figure 2A). In contrast to the sum of mutation classifiers, the top decision-trees predictors substantially exceeded TMB performance for the validation dataset (Figure 2A, Supp table 3). Interestingly, leukocyte proliferation regulation and T-cell proliferation regulation were among the top biological processes, both directly linked to ICI related immune responses; checkpoint inhibitor antibodies prevent T-cell inhibition and promotes the proliferation of effector T cells42, and their response to these treatments require their proliferation and presence in the tumor microenvironment43 (Figure 2B). We investigated the mutated genes in the leukocyte proliferation regulation process with the highest contribution to the RF prediction capacity. We found that mutations in beta catenin gene CTNNB1 had the highest contribution for prediction, in agreement with recent findings that activation of this gene is associated with a reduction in T-cell antitumor response44. In addition, among the top contributing genes in that process we found IL2, a gene with known antitumor activity by increasing T-cell proliferation and previously used clinically to treat cancers5,45, and CD137, another known target for antibody mediated immunotherapy target previously tested in clinical trials46 (Figure 2C). To further investigate non-linear predictors that may capture complex interactions between mutated genes within these processes, we evaluated two classes of neural network models using mutated genes within the top processes. Both the Forward Neural Network and Long Short-Term Memory Recurrent Neural Network models demonstrated high predictive capacity when applied to mutations within these biological processes (Figure 2D, Supp table 4).
To evaluate the potential clinical utility of these predictors, we examined their performance using an additional dataset where not all genes used for training are sequenced. This dataset21 comprises mutation and response data from 38 melanoma patients treated with anti-PD1, but included only 59-68% of the genes used to train the classifiers (Supp. Table 5). Remarkably, despite this, the process mutation decision tree classifiers maintained their high predictive performance for this dataset (Figure 3A-D, Supp Table 5). To test the robustness of this approach we evaluated these classifiers when retrained using different random seeds (see methods). This analysis revealed that the performance on both unseen datasets is maintained with the random forest classifiers and is consistently better compared to TMB (Figure 3E). Notably, random forest classifiers were the most robust when presented with missing features in the test dataset21 (Supp figure 1).
To further evaluate the potential clinical utility of these classifiers, we assessed their ability to predict overall survival in an independent dataset, the Memorial Sloan Kettering Cancer Center (MSKCC) data of patients treated with anti-PD147. This MSKCC dataset includes 321 melanoma patients treated with anti-PD1; in this dataset the mutation data are limited to 468 genes in the MSK-IMPACT targeted set. Nevertheless, the RF mutated process models trained previously were significantly predictive of survival in this dataset, and in particular, the leukocyte proliferation regulation process was significant and strongly predictive (Figure 4A, Supp Fig. 2). Using the predictors based on sum of mutations and the genetic algorithm feature selection, we found that higher number of mutations in the leukocyte differentiation process was predictive of ICI response (Fig. 1A). We found that the sum of mutations in selected genes in this process was also strongly predictive of overall survival in the MSKCC dataset (Figure 4B).
We then evaluated whether the leukocyte proliferation regulation RF classifier, which obtained the best performance over all datasets, may be applicable to other cancer types. To this end, we applied it to predict overall survival for other cancer types included in the MSKCC dataset. In addition to melanoma, three cancers (colon, bladder, and renal) showed positive association between the leukocyte proliferation regulation predictor and overall survival following anti-PD1 treatment (Figure 4C). When pooling samples from these four cancer types together, the leukocyte proliferation regulation predictor demonstrated significant overall survival predictive capability (Figure 4D).