The use of AI is experiencing a great interest in the medical field and, in particular, in oncology. In the recent literature, there exists a wide range of publication regarding the use of AI applied to NSCLC, especially focusing on real word data, genomics, circulomics, radiomics. In our study, we aimed to find an algorithm to predict response and efficacy to IO using real word data (i.e., clinical, tumour, and treatment data) and translational ones (i.e., the result of the MSC test). Combining the current medical literature, clinical experience of the physicians, and ML tools, we developed an algorithm including 5 important features discriminating with a good accuracy (ACC = 0.756, F1 = 0.722, and AUC = 0.83) between R and NR patients. The model achieved significantly better result comparing to PD-L1 prediction value alone, which is the only currently used biomarker by physician in clinical practice to select NSCLC patients to IO that have an accuracy ACC = 0.655 on the analysed dataset. To understand if the algorithm maintains its accuracy using only real word data, we decided to exclude the PD-L1 from the model features. In this case the accuracy of the model decreased, suggesting that, even if the PD-L1 alone it is not enough to provide an effective response prediction, it remains an essential feature for IO prediction to be used in clinical practice. We did the same with the MSC, since this test is an expensive and time-consuming exam, and, therefore, its introduction in clinical practice needs to be justified. When we leave out the MSC from the model, the model accuracy reduces even if less than the case of PD-L1 exclusion, again suggesting that the MSC has a role in our model. We also tested the model removing the patient’s ECOG, which it is a physician-dependent value and the results demonstrated a significant impact, analogous to PD-L1. Since the model was able to discriminate between R and NR group, we were also able to indirectly predict PFS and OS of these patients.
With a binary classification approach, we provided a method to identify and predict those patients with long OS (≥24-months OS). Even in this case the use of ML techniques showed a significant improvement over the use of PD-L1 (ACC = 0.855, F1 = 0.908, and AUC = 0.87 vs. ACC = 0.734).
Different papers have been recently published to cope with our same unmet clinical need not only in NSCLC but also in other different cancer types.
Radiomics features are frequently used to predict IO response in NSCLC patients. In the study by He et al. (24) with a dual propose, radiomic were applied to build a TMB signature. CT images were used to discriminate patients with High-TMB and Low-TMB in 327 patients. The model was then applied to the IO of 123 patients’ dataset to evaluate the risk stratification. The TMB radiomic signature reached an AUC of 0.74. (5) The prediction was slightly lower compared to our study, probably to indicate that the clinical features and patients’ presentation has as high relevance as tumour features and it is important to consider them in the model.
Khorrami et al. (25) compared changes (“delta”) in the radiomic texture of CT-scans patterns (139 patients) and associate them with tumor-infiltrating lymphocyte (TIL) density on the diagnostic biopsies in 36 patients. A linear discriminant analysis classifier yielded an AUC of 0.88 ± 0.08 in distinguishing R from NR patients when CT scan features were combined with TIL density. However, 36 patients are included in this coupled analysis and even if our study achieved a lower AUC, our model includes 4 real world data which are easier to be obtained compared to radiomics and TIL analysis.
Yang et al. (26) used 200 patients to develop a Deep Learning (DL) model integrating different data sources (serial radiomics CT scans, laboratory and baseline clinical data) to identify R and NR subgroups to IO in NSCLC patients. The model reported an AUC of 0.80 (95% CI: 0.74–0.86), showing a smaller expected value when compared to ours (AUC 0.82). A very interesting study called DeePaN (27), used a deep patient graph convolutional network to investigate the IO benefit in NSCLC patients. By integrating real world data (age, sex, race, histology, stage, ECOG score, smoking status and previous treatment, blood analyses) and genomics in 1937 patients, the algorithm was able to divide patients in two different subgroups: beneficial and non-beneficial patients with a mOS of 20.35 and 9.42 months respectively. Comparing to our model even our sample was smaller we also were able to predict survival and response with comparable results. The model also demonstrated the positive role of TMB and KRAS mutated in IO patients (27). The study by Tian et al (28) has a dual purpose: first predicting a PD-L1 signature (PD-L1ES) using CT images (in 939 patients) and the second to predict IO response in NSCLC patients combining PD-L1ES and clinical features (in 77 patients). PD-L1ES was able to distinguish patients with a better PFS compared to those with a lower PFS. However, results of the combined model (PD-L1ES and clinical data) were superior to both the clinical and PD-L1ES models singularly (28). Our study also confirmed the importance of PD-L1 and its adding value to clinical features.
The Development and the Validation of a 12-Gene Immune Relevant Prognostic Signature for Lung Adenocarcinoma through ML strategies has been investigated in 954 patients to predict IO. From the discovery dataset of 204 observations including microarray data of gene expression of 1811 genes a Cox Regression was used to decrease the number of features to 336. Random Forest was then used to extract the final 12 selected genes used to compute the risk score. Patients were classified into high- or low-score with an AUC of 0.854, 95%CI = 0.79–0.92). Patients with a high-risk score experienced a lower survival comparing to the low one (HR = 10.6, 95%CI = 3.21–34.95, P < 0.001). (29)
Independently from IO, ML and DL techniques are now used in research to predict NSCLC prognosis treated with different therapies to better address precision medicine, however these techniques are still far from their introduction in clinical practice. An interesting study used DL to implement OS prediction of NSCLC patients by integrating microarray and clinical data. A list of 15 relevant genes was built using 7 known relevant biomarker genes and other less known 8 genes. Expression data of the 15 genes and the clinical data were combined and developed an integrative deep NN predicting the 5-year survival status of NSCLC patients with high accuracy (AUC: 0.8163, accuracy: 75.44%), these data are consistent and comparable with our results (30). Another study developed an algorithm to predict NSCLC survival time in 1000 patients treated with different type of therapies. Thirteen features were included in the algorithm, e.g., number of primaries, tumour size, age and stage. Random forest was the best model to predict short period survival term (< 6 months) (31).
Finally, IO biomarker prediction, as we mentioned above, is an unmet clinical need also for other cancer types. In fact, as in NSCLC different efforts have been made to find predictive biomarkers of IO response using ML or DL methodology in other cancers. An interesting report on melanoma patients integrates histologic data and clinical data to predict IO response. The algorithm consists in a segmentation classifier that takes as input the whole slide image of the patient (haematoxylin and eosin tissue). These results were then combined through a multivariable logistic regression with clinical characteristics such as age, gender, histologic subtypes, etc. The classifier accurately stratified patients into high versus low risk for disease progression with an AUC = 0.80 (32).
Gene expression data were used to separate patients in Durable Clinical Benefit (DCB) and Non-Durable Clinical Benefit (NDCB) in gastric metastatic cancer considering a Training Dataset of 25 (DCB) + 45 (NDCB) and a Validation Cohort of 9 (DCB) + 15 (NDCB), obtaining an accuracy of ACC = 0.857 on validation cohort (33).
Lastly, in another work regarding IO prediction in bladder cancer CT-scans were used to develop a ML model according to RECIST methodology and the ROI were processed to extract radiomic features. Considering a dataset of 43 subjects the model reaches an Accuracy of ACC = 0.861 (34).
Our study has different limitations: firstly, the limited simple size. Secondly, we did not used radiomic features in our study and no genomic data are included except the unique molecular data requested as for standard of care.
There are many studies that are trying to extract more information from imaging (radiomics) and genomic data. Radiomics is a very important frontier but still in an early phase and more time will need to include it in clinical practice. The same for genomics. The approach used in this paper include routine information from imaging (e.g., RECIST) and also real word genetic data were used, those already investigated as per standard of care, which both added to the clinical can allow to better extract predictive multifactorial information. This collection can be chipper and easier to be collect.