In lung cancer screening, LDCT is used to detect pulmonary nodules and evaluate their size and morphology. Most pulmonary nodules are small (< 5 mm in diameter) and benign, and their morphology is variable . Across the lung cancer screening literature, the major challenge faced by this diagnostic imaging modality is the difficulty of defining a “positive scan [23, 24].” The false-positive rate of the Lung-RADS has increased due to the large degree of variation in lung cancer demographics between populations, thus limiting the reliability of this tool . In addition, application of the unitary criteria without appropriate validation may result in false-positive results, overdiagnosis, and unnecessary costs . In this study, the Lung-RADS predicted lung cancer risks for the validation cohort with an AUC of 0.76, which indicated suboptimal decisive power to assess lung cancer risks in the population. The principles of the Lung-RADS are uniformity of radiology interpretation, risk assessment, and nodule management in LDCT lung cancer screening programmes, and although the clinical presentations of lung cancer are likely to vary greatly between populations, some of these imaging findings are not assessed. One possible remedy for this obstacle is the development of a validated prediction model for lung cancer risk using artificial intelligence algorithms, such as ANNs. In this study, the ANN took many risk factors into account, and it predicted lung cancer risks for the validation cohort with an AUC of 0.87. Among high-risk groups, overdiagnosis and unnecessary procedures might be avoided when patients are identified correctly by ANNs. Compared to the Lung-RADS, ANNs may be more robust in the prediction of lung cancer. Additionally, the standardized structured reports in this study involved the use of lung nodule descriptions from the Lung-RADS lexicon suggested by the ACR. As these input features can be easily identified and are generally assessed by radiologists, the ANN-based LDCT reporting system is both cost-effective and user-friendly.
Along with risk group identification, we investigated the predictors of lung cancer that are potentially useful for identifying patients at high risk for lung cancer. According to the heatmap derived from the ANN, three features, i.e., solid nodules, partially solid nodules, and GGNs, were identified as significant predictors of malignant outcomes. In conformity by the NLST and Lung-RADS criteria, there is a strong implication that the ANN predicts lung cancer mainly based on the documented nodule size in each category. Furthermore, this study addressed the diversity of lung cancer risk assessments in populations with a high percentage of non-smoking-related lung cancer. Among the subjects in this study, more than one-third of the confirmed lung cancer lesions presented with GGNs < 20 mm (5 of 19 lung cancer cases in the derivation cohort and 5 of 8 lung cancer cases in the validation cohort). When the Lung-RADS was applied, these patients were classified as category 2 and may have been falsely reassured by the “negative” screening results and thus did not return for follow-up scans. Among the 5 of 8 lung cancer cases in the validation cohort, the ANN could identify all (100%) of these patients who had pulmonary lesions and initially presented with GGNs < 20 mm, which were finally confirmed as adenocarcinoma. In several studies performed in Asian cohorts, the majority of lung cancer patients were non-smokers with pulmonary adenocarcinoma spectrum lesions, which typically presented as pure GGNs or partially solid nodules [27, 28]. The current literature shows that larger GGNs (variable cut-off, range 10.5 ~ 15.0 mm) tend to be more aggressive or appear as invasive pulmonary adenocarcinoma [29, 30]. This is a particular concern in Asian populations, where it would be important to report these GGNs and develop corresponding algorithms with follow-up strategies. Therefore, the ANN potentially assimilates population-specific demographic characteristics and provides important insights that improve the efficacy of lung cancer screening programmes.
There were several limitations to this study. First, classification models based on machine learning tend to be unstable in small datasets. Therefore, both models in this study were externally validated using a prospective cohort. Second, the positive and negative predictive values were influenced by the prevalence of disease in the study population. The prevalence of inpatient falls being estimated as 3% is a rough estimate as mentioned above and is therefore arbitrary to some extent. Finally, the short follow-up period may have caused partial verification. A large-scale prospective study with long-term follow-up is required to explore the benefits of using an ANN as part of an LDCT lung cancer screening programme.