Background: Acute respiratory distress syndrome (ARDS) is a prevalent complication among critically ill patients, constituting around 10% of intensive care unit (ICU) admissions and mortality rates ranging from 35% to 46%. Hence, early recognition and prediction of ARDS are crucial for the timely administration of targeted treatment. However, ARDS is frequently underdiagnosed or delayed, and its heterogeneity diminishes the clinical utility of ARDS biomarkers. This study aimed to observe the incidence of ARDS among high-risk patients and develop and validate an ARDS prediction model using machine learning (ML) techniques based on clinical parameters.
Methods: This prospective cohort study in China was conducted on critically ill patients to derivate and validate the prediction model. The derivation cohort, consisting of 400 patients admitted to the ICU of the Peking University Third Hospital(PUTH) between December 2020 and August 2023, was separated for training and internal validation, and an external data set of 160 patients at the FU YANG People's Hospital from August 2022 to August 2023 was employed for external validation. Least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression were used to screen predictor variables. Multiple ML classification models were integrated to analyze and identify the best models. Several evaluation indexes were used to compare the predictive performance, including the area under the receiver-operating-characteristic curve (AUC) and decision curve analysis (DCA). S Hapley Additive ex Planations (SHAP) is used to interpret ML models.
Results:400 critically ill patients were included in the analysis, with 117 developing ARDS during follow-up. The final model included gender, Lung Injury Prediction Score (LIPS), HepaticDisease, Shock, and combined Lung Contusion. Based on the AUC and DCA in the validation group, the logistic model demonstrated excellent performance, achieving an AUC of 0.836 (95% CI: 0.762-0.910). For external validation, comprising 160 patients, 44 of whom developed ARDS, the AUC was 0.799 (95% CI: 0.723-0.875).
conclusion: Logistic regression models were constructed and interpreted using the SHAP method to provide a basis for screening high-risk groups for ARDS and to guide individualized treatment for different patients.