Baseline characteristics
In our study, a comprehensive analysis was conducted, encompassing a total of 2,825 individuals. Among the included individuals, a total of 2,078 individuals maintained their compliance with the diagnostic criteria for ARDS after 24 hours of standard ventilatory conditions (PEEP ≥ 10, FiO2 ≥ 0.5). However, 747 individuals did not meet the required diagnostic criteria for ARDS. The divergent characteristics between the ARDS and non-ARDS groups are summarized in Table 1. Importantly, patients who met the diagnostic criteria for ARDS exhibited advanced age and significantly higher SAPS II scores, along with elevated rates of in-hospital mortality and 28-day mortality, compared to those who did not meet the diagnostic criteria. Within the ARDS group, several parameters exhibited substantial disparities compared to the non-ARDS group. Specifically, the minimum values of PH, Calcium,PaCO2,bicarbonate and base excess, and the maximum values of bicarbonate and base excess, were markedly higher (p < 0.001). Conversely, the minimum values of PaO2 and the maximum value of SpO2 and PaO2/ FiO2, were notably lower among ARDS patients compared to their non-ARDS counterparts (p < 0.001).
Table 1
Characteristics of participants
Value | Overall | Non-ARDS | ARDS | p |
| n = 2825 | n = 747 | n = 2078 | |
gender (%) | | | | 0.017 |
male | 1042 (36.9) | 303 ( 40.6) | 739 ( 35.6) | |
female | 1783 (63.1) | 444( 59.4) | 1339 ( 64.4) | |
age (years) | 59.31 (24.58) | 56.79 (30.83) | 60.22 (21.83) | 0.001 |
Weight(kg) | 92.51 (27.88) | 92.21 (32.04) | 92.62 (26.22) | 0.727 |
Height(cm) | 171.38 (9.12) | 171.40 (9.24) | 171.37 (9.08) | 0.956 |
Heartrate_min (beats/minute) | 75.40 (16.46) | 76.45 (16.96) | 75.02 (16.26) | 0.041 |
Heartrate_max (beats/minute) | 113.09 (21.83) | 115.29 (20.84) | 112.30 (22.12) | 0.001 |
Sysbp_min (mmHg) | 83.16 (14.89) | 82.87 (15.33) | 83.27 (14.73) | 0.53 |
Sysbp_max(mmHg) | 150.69 (25.13) | 151.12 (24.11) | 150.53 (25.49) | 0.582 |
Diasbp_min(mmHg) | 43.30 (10.36) | 43.11 (10.99) | 43.37 (10.12) | 0.549 |
Diasbp_max(mmHg) | 85.81 (19.51) | 86.84 (19.81) | 85.44 (19.39) | 0.092 |
Meanbp_min(mmHg) | 53.90 (14.10) | 53.94 (14.15) | 53.88 (14.09) | 0.916 |
Meanbp_max(mmHg) | 111.85 (36.04) | 113.84 (36.43) | 111.13 (35.87) | 0.079 |
Respiratory rate_min | 13.16 (4.48) | 13.21 (4.46) | 13.15 (4.49) | 0.744 |
Respiratory rate_max | 30.29 (7.26) | 30.43 (7.18) | 30.24 (7.29) | 0.543 |
Temperature_min(℃) | 36.14 (1.21) | 36.04 (1.40) | 36.18 (1.13) | 0.007 |
Temperature,_max(℃) | 37.94 (0.97) | 37.96 (1.02) | 37.93 (0.95) | 0.483 |
SpO2_min (%) | 88.59 (9.66) | 88.86 (9.81) | 88.49 (9.61) | 0.375 |
SpO2_max (%) | 99.65 (0.96) | 99.85 (0.47) | 99.58 (1.08) | < 0.001 |
PaO2/ FiO2 | 122.31 (57.02) | 134.02 (67.17) | 118.10 (52.29) | < 0.001 |
AnionGap_max | 16.88 (5.40) | 17.32 (5.44) | 16.72 (5.37) | 0.009 |
AnionGap_min | 13.04 (3.78) | 13.19 (3.60) | 12.99 (3.84) | 0.218 |
PaCO2_max(mmHg) | 54.56 (16.01) | 54.23 (16.26) | 54.68 (15.92) | 0.503 |
PaCO2_min(mmHg) | 35.48 (8.44) | 34.24 (8.37) | 35.93 (8.43) | < 0.001 |
PH_max | 7.41 (0.07) | 7.40 (0.08) | 7.41 (0.07) | 0.172 |
PH_min | 7.24 (0.12) | 7.22 (0.12) | 7.25 (0.12) | < 0.001 |
PaO2_max(mmHg) | 233.64(125.93) | 246.51 (116.18) | 229.01 (128.97) | 0.001 |
PaO2_min(mmHg) | 69.09 (23.51) | 74.07 (28.16) | 67.30 (21.32) | < 0.001 |
Lactate_max (mmol/l) | 3.94 (3.59) | 4.13 (3.24) | 3.87 (3.71) | 0.081 |
Lactate_min (mmol/l) | 1.96 (1.53) | 2.02 (1.42) | 1.93 (1.56) | 0.157 |
Bicarbonate_max(mmol/l) | 23.77 (4.56) | 23.04 (4.46) | 24.03 (4.57) | < 0.001 |
Bicarbonate_min(mmol/l) | 20.42 (5.20) | 19.48 (5.16) | 20.76 (5.17) | < 0.001 |
Base excess_max(mmol/l) | 0.23 (4.58) | -0.45 (4.54) | 0.47 (4.57) | < 0.001 |
Base excess_min(mmol/l) | -5.92 (6.23) | -7.08 (6.35) | -5.51 (6.14) | < 0.001 |
INR_max (seconds) | 1.67 (1.14) | 1.64 (1.12) | 1.69 (1.14) | 0.305 |
INR_min (seconds) | 1.33 (0.43) | 1.30 (0.37) | 1.34 (0.45) | 0.024 |
PT_max (seconds) | 17.72 (10.39) | 17.28 (9.28) | 17.88 (10.76) | 0.179 |
PT_min (seconds) | 14.58 (4.08) | 14.22 (3.25) | 14.70 (4.34) | 0.006 |
PTT_max (seconds) | 47.77 (31.35) | 48.31 (32.02) | 47.58 (31.11) | 0.582 |
PTT_min (seconds) | 32.00 (9.80) | 31.70 (9.47) | 32.11 (9.92) | 0.32 |
WBC_max(K/ul) | 17.04 (12.55) | 16.77 (9.13) | 17.14 (13.57) | 0.482 |
WBC_min (K/ul) | 11.57 (8.92) | 11.23 (6.68) | 11.69 (9.60) | 0.228 |
Hemoglobin_max(g/L) | 11.79 (2.22) | 11.90 (2.19) | 11.75 (2.22) | 0.116 |
Hemoglobin_min(g/L) | 9.99 (2.27) | 10.00 (2.16) | 9.99 (2.30) | 0.967 |
Platelet_max (K/ul) | 226.56(119.92) | 238.39 (124.23) | 222.31 (118.07) | 0.002 |
Platelet_min (K/ul) | 174.0(110.09) | 180.59 (111.77) | 171.72 (109.41) | 0.059 |
Hematocrit_max(%) | 35.50 (6.56) | 35.79 (6.46) | 35.40 (6.59) | 0.16 |
Hematocrit_min(%) | 29.59 (6.75) | 29.52 (6.32) | 29.62 (6.90) | 0.753 |
RDW_max | 15.26 (2.13) | 15.13 (2.16) | 15.31 (2.12) | 0.047 |
RDW_min | 14.77 (1.98) | 14.66 (1.98) | 14.82 (1.99) | 0.059 |
Creatinine_max(mg/dl) | 1.69 (1.59) | 1.82 (1.87) | 1.65 (1.47) | 0.012 |
Creatinine_min(mg/dl) | 1.31 (1.21) | 1.36 (1.37) | 1.29 (1.14) | 0.17 |
Sodium_max (mg/dl) | 140.93 (5.38) | 141.01 (5.32) | 140.90 (5.40) | 0.651 |
Sodium_min (mg/dl) | 137.59 (5.25) | 137.37 (5.06) | 137.67 (5.31) | 0.182 |
Potassium_max(mg/dl) | 4.71 (1.00) | 4.76 (0.85) | 4.70 (1.05) | 0.163 |
Potassium_min (mg/dl) | 3.91 (0.60) | 3.87 (0.61) | 3.93 (0.60) | 0.033 |
Chloride_max (mg/dl) | 108.33 (6.43) | 108.99 (6.34) | 108.09 (6.44) | 0.001 |
Chloride_min (mg/dl) | 104.01 (6.44) | 104.41 (6.27) | 103.86 (6.50) | 0.048 |
Calcium_max (mg/dl) | 8.42 (1.32) | 8.43 (1.33) | 8.41 (1.32) | 0.756 |
Calcium_min (mg/dl) | 7.63 (0.93) | 7.52 (0.93) | 7.67 (0.93) | < 0.001 |
BUN_max(mg/dl) | 28.73 (21.43) | 29.39 (23.42) | 28.49 (20.66) | 0.322 |
BUN_min (mg/dl) | 23.60 (18.17) | 23.61 (18.91) | 23.60 (17.90) | 0.984 |
Glucose_max(mg/dl) | 183.7(102.79) | 187.64 (111.02) | 182.32 (99.65) | 0.226 |
Glucose_min(mg/dl) | 121.80 (42.73) | 119.52 (39.53) | 122.62 (43.80) | 0.089 |
GCS | 11.28 (4.65) | 11.58 (4.50) | 11.17 (4.70) | 0.041 |
SAPS II | 45.77 (15.21) | 44.39 (15.33) | 46.27 (15.14) | 0.004 |
SOFA | 9.25 (4.09) | 9.15 (3.98) | 9.28 (4.13) | 0.457 |
in-hospital mortality(%) | 644 (22.8) | 137 ( 18.3) | 507 ( 24.4) | 0.001 |
28-day mortality (%) | 708 (25.1) | 154 ( 20.6) | 554 ( 26.7) | 0.001 |
Data are presented as %, mean ± SD or median (IQR) |
Sysbp Systolic blood pressure, Diasbp Diastolic blood pressure, Meanbp Mean arterial pressure,SpO2 oxygen saturation, PaO2 partial pressure of blood oxygen, PaCO2 arterial partial pressure of carbon dioxide, PPT,artial thromboplastin time,PT prothrombin time, INR international normalized ratio,WBC white blood cell,RDW red cell distribution width, BUN Blood urea nitrogen,GCS Glasgow coma scale,,SAPS III Simplified Acute Physiology Score III,SOFA Sequential organ failure assessment.p < 0.05 was statistically significant |
Feature selection Fig. 2 displays the results of feature screening. Based on the weight attributed to each feature, we have identified the 20 variables most strongly associated with the occurrence of ARDS. These variables are listed as follows: PaO2/ FiO2, height, the maximum values of anion gap, PT, creatinine, hematocrit, SpO2, DiasBP, SysBP, heartrate and platelet; as well as the minimum values of respiratory rate, SpO2, temperature, PaO2, platelet, BUN,WBC,RDW and glucose.
Model performance comparisons We developed a total of eight machine learning (ML) models to assess the risk of ARDS following ICU admission. The discriminative performance of these models is depicted in Fig. 3, illustrating the receiver operating characteristic (ROC) curves. Among the eight models evaluated, the CATBoost model demonstrated the highest predictive efficacy for ARDS, with an AUC of 0.817. Following closely were the lightGBM model (AUC = 0.793) and the XGBoost model (AUC = 0.792). Notably, the SVM model (AUC = 0.567) exhibited comparatively lower performance in predicting ARDS. The DTC model (AUC = 0.709), RF model (AUC = 0.731), KNN model (AUC = 0.692), and LR model (AUC = 0.767) also demonstrated varying degrees of prediction ability, surpassing the LR model (AUC = 0.664) used as a reference. To further evaluate the models, a detailed set of performance metrics can be found in Table 2. Notably, based on the findings from the decision curve analysis (DCA) curve presented in Fig. 4 ,the CATBoost model exhibits a substantial net benefit in terms of threshold probability compared to alternative models. In addition, the calibration curve showed that the CATBoost model was well calibrated showed in Fig. 5 .
Table 2
Model performance metrics
Model | AUC | Accuracy | Precision | Recall score | F1 socre |
LR | 0.664 | 0.745 | 0.705 | 0.745 | 0.685 |
SVM | 0.567 | 0.651 | 0.656 | 0.650 | 0.653 |
KNN | 0.692 | 0.733 | 0.715 | 0.733 | 0.722 |
DTC | 0.709 | 0.728 | 0.681 | 0.728 | 0.628 |
RF | 0.732 | 0.777 | 0.758 | 0.777 | 0.760 |
XGB | 0.793 | 0.801 | 0.837 | 0.841 | 0.775 |
LightGB | 0.793 | 0.841 | 0.783 | 0.831 | 0.803 |
CatBoost | 0.817 | 0.797 | 0.806 | 0.797 | 0.801 |
LR logistic regression, SVM support vector machine, KNN:K-nearestneighbor, DTC:Decision Tree Classifier,RF: random forest,XGBoost extreme gradient boosting, LightGB light gradientboosting, CatBoost:categorical boosting |
Discussion In this study, we conducted an extensive analysis incorporating eight prominent machine learning algorithms, namely LR, KNN,SVM,DTC,RF, XGBoost, LightGB, and CatBoost. We considered the PaO2/ FiO2 obtained after standardizing ventilator settings as the gold standard for the ARDS diagnostic model. The resulting ROC scores for the models were as follows: LR − 0.664, KNN − 0.692, SVM − 0.567, DTC − 0.709, RF − 0.732, XGBoost − 0.793, LightGB − 0.793, and CatBoost − 0.817. Remarkably, the CatBoost algorithm demonstrated the highest predictive performance among the evaluated models.
Due to the intricate etiology and pathophysiology of the condition ,the diagnosis of ARDS based on the initial PaO2/ FiO2 faces significant challenges[24]. The PaO2/ FiO2, influenced by various external factors including conventional mechanical ventilation parameters such as FiO2 and PEEP, fails to accurately reflect the true oxygenation status of the body. At high intrapulmonary shunt (QS/QT) and FiO2 levels above 0.4, the relationship between PaO2/ FiO2 and FIO2 remains relatively constant. However, when QS/QT falls between 0.1 and 0.3, PaO2/ FiO2 undergoes significant changes as FiO2 is altered[25]. When PEEP is equal to or greater than 10 cm H2O, it can decrease intrapulmonary shunt and significantly increase PaO2, resulting in an increase in PaO2/ FiO2[26]. Due to the lack of stability of oxygenation index, when it is used as one of the diagnostic criteria for ARDS, it will inevitably lead to misdiagnosis, and some studies have also proved this point. In a multicenter randomized controlled clinical trial[27], 66% of patients initially meeting the ARDS criteria no longer met the criteria after 24 hours under standard ventilation settings (FiO2 > 0.5, PEEP > 5cm H2O) .In two prospective observational studies from Hernu et al. [28]and Casser et al[29], the Berlin criteria failed to identify subgroups with different degrees of lung injury based on unnormalized baseline PaO2/FiO2 values. The Berlin criteria failed to classify patients into categories of varying severity with significantly different mortality rates.Several autopsy reports [10, 30, 31] have investigated pathological findings and found no association between the Berlin criteria and DAD in over 50% of patients with moderate and severe ARDS. Interestingly, the study[30] further reported a higher likelihood of DAD on autopsy examination among patients who met the clinical criteria for ARDS for an extended duration. DAD was observed in 27% of patients who met the clinical criteria for ARDS for less than 72 hours, compared to 62% of patients who met the criteria for ARDS for more than 72 hours. These observations highlight the unreasonableness of diagnosing ARDS based on the oxygenation index at admission, given its inherent instability. Patients categorized as severe ARDS based on the Berlin criteria may receive invasive and aggressive treatments that provide no benefit or could potentially be harmful. This is because a significant proportion of patients transition to less severe forms of ARDS after 24 hours of ventilation. In the ESICM Guidelines for ARDS[32], experts also note that some minimum period of stabilization and stability prior to diagnosing ARDS is likely appropriate; Although the length of this period remains uncertain, and a longer stable period increases specificity, it impedes early therapeutic intervention.
Villar and coworkers [27]demonstrated in a prospective observational cohort (n = 282) that the PaO2/ FiO2 obtained on ventilator settings (FiO2 > 0.5, PEEP > 5cm H2O) 24 h of ARDS onset may allow a better risk classification. The PaO2/ FiO2 determined on these settings no later than 24 h after ARDS onset, stratified patients into mild (PaO2/ FiO2 [200; n = 47, mortality 17%), moderate ARDS (PaO2/ FiO2 100–200; n = 149, mortality 40.9%), and severe ARDS (PaO2/ FiO2 < 100 (n = 86, mortality 58.1%) (p = 0.00001) .In another study[11] by Villar, 478 patients with moderate to severe ARDS diagnosisd by the Berlin definition, assessing the PaO2/FiO2 ratio after 24 hours of standardized ventilator settings (PEEP > 10, FiO2 > 0.5). The hospital mortality rate significantly differed based on PaO2/ FiO2 categories under standardized methodology. However, this correlation significantly improved when patients met the PaO2/ FiO2 criteria after 24 hours of sustained ARDS diagnosis. In view of the aforementioned observations, the lack of standardized methods for assessing oxygenation criteria can result in misdiagnosis of ARDS patients and misjudgment in treatment selection. Current research suggests that optimizing ARDS diagnosis and risk stratification can be achieved by utilizing the PaO2/ FiO2 after 24 hours of standardized ventilator application, providing a more accurate reflection of ARDS patient prognosis.
The advancement of artificial intelligence and the emergence of big data have propelled ML to achieve remarkable milestones in disease diagnosis[33, 34]. In contrast to traditional algorithmic models, machine learning demonstrates efficient processing capabilities for large-scale and complex datasets, leading to higher accuracy and improved prediction ability in addressing complex problems[35, 36]. In recent years, the application of machine learning in the clinical diagnosis of ARDS has witnessed a notable surge. Abigail et al. [17]utilized the XGBoost algorithm to predict the onset of ARDS. The model exhibited an area under the receiver operator characteristic curve (AUC) of 0.827, 0.810, and 0.790 at 12 hours, 24 hours, and 48 hours prior to the onset of ARDS, respectively. This implies that an early warning for ARDS can be provided up to 48 hours in advance. Ding et al. [37]conducted a secondary analysis of a multicenter prospective observational cohort study and employed a random forest model to predict the occurrence of ARDS. Ultimately, the model achieved AUC of 0.82. In a study by Yang et al.[18], a combination of feature selection methodology and machine learning algorithms was employed. They established a non-invasive physiological parameter model to estimate the P/F ratio for identifying ARDS. Among the models tested, XGBoost demonstrated the highest performance with an AUC of 0.9128, surpassing the traditional linear model utilizing non-invasive pulse oxygen saturation (SpO2/FiO2, S/F) (AUC = 0.7738). The incorporation of machine learning techniques into clinical prediction models holds promise for enhancing the accuracy of models compared to traditional statistical approaches.
The study presents several notable advantages. Notably, we introduce a novel approach by utilizing PaO2/FiO2, measured after 24 hours of standardized ventilator setting, as the diagnostic standard for ARDS, marking a pioneering contribution in our field. This unique approach enhances the accuracy of our model, surpassing other models based on conventional gold standard selection. Furthermore, the incorporation of machine learning techniques in prediction models has significantly elevated their performance. Our comprehensive analysis encompasses eight distinct machine learning algorithms. Comparatively, ensemble learning models such as XGBoost, LightGB, and CatBoost have demonstrated remarkable enhancements in predictive capabilities when contrasted with the traditional logistic model.
Our study, however, has certain limitations that should be acknowledged. Firstly, it is important to note that our research utilized a single-center retrospective study design. Retrospective studies inherently carry a degree of bias, and this should be taken into consideration when interpreting the results. Furthermore, the lack of participation from multiple centers and external validation hampers the overall reliability of our model. In order to enhance the robustness of our findings, future investigations should involve collaborations with multiple centers and seek external validation. It is also worth mentioning that although previous studies have established associations between biomarkers and the diagnosis of acute respiratory distress syndrome (ARDS), unfortunately, such data was not available in the mimic database used for our research. Therefore, the inclusion of these variables in future diagnostic models holds promise for further improving the performance and accuracy of the model.