Positive end-expiratory pressure selection by comprehensively considering clinical measurements and patient characteristics may improve ICU outcome in ARDS patients: an observational study

Background: It remains controversial as how to set positive end-expiratory pressure (PEEP) for acute respiratory distress syndrome (ARDS) patients. This study aims to provide suggestions to the clinicians in selecting PEEP for ARDS patients receiving invasive mechanical ventilation based on artificial intelligence (AI). Methods: Invasively ventilated ARDS patients in MIMIC-IV and eICU databases were enrolled in the observational cohort study. An AI model trained by awarding survival for suggesting optimal PEEP was developed and tested on the MIMIC-IV database and externally validated on the eICU database. Three subgroups were defined in which the PEEP grades set by the AI model are lower, equal, and higher than that set by the clinicians (denoted as respectively). Intensive care unit (ICU) mortality and 28-day ventilation-free days are the primary and secondary outcomes. Results: 6839 (MIMIC-IV) and 2117 (eICU) ARDS admissions were included in the study. The ICU mortalities are 10.8% and 8.6% in the subgroup in the MIMIC-IV and eICU databases, respectively, and become higher in the and subgroups (MIMIC-IV: 25.6% and 23.7%, eICU: 26.9% and 27.6%). An explainable analysis reflects that the clinicians’ PEEP setting relates more to the oxygenation, respiratory mechanics, and ventilatory settings, while the RL model also pays attention to the more comprehensive parameters concerning patient characteristics such as Sequential Organ Failure Assessment (SOFA) and age. Conclusions: AI-based PEEP selection tends to consider clinical measurements and patient characteristics comprehensively, and is promising to improve the ICU outcomes for ARDS patients.


Abstract
Background: It remains controversial as how to set positive end-expiratory pressure (PEEP) for acute respiratory distress syndrome (ARDS) patients. This study aims to provide suggestions to the clinicians in selecting PEEP for ARDS patients receiving invasive mechanical ventilation based on artificial intelligence (AI).

Methods:
Invasively ventilated ARDS patients in MIMIC-IV and eICU databases were enrolled in the observational cohort study. An AI model trained by awarding survival for suggesting optimal PEEP was developed and tested on the MIMIC-IV database and externally validated on the eICU database. Three subgroups were defined in which the PEEP grades set by the AI model are lower, equal, and higher than that set by the clinicians (denoted as Intensive care unit (ICU) mortality and 28-day ventilation-free days are the primary and secondary outcomes. An explainable analysis reflects that the clinicians' PEEP setting relates more to the oxygenation, respiratory mechanics, and ventilatory settings, while the RL model also pays attention to the more comprehensive parameters concerning patient characteristics such as Sequential Organ Failure Assessment (SOFA) and age.

Introduction
The mortality of acute respiratory distress syndrome (ARDS) remains high despite the extensive research in the past 50 years since its first description [1]. Positive endexpiratory pressure (PEEP) plays a crucial role in the management of ARDS in a "double-edged sword" way. PEEP is applied to open the collapsed alveoli to improve arterial oxygenation, to reduce cyclic opening and closing of alveoli to mitigate atelectrauma, and to promote more homogeneous ventilation to reduce the stress at the margins between aerated and collapsed lung tissue [2]. However, excessive PEEP may contribute to the development of ventilator-induced lung injury (VILI) by alveolar overdistention [3].
PEEP titration is often used to select the optimal PEEP level. The currently used approaches for PEEP titration include those guided by clinical measurements, including FiO2 [4], arterial oxygenation [5], electrical imaging tomography [6], respiratory compliance [7,8], and esophageal pressure [9][10][11]. Although most of the PEEP titration approaches can improve the oxygenation and/or respiratory mechanics, several large-scale clinical trials produce controversial conclusions regarding its effect on the clinical outcomes [4,9,10,12,13]. One of the major reasons for the controversy may be that ARDS is not a distinct disease but a heterogeneous syndrome [14].
Therefore, PEEP selection solely based on clinical measurement may be unable to provide optimal clinical outcome.
Artificial intelligence may shed a light on this question due to its ability in comprehensive knowledge discovery from a large amount of multi-modal medical data [15]. Reinforcement learning (RL), a branch of artificial intelligence, is capable of processing temporal dynamic data for decisions in an intensive care unit (ICU) [16].
The key feature of RL is that a computational model learns from human actions to achieve the best outcomes. It has been successfully applied to learn the optimal dosage of medication and volume therapy for sepsis patients [17] and select the optimal parameters for mechanical ventilation [18]. In the current study, we aimed to develop an RL-based model aiming to award survival to provide personalized suggestions to the clinicians in selecting PEEP for ARDS patients, and to explore the factors that the RL model considers in PEEP selection.

Study Population
The Due to the retrospective design of the study, the ethical approval from the institutional review board is exempted. The data in both databases have been deidentified for research purposes.

Model development
The RL model is about a computational model adjusting PEEP in response to the changing states (clinical variables) of the patients to achieve an optimal reward. It is based on the estimation of the Q functions ̂( , ), which provides the expected return of taking an action (adjustment of PEEP) at each state (a set of clinical variables).
The Q-values are updated using the Bellman recursion ( , ) = +1 + (max ′∈̂( +1 , ′)), where is the discount factor and gives the relative weight of the current and future return and +1 is the instant reward at time t+1. We used a deep Q network (DQN) [23], which approximates ̂( , ) using deep neural networks by sampling and minimizing the squared error loss between the output and target Q values, to learn the optimal PEEP policies. The approximated optimal policy after N iterations is given as * ( ) = (̂(s, a)). The diagram of the DQN model is shown in eFigure 1.
A set of 25 variables were extracted from the databases, including demographics, clinical scores, vital signs, laboratory values, and ventilator parameters (eTable 1).
They were collected during the whole ventilation period at 1-hour intervals. Variables with multiple measurements within a 1-hour interval were averaged. To address the problem of missing or irregularly sampled data, we adopted sample-and-hold to impute ventilator parameters and linear interpolation to impute laboratory values and vital signs. As for atemporal variables, their respective means of the dataset were imputed.
The action space includes 11 grades of PEEP, with < 5 cmH2O and ≥ 23 cmH2O the first and the last grade, and 2 cmH2O the interval between consecutive grades (eTable 2). The grade is denoted as PEEP-G to distinguish it from the actual PEEP value. Given a PEEP-G, the actual PEEP range is between 5 + (PEEP-G -1) × 2 and

Clinical outcome
The primary outcome was the ICU mortality rate. The secondary outcome was the 28day ventilation-free days.

Data analysis
Only the variables in the first week of MV using assist/control ventilation mode were involved in the model development and validation. For each patient, the difference between the PEEP grades ( -PEEP G ) given by the RL model and the clinicians was defined as where s indicates the states and s N indicates the number of states for the investigated patient. In this study, each state means each hour.
Three subgroups were defined according to with an interval of 1.0 PEEP grade (corresponding to ~ 2 cmH2O) between neighboring ranges to assess the dependence of ICU mortality, 28-day ventilation-free days, and clinical variables on the were assessed using one-way ANOVA. An XGBoost classifier [24] was used to evaluate the feature importance in recommending high PEEP (≥ 10 cmH2O) and low PEEP (< 10 cmH2O) for the clinicians and the RL model, respectively, to explain their actions. The reinforcement learning model, the XGBoost classifier, and all the statistical analyses were implemented using Python 3.8 and Scikit-learn package.

Optimal PEEP recommended by RL model
In the MIMIC-IV database, the percentage of patients where the clinicians gave higher PEEP compared to the RL model is 52.0% and is higher for the more severe patients for the MIMIC-IV database). The PEEP grades recommended by the RL model and set by the clinicians are 2.4±0.7 and 1.5±0.6, respectively (p < 0.001). These patients are significantly younger than the other two subgroups (Table 2).
For the patients who were given equal PEEP by the RL and the clinicians, the PEEP grades are close to 1 (corresponding to 5 ~ 7 cmH2O). A major percentage of these patients are associated with indirect ARDS risk factors (direct: 42.5% vs. indirect: 57.4%).
The ICU mortality was higher and the 28-day ventilation-free days was lower when does not equal to zero (  Table 3). The increase in ICU mortality with the decrease of -PEEP G  was more evident in the moderate and severe ARDS and was strongly associated with worsening oxygenation, increased FiO2, body mass index (BMI), mechanical power (MP), and MP normalized to predicted body weight (norMP), but weakly with tidal volume, driving pressure (DP), and respiratory compliance (Table 3). When -PEEP G  equals zero, the patients have the lowest ICU mortality and the longest 28-day ventilation-free days ( Table 2).
The XGBoost model which classifies the high (≥10 cmH2O) and low (<10 cmH2O) PEEP recommendations suggests that FiO2 and RRset were the most important features for the recommendation of both the clinicians and the RL model ( Figure 2).
While the clinicians also considered some clinical observations, such as SpO2, PaCO2, BMI, DP, and P/F ratio, the RL model gave SOFA and demographics (age, BMI) a high priority.
Similar findings are observed in the external validation on the eICU database. The increase in -PEEP G  also associated with increased ICU mortality and decreased 28-day ventilation-free days (eTable 4 and eFigure 2).

Discussion
The main findings of this study can be summarized as follows. First, the ICU mortality was higher when the clinicians' PEEP setting deviated from that of the RL model.
Second, the performance of the RL model was associated with the severity of ARDS.
For the severe ARDS in MIMIC-IV, the percentage of patients that recommended by lower, equal, and higher PEEP by the RL model were 75.9%, 2.4%, and 21.6%, respectively, whereas the percentages were 28.8%, 45.8%, and 25.4% for the mild ARDS, respectively. Third, for the -0 PEEP G  subgroup, the decrease in the -PEEP G  was associated with an increase in the mortality rate and a decrease in 28day ventilation-free days, while the tidal volume, driving pressure, and respiratory compliance did not differ significantly. Fourth, despite that the PEEP setting is related to FiO2, RRset, and BMI for both the clinicians and the RL model, it is found that the PEEP setting by the clinicians is more related to the clinical measurements such as P/F, DP, SpO2, and PaCO2, while the RL model puts focus on the SOFA score and age,  (Table 3 and eTable 4) indicates that for the patients with higher ICU mortality, the PEEP settings made by awarding survival using the RL model were likely to deviate stronger from that of the clinicians.
We suggest that in these patients, setting higher PEEP did not further improve respiratory mechanics, but further increase Pplat, MP, and norMP (Table 3), resulting in stronger degrees of lung injury [26] and thus poorer outcomes. On the contrary, the -G 0 PEEP   subgroup is significantly younger, has a better SOFA score, and better P/F than the -G 0 PEEP   subgroup. And the percentages of these patients in the mild, moderate, and severe groups are close.
More and more evidence shows the importance of the biological and clinical heterogeneity of ARDS on the selection of ventilatory strategy [14].  (Figure 2(a)) shows that the PEEP setting by the clinicians is more related to the well-established targets such as oxygenation [5], respiratory mechanics [7,8], and also BMI [29]. By contrast, in addition to the FiO2, the RL model was more inclined to consider the SOFA score (and age), which may convey more comprehensive information regarding patient characteristics, exhibit a high rank of feature importance. The driving pressure of these patients was not found to be strongly associated with the PEEP selection decision for the RL model, probably because they were well controlled in the investigated cohorts The study has several limitations. First, it should be noticed that the RL model's recommendations do not result in real clinical actions in the study due to the retrospective observational design of the study. It is therefore an "off-line" reinforcement learning process. Even so, the results could be interpreted in the way that the outcome was worse if the clinicians' decisions deviate from that of the RL model. Second, the chest X-ray imaging required for defining ARDS according to the Berlin definition is missing. In addition, the Berlin definition itself is also criticized in some cases [31]. We included the patients who have the common risk factors for ARDS [22] to make the cohort as suitable as possible for the aim of this study. In the future, more strict recruitment of subjects should be conducted to validate the effectiveness of the RL model. Third, it is unknown what approaches were adopted to titrate PEEP in the investigated cohort, or whether a recruitment maneuver was performed, preventing us from an in-depth analysis of the association between PEEP  and the titrating approaches.

Conclusion
In conclusion, the RL model aiming to award survival succeeds in recommending personalized PEEP for ARDS patients and is promising to improve the clinical outcome.
Further prospective studies are needed to confirm whether the RL-based PEEP setting strategy can assist the clinical practice.

Ethics approval and consent to participate
Due to the retrospective design of the study, the ethical approval from the institutional review board is exempted.

Consent for publication
Not applicable.

Availability of data and materials
The MIMIC-IV and eICU databases can be downloaded from https://physionet.org/content/mimiciv and https://eicu-crd.mit.edu/, respectively. The source code of the reinforcement learning model can be obtained from the authors upon a reasonable request.

Competing interests
The authors declare no conflicts of interest.

Funding
This study was supported by the National Natural Science Foundation of China