This study was conducted in two steps. First, we performed prospective data collection (step 1); second, we retrospectively assessed the microbial etiology prediction performances of experienced physicians (more than 10 years’ experience) and a computational data-driven approach for this dataset (step 2).
Step 1: patient data collection
Prospective data collection was conducted in a single center over an 18-month period. The study complied with French law for observational studies, was approved by the ethics committee of the French Intensive Care Society (CE SRLF 13-28), was approved by the Commission Nationale de l’Informatique et des Libertés (CNIL) for the treatment of personal health data. We gave written and oral information to patients or next-of-kin. Patients or next-of-kin gave verbal informed consent, as approved by the ethic committee. Eligible patients were adults hospitalized in ICU for CAP. Pneumonia was defined as the presence of an infiltrate on a chest radiograph and one or more of the following symptoms: fever (temperature ≥ 38.0ºC) or hypothermia (temperature < 35.0ºC), cough with or without sputum production, or dyspnea or altered breath sounds on auscultation. Community-acquired infection was defined as infection occurring within 48 hours of admission. Cases of pneumonia due to inhalation or infection with pneumocystis, pregnant women and patients under guardianship were not included. Cases with PaO2 ≥ 60 mmHg in ambient air or with the need for oxygen therapy £ 4L/min or without mechanical ventilation (invasive or non-invasive) were not included.
Baseline patient information was collected at case presentation through in-person semi-structured interviews with patients or surrogates (see Supplementary Table 1). Observations from the physical examination at presentation, including vital signs and auscultation of the lungs, were recorded. Findings of biological tests done at presentation (within the first three-hour period) were also recorded (hematology and chemistry tests), as were findings from chest radiography. Two physicians interpreted chest x-rays; a third physician reviewed the images in cases of disagreements in interpretation.
Microbiological investigations included blood cultures, pneumococcal and legionella urinary antigen tests, bacterial cultures and multiplex PCR RespiFinder SMART 22® (PathoFinder B.V., Oxfordlaan, Netherlands) analyses on respiratory fluids (sputum and/or nasal wash and/or endotracheal aspirate and/or bronchoalveolar lavage [BAL]).
Step 2: clinician and data-driven predictions of microbial etiology
Clinicians and a mathematical algorithm were tasked with predicting the microbial etiology of pneumonia cases based on all clinical (43 items), and biological or radiological (17 items) information available in the first 3-hour period after admission except for any microbiological findings (Supplementary Table 1). For this proof-of-concept investigation, we decided to study only CAP caused by a singular and identified pathogen; cases of CAP with mixed etiology or without microbiological documentation were excluded. From the initial dataset of patients, we randomly generated two groups (prior to any analysis): (i) a work dataset (80% of the initial dataset) dedicated to construction of the mathematical model and training the experts; (ii) an external validation dataset (20% of the initial dataset) dedicated to testing the prediction performances. The methodology used is summarized in Figure 1A.
Clinician predictions An external three member expert panel reviewed the work dataset to familiarize themselves with the dataset containing the patient characteristics. Then, the experts were asked to predict the microbial etiologies in the external validation dataset (Fig 1A). The clinicians had to answer the question: is it a viral or a bacterial pneumonia? They were also asked to give a confidence index regarding the accuracy of their answer: 1 (very low), 2 (low), 3 (moderate), 4 (high). Agreement of at least two of the three experts was required for the final predicted etiology.
Data-driven approach predictions The data were analyzed using an AI method (Figure 1B) involving a logistic regression analysis using forward stepwise inclusion. This method was employed to optimize the ability of the algorithm to distinguish viral and bacterial pneumonia based on the combination of parameters available in the work dataset. All available data were thus included in the model, regardless of the data type. Qualitative data were processed as binary information (i.e. influenza immunization: present “1”, absent “0”). Raw data were provided for quantitative values (no cut-offs defined). We built the predictive mathematical model from the work dataset using the Random Forest method and Leave-One-Out Cross-Validation. We started by determining the most relevant item to use through a variable selection procedure using the Random Forest method and the Mean Decrease in Gini criterion (value 0.75). Then, the population in the work dataset was randomly separated into two independent datasets: 80% of cases were assigned to the training set and 20% were assigned to the test set. N models with bootstrap resampling (with N = 25) were performed on the training set and validated on the test set. The model providing the best prediction criteria was selected, and the final model was built from the entire work dataset. Finally, an independent validation set of samples was used to test the pathogen prediction performance of the AI algorithm. To decipher the relative importance of clinical versus biological/radiological variables in the predictions, we generated three algorithms built from different parameters of the work dataset: (i) clinical variables only, (ii) biological and radiological variables only, and (iii) all variables. For each parameter tested, the area under the ROC curve (AUC) was calculated, and the best cutoff value that yielded the highest accuracy was determined along with the sensitivity and specificity.
Statistical analysis
We compared the concordance between the predictions and the final microbial etiologies for the experts and for the algorithm and calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and likelihood ratios (LRs) for the predictions [12]. Given the importance of this diagnostic prediction in the patient's therapeutic management, we determined that the discriminant properties should be "high" (LR +> 10 and/or LR- < 0.1) for the prediction to be considered useful for clinical practice [13, 14]. Table 1 summarizes the LR cutoff values defining the discriminant properties of the predictions [13]. Quantitative data are reported as the median value and interquartile range (IQR). Statistical analyses were done with JMP software (SAS, version 7.2).