Study setting
Seven ICUs located in 6 private and 1 not-for-profit institution, part of the Indian Registry of IntenSive care(IRIS) contributed data to this study. Of these, 5 were general (mixed medical-surgical) ICUs and two were medical ICUs. The Indian Registry of IntenSive care a cloud-based registry of critical care units was established in Jan 2019.9 Details of the implementation and preliminary results of the case-mix program have been previously published.9
Patients
All patients reported to the registry between January 2019 to May 2019, were considered. Patients > 18 years of age with an ICU length of stay > 6 hours were included in the study. Patients with missing outcomes and those not meeting the inclusion criteria were excluded
Data collection
This retrospective study used data collected as part of the IRIS dataset. Age, gender, pre-existing co-morbidity, diagnostic category, type of admission (planned, unplanned, medical or surgical), physiological vital signs and laboratory measurements were collected as per the definitions described for e-TropICS (Table 1 ) for all consecutive admissions. ICU outcomes rather than hospital outcomes were collected due to well-described logistical challenges in such settings.5,8 Data was collected daily by either nursing staff or by data collectors appointed to the registry network, all of whom had been trained in the process of data acquisition. Daily telephone reminders encouraging data input and checks for consistency of the number of admissions, discharges and outcomes from each ICU were undertaken by staff from the central coordinating centre. In-built measures in the data entry portal such as mandatory fields, range validations, drop down and checkboxes as opposed to free text entries were employed to promote fidelity of data recording.
Ethics:
The study was approved by the Institutional Ethics Committee centrally at the study coordinating centre (AMH-021/07-19). The informed consent model used in the registry has been described and published previously.9
Statistical analysis
Availability of physiological and laboratory measurements was described using descriptive statistics. e-TropICS was calculated as per the authors’ original methods.8 The area under the receiver operator characteristic curve (AUROC) was used to express each of the models' power to discriminate between survivors and non-survivors. For all tests of significance, a 2-sided P less than or equal to 0.05 was considered to be significant. AUROC values were considered poor when less than or equal to 0.70, adequate between 0.71 to 0.80, good between 0.81 to 0.90, and excellent at 0.91 or higher.10 Calibration for each model was assessed using Hosmer-Lemeshow C -statistic. Overall model accuracy in measuring predictions was calculated using Brier Scores. Continuous variables and differences between the means of normally distributed variables were compared using Student's t-test. Chi-square test was employed to compare categorical variables and to compare between AUROC values. All analysis was performed using Stata software version 13.111
Handling of missing data and analysis:
When faced with high proportions of missing data, one approach is to assume normality for a variable when not measured or unavailable, resulting in a score of “0” in weighted scoring systems. Such an approach may not be justified in LMICs where measurements may be unavailable due to lack of resource availability or to differing approaches in decision-making in critical illness. Assumptions of normality in the above manner can adversely impact model performance by underestimating severity scores. In this study, multiple Imputation (MI) with chained equations was employed to handle missing data. It was assumed that the missingness of a variable depends on some of the other observed variables i.e. Missing At Random (MAR). MI was performed using Predictive Mean Matching (PMM) of the “MI impute chained” command in Stata (version 13.1, Stata Corp, College Station, TX, USA). The number of imputations (M) was set at 20 and “k-nearest neighbours” (kNN#, Stata syntax) was set at 10. Multiple Imputation (MI) generates several values reflecting the uncertainty in the estimation of the imputed value. Both continuous and categorical variables were imputed, as PMM generates predicted values that have been drawn from data which has already been observed within the variable. This ensures that categorical variables which can only take specific values do not have predicted values which are not allowed for the variable. The scores (and their mortality probabilities) were then calculated individually for each of the 20 multiple imputed datasets. The mean of 20 probabilities was then calculated and used the MI mortality prediction.