Ethics and data extraction
Ethical approval for this study was approved after completion of the National Institute of Health(NIH) web-based training course named “Protecting Human Research Participants” by the author Yu-wen Chen (Certification Number: 28341490). Data used for the prediction of mortality risk were extracted from the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) database. All the data from the database had been treated with data masking to protect patients’ privacy. So, there was no requirement for written informed consent in the current study. There were 61532 records of ICU stays in Beth Israel Deaconess Medical Center ICUs, including clinical notes, physiological waveforms, laboratory measurements, and nurse-verified numerical data(18). The exclusion criteria were as following: any hospital admission with multiple ICU stays or transfers between different ICUs or wards, which would reduce the ambiguity of outcomes associated with hospital admissions rather than ICU stays; patients younger than 16; patients’ whose initial ICU stay was missing or less than 48 hours; ICU events with no events in the initial 48 hoursThere were 18094 were for final analysis. As shown in Fig.1, In order to avoid overfitting, we split the dataset into training set (15331patients, 17917 ICU stays) and testing set (2763 patients, 3222 ICU stays). Five-fold cross-validation was performed on the training set to determine the model parameters. We obtained the best model parameters after cross-validation on the training set and got the score of the model on the testing set.
We use 17 physiologic variables (shown in Tab.1) representing a subset from the Physionet/CinC Challenge 2012(12). Up to 17 variables were recorded at least once during the first 48 hours after admission. Not all variables were available in all cases. We used all raw values for time series measurements included in the score. For Glasgow Comma Score (GCS), we included GCS (Verbal response), GCS (Motor response), GCS (Eye opening) and GCS (total) as different features. The rest of the variables included weight, height, temperature, respiratory rate (RR), heart rate (HR), diastolic blood pressure (DBP), Mean blood pressure(MBP), systolic blood pressure(SBP), fraction inspired oxygen (FiO2), oxygen saturation (OS), pH, glucose, and capillary refill rate (CRR). When value was more than three standard deviations away from each individual mean value, it would be removed. Twelve of them were continuous and five discrete. All the time series variables were re-sampled into hourly rate starting from ICU admission. When there was a continuous variable that was missing at a point in time, we filled the data with the nearest neighbor value. When the indicator had no record data during the observation time,we assumed that the nurse did not measure the attribute and that the indicator was considered normal so that we filled the data using the normal value of the attribute. . For discrete variables, we performed one-hot encoding. For continuous variables, we performed Z-score normalization to scale the feature values. Each patient’s record was summarized into a visualization data matrix 59×48 for 48-hour observation period as the input for deep learning.
Model construction for Attention-based TCN
In this work, we developed an attention-based TCN model to predict the mortality risk of ICU patients with time series and static data. The TCN is convolutional network, which is composed of causal convolution, diluted convolution, and residual connections. The causal convolution makes the TCN a strict temporal model by using data from time t and earlier in the previous step to predict the status at time t when model trained. TCN allows the input of convolution to be sampled at intervals to broaden the field of perception (make the most of information) due to use of the dilated convolution. The residual connections enable the network to transmit information across layers, which are usually used to train deep network. In addition, TCN adds Dropout to each hole in the residual module to achieve regularization. Attention mechanism was introduced into the TCN model to elevate the efficiency and the interpretability.
The structure of attention-based TCN model was shown in Fig.2: patients’ raw data were preprocessed as data flow for model in-put; The TCN (Temporal Convolutional Network) (17) was directly applied to process the ICU patient's temporal data. The network was referenced to the basic structure of the literature(17) without corresponding structural optimization. Since the number of kernels was 3 and the number of attributes for the patients was 59, the stacked temporal convolutional attention layer was set to 7. When the network layer was set to 7, the receptive field of the network exactly covered all the patients' input data. The patient's vital signs data are extracted by 7-level TCN, then connected to the attention layer; finally, the mortality risk were predicted by linear layer. Implementation parameters of TCN are batch_size=32, dropout=0.2, kernel size=3, levels of TCN =7, initial learning rate=0.02, number of hidden units per layer=59, optimization algorithm=Adam. The loss function used is Binary Cross Entropy:
temporal convolutional network
pred: prediction tensor with arbitrary shape
label: target tensor with values in range [0, 1]. Must have the same size as pred.
Non-time series model construction
We also predicted the mortality risk by non-time series ML methods such as RF (19), LR, Decision Tree (DT) and SVM. Due to the limitation of these ML methods, the in-put for these models were not time series data but results of feature extraction (statistical variables, such as the minimum, maximum, average of the variables).
Model performance was assessed by overall performance, discrimination, and calibration. The overall performance is determined by the F1 score. The F1 score is defined as the harmonic mean of accuracy and recall, which considers both the precision and the recall equally. Discrimination is the capability to distinguish between those who survival and those who do not 48h after admission in ICU by AUCROC and the area under the Precision-Recall curve (AUC-PR). The AUC-PR is sensitive for the imbalance distribution of the negative and positive data especially for an extreme small portion of positive data. Calibration is assessed by Brier score via calculating the averaged squared deviation between the predicted probability and the actual outcome.
The statistical analyses were carried out using SPSS software for Windows, V.19.0 (SPSS). Quantitative variables are presented using basic descriptive statistics: mean and SD (for normal distribution data), or median and IQR (for non-normal distribution data). Comparisons among datasets were performed using the chi-square test or Fisher's exact test, or Kruskal-Wallis test. All statistical tests were two sided, and P values less than 0.05 indicated statistical significance.