The performance of a deep learning-based competing risk model(DeepHit) on a COVID-19 clinical data is compared with a traditional statistical model(Fine-Gray). These are the two benchmarks for the competing risk problem in the current literature. Both of these models are based on the cumulative incidence function(CIF). Cause-specific CIF gives the probability that the event occurs on or before time conditional on the covariates . i.e.,
Fine-Gray model
Fine-Gray8 model is the most commonly used statistical method in competing risk problems. The traditional proportional hazards model is modified by the direct transformation of the CIF. This approach focus on the subdistribution of a competing risk. For each failure type, the model provides a direct interpretation in terms of survival probabilities. The cmprsk package in R is used to fit the Fine-Gray model. We used the materials provided by Nemchenko et al.7 to fit the model and find the performance metric.
DeepHit
DeepHit6 is a deep neural network that learns the distribution of survival times directly without making any assumption on the underlying stochastic process. The model trains a multi-task network to learn the estimate of the joint distribution of the first hitting time and competing events. DeepHit can be used to predict the competing risks, discharge from the hospital (event 1), and death prior to discharge (event 2). Since we are considering 2 competing events, the network consists of 4-layers. The first layer is a fully-connected layer for the shared subnetwork, followed by two fully-connected layers for each cause-specific sub-network. The output layer is a softmax layer. ReLU(Rectified Linear Unit) activations are used in all three layers. The network training is done by back-propagation via Adam optimizer with a batch size of 50 and a learning rate of . The dropout probability of 0.1 and Xavier initialization was applied for all the layers. We used the pycox package to implement the model.13 For training, testing, and validation, 60%, 20%, and 20% of the data are randomly separated. For evaluation, 5-fold cross-validation is being applied.
Evaluation metric
The time-dependent concordance index is used to evaluate the discriminative ability of the models.18 The principle of concordance is that the predicted survival probability of a subject who experienced the event should be less than those who have survived longer. The value ranges between 0 and 1. As the metric approaches one, better the performance of the model. The concordance index evaluates a method’s discriminative performance.
Data
The raw data of 1863 hospitalized patients is extracted from an open-access COVID-19 epidemiological data website https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30119-5/fulltext.
A group of researchers collected epidemiological data from different research labs. The data was extracted from online resources and national health portals released by state/local health officials and hospitals of different countries. The dataset consists of a subject ID, date of hospital admission, gender, age, the onset date of symptoms, outcome(death/discharge), death/discharge date, history of chronic disease, symptoms, location, and travel history.
Time to the events is calculated directly by subtracting the hospital admission date from the date of the outcome. Observations without the admission dates, covariate information, and outcome were removed from the study. Discharge from the hospital and in-hospital mortality is event 1 and event 2 respectively. Censoring time is obtained from the last available date. Since the outcomes were well defined, there was no complication in defining death, discharge, and censoring. We have included only the patients who got admitted till March 30, 2020, to avoid massive censoring. We have considered the covariates: age, gender, chronic disease history, latitude, and longitude for the analysis. Table 1 presents outcome-wise descriptive statistics. All the percentages are calculated based on the remainder of 1863. The schematic plot for competing-risk time-to-event data for five hypothetical subjects can be seen in Fig. 1.
Table 1. Event-wise summary measures of the patients.
Descriptive statistics
|
Event 1(Discharge)
|
Event 2(Death)
|
No. of subjects(%)
|
162(8.7)
|
59(3.2)
|
Median time to event in days(Range)
|
13(8-18)
|
7.5(4-10.75)
|
Median age in years(Range)
|
38(30-51)
|
70(62.75-79.75)
|
No. of males(%)
|
94(5.1)
|
39(2.1)
|