We have developed a predictive model whichis available as an open-access web-based tool that has performed well with high certainty with an AUROC 89·6% [95% CI: 89·5–89·7] and a high accuracy of 96.23% [95% CI: 96.21–96.26]. The model comprises 8 independent predictors, incorporating demographic, clinical, and laboratory variables. The model was developed through regression search for influential variables of the disease. We believe the model reflects all important underlying pathophysiological aspects of the disease represented by these variables and therefore led to its high performance. The model has a very high sensitivity of 99·6% but a modest specificity of 23·6%. In view that such a model is intended to capture and prevent potential deaths, and the overall high AUROC, the model would be very useful in clinical practice. Misidentification of a non-fatal case is an acceptable compromise compared to missing a fatal case altogether. A possible explanation for the modest specificity is the fact that the real underlying outcome-determining pathophysiological mechanisms in dengue has not been clearly elucidated and only biomarkers involved in these mechanisms could improve the specificity of any model.
We developed this model to be employed at the time of severe dengue diagnosis, which may reasonably be assumed as the time of onset of severe dengue. This is likely the time when outcome-determining pathophysiological processes become critical. This is the earliest and most appropriate choice of time to prognosticate a patient in terms of death in severe dengue. We believe the selection of the point of time of prediction is crucial in the development of a dengue death prediction model. An earlier time point would be too early where pathophysiological processes may not have reached outcome-determining significance. Selection of a later time point would be too late for the prediction model to be beneficial. With that in mind and the fact that the model performed well, we postulate that investigating processes represented by these variables could elucidate the pathophysiology of dengue with better clarity. We also believe that the model may be used to track the progression of a patient with severe dengue through the course of illness, assists in guiding prognostication and decision-making.
Making predictions in dengue research has gained momentum. Modelling studies in outbreak prediction utilised several predictive analytics which includes ensemble methods, time series regression, and support vector machine.10–16 Modelling studies involving earlier clinical aspects of management of dengue - identifying and stratifying dengue - used decision trees, logistic regression, and structural equation models.17–21 However, only three studies have modelled prediction of death.23–25
Huang et al. studied all patients with dengue that included 34 deaths and identified five independent predictors of death: age >64 years old, diabetes mellitus, systolic blood pressure <90 mmHg, chronically bedridden, and haemoptysis.23 However, this study used a scoring method. Risk scoring is not without weaknesses and has been deemed to have serious problems.35 Risk scoring involves converting continuous predictors into categorical predictors. The conversion results in loss of granularity of information contained within continuous predictors.36 Moreover, the presence of any of the first three predictors only occurred in 162 (20·6%) cases of our cohort and the combination of all three in only 2 cases. It is imperative in clinical prediction that a predictor is sufficiently prevalent for the achievement of reasonable accuracy.37
Md-Sani et al. examined severe dengue cases and predicted death at the onset of severe dengue similar to the current study but at a single centre.24 That study however, had only 20 events (deaths).24 Building a predictive model to predict death among severe dengue cases is challenging as dengue death, the event or outcome of interest, is actually uncommon - in Malaysia it is just above 0.2% of all dengue cases.6 This poses a huge problem in that, in the usual approach of statistical modelling using multivariable logistic regression, 5–10 events per candidate predictor variable (EPV) are required.38 This ratio dictates the number of candidate predictors that may be simultaneously analysed in the multivariable model. When the EPV is less than this, the number of candidate predictors that may be simultaneously analysed to identify which among them are truly independent predictors is limited. Thus, Md-Sani et al. employed the approach of using adjustment or controlling variables instead of assessing all candidate predictors simultaneously in one model. The present study is larger, multicentred and used a different analysis technique which addressed overfitting, multicollinearity and the low number of events per variable.
Pinto et al. which had a large cohort with 61 severe dengue deaths, built a simple predictive model comprising only four categorical predictors: age (binary, cutoff age 55), haematuria, gastrointestinal bleeding and thrombocytopaenia (binary, cutoff platelet count 20,000 cells/mm3).25 In our study, we did not specifically identify the source of bleeding. However, assuming the variable warning sign of spontaneous bleeding tendencies and the variable severe bleeding represent haematuria and gastrointestinal bleeding, respectively, the presence of any of Pinto et al.’s predictors occurred only in 259 (33%) cases in our cohort. The presence of all four predictors occurred only in 3 cases. Therefore, these models may not be adequate to prognosticate death in severe dengue. Our model kept variables in their original continuous attribute, and persistent diarrhoea was the only categorical variable. Persistent diarrhoea occurred in 38·5% of our cohort, which is more prevalent than any of the predictors of Huang et al. and Pinto et al. if applied to our cohort. Thus, because of this and the predominantly continuous attribute of predictors in our model, we obtained a higher performance accuracy. Additionally, Pinto et al. used the WHO 1997 dengue classification instead of the WHO 2009 schema which Malaysia has adopted in clinical practice.6,28
While there are other studies on death in dengue, they examined the association with death in unselected dengue cases and were not prediction modelling studies.39–46 Our study is the first modelling study, based on the latest and widely adopted WHO 2009 classification scheme, to model prediction of death in severe dengue cases. As mentioned above, there have been many studies that built models for identification of dengue and stratification of severity in dengue. Our model complements this and completes the prediction aspect of clinical management.
The study additionally documented another interesting finding. We found that almost three-quarters of the cases were still in the febrile phase (study definition temperature >38°C) at diagnosis of severe dengue. This finding supports similar documentation in a previous study.24 A similar proportion was found for those who had shock. This finding is different from what majority of guidelines have stated where shock only occurs during the critical phase, i.e. upon defervescence and later.6,28 Current clinical practice only recognises that severe dengue would occur only during the critical phase, thereby missing severe dengue which could occur earlier in the febrile phase. Our study provided quantified evidence that severe dengue could occur early in the febrile phase and clinicians should be vigilant of this fact in order to prevent deaths.
The limitation of our study is its retrospective design in which missing data was inevitable. However, only variables of utmost importance were included as candidate predictors for variable selection. We included serum bicarbonate and serum lactate though these had higher missing proportions (11·6 and 16%, respectively) as we believe they play essential roles in determining not only progression to severe disease but also death.24,47,48 Even though external validation of the model was not performed, we believe that the repeated k-fold cross-validation algorithm we employed had ensured the robustness of the model for unseen data. A final limitation is that though our model may save costs due to its accessibility, it will require additional laboratory-related resources. Nevertheless, any severe dengue cases should be treated in a setting with adequate resources to implement evidence-based clinical practice, and the laboratory predictors in our model are commercially available. Zakaria et al. demonstrated that the WHO 2009 dengue severity stratification scheme classifies more patients (4·6%) into the most severe form as compared to the previous WHO 1997 scheme (0·7%).49 They highlighted that this might pose a significant impact on hospital resources. Our model can potentially prioritise patients to local resources based on their probability of death.
In conclusion, we have developed a dengue death prediction model comprising clinical and laboratory data and deployed an open-access web-based tool for any centre to utilise for local validation. The findings from this study would be valuable to the global community of clinicians who treat dengue, hopefully paving better and tailored clinical decision-making and resource management. In terms of research, the tool may be a useful yardstick, similar to how TIMI and APACHE are useful to cardiovascular and critical care medicine, respectively.22,50,51