A tree of life? Multivariate logistic outcome-prediction in disorders of consciousness

ABSTRACT Background: Clinical outcome of patients with disorders of consciousness (DOC) is seen as generally very poor. Here, we specify individual outcome chances for patients with DOC on the basis of clinical and event-related-potentials (ERPs) data and identify subgroups, who vary substantially regarding their outcome chances. Methods: We employed data from 102 patients and used standard clinical protocol data (age, etiology, diagnosis, gender), sensory (N100, Mismatch-Negativity) and cognitive (P300, N400) ERPs to predict patients’ recovery rates. Results: Two significant prediction models emerged: In both, subgroups of patients with good (51%, tree 1) to very good recovery chances (97%, tree 2) could be identified. The first model was obtained from standard clinical data. The second model included cognitive ERPs and resulted in considerably better patient classification. Moreover, when taking cognitive ERPs into account, the standard protocol data did not add further significant information, neither did sensory ERPs. Conclusion: The presented information about outcome chances of individual patients with DOC will be vital for these patients and critical for clinical professionals who have to direct specialized treatments and council relatives. Legal guardians and families, in turn, need to know what to expect in the future in order to prepare for the challenges ahead.


Introduction
Brain injuries are the number one causes of death and severe disability among the younger population in industrialized countries (1). Among the most severe disabilities resulting from brain injuries are disorders of consciousness (DOC), encompassing the clinical syndromes of the unresponsive wakefulness syndrome (UWS (2)); (former vegetative state (3,4)) and the minimal conscious state (MCS (5)). In the former, patients show no signs of self-awareness or awareness of their surroundings (3). In the latter, patients show some limited, often inconsistent signs of awareness which can reach from sole eye-fixation to the following of simple commands. A patient is considered to have improved above MCS if functional communication and/or functional object use has been reestablished (5). Both syndromes can be steps toward recovery but can also become permanent conditions in which patients can survive for many years without any apparent steps toward consciousness. Thus, a reliable identification of prognostic factors is of importance to all, the patients themselves, family members, as well as for the medical staff involved.
On the group level, variables such as age, diagnosis, and etiology have been repeatedly found to be statistically reliable prognostic indicators of outcome (4,6,7). Here, younger age, a traumatic injury, and a quick transition into the state of MCS are predictors of a favorable outcome (i.e. emergence from MCS), whereas an age of over 45 or 50 years, prolonged UWS, and an anoxic event are considered to be predictors of an unfavorable outcome (no recovery until death).
Various cerebral measures of information processing have also been found to correlate with outcome (see, for example (8,9)). These include several, mostly auditory, electroencephalographic event-related potentials (ERPs). For example, the absence of the N100, an index of cortical sensory stimulus registration, is considered to be predictive of a negative outcome (10). The presence of the so-called mismatch negativity (MMN), indicating an automatic detection of deviant stimuli in continuous auditory stimulation streams, on the other hand, has been shown to be a positive sign (11,12). The presence of a P300, reflecting higher-order stimulus discrimination, also often correlates with a positive outcome (12,13). Another recently discussed and possibly very promising ERP is the N400, the cortical reaction to semantic violations in spoken speech. Although rarely found in patients, its presence correlates highly with a positive outcome (14).
However, although all these separate prognostic factors are important and relevant, they hold limited information for clinical daily routine. This is mainly because the typical patient presents with a highly individual set and combination of positive and negative predictors, which can interact in various ways. For instance, a young patient with an anoxic event may demonstrate ERPs like the N100 but no detectable MMN. In this case, the young age and the N100 are positive predictors whereas the anoxic cause, as well as the missing MMN, would foretell a negative outcome. From extant scientific studies, it remains unclear how these prognostic factors interact and how physicians are supposed to weigh factors when a patient presents both, positive and negative predictive factors at the same time.
Currently, physicians may overestimate the probability of unfavorable outcomes after severe brain damage (15,16). This is especially concerning, since physicians are often asked to provide counseling regarding end of life decisions. Undoubtedly, those are hard decisions to make and physicians are forced to provide counseling to families with few proven prognostic factors to do so. Unfortunately, self-fulfilling prophecy may result, if clinicians rate survival chances (or a good outcome) as very poor, advise discontinuation of life-sustaining measures, patients die because life-sustaining measurements are ended, and clinicians are confirmed in their prognosis. In fact, as a Canadian retrospective study shows, it seems that up to 70% of the deaths in patients with severe TBI on intensive care units result from the termination of life-sustaining measures (16). Consequently, research is mandated to provide a source of information for physicians to reliably predict the outcome of various subgroups of patients.
For the acute phase of TBI-induced coma, some attempts have been made to provide physicians with 'multi-factor models'. For example, based on data from 102 patients, Jain et al. (17) employed the presence/absence of pupil responses, need of ventilation and Glasgow Coma Scale (GCS) improvement within the first 24 h after the incident in a predictiontree, identifying a subgroup of patients (with pupil responses, no need of ventilation and an increase in GCS scores within the first 24 h), where survival rate increased from 6.1% to 57.1%. The prediction tree of Rovlias and Kotsou (18) tested a total of 16 known predictive factors on outcome data of 345 acute patients. The best predictive tree resulted from eight factors (GCS, age, pupillary responses, computed tomographic (CT) findings, hyperglycemia, and leukocytosis) with a predictive accuracy of 86.84%. Furthermore, from the 'International Mission for Prognosis and Analysis of Clinical Trials in TBI' (IMPACT; (19)) with over 9000 patients, prognosis chances can be calculated with three models of increasing complexity, yielding the 6-month outcome of adult patients with moderate to severe head injury. However, IMPACT treats death, UWS, and severe disability indiscriminately as unfavorable outcome. Although all current models provide physicians with helpful data about the likelihood of a favorable or unfavorable outcome in the acute phase, given the data of the same patient with DOC, the different models result in very different forecasted recovery chances (20).
Moreover, to the best of our knowledge, no prediction model exists for the post-acute phase when patients have already entered into the stages of prolonged disorders of consciousness (DOC). This might be due to the fact that in order to calculate a prediction tree with various branches, a large number of patients is needed. Such numbers are hard to obtain for patients with DOC since, although growing in numbers, DOC is still a rare and sometimes slowly changing syndrome, requiring long follow-up intervals. Moreover, clinical routines and documentation are rarely standardized between centers, and sometimes the whereabouts of patients post-acute brain damage are not even known.
The aim of the present study is to develop prediction models for DOC outcome on the basis of data from 102 patients with DOC, testing the predictive value of clinical prognostic factors as well as ERPs regarding patients with DOC outcome after 8 years on average (range 2 to 17 years). The models demonstrate the influence of presence or absence of various factors on the probability of a favorable (regaining communicational skills) or unfavorable (permanent UWS or MCS until death) outcome. We calculated predictions for two possible scenarios: in the first scenario, patient demographics were taken into account, which can be easily obtained for every patient (diagnosis, age, etiology, gender). In the second model, we tested outcome prediction relying on information on ERP statuses (N100, MMN, P300, and N400).

Materials and methods
The initial sample consisted of 175 patients with UWS (n = 92) or MCS (n = 83). Outcome data could be obtained from 102 patients (43 patients with MCS, 59 patients with UWS) who were included in this study. From 72 patients, contact details were unobtainable and one caregiver refused to answer. The outcome follow-up was a semi-structured telephone interview 2 to 15 years after the patients' discharge. Here, in most cases, relatives of the patients were asked about the physical and cognitive abilities of the patient. Because of very high misdiagnosis rates even with trained clinical staff, relatives were not asked to distinguish between patients with UWS and MCS. We therefore asked relatives, whether or not patients were able to functionally communicate (including but not limited to correct responses to 'test-questions' like 'Is your given name … ?' or 'Do you have two kids?') via speech, eyeblink code or any other means, expecting that this hallmark could be reported reliably. According to this, we regarded patients as 'recovered' if functional recovery was obtained, and considered a patient as 'not recovered' if he or she was still UWS or MCS.
Patients were initially treated at the Neurorehabilitation Hospital 'Kliniken Schmieder' (Allensbach, Germany) between 1994 and 2005. From these patients, clinical files were available and all had undergone ERP examination, including N100, MMN, P300, and N400 ERPs (for paradigm description, ERP analysis, ERP scoring and illustration of individual responses see supplement 1). All ERPs were obtained during the first weeks of rehabilitation. The patients' cognitive functioning was, for one, evaluated using the German Koma Remissions Skala (KRS, coma remission scale). The KRS was developed in Germany specifically to monitor and protocol the improvements of coma, patients with UWS and MCS in early rehabilitation units. Is has good psychometric properties and its use is recommended for neurorehabilitation units in Germany (21). It was used in all patients (every 4 weeks during their stay which was on average 112 days long) as part of the clinic routine. We further analyzed daily clinical protocols from nurses and therapists for any signs of (even suspected) conscious behaviors (like eye-following or eye-fixation). Patients were assigned to be in UWS if there was no indication of conscious behavior up to the time of ERP-testing. If there was any indication of conscious behavior, patients were assigned to be in MCS.
Demographics of the patients are given in Table 1. Detailed descriptions of the patients, an English translation of the KRS, and information on the ERP paradigms can be obtained from supplemental data of previously published articles (14,22) in which we reported on the prognosis value of P300 and N400 as single predictors. N100 and MMN were not included in the previous reports and demographics not tested as predictors.

Statistical analysis
The analyses were conducted using SPSS 21 (IBM SPSS Statistics for Windows, Version 21.0) and R (23). Several logistic regression models were performed to assess the impact of the prognosis factors on the likelihood that patients would recover communicative abilities. Model 1 included age, diagnosis, etiology, and gender as variables. Model 2 additionally included early sensory (N100, Mismatch Negativity) and late cognitive (P300, N400) ERPs. Model variables were binary-coded as follows: Diagnosis: minimal conscious state (MCS) = 0, unresponsive wakefulness syndrome (UWS) = 1; female = 0, male = 1; event-related potentials: Yes present = 1, No, not present = 0. Etiology was divided into traumatic brain injury, hypoxic-ischemic injury (as well as respiratory arrest) and others (like Meningitis, anesthesia accident, lightning stroke, epileptic seizure, etc.). Age was divided into younger than 31.5 years, between 31.5 years and 58.5 years and above 58.5 years. These categories were effect coded with Yes = 1 and No = 0.
The resulting models (Model 1 and 2c) are shown in Table 2; a graphical presentation is given in Figures 1 and 2. In the second model, an upwards strategy was used. Factors were only included if the model comparison was significant regarding the likelihoodratio-test (α = .05). Only Diagnosis, P300 and N400 increased the model fit significantly, although N100 and MMN were also tested. The first and the second model are not nested, and thus models cannot be compared via likelihood ratio test. Therefore, the Akaike Information Criterion (AIC (24)) was computed for comparing the models.

Results
Demographic variables of the patients can be seen in Table 1.
Two prediction models of 102 patients with severe brain injury in a DOC are presented in Figures 1 and 2. For the sake of clarity of presentation, we present the model results comparable to those of a partitioning tree, although all model parameters are estimated simultaneously rather than successively in logistic regressions. The base-rate for reestablishing some form of communication is .33. Testing demographic information in the first model (see Figure 1, Table 2, Model 1) revealed that the best predicted probability for a favorable outcome results for young female patients with MCS with TBI or Other causes (43% and 51%, respectively).
In the second model, information about presence or absence of ERPs was analyzed ( Figure 2, Table 2, Models 2a-c). Here, a very good chance of recovery is prognosticated for patients with MCS with both N400 and P300 (up to 97% recovery chance). However, there is also a small subgroup of patients with UWS, namely those with both ERPs present, that reaches very good outcome predictions of up to 92%

Discussion
Using logistic regression models, this study examined whether it is possible to identify from a fairly large sample of patients with DOC, sub-samples with favorable outcome probabilities.
Our most important finding is that it is indeed possible to identify subgroups of patients with much better chances for a good outcome (regaining the ability to communicate) than currently suggested by most physicians. This is highly important since in clinical practice patients with DOC are sometimes still considered a homogeneous group of hopeless patients, representing the collateral damage of modern medicine (25,26) which may lead to the circle of forecasted devastating outcome, discontinued life support, occurring death and the subsequent reinforcement of the devastating prognosis (15). However, the perspective on patients with DOC changes as evident by the nosographic redefinition from 'vegetative state' to 'unresponsive wakefulness syndrome', expressing the concept that patients are considered to be susceptible to potential improvements rather than locked in an irreversible and uniform condition. Furthermore, the most recent works tend to differentiate MCS and UWS into MCS+ and MCS-, UWS+ and UWS- (27). This reflects the emerging necessity to better stratify patients into subgroups with different prognoses and levels of residual brain function, resulting into varying therapeutic demands.
From our results, it is possible to identify patients with a good prognosis by combinations of known and easily obtainable factors like diagnosis, age, etiology, and gender. Furthermore, ERPs can help to further differentiate patients within the same diagnosis category. Patients with both, a N400 and a P300 had the best chances of a favorable outcome (97%), with the presence of a N400 causing the biggest change in predicted probabilities. Thus, electrophysiological indicators of high-level cognitive processing are important outcome predictors, whereas lower-level sensory and perceptual processing added no further information to our outcome classification. Moreover, when full electrophysiology measures are available, only diagnosis played an additional significant role. Accordingly, the AIC of our second model is with 74.66 considerably smaller than the AIC from the first model (100.51), indicating that model 2 indeed predicts outcome better (24). This further highlights the need for using standardized electrophysiological measures for patients with DOC  in order to give the most exact prognosis available before making irreversible recommendations. In both regression models, the current state of a patient emerges as the first powerful predictor of outcome. In our study, we had to diagnose the patients retrospectively, and being aware of the risks of misdiagnosing patients with UWS and MCS (28-30) we took every effort to ensure the diagnoses by extensive study of medical files. For example, we evaluated daily nursing and therapy records that often covered a considerable period of time (average duration of stay was 112 days). It is important to note that nurses and therapists at the 'Kliniken Schmieder' were specially trained to work with patients with prolonged DOC and had years of experience in the treatment and recovery stages of patients. We further took into consideration the scores of the German 'Koma Remission Skala' (KRS, coma remission scale)the recommended scale to protocol coma remissionswhich was filled out every 4 weeks by trained and experienced physicians. Moreover, the Kliniken Schmieder follow the primary nursing model, in which a patient is cared for and treated by the same nurse and therapists/physician during his or her stay and thus primary caregivers get to know their patients very well during their average hospital stay of 112 days. A patient was only assigned to UWS, if there was no indication of consciousness in the medical files. Unclear cases were assigned to the MCS group. However, despite all efforts, cases of misdiagnosis cannot be completely ruled out, as is the case in all medical decisions. Still, our results clearly highlight the need of accurate diagnosis which is worth an additional effort since the knowledge about the correct diagnosis and the changes in outcome chances are very important for physicians, families, facilities and, in particular, the patients, since the correct diagnosis could make a huge difference for actually aware patients who are mistaken for being unaware.
The first model, using patient demographics, is significantly better in classifying patient outcome (overall correct classification/ OCC: 72%) than the standard assumption that no recovery will occur (which is correct in 67% of the cases in our data). Here, in general, known results can be replicated with younger patients having better chances than older ones (4,6); MCS has better recovery chances than UWS (7); female patients have slightly better general recovery chances than male patients (31), and hypoxic causes are associated with the lowest recovery chances (6,7). In our study, recovery chances vary between 6% for an older male patient in UWS with a hypoxic cause and 51% for a younger female patient in MCS with a cause from the category 'others', but also with TBI (around 45%).
In the second model, after the diagnosis, the pre-or absence of a N400 and the pre-or absence of a P300 cause significant changes in outcome prediction. In this model, the highest predicted chance of recovery with 97% is reached for a patient with MCS with both, detectable N400 and P300. The lowest predicted recovery chance (around 10%) unfolds for patients with UWS with neither a N400 nor a P300. This model reaches a total of 80% of the correct classifications. This makes the second model more informative than the first. It comes close to the correct classification level of Rovlias' and Kotsou's model for acute patients with traumatic brain injuries (TBI), which reaches 86.84%. However, the  Table 2) demonstrates that every factor in the tree contributes significantly.
Rovlias and Kotsou model actually needs eight factors to reach this result (18).
One limitation of our regression-models is, that we started with 'only' 102 patients, although this is a fairly large number for this field and actually the same number as in the model of Jain et al. (17). Additionally, according to Peduzzi et al. (32), consistency of logistic regression estimates depends on the number of events per variable, sample size, and the number of predictors. Peduzzi's simulated data suggests, that in our case the maximum number of predictors given the sample size would be three or four. Therefore, a model calculated from a bigger patient group to begin with could very well result in a more fine-grained model, taking more factors into account and reaching even more overall correct predictions. In such a model, factors that at present did not contribute significant information, such as early ERPs, could turn out to be of added value. Currently, however, such numbers, would be very hard to obtain since most facilities loose contact to patients after discharge, which makes them and their longterm outcome difficult to acquire. But long-term followups are needed to correctly classify recovered and not recovered patients since UWS and MCS can be very slow changing syndromes (20,21,33). In our sample of 102 patients, according to the relatives, six patients recovered consciousness and communicative skills only after 3 to 5 years in UWS (five patients) or MCS (one patient). In shorter follow-up periods, the potential of these patients would have been missed.
The research team also discussed whether 'death of a patient' might serve as a separate outcome category. This might supply information of high practical value; however, such an approach would have demanded a different study design than the one we applied. We exclusively recruited patients who survived until after neurorehabilitation, those who died as an immediate consequence of the trauma were not included into this analysis. Our patients lived at least for about half a year after the event, patients with UWS in this sample died on average 2 years (SD 1.89) after the event, and patients with MCS after about 4 years (SD 3.38). This difference, although close, was not quite significant (Mann-Whitney-U: 112,0; p = .056). Even more importantly, the most common cause of death within both patient groups was pneumonia. We reasoned that cause of death, occurring after more than 6 months and on average after about 2 years, does not qualify as effect of the event itself, but as a subsequent consequence influenced by many different and partly undefinable variables (such as type of sustained physical injuries and ensuing physical condition, type of care, decisions on medication, preexisting medical conditions, end of life decisions …) so that predicting them by sociodemographics or early occurrence of ERPs could only be inadequate.
Clearly, our results call for validation in additional samples. Unfortunately, we are not able to run a cross-validation within our sample due to sample size requirements. Nevertheless, we used the rpart-package (34) (method = "class") to investigate if our results remain stable across different analytic strategies, namely recursive partitioning. Recursive partitioning strives to correctly classify members of the population by splitting it into sub-populations. Within this process, the first variable is identified which best splits the data into two groups. After that, the next variable is tested independently for each sub-group. The resulting models can therefore be presented as binary trees. Using all predictors Figure 3. Graphical presentation of the recursive partitioning. The boxes represent the nodes of the tree. The first number (%) indicates the percentage of the sample at this node. 0 (no recovery) or 1 (recovery) present the most likely state of the participants. The last number presents the likelihood for being recovered at this node. The annotations at the arrows present the predictor at the preceding node. The last box with entries (14%; 1; .86), for example, indicates that 14% of the participants had a N400 reaction, they were mostly recovered (1) with a likelihood of .86. Patients being younger than 31 years without N400 reaction but with P300 have a probability of recovery of .67. This group consists of 9% of the sample. simultaneously, we found that again the occurrence of a N400, and for those with a N400, age and the occurrence of a P300 could be identified as important nodes producing the best recovery rates ( Figure 3).
In sum, in the present study, which is to our knowledge the first of its kind, it was possible to identify subgroups of patients with DOC with astonishingly high chances of good recovery, which was defined here as, at least, regaining communication skills. Our study further highlights the usefulness of higher cognitive event-related potential measures, like the P300 and the N400. We argue for the need of making these measures a standard examination for patients with DOC since they are relatively accessible, easy to use and comparatively inexpensive and hold more prognostic information than the standard clinical data. All obtainable information should be used before physicians' council families about prolonging or terminating a patient's life support.