FOUR score versus GCS in patients with traumatic brain injury in the prehospital setting

Background: In the last few decades, different coma scoring scales have been proposed. The purpose of this study is to compare two coma scales:, the GCS (Glasgow Coma Scale) and the FOUR score (Full Outline of UnResponsiveness score), aiming to examine which scale is better for predicting outcome in traumatic brain injury (TBI) patients in the prehospital setting. Methods: We evaluated the GCS and FOUR scores in the prehospital setting at three different prehospital timepoints, and we reassessed the scores in surviving patients 24 hours, one month and three months after the injury. Then, we compared the outcomes. We used the χ² method, and based on the analysis with the best cut-off point for each model, we calculated the sensitivity, specicity and correct prediction of outcomes with four severity scores. The Youden index, Z score, McNemar's test and ROC curve were also assessed. P < 0.05 was considered statistically signicant. Both scales were ranked with gain ratios. Results: We included 200 TBI patients who were treated in a prehospital setting by a prehospital specialized medical unit. In terms of the predictions of positive outcomes, our study showed the following: obtained Conclusions: In our study, the results of our research conrm that there are no practical or clinical differences between the GCS and FOUR scores in terms of predicting morality outcomes 24 hours, one month, and three months after injury. No statistically signicant differences were found in the Youden index or the area under the ROC curve 24 hours, one month or three months after the injury.

has been developed [7] to overcome these shortcomings and to provide further neurological details that might lead to a better prediction of outcomes in coma patients.
The FOUR score has four components: eye responses, motor responses, brainstem re exes, and respiration patterns. Each component ranges from a minimal value of 0 to a maximal value of 4 [7]. The total FOUR score ranges from a minimum of 0 to a maximum of 16 [9]. The components are described in Table 1.

Methods
The present study was approved by the National Medical Ethics Committee of the Republic of Slovenia. All declarations were performed in accordance with the relevant guidelines and regulations. We performed a prospective observational cohort study over a one-and-a-half-year period, from March 2012 to September 2013, in a prehospital setting in a large community area. TBI patients were treated and evaluated by emergency prehospital medical unit personnel. The arrival time to the nearest regional hospital was up to 15 minutes. All patients were treated according to the ATLS guidelines.
Our inclusion criteria were minor or moderate-to-severe TBI patients with altered mental status and/or coma who were either polytraumatized or had isolated head injuries. We did not include patients under the age of 18, patients who required CPR or patients who died before arriving at the hospital.
We evaluated the GCS and FOUR scores in the prehospital setting at three different time points: immediately upon rst contact with the patient at the scene, after management of the patient by the prehospital medical unit, and during patient handover by the ambulance staff at the hospital (Table 2).
We included 200 patients with TBI in our study (133 men and 67 women). The study size was obtained with a power analysis. An overview of the clinical characteristics of the cohort study is shown in Table 3.
The sensitivity, speci city, correct prediction and Youden's J-statistic (index) [13] were obtained with a two-bytwo table. Youden's J-statistic was used to assess the performance of a dichotomous diagnostic test. For each scale and each outcome, all possible cut-off points were constructed by means of two-by-two tables. We calculated the sensitivity (true positive) by choosing survivors with equal or more points according to the selected cut-off point. In addition, we calculated the speci city (true negative), where we chose nonsurvivors with fewer points according to the selected cut-off point. The best cut-off points for each of the outcomes were further assessed and pairwise compared. The percentages of correct predictions of outcomes were obtained according to these cut-off points. For each score, receiver operating characteristic (ROC) curves were obtained [14][15][16][17]. The greater the area under the ROC curve was, the better the scoring system. Data were analysed with IBM SPSS Statistics for Windows, version 21.0. Armonk, NY: IBM Corp. The outcome prediction data were compared to the observed data using McNemar's test. The comparisons of the areas under the ROC curves and the analyses of the differences in the Youden index were performed using the method described by . P < 0.05 was considered statistically signi cant.

Results
Within 24 hours after the injury, 6 patients (3%) died. One month after the injury, we registered 23 deceased patients (11,5%), and three months after the injury, we recorded a cumulative total of 25 deceased patients (12.5% of the total number of studied patients). Upon rst contact at the scene, we assessed the FOUR 1 score and the GCS 1 score. The average score upon rst contact at the scene for the FOUR 1 model was 14/16 and for the GCS 1 model was 12.6/15. The prehospital medical unit performed head-to-toe immobilization and xation and placed an IV line with IV therapy; additional oxygen was administered. Among all patients, 33 out of 200 (16,5%) were ET intubated (with appropriate sedation, analgesia and muscular relaxation), mechanically ventilated and monitored with end-tidal CO 2 . No missing data were present in the nal dataset. Table 3 shows an overview of the clinical characteristics of the study cohort and present data obtained from the assessment of the two singular components present in both scales: the eye and motor response for the GCS and FOUR scores. Both the eye and motor responses of the two scales were compared to the severity of TBI (mild, moderate, severe and other). We additionally performed possible outcomes after brain injury using the Glasgow Outcome Scale Extended (GOS-E) with the following outcome ranges: eight progressive steps from 8 and 7 for good recovery, 6 and 5 for moderate recovery, 4 and 3 for severe disability, 2 for persistent vegetative state and 1 for death.
The best cut-off points for predicting outcomes within 24 hours after the injury were 12 for the FOUR 1 model and 8 for the GCS 1 model, evaluated immediately upon rst contact at the scene (1: rst assessment). Tables 4a, 4b, and 4c provide data regarding the sensitivity, speci city and correct predictions of outcome 24 hours after the injury for the best cut-off points for the Youden index. We observed no statistically signi cant differences within 24 hours after the injury between the FOUR score and the GCS for assessments 2 and 3 (2: after initial management and intervention of the patient, 3: during patient handover by the ambulance staff at the hospital).
These results, according to McNemar's test, con rmed that the FOUR 1 model and the GCS 1 model showed slightly better predictive power in terms of patient outcome ( Table 4). The best cut-off values for the Youden index after 24 hours were 0.85 for the FOUR 1 model and 0.88 for the GCS 1 model (Table 4). The area under the ROC curve (area ± standard error) obtained after 24 hours was 0.94 ± 0.02 for FOUR 1 and was almost the same at 0.93 ± 0.02 for GCS 1 (Table 4). We conclude that no differences in the Youden index or area under the ROC curve 24 hours after the injury were found.
In addition to the 24-hour follow-up outcomes, we also evaluated the GCS and FOUR scores in surviving patients one and three months after the injury.
The two scores were also compared by drawing ROC curves to avoid bias of arbitrary cut-off points. The comparison of the GCS score and the FOUR score in patients who survived for 24 hours (A) and for one (B) and three months (C) after the injury in regard to outcomes showed only a minimal signi cant difference in the correct prediction of outcomes in surviving patients 24 hours after the injury (Figure 1).
Analysis performed one month after the injury showed that the best cut-off point was 8 for the FOUR 2 and FOUR 3 models and 11 for the GCS 3 model (Table 5). According to McNemar's test, these three models had the highest outcome prediction value among all combined models. The cut-off value for the Youden index was the same at 0.81 for the FOUR 2, FOUR 3 and GCS 3 models ( Table 5). The ROC curve was the same at 0.91 ± 0.04 for both the FOUR 2 and FOUR 3 models and was 0.95 ± 0.05 for the GCS 3 model (Table 5). No differences in the Youden index or area under the ROC curve were found.
Data obtained from the analysis performed three months after the injury showed that among all combined models and according to McNemar's test, the best models were the FOUR 2 and FOUR 3 (both cut-off values of 12) as well as GCS 1 (cut-off value of 12) and GCS 2 (cut-off value of 9) models ( Table 6). The Youden index values were 0.77 for the FOUR 2, FOUR 3 and CGS 2 models and 0.78 for the GCS 1 model ( Table 6).
The ROC was 0.89 ± 0.04 for the FOUR 2 and FOUR 3 models, 0.93 ± 0.02 for the GCS 1 model and 0.95 ± 0.02 for the GCS 2 model (Table 6). No signi cant differences in the Youden index or area under the ROC curve were found.

Discussion
Impaired consciousness is present in many injured patients. Efforts are directed to evaluate the depth of consciousness in these patients for proper management and prediction. The aim of the study was to compare the GCS and FOUR scores and to verify their ability to predict outcomes in TBI coma patients outside the hospital setting.
Since 1974 [1], when the GCS was introduced, the GCS has been widely used in the prehospital setting. In 2005, the FOUR score [7] was proposed to reduce some limitations of the GCS. Currently, the FOUR score is mostly employed in intensive and neurological care units.
The advantages of the FOUR score have been assessed by Wijdicks et al. [7], especially in neurologically critically ill patients who are intubated. When we compared the GCS and FOUR scores, we noticed a key difference: the verbal response is not an intrinsic part of the FOUR score. Taking this into consideration, the FOUR score is very useful in intubated patients [7]. Intubation is a common procedure after injury. The FOUR score tests essential brainstem re exes and provides information about the degree of brainstem injury that is not registered with the GCS. The FOUR score can distinguish a locked-in syndrome and a possible vegetative state [7] and includes signs suggestive of uncal herniation [7]. The evaluation of respiratory patterns in the FOUR score may also add information about the presence of respiratory drive [7]. Studies have also shown that the in-hospital outcomes between the scales were better for the lowest total FOUR scores than for the GCS scores [7]. Conclusions obtained in previous studies have shown that the FOUR score is an accurate predictor of outcome in TBI patients [8], that it has some advantages over the GCS [8] and that it can be performed in a variety of ICU contexts [8]. The FOUR score is easily taught and simple to administer, and it provides essential neurologic information that allows for an accurate assessment of patients with altered consciousness with excellent interrater agreement among medical intensivists [9]. The FOUR score might be a better prognostic tool for ICU outcomes than the GCS, most likely because it integrates brainstem re exes and respiration [10]. Other studies have shown the predictive value of the FOUR score on admission in patients after moderate and severe TBI [11]. These studies also showed that the predictive ability for the primary outcome 2 weeks after injury was no better than that with the GCS score [11]. For nontraumatic comatose patients, different parameters as predictors of outcome in the prehospital environment were also studied [12].
Analyses performed between the GCS and FOUR scores in the hospital environment have demonstrated that the GCS is missing the key essential elements of a comprehensive neurological examination for comatose patients [18]. In the same study, it was con rmed that the FOUR score maintained simplicity and, at the same time, provided far better information [18]. Other previous studies have demonstrated that the GCS and FOUR scores show comparable results in the assessment of patients with traumatic brain injury [19]. These data show that there are excellent statistical correlations between the two scoring systems [19]. Moreover, these studies show that the FOUR score provides better details regarding the neurological status of patients [19].
The results can be considered clinically relevant because of the strong statistical association obtained as well as the agreement in the literature [19]. Overall, there are currently multiple scores used to determine the prognosis of patients in intensive care units. However, a scoring system should be simple, reliable, and predictive of morbidity and outcome.
Due to the different categories of scores, the FOUR score is more effective in evaluating patients who are unconscious and dependent on mechanical ventilation. Prospective studies with larger cohorts of patients treated in various intensive care units for longer durations are needed to evaluate whether the application of these scales in uences functional and cognitive outcomes [20].
In addition, further comparative neurological outcome studies also showed that the outcome of patients admitted to the ICU was signi cantly higher when the GCS or the FOUR score was used [21]. Discrimination was fair for both scores, but the FOUR score was superior to the GCS [21]. Calibration was better for the FOUR score than for the GCS in the ICU [21]. The sensitivity, speci city, positive predictive value, negative predictive value, and accuracy were also better for the FOUR score than for the GCS [21]. Good correlation was observed between the two scores [21].
A comprehensive overview of the relationship between a patient's FOUR score and outcome is still lacking. A recent study on the FOUR score showed that the FOUR score had a close overall relationship with in-hospital outcomes and poor functional outcomes in patients with impaired consciousness [22]. This research also claimed that there was insu cient evidence to determine whether performance was modi ed in different groups, and there was some suggestion that the assessment of brainstem re exes and respiratory patterns made less of a contribution than eye and motor scores [22].
In the present study, our data showed no statistically signi cant difference in terms of the correct prediction of outcome 24 hours after the injury. We found no statistically signi cant differences in the Youden index or area under the ROC curve after 24 hours, no statistically signi cant differences one month after the injury, and no statistically signi cant differences three months after the injury.
In our opinion, we should obtain a better understanding of the anatomical and pathophysiological pathways that are not evidenced by certain GCS and FOUR scores. Further research should be focused on the comparison between the obtained GCS and FOUR score data and the anatomical substrate changes revealed by diagnostic tools such as head CT scans and brain fMRI. With these data, we could obtain the accurate subanatomical and clinical information needed to perform speci c invasive therapy to lead to a far better outcome for patients.

Conclusions
The present study involved a comparison between the GCS and FOUR scores in TBI patients in out-of-hospital scenarios at different follow-up times. We introduced and compared different models for the prediction of morality outcome 24 hours after the injury and re-evaluated the predictive ability one and three months after the injury.
The results of our research con rm that there are no practical or clinical differences between the GCS and FOUR scores in terms of predicting morality outcomes 24 hours, one month, and three months after injury. Due to the different assessment categories, the FOUR score is more effective for evaluating patients who are unconscious and dependent on mechanical ventilation.
We believe that the FOUR score has promising predictive outcome potential and could be regularly performed in the prehospital setting, especially in intubated patients with brain injuries. The present study was approved by the National Medical Ethics Committee of the Republic of Slovenia. We recruited only patients who gave signed consent to participate. Informed signed consent was obtained from the patients' or patient's guardian/caregivers.   Table 5. Sensitivity, speci city, ROC area, and correct prediction of outcomes 1 month after injury for selected cut-off points in the FOUR and GCS models based on the best Youden index. 2: after initial management and intervention of the patient, 3: during patient handover by the ambulance staff at the hospital.  Table 6. Sensitivity, speci city, ROC area, and correct prediction of outcomes 3 months after injury for selected cut-off points in the FOUR and GCS models based on the best Youden index. 1: immediately upon rst contact at the scene, 2: after initial management and intervention of the patient, 3: during patient handover by the ambulance staff at the hospital.