Machine learning techniques for mortality prediction in critical traumatic patients: anatomic and physiologic variables from the RETRAUCI study

doi:10.21203/rs.3.rs-39037/v1

Download PDF

Research article

Machine learning techniques for mortality prediction in critical traumatic patients: anatomic and physiologic variables from the RETRAUCI study

https://doi.org/10.21203/rs.3.rs-39037/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 19 Oct, 2020

Read the published version in BMC Medical Research Methodology →

You are reading this older preprint version

Read the latest preprint version →

Background: Interest in models for calculating the risk of death in traumatic patients admitted to ICUs remains high. These models use variables derived from the deviation of physiological parameters and the intensity of anatomical lesions. Our objective is to create different predictive models of the mortality of critically traumatic patients using machine learning techniques.

Methods: We used 9625 records from the RETRAUCI database (National Trauma Registry of 52 Spanish ICUs in the period of 2015–2019). Hospital mortality was 12.6%. Data on demographic variables, affected anatomical areas and physiological repercussions were used. The Weka Platform was used, along with a ten-fold cross-validation for the construction of nine supervised algorithms: logistic regression binary (LR), neural network (NN), sequential minimal optimization (SMO), classification rules (JRip), classification trees (CT), Bayesian networks (BN), adaptive boosting (ADABOOST), bootstrap aggregating (BAGGING) and random forest (RFOREST). The performance of the models was evaluated by accuracy, specificity, precision, recall, F-measure, and AUC.

Results: The most important factors are those associated with traumatic brain injury (TBI) and organic failures. The LR finds thorax and limb injuries as independent protective factors of mortality. The CT generates 24 decision rules and uses those related to TBI as the first variables (range 2.0–81.6%). The JRip detects the eight rules with the highest risk of mortality (65.0–94.1%). The NN model uses a hidden layer of ten nodes, which requires 180 weights for its interpretation. The BN find the relationships between the different factors that identify different patient profiles. Models with the ensemble methodology (ADABOOST, BAGGING and RandomForest) do not have greater performance. All models obtain high values in accuracy, specificity, and AUC, but obtain lower values in recall. The greatest precision is achieved by the SMO model, and the BN obtains the best recall, F-measure, and AUC.

Conclusion: Machine learning techniques are useful for creating mortality classification models in critically traumatic patients. With clinical interpretation, the algorithms establish different patient profiles according to the relationship between the variables used; determine groups of patients with different evolutions, and alert clinicians to the presence of rules that indicate the greatest severity.

Critical Care & Emergency Medicine

Intensive Care Unit

Machine learning techniques

Supervised algorithms

Traumatic patient

Mortality

Models for calculating the risk of death are used to assess the severity of the condition of traumatic patients. The models for calculating severity according to the risk of death in traumatic patients admitted to an intensive care unit (ICU) use variables derived from the deviation of physiological parameters and the intensity of anatomical lesions with respect to the affected body areas [1].

Although various studies have tried to take advantage of these two approaches – anatomical and physiological – there is still a need to look for systems that achieve better results and to obtain tools that can be used in healthcare practice [2].

Lesions are divided into groups associated with different anatomical areas, and their intensity can be assessed according to the Abbreviated Injury Scale (AIS) [3]. The physiological impact is assessed at the neurological, hemodynamic, and respiratory levels according to the Triage-Revised Trauma Score (T-RTS) [4].

In order to analyse the relationship between mortality, anatomical extension of the injury and its physiological repercussion, it is necessary to have large databases that include records of critically traumatic patients. The RETRAUCI (National Trauma Registry in ICU) study includes the participation of 52 ICUs in Spain and almost 10,000 patients [5].

Classification systems that use machine learning techniques (MLT) provide a global methodological vision and allow us to create multiple algorithms to achieve a more accurate result [6]. The current interest in MLT methodology applied to biomedical research is especially keen, and the development of these techniques requires adequate standardization and evaluation guidelines [7].

There are platforms that make it possible to work with multiple algorithms and that make the work more user-friendly and accurate in the construction and evaluation of results [8]. Among others, the WEKA (Waikato Environment for Knowledge Analysis) platform, developed by the University of Waikato, offers the possibility of using various algorithms and evaluating them with a single tool [9].

From a theoretical point of view, in the ideal conditions of a refined database with sufficient records, the No Free Lunch Theorem establishes that all the algorithms will optimize their results. With real data, however, this same theorem forces us to use different algorithms that will obtain different degrees of precision [10].

Thus, with limited data, we must use different algorithms to obtain a better assessment of the importance of the variables analysed. Some algorithms produce models with clinical interpretation, such as those based on classification trees, decision rules or Bayesian networks [11, 12]. Understanding the relationships between the variables that influence the classification of patients according to their severity offers us the possibility of understanding the different profiles of traumatic patients admitted to the ICU.

Our objective is to create different predictive models of the mortality of critically traumatic patients using machine learning techniques, to evaluate their performance and, if they can be interpreted, to evaluate relationships between the different types of variables included.

RETRAUCI database

RETRAUCI is an observational, prospective, and multicentre nationwide registry that currently includes 52 ICUs in Spain. It has the endorsement of the Neurointensive Care and Trauma Working Group of the Spanish Society of Intensive Care Medicine (SEMICYUC) and currently operates in a web-based electronic format [13]. We include a five-year study period (2015–2019). Ethics Committee approval for the registry was obtained (Hospital Universitario 12 de Octubre, Madrid: 12/209). Due to the retrospective analysis of de-identified collected data, informed consent was not obtained.

Mortality within the hospital episode was used as the outcome variable.

The variables collected were classified into several groups (Table 1).

First, we considered patient variables, such as Age and Sex. Variables were used that describe the importance of injuries by anatomical area according to the AIS model – the scale ranges from 0 to 6, with 0 indicating no involvement and 6 indicating maximum involvement [3]. The anatomical areas were head (AHEAD), neck (ANECK), face (AFACE), thorax (ATHORAX), abdomen (AABDOM), spine (ASPINE), upper extremity (AUPPEREXT), lower extremity (LOWEREXT) and external and thermal injuries (AEXTERNAL). Also, we considered variables derived from the T-RTS, such as the Respiratory Rate (PointRR), Systolic Blood Pressure (PointSBP) and the Glasgow Coma Score (PointGCS), which range between 0 points (greater involvement) and 4 points (normality) [4].

Next, patient treatment variables, such as the presence at the ICU of mechanical ventilation (MV) or the occurrence of a Massive Haemorrhage (MASSIVEHEM) requiring activation of the massive transfusion protocol, were also included [14].

Finally, variables that define organic failures were used: hemodynamic failure (HEMODINAM) indicated by the presentation of an SBP lower than 90 mmHg requiring the administration of volume, blood products, and vasoconstrictor support; respiratory failure (RESPIRATORY), indicated by the presence of PO2/FiO2 below 300 during admission; renal failure (AKIDNEY), indicated by an increase in creatinine > 1.5 times the initial, or 25% reduction in urine flow to less than 0.5 ml/kg/h for at least 6 h; and the presence of coagulopathy (COAGULOP), indicated by the prolongation of prothrombin and activated partial thromboplastin times in > 1.5 times the control or by levels of fibrinogen < 150 mg / dl or thrombocytopenia < 100,000 in the determination of the first 24 hours [13,15,16].

Conventional statistics

Variables are described as median (interquartile range) or as a percentage. For the comparison of survivors (A-ALIVE) and non-survivors (D-DIED), the Mann-Whitney test was used for continuous variables and the chi-square test for categorical variables. A p-value of < 0.05 was taken as significant.

Machine learning techniques

We used the WEKA Platform (version 3.8) and its Explorer modules to determine the parameters of the different algorithms and the Experimenter module to establish the differences between the algorithms used. A ten-fold cross-validation process system was used in all algorithms. WEKA allows one to make a first selection of variables through its application — the gain ratio feature evaluator sorts the variables according to their importance [17].

Algorithm selection

Of the multiple algorithms included in WEKA, we selected nine supervised algorithms classified in traditional and ensemble methodology. The first six are traditional models based on logistic regression binary (LR) functions, a neural network according to multilayer perceptron (NN), sequential minimal optimization (SMO), classification rules (JRip), classification trees (CT) and Bayesian networks (BN), respectively. We also included three models that use ensemble classification algorithms: adaptive boosting (ADABOOST), bootstrap aggregating (BAGGING), and random forest (RFOREST) [17].

For the LR model, we used a backward stepwise regression system with variable input with p < 0.05 and removal with p < 0.10. Odds ratios (OR) with a 95% confidence interval were calculated.

In the CT model, we used the J48 algorithm based on C4.5, obtaining a pruned tree [18]. The JRip algorithm uses a rule learner: Repeated Incremental Pruning to Produce Error Reduction (RIPPER) [19]. We limited tree growth (CT) and the number of rules (JRip), with a minimum of 20 instances.

For the BN, we used the TAN (Tree Augmented Network) variable relation search algorithm, which generates a graph that can be interpreted. This method does not assume the independence of the variables [20,21].

The SMO implements John Platt's sequential minimal optimization algorithm for training a support vector classifier [22]. In NN, we used the automatic mode for selecting the number of nodes in the hidden layer, with a learning rate of 0.3 and a momentum of 0.2 [23]. In RFOREST, we selected ten trees with the C4.5 algorithm [24]. In the rest of the algorithms (ADABOOST and BAGGING), we used the parameters that WEKA incorporates by default [17,25].

Algorithm evaluation

To evaluate the performance of the algorithms, we used the calculation of accuracy, specificity, precision, recall, F-measure and the area under curve ROC (AUC). WEKA's Experimenter module, with ten repetitions, allows one to establish whether there are statistical differences between the evaluated properties of the algorithms using the paired T-Tester (corrected) [17].

The RETRAUCI database enrolled 9,790 patients in the 2015–2019 period. With 165 records, the data was not complete. The study group includes 9,625 patients. Hospital mortality was 12.6% (1,212 patients).

Table 2 shows the demographic characteristics and the 20 selected factors according to mortality. It is observed that the factors without significant differences between survivors and non-survivors also have less weight (see last column of Table 1) according to WEKA's attribute selection criteria. The ANECK and AFACE variables, with the lowest values, were not used in the construction of the models. The most important factors (PointGCS and AHEAD) are those associated with traumatic brain injury (TBI).

Table 3 shows the results of the LR model. 14 variables are included. It can be seen that it does not include the SEX variable and that there are three anatomical areas with OR less than 1 (THORAX, LOWEREXT and UPPEREXT).

The CT algorithm (Figure 1) generates 24 decision rules and uses those related to TBI as the first variables. A range of probability of death is obtained between 2.0% and 81.6%.

The JRip algorithm detects eight classification rules (Figure 2) that define the patients with the highest risk of mortality. The mortality rate ranges between 65.0% and 94.1%. Patients who do not comply with any of these rules have a lower mortality of 5.8%.

The NN model was established automatically with a hidden layer of ten nodes. The model is fully interconnected and requires more than 180 weights to be used for its interpretation.

The BN model offers us a graph (Figure 3) with which we can identify the relationships between the different factors, which, in turn, can help us identify different patient profiles. For example, the relationships are observed in the variables associated with TBI (AHEAD and PointGCS), hemodynamic failure with COAGULOP and AKIDNEY, respiratory failure with ATHORAX and AUPPEREXT and the relationship between ALOWEREXT injury with MASSIVEHEM and COAGULOP.

The performance data of the nine algorithms evaluated are shown in Table 4. No great differences were found in the precision measurements. All models obtained high values in accuracy, specificity, and AUC, but obtained lower values in recall. The highest precision was achieved by the SMO model, and the BN obtained the best recall, F-measure, and AUC.

Despite using more complex algorithms, models with the ensemble methodology (ADABOOST, BAGGING and RandomForest) did not manage to increase the performance of the classification.

The availability of a database such as RETRAUCI gives us the opportunity to apply classification model methodology to stratify the risk of mortality and, therefore, establish the severity of traumatic patients admitted to the ICU [26]. It is necessary to make specific studies of critical trauma patients since they present differential characteristics, groupings of injuries and severity that are different from those that do not require admission to the ICU [2,27].

Classifying these patients in groups of different severity can help to prioritize the allocation of healthcare resources for the most seriously ill patients. There are several studies on the application of MLT in biomedical problems and in other aspects of critically ill patients [28,29]. The WEKA platform enables us to carry out multiple classification models using a single tool [30].

In our results, the different algorithms have found certain common factors to be the most important in determining the risk of mortality. The most influential factors are those derived from TBI, both measured by anatomical involvement and by physiological repercussion. These results have already been studied in other studies on the severity of critical trauma patients [31]. The presence of an organic failure has also been shown to influence mortality [32]. Age is also a particularly important factor in the evolution of these patients [33].

In general, with some differences, the algorithms used have achieved similar levels of performance. The models have failed to classify the group of deceased patients with moderate recall values. Although this result coincides with other studies with different groups of patients, it requires us to continue searching for more precise algorithms [8].

The algorithms used have specific characteristics based on the clinical interpretation of the groups of patients with different severities and on the relationships of the different variables studied, which must be considered.

The LR results identify the variables that are independently associated with higher mortality. They also indicate that in critically ill patients, those with only more severe chest or limb injuries are a group which requires intensive surveillance, but which has a lower mortality rate among those admitted to the ICU [34].

The CT model serves to establish a hierarchy of variables and, through decision rules, establish different groups of patients according to their mortality rate. There are two large groups of patients: those with TBI and those without. It is also interesting to observe the different cut-off points for age according to each decision rule. Other models have been built into classification trees for both traumatic patients and other critical pathologies, and these have also found an increased risk of mortality associated with TBI [18,35].

The JRip model shows easily interpretable classification rules [36]. In our work, it identifies the groups of patients with the highest mortality rate. This set of classification rules should become an alert system that identifies those patients with the highest risk of mortality early. In these groups of patients, the most important factors are advanced age, the presence of TBI and organic failure.

The BN-based model shows the relationship between the different factors studied. The relationship between factors dependent on head trauma can be appreciated. For example, the relationship between thoracic injury, respiratory involvement and upper extremity injuries is observed. On the other hand, the relationship between the presence of lower limb injuries (including the pelvis), coagulopathy, hemodynamic alteration and massive bleeding is observable. The study of these relationships is capable of differentiating groups of patients with different profiles of anatomical involvement and physiological repercussions. Traumatic patients admitted to the ICU share a critical process, but they express different forms of involvement that can be grouped into different profiles with specific characteristics in their severity and treatment. As in other works, the BN algorithm obtained better precision values [37].

The NN works with all possible relationships between the analysed factors. This characteristic has resulted in NN models obtaining the best classification results in other databases [38,39]. In our case, it did not manage to improve the performance. Furthermore, the great complexity of its structure turns the model into a black box that is impossible to interpret.

The ensemble algorithms, although more complex in their methodology, have been shown to obtain greater performance in other works. In our results, however, they also did not achieve greater performance values [40].

Our work has several limitations. Other variables concerning the type or mechanism of the trauma, analytical or evolutionary, could have been included. Our objective required working with variables of anatomical involvement and physiological repercussion. More types of classification algorithms could also have been used [41,42]. Furthermore, we believe that these results should be validated with new data as the RETRAUCI database manages to incorporate more records. These limitations require us to carry out future studies with more patients, using more classification algorithms. The WEKA platform is a dynamic project that continuously improves the learning methodology by incorporating new algorithms and further automating the construction process [43].

Machine learning techniques are useful for creating mortality classification models in critically traumatic patients. Even with some differences, the different algorithms achieved similar performance values. In addition, the algorithms that have a clinical interpretation help us to establish different patient profiles according to the relationship between the variables used and establish groups of patients with different evolutions, and some of the rules can even become alert systems to identify patients with the highest severity.

The models for classifying the severity of critically ill patients should have the common objective of determining the variables and their relationship in order to improve precision by establishing groups of patients with a greater probability of dying who could benefit from priority care that improves their survival.

ADABOOST: adaptive boosting; AIS: Abbreviated Injury Scale; BAGGING: Bootstrap aggregating; BN: Bayesian networks; CT: Classification trees; FiO2: Fraction of Inspired Oxygen; GCS: Glasgow Coma Score; ICU: Intensive Care Unit; JRip: Classification rules; LR: Logistic Regression Binary; MLT: Machine Learning Techniques; MV: mechanical ventilation; NN: neural network; PO2: Partial pressure of oxygen; RETRAUCI: National Trauma Registry in ICU; RF: Respiratory rate; RFOREST: Random forest; SBP: Systolic blood pressure; SMO: Sequential minimal optimization; TBI: Traumatic brain injury; T-RTS: Triage-Revised Trauma Score; WEKA: Waikato Environment for Knowledge Analysis.

Ethics approval and consent to participate

Ethics Committee approval for the registry was obtained (Hospital Universitario 12 de Octubre, Madrid: 12/209).

Consent for publication

Not applicable

Availability of data and materials

The datasets analysed during the current study are not publicly available due they are the property of the RETRAUCI project. Data are however available from the authors upon reasonable request and with permission of RETRAUCI project.

Competing interests

The authors report no conflicts of interests related to the work described

Funding

This work has not received any funding.

Authors´ contributions

LS, NM, MB, and JT conceived and supervised the project. JLL, JA, MC, MS, JJ, and DM analyzed the data, interpreted the findings, and drafted the manuscript. JLL, JA, MC, and MS designed and collected data study. All authors edited the manuscript and approved the final version.

Acknowledgements

To all the professionals of the UCIs participating in the RETRAUCI project.

de Munter L, Polinder S, Lansink KW, et al. Mortality prediction models in the general trauma population: A systematic review. 2017 Feb;48:221-229.
Lefering R, Huber-Wagner S, Bouillon B, et al. Cross-validation of two prognostic trauma scores in severely injured patients. Eur J Trauma Emerg Surg. 2020;10.1007/s00068-020-01373-6.
Gennarelli TA, Wodzin E. AIS 2005: a contemporary injury scale. Injury 2006;37:1083-91.
Champion HR, Sacco WJ, Copes WS, Gann DS, Gennarelli TA, Flanagan ME. A revision of the Trauma Score. J Trauma 1989 May;29:623-9.
Chico-Fernández M, Sánchez-Casado M, Llompart-Pou JA. Trauma registry in Spain. Comment to "Trauma systems around the world: A systematic overview". J Trauma Acute Care Surg. 2018;84(1):217‐
Ma H, Xu CF, Shen Z, Yu CH, Li YM. Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China. Biomed Res Int. 2018;2018:4304376.
Luo W, Phung D, Tran T, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J Med Internet Res. 2016;18(12):e323.
Deist TM, Dankers FJWM, Valdes G, et al. Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers. Med Phys. 2018;45(7):3449‐
Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20(15):2479‐
Gómez D, Rojas A. An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification. Neural Comput. 2016;28(1):216‐
Pourhoseingholi MA, Kheirian S, Zali MR. Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients. Acta Inform Med. 2017;25(4):254‐
Zador Z, Sperrin M, King AT. Predictors of Outcome in Traumatic Brain Injury: New Insight Using Receiver Operating Curve Indices and Bayesian Network Analysis. PLoS One. 2016;11(7):e0158762.
Chico-Fernández M, Llompart-Pou JA, Guerrero-López F, et al. Epidemiology of severe trauma in Spain. Registry of trauma in the ICU (RETRAUCI). Pilot phase. Med Intensiva. 2016 Aug-Sep;40(6):327-47.
Llau JV, Acosta FJ, Escolar G, et al. Multidisciplinary consensus document on the management of massive haemorrhage (HEMOMAS document). Med Intensiva. 2015;39(8):483‐ O
Søvik S, Isachsen MS, Nordhuus KM, et al. Acute kidney injury in trauma patients admitted to the ICU: a systematic review and meta-analysis. Intensive Care Med. 2019;45(4):407‐
Spahn DR, Bouillon B, Cerny V, et al. The European guideline on management of major bleeding and coagulopathy following trauma: fifth edition. Crit Care. 2019;23(1):98.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software. ACM SIGKDD Explor Newslett. (2009) 11:10
Trujillano J, Badia M, Serviá L, March J, Rodriguez-Pozo A. Stratification of the severity of critically ill patients with classification trees. BMC Med Res Methodol. 2009;9:83.
Rajput A, Prasad R, Dubey M, Saxena SP, Raghuvanshi M. J48 and JRIP rules for E-governance data. IJCSS. 2011;5(2):201-7.
Banu A B, Thirumalaikolundusubramanian P. Comparison of Bayes Classifiers for Breast Cancer Classification. Asian Pac J Cancer Prev. 2018;19(10):2917‐
Friedman N, Geiger D, Goldszmidt M. Bayesian networks classifiers. Machine Learning. 1997;29:131-163.
Zhang YH, Hu Y, Zhang Y, Hu LD, Kong X. Distinguishing three subtypes of hematopoietic cells based on gene expression profiles using a support vector machine. Biochim Biophys Acta Mol Basis Dis. 2018;1864(6 Pt B):2255‐
Lee KH, Dong JJ, Jeong SJ, et al. Early Detection of Bacteraemia Using Ten Clinical Variables with an Artificial Neural Network Approach. J Clin Med. 2019;8(10):1592.
Wang HL, Hsu WY, Lee MH, et al. Automatic Machine-Learning-Based Outcome Prediction in Patients With Primary Intracerebral Hemorrhage. Front Neurol. 2019;10:910.
Somnay YR, Craven M, McCoy KL, et al. Improving diagnostic recognition of primary hyperparathyroidism with machine learning. Surgery. 2017;161(4):1113‐
Chico-Fernández M, Llompart-Pou JA, Sánchez-Casado M, et al. Mortality prediction using TRISS methodology in the Spanish ICU Trauma Registry (RETRAUCI). Med Intensiva. 2016;40(7):395‐
DiMaggio CJ, Avraham JB, Lee DC, Frangos SG, Wall SP. The Epidemiology of Emergency Department Trauma Discharges in the United States. Acad Emerg Med. 2017;24(10):1244‐1256
Sidey-Gibbons JAM, Sidey-Gibbons CJ. Machine learning in medicine: a practical introduction. BMC Med Res Methodol. 2019;19(1):64.
Shillan D, Sterne JAC, Champneys A, Gibbison B. Use of machine learning to analyse routinely collected intensive care unit data: a systematic review. Crit Care. 2019;23(1):284.
Smith TC, Frank E. Introducing Machine Learning Concepts with WEKA. Methods Mol Biol. 2016;1418:353‐
Rau CS, Kuo PJ, Chien PC, Huang CY, Hsieh HY, Hsieh CH. Mortality prediction in patients with isolated moderate and severe traumatic brain injury using machine learning models. PLoS One. 2018;13(11):e0207192.
Fröhlich M, Lefering R, Probst C, et al. Epidemiology and risk factors of multiple-organ failure after multiple trauma: an analysis of 31,154 patients from the TraumaRegister DGU. J Trauma Acute Care Surg. 2014;76(4):921‐
Llompart-Pou JA, Chico-Fernández M, Sánchez-Casado M, et al. Age-related injury patterns in Spanish trauma ICU patients. Results from the RETRAUCI. Injury. 2016;47 Suppl 3:S61‐
Lin FC, Tsai SC, Li RY, Chen HC, Tung YW, Chou MC. Factors associated with intensive care unit admission in patients with traumatic thoracic injury. J Int Med Res. 2013;41(4):1310‐
Serviá L, Badia M, Montserrat N, Trujillano J. Severity scores in trauma patients admitted to ICU. Physiological and anatomic models. Med Intensiva. 2019;43(1):26‐
Rau CS, Kuo PJ, Chien PC, Huang CY, Hsieh HY, Hsieh CH. Mortality prediction in patients with isolated moderate and severe traumatic brain injury using machine learning models. PLoS One. 2018;13(11):e0207192.
Zampieri FG, Aguiar FJ, Bozza FA, Salluh JIF, Soares M; ORCHESTRA Study Investigators. Modulators of systemic inflammatory response syndrome presence in patients admitted to intensive care units with acute infection: a Bayesian network approach. Intensive Care Med. 2019;45(8):1156‐
Lee KH, Dong JJ, Jeong SJ, et al. Early Detection of Bacteraemia Using Ten Clinical Variables with an Artificial Neural Network Approach. J Clin Med. 2019;8(10):1592.
Gholipour C, Rahim F, Fakhree A, Ziapour B. Using an Artificial Neural Networks (ANNs) Model for Prediction of Intensive Care Unit (ICU) Outcome and Length of Stay at Hospital in Traumatic Patients. J Clin Diagn Res. 2015;9(4):OC19‐
Hosni M, Abnane I, Idri A, Carrillo de Gea JM, Fernández Alemán JL. Reviewing ensemble classification methods in breast cancer. Comput Methods Programs Biomed. 2019;177:89‐
Sandri M, Berchialla P, Baldi I, Gregori D, De Blasi RA. Dynamic Bayesian Networks to predict sequences of organ failures in patients admitted to ICU. J Biomed Inform. 2014;48:106‐
Sierra B, Serrano N, Larrañaga P, et al. Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patient’s data. Artif Intell Med. 2001;22(3):233‐
Wang HL, Hsu WY, Lee MH, et al. Automatic Machine-Learning-Based Outcome Prediction in Patients With Primary Intracerebral Hemorrhage. Front Neurol. 2019;10:910.

Table 1 Risk factors associated with mortality. Description and attribute evaluation

Variable abbreviation	Type	Group	Description	Attribute evaluation (weight)
Age	N	Patient	Age in years	0.04767
Sex	C	Patient	Male / Female	0.00168
AHEAD	S	AIS	AIS scale for Traumatic brain injury (0-6)	0.08977
ANECK	S	AIS	AIS scale for neck injury (0-6)	0
AFACE	S	AIS	AIS scale for face injury (0-6)	0.00120
ATHORAX	S	AIS	AIS scale for thorax injury (0-6)	0.00949
AABDOM	S	AIS	AIS scale for abdomen injury (0-6)	0.00373
ASPINE	S	AIS	AIS scale for spine injury (0-6)	0.00380
AUPPEREXT	S	AIS	AIS scale for upper extremity injury (0-6)	0.00507
ALOWEREXT	S	AIS	AIS scale for lower extremity injury (0-6)	0.01186
AEXTERNAL	S	AIS	AIS scale for unspecific injury (0-6)	0.00245
PointRR	S	T-RTS	Points of Respiratory Frequency (4-0)	0.02438
PointSBP	S	T-RTS	Points of Systolic Blood Pressure (4-0)	0.02855
PointGCS	S	T-RTS	Points of Glasgow Coma Score (4-0)	0.09633
MV	C	Status	Mechanical Ventilation (Yes/No)	0.05836
MASSIVEHEM	C	Status	Massive Haemorrhage (Yes/No)	0.01503
HEMODINAM	C	Failure	Hemodynamic failure (Yes/No)	0.04002
RESPIRATORY	C	Failure	Respiratory failure (Yes/No)	0.02438
AKIDNEY	C	Failure	Kidney failure (Yes/No)	0.02234
COAGULOP	C	Failure	Coagulopathy (Yes/No)	0.02234

N: Numerical, C: Categorical, S: Scale, T-RTS: Triage-Revised Trauma Score, AIS: Abbreviated Injury Scale.

Table 2 Demographic and clinical characteristics of patients according to mortality.

Variable	ALL N = 9625	SURVIVORS N = 8413	NON-SURVIVORS N = 1212	p-value
Age (years)	48 (33-64)	46 (32-61)	66 (47-78)	< 0.001
Sex (% male)	77.8	78.5	72.4	< 0.001
AHEAD	2 (0-4)	1 (0-3)	4 (2-5)	< 0.001
ANECK	0 (0-0)	0 (0-0)	0 (0-0)	0.930
AFACE	0 (0-0)	0 (0-0)	0 (0-0)	0.303
ATHORAX	0 (0-3)	1 (0-3)	0 (0-3)	< 0.001
AABDOM	0 (0-0)	0 (0-1)	0 (0-0)	< 0.001
ASPINE	0 (0-2)	0 (0-2)	0 (0-0)	< 0.001
AUPPEREXT	0 (0-1)	0 (0-1)	0 (0-0)	< 0.001
ALOWEREXT	0 (0-2)	0 (0-2)	0 (0-0)	< 0.001
AEXTERNAL	0 (0-0)	0 (0-0)	0 (0-0)	0.133
PointRR	4 (4-4)	4 (4-4)	4 (2-4)	< 0.001
PointSBP	4 (4-4)	4 (4-4)	4 (2-4)	< 0.001
PointGCS	4 (3-4)	4 (3-4)	2 (0-4)	< 0.001
MV (%)	48.0	42.8	84.1	< 0.001
MASSIVEHEM (%)	6.0	4.5	16.5	< 0.001
HEMODINAM (%)	34.5	30.1	64.9	< 0.001
RESPIRATORY (%)	11.8	9.5	27.2	< 0.001
AKIDNEY (%)	16.9	14.1	35.8	< 0.001
COAGULOP (%)	15.9	12.8	37.4	< 0.001

Values expressed as percentages or median (Interquartile range), RR: Respiratory rate, SBP: Systolic blood pressure, GCS: Glasgow coma score. MV: Mechanical ventilation.

Table 3 Logistic Regression Binary model for mortality prediction

Variable	B coefficient	Standard error	OR (95 % CI)	p-value
Age	0.049	0.002	1.05 (1.04-1.06)	< 0.001
AHEAD	0.358	0.025	1.43 (1.36-1.50)	< 0.001
ATHORAX	-0.075	0.027	0.93 (0.88-0.97)	0.005
AUPPEREXT	-0.213	0.051	0.81 (0.73-0.89)	< 0.001
ALOWEREXT	-0.117	0.033	0.89 (0.83-0.95)	< 0.001
PointRR	-0.076	0.035	0.93 (0.86-0.99)	0.031
PointSBP	-0.274	0.046	0.76 (0.70-0.83)	< 0.001
PointGCS	-0.464	0.029	0.63 (0.59-0.66)	< 0.001
MV	0.819	0.100	2.27 (1.86-2.76)	< 0.001
MASSIVEHEM	0.689	0.148	1.99 (1.49-2.66)	< 0.001
HEMODINAM	0.844	0.095	2.32 (1.93-2.80)	< 0.001
RESPIRATORY	0.581	0.104	1.79 (1.46-2.19)	< 0.001
AKIDNEY	0.582	0.096	1.79 (1.48-2.16)	< 0.001
COAGULOP	0.747	0.107	2.11 (1.71-2.60)	< 0.001

RR: Respiratory rate, SBP: Systolic blood pressure, GCS: Glasgow coma score. MV: Mechanical ventilation, OR: Odds Ratio, CI: Confidence interval.

Table 4 Performance properties of the 9 algorithms analysed

Algorithm	Accuracy	Specificity	Precision	Recall	F-measure	AUC
LR	0.901	0.970	0.668	0.424	0.590	0.912
CT	0.899	0.966	0.647	0.438	0.603	0.856
JRip	0.899	0.962	0.638	0.462	0.624	0.730
BN	0.894	0.955	0.603	0.469	0.630	0.915
NN	0.889	0.952	0.576	0.451	0.612	0.890
SMO	0.901	0.978	0.710	0.366	0.533	0.672
ADABOOST	0.892	0.971	0.630	0.337	0.500	0.891
BAGGING	0.902	0.968	0.668	0.444	0.609	0.910
RFOREST	0.901	0.964	0.650	0.460	0.623	0.905

LR: Logistic regression model, CT: Classification tree, JRip: Repeated Incremental Pruning to Produce Error Reduction, BN: Bayesian network, NN: neural network, SMO: Sequential Minimal Optimization, ADABOOST: Adaptive boosting, BAGGING: Bootstrap aggregating, RFOREST: Random forest, AUC: Area under ROC curve. In bold values with statistically significant differences.

Download PDF

Journal Publication

published 19 Oct, 2020

Read the published version in BMC Medical Research Methodology →

Editorial decision: Major revision
19 Jul, 2020
Review #2 received at journal
18 Jul, 2020
Reviewer #1 agreed at journal
02 Jul, 2020
Reviewer #2 agreed at journal
02 Jul, 2020
Review #1 received at journal
02 Jul, 2020
Reviewers invited by journal
01 Jul, 2020
Editor assigned by journal
28 Jun, 2020
First submitted to journal
27 Jun, 2020
Submission checks completed at journal
27 Jun, 2020
Editor invited by journal
27 Jun, 2020

You are reading this older preprint version

Read the latest preprint version →

Machine learning techniques for mortality prediction in critical traumatic patients: anatomic and physiologic variables from the RETRAUCI study

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Methods

Results

Discussion

Limitations

Conclusions

Abbreviations

Declarations

References

Tables

Status:

Journal Publication

Version 1