Validation and Optimization of the Trigger Tool for the Detection of Adverse Events in General Surgery.

of evets and use of resources, addition to the injury caused to the patient. The preparation of specic tools in the eld of patient safety and specically to detect adverse events is a priority of healthcare systems. Regarding the hypothesis that the trigger tool is effective to detect adverse events in general surgery we set out this study with the aim to validate this and propose optimization. Methods The TT proposed in this study is effective to detect adverse events in general surgery and shows high sensitivity and specicity. The tool’s optimized model has good predictive capacity.


Abstract Background
General surgery is an area with a major incidence of adverse evets and entails increased use of resources, in addition to the injury caused to the patient. The preparation of speci c tools in the eld of patient safety and speci cally to detect adverse events is a priority of healthcare systems. Regarding the hypothesis that the trigger tool is effective to detect adverse events in general surgery we set out this study with the aim to validate this and propose optimization.

Methods
Observational, descriptive, retrospective and multicenter national study where trigger tool (40 triggers) was applied in patients who underwent surgery in general surgery departments.
A descriptive analysis was performed. The tool's sensitivity and speci city was studied by means of looking at predictive capacity. A prediction model was used for the proposed optimization by means of binary logistic regression Results A total of 31 hospitals took part. The prevalence of adverse events was 31.53%.
The tool revealed sensitivity and speci city of 86.27% and 79.55%, respectively. A total of 12 triggers comprised the optimized model. An area under the curve of 79.29% was obtained.

Conclusions
Trigger Tool is extremely effective to detect adverse events during surgery. The optimized model signi cantly reduces the number of triggers used and upholds Background Identi cation of adverse events (AE) is relevant for patient safety. The overall rate of AE during hospitalization varies from 3-17%, of which approximately 50% are deemed preventable ,, .
AE entail a clinical impact and an increase in resources. The most expensive are surgical, those related to medication and diagnostic delay , .
Surgical units are the areas with the highest frequency of AE. They are related to 1.9-3.6% of AE in patients admitted to hospital, which represents 46-65% of all AE in hospitalization 3, , .
The most usual AE methods to detect AE (reporting of incidents, record of incidents and clinical-administrative databases) tend to underestimate the actual number of AE , . Since the publication of the Harvard Medical Practice Study (HMPS), the retrospective methodology to review AE has been the most commonly used.
In 2006 the Institute for Healthcare Improvement (IHI) encouraged healthcare systems to implant the Global Trigger Tool to measure and monitor injury to the patient.
In general surgery the trigger tool presented sensitivity and speci city of 86.0% and 93.6% respectively. This means it is highly effective to detect AE 2, .
Development of a speci c tool that enables identifying AE at low cost, quickly and effectively is of major use in surgery.
The aim of this study is to validate a set of predictive "triggers" for AE in patients operated in General Surgery and Gastrointestinal System (GSGS) departments.

Study design
Observational, descriptive study with analytical, retrospective and multicenter components to validate the trigger tool for detection of AE in GSGS.
A total of 31 acute care hospitals from the public health system took part in the study (sampling by convenience).
Patients aged over 18 admitted to GSGS from 01-09-2017 to 31-05-2018 who underwent surgery, with full and closed clinical histories and hospital discharge from the same hospital, were included.
Psychiatric, transplanted patients and those referred from other hospitals were excluded.
The sample was calculated randomly according to an estimated probability of 90% for detection of AE 2 , with an estimated population of 80,000 patients, a 95% con dence interval and precision of 0.02. Sample size was 855 histories distributed among the hospitals taking part. The sample was enlarged to avoid possible case losses and incomplete information.

Instrumentalization
The trigger tool (TT) was applied to detect AE. A total of 40 triggers were included (Table 1).
For the category of AE injury the "National Coordinating Council for Medication Error Reporting and Prevention" classi cation (Figure 1) was used.
A screening guide was published in accordance with criteria on the search for triggers and AE and a training video-tutorial. When necessary the training was completed with an individual tutorial.

Review process
Each center had at least two reviewers.
Clinical histories were reviewed in accordance with the screening guide to identify triggers. Both histories that contained triggers and those that did not were reviewed to search for AE. The same information sources and review sequences were used.
Information sources were clinical discharge reports, surgical procedure protocols, medical and nursing clinical course observations from the patient's admission to 30 days post-discharge, reports of additional tests and prescription of medicines.
AE was considered to be any harmful and unintended event that occurred to the patient as a consequence of the practice of healthcare unrelated to their illness.
When an AE was detected an injury category was assigned and the degree to which this could have been prevented was assessed. The classi cation used in the ENEAS study was adapted to determine the preventable nature of the AE.
The study data and variables were recorded in an online database (REDCap). Con dentiality rules were upheld. This study was approved by the coordinator site's ethics committee.

Statistical method
Descriptive analysis by means of mean, median and standard deviation for continuous variables and by means of distribution of frequencies for categoric variables.
The most important variables were compared by means of Mann-Whitney U non-parametric contrast, chi-squared contrast or Fisher test.
To measure the predictive validity of the tool to detect AE, diagnostic sensitivity and speci city, in addition to positive predictive value (PPV) and negative predictive value (NPV) were used.
A prediction model was used for the proposed optimization of the tool by means of binary logistic regression. The onset of AE and triggers were introduced as dependent and independent variables, respectively. The latter were the statistically signi cant ones on bivariate analysis.
The model's results are shown in the form of odds ratio (95% con dence interval [CI]). The model's discriminatory power was assessed by means of area under the curve (ROC).
The prediction model was repeated for relevant clinical entities such as preventable and severe AE and most common procedures.
P<0.05 was considered statistically signi cant for all analyses.
Data were entered by each center's reviewers into the REDCap database. The statistics program STATA/SE v10.0 was used.
This study has been funded by Instituto de Salud Carlos III through the project "PI17/01374" (Co-funded by European Regional Development Fund/European Social Fund; "A way to make Europe"/"Investing in your future").
The project was approved by the ethics committee of the study coordinating center.
Mean stay was 6.5 days (standard deviation 14.32). A total of 73.7% and 26.1% were scheduled and emergency surgical procedures, respectively.

Behavior of the tool
The tool revealed sensitivity and speci city of 86.27% and 79.55%, respectively. PPV and NPV were 66.52% and 92.48%, respectively. For severe AE, sensitivity and speci city were 100% and 26.5%, respectively. For preventable AE sensitivity and speci city were 90.3% and 66.9%, respectively. Table 2 shows the 38 triggers which, after bivariate study, were statistically signi cant with the onset of AE and their onset frequency.
The triggers that comprised part of the optimized models are shown in Table 3. The model for total AE had 12 triggers. Its predictive capacity is shown in Table 4, its ROC was 83.36 % (CI 81.14%-85.83%).

Adverse events
The prevalence of AE was 31.53% (357 patients). There was a total of 599 AE. A total of 69 patients presented a second AE (6.10%) and 28 a third AE (2.47%). A total of 16 patients had four or more AE (1.41%).
The most commonly observed AE were infections (35%). The most common was infection of the surgical site followed by paralytic ileus, intraabdominal abscess and anastomotic stula.
The category of AE injury is shown in Graph 1. A total of 34% of AE were deemed preventable.

Discussion
The most important contribution of this study is validation of the TT in GSGS and the proposal for the rst time of an optimized model. This enables detecting AE more e ciently, which is extremely useful to improve patient safety.
Methodology for validation of the trigger tool.
TT validation studies have been performed in other specialties ,, . Some works have also published results on optimization of the tool in different areas. This study is to date the rst on validation of the TT in GSGS and also the rst proposed optimized model for this specialty.
One of the methods used to validate the tool was the opinion of experts with Delphi-like surveys on the triggers included in an initial proposal.
For some of them the nal model included those with a PPV greater than 5% , 19 . In others a subsequent study was performed for its validation by means of calculating false negatives in a random sample 20 .
Some works report the review of trigger histories. This is the case of the Israeli study on TT in AE related to medication. The optimized model proposed was prepared in accordance with PPV over 10% and the opinion of a panel of experts removing four of the 17 initial triggers. This study only reports AE related to medication and the nal model is not based on multivariate statistical analysis.

Predictive capacity of the optimized tool
In regard to the predictive capacity of optimized models we found that the study whose results are most similar to this work is the one that uses a similar methodology. In the study by Griffey its model's area under the curve was 82% with 12 triggers compared to 83.6% in our study.
The PPV of our model (66%) is much higher than that reported in the remaining publications where other methodologies were used with PPV 28.5% 19 and 22.1% 22 where the selection of triggers is not su ciently accurate.
The studies detected to date do not report speci city or NPV of the tools used as the histories ruled out that did not contain triggers were not reviewed.

Adverse events
The prevalence of AE detected in our study is greater than that reported in studies on AE 17 but similar to that reported in studies where the trigger methodology was used in 7-40% of hospitalized patients 20 .
In a scope review performed by Schwendimann et al. it was concluded that half the AE were deemed preventable compared to 34% in our study 7 . The variability and subjectivity in regard to the preventability of AE was discussed previously. It was recommended not to use this kind of measure.
In regard to the severity of AE, the most common injury category was F with 58%, followed by category E. These outcomes coincide with those reported in the literature 24, .

Limitations
The national study required a large number of reviewers and there may be a certain degree of variability.
The use of HT to identify AE may not capture all AE and information sources may not be reliable. These limitations are part of the IHI's own methodology.

Strengths
Validation was performed in a multicenter study including different kinds of hospitals inside the national health system. However, there was a special focus on training reviewers and homogenization of criteria with close tutoring by the research team.

Conclusions
The TT proposed in this study is effective to detect adverse events in general surgery and shows high sensitivity and speci city.
The tool's optimized model has good predictive capacity.

Con dence interval (CI)
Declarations FOUNDING This study was subsidized by the European Regional Development Fund ("A way to make Europe") by means of a research grant awarded by the Spanish Ministry of Economy, Industry and Competitiveness of the Government of Spain, through Carlos III Health Institute.

STATEMENTS TO COMPLY WITH ETHICS REQUIREMENTS
The research project was approved by the 12 de Octubre University Hospital Ethics Committee.

CONFLICTS OF INTEREST
None of the authors have any con ict of interest to declare in regard to the performing or publication of this work.

INFORMED CONSENT
According to the ethics committee's assessment and given the research project's observational, descriptive and non-interventionist methodology, informed consent was not required from the patients included in the study.
We acted in accordance with the prevailing data protection law and rules that regulate the processing of patient information by the Spanish health system.

CONTRIBUTIONS OF AUTHOR
AIPZ was the principal investigator of the study. Together with ERC and PRL they developed the research project that was eligible for the Carlos III Institute grant.
MFB was in charge of coordinating the team of collaborators in the centers participating in the study. CLS and EFH facilitated contact with reference centers and provided methodological support for the resolution of doubts to the participants. Lastly, CMA, MTG and AIPZ performed the statistical analysis.

Not applicable
This manuscript does not report on or involve the use of any animal or human data or tissue.    Figure 1 AE by injury category Category E: temporary injury to the patient that requires intervention; category F: temporary injury to the patient that requires readmission or prolonged hospital stay; category G: permanent injury to the patient; category H: injury that requires essential intervention to keep the patient alive; category I: injury that leads to the patient's death.