This systematic review and meta-analysis were conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines (PRISMA checklist) [37]. The protocol is available in the International Prospective Register of Systematic Reviews (CRD42019127181).
Search strategy
MEDLINE (OVID interface, including In-process and Epub Ahead of Print) and Embase (OVID interface) databases were searched from 1946 to February 2019 (supplemental 1). The literature search results were uploaded and reviewed using Covidence Software (Covidence, Melbourne, Australia).
Selection criteria
Search results and full-text articles meeting full eligibility criteria were reviewed independently and in duplicate. Potentially relevant studies were screened by title and abstract (stage 1) followed by full-text article screening to assess full eligibility (stage 2). Two review authors assessed the eligibility of full reports. Any disagreement was resolved through discussion with a third reviewer. The reasons for excluding studies were recorded. RCTs that evaluated any intervention to minimize AL following esophagectomy were included with no restriction on language. Only studies that reported our primary outcome, AL, were included. Properly conducted RCTs are the gold standard for evaluating the effectiveness of an intervention [13]. Thus, only RCT articles were included and other articles, including review articles, editorials, preclinical studies, observational studies, and abstracts, were excluded.
Outcome justification and prioritization
Our primary outcome of interest was an anastomotic leak, required to be recorded for both interventional and control groups. AL was defined as the presence of extraluminal collections of air or contrast, excess bile-stained fluid on drainage, or a combination of these [13]. Secondary outcomes of interest were anastomotic stricture, mortality, and length of stay in hospital post-operatively.
Data extraction
Patient characteristics and demographic information, methodology, intervention details, outcomes of interest, and risk of bias were recorded. Two reviewers performed all data extraction. The study and patient characteristics for included studies were recorded. This included the first author name, year of publication, study country of origin, number of patients investigated (intervention and control groups), and the indication for esophagectomy (e.g. esophageal cancer). The methods used for interventions to prevent AL were recorded (e.g. omentoplasty, stapled vs. hand-sewn anastomosis, early NG tube removal. Details recorded included the use of neoadjuvant therapy (e.g. radiation and chemotherapy), medical management (e.g. antibiotics), endoscopic management (e.g. nasogastric tube use), or surgical management (e.g. re-operation), the modality used to diagnosis AL, and the surgical approach for esophagogastric anastomosis (e.g. cervical or thoracic anastomosis). Disagreements were resolved through discussion with a third-party member.
Summary measures and synthesis of results
DerSimonian and Laird’s random-effects method was used to pool relative risk effect estimates with corresponding 95% CIs for dichotomous variables. A risk ratio of greater than one indicates an increased risk of AL, stricture, or mortality and less than one indicates a reduced risk of AL, stricture, or mortality. Continuous measures were reported for individual studies as a mean with standard deviation (SD) or a median with interquartile range (IQR) or the overall range from minimum to maximum. The pooled mean difference between the length of stay in the intervention and control groups was determined using a DerSimonian and Laird’s random-continuous effects method [15]. Studies that reported median with IQR were excluded from the pooled mean difference estimation for the length of stay. A mean difference in length of stay less than zero represents a shorter length of stay in the intervention group compared to the control group and a value greater than zero represents a longer length of stay in hospital in the intervention group compared to the control group. The heterogeneity of effect sizes for pooled estimates was assessed using the Cochrane I2 statistic. The following thresholds were used to describe the I2 threshold: 0 – 40% (low heterogeneity), 30 – 60% (moderate heterogeneity), 50 – 90% (substantial heterogeneity) and 75 – 100% (considerable heterogeneity) [15]. Open Meta-Analyst was used to generate forest plots, heterogeneity, and effect estimates for risk ratios and mean differences (Open-source, USA).
Subgroup analyses included analyzing AL grouped by type of disease (e.g. esophageal cancer), age (≤ or > 18 years old), type of surgery (cervical vs. thoracic anastomosis), and use of induction or neoadjuvant therapy.
Risk of Bias
The Cochrane revised risk of bias tool for randomized trials was used to evaluate the individual risk of bias for studies reviewed [17]. Within each risk of bias domain, a series of questions ('signaling questions') were chosen to elicit information about features of the trial that were felt to be relevant to the risk of bias. Publication bias was included in the assessment. Judgement is classified as 'low', 'high', or as having ‘some concerns' [17]. Meta-bias (risk of bias across studies) was summarized by pooling the individual study risk of bias for each risk of bias domain.
Grading of Recommendations, Assessment, Development, and Evaluations
The quality of the treatment effects was graded by using a systematic and comprehensive approach known as GRADE [17]. GRADE provides a reproducible and transparent framework for grading the quality of evidence or certainty in the evidence. The quality of evidence reflects the extent to which we are confident that an estimate of the effect is correct. High grade of evidence means the true estimate lies close to the estimate of effect; moderate grade means that the true effect is likely to be close to the estimate of the effect; low grade means that the effect estimate may substantially differ from the true estimate of the effect; very low grade means we have little confidence in the effect estimate [17].