Protocol and registration
This systematic review was designed in accordance with the Preferred Reporting Items for Systematic review and Meta-Analyses (PRISMA) extension statements for reporting systematic reviews that incorporate NMA (e-Table 1 in additional file 1). The review protocol is registered with PROSPERO (CRD42020139112).
Studies, participants, interventions/comparators, and outcomes
We included all reports of randomized controlled trials (RCTs) in English and Japanese regardless of publication status (e.g., published, unpublished, and academic abstracts). Randomized crossover, cluster randomized, and quasi-experiment trials were excluded. This meta-analysis included reviews of adult patients (age ≥16 years) who underwent IMV for more than 12 hours due to ARF and were scheduled for extubation after an SBT. The definitions of acute hypoxic respiratory failure and SBT were individualized for each study. This meta-analysis excluded studies that included patients who underwent tracheostomies, experienced accidental extubation or self-extubation, those who experienced hypercapnia during SBT, and those who had do-not-resuscitate (DNR) orders. Studies in which more than half of the study population had acute chronic obstructive pulmonary disease (COPD) exacerbation, those that included patients with a postoperative status or who were being treated for trauma, and those that included patients with congestive heart failure were also excluded. We included RCTs that compared two of the three available respiratory support devices: (1) COT: low-flow nasal cannula, face mask, and venturi mask (no flow rate restriction); (2) NPPV: the type of mask, mode, duration of ventilation, and weaning methods were not limited; and (3) HFNC: no limitations on the flow rate or FIO2. The outcome measures evaluated were as follows: the primary outcome was the short-term mortality rate ( at the end of the follow-up period for each trial within 30 days,  at ICU discharge, and  at hospital discharge). Secondary outcomes included the reintubation rate within 72 hours (reintubation included the need for intubation and NPPV) and post-extubation respiratory failure rate (the definition was individualised for each study).
Data sources and search details
We searched the Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE via PubMed, EMBASE, and Ichushi, a database of Japanese papers for eligible trials. We searched for ongoing trials in the World Health Organization International Clinical Trials Platform Search Portal. In cases of missing data, we attempted to contact the authors of each study. Searches were performed in December 2020. Details regarding search strategy and when the searches were performed are shown in e-Table 2 in additional file 1.
Study selection, data collection process, and data items
Two of the three physicians (YO, CN, and HY) screened the title, abstract, and full text during the first and second screenings for relevant studies and independently extracted data from eligible studies into standardized data forms. For abstract-only studies that could not be evaluated according to the eligibility criteria, we contacted the authors. Disagreements, if any, between two reviewers were resolved via discussion among themselves or with a third reviewer as necessary. Data extraction from identified studies during the second screening was done by three reviewers using two tools: (1) the Cochrane Data Collection Form (RCTs only) and (2) Review Manager (RevMan) software V.5.3.5.
Risk of bias within individual studies
The risk of bias for primary outcomes were independently assessed using the Cochrane Risk of Bias tool 1.0[24, 25]. Each bias was graded as ‘low risk’, ‘unclear risk’, or ‘high-risk’. Discrepancies between reviewers were resolved by mutual discussion.
Direct comparison meta-analysis
A pairwise meta-analysis was performed by using RevMan 5.3 (RevMan 2014). Forest plots were used for the meta-analysis, and effect sizes are expressed as relative risk (RR) and weighted mean differences, both with 95% confidence intervals (CI), for categorical and continuous data, respectively. Outcome measures were pooled using a random effect model to include study-specific effects in measures. A two-sided p-value <0.05 was considered significant.
Study heterogeneity between trials for each outcome was assessed by visual inspection of forest plots and with an I² statistic for quantifying inconsistency  (RevMan; I2: 0–40%, 30–60%, 50–90%, and 75–100% as minimal, moderate, substantial, and considerable heterogeneity, respectively). When heterogeneity was identified (I² >50%), we investigated the reason and quantified it using the Chi-square test (p-value).
We planned to use a funnel plot, Begg’s adjusted rank correlation test, and Egger’s regression asymmetry test for the possibility of publication bias if ≥10 studies were available (RevMan) . However, as <10 studies were included for each outcome, we did not test for funnel plot asymmetry.
Network comparison meta-analysis
A network plot was constructed to determine the number of studies and patients included in this meta-analysis. An NMA, using the netmeta 0.9-5 R-package (version 3.5.1), was performed using a frequentist-based approach with multivariate random effects meta-analysis, and effect size was expressed as the RR (95% CI). Covariance between two estimates from the same study shows variance of data in the shared arm, as calculated in a multivariable meta-analysis performed using the GRADE Working Group Approach for an NMA.
The transitivity assumption underlying the NMA was evaluated by comparing the distribution of clinical and methodological variables that could act as effect modifiers across treatment comparisons.
Ranking plots (rankograms) were constructed using the probability that a given treatment had the highest event rate for each outcome. The surface under the cumulative ranking curve (SUCRA), which is a simple transformation of the mean rank, was used to set the hierarchy of the treatments and was created using standard software (Stata 15.0, Stata, TX, USA).
Risk of bias across studies
Assessment of the risk of bias across studies followed considerations on pairwise meta-analysis. Conditions associated with ‘suspected’ and ‘undetected’ bias across studies were determined by the presence of publication bias as shown by direct comparison.
The indirectness of each study included in the network was evaluated according to its relevance to the research question, which consisted of the study population, interventions, outcomes, and study setting, and was classified as low, moderate, or high. Study-level judgments could be combined with the percentage contribution matrix.
The approach to imprecision comprised a comparison of the range of treatment effects included in the 95% CI with the range of equivalence. We assessed the heterogeneity of treatment effects for a clinically important risk ratio (<0.8 or >1.25) in CI.
To assess the amount of heterogeneity, we compared the posterior distribution of the estimated heterogeneity variance with its predictive distribution. The concordance between assessments based on CI and prediction intervals, which do and do not capture heterogeneity, respectively, was used to assess the importance of heterogeneity. We assessed the heterogeneity of treatment effects for a clinically important risk ratio of <0.8 or >1.25 in prediction intervals.
Assessment of inconsistency
The inconsistency of the network model was estimated from inconsistency factors and their uncertainty, and consistency was statistically evaluated using the design-by-treatment interaction test. For comparisons informed only by direct evidence, there was no disagreement between evidence sources and, thus, there was ‘no concern’ for incoherence. If only indirect evidence was included, there was always ‘some concern’. ‘Major concern’ applied when the p-value of the design-by-treatment interaction test was <0.01.