Overview of evidence-based clinical practice guidelines for dicult airway management in adults: a systematic review

Background : The aim of the clinical practice guidelines (CPGs) in the management of dicult airway is to provide optimal responses to a potentially life-threatening clinical problem. Objective : to summarize and compare relevant recommendations and algorithms from evidence-based CPGs (EB-CPGs). Methods : We conducted a systematic review (overview) of CPGs, following Cochrane methods. We summarized recommendations, its supporting evidence and strength of recommendations according to the GRADE methodology. In July 2018, we searched CPGs that were published in the last 10 years, without language restrictions, in electronic databases, and searched specic CPG sources, reference lists and consulted experts. We searched PubMed, EMBASE, Cochrane Library, LILACS, Tripdatabase and additional sources. Pairs of independent reviewers selected EB-CPGs and rated their methodological quality using the AGREE-II instrument. We included those EB-CPGs reporting standard methods for identication, data collection, study risk of bias assessment and recommendations’ level of evidence. Discrepancies were solved by consensus. Results: We included 11 EB-CPGs out of 2505 references identied in literature searches within the last ten years. Only three of them used the GRADE system. The domains with better performance in the AGREE-II assessment, were ‘adequate description of scoping’ and ‘objectives’ while those with worst performance were ‘‘Guidelines’ applicability’ and ‘monitoring’. As a result, only three EB-CPGs were classied as ‘Highly recommended, two as ‘Recommended’ and six as ‘Not recommended. We summarized 22 diagnostic recommendations, 22% of which were supported by high/moderate quality of evidence (41% of them were considered by developers as strong recommendations), and 16 therapeutic/preventive recommendations, 59% of which were supported by high/moderate quality of evidence (76% strong). Only half of the EB-CPGs were updated in the past ve years. Conclusions : The main EB-CPGs in the management of dicult airway in anesthesia presented signicant heterogeneity in terms of their quality and system of grading the evidence and strength of recommendation used, and most used their own systems. We present many strong recommendations that are ready to be considered for implementation, and we reveal opportunities to improve guidelines’ quality.


Background
An estimated 234 million major surgical procedures are undertaken every year worldwide [1]. The number of patients who receive surgery is increasing, as well as the frequency of comorbidities [2]. Because of the inherent risks of death and complications, the safety of different management strategies becomes a signi cant public-health concern.
Di cult airway (DA) is de ned as the clinical situation in which a conventionally trained anesthesiologist experiences di culty with facemask ventilation, with tracheal intubation, or with both [3].The rate of di cult intubation is from 0.5 to 10% in patients undergoing general anesthesia, depending on the de nition adopted [4][5][6][7][8]. The rate of di cult mask ventilation (DMV) ranges from 0.9 to 12.8% of patients undergoing general anesthesia in different series. Di cult intubation and DMV are closely associated [8]. It is a true emergency situation because if not solved immediately, it could lead to catastrophic outcomes such as permanent brain damage or even death [9,10].
Different surveys provided detailed information about the factors contributing to poor outcomes associated with airway management and highlighted de ciencies relating to judgement, communication, planning, equipment, and training [2].
The uptake of clinical practice guidelines (CPG) may lead to increased quality and safety of care. Their aim is to provide a structured response to a potentially life-threatening clinical problem both in unanticipated and known di cult intubation. Usually, recommendations balance risks and bene ts of a speci c diagnostic or therapeutic procedure, and propose an algorithm pathway for management. Standardization of processes promotes high-quality care in a costeffective manner. By creating this consensus of bundles of procedures and alerts, and by promoting further research in speci c directions, academic societies aim to support heath care systems in improving the level of care in patients DA.
However, the standardization of procedures for a given health facility, and subsequent bene ts are only as good as the quality of the CPGs themselves. Notwithstanding the fact that not every procedure counts with strong scienti c evidence, those CPGs whose recommendations are not fully supported by the best evidence available might promote inappropriate strategies both for patients and health systems.
Multiple medical societies and organizations around the world have published management of di cult airway CPGs; however, many of them are not based on solid scienti c evidence. Additionally, not all of them harness the best methods like the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach, that is the most relevant system for rating the quality of evidence in systematic reviews and CPGs [11]. GRADE offers a transparent and structured process for developing and presenting evidence summaries and making recommendations [11].
The di cult airway management represents a complex interaction between patient factors, the clinical setting, and the skills of the practitioner. There exist different settings around a di cult airway situation, and the guidelines cannot consider every scenario. This is a limitation we prede ne to select the scope of our statement. For this reason, through a systematic review (overview), we aimed to identify and synthetize Evidence Based-CPGs (EB-CPG) on DA care in anesthesia that were published globally, in the last 10 years. We also intended to rate their quality, describing levels of evidence and the strength of their recommendations according to the GRADE approach [11]. Methods using the software COVIDENCE to facilitate the initial phases of systematic reviews (https://www.covidence.org/). One reviewer extracted data while the other audited it in a previously piloted form (which included variables such as search date, objective, setting, target population, target professionals, recommendations, classi cation system of the quality of evidence and of the strength of the recommendation, quality of evidence by recommendation, and the strength of each recommendation). Discrepancies were solved by a consensus of the whole team. The protocol of this review was not registered.
Guideline quality appraisal and classi cation: Independent pairs of reviewers rated each EB-CPGs using the AGREE-II tool, consisting of 23 key items organized in six domains: scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, editorial independence and two overall evaluation items [17,18]. Each item was graded using a scale of 7 points: from 1, meaning 'Strongly disagree', to 7, meaning 'Strongly agree'. Each domain was graded by summing up all the item scores in a domain and showing the total as a percentage of the maximum possible score for that domain (from 0 to 100%). We present the AGREE-II domain scores expressed as a percentage across CPGs.
We also categorized each EB-CPGs according to the extent to which they successfully addressed AGREE-II criteria [13] as: 'Strongly recommended' (++), for CPGs whose standardized score exceeds 60% in ≥4 AGREE-II domains. The scores of the remaining domains must be ≥30% and >60% for the domain rigor of development, 'Recommended' (+), for CPGs whose standardized score range from 30 to 60% in ≥4 AGREE-II domains. The rigor of development score must be between 30% and 60%, and 'Not recommended' (-) for CPGs whose standardized score is <30% in ≥4 AGREE-II domains or if rigor of development score is less than 30%. To deal with discrepancies between the direction and strength of the CPG recommendations, we applied a rule to decide 'doing or not doing the recommendation' as follows: Yes (Y) / No (N) when ≥2/3 recommendations in the same direction (for/against) and ≥2/3 strong recommendations; Probably yes (PY) -Probably no (PN) when ≥2/3 recommendations in the same direction (for/against) and <2/3 strong recommendations, and nally Uncertain when <2/3 recommendations were in the same direction (for/against).
Synthesis of results: We conducted a tabular synthesis of the whole set of recommendations to better describe their strength and level of evidence according to the GRADE methodology [11], and approximating the original grading system used in the guideline to GRADE whenever necessary, to compare and integrate the results for each recommendation in a uni ed manner. GRADE quality of evidence may be scored as high, moderate, low and very low.
Randomized clinical trials (RCTs) start always from high quality of evidence and the non-randomized studies initiate always from a low quality of evidence.
Five criteria can be applied to downgrade one or two levels: methodological quality (study limitations), inconsistency of results, indirectness, imprecision and publication bias. In cases where there are no methodological limitations, there are three criteria that can upgrade one or two levels: magnitude of effect, doseresponse effect, and confounders underestimating the effect. Regarding the strength of a recommendation, which is de ned as the extent to which one can be con dent that the desirable consequences of an intervention outweigh its undesirable consequences, GRADE uses four simple categories: 'strong' or 'weak', and 'for' or 'against' a certain diagnostic or therapeutic approach. We presented descriptive statistics as percentages or means with standard deviations.

Results
The search strategy identi ed 2588 references after the elimination of duplicates. After the selection process we identi ed 81 full-text studies assessed for eligibility and 11 EB-CPGs published in the last 10 years ( Figure 1: Study owchart, Table 1). Three were developed in America (1 from United States & 2 from Canada), ve in Europe (1 from Italy, 1 from Scandinavia, 1 from Germany and 2 from United Kingdom) and three in Asia (1 from Japan and 2 from India).
Eight of eleven (72 %) of the EB-CPGs conducted their searches within the last ve years.
Out of the 11 EB-CPGs identi ed, seven addressed the speci c intubation in anesthetic setting, two focused on obese patients, one related to obstetric setting, one included intensive care patients and other the pre-hospital airway management. CPGs differed in the recommendation grading systems used by their authors. The grading system used were: GRADE (3 EB-CPGs) [19][20][21] , Oxford Centre for Evidence-Based Medicine 2011 (1 EB-CPGs) 22 and the other utilized their own or modi ed systems. We presented the scores as a percentage per each AGREE-II domain. The domains with the higher mean ± SD score were Scope and Objective (87% ±12), Clarity of presentation (74 ± 10%) and Editorial independence (65 ± 19 %). Stakeholder involvement (55 ± 17%) and rigor of development (54 ± 14%) had an intermediate performance while 'applicability' was the most de cient (37 ± 10%). Regarding the guideline recommendation category, 3/11 (37%) were classi ed as highly recommended, 2 as recommended and the rest as not recommended. An overall AGREE-II score was also presented in Table 2, which provides a general description of the included EB-CPGs.  We identi ed four recommendations related to the pre-anesthetic preparation (studies required, strategies, equipment), nine related to the strategies during the intubation procedure, ve referred to the intubation failure and three to the post-extubating period.
We detected one recommendation, referred to the pre-anesthetic studies in patients presumed to have di cult intubation, with a very low level of evidence and weak recommendations, while nine referred to the strategies during the intubation related to the protection against gastric re ux and aspirated pneumonitis.
These nine were strong recommendations, but based on a low level of evidence. Four guidelines speci ed the equipment necessary to be included in this setting of di cult airway intubation, all with strong level of recommendation but again, with low level of evidence.
Some discrepancies could be related to the speci c population included, the urgency of the procedure and the setting analyzed. The American Society of Anesthesiologist (ASA) de ned a di cult airway as the clinical situation in which an anesthesiologist with a standard training has di culty ventilating with a mask, intubate or both. The Canadian guide de nes the same situation when a professional has di culty in mask ventilation, direct or indirect laryngoscopy, intubation, use of supraglottic devices (DSG) or in achieving surgical access to the airway must be experienced in handling of airway. The rest of the guidelines, although addressing these concepts, do not explicit a de nition of DAV.
Other de nitions related to di cult airway management described in some of the guidelines are: a) Di cult insertion of a DSG: when multiple attempts are required, in the presence or absence of tracheal pathology (ASA), b) Di cult ventilation with bag and mask or DSG: when adequate ventilation is not achieved due to one or more of the following problems: improper sealing, leakage or excessive resistance during gas inlet or outlet the Canadian guide describes the di culty for bag and mask ventilation as a continuum that goes from the absence of di culty to the inability to achieve it, multiple head and neck adjustments or the help of a second operator is required, and c) Di cult laryngoscopy: the ASA guideline describes it as the total invisibility of the vocal cords, after multiple attempts at conventional laryngoscopy. Other guidelines use a description of the vision obtained to classify it: -the Canadian guide uses Regarding the physical characteristics of the patients that can be associated with a VAD, the ASA guide refers to age, obesity and the presence of mediastinal masses as predictors. The guidelines that describe a speci c population as obese and pregnant women detail the reasons why it is more common for di culties in airway management in these patients [ 11, [35][36][37] ]. The Canadian Advance VAD Guide provides strategies to adopt in obese, pregnant patients and patients with obstructive airway pathology [ 38 ]. The ASA guide adds other pathologies that are associated with VAD as obstructive apnea of the sleep, snorers, ankylosis, degenerative osteoarthritis, subglottic stenosis, lingual thyroid, tonsillar hypertrophy, Treacher-Collins syndromes, Pierre Robin or Down.
At the moment of the intubation, seven guidelines strongly recommended preoxygenation, four (67%) referred low and three (47%) moderate level of evidence.
We identi ed four ECAs in the literature referred to the use of neuromuscular blockade. The ve guidelines that mentioned this indication had polar different recommendations and reported antagonist's levels of evidence in this item.
In the other items referred to the intubation we observed High disparity criteria between the guidelines: in 65 different recommendations the majority were strong in favor (56%), 35% were weak in favor and only 3 and 6% were strong or weak against, respectively. However, the level of evidence sustaining these recommendations were low or very low in 71% of the items.
In some issues considered important or critical in the setting of di cult intubation care, we searched for systematic reviews. In these items the level of certainty of the outcomes observed was low or very low in 55% of the cases.
Only one guideline made recommendations about extubation technique with weak strength and very low level of evidence. The evidence referred to reintubation is also weak (low 60% and 40% very low), despite 80% of the recommendations have been reported as strong in the guidelines. The recommendations are summarized in table 4. To our best knowledge the present study is the rst overview of guidelines encompassing a broad spectrum of di cult airway care in anesthetic patients' recommendations.
We observed higher level of evidence supporting therapeutic than diagnostic recommendations (high/moderate quality of evidence 56 vs 22%, respectively). It is not surprising, because cross-sectional or cohort studies can provide high quality evidence for test accuracy but indirect evidence for patient-important outcomes. Furthermore, high level of heterogeneity is almost the rule in diagnostic studies, downgrading even more the level of evidence because inconsistency [40][41][42]. Although there is consensus on some practices such as the use of neuromuscular blockers, the importance of patient preoxygenation and the relevant role of video laryngoscopy, not all recommendations present an adequate level of evidence to support them considering relevant outcomes for patients. In contrast, no consensus was found on the necessary material that should be available when a di cult airway is expected or on the preferred technique for performing a rescue cricothyroidotomy.
The strength of a recommendation is de ned as the extent to which one can be con dent that the desirable effects of an intervention outweigh its undesirable ones. We found only 41% 'strong' diagnostic recommendations statements (for and against) based on high/moderate level of evidence and 56% for therapeutic/preventive care recommendation. Although it would be desirable higher proportions of high-quality supporting evidence a guide panel must consider additional factors. To assess competing management alternatives, GRADE proposes to consider four domains: estimates of effect for desirable and undesirable outcomes, con dence in the estimates of effect, values and preferences, and resource use. Guideline panels must integrate these factors to make a strong or weak recommendation for or against an intervention [ 43 ].
Our updated overview of EB-CPGs may be a useful resource for the professionals involved in di cult airway management to consult. We present many strong recommendations that are routinely implemented in clinical practice. Although, there is still no published evidence on whether the adoption of CPGs results in an improvement in patient critical outcomes and in many recommendations there is a lack of consensus among practitioners as to which approaches to airway management should be adopted.
However, any decision should be taken considering local contextual factors. The heterogeneous settings represent itself a limitation to generalize every recommendation. However, we analyze the differences in accordance with the setting that was evaluated, based on the expert opinion in a second review of the results. These resolutions are highlighted when the key questions are presented. The experts who collaborate with the project remarked that some speci c maneuvers as the laryngeal manipulation could be acceptable depending the situation and the additional materials to facilitate the intubation. In the unexpected emergency situation, it could be the most available maneuver associated to stylets or TE, but in a schedule anesthetic procedure videolaryngoscopy and muscular blockers (if no contraindication) should be preferred. In contrast, no consensus was found on the necessary material that should be available when a di cult airway is expected or on the preferred technique for performing a rescue cricothyroidotomy. The unexpected di cult intubation/ventilation and the characteristics of the patients or the procedures make more di cult the universality of the recommendations.
Recommendations can be adopted, modi ed or even not implemented, depending on institutional or national requirements and legislation and local availability of devices, drugs and resources [ 44 ]. Decision-makers at the national and subnational levels should be provided with the information they need to apply the evidence and recommendations in their setting [ 45 ]. As a limitation, including only EB-CPGs could have resulted in omitting some information, but we prioritized summarizing the highest quality evidence. Our exclusion criteria for CPGs, limiting the scope to speci c conditions, may represent an additional caveat since some particular diagnostic or therapeutic interventions could have been also excluded. Nevertheless, the large amount of recommendations summarized in our study suggest that this could have been only a minor limitation.
Our study will be useful for future di cult airway guideline developers or adapters. Consistently with other overviews of clinical guidelines, the domain that received the lowest mean score was the 'applicability' domain of the AGREE-II tool. Similarly, the heterogeneity of evidence and the strength of recommendations grading systems [46][47][48] in this overview echo that of other clinical guideline overviews. We also found some discrepancies, mainly in the evidence level, in each recommendation that did not always discriminate between universal interventions and those suitable only for special target groups or speci c surgeries.
Guideline developers should ensure rigorous methodological processes and make also recommendations formulated and disseminated in ways they facilitate understanding and application by end-users.
Our overview identi ed several controversies, evidence gaps and problems regarding di cult airway care guidelines that warrant future research and reveal opportunities to improve their quality.
We are aware that it is not possible to study some di cult airway management rare events in prospective trials, so our most valuable insights come from the detailed analyses of adverse events. Every adverse event is unique, and its outcome will be ultimately in uenced by patient co-morbidities, urgency of the procedure, skill set of the anesthetist, and available resources [ 2,31 ].
Standardized management plans are directly transferable from one hospital to another and make it less likely that team members will encounter unfamiliar techniques and equipment during an unfolding emergency. We encourage guideline developers to adopt GRADE [11] and AGREE-II [18] tools to elaborate future sound preoperative care guidelines.

Conclusion
We found signi cant heterogeneity of guidelines' quality and rating systems, as well as de ciencies in several guideline quality domains, which reveal opportunities for quality improvement which deserve careful consideration by future guideline developers. Nevertheless, we present many strong recommendations ready to be considered at present for implementation or discontinuation.  Study owchart

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.