Developing clinical questions and important outcomes for the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis


 Background

Acupuncture and moxibustion have been widely applied in treating allergic rhinitis (AR). However, there is a lack of evidence-based guidelines for acupuncture and moxibustion for AR, thus we started a project on developing an international clinical practice guideline (CPG) for acupuncture and moxibustion for AR (WFASRP202001-SC05) approved by the World Federation of Acupuncture-Moxibustion Societies (WFAS). This study aims to formulate the clinical questions and important outcomes for this guideline.
Methods

Based on the principle of the WFAS standardization committee, multiple methods including the International PICO question survey, Delphi survey, and consensus conference of guideline development group (GDG) were applied. International PICO questionnaires widely gathered the demands from the target population. Then GDG selected clinical questions and important outcomes for the guideline via a mixed method of Delphi survey and consensus conference.
Results

15 potential clinical questions and 10 sorts of outcomes were formulated under the supervision of a guideline steering group based on the analysis of 123 pieces of feedbacks from 17 countries of 5 continents. After 2 rounds of the Delphi survey, the consensus was reached in GDG that all of the potential questions were included. After 3 rounds of the Delphi survey, the consensus was reached that 9 of these outcomes were considered important outcomes.
Conclusion

15 clinical questions and 9 important outcomes were selected for the CPG for acupuncture and moxibustion for AR. Since there has not established a standard method in formulating the clinical questions and important outcomes for CPGs in acupuncture and moxibustion, this one will be a useful reference.


Abstract
Background Acupuncture and moxibustion have been widely applied in treating allergic rhinitis (AR). However, there is a lack of evidence-based guidelines for acupuncture and moxibustion for AR, thus we started a project on developing an international clinical practice guideline (CPG) for acupuncture and moxibustion for AR (WFASRP202001-SC05) approved by the World Federation of Acupuncture-Moxibustion Societies (WFAS). This study aims to formulate the clinical questions and important outcomes for this guideline.

Methods
Based on the principle of the WFAS standardization committee, multiple methods including the International PICO question survey, Delphi survey, and consensus conference of guideline development group (GDG) were applied. International PICO questionnaires widely gathered the demands from the target population. Then GDG selected clinical questions and important outcomes for the guideline via a mixed method of Delphi survey and consensus conference. continents. After 2 rounds of the Delphi survey, the consensus was reached in GDG that all of the potential questions were included. After 3 rounds of the Delphi survey, the consensus was reached that 9 of these outcomes were considered important outcomes.

Conclusion
15 clinical questions and 9 important outcomes were selected for the CPG for acupuncture and moxibustion for AR. Since there has not established a standard method in formulating the clinical questions and important outcomes for CPGs in acupuncture and moxibustion, this one will be a useful reference.

Background
Allergic rhinitis (AR) is one of the most common chronic diseases, with a prevalence of up to 40% [1].
Without e cient control, AR patients suffer high a risk of comorbidity with asthma [2] or other disorders of the immune system [3]. AR also seriously affects patients' life quality and increases the incidence of tra c accidents [4] and suicide [5].

International survey on PICO clinical questions
To comprehensively re ect the demands on each aspect of clinical questions and outcomes, we have designed an online survey and send the questionnaire to the target populations from different countries. The questionnaire was also published in a journal for potential target populations to feedback [15]. For the feedback that could not formulate clinical questions in PICO format, we corresponded with the participant via telephone or email, explaining the PICO structure and asked them to ll the questionnaire again. We analyzed the most concerned foreground clinical questions and patients, interventions, comparators, and outcomes related to the questions. The frequency or percentage of each element will be calculated and presented as statistical graphs. Then, the guideline draft group will formulate the potential clinical questions and outcomes lists based on the feedbacks under the supervision of the guideline steering group.
Delphi survey on importance rating of clinical questions and outcomes Ahead of the consensus conference, we introduced the background knowledge and formulating process of the potential clinical questions and outcomes by PDF documents and asked the guideline development group (GDG) members to nish the rst round of the Delphi survey online. For clinical questions and outcomes, GDG members were required to rate its importance, make comments if necessary (especially for the items that are strongly preferred or disagreed, comments were required), and choose familiarity (Ca) on the question as well as judgment basis (Cs). For clinical questions, the importance was rated by a 5 point Likert scale (very important = 5, important = 4, moderately important = 3, slightly important = 2, not important = 1)[18]. For outcomes, GRADE hierarchy rules (7 ~ 9 = critical, 4 ~ 6 = important, 1 ~ 3 = low important) were applied [14]. For each item, mean scores, coe cient of variation (CV = SD /(χ̅ ), and authority coe cient (Cr=(Ca + Cs)/2) were calculated and displayed in statistical graphs.

Consensus conference on clinical questions and important outcomes
The process and results of the rst step survey, as well as the formulation of potential clinical questions and outcomes, were introduced at the consensus conference online (via Tencent Cloud Meeting). Then, the result of 1st round Delphi survey was displayed anonymously, and each GDG member was required to express their opinion. Then there was a full discussion on the items with CV > 25%. After that, the 2nd round of the Delphi survey was carried out in the same way. For both clinical questions and outcomes, when the mean > 3 and CV ≤ 25%, the item was included; when the mean < 3 and CV > 25%, it was excluded; when CV > 25%, another round of Delphi survey was carried out. If the CV is still higher than 25% in the 3rd round survey, the item will be excluded [19].

Participants in 1st step survey
There were 123 feedbacks of the international PICO question survey (30 from abroad and 93 from China) covered 17 countries from 5 continents, and the domestic participants were from 25 provinces respectively ( Fig. 2a-b). Among all the participants, 81% were acupuncturists, 9% were physicians of Chinese medicine, 4% were medical school faculty, 3% were scienti c researchers and 2% were Otolaryngologists (Fig. 2c). We also analyzed the years of working of the participants, nding that 24% of them had working experience of equal or less than 5 years, 33% of them has worked for more than 5 but no more than 20 years, 43% have been working for more than 20 years.

Participants in the Delphi survey and consensus conference
The Delphi survey and consensus conference were carried out among GDG, which was comprised of experts in acupuncture and moxibustion, experts in evidence-based medicine, experts in otorhinolaryngology, experts in health economics, and allergic rhinitis patients (shown in Table 1). The GDG members are from six countries, where acupuncture is practical. The con ict of interest was required to declare ahead of the survey, and each member has signi ed the declaration of interest form. The most concerning clinical questions were rst categorized into 3 types (A. validation of effectiveness; B. optimal e ciency or non-inferiority; C. standardization or optimization of manipulation.) Then the PICO elements were summarized respectively.
As shown in Fig. 3a, the optimal e ciency or non-inferiority related questions accounted for 43% (e.g., Compared with intranasal steroids spray only, could allergic rhinitis patients bene t more from the combination therapy of liform needle therapy with intranasal corticosteroids? Compared with oral antihistamines, does moxibustion has an equal effect in treating seasonal allergic rhinitis?). The questions related to validation of effectiveness accounted for 36% (e.g., Compared with no treatment, could liform needle therapy perform better in releasing perennial allergic rhinitis-related symptoms?).
The questions related to standardization or optimization of manipulation accounted for 21% (e.g., Compared with 4 weeks liform needle therapy, could 8 weeks treatment bring allergic rhinitis patients more bene t? Compared with liform needle therapy, could the complementary of moxibustion enhance the therapeutic effect in moderate to severe allergic rhinitis patients?) The percentage of participant's concern on speci c classi cations of each PICO (Patient, Intervention, Comparator, Outcome) element were then calculated respectively.
For the comparator classi cation (shown in Fig. 3d), conventional treatments (including intranasal corticosteroids, intranasal and oral antihistamines, intranasal and oral decongestants, nasal saline, leukotriene receptor antagonists) was most frequently concerned (43%), followed by sham acupuncture or waitlist (36%), liform needle therapy (17%), and moxibustion (5%). The conventional treatments usually occurred as the comparators of the optimal e ciency or non-inferiority related questions, and sham acupuncture or waitlist were correlated to effectiveness validating questions. For the questions related to standardization or optimization of manipulation, the control intervention might be liform needle therapy or moxibustion with different manipulation (e.g., fewer treating courses, lower treating frequency, without the addition of other therapies, etc.).
According to the feedback from the target population, we draft 15 potential clinical questions and 10 sorts of outcomes for the Delphi study and consensus conference (shown in Table 2 and Table 3). Compared with conventional treatments, could moderate-severe allergic rhinitis patients bene t equally or more from moxibustion therapy?
10 Compared with conventional treatments only, does the combination of moxibustion therapy increase the bene t of moderate-severe allergic rhinitis patients?

11
Compared with liform needle therapy or moxibustion therapy alone, does the combination of liform needle therapy and moxibustion therapy increase the bene t for allergic rhinitis patients?

12
With the same treatment frequency, does a longer course of liform needle therapy increase the bene t of allergic rhinitis patients?

13
With the same treatment course, does a higher frequency of liform needle therapy increase the bene t for allergic rhinitis patients? 14 With the same treatment frequency, does a longer moxibustion therapy course increase the bene t for allergic rhinitis patients? 15 With the same treatment course, does a higher frequency of moxibustion therapy increase the bene t for allergic rhinitis patients?  Fig. 4. According to our inclusion and exclusion criteria, all clinical questions were considered important (Fig. 4a), while the CV of the question 6, 14, 15 was higher than 25% (Fig. 4b). The authority coe cients of clinical questions were among 65.8-76.3% (Fig. 4c). For outcomes, all of the outcomes were considered important (Fig. 4d), while mental health score, laboratory immunological indicators, Chinese medicine syndrome score, and clinical economic indicators had high CV (Fig. 4e). The authority coe cients of outcomes were among 61.6-75.8% (Fig. 4f).

Consensus conference with Delphi survey
The GDG consensus conference on clinical questions and important outcomes was carried out online and all 19 GDG members attend the meeting. After the introduction of the preparation of this meeting (the members of the guideline drafting group, steering group, and GDG group; the WFAS approval of the project on developing the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis; the result of the PICO question gathering; the de nition and application of the potential clinical questions and outcomes; the result of 1st round Delphi survey and anonymous comments), each GDG member gave his/her ideas and discussed (especially for the items with high CV value).
For clinical questions, the discussion mainly focused on the question with high CV (6, 14, 15). These are questions focusing on the e cacy and application of moxibustion, and the difference in opinion was caused by the following aspects: (1) Moxibustion is not available in some foreign countries because of the smog during the treatment is not acceptable. (2) Moxibustion is less frequently used than liform needle therapy in treating AR. (3) There was not enough evidence to support moxibustion's e cacy in AR treatment.
The discussion among GDG could be summarized as follows: (1) Experts in acupuncture and moxibustion from China recommend moxibustion therapy to be included in the guideline because it is frequently applied in AR treatment. The TCM syndromes and stages of AR are critical factors in acupuncture and moxibustion application, thus subgroup analysis on speci ed populations should be carried out for recommendation formulation.
(2) Experts in acupuncture and moxibustion from America, Australia, Switzerland, Japan said that the low importance rating on moxibustion-related questions was mainly because of the limitation of moxibustion practice abroad. As the smog of moxibustion therapy could elicit the re alarm system, it is not acceptable indoors without the special ventilation system. They suggested that the setting of moxibustion treatment, the quality of moxa, the speci c type of moxibustion therapy should be articulated during the systematic review and recommendation formulation. (3) Experts in otolaryngology said that they doubted whether AR would get worse after moxibustion therapy because the moxa could be allergen itself. However, the Chinese medicine expert in otolaryngology said that the main allergen for seasonal AR is the pollen of mugwort in autumn [20,21], while the mugwort leaves collected during April and June is the main component for moxa. From this aspect, the moxa would not worsen AR, but it remains controversial whether moxa smog could worsen AR symptoms. Therefore, the importance of moxibustion-related questions is less concerned. (4) Experts in evidence-based medicine said that there might not be enough high-quality evidence to support these clinical questions. However, these foreground questions are important during clinical practice. Therefore, the evidence-gathering process should cover all types of evidence, including randomized clinical trials, cohort studies, casecontrol studies, case series, and expert evidence [22]. (5) Allergic rhinitis patient representatives who have different types of western medication and acupuncture therapy previously. They felt that acupuncture could control the AR symptoms and enhance life quality in a longer period than other types of therapy, thus they preferred acupuncture therapy to other therapy. (6) Expert in health economics explained the economic consideration in clinical practice guidelines and emphasized its importance in weighing bene ts and cost.
After a full discussion on clinical questions, a second round of the Delphi survey was launched and all GDG members nished the survey. The result was shown anonymously. According to the inclusion and exclusion criteria, all potential clinical questions were rated as important questions without obvious divergence ( Fig. 5a-b), and the authority coe cient of each clinical question was above 70% (Fig. 5c).
Therefore, the GDG reached a consensus to include the 15 clinical questions into the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis.
For each sort of outcome, there was an introduction on different indicators, including suitable population, construction validity, content validity, criteria validity, reliability, responsiveness and minimal clinically important difference (MCID), etc. [23] Then every GDG member gave their opinions on different types of outcome. The discussion mainly focused on 4 sorts of outcome with obvious divergence (mental health score, laboratory immunological indicators, Chinese medicine syndrome score, and clinical economic indicators), summarized as follow: (1) Experts in acupuncture and moxibustion considered all the outcomes important, among which symptom score that re ected nasal and non-nasal symptoms related to allergic rhinitis was the most important one. Disease control score, quality of life score, and medication score that re ects the severity of AR from other aspects were also frequently used in assessing treatment e cacy. However, the laboratory immunological indicators and clinical economic indicators were less commonly used in clinical trials on AR.
(2) Experts in otolaryngology pointed out symptom score, disease control score, quality of life score, medication score as important outcomes. Meanwhile, they stressed the importance of mental status in related to AR, and introduced several scales to assess mental health in AR patients [24][25][26][27]. (3) Experts in evidence-based medicine suggested all related outcomes should be included to assess acupuncture and moxibustion in treating allergic rhinitis (e cacy, safety, clinical economic indicators, etc.). Different measurements belong to one sort of outcome could be synthesized via standardized mean difference (SMD) as lack of enough clinical evidence. Since this guideline is developed using GRADE system, outcomes related to safety and clinical economic indicators should be included. However, the laboratory immunological indicators are less important than other outcomes in clinical trials, for it is more commonly applied in mechanism studies rather than randomized clinical trials. The Chinese medicine syndrome score are less important, for there is lack of uni ed scales to Chinese medicine syndrome. (4) Expert in health economics pointed out that medication score, adverse event rate and clinical economic indicators were critical factors from the perspective of health economics. These factors determine bene ts, harms and resource use of different treatment, so should be rated as important outcomes. (5) Allergic rhinitis patient representatives also mentioned the importance of symptom score, while they paid more attention on medication score, adverse event rate, clinical economic indicators and patient satisfaction score.
After a full discussion on outcomes, a second round of the Delphi survey was launched and all GDG members nished the survey. All outcomes were rated as important outcomes (Fig. 5d), while the laboratory immunological indicators and Chinese medicine score remained divergent (Fig. 5e). The authority coe cient of each outcome was above 70% except for the laboratory immunological indicators (Fig. 5f). Therefore, the second round of discussion on the two outcomes was carried out. For laboratory immunological indicators, GDG members who rated it highly believed this outcome could subjectively reveal the e cacy of acupuncture and moxibustion therapies on AR. However, GDG members who rated it lowly thought that there were a huge number of immunological indicators which make it unfeasible to synthesis these data. For the Chinese medicine syndrome score, GDG members who rated it highly believed that syndrome differentiation related to the treatment e ciency. Without this sort of outcome, there will be a lack of the characteristics of Chinese medicine. Whereas the GDG members who rated it lowly believed that the Chinese medicine syndrome score has not been used globally, and the Chinese medicine syndrome of AR has not yet been uni ed. Chinese medicine syndrome score was developed based on AR symptoms, which could be comprehensively evaluated by symptom score, disease control score, and quality of life score. Therefore, the Chinese medicine syndrome score is redundant. Moreover, during the progression of AR, the syndrome might change spontaneously, which could not re ect the e cacy of treatment. After full discussion, a third round of the Delphi survey was carried out. Both outcomes were rated important (Fig. 6a), while the CV of Chinese medicine syndrome score is still higher than 25% (Fig. 6b), thus was excluded. The authority coe cients of both outcomes were above 70%. Therefore, the GDG reached a consensus of including 9 important outcomes in the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis. Among these outcomes, symptom score rated as the most important one, followed by patient satisfaction score, disease control score, adverse event rate, quality of life score, medication score, clinical economic indicators, mental health score, and laboratory immunological indicators. In general, 15 clinical questions and 9 important outcomes were eventually included in the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis.

Discussion
Since the clinical concerns from different target populations vary from each other, it is of priority to widely collect their most concerned PICO clinical questions. The scope of this guideline is to provide clinical practice recommendations for global acupuncturists, TCM practitioners, and relevant occupations. Therefore, the 1st step survey (international survey on PICO questions) covered participants from 17 countries of 5 different continents. As the majority of the potential users of this guideline will be Chinese acupuncturists, 81% of the participants in the 1st step survey were acupuncturists and 75% of the participants were from 25 different provinces of China. Other related occupations, such as physicians of Chinese medicine, otolaryngologists, medical school faculties, and scienti c researchers who work on AR and acupuncture, were also included. Participants of different working experiences were all included to truly re ect the demands from the target population. After the 1st step survey, we extensively collected the most concerned clinical questions from the target population of this guideline and extracted each PICO element. Then, we summarized the most concerned clinical question into three types and analyzed the distribution of concern on each element. Eventually, under the instruction of the guideline steering committee, the guideline drafting group transferred the results of PICO question analysis into 15 clinical questions and 10 sorts of outcomes that could cover most of the target population's concerns.
The Guideline Development Group (GDG) plays a critical role in guideline formulation. Throughout the development of a clinical practice guideline, GDG members need to reach consensus in several key steps, among which consensus on review questions and important outcomes is the rst step [28][29][30]. In this study, we constructed the GDG following the WFAS requirement [31]. Experts in acupuncture and moxibustion from different countries accounted for the majority of GDG. Experts in different elds (otolaryngology, evidence-based medicine, health economics) and AR patient representatives were also indispensable compositions. All GDG members have claimed their con ict of interest [32] and signed the Declaration of Interest Form of the WFAS standard expert committee.
After the generation of potential clinical questions and outcomes, the GDG reached a consensus on the clinical questions and important outcomes. Delphi Method, Consensus Conference, and the Nominal Group Technique are the most often applied strategies in clinical practice guideline development [33,34].
In this study, we applied a mixed method of the Delphi survey and consensus conference. After the formulation of potential clinical questions and outcomes, the 1st round of the Delphi survey was carried out ahead of the consensus conference. Therefore, during the consensus conference, the distribution of importance rating and concerns on clinical questions and outcomes could be displayed anonymously, which makes the meeting more e cient in nding discordance and reach consensus. Then, the 1st round of discussion on the clinical questions was carried out, during which each GDG member share his/her opinions and expertise on the potential clinical questions. During the 2nd round of the Delphi survey, the coe cient of variation decreased while the authority coe cient increased, and the consensus was reached.
As this guideline will be developed using the GRADE system [35], the GDG rated the importance of all sorts of outcomes and selected the important and critical outcomes accordingly. Ten sorts of outcome indicators were mentioned in this survey: symptom score, disease control score, quality of life score, mental health score, medication score, laboratory immunological indicators, Chinese medicine syndrome score, adverse event rate, clinical economic indicators, and patient satisfaction score. Among these, symptom score mainly focuses on the nasal and non-nasal symptom related to allergic rhinitis, including  [51,52], etc. The mental health score is another critical concern in AR management [26], scales such as BDI (Beck Depression Inventory) [24] and PHQ-2 (Patient Health Questionnaire-2) [53] have been mentioned. Medication score is commonly used to re ect the e cacy of complementary therapy, it could be described as RMS (Rescue Medication Score) [54,55]  For the important outcomes, though all of them were rated as important or critical, there existed discordance on several items, and after three rounds of Delphi survey and discussion, 9 outcomes were considered important for this guideline. They were ranked in order of priority, so it could be feasible to weigh the advantages and disadvantages during the formulation of recommendations. Among these, symptom score was considered the most important outcome since its wide application in AR assessment. Patient satisfaction score was the second priority since both patient representative and clinical practitioners attached importance to it. The coe cient of variation of 9 sorts of outcomes decreased during three rounds of the Delphi survey, which means the GDG gradually reached a consensus. However, the Chinese syndrome score remained highly divergent and was excluded. The increasing authority coe cient means the discussion and interpretation were comprehensive and effective.

Strengths and limitations
The strength of this study is the joint application of multi-methods (international PICO question survey, Delphi survey, and consensus conference). Previously, the items of the Delphi survey were developed based on systematic review and interview [66, 67], while we carried out an online clinical question survey among the global target population, using a semi-structured PICO questionnaire, which could help us to obtain the demands from the potential users and avoid the data-driven problems. After formulating the potential clinical question and outcome lists, 1st round of the Delphi survey was held ahead of the consensus conference to summarize the concerns from GDG. Therefore, the divergence and comments could be displayed anonymously at the conference to minimize the authoritative effect. Moreover, this made the consensus conference was more e cient, for the discussion could be more focusing on the divergence. During the conference, each GDG members were required to show their opinions which could represent the potential users, bene ciary, and methodologists. This enhanced the equality and representativeness in the discussion.
There are also several limitations in this study. Firstly, since lack of background knowledge of the PICO framework in clinical practitioners in acupuncture and moxibustion, the clinical question-gathering process was time-costing. For there were some feedbacks not in PICO form during the survey, we needed to contact the participants, interpret the PICO framework, and ask them to ll the questionnaire again.
Secondly, although the Delphi voting results were displayed anonymously and all GDG members were required to share their views, dominant GDG numbers might impose their opinions upon more reticent colleagues. Therefore, the individual generation and round-robin fashion of NGT[68-70] might be referential for future improvement.

Conclusion
15 clinical questions and 9 important outcomes were selected by GDG for the clinical practice guideline for acupuncture and moxibustion for allergic rhinitis. Since this will be the rst edition of WFAS CPG for acupuncture and moxibustion for Allergic Rhinitis, the clinical questions and important outcomes are universally applicable and re ect the most fundamental and urgent demands from global users. The joint application of multi-methods in this study could be useful for the relevant CPG studies.   PICO elements summary of the 1st step survey. a Clinical question classi cation; b Participant's concern on patients; c Participant's concern on interventions; d Participant's concern on comparations; e Participant's concern on outcomes. AR (all types of AR), SAR (seasonal AR), PAR (perennial AR), Pers. AR

Figure 4
Results of 1st round Delphi survey. a Importance rating of clinical questions; b CV of clinical question rating; c Cr of clinical questions; d Importance rating of outcomes; e CV of outcome rating; f Cr of outcomes.  Results of 3rd round Delphi survey. a Importance rating of outcomes; b CV of clinical question rating; c Cr of clinical questions