Analysis of Methodological Quality of The Process of Moving From Evidence to Recommendations in Chinese Clinical Practice Guidelines Between 2010 and 2020: A Cross-Sectional Study

Nan Zheng Qilu Hospital of Shandong University Xiaoming Chen Qilu Hospital of Shandong University Xiangying Ren College of Nursing and Health, Henan University Xiangyun Guan Qilu Hospital of Shandong University Ran Tan Qilu Hospital of Shandong University Linxia Su Qilu Hospital of Shandong University Qiao Huang Zhongnan Hospital of Wuhan University, Wuhan University Yingjuan Cao Qilu Hospital of Shandong University Yinghui Jin (  jinyinghuiebm@163.com ) Zhongnan Hospital of Wuhan University, Wuhan University

This study aimed to describe the status and trends of the main methodological characteristics of EtR by analyzing all Chinese CPGs published during the period 2010-2020; compare the methodological characteristics of EtR of the Chinese CPGs in EB-CPGs with CB-CPGs, and GRADE with non-GRADE CPGs; determine the main methodological factors of the rigor of EtR in CPGs and then, make several suggestions for a rigorous EtR for CPGs.

Search strategy
The two main Chinese electronic databases were systematically searched, CNKI (China National Knowledge Infrastructure) and Wan Fang Database to identify CPGs. According to the title de nition, we classi ed the CPGs into EB-CPGs and CB-CPGs. In this research, CB-CPGs (including consensus statements and expert opinions) refer to CPGs developed by guideline panels based on only low-quality or very low-quality evidence. The keywords for the searches included Chinese words for terms, such as "clinical practice guideline" or "practice guideline" or "clinical guideline" or "guideline" or "recommendation" or "consensus" or "consensus statement" or "expert consensus" or "expert consensus statement" in the titles. The search dates were from 1st January 2010 up to 31st December 2020.

Guideline Identi cation
All CPGs were included if they met both of the following criteria: (1) The articles were considered as CPGs if they met the de nition of a CPG as proposed by IOM [1,18]; (2) The CPGs illustrated recognizable recommendations [12,[19][20][21]: ) The CPGs used Chinese words for terms or phrases such as "should", "strongly recommend", "suggest", "consider", "must", "could", "should not" to identify recommendations; ) The CPGs presented recommendations separately by using text boxes/summarizing tables, recommendations ow charts, or propositions of recommendations in executive summary; ) The CPGs presented recommendations in the form of italic or bold fonts.
If several versions of one CPG existed, only the version that included the greatest detail on the CPG development methodology was assessed. Where the same CPG was published in different journals simultaneously, only one was chosen. Where one CPG was published in several parts, we merged them into one complete CPG. The Chinese versions of foreign CPGs and adapted versions of CPGs from other countries were excluded. Articles were also excluded if the full text was unavailable.

Screening and review
Two reviewers (Yan SY, Ren XY) independently selected relevant guidelines as proposed by IOM, and screened eligible studies by title, abstract, and full text. Then, two reviewers (Zheng N, Ren XY) independently selected CPGs that speci ed recommendations. Discrepancies were resolved by consensus or discussion with a third researcher (Jin YH).

Data Extraction
To generate potential items for the data-extraction form for the methodological characteristics of EtR, we implemented a 3-aspect approach: (1) we extracted the main methodological characteristics associated with the process of EtR from the development manuals on CPGs, by accessing the websites of all major guideline databases, famous academic societies or institutions (Appendix A); (2) the GRADE [8](including the GRADE EtD framework [3,4,22,23]), AGREE [24,25](www.agreetrust.org), and the Guidelines 2.0 checklist [9,26] were obtained to give the main methodological characteristics associated with the process of EtR; (3) then face-to-face group meetings were held to discuss and con rm all potential items.
In addition to general characteristics (title, publication year), we also extracted 12 methodological characteristics in 4 domains of EtR from the included CPGs: (1) The basic characteristics: type of CPGs (EB-CPGs or CB-CPGs), grading system (GRADE or non-GRADE); (2) The relevant factors being considered: the effectiveness of the intervention, safety, feasibility, costs, values and preferences, equity, balance of bene ts and harms. The CPGs considering relevant factors of EtR were de ned as those CPGs that presented the information of cited references supporting the consideration of relevant factors in the text of the CPGs, such as original research or systematic reviews. In this study, greater than or equal to 3 relevant factors was considered as represented a rigorous process of EtR. (3) The information related to the process of consensus or voting: the consensus methods (Delphi Method, Nominal Group Technique, Consensus Development Conference, et al.) used [27], the holding of consensus meetings to discuss or vote on recommendations, stuff composition of consensus meetings, the participation of patients, the declaration of con icts of interest, the production of Evidence Pro les (EP) or Summary of Findings (SoF) tables and the use of supporting tools (such as GRADE EtD framework or GRADEPro, et al); (4) Other characteristics: the designated level of evidence, and the conduct of external reviews.
Data were extracted using a double-extraction method from each eligible CPG and its corresponding appendices by two reviewers (Zheng N, Ren XY) who are familiar with Evidence-based medicine and CPG development methodology. Any disagreement was also resolved through discussion with a third author (Jin YH). Finally, two independent reviewers recorded the characteristics of included CPGs in an EXCEL le. Kappa statistics were calculated to evaluate inter-rater reliability between the two assessors using SPSS 23.0 software. Kappa value of > 0.75 was considered as high inter-rater reliability.

Statistical Analysis
Categorical variables were described using frequencies and percentages. The data relating to methodological characteristics of EtR were summarized and strati ed by the year of the CPGs development. The "color scale chart" was used to adjust the basic tones of the image to provide a visual reference for the reader. We used Mann Kendall Trend Test (M-K test), a non-parametric method, to identify monotonically increasing or decreasing trends of methodological characteristics over years, a positive z value indicated a monotonic upward trend, and a negative one indicated a downward trend.
We classi ed included CPGs as CB-CPGs and EB-CPGs based on category and dichotomized CPGs based on whether or not GRADE was used. Chi-squared tests were conducted to assess differences in EB-CPGs versus CB-CPGs and GRADE versus non-GRADE of CPGs from 2010-2020 in types of methodological characteristics.
Multi-factor Binary Logistic regression analysis was used to analyze factors related to the rigor of the process of EtR in the CPGs. The dependent variable was a rigorous process of EtR. The independent variables covered EB-CPGs or CB-CPGs, GRADE or non-GRADE CPGs, the designation of the level of evidence, the use of the consensus methods, the holding of consensus meetings to discuss or vote on recommendations, the participation of methodologists, the participation of patients, the declaration of con icts of interest, the production of EP or SoF tables, the use of supporting tools and the conduct of external reviews. We use "Yes/No" to classify all the dependent and independent variables. For simpli cation, adjusted odds ratios (ORs) with corresponding 95% con dence interval (95% CI) were presented for each regression. We evaluated the screened risk factors by the partial chi-square statistic minus the predicted degrees of freedom to measure the importance of each screened risk factor. All statistical tests were 2-sided at a signi cance level of P < 0.05. The M-K test was conducted in R, version 4.0.1 (R Foundation), and the rest of the analyses were carried out using the SPSS software, version 23.0.

Flow of included studies
We identi ed a total of 29,186 articles of which 18,078 were considered potentially relevant; after screening using titles and abstracts, 2,873 CPGs were selected. In the end, a total of 753 CPGs (EB-CPGs=386/1,127, CB-CPGs=367/1,527) were selected based on the selection criteria (Fig. 1). Unfortunately, over the 11 years, only 28.4% (753/2654) of the CPGs illustrated recognizable recommendations There was high agreement between the authors extracting the data (Kappa=0.89; 95% CI 0.73-0.84; P<0.001). The differences in data extraction were resolved by consensus or discussion with the third author.

Number of CPGs that illustrated recognizable recommendations
From 2010 to 2020, it was evident that the production of CPGs specifying recognizable recommendations was increasing annually (Fig. 2A). The number of CB-CPGs illustrating recognizable recommendations published in the last two years was far higher than that of EB-CPGs.

Trends of the methodological characteristics of EtR
According to the result of M-K test, there was an increasing tendency in these 5 characteristics (P<0.05) in all the CPGs, including the consideration of relevant factors, the holding of consensus meetings to discuss or vote on recommendations, the participation of methodologists, the participation of patients, and the declaration of con icts of interest (Fig. 2B) As Fig. 2C shows, the EB-CPGs have improved in the 6 characteristics over the time-span by M-K test (P<0.05), which include the consideration of relevant factors, GRADE or non-GRADE CPGs, the participation of methodologists, the participation of patients, the declaration of con icts of interest, and the conduct of external reviews. As reported, 52.6% (203/386) of EB-CPGs considered relevant factors with an increasing trend (P<0.05). Although there were only 29.0% (112/386) of EB-CPGs using GRADE to develop recommendations, the proportion of EB-CPGs using GRADE has still been increasing year on year (P<0.05).
As presented in Fig. 2D, 5 characteristics showed an increasing tendency over the 11 years based on the M-K test in CB-CPGs (P<0.05). The 5 characteristics referred to the consideration of relevant factors, the use of consensus methods, the holding of consensus meetings to discuss or vote on recommendations, the declaration of con icts of interest, and the conduct of external reviews. Although there were only 34.3% (126/367) of CB-CPGs considering the relevant factors, which was less than that in EB-CPGs, a signi cant improvement was observed in both EB-CPGs and CB-CPGs in this area over the time span (P<0.05). Unfortunately, the proportion of CB-CPGs using GRADE to develop recommendations was only 19.6% (72/367), which was also less than EB-CPGs, and there was no signi cant upward trend in this area in CB-CPGs (P>0.05). CPGs which considered costs, the proportion was even lower for consideration of values and preferences (57, 7.6%) and feasibility (47, 6.2%) no CPG considered fairness.

Methodological characteristics of EtR in EB-CPGs versus CB-CPGs
In total, there were statistically signi cant differences in 10 characteristics demonstrated by Chi-squared test (P<0.05) (Fig. 3A), including the consideration of relevant factors, GRADE or non-GRADE CPGs, the designation of the level of evidence, the use of consensus methods, the holding of consensus meetings to discuss or vote on recommendations, the participation of methodologists, the participation of patients, the declaration of con icts of interest, the production of EP or SoF tables and the conduct of external reviews.

Methodological characteristics of EtR using GRADE versus non-GRADE
According to the statistics, a total of 184 (24.4%) CPGs used GRADE, and 75.6% (569/753) did not. As Fig. 3B shows, compared with CPGs not using GRADE, we observed higher methodological quality in CPGs using GRADE across the 10 methodological characteristics (P<0.05), which included the consideration of relevant factors, EB-CPGs or CB-CPGs, the designation of the level of evidence, the use of consensus methods, the holding of consensus meetings to discuss or vote on recommendations, the participation of methodologists, the declaration of con icts of interest, the use of supporting tools, the production of EP or SoF tables, and the conduction of external reviews.

Discussion
This survey shows that the process and EtR in Chinese CPGs over the last 11 years is not satisfactory, especially in terms of the consideration of relevant factors. A number of recommendations were developed without su cient consideration of feasibility, costs, values and preferences or balance of bene ts and harms, which is similar to ndings by Chen YL, et al [14]. What's more, according to the result of the Chi-square test and the Multi-factor Binary Logistic regression analysis, the CPGs using GRADE outperformed those that did not, and the EB-CPGs were superior to the CB-CPGs across almost all methodological characteristics of EtR involved.
The authoritative academic societies or institutions had made it clear that to formulate a recommendation, it's important to take the relevant factors into consideration, such as values and preferences related to the outcomes of an intervention or exposure, balance of bene ts and harms and acceptability, which have different degrees of in uence on the process of EtR, and sometimes even play a decisive role [8, 10,12,26]. These considerations mean that high certainty evidence does not necessarily imply strong recommendations, and strong recommendations can result from low or even very low certainty of evidence by taking all the relevant factors together [28]. In addition, to make a recommendation, a panel must consider the implication and importance of each of the relevant factors to make sure that when there is uncertainty or disagreement, it can help to explicitly consider this for each criterion [4]. While the neglect of relevant factors of EtR may affect the credibility and acceptability of recommendations, this disadvantages the promotion and implementation of the CPGs [29,30]. The structure and standardization of Chinese CPGs are still gradually developing. A large proportion of CPGs in the early stage, focused on health effects ignoring important criteria for decision-making including feasibility, cost, acceptability, and equity of the interventions [6, 14]. What's more, the methodologist also plays a critical role in EtR by helping the guideline panel to formulate recommendations informed by the evidence in a transparent and explicit manner [10,12]. Unfortunately, it is reported that the CPGs rarely involved patients' participation in the process of EtR, which has led to the neglect of patient's perspectives [14,31]. This study con rmed these conclusions. Moreover, the absence of methodologists in the process of EtR may also lead to the neglect of evidence regarding patients' value and preferences, costs, and feasibility, which will reduce the rigor of the EtR process. Therefore, the guidelines working group should specify and accurately de ne the relevant factors that need to be taken into account when formulating recommendations, and assess the criteria of relevant factors being considered quantitatively as much as possible. What's more, guideline developers should also use supporting tools or frameworks to ensure that recommendations are scienti cally and transparently formulated.
It is acknowledged that the use of GRADE to develop recommendations is an important safeguard of the structural factors, transparency, and rigor of CPGs, and this has been con rmed by our study [32,33]. Because of the loose methodological requirements for the formulation of CB-CPGs, the development process of CB-CPGs has usually been much simpler and faster, possibly leading to signi cantly worse methodological quality than that of EB-CPGs [6]. This study has con rmed that the EB-CPGs are superior to the CB-CPGs in the vast majority of methodological characteristics related to methodological quality of EtR. Therefore, the EB-CPGs using GRADE based on strict methodology rather than CB-CPGs should be selected as far as possible when considering the actual situation.
We have also made the following suggestions: (1) recommendations in the CPGs should be based on a systematic review of the scienti c literature guided by speci c key questions about the intervention, exposure or approach under consideration, which should underpin all recommendations on the effectiveness of the intervention, safety, feasibility, costs, values and preferences, equity, balance of bene ts and harms; (2) it is necessary to include methodologists into the guideline panels to regulate the process of EtR; (3) we should attach importance to the rigor of formulating recommendations to promote the implementation and promotion of the CPGs.

Strengths and limitations
As far as we are aware, this is the rst systematic review on the status of EtR in Chinese CPGs, which is based on a systematic search of all Chinese CPGs for the last 11 years. The data-extraction form based has been based on the authoritative guidance documents including the CPGs development manuals, GRADE and AGREE websites or papers, and the Guidelines 2.0 checklist. The data-extraction form has covered most of the important characteristics that may be relevant in the process of EtR. We also found several most important factors related to the rigor of EtR in CPGs. Overall, our work provides the basis for these important next steps in improving the methodological quality of EtR in Chinese CPGs.
This systematic review has certain limitations. Evaluating the methodological quality of CPGs, we simply analyzed the data reported in the literature. Some items may not have been fully reported due to reporting differences or space restrictions of publishers. Unfortunately, our study did not contact authors to obtain further access and analysis. Consequently, the conclusion of this research may underestimate the methodological quality of EtR in Chinese CPGs.

Conclusion
The CPGs have played a vital role in clinical medicine, and have already become one of the most important tools for potentially improving clinical decision-making, patients' outcomes and reducing medical costs. It is very important to develop recommendations rigorously. The rigor of EtR in CPGs will not only improve the overall quality of the guidelines but also facilitate their dissemination and implementation. However, the status of relevant factors being considered in the process of EtR in Chinese CPGs across the last 11 years is not satisfactory. It is critical to consider not only evidence of the effects of an intervention on health outcomes, but also all relevant factors by rigorous methodology for guideline panels to reach the consensus. The owchart for CPGs selection.