Methodological Quality Assessment of Systematic Review or Meta-Analysis Using AMSTAR-2: the Long-Term Effectiveness or Efficacy of Opioids for Chronic Non-Cancer Pain

BackgroundSystematic review or meta-analysis, the strong study design of high quality evidence, give inconsistent conclusion of long-term effectiveness or efficacy of opioids for chronic non-cancer pain. We appraised the methodological quality of systematic reviews or meta-analyses. Methods: We found the relevant systematic reviews or meta-analyses by searching Medline, EMBASE, the Cochrane Database of Systematic Reviews, the Cochrane Central Register of Controlled Trials, the International prospective register of systematic reviews, Psyc ARTICLES/OVID, the Chinese Bio-Medical Literature Database, the China National Knowledge Infrastructure, and the Wan Fang Data and VIP Database on March 1st, 2019. The methodological quality was assessed by A Measurement Tool to Assess Systematic Reviews-2(AMSTAR-2). Spearman correlation analysis and non-parametric tests were used to assess the association between quality and factors. Results: Twenty-one systematic reviews or meta-analyses were included in our study. One has no individual study. In terms of methodological quality, twelve reviews were critically low in overall confidence, four reviews were low, two reviews were moderate, two reviews were high. When referring to the systematic reviews or meta-analyses of relatively better methodological quality with more credible results and conclusions, the effectiveness or efficacy of opioids was small to questionable. Cochrane reviews performed better than non-Cochrane reviews in establishing prior protocol (100% vs 17%, P<0.05), providing an excluded studies list (100% vs 50%, P<0.05) and taking risk of bias into account when interpreting the results of the review (100% vs 75%, P<0.05). There was a strong correlation (ρ=0.526, P<0.05) between the impact factor of systematic reviews or meta-analyses in published journals and methodological quality. Conclusion The methodological quality of the included systematic reviews or meta-3

analyses is far from satisfactory and needs improvement, especially in establishing prior protocol and justifying significant deviations from the protocol, providing an excluded primary studies list, reporting the funding information of primary studies, and assessing the potential impact of risk of bias on individual studies.

Background
Chronic non-cancer pain (CNCP) is identified as non-cancer pain lasting for over three months. Approximately 20% people abandon work due to chronic pain 1 because it affects the quality of life, activities of daily life, social life, and work. What's more, Ratcliffe GE et al 2 the suicide rate of patients with CNCP remained high due to mental illness. Hence, prompt treatment is indispensable. Opioids are a kind of drug that relieves pain by binding to and blocking certain receptors located in the brain and spinal cord, and have been endorsed to relieve CNCP by the American Academy of Pain Medicine (AAPM) 3 . Meanwhile, 3% to 4% of adults (9.6 to 11.5 million people) in the US have been prescribed long-term opioid therapy 4 . However, when facing the adverse effects and potential misuse, and a lack of high-quality data on benefits, some guidelines discourage opioid prescribing for chronic pain 5,6 . One clinical RCT published in Jama 7 also found that treatment with opioids was not superior to treatment with non-opioid medications for improving pain-related function in patients with moderate to severe chronic back pain or hip or knee osteoarthritis. Hence, the controversy still exists whether opioids have long-term effectiveness or efficacy for CNCP [8][9][10][11] . Systematic review or meta-analysis, the strong study design of high quality evidence [12][13][14][15][16] , also give inconsistent conclusion. For instance, Devulder et al. 17 found that long-term opioid treatment can lead to significant improvements in functional outcomes. On the contrary, the study by Welsch et al. 18 supported the concept of "non-opioid requiring" for CNCP. The different levels of methodological quality can threaten the validity of systematic review or meta-analysis [19][20][21] . Thus, It was therefore necessary to assess the methodological quality of the systematic reviews or meta-analyses to determine which have high methodological quality concerning the long-term effectiveness or efficacy of opioids for CNCP, then guide the following systematic reviews or meta-analyses to refer to the principle of high methodological quality 22 .
AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews-2) is an updating version of AMSTAR. When compared with AMSTAR, not only can it assess systematic reviews or metaanalysis of RCT, it can also assess study including non-randomized studies (NRSI), which caters to the fact that almost half of all published systematic reviews include NRSI 23 . The aim of this study was to assess the methodological quality of systematic review or metaanalysis, which were about the long-term effectiveness or efficacy of opioids for CNCP by the tool of AMSTAR-2, and identify whether serious methodological flaws existed and mislead clinical decision-making, then we can choose to believe the conclusion of high methodological quality and improve the quality of following study.

Methods
Prior to the beginning of the review, a protocol was produced, which included the search strategy, inclusion criteria, and outcomes of interest. The protocol has been registered on the International prospective register of systematic reviews (CRD42018094628).
Comparing with the protocol, the changes in the process of the review conduction were listed in Additional file 1.

Patient and Public involvement
All the data of our study were extracted from published articles and contained no study with human participants or animals performed by any of the authors. Informed consent was not involved since all the data in this study were extracted from published articles.
Meanwhile, no patient involved in the study, hence, there was no need to make dissemination.

Studies Selection
We Inclusion criteria (the included systematic reviews or meta-analyses should meet all the following criteria): (1) appraise the long-term effectiveness or efficacy of opioids for CNCP;(2) Have long-term results-identified as lasting more than 4 weeks; (3) opioids should be compared with a placebo or non-opioid analgesics;(4) state explicitly that the study was a systematic review or meta-analysis; (5) be published in English or Chinese.
Exclusion criteria (systematic reviews or meta-analyses that met any of the following criteria would be excluded): (1) includes an appraisal of chronic cancer pain; (2) had a period of less than 4 weeks; (3) the study was not a systematic review or meta-analysis; (4) the systematic review or meta-analysis included the study of animal trials.

Studies selection
Two reviewers (Qian Li and Xiaoyuan Jiang) undertook the process of study selection independently according to the inclusion and exclusion criteria, and any controversial systematic reviews or meta-analyses were resolved by discussion. If necessary, the third reviewer (Huan Tao) was involved in judgment. The specific steps were listed as follows: Firstly, the two review authors downloaded all relative literature after the search was complete. Then they imported the literature to an Endnote database (Version 7.0). Next, they screened the title and abstract of all identified studies according to the inclusion and exclusion criteria. Finally, they retrieved all possibly relevant articles in full text to assess their internal validity (quality) and whether they satisfy the inclusion criteria.

Data extraction and management
Two reviewers (Qian Li and Hui Liu) independently extracted relevant data to a data collection template according to the AMSTAR-2 questionnaire. The third reviewer (Ke Deng) was consulted when disagreements were not resolved by discussion. The specific items extracted from the full text are as follows: journal name, first author's name, number of authors, number of authors with a conflict of interest, country, year of publication, published journal, impact factor of the published journal, searching database, number of searching databases, additional retrieval, whether the review was registered, whether the protocol was published, number of included studies, type of included study RCT or NRSI), drug, treatment duration, instrument to evaluate the risk of bias, whether go on meta-analysis, primary outcome, and conclusion.

Methodological quality assessment by the tool of AMSTAR-2
Two reviewers (Qian Li and Xiaoyuan Jiang) were trained to assess the methodological quality of systematic reviews or meta-analyses by an experienced reviewer (Jin Chen).
They independently conducted the process of methodological assessment. Each reviewer provided reasons for their judgment and disagreements were solved by face-to-face discussion. The appraisal tool of AMSTAR-2 included 16 domains: whether there was PICO element in the research issues and the inclusion criteria, the protocol of systematic review or meta-analysis, the reason for selecting the study design, the literature searching strategy, study selection, data extraction, specific details of inclusion and exclusion criteria, adequate detail of the included studies, bias risk assessment of the included studies, the sources of funding, appropriate statistical methods, the impact evaluation of the individual study's RoB (risk of bias), the explanation of RoB in individual studies, a satisfactory explanation for any heterogeneity, adequate investigation of publication bias, and potential conflicts of interest.
The answer options of the AMSTAR-2 were Yes, Partial Yes, and No. "Yes" denoted a positive result. "No" represented that there was not enough information about the domain. "Partial Yes" represented that it partially adhered to the standard. AMSTAR-2 used BOX 1 of the seven critical domains -item2, item4, item7, item9, item11, item13, item15 -as the critical domains, while the other nine domains were non-critical. It then used BOX 2 to rate overall confidence of the reviews, including four ranks: High (no or one non-critical domain weakness), Moderate (more than one domain weakness but no critical domain weaknesses), Low (one critical flaw with or without non-critical weaknesses), and Critically Low (more than one critical flaw with or without non-critical weaknesses).

Statistical analysis
Kappa (κ) was used to measure the inter-rater reliability (IRR) between the two reviewers for AMSTAR-2. κ value ≤ 0 indicated no agreement, 0.01-0.20 slight agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.80 substantial agreement, and 0.81-1.00 almost perfect agreement21. Data were summarized as frequencies or percentages for categorical variables and as mean ± standard deviation or median (interquartile range: the 25th to 75th percentile) for continuous variables, such as the import factors of the published journals. A non-parametric test (Chi-square test) was used to assess whether there was any difference between Cochrane reviews and Non-Cochrane reviews in each domain. The association between the number of authors, the number of searched databases, the import factors of the published journal, the number of authors with a conflict of interest, and the AMSTAR-2 overall confidence for each study was analyzed by Spearman correlation analysis. The association between the year of publication, the continent of publication, the type of included studies, whether it is a Cochrane review or not, the source of funding, and the AMSTAR-2 overall confidence was analyzed by non-parametric tests (rank-sum test). We used SPSS 20.0 for statistical analysis, and statistical significance was identified as two-sided, P <0.05.

Results
We included 6020 records by searching. After the steps of removing duplicates, screening titles and abstracts, and screening full-text according to the inclusion criteria, only twenty-one records 17,18,24-42 went into the step of final methodological synthesis. The study procedure is shown in Figure 1.

Characteristics of the included systematic review or meta-analysis
Nine systematic reviews or meta-analyses were published in Cochrane Database of Systematic Reviews 24,26,[28][29][30][31]37,39,40 . Almost half of the studies were published in Europe, nine studies were in North America 25,[27][28][29][33][34][35][36]41 , and the remaining two were in Australia 24 and China 42 . Not all of the studies were published in an English version: one was only in Chinese 42 and two were in German 18,32 but included the English version. When talking about the study design, fifteen systematic reviews or meta-analyses included only RCT, two included only non-randomized studies 38,41 , three systematic reviews or metaanalyses included both RCT and non-randomized studies 17,31,37 , and one included no study 30 . Most systematic reviews or meta-analyses evaluated the general types of opioids to treat the general types of CNCP. Because of the diversity in identifying a long-term treatment period, we claimed the treatment period of all the individual studies in systematic reviews or meta-analyses should be at least 4 weeks or have a subgroup analysis of over 4 weeks. The longest treatment period was at least 6 months 41 . When considering the primary outcome, only two systematic reviews or meta-analyses went on a specific assessment of the effectiveness or efficacy by the Quality of Life and pain relief 17,25 . The remaining assessed the effectiveness, efficacy, and safety. The largest number of original studies in the included systematic review or meta-analysis was ninetyfour 27 . One review included no individual study 30 , and two reviews only included one study 26,40 . Most systematic reviews or meta-analyses were conducted by 3 to 6 authors.
Most systematic reviews or meta-analyses were funded by academic or health institutions.

Methodological quality
Two reviewers (Qian Li and Xiaoyuan Jiang) independently conducted the process of methodological quality assessment. The inter-reviewer agreement on the independent methodological quality assessment was substantial (k=0.784, 95% CI (0.747~0.821)), the disagreements were solved by face-to-face discussion. Table 1 show the results of methodological quality study. Detailed AMSTAR-2 overall confidence for each study was listed in additional file 4. In general, the included systematic review or meta-analysis were more likely to include the research questions and the inclusion criteria as the form of PICO (item 1:100%), provide keyword or search strategy beside searching at least 2 databases (item 4:100%), perform study selection and data extraction in duplicate (item 5:85.71%; item 6:80.85%), and provide a list of the included studies in detail (item 8:85.71%). Meanwhile, they were more likely to provide a satisfactory explanation for the heterogeneity (item 14:61.90%), account for ROB in individual studies when interpreting or discussing the results of the review (item 13:80.95%), and manifest the funding sources they received when conducting the review (item 16:76.19%). However, they were less likely to extract the individual study's funding information (item 10:42.86%), go on subgroups or sensitivity analysis (item 11:42.86%), use appropriate statistical methods to assess the heterogeneity of ROB (item 12:14.28%), and go on an adequate investigation of publication bias (item 15:38.10%). Besides systematic reviews or meta-analyses published in the Cochrane library, only two systematic reviews or meta-analyses 27,38 showed that they established protocol prior to the beginning of the review and justified the significant deviations from protocol. In the ROB assessment for systematic reviews or meta-analyses including both RCT and NRSI, only two studies 31,37 assessed the necessary domain of ROB (item 9).
In the overall confidence of the review, two systematic reviews or meta-analysis were rated as high 27,28 , two systematic reviews or meta-analyses were rated as moderate 26,31 , four systematic reviews or meta-analyses were rated as low 18,29,36,40 , twelve systematic review or meta-analyses were rated as critically low 17,24,25,32,[34][35][36][37][38][39]41,42 , one review 30 included no individual study. Most reviews supported the interpretation that the effectiveness or efficacy of opioids was small or questionable, including the two 27,28 of high methodological quality. The reviews 18,29,34,36,42 supporting the effectiveness or efficacy of opioids were all low or critical low quality.

Comparison between Cochrane reviews and Non-Cochrane reviews
As showed in Table 2, when compared with Non-Cochrane reviews, Cochrane reviews performed better (p<0.05) in establishing protocol prior to the beginning of the review (item 2). This justified the exclusion of such studies (item 7) as well as the consideration of risk of bias when interpreting the results of these reviews (item 13).

Characteristics and methodological quality
The results of the Spearman Correlation, non-parametric tests (Rank-sum test) were summarized in Table 3. In the Spearman Correlation analysis, there was a strong correlation (ρ=0.526, P<0.05) between the impact factor of the published journal and the methodological quality of the systematic review or meta-analysis.

Discussion
In summary, our contribution to the state of the art was using AMSTAR-2 to assess the methodological quality of systematic reviews or meta-analyses, which were about the long term effectiveness or efficacy of opioids for chronic non-cancer pain. Using this tool, we found that the majority of the included systematic reviews and meta-analyses were rated as critically low for more than one critical flaw. Maybe it is the cause of inconsistent conclusion. Systematic review or meta-analysis with better quality were more likely to be published in journals with higher impact factors.
In our study, when referring to the systematic review or meta-analysis of relatively better methodological quality, the effectiveness or efficacy of opioids was small or questionable.
Likewise, a number of adverse events, such as nausea, pruritus and vomiting, were associated with the medium and long-term use of opioids for CNCP 27,43 . Opioids were highly addictive, which made stopping use of medication difficult 44 . As Matt Hancock said 45 "the use of opioids must be under the premise of protecting people from the dark side to painkillers." In line with this thinking, our study suggested that it is necessary for better guidelines that help users to manage the long-term use of opioids for CNCP.
We found obvious variations between Cochrane reviews and non-Cochrane reviews, such as establishment of protocol prior to the beginning of the review, significant deviation from established protocol (item 2), justification of the exclusion of studies (item 7), and considering risk of bias when interpreting the results of the review (item 13). In prior studies 46,47 , Cochrane reviews showed better methodological quality than non-Cochrane review. In our study, although there was no significant statistical difference(P=0.196), due to the small sample size of twenty-one systematic review or meta-analysis, there was a risk of obtaining false-negative results in statistical analyses 48 . With this in mind, there was not enough evidence to refute the above conclusion. It is possible that the obvious variations between Cochrane review and non-Cochrane reviews were the causes for the better methodological quality of Cochrane reviews. Firstly, all the Cochrane reviews made a protocol prior to conducting analysis, registered it at Cochrane Library platform, and included the justification for any deviations from that protocol. These are mandatory requirements for Cochrane reviews. For non-Cochrane reviews, it is optional to establish a protocol prior to conducting analysis, and in our study only one non-Cochrane review established a protocol. Just as the International Committee of Medical Journal Editors suggested that RCT had better register on a clinical trial registration platform to ensure it transparency and validity and avoid the selection reporting bias 49 , it was also recommended that the secondary research should register on a specialized platform 50 like PROSPERO. In line with these suggestions, our study also suggested that establishing a prior protocol and registration were helpful in improving the methodological quality, validity and reliability of systematic review of meta-analysis. Secondly, all the Cochrane reviews provided a list of included studies and a list of excluded studies. However, most non-Cochrane reviews only provided a list of included studies. If we rely on inappropriate reasons to exclude studies, it would introduce bias, which would influence the transparency, validity and reported quality of our results 51,52 . Hence, it was obvious that Cochrane reviews showed better methodological quality than non-Cochrane review 46,47 .
Thirdly, Cochrane reviews made better consideration of ROB in their discussions. The reasons to consider ROB were listed as follows. "Garbage in, garbage out" 53 would occur when a highly biased study was regarded the same as a less biased study. Thus, it was important to assess the ROB of individual studies to ensure their validity. Likewise, in the light of an increase of systematic reviews or meta-analyses including non-randomized studies, it is important to use a different tool to when considering bias 54,55 . Meanwhile, the result of a bias assessment would influence the statistical heterogeneity of a sample, which would in turn influence the heterogeneity of SR/meta-analysis and the robustness of result 56 , With this in mind, it was necessary to assess the impact of the ROB of individual studies on the overall confidence of systematic review or meta-analysis.
In our study, most systematic review or meta-analysis did not extract the funding source of individual studies. Sreeram 57 investigated and found that the funding source was associated with the reporting of a significant primary outcome on clinical trials.gov; to be specific, trials funded by industry and reporting of statistically significant outcomes were more likely to be published. Hence, whether or not the individual study was funded by industry money would affect the outcome validity of the systematic review or metaanalysis and cause selection reporting bias 58  published in the journal with higher impact factor were of higher quality, our study proved it again. Because journals of higher impact factor were more likely to adopt stricter peerreview mechanism were more apparent for that they advised or even mandated authors to adopt reporting guidelines, such as PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and make protocol registration.
When comparing with the existing tools of methodological quality assessment, such as ROBIS (the Risk of Bias in Systematic reviews), AMSTAR-2 was more familiar for users of the original instrument. However, we cannot ignore the obvious weakness that does not currently specify which risk of bias instruments review authors should use to assess nonrandomized studies, and indeed which need more improvement in the future.

Limitations
We cannot neglect the existing limitations of our study, to be listed as follows: Firstly, the small sample size of systematic reviews or meta-analyses in our study would cause bias of our result. Secondly, the long-term that we identified accurately as 4 weeks may be too short. Thirdly, two of the systematic reviews or meta-analyses in our study included the individual study of open-label study, which has broken blindness in the follow-up time.
Hence, controversy exists when we regard them as RCT in the process of the methodological quality assessment of the specific domain of AMSTAR-2.

Conclusions
In conclusion, most systematic reviews or meta-analyses, which were about the effectiveness or efficacy of long-term opioid use for chronic non-cancer pain, had poor methodological quality, especially in protocol registration, making important the consideration of risk of bias when interpreting the results of the review. In the future, more progress should be made in the above items and influence strict control of methodological quality.

Consent for publication:
All authors agreed the manuscript to be published.

Contributor ship statement:
Qian Li and Ke Deng was co-first author who wrote the manuscript, designed and conducted the analysis, analyzed the data. Xiaoyuan Jiang and Huan Tao participated in assessing the quality of the included reviews. Hui Liu was responsible interpretation and supervision of this review. Jin Chen as the corresponding author, was responsible for the conception, design, analysis and interpretation of the reviews, revising it critically for the important intellectual content. All authors agreed the manuscript to be published.

Competing interests:
the authors declared they have no conflicts of interests.

Funding:
This study did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data sharing statement:
Since all the data in this study were extracted from published articles, all the data and material was available.  b, In item 11, item 12 and item 15, the total Cochrane number was 4 ( a review didn't include any individual study and four reviews didn't go on meta-analysis).
C, In item 14, the total Cochrane number was 6(a review didn't include any individual study and two reviews only included one study) . d, In item 11, item 12 and item 15, the total Non-Cochrane number was 8 (four reviews didn't go on meta-analysis).