Methodological Quality and Risk of Bias of Systematic Reviews and Meta-Analyses on Stem Cells for Knee Osteoarthritis: A Cross-Sectional Survey

Background: Clinical guidelines need high-quality studies to support clinical decision-making, in which the evidence often was collected from systematic reviews (SRs) and/or meta-analyses (MAs). At present, the methodological quality and risk of bias (RoB) of SRs/MAs on stem cell therapy for the treatment of knee osteoarthritis (KOA) has been poorly investigated. This study aims to strictly evaluate the methodological quality and RoB in SRs/MAs of stem cell therapy for KOA. Methods: Four electronic databases (PubMed, Embase, Cochrane Library, and Web of Science databases) were searched, from inception to October 5th, 2021. SRs/MAs involving RCTs or cohort studies on stem cell therapy for the treatment of KOA were included. The methodological quality and RoB were assessed using AMSTAR 2 and ROBIS tool respectively. Results: In total, 22 SRs/MAs were included. According to the results obtained by AMSTAR 2 tool, all SRs/MAs were rated as “Critically low”. Main methodological weaknesses were as follows: eight items accounted for more than 50% of “No”, including Items 2 Protocol registration (81.82%), Item 7 Study exclusion and justication (86.36%) and Item 15 Investigation and discussion of publication bias (63.64%) were critical items. ROBIS-based RoB assessment showed that all the SRs/MAs were rated as “High”. Conclusions: The overall methodological quality of the SRs/MAs concerning the application of stem cell therapy in treating KOA is “Critically low”, while the RoB is high. It is dicult to provide effective evidence for the formulation of guidelines for KOA treatment. We suggest that the relevant methodological quality assessment should be carried out in the future before the SRs/MAs are used as clinical evidence. CRD42021246924

con icting ndings may have severe defects and can not support clinical decision-making [14]. Stem cell therapy is not recommended by the Osteoarthritis Research Society International (OARSI) to treat KOA based on low-quality evidence [15]. Other national guidelines also indicate that the effectiveness of stem cell therapy should be veri ed by higher-quality studies [8,16].
In SRs/MAs, the methodological quality often affects the reliability and accuracy of conclusions [17]. At present, only one study uses the AMSTAR tool to evaluate SRs/MAs that contain both RCTs and non-RCTs regarding the application of mesenchymal stem cells (MSCs) therapy in treating KOA [18]. The results show that the existing SRs/MAs lack high methodological quality [18]. In addition, the application of AMSTAR in SRs/MAs that contained non-RCTs is still limited [19]. At present, AMSTAR has been updated to the second edition [20]. As a new tool to evaluate the methodological quality of SRs/MAs, AMSTAR 2 can assess SRs/MAs incorporated into RCTs and/or non-RCTs, and has been applied in obesity [21], osteoarthritis [22], female menopause [23], sleep [24], and other elds. It is found that most SRs/MAs have low methodological quality. At present, no existing study has employed the AMSTAR 2 tool to evaluate the methodological quality of SRs/MAs involving both RCTs and/or non-RCTs regarding the application of stem cell therapy for KOA.
It is necessary to distinguish the term "methodological quality" with "risk of bias (RoB)". First of all, methodological quality re ects the standardization of the overall research and production process.
Different from methodological quality, bias is derived from the study design, implementation, analysis, and report processes, which will affect the nal research results [25]. Meanwhile, it is also important to assess the RoB in SRs/MAs.

Literature search
Two reviewers (W.Y. and J.C.) independently searched the PubMed, Embase, Cochrane Library, and Web of Science databases from inception to October 5 th , 2021. No restriction was applied to country or language of publication. First of all, the keywords used in the search strategy included "stem cells," "mesenchymal stem cells," "mesenchymal stromal cell*," "osteoarthritis," "knee osteoarthritis," "systematic review," "meta-analysis," etc. The detailed search strategy was developed for the PubMed database and subsequently adapted for the other databases (Additional le 1). In addition, some international databases of registered SRs such as the International prospective register of systematic reviews (PROSPERO), and reference lists in the eligible studies were also checked to identify any other potentially eligible SRs/MAs. Any disagreement between them was resolved by consulting with a third author (A.L.).

Literature selection
The retrieved studies were imported into the EndNote 20 software, and then two authors (T.G. and P.N.) independently deleted the duplicates and irrelevant studies by checking titles and abstracts. Later, the full-texts of the remaining studies were checked for their eligibility for inclusion. Any disagreement between them was solved by consulting with a third author (A.L.).

Data extraction
One author (H.F.) was responsible for extracting critical data from the included SRs/MAs and imported them into the Excel 2010 software. The extracted data included the rst author, number of authors, publication year, country of the rst author, journal name and impact factor, searched databases, number of included RCTs, number of recruited participants, registered information, interventions and funding. The data extraction process was checked independently by the second author (Y.J.). Any disagreement was solved by consulting with a third author (A.L.).

Assessment of methodological quality
The methodological quality of the included SRs/MAs was assessed by one author (W.Y.) using AMSTAR 2. AMSTAR 2 is a critical tool used to evaluate the methodological quality of SRs/MAs involving RCTs and/or non-RCTs of healthcare interventions [20]. It consists of 16 items, including 7 critical and 9 non-critical ones. The overall con dence is rated as high, moderate, low, or critically low. In this study, the methodological quality assessment process was checked independently by the second author (J.C.). Any disagreement between them was settled down by the opinion of a third author (A.L.).

Item 2*. Protocol registration (Critical item)
No SRs/MAs were rated as "Yes". Speci cally, 18.18% (n=4) of the enrolled SRs/MAs were rated as "Partial Yes", including three reporting a protocol and PROSPERO registration number.
However, one SR/MA had inconsistent search strategy with protocol [11], while two lacked complete inclusion and exclusion criteria in the protocol, and did not describe the inconsistency between the implementation process and the protocol [12,40]. Another SR/MA reported that a protocol was developed in advance and followed the PRISMA reporting guidelines, but the protocol was not registered or published [43]. Meanwhile, 81.82% (n=18) of the SRs/MAs were rated as "No". Among these 18 SRs/MAs, three referred to a protocol [30,41,42], of which, two completed the PROSPERO registration [41,42]. Still, one failed to provide a PROSPERO registration number [42], one was completely inconsistent with the research content [41], and the remaining one did not describe the protocol content [30]. The remaining 15 SRs/MAs did not refer to a protocol. It was impossible to determine the review methods in advance according to the content of the studies.

Item 3. Study designs for inclusion
There were 27.27% (n=6) of the enrolled SRs/MAs reporting the reasons why speci c study designs were included [12,30,36,37,41,44] (rated as "Yes"). Of them, one included prospective cohort study with longterm follow-up and emphasized that RCTs alone did not provide su cient evidence [30], one included RCTs to offer reliable evidence due to the controversial conclusion of previous studies [12], one was carried out because a previously unpublished SRs/MAs involved RCTs alone [36], two included the newly published or high-quality RCTs to update clinical evidence [37,41], and one considered that the overall quality of previous studies was poor and carried out MAs including RCTs to evaluate the clinical e cacy [44]. The remaining 72.73% (n=16) of the SRs/MAs did not explain the reasons why certain study designs were included (rated as "No").

Item 4*. Literature search (Critical item)
None of the enrolled SRs/MAs used a comprehensive literature search strategy that satis ed all the components required. In addition, 68.18% (n=15) of these SRs/MAs searched two or more databases, provided keywords and search strategy, and justi ed language or retrieval time restrictions, but supplementary retrieval contents (like published reviews, clinical trial registration platform, eld experts, gray literature) were not comprehensively searched. Among these SRs/MAs, one did not consult eld experts [11], while for the rest 14 SRs/MAs, one did not retrieve gray literature [43], three did not search the reference list of the included studies or gray literature [27,29,37], three did not search the clinical trial registration platform or gray literature [13,33,41], seven did not retrieve the reference list of the included studies, gray literature, or clinical trial registration platform [30,32,34,38,40,42,45], which met the minimal requirement (rated as "Partial Yes"). However, 31.82% (n=7) of the SRs/MAs failed to meet the minimal requirement (rated as "No"), among which, ve did not provide complete search strategies [28, 35,36,39,44], one did not explain the reason for restricting the English language [12], and one did not offer comprehensive search strategy or explain the reason for restricting the English language [31]. (n=5) of the SRs/MAs did not describe the data extraction process, and it was di cult to judge whether it was independently completed by two reviewers [27,31,35,42,45] (rated as "No").

Item 7*. Study exclusion and justi cation (Critical item)
Only 13.64% (n=3) of the SRs/MAs provided a list of excluded studies and justi ed the reasons for exclusions [27,28,44] (rated as "Yes"). The remaining 86.36% (n=19) of the SRs/MAs only described the summary reasons of exclusions and did not provide a list of speci c exclusion studies [11-13, 29-43, 45] (rated as "No").

Item 8. Description of included studies
There were no clear indicators to distinguish "Partial Yes" and "Yes". Therefore, we rated all the studies that described PICOs as "Partial Yes" for conservative evaluation. 81.82% (n=18) of the SRs/MAs described the elements of PICOS (rated as "Partial Yes"). The remaining 18.18% (n=4) of the SRs/MAs were rated as "No". In particular, three did not describe the control measures [32,33,45], and one did not describe the control measures and outcomes [34].
Item 9*. Risk of bias assessment (Critical item) 45.45% (n=10) of the enrolled SRs/MAs appropriately used the RoB assessment to assess all of the essential biases (rated as "Yes"). In particular, nine only contained RCTs and used the Cochrane RoB tool to assess all the essential biases [12,13,35,37,[39][40][41][42]44], whereas one included both RCTs and non-RCTs (RCTs: the Cochrane RoB tool, non-RCTs: ROBINS-I) [45]. 9.09% (n=2) of the enrolled SRs/MAs assessed part of the essential biases (RCTs: concealed allocation and blinding; non-RCTs: confounding and selection bias), which met the minimal requirement [30,43] (rated as "Partial Yes"). Among them, one used the Downs and Black scale but did not evaluate the selective reporting bias [30], whereas the other adopted the modi ed Newcastle-Ottawa scale for non-RCTs and did not evaluate the selective reporting bias [43]. The remaining 45.45% (n=10) of the enrolled SRs/MAs were rated as "No", including two adopting the Jadad scale for RCTs [29,38], ve not reporting the signi cant bias results [11,27,28,32,33], and three not reporting the tool used [31,34,36].
Item 10. Report on the sources of funding for included studies There were 95.45% (n=21) of the SRs/MAs that failed to report the funding information of included studies (rated as "No"). Only 4.55% (n=1) of the SRs/MAs reported it by chart [30] (rated as "Yes").
Item 12. Assessment of the potential impact of risk of bias on the results 13.64% (n=3) of the SRs/MAs included both RCTs and/or non-RCTs with different RoB and investigated the potential impact of RoB on MAs or other evidence integration [13,30,43] (rated as "Yes"). Of them, one conducted meta-regression [30], one carried out subgroup analysis [13], and one performed sensitivity analysis to evaluate the high-quality studies [43].  [11, 13, 27-32, 34, 35, 39, 42-45] (rated as "Yes"). Among the 15 SRs/MAs, two discussed the RoB resulting from methodological quality [13,27]; two mentioned the RoB of different study designs [28,32]; one reported the RoB of outcome measurement, blinding and randomization [29]; one addressed the RoB of outcome measurement, blinding, randomization, allocation concealment and selective reporting [11]; one discussed the RoB of methodological quality and study design [30]; two addressed the risk of selection bias [31,34]; one discussed the RoB caused by randomization and small sample size [35]; one reported the RoB of outcome measurement and blinding [39]; one discussed the RoB of implementation [42]; one addressed the RoB of outcome measurement and implementation [43]; one discussed the RoB caused by small sample size [44]; and one addressed the RoB of confounding factors [45]. The remaining 31.82% (n=7) of the SRs/MAs did not investigate the impact of RoB on the total effect [12,33,[36][37][38]40, 41] (rated as "No").

Item 14. Exploration and explanation of heterogeneity
There were 81.82% (n=18) of the SRs/MAs investigating the sources of heterogeneity and discussing the potential impact on the conclusions (rated as "Yes"). The rest 18.18% (n=4) of the SRs/MAs were rated as "No" [12,28,29,38], among which, three investigated the source of heterogeneity but did not thoroughly investigate the existing heterogeneity [12,28,38], and one did not report the source of heterogeneity [29].
Item 15*. Investigation and discussion of publication bias (Critical item) 13 Item 16. Report of sources of con ict of interest There were 45.45% (n=10) of the SRs/MAs rated as "Yes", including ve stating that they did not receive any funding related to the research and described the potential con icts of interest [13,34,39,43,44]. The other ve provided information on the sources of funding and explained the role of funds in research (like participation in research design, data collection and analysis, article writing) as well as the potential con icts of interest [11,12,29,33,40]. The remaining 54.55% (n=12) of the SRs/MAs were rated as "No". In particular, three of them reported the source of funding but did not describe the role of funds in research or the potential con icts of interest [27,28,37]; two described the funding and potential con icts of interest, but did not explain the role of funds in research [30,41]; one stated that it did not receive any research-related funding, but did not describe the potential con icts of interest [42]; and six did not describe the source of funding [31,32,35,36,38,45].

Rating of each issue and overall con dence
Methodological adherence of each item is presented in gure 2 (Methodology adherence of each item for included systematic reviews based on AMSTRA 2). Overall con dence is displayed in gure 3 (Methodology adherence of overall con dence for included systematic reviews based on AMSTRA 2). All the SRs/MAs were rated as "Critically low", among which, eight items rated as "No" accounted for more than 50%, including Items 1, 2, 3, 7, 10, 12, 15, 16. Of them, Items 2, 7, 15 were the critical items.

Risk of bias for the included SRs/MAs
The details of ROBIS in assessing the RoB of each SRs/MAs are presented in Table 2. The gure 4 (Risk of bias for included SRs/MAs based on ROBIS) showed the bias judgments in Phase 2 and Phase 3 by ROBIS in the form of percentages, and all SRs/MAs were rated as "High".

Phase 2.1 Study eligibility criteria
Only 4.55% (n=1) of the SRs/MAs were rated as "Low" [11]. However, 63.64% (n=14) of the SRs/MAs were rated as "High". In particular, 13 SRs/MAs did not describe the intervention measures of the control group [12, 28, 29, 30, 32-34, 36, 40-43, 45], and one did not present the intervention measures of the control group, nor did it report the selected outcomes [31]. The remaining 31.82% (n=7) of the SRs/MAs were rated as "Unclear" because they did not describe whether the protocol was registered in advance [13,27,35,[37][38][39]44]. It was di cult to judge whether the predetermined inclusion criteria were followed based on the study contents and whether the limitations were appropriate based on the research characteristics.

Phase 2.3 Data collection and study appraisal
There were 50.00% (n=11) of the SRs/MAs rated as "High", among which, one did not describe the basic characteristics of the included studies in detail [45], seven did not use appropriate tools to evaluate the RoB [11, 27-29, 31, 36, 38], and three had both of the above two issues [32][33][34]. The remaining 50.00% (n=11) of the SRs/MAs were rated as "Unclear", since it was di cult to judge whether all the relevant research results were extracted for data synthesis, and ve of them did not describe the process of RoB assessment. It was also not easy to judge whether the two reviewers completed the RoB assessment process independently [35,37,[42][43][44].

Discussion
It is the rst time that AMSTAR 2 and ROBIS tools have been used to assess the methodological quality and RoB of SRs/MAs in the eld of stem cell therapy for KOA. This study included 22 SRs/MAs from eight countries published before October 5th, 2021.
Our results showed that the overall methodological quality of the enrolled SRs/MAs was disappointing.
Based on our analysis, following the PRISMA reporting guideline did not seem to improve the methodological quality of the studies. Of the 22 studies, 18 adopted the PRISMA guideline, but all of them were rated as "Critically low", similar to the conclusion drawn by Xu et al [24]. It may be because that the PRISMA reporting speci cation only provides researchers with a list of reports to be reported.
Still, the nal report content and level of detail are up to the researchers. According to the AMSTAR 2, de ciencies of critical items of this study mainly came from Items 2, 7 and 15. The PRISMA reporting guideline reported these items in Items 16b, 21, and 24 [88]. This indicated that most of the SRs/MAs only emphasized the use of the PRISMA reporting guideline but did not fully follow the implementation of the standard, which also resulted in the low methodological quality of the SRs/MAs.
The ndings of this work were consistent with the conclusions drawn in other studies. Wu et al. [22] evaluated SRs/MAs in the eld of osteoarthritis treatment and found 167 SRs/MAs. Only seven of these studies were rated as high-quality, and 76% of them were rated as "Critically low". Additionally, the main methodological issues were Items 2, 4, and 7 in AMSTAR 2. Storman et al. [21] discovered that 99% of the SRs/MAs in the eld of bariatrics were rated as "Critical low", and only 6% of the SRs/MAs were rated as low RoB by ROBIS. Yan et al.
[89] evaluated SRs/MAs related to robotic surgery and found that 95.1% of the studies were rated as "Critical low". Typically, Item 2 Protocol registration and Item 10 Report on the sources of funding for included studies were the main methodological issues in AMSTAR 2. Siemens et al. [90] discovered in advanced cancer patients that 88.1% of the SRs/MAs were rated as "Critical low", and the main causes were not providing the list of excluded literature and the reasons for exclusion (83.5%) and not being registered in advance (85.1%). In addition, the AMSTAR 2 tool has been adopted in numerous elds like sleep [24], hip-related pain [91], minimally invasive glaucoma surgery [92], and proton pump inhibitor safety [93] to evaluate the methodological quality of studies.
Noteworthily, the list and reasons for excluded literature and the registration of protocols in advance have signi cant impacts on the methodological quality of the SRs/MAs. The former should comply with the PRISMA reporting guideline detailedly and cautiously when conducting SRs/MAs. Meanwhile, it is still controversial about whether the latter should be regarded as a critical item of AMSTAR 2. Xu et al. [24] did not take protocol registration as an essential item to evaluate the methodological quality of SRs/MAs in the eld of sleep, and thought that it did not re ect the level of methodological quality. Siemens et al. [90] found no signi cant correlation between protocol registration and methodological quality.
On the contrary, Ge et al. [94] stated that early registration improved the overall methodological quality of SRs/MAs, but the impact on the overall reporting quality was not signi cant. Sideri et al. [95] found in the eld of orthodontics that, the AMSTAR score of the SRs/MAs registered on PROSPERO increased by an average of 6.6%. The registered SRs/MAs had higher methodological quality, regardless of their minor number. In this study, SRs/MAs that registered the protocol in advance or published the research plan were also rated as "Critical low". However, the correlation between protocol registration and methodological quality should be further evaluated in more elds.
Comprehensive literature retrieval is crucial for the comprehensiveness of the SRs/MAs. According to the AMSTAR 2 and PRISMA reporting guidelines, literature retrieval based on electronic databases alone (e.g., PubMed, Embase, Cochrane Library, and Web of science) is not comprehensive. Instead, it is also necessary to search the published reviews, clinical trial registration platforms, gray literature, and consult eld experts to reduce the publication bias. Generally speaking, journals are more likely to publish positive results, while studies with negative results are often di cult to be published [96]. In this study, none of the enrolled SRs/MAs adopted a comprehensive literature search strategy. For gray literature retrieval, only 4.55% (n=1) of the SRs/MAs searched the gray literature database [11], and 27.27% (n=6) searched the clinical trial registration platform [11, 27-29, 37, 43]. Winters et al. [97] proposed three methods to control the publication bias, namely, (1) registering before the start of the trial, (2) using a funnel chart to evaluate the publication bias intuitively, and (3) using a comprehensive retrieval strategy to retrieve multiple gray literature data. Unpublished studies may signi cantly change the clinical effect of SRs/MAs, and more comprehensive literature retrieval strategies should be developed in the future.
The AMSTAR 2 and ROBIS tools have been used in combination in diverse elds such as chronic heart failure [98], dyslipidemia [99], infertility [100], and psoriasis [101]. It has been con rmed that there is a strong correlation between the use of AMSTAR 2 and ROBIS tools for evaluation and the methodological quality [102,103]. However, the evaluation items in ROBIS need to be further improved to enhance the consistency of evaluation results among different reviewers [104]. The difference in evaluation between two reviewers may be because that AMSTAR 2 is more speci c than the ROBIS evaluation criteria. Further research is warranted to verify the reliability and applicability of these two tools in combination [105].
According to the eld of "identi cation and selection of studies" in ROBIS, 86.36% (n=19) of the SRs/MAS were rated as "high bias risk", which was mainly ascribed to the inclusion of original studies that did not meet the inclusion criteria. Notably, 42 initial studies involved in the 22 SRs/MAs were considered to be RCTs. After screening the full-texts, 16 did not meet the inclusion requirements. Moreover, nine SRs/MAs included studies that did not meet the inclusion criteria, which accounted for the reason why they had a high RoB in ROBIS [12, 13, 29, 36-39, 43, 45]. In addition, appropriate RoB assessment tools are also necessary to ensure the methodological quality. In this study, 59.09% (n=13) of the SRs/MAs used the Cochrane RoB tool [11-13, 33, 35, 37, 39-45], while 13.64% (n=3) adopted the Jadad scale [28, 29, 38], however, it was unable to evaluate the bias caused by the non-real random distribution and selective reports.
In this study, the AMSTAR 2 tool was employed to evaluate the methodological quality of the SRs/MAs, which was rated as "Critically low" when there were two or more issues in the critical items. In fact, the number of problems in essential issues is also signi cant to the research quality, and qualitative evaluation is often di cult to re ect their differences objectively. The scoring scheme recommended by the traditional AMSTAR 2 may not be suitable for SRs/MAs with "Critically low" methodological quality.
This study comprehensively evaluated SRs/MAs regarding the application of stem cell therapy in KOA treatment, and almost all the published SRs/MAs in the eld of stem cell therapy for KOA were included. Therefore, our results were representative. Nonetheless, certain limitations should be noted. In this study, the gray database was not selected for retrieval, which might cause certain publication bias. In addition, this study focused on evaluating the methodological quality of SRs/MAs, but did not discuss the heterogeneity of SRs/MAs, which might not be addressed or derived from clinical heterogeneity, methodological heterogeneity, or clinical combined with methodological heterogeneity.

Conclusions
In conclusion, the overall methodological quality of the SRs/MAs concerning the application of stem cell therapy in treating KOA evaluated by AMSTAR 2 is "Critically low", while the RoB is high in ROBIS. It is di cult to provide effective evidence for the formulation of guidelines for KOA treatment. We suggest that the relevant methodological quality assessment should be carried out in the future before the SRs/MAs are used as clinical evidence. We should also follow the AMSTAR 2 to carry out related SRs/MAs research in the future, so as to improve the methodological quality.     Methodology adherence of overall con dence for included systematic reviews based on AMSTRA 2