Literature Search Results
The literature screening process is illustrated in Figure 1. Initial literature retrievals obtained 401 citations, 110 of which were removed for duplicate publications. After screening the titles and abstracts of the remaining 291 citations, we excluded 262 irrelevant records based on the selection criteria. Twenty-nine studies were remained for full-text assessment, eleven studies were excluded because of ineligible interventions, including the combination of glucosamine hydrochloride, chondroitin sulphate and curcumin [42], curcumagalactomannoside complex [43, 44], CURs combined with diclofenac [45, 46], herbal formulations of different extracts [47-51], and CURs-free CL extracts [52]. Furthermore, we excluded four conference abstracts [53-56] due to incomplete data. Finally, fourteen eligible studies with 1633 patients were enrolled in our analysis [57-71].
Basic Characteristics of Studies
The study characteristics are presented in Table 1. All included studies were CURs-intervened trials aimed to evaluate the clinical effectiveness of CURs, and published between 2009 and 2021. Among which, eleven studies [57-61, 63, 64, 66, 68-72] were conducted in Asian countries, while the other three studies [62, 65, 67] were from non-Asian regions. Sample sizes were ranged from 30 to 331, and the follow-up durations were limited in 6 months. Forty-three percent of included studies [59, 62-65, 67] were founded by private corporations, 14% of included studies [19, 66, 70] declared that no founding was received, and the rest [57, 58, 60, 61, 68, 69] were founded by research departments of government or university. The details of CURs preparations and administration protocols are presented in Table 2 and Table 3. Five trials [57, 58, 64, 66, 70] applied active-controlled arms (ibuprofen, diclofenac, and paracetamol), and the other nine [59-63, 65, 67-69, 71, 72] were all placebo-controlled trials.
Quality Assessment
The results of literature quality assessment based on the Cochran Risk of Bias Tool 1 are presented in Figure 2. Overall, seven studies [57, 62, 65, 67-70] were defined as having low risk of bias, and five studies [59-61, 63, 64] were judged as having moderate risk of bias for potential reporting bias [59] and attrition bias [63], and deficiencies in specific descriptions of randomization [59, 64], allocation concealment [60] or blinding methods [59, 61, 63]. Two studies [57, 66] were judged as having high risk of bias for inadequate procedures in blind methods. An appropriate description of random sequence generation was reported in eight twelve studies [57, 58, 60-63, 65-70], reasonable allocation concealment was performed in thirteen studies [57-59, 61-70], and double-blinded methods were specific in nine studies [58, 60, 62, 63, 65, 67-70].
MCID for Patient-Reported Outcomes
The MCID is defined as the minimal magnitude an subjective outcome must change to achieve clinical efficacy meeting the satisfaction of patients and clinicians [27]. According to the baseline of included studies, the threshold for each outcome was calculated as follows: 1.33/10 for VAS for pain, 8.97/96 for WOMAC total score, 2.12/20 for WOMAC pain score, 6.62/68 for WOMAC function score, and 0.76/8 for WOMAC stiffness score.
Primary Outcomes
VAS for Pain
Ten studies (497 769 patients) [57, 59-61, 63-68] assessed knee pain using VAS for pain. The overall analysis revealed that CURs had better efficacy in pain reduction than control groups (WMD: −1.46, 95% CI: −2.13 to −0.8, P < 0.001, Figure 3), with high heterogeneity (I2 = 90.6%). The relative improvement (−1.46) of VAS for pian exceeded the MCID (1.33 for VAS for pain) . When compared to placebo, CURs were found to be more efficacious on the improvement of VAS for pain (WMD: −1.94, 95% CI: −2.65 to −1.22, P < 0.001, I2 = 86.3%). Whereas there was no significant difference detected between CURs and NSAIDs (WMD: −0.3, 95% CI: −0.63 to 0.04, P = 0.082, I2 = 6.3%). For the comparison between CUR and NSAIDs, the therapeutic effect (−0.3) was smaller than the MCID. However, the therapeutic effect (−1.94) of CURs in placebo-controlled group exceeded the MCID with both statistical and clinical significance.
An obvious decrease from 86.3% to 36.3% in heterogeneity for placebo-controlled group was observed after removing the study of Atabaki et al. [68], with the intervention of CURs loaded nano-micelles, the pooled result (WMD: −1.50, 95% CI: −1.85 to −1.15, P < 0.001) was similar with the original analysis. Sensitivity analysis did not ferret out one individual study that would affect the statistical robustness of the overall results.
WOMAC Total Score
Seven study (795 patients) [58, 60, 62-64, 69, 70] reported the data of WOMAC total score. The overall analysis found that CURs were more efficacious than control groups on functional promotion (WMD: −7.06, 95% CI: −12.27 to −1.84, P < 0.001, Figure 4), with high heterogeneity (I2 = 87.9%). The relative improvement (−7.06) of WOMAC total score did not exceed the MCID (8.97 for WOMAC total score). When compared to placebo, CURs were found to be more efficacious on the improvement of WOMAC total score (WMD: −10.47, 95% CI: −15.65 to −5.3, P < 0.001, I2 = 0.0%). Whereas there was no significant difference found between CURs and NSAIDs (WMD: −0.68, 95% CI: −3.88 to 2.52, P = 0.676, I2 = 80.6%). For the comparison between CURs and NSAIDs, the therapeutic effect (−0.68) did not exceed the MCID, while the effect size (−10.47) of CURs in placebo-controlled group was larger than the MCID with both statistical and clinical significance.
When the data of Haroyan et al. [62] was omitted, a significant reduction in heterogeneity from 80.6% to 0.0% in placebo-controlled group was observed, but the pooled result (WMD = −12.88, 95% CI: −14.79 to −10.98, P < 0.001) of remained studies was similar with the original analysis. Sensitivity analysis did not ferret out one individual study that would affect the statistical robustness of the overall results.
WOMAC Pain Score
Eight studies (956 patients) [58, 60-63, 67, 69, 70] reported the data of WOMAC pain score. The overall analysis indicated that CURs were significantly better than control groups in terms of pain relief (WMD: −1.42, 95% CI: −2.41 to −0.43, P = 0.005, Figure 5), with high heterogeneity (I2 = 85.6%). The relative improvement (−1.42) of WOMAC pain score did not exceed the MCID (2.12 for WOMAC pain score). When compared to placebo, CURs were found to be significantly more efficacious on the improvement of WOMAC pain score (WMD: −1.94, 95% CI: −2.91 to −0.97, P < 0.001, I2 = 79.2%, Figure 4). There is no significant difference detected between CURs and NSAIDs (WMD: 0.24, 95% CI: −0.47 to 0.96, P = 0.505, I2 = 0.0%, Figure 4). For the comparison between CURs and NSAIDs, the therapeutic effect (0.24) did not exceed the MCID. Similarly, the effect size (−1.94) of CURs in placebo-controlled group was smaller than the MCID with only statistical significance.
The study of Srivastava et al. [61] was considered to be the potential source of heterogeneity given that the I2 values in placebo-controlled group decreased from 79.2% to 51.8% after omitting their data, and the pooled result (WMD: −2.28, 95% CI: −3.05 to −1.52, P < 0.001) reached up to the magnitude exceeding the threshold (2.12) for clinical significance. Sensitivity analysis did not ferret out one individual study that would affect the statistical robustness of the overall results.
WOMAC Function Score
Eight studies (481 956 patients) [58, 60-63, 67, 69, 70] assessed joint function using the WOMAC function score. The overall analysis displayed a significantly better efficacy of functional promotion in CURs groups (WMD: −5.04, 95% CI: −7.65 to −2.43, P < 0.001, Figure 6), with high heterogeneity (I2 = 83.6%). The relative improvement (−5.04) of WOMAC function score did not exceed the MCID (6.62 for WOMAC function score). When comparing to placebo, CURs were found to be significantly more efficacious on the improvement of WOMAC function score (WMD: −6.36, 95% CI: −8.94 to −3.78, P < 0.001, Figure 5). However, there is no significant difference found between CURs and NSAIDs (WMD: −0.57, 95% CI: −3.07 to 1.94, P = 0.657, I2 = 0.0%, Figure 4). For the comparison between CURs and NSAIDs, the therapeutic effect (−0.57) did not exceed the MCID. Similarly, the effect size (−6.36) of CURs in placebo-controlled group was smaller than the MCID with only statistical significance.
After excluding the study of Haroyan et al. [62], the inter-study heterogeneity in placebo-controlled group slightly decreased from 79.2% to 70.7%, but the pooled result (WMD: −7.21, 95% CI: −9.71 to −4.72, P < 0.001) reached up to the magnitude exceeding the threshold (6.62) for clinical significance. Sensitivity analysis did not ferret out one individual study that would affect the statistical robustness of the overall results.
WOMAC Stiffness Score
Eight studies (956 patients) [58, 60-63, 67, 69, 70] evaluated joint stiffness status using the WOMAC stiffness score. The pooled analysis found no significant difference between CURs and control groups in relieving joint stiffness (WMD: −0.34, 95% CI: −0.79 to 0.1, P = 0.131, Figure 7), with high heterogeneity (I2 = 80.7%). The effect size (−0.34) of WOMAC stiffness score did not achieve statistical significance in overall analysis, and the 95% CI encompassed the MCID (0.76/8 for WOMAC stiffness score). However, when compared with placebo, CURs were found to be significantly more efficacious on the improvement of WOMAC stiffness score (WMD: −0.54, 95% CI: −1.03 to −0.05, P = 0.031, I2 = 77.6%, Figure 6). There is no significant difference found between CURs and NSAIDs (WMD: 0.19, 95% CI: −0.17 to 0.56, P = 0.298, I2 = 0.0%, Figure 4). For the comparison between CURs and NSAIDs, the therapeutic effect (0.19) did not exceed the MCID. Similarly, the effect size (−0.54) of CURs in placebo-controlled group was smaller than the MCID with only statistical significance.
By excluding the study of Panda et al. [63], a decrease in heterogeneity from 77.6% to 0.0% in placebo-controlled group was observed, but the pooled result (WMD: −0.31, 95% CI: −0.56 to −0.05, P = 0.018) was similar with the original analysis. Sensitivity analysis did not ferret out one individual study that would affect the statistical robustness of the overall results.
Secondary Outcomes
OA biomarkers
Two studies [60, 61] assessed the antioxidation of CURs through detecting the serum level of reactive oxygen species (ROS), superoxide dismutase (SOD), glutathione (GSH) and malondialdehyde (MDA), and found that changes in these biomarkers may contribute to the therapeutic effects of CURs in alleviating OA symptoms. Three studies [61, 64, 72] reported the serum level of inflammatory mediators, such as interleukin-1β (IL-1β), IL-4, IL-6, tumor necrosis factor-α (TNF-α), leukotriene B4 (LTB4) and prostaglandin E2 (PGE2), and proved that the systemic anti-inflammatory effects of CURs may have no correlation with its therapeutic effects in knee OA. Besides, six studies [62-64, 66, 68, 72] measured the C-reactive protein (CRP) serum concentration and erythrocyte sedimentation rate (ESR), two sensitive biomarkers for systemic inflammation. Two studies [64, 65] evaluated the status of cartilage degeneration via serum level of C-terminal telopeptides of type II Collagen (U-CTX-II) and Coll2-1. Similarly, there was no significant difference found between groups for aforementioned biomarkers.
Withdraw Rate and Rescue Medications
All included studies reported the withdraw rate of follow-up cohort, there was no lost case reported in the treatment and control group of the study of Atabaki et al. [68], thus their data cannot be merged in meta-analysis. The pooled analysis showed no significant difference in withdraw rate between CURs and placebo group (RR: 1.12, 95% CI: 0.76 to 1.64, P = 0.57, I2 = 0.0%) or NSAIDs group (RR: 0.87, 95% CI: 0.63 to 1.21, P = 0.4, I2 = 14.2%, Supplementary Figure 1). Eight studies [58-60, 63, 65-67, 69] reported the administration of concomitant rescue medications for ethical concerns. Among which, five studies [58, 63-66] reported the number of patients using rescue medications, the pooled analysis found no significant difference in the usage of rescue medications between CURs and placebo group (RR: 0.93, 95% CI: 0.6 to 1.43, P = 0.742, I2 = 55.4%) or NSAIDs group (RR: 0.99, 95% CI: 0.53 to 1.83, P = 0.963, I2 = 38.8%, Supplementary Figure 2). Four studies [59, 60, 67, 69] recorded the discontinuation of rescue medications, the pooled results showed that the cessation rate of rescue medications in CURs group was significantly higher than placebo group (RR: 4.04, 95% CI: 2.43 to 6.71, P < 0.001, I2 = 11.7%, Supplementary Figure 2).
Adverse Events
Among the included fourteen studies, two [68, 69] reported no AEs at the end of the trials. According to the data of the remaining twelve studies (1515 patients), AEs were mainly concentrated in gastrointestinal symptoms including meteorism, gastro-oesophageal reflux, dyspepsia, nausea, and stomach pain as shown in Table 4. The pooled analysis revealed no significant difference between CURs and control groups (RR: 0.86, 95% CI: 0.72 to 1.04, P = 0.122, Figure 8), with low heterogeneity (I2 = 49.2%). There was no significant difference found in the incidence of AEs between CURs and placebo group (RR: 1.22, 95% CI: 0.89 to 1.67, P = 0.227, I2 = 34.7%), while a significant lower incidence of AEs was observed in CURs group when compared with NSAIDs group (RR: 0.71, 95% CI: 0.57 to 0.90, P = 0.004, I2 = 55.8%).
Subgroup analysis
Subgroup analyses were only performed in the placebo-controlled group due to limited number of original studies, and to avoid the interference of different controls to the results. The results of subgroup analyses are arranged in Table 5. We found no significant difference in the subgroup results of VAS for pain, WOMAC pain score and WOMAC function score compared to the overall analyses, except for the pure extracts subgroup; the result of the subgroup showed no significant difference between CURs and placebo on the improvement of WOMAC pain score. As for clinical significance, the effect size of the pure extracts and non-Asia subgroups decreased to be lower than the MCID for VAS for pain. Conversely, in the time < 12 weeks, daily dose < 1,000 mg, total dose < 50 g, bio-optimized extracts, and Asia subgroups, we found that the effect sizes increased to exceed the MCID for WOMAC pain score and WOMAC function score.
Publication bias
The Egger’s linear regression test for VAS for pain, WOMAC total score, WOMAC pain score, WOMAC function score and WOMAC stiffness score did not detect significant publication bias (P = 0.296, 0.96, 0.78, 0.515, and 0.63 respectively), however, asymmetry of funnel plots was observed by visual inspection, which indicating the existence of potential publication bias (Supplementary Figure 3).