2.1. Literature search
The way to identify relevant studies was to use “Romosozumab ”,” Teriparatide” and “postmenopausal osteoporosis” as key words with Boolean operators “AND” or “OR” in electronic databases including PubMed, Embase, the Cochrane Library, Web of Science and the Cochrane Controlled Trials Register up to June 2019 . Only randomized controlled trials (RCTs) performed on human beings were included in our studies. The Flow chart of the trial selection process was presented in the figure of flow chart (Fig. 1). We also used the PRISMA guidelines , GRADE system and Cochrane Handbook to assess the quality of the included studies to make sure the data reliable and veritable.
2.2. Selection criteria
Trials were included on condition that they met the PICOS (population, intervention, comparator, outcome, study design) criteria.
Population: Female patients with postmenopausal osteoporosis
Outcomes: The primary outcomes included the following: the percentage change of lumbar spine and total hip from baseline in bone mineral density at month 6 and month 12 in each group. The secondary outcomes contained the following: the percentage change of femoral neck from baseline in bone mineral density at month 6 and month 12 in each group and the incidence of adverse events at month 12 in each group.
Study design: RCT
2.3. Data extraction
A standard data extraction form was used to collect the relevant data from included studies. Two reviewers collected available data from included studies independently, and any disagreement between the two reviewers was judged by a third reviewer. The relevant data included authors, published dates, intervention types, age, sample size, outcomes, duration of follow-up and reference type. Baseline characteristics of included trials were presented in Table 1. Data on BMD were obtained from the data presented in tables or figures if no direct data were available from the article text.
2.4. Risk of bias assessment
According to the Cochrane Handbook for Systematic Reviews of Interventions, the methodological quality and basis of the included literature were assessed as follows:
randomization, allocation concealment, blind method, selective reporting, incomplete outcome data, and other bias (Figs. 2 and 3).
2.5. Grading quality of evidence
We used GRADE system to evaluate the level of the evidence and strength of recommendations for included outcomes. GRADE software was used to evaluate the evidence of included outcomes. Initially, RCTs were considered as high confidence in an estimate of effect and cohort studies were considered as low confidence in an estimate of effect. Reasons that might decrease the level of confidence include limitations, inconsistency, indirectness, and imprecision, and publication bias. Reasons that might raise the level of confidence include large effect, plausible confounding, dose-response. The GRADE evidence was divided into the following categories: (1) High-quality evidence, which indicated that further research was unlikely to change the confidence in an estimate of effect; (2) Moderate-quality evidence, which indicated that further research was likely to have an important impact on confidence in an estimate of effect and may change the estimate; (3) Low-quality evidence, which indicated that further research was likely to have an important impact on confidence in an estimate of effect and was likely to change the estimate; and (4) Very low-equality evidence, which indicated that we were very uncertain about the results. The results of the GRADE analysis were presented in Table 2.
2.6. Statistical analysis and data synthesis
Meta-analyses were performed with Review Manager Software for Windows (version 5.3; Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014). The mean difference (MD) was used to assess continuous outcomes in month 6 and 12, such as BMD of different parts, with a 95% confidence interval (CI). Relative risks with a 95% CI were used to assess dichotomous outcomes, such as AEs. The inverse variance and Mantel-Haenszel methods were used to combine separate statistics. If P values were <0.05, the results were considered statistically significant.
2.7. Investigation of heterogeneity and publication bias
Statistical heterogeneity of the included studies was evaluated using the chi-square test in accordance with the values of P and I2. If the values of I2 < 50%, the heterogeneity might not be important. A fixed-effects model was used to assess these outcomes. If I2 was between 50% and 100%, it could represent substantial heterogeneity. We used random-effects model to evaluate these outcomes. Thresholds for the interpretation of I2 can be misleading, since the importance of inconsistency depends on several factors. Therefore, subgroup analysis or sensitivity analysis was performed to interpret the potential source of heterogeneity. Because of only four studies included, publication bias test were not necessary.