Quantitative study on the efficacy of acupuncture in the treatment of menopausal hot flashes and its comparison with nonhormonal drugs

Supplemental Digital Content is available in the text Abstract Objective: This study aimed to compare the efficacy of acupuncture to that of sham acupuncture, placebo pills, and nonhormonal drugs to provide the necessary quantitative information for establishing medication guidelines for menopausal hot flashes. Methods: A comprehensive literature search was performed using public databases. Randomized clinical studies on acupuncture therapy for the treatment of hot flashes in menopausal women were identified. A time-course model was established to describe the efficacy characteristics of acupuncture and sham acupuncture, which were compared with the efficacy of nonhormonal drugs and placebo pills reported in the literature. Results: A total of 17 studies involving 1,123 participants were included. The quality of all the studies included in the analysis is medium to high, and there was no obvious risk of bias. It was found that the baseline number of hot flashes was an important factor affecting the efficacy of acupuncture and sham acupuncture. After correcting the baseline to eight hot flashes per day, the frequency of hot flashes decreased from baseline for traditional acupuncture (TA), electro-acupuncture (EA), TA&EA (merger analysis of TA and electro-acupuncture), and sham acupuncture were 3.1 (95% confidence interval [CI]: 2.8-3.4), 3.6 (95% CI: 3.2-4.0), 3.2 (95% CI: 2.9-3.5), and 2.6 (95% CI: 2.2-3.0) times/d at week 8, respectively. Compared with findings reported in the literature, we found the efficacy of electro-acupuncture was comparable to that of selective serotonin reuptake inhibitors/serotonin–norepinephrine reuptake inhibitors and neuroleptic agents such as gabapentin and escitalopram. Furthermore, the efficacy of TA&EA (merged) was significantly higher than that of placebo pills (2.3, 95% CI: 1.8-2.9). Conclusions: The efficacy of TA&EA (merged) was higher than that of sham acupuncture and significantly higher than that of placebo pills. The efficacy of electro-acupuncture was higher than that of traditional acupuncture, significantly higher than that of sham acupuncture, and comparable to that of selective serotonin reuptake inhibitors/serotonin–norepinephrine reuptake inhibitors and neuroleptic agents.

H ot flashes are disturbing symptoms during menopause that are caused by declining estrogen levels, always accompanied by a large amount of sweating and a sensation of intense heat in the face, neck, or chest. 1 In America, approximately 75% of menopausal women experience hot flashes, which may impact their quality of life. 2 Hormone therapy can significantly alleviate menopausal hot flashes; however, due to safety concerns, hormone therapy products are contraindicated for women who have breast cancer or a history of breast cancer, venous thromboembolism or a history of venous thromboembolism, thrombophilia, or a history of stroke or myocardial infarction. 3 The selective serotonin reuptake inhibitor (SSRI) paroxetine was approved by the US Food and Drug Administration in 2013 and was the first nonhormonal drug approved for the treatment of menopausal hot flashes. 4 However, the efficacy of paroxetine in relieving menopausal hot flashes is not clear. The decreased frequency of hot flashes after 12 weeks of treatment is only 14.6% higher than that of the placebo, which is equivalent to only 40% of the efficacy of estradiol. 5 Other nonhormonal drugs such as gabapentin, clonidine, SSRIs (except paroxetine), and serotonin norepinephrine reuptake inhibitors (SNRIs) are often used off-label to treat menopausal hot flashes. [6][7][8][9] However, in addition to a wide range of central side effects and the risk of drug interactions, 10 the efficacy of these drugs fail to demonstrate much of an advantage and their maximum efficacy is no more than half that of estradiol. 5 Phytoestrogens are also widely used in the treatment of menopausal hot flashes because of their estrogen-like effects. 11 However, their actual efficacy remains controversial. 12 For instance, our previous study suggested that the maximum efficacy of soy isoflavones was higher than that of paroxetine, but the onset of action was very slow and at least 16 weeks of treatment was needed to reach or surpass the efficacy of paroxetine. 5 As the efficacy of nonhormone drugs is generally poor, some individuals use complementary or alternative therapies of which acupuncture is one of the most common. 13 The prevalence of acupuncture use by women to treat menopausal hot flashes is estimated to be 1% to 10.4%. 1 A large number of studies have confirmed that acupuncture can promote the release of endogenous analgesics, thereby preventing the development of chronic pain. 14 However, the mechanism of acupuncture in alleviating menopausal hot flashes remains unclear. 15,16 A randomized controlled trial published in 2016 showed that acupuncture has no improved efficacy in alleviating menopausal hot flashes compared with that of nonintrusive sham acupuncture. 2 Many researchers believe that acupuncture is only a powerful placebo and that its efficacy comes primarily from strong patient beliefs and expectations regarding acupuncture, rather than any direct ''biological effect'' on the body. Opponents argue that nonintrusive sham acupuncture may also produce certain biological effects due to skin irritation that are not completely ineffective. 17 Therefore, using nonintrusive sham acupuncture as a control may actually improve the placebo effect, making it difficult to formulate accurate conclusions in acupuncture trials. 1 Although there have been a number of meta-analyses on the efficacy of acupuncture in relieving menopausal hot flashes, the conclusions are contradictory. [18][19][20][21][22] The contradictions may result from limitations of the studies. First, most acupuncture trials report repeated measurements of hot flashes over time. However, previous meta-analyses often aggregated the results of only the final time point (ranging from 4 to 24 wk), which not only excludes valuable information from other follow-up time points, but also results in bias estimates of efficacy due to different follow-up time points being used. 11 Second, there is a large amount of heterogeneity between the studies, 19 such as treatment duration, baseline frequency of hot flashes, the method of blinding, and other factors that affect the reliability of the conclusions. Modelbased meta-analysis (MBMA) is an efficient type of metaanalysis that explicitly incorporates the effects of treatment duration and other covariates using pharmacology models, thereby reducing the heterogeneity among studies. MBMA has become an important analytical method in modelinformed drug development. 5,11 We previously determined the distribution of efficacy for placebos and nonhormonal drugs in relieving menopausal hot flashes using MBMA. 4,5 In the current study, we used MBMA to quantitatively analyze the efficacy characteristics of acupuncture in treating menopausal hot flashes. This study focused on comparing the efficacy of acupuncture with that of sham acupuncture, the efficacy of sham acupuncture with that of an oral pharmacological placebo, and the efficacy of acupuncture with that of nonhormonal drugs to provide the quantitative information necessary for establishing medication guidelines for treating menopausal hot flashes.

Search strategy
We carried out a comprehensive search for literature indexed in PubMed, EMBASE, Cochrane Library, Web of Science, and SCOPUS databases with publication dates ranging from the date of inception of each database until January 8, 2019. The search terms included ''hot flashes'' and ''acupuncture.'' Only clinical trials were included and the language was restricted to English. The detailed search strategies are described in the Supplementary Search Strategy, http://links.lww.com/MENO/A729.

Inclusion and exclusion criteria
Only studies that met the following criteria were included: (1) randomized controlled acupuncture trials; (2) participants were premenopausal and postmenopausal women or patients with a history of breast cancer (include breast cancer survivors and patients undergoing chemotherapy or radiotherapy. These patients were included because hot flashes are also one of the most typical symptoms of such population); (3) studies reporting the frequency of hot flashes at baseline and after treatment; and (4) if the study was a cross-over design, only data from the first period were included to avoid the legacy effect. Any cross-over trials that failed to report outcomes in the first stage were excluded. To reduce heterogeneity, participants with other comorbidities or interventions of acupuncture combined with other treatments were excluded.

Data extraction
Data were independently extracted by two researchers and checked by a third person for any discrepancies. A Microsoft THE EFFICACY OF ACUPUNCTURE ON HOT FLASHES Excel (version 16.0.12130.20232) database was used to categorize relevant information from the included studies. The information collected included literature characteristics (author, year of publication, and clinical trial registration number), trial design (grouping, sample size, treatment duration, blinding method, frequency of acupuncture treatment, and total acupuncture times), characteristics of participants (age, weight, region, frequency of hot flashes at baseline, duration of disease, and type of participants), and clinical outcomes (frequency of hot flashes for each follow-up time point).
Data from intention-to-treat groups were entered when available. For published per-protocol data, unreported sample sizes during treatment were calculated based on the endpoint sample size. If efficacy results were presented in graphs, the digitizing software Engauge Digitizer (version 4.1, 2002, by Mark Mitchell) was used to extract the graphical data. Data extraction errors between two independent researchers were not allowed to exceed 2%, and the mean values were considered as the final results.

Risk of bias assessment
Two investigators independently extracted relevant information and assessed the risk of bias using the Cochrane Risk of Bias Tool. 23 Any disagreement was resolved through discussion with a third investigator. The evaluation items included random sequence generation, allocation concealment, blinding of participants and personnel, blinding for outcome assessment, incomplete outcome data, selective reporting, and other biases. Other biases were defined as trials in which baseline characteristics were not comparable between different treatment groups.
These studies were graded as high, moderate, or low quality based on the following criteria: (1) studies were considered to be of high quality when both randomization and allocation concealment were assessed as low risks of bias, and all other items were assessed as low or unclear risk of bias in a trial; (2) studies were considered to be of low quality if either randomization or allocation concealment was assessed as a high risk of bias, regardless of the risk of other items; (3) studies were considered to be of moderate quality if they did not meet the criteria for high or low quality.

Model building
In a previous study, we found that changes in the frequency of hot flashes varied with time and reached a plateau, which is consistent with the E max model. The E max model demonstrates good biological plausibility and is frequently used for modeling pharmacodynamic properties of drugs. 24 This model has two important parameters, E max and ET 50 . E max is the maximum possible efficacy that treatment can achieve. ET 50 is the time point at which 50% of the maximum efficacy has been achieved, which represents the speed of the onset of treatment. The combinations of different E max and ET 50 values resulted in different time-course curves. In the current study, the E max model was used to fit the longitudinal efficacy data of each treatment group with the E max and ET 50 values of each treatment group being obtained by Bayesian feedback.
Part of the heterogeneity between trials could be explained by covariates. In the current study, factors that would potentially affect model parameters, such as age, weight, baseline frequency of hot flashes, duration of disease, blinding method, type of participants, and number of acupunctures per week, were screened as covariates on parameters for both E max and ET 50 . A detailed description of the construction of the model is shown in Supplementary, http://links.lww.com/MENO/A729.

Model evaluation
The goodness-of-fit of the final model was evaluated using diagnostic graphs. The sensitivity of the final model was evaluated using the leave-one-out cross-validation method. 25 Briefly, data from one trial were sequentially dropped from the full data set and the remaining data were used to fit the final model. The parameter estimates obtained from each data set were compared to investigate the stability of the model. The final model was further validated by a visual predictive check, 26 which is commonly used to determine if a model is able to reproduce the variability and main trend of observed data. Typically, 1,000 datasets were modeled using Monte Carlo simulations based on the final model parameters. The observed data were then compared with the 95% confidence intervals (CI) of the simulated data to assess the predictive capacity of the final model.

Typical efficacies analysis
When a covariate was found to have a significant impact on the model parameters, the parameters were corrected for the covariate to ensure that the parameters were comparable at different covariate levels. The methodology for covariate correction is detailed in the Supplementary, http://links. lww.com/MENO/A729. A single-arm meta-analysis with random-effects model was then used to summarize the model parameters according to each type of treatment. The typical value and 95% confidence interval (CI) of the parameters of each treatment were obtained from the analysis. Based on these parameters, typical efficacy and 95% CI of each treatment at different time points were simulated 1,000 times using Monte Carlo Simulation.

Software
Model parameter estimation was performed using NON-MEM 7.4.1 (level 1.0, ICON Development Solutions, New York, NY). Meta-analysis was performed using StataCorp (2013. Stata Statiscal Software: Release 13. College Station, TX: StataCorp LP). Plotting, model simulation, and statistical analysis were performed using R software (version 3.6.0, The R Foundation of Statistical Computing, Vienna, Austria).

Characteristics of the included studies
A total of 17 studies including 1,123 participants were ultimately included for analysis (Fig. 1). Of these, 7 studies LI ET AL (329 participants) involved patients with a history of breast cancer. There were three types of treatment involved, including traditional acupuncture (TA, N ¼ 631), electro-acupuncture (EA, N ¼ 69), and sham acupuncture (N ¼ 423). The mean age of the participants ranged from 50.4 to 61 years with a median age of 54 years. The mean frequency of hot flashes at baseline ranged from 1.9 to 12.9 times per day with a median frequency of 8.4 times per day. The acupuncture frequency ranged from 0.7 to 12 times per week with a median frequency of 1.3 times per week. The treatment duration ranged from 4 to 24 weeks with a median time of 8 weeks. The sample size of each treatment arm ranged from 10 to 170 with a median value of 24 (Table 1).
Detailed information regarding the included studies is listed in Supplementary Table S1, http://links.lww.com/ MENO/A729. Of the 17 studies, 3 (17.6%) were high quality, 14 (82.4%) were medium quality, and no studies were low quality. The quality of the 17 studies is shown in Supplementary Figure S1, http://links.lww.com/MENO/A729.

Model establishment and assessment
The time course of changes in hot flash frequency from baseline was well described by the E max model. The goodness-of-fit plots of the acupuncture response model indicated a relatively good fit to the observed data (Supplementary Figures S2 and S3  conditional weighted residual errors (CWRES) and plots of population prediction versus CWRES showed no trend and were randomly scattered around the identity line at CWRES ¼ 0, indicating the suitability of the error model for this study. The visual predictive check plots indicated that the 95% CI of model prediction covered almost all of the observed data, demonstrating good predictability by the model (Fig. 2). The results of leave-one-out cross-validation showed that the distribution of model parameters was stable and only slightly affected by individual trials (Supplementary Figure S4, http://links.lww.com/MENO/A729). The covariate screening process revealed that the baseline frequency of hot flashes had a significant impact on the E max value (Supplementary Figure S5, http://links.lww.com/ MENO/A729). The final model was expressed as follows: In Equation 1, E max,i was the E max value for the ith group with 3.64 being the typical E max value for the overall group and Baseline i was the frequency of hot flashes for the ith group at baseline. For every increase in the number of hot flashes at baseline, the E max value increased 0.343 times. Therefore, the correction coefficient of the baseline for E max was 0.343. Normal (0, 1.049 2 ) was the inter-group variation of the E max value using normal distribution with a mean of 0 and variance of 1.049. The optimal E max,i values and the standard errors for each group as estimated by Bayesian feedback are shown in Supplementary Table S2, http://links.lww.com/ MENO/A729. During the model building process, the inter-group variability of ET 50 was close to zero, which means that the ET 50 value of all the groups was close to 1.89 weeks. Finally, the intergroup variability of ET 50 was fixed to 0 to enhance model stability in the final model (Equation 2).

Typical efficacy analysis
As the frequency of hot flashes at baseline had a significant impact on the E max value, it was necessary to perform baseline correction of the E max value for each treatment group. The correction equation used was as follows: In Equation 3, the E max,i,corrected was the E max correction value for the ith group and E max,i was the E max value for the ith group. This equation corrected the E max values at different baselines to a level of 8, thereby eliminating the impact of the baseline frequency of hot flashes on E max values when comparing the efficacy characteristics of the different treatment groups.
To compare the efficacy characteristics of different interventions, we summarized the E max correction value for each group according to the intervention and estimated the 95% CI of the efficacy at week 8 for each type of intervention. The results revealed that the corrected E max values of TA&EA (merged) and sham acupuncture were 4.0 (95% CI: 3.6-4.3) and 3.2 (95% CI: 2.7-3.7), respectively. Further division of acupuncture into electro-acupuncture and traditional acupuncture resulted in corrected E max values of 4.4 (95% CI: 3.9-4.9) and 3.8 (95% CI: 3.5-4.2), respectively ( Table 2).

DISCUSSION
Acupuncture is a common supplementary or alternative therapy used to relieve menopausal hot flashes. 13 However, the efficacy of acupuncture remains controversial. 27 A metaanalysis of acupuncture for the treatment of menopausal hot flashes suggests that there was no significant difference in the efficacy between acupuncture and sham acupuncture. 18,[20][21][22] However, due to heterogeneity among the different studies, e.g., treatment duration, baseline frequency of hot flashes, blinding method, and other factors, the results from previous meta-analyses may be mixed with a significant amount of bias.
In this study, we found that the efficacy of acupuncture was significantly related to treatment duration. The changes in frequency of hot flashes increased with time and reached a plateau, and the onset time (ET 50 ) of acupuncture was approximately 2 weeks. After 8 weeks of treatment, the efficacy of acupuncture was close to 80% of maximum effects (E max ) and was approximately 1.6 times that of 2 weeks. In addition, we found that the efficacy of acupuncture was also related to the baseline frequency of hot flashes. The higher the EA, electro-acupuncture; SA, sham acupuncture; TA, traditional acupuncture; TA&EA, merger analysis of traditional acupuncture and electro-acupuncture. a The data of placebo pill was cited from Li, T 2018. 26 In this literature, the efficacy of placebo pill was not associated with the baseline frequency of hot flashes, thus the corrected E max value of placebo pill was consistent with the original value. To deduct the impact of treatment duration and baseline frequency of hot flashes on efficacy, we compared the efficacy of each intervention at the same treatment duration (8 wk) using the same baseline level of hot flashes (8 times/d). The results showed that the frequency of hot flashes decreased with TA&EA (merged) treatment at 8 weeks with the frequency being reduced by 0.6 times/d compared with that for sham acupuncture treatment. However, the 95% CIs partially overlapped for these treatments due to the large variations among trials.
It is worth noting that in our current study, acupuncture intervention was divided into traditional acupuncture and electro-acupuncture, but this difference was not evident in previous meta-analyses. Our study revealed that the efficacy of electro-acupuncture was significantly higher than that of sham acupuncture. The efficacy difference between electroacupuncture and sham acupuncture at 8 weeks of treatment was the reduction in the frequency of hot flashes by one per day on average, which is comparable to the pure effect of paroxetine (deducting the corresponding placebo effect). Since paroxetine was the first non-hormonal drug approved by Food and Drug Administration for the treatment of menopausal hot flashes, we believe that the difference in the efficacy of electro-acupuncture and sham acupuncture is considerable, but the related clinical meaningfulness needs to be further discussed. In addition, we found that the interindividual variation for the efficacy of electro-acupuncture reported in the literature was smaller than that for traditional acupuncture and their relative standard deviations were approximately 30% and 40%, respectively. [28][29][30] The reason for the difference may have been due to the intensity of electro-acupuncture being instrument-controlled and not controlled by an acupuncture therapist. Thus, it was easier to standardize electro-acupuncture compared with the manual administration of traditional acupuncture resulting in the efficacy of electro-acupuncture being more stable. However, it should be pointed out that only three electro-acupuncture trials were included in the current study; more trials are needed to verify the efficacy of electro-acupuncture.
Although previous studies have shown that the incidence and the intensity of hot flashes in patients with breast cancer are higher than those in naturally menopause women, 31 our previous research quantitatively analyzed the efficacy of different drugs in relieving hot flashes in breast cancer patients and found that the efficacies of placebo and drugs in breast cancer patients are comparable to those in naturally menopausal women. 26 This study also found that there is no significant difference in the efficacy of acupuncture in breast cancer patients and naturally menopausal women. Therefore, it is suggested there was no significant difference between breast cancer patients and naturally menopausal women in the treatment of hot flashes. The results of these two types of populations can be combined to increase the accuracy of the estimation of efficacy.

LI ET AL
In addition to being compared with sham acupuncture, we also wanted to know how effective electro-acupuncture was compared with existing drugs. Therefore, in the current study, we compared the efficacy of acupuncture with the efficacies of drugs we had previously reported. The results revealed that the efficacy of electro-acupuncture was approximately half of that of progesterone and was comparable to SSRIs/SNRIs and neuroleptic agents such as gabapentin and escitalopram. Due to the common side effects of SSRIs/SNRIs and neuroleptic agents, such as insomnia, nausea, dry mouth, dizziness, fatigue, and anxiety, patient compliance in taking these drugs is not high. 5 Furthermore, some drugs, such as paroxetine and fluoxetine, are inhibitors of CYP2D6, which may prevent tamoxifen from being converted into its active metabolite in the body, thus affecting the efficacy of breast cancer treatments. 5 In addition, some studies have shown that acupuncture can also provide additional physical and mental health benefits such as improved sleep quality and reduced fatigue. 1 Therefore, electro-acupuncture may be a good choice for patients who are worried about drug side effects or adverse drug interactions.
Currently, sham acupuncture is commonly used as a control group for clinical trials of acupuncture. However, whether placebo needles are valid controls in acupuncture research is the subject of ongoing debate. Some studies have pointed out that placebo needles may exhibit therapeutic effects, including enhanced touch sensation, direct stimulation of the somatosensory system, and activation of multiple brain systems. 32 These effects may result in physiological responses similar to real acupuncture. Therefore, in the current study, we compared the efficacy of sham acupuncture with the placebo response in previous drug intervention trials. The results showed that although the efficacy of sham acupuncture in reducing the frequency of hot flashes was 0.3 times/d higher than that of the placebo pills at 8 weeks of treatment, there was no statistically significant difference due to large variations among the trials. In addition, there are various forms of sham acupuncture, including non-acupoint shallow thorns and nonpiercing blunt needle sham acupuncture. Subgroup analyses showed that the efficacies of the non-acupoint shallow thorn sham acupuncture group and the non-piercing blunt needle sham acupuncture group were 2.4 (95% CI: 1.5-3.2) and 2.9 (95% CI: 2.6-3.2) at 8 weeks, respectively. There was no significant difference between these two treatments. It should also be noted that although the efficacy of sham acupuncture was not significantly better than that of placebo pills, this does not mean that acupuncture is simply a placebo. One study demonstrated that placebo needles have unique characteristics that are different from those of placebo pills. 33 Interestingly, we found that while there was no significant difference between TA&EA (merged) and sham acupuncture, the efficacy of TA&EA (merged) was significantly higher than that of placebo pills. The difference in efficacy between TA&EA (merged) and placebo pills at 8 weeks was 0.9 hot flashes/d. These results suggested that even though the difference in efficacy between sham acupuncture and placebo pills was small, the difference was sufficient to influence the conclusion regarding whether acupuncture was effective.
The evaluation of changes in the frequency of hot flashes was relatively subjective as the daily frequency of hot flashes was self-reported by the patients. 34 Thus, the interindividual variation of efficacy values for such trials is relatively large. Usually, this type of trial requires a large sample size to obtain a high power. However, the sample sizes of the acupuncture clinical trials included in our current analyses were generally low and most of them failed to estimate the sample sizes. This is one of the reasons why many clinical trials have reached the conclusion that acupuncture is invalid.
There are some notable limitations for the current study. First, there was a large difference in the baseline for hot flashes among the different studies. In four of these studies, the baseline for hot flashes was less than 5/d (mild to moderate hot flashes). However, sensitivity analysis demonstrated that by removing these studies of mild to moderate hot flashes, the results were basically consistent with those of the original data set (Supplementary Table S3, http://links.lww.com/MENO/ A729). This indicated that the original results were robust.
Second, the duration of treatment for most included studies was 8 to 12 weeks. Although the established model in this study can predict the change in the frequencies of hot flashes under long-term treatment, the reliability of the extrapolated results after 12 weeks requires further verification through clinical trials. Third, most trials did not report the race composition of participants; therefore, the impact of race was not explored in this study. In addition, the acupoints used in the various trials were different, which may have also contributed to the inconsistent conclusions among the studies. However, due to the relatively small amount of literature included in the current analysis, it was not possible to analyze the combination of different acupoints. Finally, only published studies and those written in English were included, which may have introduced positive published bias and language biases.

CONCLUSIONS
In the current study, the efficacy of acupuncture in relieving menopausal hot flashes was quantitatively analyzed by establishing a pharmacodynamic model. We found that the efficacy of acupuncture (traditional acupuncture and electroacupuncture) was better than that of sham acupuncture and was significantly better than that of placebo pills. We also found that the efficacy of electro-acupuncture was more stable than that of traditional acupuncture, and its efficacy was significantly better than that of sham acupuncture. Furthermore, the efficacy of electro-acupuncture was comparable to that of SSRIs/SNRIs and neuroleptic agents such as gabapentin and escitalopram.