Title: The Effectiveness of Participant Blinding of Non-Penetrating Sham/Placebo Acupuncture in Clinical Trials: A Systematic Review with Meta-Analysis

Background: Acupuncture clinical trial is important to evaluate the ecacy of acupuncture. However, it is challenging to achieve effective blinding due to the nature of acupuncture. A standardised placebo control method of acupuncture has yet to be established. The study focuses on the non-penetrating sham acupuncture because it eliminates the placebo effect and generates lesser physiological responses. The study aims to evaluate and compare the participant blinding effectiveness of non-penetrating sham acupuncture devices, and analyse the factors which may inuence the participant blinding. Methods: The study followed the PRISMA guidelines. An electronic search was conducted on PubMed, Ovid and CNKI up until 1st of October 2020 to include English and Chinese randomised controlled trials which evaluated the awareness on the type of acupuncture (real or sham) in any population who received acupuncture. Data screening, data extraction and quality assessment were done independently by two researchers and discrepancies were sorted out via discussion with a co-researcher. Data analysis was performed using RevMan 5.4.1. Results: 34 full-text articles had been included in the systematic review and meta-analysis. The quality of the studies ranged from moderate to good. Generally, non-penetrating sham acupuncture devices were effective in blinding participants in clinical trials. The foam device demonstrated a better blinding effect, followed by Streitberger, Park and Takakura devices. Sham needles with no skin contact could not blind the participants successfully. Naive, experienced, healthy and diseased participants all could be blinded using non-penetrating sham acupuncture devices but naive and healthy participants could be blinded comparatively easily. Acupoints from different regions could achieve blinding, however, the acupoints on the back could blind the participants more easily compared to the other areas. Conclusion: Non-penetrating sham acupuncture devices are valid placebo control for acupuncture clinical trials. The foam device has a better blinding effect, followed by Streitberger, Park and Takakura devices. Recruiting naive healthy participants and choosing acupoints from the back can achieve better blinding effects in the participants. stung. In other words, the performance bias will be high in the acupuncture clinical trials that use other types of sham acupuncture devices.

studies. As a result, this study aims to assess and compare the blinding effectiveness of non-penetrating sham acupuncture devices as well as analyse the factors which may in uence the blinding effectiveness.

Methods
This systematic review and meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Statement (PRISMA) guidelines 13 .

Search Strategy
An electronic search was carried out in health-related databases such as PubMed, Ovid and CNKI for relevant studies. The literature search was conducted using search terms such as sham acupuncture, placebo acupuncture, sham needle, placebo needle, Park needle, Park device, Streitberger needle, Streitberger device, Takakura needle and Takakura device. Similar search terms in Chinese were used to search in CNKI, for example, , , and . The search was limited to original articles published in English and Chinese language until 1st October 2020. Additional studies were identi ed manually from reference lists of potentially eligible articles. Title and abstract of those studies were screened to determine its relevance.

Inclusion Criteria
Selection of primary studies for this review was derived from the following pre-speci ed criteria (in PICOS format).Type of participants (P)Any population who received acupuncture treatment regardless of age, sex, race, region, underlying disease and acupuncture experienceIntervention (I)Non-penetrating Studies were excluded if the studies were animal studies, review articles, case reports, editorials, letters and comments. Studies published other than English and Chinese were excluded. Studies which did not meet any of the inclusion criteria were also excluded.

Data Extraction
Data were extracted by using a review spreadsheet, containing information such as author, publication year, study design (blinding and sample size), participant details (age, gender, health status and acupuncture experience), acupuncture methods (the type of sham acupuncture device and needling location) and results. Corresponding authors were contacted if the data was unclear. The data extraction was conducted separately by two researchers. Any discrepancies were sorted out via discussion with a co-researcher.

Quality Assessment
Quality and risk of bias of the eligible studies were assessed independently by two researchers at outcome level by using Cochrane risk of bias tool 14 . Quality assessments included random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting and other items. Studies were graded as low, unclear or high risk of bias. Discrepancies were resolved via discussion among the researchers. Publication bias across the studies was presented as funnel plot.

Data Analysis
Meta-analysis was performed using RevMan 5.4.1 15 . The effectiveness of blinding in sham/placebo acupuncture compared to real acupuncture was estimated with the odds ratios (OR) and its 95% con dence intervals. OR > 1 means blinding is more likely to occur in the intervention arm (sham group) than in the comparator arm (real group). Trials which the patients had no events in both intervention and comparator arms were excluded from the meta-analysis.
Heterogeneities were assessed using the chi-squared (χ 2 ) test and the inconsistency index (I 2 ) statistic. A two-tailed P value of less than 0.05 was considered as statistically signi cant. Subgroup analyses were performed on the factors which could in uence the blinding effectiveness of participants.

Study Selection
A total number of 1189 studies were identi ed from the databases and 18 studies from the reference list of related studies (Fig. 1). After screening, 34 studies had been included in the systematic review and meta-analysis. 28 studies  had been excluded with reasons during the screening of full-text articles (Table 1).  50,55,56 . The number of participants recruited in these studies ranged from 10 to 321. The total number of participants involved was 2538.  71,72 ; whereas the rest of them were either healthy or not reported. Most studies reported acupuncture experience of the subjects, for example, naive (never experienced acupuncture before), experienced (at least experienced acupuncture once before) or mixed.
A variety of non-penetrating sham acupuncture devices was used in the studies. They were classi ed and named according to the name of the authors (Park, Streitberger, Takakura and Kim) or materials (foam and cocktail stick). The studies had chosen acupoints or non-acupoints located on the head, abdomen, back, upper limbs, lower limbs and ears. The coding and naming of acupoints were converted if they did not follow a Proposed Standard International Acupuncture Nomenclature 74 by the World Health Organization (WHO). Some selected acupoints based on syndrome differentiation of Chinese medicine 57,58 or tender muscle point 67 and, hence, did not have constant points throughout the study.
The number of participants who guessed correctly the type of acupuncture (real or sham) was listed in Table 2. Some studies were recorded in the form of the total number of responses from the participants due to cross-over study design 7

Quality Assessment
Quality and risk of bias of the eligible studies were assessed independently by two researchers at outcome level by using Cochrane risk of bias tool 14 . Quality assessments included random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting and other items. Studies were graded as low, unclear or high risk of bias. Discrepancies were resolved via discussion among the researchers. Publication bias across the studies was presented as funnel plot.

Data Analysis
Meta-analysis was performed using RevMan 5.4.1 15 . The effectiveness of blinding in sham/placebo acupuncture compared to real acupuncture was estimated with the odds ratios (OR) and its 95% con dence intervals. OR > 1 means blinding is more likely to occur in the intervention arm (sham group) than in the comparator arm (real group). Trials which the patients had no events in both intervention and comparator arms were excluded from the metaanalysis Heterogeneities were assessed using the chi-squared (χ 2 ) test and the inconsistency index (I 2 ) statistic. A two-tailed P value of less than 0.05 was considered as statistically signi cant. Subgroup analyses were performed on the factors which could in uence the blinding effectiveness of participants.

Blinding Effectiveness of Sham Acupuncture Devices
The OR of overall blinding effectiveness of non-penetrating sham acupuncture devices (Fig. 5) was 5.11 [3.36, 7.76] with a P-value (< 0.00001), which indicated that the blinding was more likely to occur in the sham group and the result was statistically signi cant. However, the studies were not homogenous and the heterogeneity was high in the meta-analysis including the subgroup analyses, hence, a random effect model was selected. After switching to the random effect model, the heterogeneity was still high (I 2 = 88%) in the overall blinding effectiveness analysis, so subgroup analyses were performed to evaluate the factors which may in uence the blinding effectiveness of sham acupuncture devices. For example, acupuncture experience of the participants, the health status of the participants and the location of acupoints.
A comparison between sham acupuncture devices was performed through evaluating and comparing the individual blinding effectiveness of sham acupuncture devices. Kim and cocktail stick devices were excluded because they only possessed one study each. As shown in Fig. 6 (Fig. 7). Both naive and experienced participants were successful in blinding because they were both statistically signi cant (P < 0.00001). The heterogeneity was high in both groups individually but did not differ much between the two subgroups (I 2 = 37.7%).
As for the health status of participants, both healthy and diseased participants were statistically signi cantly successful in blinding (P < 0.00001) using sham acupuncture devices (Fig. 8). Healthy participants were blinded slightly better than diseased participants with an OR 6.28 [ (Fig. 9). The back and the upper limbs exhibited a statistically signi cant result (P < 0.0001), whereas the results of the abdomen and the lower limbs were not statistically signi cantly different (P = 0.09). However, heterogeneity was high in all of the individual subgroups but low across the subgroups (I 2 = 0%).

Discussion
Blinding Effectiveness of Sham Acupuncture Devices Overall Blinding Effectiveness of Sham Acupuncture Devices When conducting acupuncture clinical trials, subjects from both real acupuncture group and sham acupuncture group should believe that they receive real acupuncture to mimic the actual scenario which takes place during acupuncture treatment. As shown in Fig. 5, the existing non-penetrating sham acupuncture devices such as Park, Streitberger, Takakura, foam and cocktail stick displayed a result which favoured the sham group and was also statistically signi cant.
In other words, more participants in the sham acupuncture group who did not identify the sham acupuncture treatment correctly compared to the real group and hence, they were blinded. As a result, non-penetrating sham acupuncture devices can act as an effective placebo control method to be applied in acupuncture clinical trials, especially in replacing other less effective types of control methods such as using no treatment, standard/conventional treatment or skin-penetrating sham acupuncture as the control.

Comparison between Sham Acupuncture Devices
Among all the non-penetrating sham acupuncture devices included in the data analysis, the foam device demonstrated the best ability in achieving the blinding of participants. There were ve studies 10,47,50,51,56 which used the foam device in their control group. The foam device is usually self-prepared by the researchers, so there are some variations in design. It is made out of a certain thickness of foam with double-sided adhesive tape at the bottom (Fig. 10). The foam pad can act as a supportive material to hold the needle in place even in the placebo group, whereas the adhesive tape can stick the device on the skin.
The real foam device utilises a real acupuncture needle with a sharp tip which can penetrate the skin of participants; the placebo foam device uses a shorter blunted acupuncture needle which cannot penetrate the skin. Ultimately, the appearance after needle insertion will remain the same and hence, achieve blinding of participants in terms of vision. After needle insertion, the real device will penetrate the skin with a certain depth, whereas the placebo device will only touch the skin to blind the participants by mimicking the feeling of penetration. In the studies of Fink 50 and Goddard 51 , the placebo needles were gently twisted to enhance the feeling of penetration. Besides having a good blinding effect, the foam device is also less pricey and easily accessible compared to the other devices, so it can be a good option to be used in the control group of acupuncture clinical trials. However, owing to the preparation of the foam device is usually self-made, hygiene in preparing the device will become the main concern. All the equipment must be sterilised adequately before being applied to the participants.
The other sham acupuncture devices such as Streitberger, Park and Takakura (Fig. 11) are also effective in blinding the participants. Their sham devices look identical with the real devices. Similar to the foam device, their sham devices possess shorter blunted-tip needles which touch on the skin to mimic penetrating sensation. Hence, all of them can be applied in clinical trials. The characteristics of the foam, Streitberger, Park and Takakura devices have been summarised in Table 3. On the other hand, the sham device of no-touch Takakura was not statistically signi cantly superior to the real device (P = 0.19), so it is not recommended to be used in clinical trials. No-touch sham Takakura device has no contact with the skin of participants. It may be useful in achieving visual blinding but not tactile blinding. In short, each sham acupuncture design has its own advantages and limitations. Researchers should take that into consideration when designing the experimental and control methods. Due to high heterogeneity of the studies across the subgroups (Fig. 6), the OR value may be in uenced by other factors and hence, the results of blinding effectiveness of sham devices can only serve as a reference.

Limitations of the Existing Sham Acupuncture Devices
The limitations of the existing sham acupuncture devices can be discussed in several aspects, including the suitability in performing electro-acupuncture, needling location, needling angle and blinding effect in acupuncturists.
The sham devices of foam, Streitberger, Park and Takakura solely rely on the adhesive double-sided tape or pedestal to attach to the skin. The attachment is not as rmly as those in the real acupuncture, therefore, sham electro-acupuncture is di cult to perform by using these devices. Moreover, the adhesive tape may not be suitable for hairy skin or areas which are not at. As a result, using sham acupuncture devices on the scalp, hairy regions and the skin with a great curvature such as the ears, ngers and toes can be challenging.
Apart from that, needling angle is usually limited to perpendicular for the foam, Park and Takakura devices due to the presence of guide tube, so only acupoints that allow perpendicular insertion can be selected when using these devices. Acupoints such as LU7 LieQue and EX-HN3 YinTang or acupoints on the scalp which require oblique or transverse insertion cannot be chosen in the trials. On the contrary, Streitberger device, which does not have a guide tube, can perform perpendicular and oblique insertions. Yet, having no guide tube can also make the needle unstable especially in the sham group, increasing the risk of exposure of grouping allocation.
Last but not least, most of the sham acupuncture devices did not demonstrate the ability to blind acupuncturists, except for the Takakura device. The Takakura device has added a lower stu ng within its guide tube to mimic the feeling of skin penetration when the acupuncturist pushes the sham needle into the lower stu ng. In other words, the performance bias will be high in the acupuncture clinical trials that use other types of sham acupuncture devices.

Guidelines of Sham Acupuncture Controlled Clinical Trials
In 1995, the Guidelines for Clinical Research on Acupuncture 4 by the WHO stated that placebo acupuncture should ful l two criteria: it must be a less effective form of acupuncture and also mimic acupuncture in a credible manner. Zhang's paper 76 also stated that placebo acupuncture should have no or minimal speci c treatment effects on the tested disease and the treatment and control groups should be identical to achieve blinding. To evaluate the true e cacy of acupuncture, the difference in speci c effects shall be maximised but the difference in non-speci c effects shall be minimised (Fig. 12).
Determining a research question is important before selecting the type of control because each control method can be used to answer different types of research questions. As shown in Table 4, no treatment and standard treatment as control can rule out regression to the mean and study the general effectiveness of acupuncture; non-penetrating sham as control can rule out regression to the mean and psychological responses (placebo effect) and study the e cacy of acupuncture, including skin-penetrating physiological effects and acupoint speci c effects; lastly, penetrating sham as control can rule out three other aspects and study speci cally on the e cacy of acupoint speci c effects 77,78 . As far as the authors know, non-penetrating sham acupuncture is the only method that can eliminate the placebo effect and meanwhile, minimise the physiological responses being generated. So, it can be widely used in a broad range of acupuncture clinical trials that study the speci c effects of acupuncture. On the other hand, penetrating sham acupuncture (e.g. shallow needling/minimal acupuncture and needling on non-acupoints) is suitable to study narrower speci c effects of acupuncture that will not be generated by skin penetration. Do note that the acupuncturists of penetrating sham acupuncture are not blinded most of the time owing to different needling techniques and locations in the sham group and hence, it may lead to performance bias of personnel. In addition, it has to be ensured that skin penetration will not trigger any desired speci c effects of the study, if not it will result in no signi cant difference in both arms. Note: the boxes with "X" in Table 4 indicate the areas that are eliminated when comparing with the real acupuncture group.

Factors which May In uence the Blinding Effectiveness of Sham Acupuncture Devices
Besides the selection of sham acupuncture method and device, there are other factors which may also play a role in achieving successful blinding, for example, acupuncture experience and health status of the participants and needling location. As shown in Fig. 7, the naive participants are more easily blinded than the experienced participants. The experienced participants are more familiar with the acupuncture process and DeQi (needling) sensation and hence, they are more likely to guess the grouping accurately. However, both groups showed a signi cant difference in blinding effectiveness. Ideally, naive subjects should be recruited in acupuncture clinical trials, but experienced participants can also be considered if naive ones are not available or su cient.
Next, both healthy and diseased participants demonstrated a signi cant difference in blinding effectiveness, in which the healthy ones were slightly superior to the other group (Fig. 8). Healthy subjects should be prioritised when designing acupuncture clinical trials, nonetheless, recruiting diseased participants is unavoidable when studying the e cacy of acupuncture on a particular disease.
Lastly, the locations of acupoints may also in uence the effectiveness of blinding. As shown in Fig. 9

Limitations of the Study
Four studies had been excluded due to language barrier and seven studies had been excluded due to inaccessibility of the full-text articles. Excluding articles other than English and Chinese may introduce language bias. Excluding inaccessible articles may also reduce the precision of combined estimates of blinding effectiveness. However, the authors were not able to overcome it due to limited resources. The included studies utilised different ways to present data, so some data had to be converted before performing the meta-analysis. Also, some studies presented their results in terms of the number of participants, whereas some were based on the total responses from the participants.
A variety of study designs also led to high heterogeneity in the results of the meta-analysis. Other potential in uencing factors which might contribute to the heterogeneity such as the diameter of acupuncture needle, the depth of needle insertion, the duration of needle retention, needle manipulation techniques and the number of treatments were not analysed in the study. For instance, the participants may be aware of the grouping by observing the procedure and treatment effects after having multiple and long-time treatment.
The number of articles in some of the subgroups was small. For example, Kim device and cocktail stick device in the analysis of the types of sham acupuncture devices possessed solely one study; besides, the head and neck, abdomen and lower limbs in the analysis of the locations of acupoints possessed a small number of studies.

Conclusions
Non-penetrating sham acupuncture devices are valid placebo control for acupuncture clinical trials. It should be applied in future clinical trials because it can blind the participants and meanwhile, produces lesser physiological responses. The foam device has a better blinding effect, followed by the Streitberger, Park and Takakura devices. Sham needles with no skin contact could not successfully blind the participants. Naive, experienced, healthy and diseased participants all can be used in acupuncture clinical trials but naive and healthy participants can be blinded more easily. Choosing acupoints from the back can blind participants more easily compared to the other areas.
Future clinical trials can study and compare the blinding effectiveness of sham acupuncture devices by performing multiple sham groups with a larger sample size to evaluate all different types of sham devices under the same setting. This can eliminate heterogeneity and show real blinding e cacy of these devices.
Besides, researchers can also try to develop a better type of non-penetrating sham acupuncture device with the considerations of the blinding e cacy, strengths and limitations as mentioned in the results and discussion.