The protocol for this review was registered on PROSPERO (CRD42019146380) and is available on the PROSPERO.com (https://www.crd.york.ac.uk/prospero/). This protocol is reported in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) statement guidelines  (Additional file 1). The review will be conducted in accordance with the PRISMA guidelines and follows the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions [31, 32]. If we refine the procedures described in this protocol, we will update the record in the PROSPERO and disclose them in the final report.
Type of studies All RCTs applying WNA independently or as an adjunct to other active therapies targeting overweight and obesity will be included. Completed and ongoing trials will be included. Owing to the language restriction of our researchers, we will limit the language of search literature to Chinese and English. If the study was designed as a cross-over trial, only the first phase results will be analyzed in order to eliminate carry-over effects.
Type of participant patients diagnosed with overweight or obese and age were greater than 18 years will be included, diagnosed according to World Health Organization (WHO) recommended criteria. All eligible study participants will be included in this review regardless of their ethnic background or gender.
Type of interventions The experimental group should be treated with WNA, that is, an ignited moxa will be placed on the handle of the needle after insertion, and the acupoints selected according to TCM nomenclature. WNA combined with other treatments will also be included, such as moxibustion, massage, cupping, and drugs or Chinese herbs. The types of moxa used and duration of treatment will be unlimited.
Type of comparators The control group will include patients treated with control interventions, such as no treatment, anti-obesity drugs, lifestyle modification, simple acupuncture (e.g., manual acupuncture, electroacupuncture, acupoint catgut embedding and auriculotherapy), simple moxibustion, cupping therapy, Chinese herb medicine and any other active therapies. In addition, studies that have intervention groups comparing WNA plus one or more therapies with the same therapies alone will also be included.
Type of outcome measurements According to the 2013 American College of Cardiology guidelines, the main purpose of obesity management is to reduce the body weight to a reasonable level . BMI is a simple index of weight-for-height that is commonly used to classify overweight and obesity in adults. Therefore, our principal outcome will be the difference in BMI from baseline to the end of studies. It is defined as a person's weight in kilograms divided by the square of his height in meters (kg/m2).
The secondary outcomes include the change of weight, percentage of body fat, waist circumference (WC), hip circumference (HC), waist-to-hip ratio (WHR), serum lipid (such as cholesterol, triglyceride, high-density lipoprotein cholesterol and low-density lipoprotein cholesterol) before and after treatment, the incidence and severity of adverse events (e.g., hematomas, dizziness or local pain) will also be measured as secondary outcomes for safety assessment.
The exclusion criteria are as follows: (1) trials including participants who are not appropriate to receive WNA, such as pregnant or lactating women, and those were diagnosed with secondary obesity (e.g., caused by drugs such as glucocorticoids and pituitary disease) or with additional severe diseases; (2) quasi-randomized trials and case reports; (3) only providing the information of effective rate and not providing the data of BMI from baseline to the end of studies; (4) studies involving WNA but without control arm.
Databases and search strategy
We will search all RCTs for warming acupuncture on weight management, electronically and manually, regardless of publication status. Electronic databases include CENTRAL, MEDLINE (via PubMed), EMBASE, CINAHL, AMED, Alt Healthwatch, CBM, Wanfang Data, Chinese Science and Technology Periodical Database (VIP) and China National Knowledge Infrastructure (CNKI). The key search terms will be developed from Medical Subject Headings (MeSH) and free text terms, such as obesity OR overweight AND warm-needling acupuncture OR warming acupuncture OR warming needle AND randomized. The search strategy will be adapted to different databases demands, and full search strategy for PubMed is shown in Additional file 2. Any relevant ongoing or unpublished clinical studies will be acquired from the International Clinical Trials Registry Platform (ICTRP), NIH clinical registry Clinical Trials.gov, and the Chinese clinical registry. The reference lists of selected studies and published systematic reviews will be screened for additional studies. Manually search for the grey literature, including conference proceedings (Additional file 3). We will also consult experts in the field to obtain possible studies and most up-to-date clinical data that are not available through the previously mentioned searching.
In the study selection process, search results will be imported from the original databases to Endnote X9 and repetitive studies will be deleted by the software. Two trained reviewers (JY and LC) will independently screen all retrieval researches, read the title, abstract and keywords to determine which studies meet the inclusion criteria. Where a study is potentially eligible, the full text will be obtained and checked independently by two reviewers (JY and LC). Studies excluded after reading the full text will be recorded and explained. Any disagreement will be resolved by consensus. Further arguments will be arbitrated by a third reviewer (LZ).
A pilot extraction will be done before the review is conducted to achieve consistency (at least 80%) between those collecting data. The following data will be extracted from the eligible studies by two reviewers (JY and LC) independently using a self-designed data acquisition form, which includes the following items: (1) details of the study (publication year, nationality, journal, study design); (2) patient demographics (sample size, age, sex, height, weight, BMI); (3) intervention (duration, frequency, types of warm-needling acupuncture, types of comparators); (4) weight-related outcomes (the difference in BMI, the change of weight, percentage of body fat, WC, HC, WHR and serum lipid); (5) main conclusion, adverse reactions and a list of the Standards for Reporting Interventions in Controlled Trials of Acupuncture (STRICTA). Any discrepancy noticed in the process of data extraction will be resolved through discussion and the suggestion of a third reviewer (LZ). For publications with insufficient or ambiguous data, we will attempt to obtain information from the corresponding authors by e-mail or telephone.
Assessment of risk of bias
Two independent reviewers (HZ and JY) will use the Cochrane Collaboration's bias risk assessment tool to assess the risk of bias for all included studies. The assessments include potential selection bias (random sequence generation and allocation concealment), performance bias (blinding of investigators and participants), detection bias (blinding of outcome assessors), attrition bias (incomplete outcome data), reporting bias (selective reporting) and possible other sources of bias (funding bias) . Our systematic review uses L, U, and H as the key to these assessments, where L (low) indicates a lower risk, U (unclear) indicates an uncertain risk, and H (high) indicates a higher risk. If inconsistent results appear, the final decisions will be made by the third reviewer (YL). In the process of data synthesis, studies with unclear or high risk of bias will be given less weight.
Data analysis and quantitative data synthesis will be performed using RevMan software (V.5.3.5) and Stata software (V.14.0). For continuous data, we will use the standardized mean difference (SMD) along with its 95% confidence intervals (CIs) to measure the therapeutic effect, whereas dichotomous data will be presented as relative risk (RR) with 95% CIs for analysis. If the standard deviation for changes from baseline to the end of studies were not reported, but the pre- and post- measurements were reported, the following formula will be used to calculate:
and the correlation coefficient (r) is assumed to be 0.4 . Statistical heterogeneity between studies will be assessed using the I² test. The study is not considered to have large heterogeneity if the I² test is less than 50%, and a fixed-effects model will be used for data synthesis. Otherwise, a random-effects model will be used. When the statistical heterogeneity is identified, we will search for possible causes from the clinical and methodological perspective, then provide a subgroup analysis or descriptive analysis to explore the possible causes of heterogeneity.
In the present study, the heterogeneity will significant with respect to the WNA types, subjects, treatment courses, etc. Therefore, subgroup analysis will be employed according to various forms of WNA, the initial BMI of patients, diverse treatment courses or frequency, different comparators, different outcomes, and so on.
Multiple sensitivity analysis for primary outcome will be performed to assess the robustness of the findings and to detect if any particular study accounted for a large proportion of heterogeneity. These will be based on different statistical model, different sample size and different methodological quality. The meta-analysis will be repeated, and more inferior quality studies will be excluded.
We will use funnel plot to visually inspect reported bias and the effects of small-scale studies. If a sufficient number of included studies (more than 10 trials) are available, the funnel plot will be used to assess the reported bias. If the funnel plot is found to be asymmetrical, analyze the causes using Harbord test. This method is an upgraded method of Egger’s regression, which has a higher test power for the dichotomous outcome data with low heterogeneity, and more accurate in combination with the visually inspect of funnel plot. All eligible trials, regardless of the quality of their methodologies, will be included.
Grading the quality of evidence
The two reviewers (HZ and LC) will independently appraise the strength of the body of evidence by GRADE system . In this process, direct evidence from RCTs starts at high quality and can be downgraded based on the risk of bias, indirectness, imprecision, inconsistency (or heterogeneity), and/or publication bias to levels of moderate, low, and very low quality. A summary of results table will be generated by GRADEprofiler software (V.3.6.1) and disclosed in the final publication. Any discrepancy will be arbitrated by a third reviewer (YL).