We screened 254 titles and abstracts (Fig. 1), and identified six individual RCTs from six reports [18, 19, 22, 25-27] and four quasi-RCTs from five reports [17, 20, 21, 23, 24] for inclusion in the final review. We excluded the trials with massage and bathing as these have other effects. Eight studies that were included mentioned that blinding of participants was not possible because of the diffusion of oil molecules in the air [17, 19-22, 25-27]. For this reason, five reports [17, 20, 21, 23, 24] had performed interventions on randomly allocated days with the aromatherapy days and the placebo days. Although we considered these four trials as quasi-RCTs, we included these four trials from the nature of the intervention.
Table 1 provides details of the included studies involving 1238 pregnant women at labor onset [17-27]. Eight trials included only nulliparous women (81.0%) [17,20-27], one trial did not report parity (8.9%) , and one trial showed the mean ± SD of the numbers of parity 1.31 ± 0.72 for the intervention group and 1.22 ± 0.91 for the control group (9.6%) . All of the trials recruited participants with singleton pregnancy and full-term pregnancy, and did not report existing medical conditions. Most trials recruited predominantly adults (18-35 years old) with cephalic presentation and 3-4 cm cervical dilatation. Nine trials were undertaken in Iran [17-25, 27] and only one was conducted in Thailand . All of the trial settings were at hospitals.
For the measurement of outcomes, labor pain severity was measured using the Visual Analogue Scale (VAS) chart and the Numerical Rating Scale (NRS) [17-21, 24, 26, 27]. Both scales have a score range of 0 to 10 [28, 29]. One trial reported the pain score changes from baseline therefore we performed GIVM for the meta-analysis . In three studies, Spielberger’s State-Trait Anxiety Inventory (STAI) was used to determine the level of anxiety of the participants [18, 23, 25]. STAI questionnaires consist of 40 questions in which the scores ranged from 20 to 80. Higher scores indicate greater anxiety . The reliability of STAI has a Cronbach’s alpha of 0.90 . One study used the Visual Analog Scale for Anxiety (VASA) . The scale ranges from 0 to 10 with 0 indicating no anxiety and 10 greatest anxiety .
Risk of bias for included studies
Of the 10 trials, most of the trials had a low risk of bias in random sequence generation (60%, 6/10), incomplete outcome data (90%, 9/10), selective reporting (60%, 6/10), and other bias (100%, 10/10). However, most of the trials had a high risk or unclear bias in allocation concealment (60%, 6/10), blinding of participants and personnel (100%, 10/10), and blinding of outcome assessment (60%, 6/10). Eight studies mentioned that blinding participants were not possible owing to the nature of aromatherapy [17, 19-22, 25-27]. Four trials from five reports [17, 20, 21, 23, 24] had performed interventions on randomly allocated days; thus, we considered these four trials as quasi-RCTs.
Table 1 presents details of the aromatherapy interventions administered in each trial. All trials evaluated inhalation of aroma essence in labor. Two studies had a three-arm design with intervention arms [21, 22]. One study used two kinds of aroma essence (Jasmin and Salvia essence), and we combined them into one group and used the calculated data which is the combined mean ±SD . Another study carried out interventions by inhalation of aroma essence using a footbath, only footbath, and routine care . We included inhalation of aroma essence with footbath as the intervention group, and only footbath as the control group to exclude the effect from footbath. Moreover, one study performed inhalation of aroma essence with breath technique as the intervention, and breath technique alone as the control .
Various aroma essences were used in the included trials. Lavender was the most used aroma oil in four trials [19, 20, 26, 27], and it is also commonly used in practice settings. The second most used aroma essences were C. aurantium essence [23, 24, 26], Geranium rose essence [25, 26], Jasmin [21, 26], and R. damascene essence [18, 22] in two trials each, and Salvia essence  and Boswellia carterii essence  were used in single trials each.
Labor pain relief
For the measurement of labor pain, all of the studies used VAS or NRS [17-21, 24, 26, 27]. As one trial reported interquartile range we used score change reports . For this reason, we calculated MD with GIVM for the analysis of labor pain. Eight studies found that aromatherapy significantly reduced labor pain intensity compared with control in the latent phase (MD -1.56, 95% CI -2.45 to -0.67, p = 0.0006, I2 =97%, eight trials, 1,005 women, low certainty of evidence; Fig 3). Six studies reported that aromatherapy intervention significantly reduced labor pain compared with control in the early active phase (MD -1.69, 95% CI -2.50 to -0.89, p < 0.0001, I2 = 96%, six trials, 689 women, low certainty of evidence; Fig. 4). These studies also reported that aromatherapy significantly reduced labor pain in the late active phase (MD -1.52, 95% CI -2.33 to -0.71, p = 0.0002, I2 = 97%, six trials, 689 women, very low certainty of evidence; Fig. 5).
Three studies used STAI [18, 23, 25] and one study used VASA  to measure the outcome of anxiety. Therefore, we calculated SMD for the analysis of anxiety. Aromatherapy intervention reduced anxiety compared with the control in the early active phase (SMD -3.49, 95% CI -6.28 to -0.69, p = 0.01, I2 = 99%, four studies, 392 women, very low certainty of evidence; Fig. 6). Three studies reported that aromatherapy significantly reduced anxiety in the late active phase. (SMD -5.54, 95% CI -10.39 to -0.69, p = 0.03, I2 = 99%, three studies, 295 women, very low certainty of evidence; Fig. 7).
Duration of contraction
We used SMD for the analysis of duration of contraction because the time unit of the included studies was unclear [17, 18, 23]. Three studies found that aromatherapy did not significantly affect the duration of contractions at 3-4 cm, 5-7 cm, and 8-10 cm dilatation (3-4 cm; SMD -0.49, 95% CI -1.41 to 0.43, p = 0.30, I2 = 94%, 347 women; Appendix. Fig. S1), (5-7 cm; SMD 2.94, 95% CI -0.38 to 6.26, p = 0.08, I2 = 99%, 347 women; Appendix. Fig. S2), (8-10 cm; SMD 0.05, 95% CI -0.16 to 0.26, p = 0.67, I2 = 49%, 347 women; Appendix. Fig. S3).
We also used SMD for the analysis of labor length as the time unit was unclear [19-21, 26, 27]. Five studies showed that aromatherapy significantly reduced the 1st stage labor length compared with the control (SMD -0.21, 95% CI -0.37 to -0.06, p = 0.008, I2 = 56%, five studies, 641 women; Appendix. Fig. S4). By contrast, aromatherapy did not significantly reduce the 2nd stage labor length (SMD 0.14, 95% CI -0.36 to 0.63, p = 0.59, I2 = 86%, four studies, 481 women; Appendix. Fig. S5).
Aromatherapy intervention did not significantly affect the Apgar score at 1 min after delivery (MD -0.25, 95% CI -0.84 to 0.35, p = 0.41, I2 = 98%, five studies, 652 women; Appendix. Fig. S6) (17,18,20,21,27). These five studies also report that aromatherapy was not significantly associated with the Apgar score at 5 min (MD 0.03, 95% CI -0.02 to 0.08, p = 0.25, I2 = 0%, five studies, 652 women; Appendix. Fig. S7).
We calculated RR for dichotomous results. Only one study reported Apgar score < 7 at 1 min ; however, there was no significant association with aromatherapy (RR 0.51, 95% CI 0.05 to 5.45, p = 0.58, one study, 103 women; Appendix. Fig. S8). An Apgar score < 7 at 5 min was not reported.
Types of delivery
Three studies reported that aromatherapy intervention did not increase spontaneous delivery [18, 21, 26] (RR 1.05, 95% CI 0.96 to 1.15, p = 0.27, I2 = 0%, three studies, 370 women; Appendix. Fig. S9). Aromatherapy did not significantly reduce operative delivery [18, 26] (RR 0.58, 95% CI 0.24 to 1.42, p = 0.24, I2 = 0%, two studies, 214 women; Appendix. Fig. S10), or C-sections [18,21,23,26] (RR 0.83, 95% CI 0.47 to 1.49, p = 0.54, I2 = 0%, four studies, 483 women; Appendix. Fig. S11).
Only one study reported the ratio of labor augmentation . Aromatherapy did not significantly reduce labor augmentation (RR 0.97, 95% CI 0.68 to 1.37, p = 0.84, one study, 104 women; Appendix. Fig. S12).
Due to the high heterogeneity, we performed sensitivity analysis by excluding quasi-RCTs and high risk of random sequence [17, 20, 21, 23, 24]. For labor pain relief, it still showed significant differences during the latent and early active phase. (Appendix. Fig. S13, S14). However, there were no significant differences in pain relief during the late active phase, and anxiety relief during all of the active stages (Appendix. Fig.S15-S17). For secondary outcomes, there was no significant difference in the labor length of the 1st stage (Appendix. Fig. S18).