Finnish version of the Childbirth Experience Questionnaire (CEQ-FI): validity and reliability assessment

Background Childbirth Experience Questionnaire (CEQ) was developed in Sweden to assess the childbirth experience in multiple dimensions. It has been translated and validated in English, Spanish, Chinese and Persian with the aim to evaluate childbirth experience reliably. This study aimed to validate the Finnish version of the questionnaire, CEQ-FI. Methods Primiparous women who had given birth in Tampere University Hospital between January and May 2019 were included in the study. Women planning a cesarean delivery, delivering preterm, or women whose children were transferred to neonatal care were excluded. Eligible 450 women were mailed the questionnaire one month postpartum, and those who completed the questionnaire were mailed it again six weeks postpartum. Test-retest reliability was evaluated by computing kappa coe�cients for the two responses. Background data was collected from patient records and used to perform known-groups validation. Internal consistency was assessed by calculating Cronbach’s alpha.


Introduction
Childbirth is one of the most important and memorable experiences in a woman's life.The experience has been shown to impact the mother's view on childbirth for more than a decade (1).A positive experience may empower women, whereas a negative experience may even lower self-esteem (2).Numerous instruments have been developed to measure maternal childbirth experience (3).Some focus on subgroups with risk factors for a negative experience, such as women suffering from fear of childbirth or women having undergone cesarean delivery.On the other hand, many instruments are designed to measure the experience especially regarding satisfaction on the management of the delivery.These instruments obviously neglect important emotional and psychological dimensions of the childbirth experience (4,5).To our knowledge, no validated instruments to measure childbirth in multiple dimensions exist in Finnish.As a myriad of tools for this purpose have already been developed, it has been recommended that research focus at further developing and assessing already existing instruments instead of creating new ones (3).
The Childbirth Experience Questionnaire (CEQ) was developed in Sweden in 2010 (6).It has been shown to measure childbirth experience in multiple dimensions unlike many previous instruments which focused solely on one dimension, such as labor pain or quality of care.The original Swedish questionnaire was validated in a cohort of 920 primiparous women (6).The need for a simple, standardized method to assess childbirth experience has been recognized in many countries, and the CEQ has since been translated to and validated in English, Spanish, Chinese and Persian (4,(7)(8)(9).The questionnaire has been shown to be a reliable tool in assessing childbirth experience (3).The aim of this study was to validate the Finnish translation of the CEQ in a population of Finnish-speaking women giving birth to their rst child.

Methods
The CEQ was translated in a collaboration between Finland (ET, OP, JU) and Sweden (AD) in 2012, primarily for use in a study assessing maternal experiences of breech delivery (10).First, two professional translators independently translated the questionnaire from Swedish into Finnish, and then integrated the translations into one Finnish version.This version was back-translated into Swedish in order to ensure that original content was preserved in the nal Finnish version, CEQ-FI.
For this study, primiparous women who gave birth in Tampere University Hospital between January and May 2019 were identi ed from the delivery ward's logbook.Women who had delivered by a planned cesarean section were excluded, as the CEQ has been designed to also assess experience of labor.In order to control confounding factors, women having preterm (gestational age less than 37 weeks) or twin deliveries were also excluded, as well as those women whose newborns were transferred to pediatric ward.However, women undergoing operative vaginal deliveries or intrapartum cesarean deliveries were included.A total of 450 women were mailed the invitation to participate and CEQ-FI one month after delivery.If she agreed to participate and returned the completed questionnaire, she was mailed and asked to ll out CEQ-FI again two weeks after the rst one.No reminders were sent, and not responding the rst questionnaire was interpreted as not being interested in participating the study.
Study sample was chosen based on previous validation studies.In these studies, the response rate has been 59-78% and the preferred number of responses has been 220 (4,7).However, no clear consensus on the adequate sample size in validation studies on quality of life assessment tools exist (11).
The Childbirth Experience Questionnaire consists of 22 items, 19 of which are assessed in a four-point Likert scale and three using a visual analogue scale (VAS).Each completed item is scored from one to four points, higher scores indicating a more favorable experience.The items approach childbirth experience in four domains: own capacity, professional support, perceived safety and participation.Domain-speci c scores are calculated in addition to the total score of the completed questionnaire.
Internal consistency of the questionnaire was assessed by calculating Cronbach's alpha for each of the four domains and for the total score.Generally, Crobach's alpha > 0,70 is considered satisfactory.
Construct validity of CEQ-FI was assessed by using known-groups validation method.Based on previous studies, prolonged labor, oxytocin augmentation, and operative delivery all predispose women to less favorable delivery experience (12,13).Similarly to the original Swedish study and the validation studies of English and Spanish versions (4,6,7), scores of women with prolonged labor (more than 12 hours), women who had their labor augmented with oxytocin, and women who had an emergency cesarean or vacuum delivery were compared with scores of women without these risk factors for a negative experience.Mann-Whitney U test was used to compare the scores between groups.Effect sizes were calculated as the difference between group mean scores divided by the pooled standard deviation of the two groups.Effect size of 0.2-0.5 was interpreted as small, 0.5-0.8 as moderate and more than 0.8 as large (11).
Test-retest reliability was assessed using responses from women who had completed the questionnaire twice.Weighted kappa coe cient was calculated for each item as well as the four domains and the total score to determine the proportion of agreement between the rst and second responses.
Statistical analyses were performed using SPSS for WIndows version 25.0 (IBM Corp., Armonk, NY), with the exception of weighted kappa values, which were calculated using Stata version 16.0.

Results
Of the 450 eligible women, 175 (38.8%) completed at least one questionnaire.One of the women did not respond to items 1-12, and her response was not used in these analyses, as missing items constituted more than half of the questionnaire.Five women had responded to all but one item.Their total and domain scores were calculated as mean scores of the completed items, as instructed in the original study.Table 1 shows key characteristics of the participants, their deliveries and infants.Women with labor duration no more than 12 hours were more likely to score higher in Own capacity, Perceived safety and the whole questionnaire than women whose labor continued beyond 12 hours.Similarly, women who needed oxytocin augmentation reported lower scores in these domains as well as a lower total score.Women who had had a spontaneous birth had higher total score and higher scores in all domains except in Professional support compared to women who had had an emergency cesarean or vacuum delivery.Mean scores and effect sizes are described in detail in Table 3.

Discussion
Based on this study, the Finnish translation of the Childbirth Experience Questionnaire (CEQ-FI) is a valid tool for assessing the childbirth experience of Finnish-speaking women.Internal consistency of the questionnaire is good and its repeatability at least moderate, mostly substantial.The questionnaire performed well in known-groups validation as women with known risk factors for a negative experience had lower scores.
Sample size was small due to lower than expected response rate.However, the results were in line with previous studies validating other translations of the CEQ.Other studies have increased the response rate using reminders (4, 7), use of a web questionnaire or administering the questionnaire during hospitalization immediately after delivery (8, 9).However, sending reminders may result in delayed response, which in turn may affect the item responses (15).On the other hand, administering the questionnaire while the woman is still receiving care may also have an effect on the item responses.
The structure of the CEQ has been modi ed in studies validating other versions, due to cultural and lingual as well as management practice differences.The Spanish study made slight alterations to the wording of the items and moved one item to another domain (4).The Chinese CEQ was translated from the English translation, and three items were completely removed to improve construct validity.Furthermore, three items in Chinese CEQ were moved to another domain.High ceiling effect was observed especially in items regarding labor pain, which the authors attributed to Chinese traditions of accepting intense labor pain as normal and lower analgesia use compared to Western countries (8).The Persian translation was not subject to structural changes, but the item evaluating midwife's support for partner was seen as irrelevant, as partners are not allowed in deliveries in Iran (9).Contrary to these studies, due to very similar culture in both society and hospital practices in Sweden and Finland, the structure of the questionnaire could easily be retained in its original form.We were also able to translate the questionnaire directly from the original Swedish version and preserve the original wording and structure of the items.
This study was designed with the original Swedish validation study in mind.Primiparity predisposes women to a negative childbirth experience (12), and only primiparous women were included to control this impact, as in the Swedish study.The Spanish validation study nevertheless showed that the CEQ is appropriate also in assessing the childbirth experience of multiparous women.Validation study of the Persian CEQ included only spontaneous vaginal deliveries, and Chinese and Spanish validation studies excluded all cesarean deliveries.In Finland, 15.6% of primiparous women had an emergency cesarean delivery in 2018 (16).Operative delivery is a risk factor for a negative childbirth experience (12), and the validity of the questionnaire is of paramount importance also in this subgroup.In this study, labor induction was not an exclusion criterion unlike in the original Swedish study.Labor induction has become increasingly prevalent: at the time of development of the CEQ in Sweden in 1999, 7.7% of primiparous women were induced (17), whereas 32.8% of primiparous women were subjected to this procedure in Finland during 2018 (16).
Internal consistency of the questionnaire is good regarding the whole questionnaire as well as regarding all domains with the exception of Participation.This was also shown in the original Swedish study describing Cronbach's alpha of 0.62 for Participation.Similarly, effect sizes of known risk factors for a negative experience were quite small especially regarding Participation in both Finnish and Swedish studies.In the Swedish study, this was thought to be related to small number of items in the domain, and rewording or adding items could increase the value of the domain (6).A revised version of the original questionnaire has since been developed and validated in English (18), Swedish (19) and Persian (9) language versions.
The English validation (of the original CEQ) study compared the questionnaire to previously validated Maternity Survey and showed high correlation (4).No validated tool exists in Finnish, and thus criterion validity of CEQ-FI can not be evaluated.However, as known risk factors signi cantly decreased the total score and the scores of most domains, CEQ-FI is likely to be able to discriminate a negative birth experience.
Test-retest reliability was assessed by computing weighted kappa coe cient for each item as well as for all domains and the total score, unlike in the English validation study, which did not report item-speci c kappa values.Compared to that study, domain-speci c and total score agreement was at least as good in our study, and agreement in Participation reached substantial level (moderate in the English study).
CEQ and its translations can be used to evaluate individual childbirth experience.In addition, the total or domain scores may be used for example in comparing the impact of different management practices to maternal experience.On the other hand, the questionnaire may be used in clinical setting for structuring discussions with women who have given birth and may need extra support.CEQ-FI is thus suitable for research and clinical settings and is a valid and reliable tool in assessing childbirth experience in multiple dimensions.

Declarations
Ethics approval and consent to participate All participants signed a written agreement to participate in the study.Study design was approved by the Ethical Committee of Pirkanmaa Hospital District (code R18205).

Not applicable
Availability of data and materials

Table 1
Demographic data of the study participants (n = 174)

Table 2 shows
Cronbach's alpha for domain scores in this study as well as in validation studies for other versions of the CEQ.

Table 3
Mean scores of CEQ-FI in subgroups with risk factors for a negative childbirth experience 0.69 for Own capacity, κ = 0.76 for Professional support, κ = 0.74 for Perceived safety and κ = 0.67 for Participation.Agreement for the total score was also substantial (κ = 0.80).
Mann-Whitney U test was used to calculate p values.The follow-up questionnaire was completed by 111 women (63.4% response rate).Weighted kappa was calculated for each item separately (shown in Table4) as well as for each domain and the total score.Sixteen items displayed substantial and six items displayed moderate agreement.All domains displayed substantial agreement: κ =

Table 4
Items of the CEQ and their test-retest reliability, presented in item-spec c weighted kappa coe cients (14)lt I could have a say in the choice of pain relief 0.67 Kappa values in this data are interpreted as moderate (0.41-0.60) or substantial (0.61-0.80), as described by Landis et al(14).