Development and Validation of Models to Predict Cesarean Delivery among Low-Risk Nulliparous Women at Term: A Retrospective Study in China

Background : Intrapartum cesarean delivery has been the focus of many researchers. We derived and validated a model to predict cesarean for low-risk Chinese nulliparous undergoing induction of labor. Methods : We developed a risk model for cesarean by including variables in univariate and multivariable logistic regression using the development set (3841 pregnant women). The performance of the model was assessed for the receiver operating characteristic (ROC) curve, calibration and decision curve analysis (DCA). Additionally, we validated the model externally using an independent dataset (3421 pregnant women). Results : Multivariable logistic regression analysis showed that age, height, body mass index (BMI), weight change during pregnancy, gestational age, premature rupture of membranes (PROM), meconium-stained amniotic fluid and neonatal sex were independent factors affecting cesarean outcome. Two models were established, depending on whether the sex of the fetus was included. The area under the ROC curve of two models were 0.755 and 0.748, respectively. We verified externally, and the area under the ROC curve of two models were 0.758 and 0.758, respectively. The calibration plots demonstrated a good correlation. DCA demonstrated that two models had clinical application value. The online web servers were constructed based on the nomograms for convenient clinical use. Conclusions : These two models can be used as useful tools to assess the risk of cesarean for low-risk Chinese nulliparous undergoing induction of labor.


Background
Cesarean is a life-saving surgical procedure within the obstetric domain.However, it may have risks for subsequent pregnancies as well as long-term effects for both mothers and their offsprings that are still undertaking research [1][2][3].With the development of urbanization, the relaxation of the one-child policy, and the introduction of the universal two-child policy in China, a growing number of pregnant women are more willing to deliver vaginally because they realize the disadvantages and risks of cesarean section by using the Internet to obtain for medical information about pregnancy and childbirth [4,5].Unfortunately, delivery is a complex and dynamic process.Sometimes spontaneous vaginal delivery doesn't go so well, resulting in pregnancies have to undergo the emergency operation.And yet, maternal and perinatal complications are higher when a failed trial of labor results in an emergency cesarean delivery [6][7][8].
Arduous birth experience could impose life-long negative effects and have an unpleasant procedure on the life of the mother, child, family, clinic, and society [9,10].In the era of foreseeable medicine, a prediction algorithm to determine women at risk of an intrapartum cesarean could help reduced labor associated morbidity and mortality.
Currently, several published models studied on prediction of cesarean delivery were mostly focused on the white populations [7,[11][12][13][14], whereas the data of the majority of Han ethnic backgrounds of the Chinese population are insu cient.To date, China lacks the relevant researches, and there is limited empirical evidence and clinical experience reported.
Therefore, the objective of our study was to derive and validate a clinical prediction model that used variables that were readily available from maternal and fetal data to predict the risk of cesarean delivery for nulliparous women in low-risk at term.The use of calculators may give physicians an evidence-based tool to assist with patient counseling and to provide individualized guidance of delivery mode.

Participants
We conducted a retrospective cohort study of low-risk nulliparous women with singleton, term, cephalic pregnancies who delivered at the Hospital of the First A liated Hospital of Soochow University and Sihong county People's Hospital.The former hospital is a tertiary referral center, while the latter is a secondary referral center.There were two distinct phases to the overall study, the development and the validation phases.The prediction model was developed on a sample of 6,551 women who delivered at the First A liated Hospital of Soochow University between January 1, 2011, and August 31, 2017.
External validation of the prediction model was then performed using the data from Sihong county People's Hospital between January 1, 2013, to December 31, 2019, to ensure the model has the same predictive ability as in the derivation cohort.
Institutional Review Board approval by these two hospitals was obtained for the study waiving informed consent for this retrospective study.Methods and reporting guidelines were followed by the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) statement [15] (Additional les 1).

Inclusion and exclusion criteria
Low-risk pregnancy nulliparous women undergoing the labor with singleton, term(37 0/7 weeks of gestation or greater) and cephalic pregnancies were recruited.Women had antepartum intrauterine fetal death or fetal anomalies were excluded.Exclusion criteria were as follows: (1) Women had complications during pregnancy(e.g., cardiac failure, severe liver and kidney diseases, hypertensive disorders of pregnancy, diabetes, oligohydramnios, placenta previa, vasa previa and fetal growth restriction); (2) Women had a scarred uterus(e.g., myomectomy); (3) Women had contraindications to vaginal delivery; (4) Women had the cesarean delivery on maternal request.
The pregnant women who meet the criteria were divided into the intrapartum cesarean delivery group and the vaginal delivery group according to their delivery modes.

Data collection
Data on maternal characteristics and perinatal parameters were collected from the institution's obstetrics database, which was obtained by the patient's medical record review.

Characteristics
The outcome of interest was de ned as cesarean delivery.A cesarean delivery was performed if there was fetal distress, arrested active phase, prolonged latent phase, prolonged second stage, arrested descent, suspected chorioamnionitis, and other medical indications, such as threatened uterine rupture.It is worth noting that we only collected the major indications.The candidate predictor variables had to be easily accessible through characteristics' data.To identify predictor variables, a systematic review of the literature was conducted [7,[11][12][13][14][16][17][18].The following variables were recorded: maternal age, height, weight, baseline body mass index (BMI), weight change during pregnancy, gestational age at delivery, premature rupture of membranes (PROM), epidural analgesia, meconium-stained amniotic uid, intervention measures(oxytocin, amniotomy, disposable cervical dilator balloon, prostaglandin(Propess or Misoprostol)), neonatal sex, and neonatal birth weight.

Operational de nitions
The relevant guidelines [19,20] were used to determine cesarean delivery indications such as arrest of descent and a prolonged second stage of labor.BMI was calculated as weight (kg)/[height (m)] 2 .Baseline BMI was de ned as pre-pregnancy BMI.Gestational age was calculated by the date of the last menstrual period and con rmed by ultrasound examination during rst-trimester (by measuring the crown-rump length) or second-trimester(by measuring biparietal diameter, abdominal circumference and femur length).
We analyzed the labor process of each participant.They were divided into the non-intervention group and the intervention group according to whether they received intervention measures.These intervention measures included the use of oxytocin, amniotomy, disposable cervical dilator balloon and prostaglandin.The non-intervention group was de ned as women who entered labor naturally and the labor process did not be intervened.The intervention group was divided into the augmentation subgroup and induction subgroup according to whether the cervix of the woman was dilated by 6 cm.The cut-off point of 6 cm was chosen because contemporary labor data indicated that active labor starts at 6 cm of cervical dilation [21].The augmentation subgroup referred to women who received intervention measures such as oxytocin augmentation and amniotomy when their cervical dilation was greater than or equal to 6 cm.The induction subgroup referred to women who received intervention measures when their cervical dilation was less than 6 cm.We used the following modes to group the induction subgroup.
(1) Oxytocin Induction group: women only received oxytocin induction.This was de ned as Induction method 1.
(2) Amniotomy group: women received arti cial rupture of membranes or both arti cial rupture of membranes and oxytocin induction(amniotomy after using Prostaglandin E2 or Disposable cervical dilator balloon was not included).This was de ned as Induction method 2.
(3) Disposable Cervical Dilator Balloon: women received Disposable cervical dilator balloon induction.This was de ned as Induction method 3.
Considering that the women in the augmentation subgroup naturally entered the active stage of labor, the augmentation subgroup and the non-intervention group were combined into one group, which serves as a reference group for the induction subgroup.For convenience, spontaneous labor group was named this combined group.

Statistical analysis
Data analysis was conducted by using the statistical software package SPSS (24.0) and R (3.6.2).
Univariate analysis was performed for all clinical data.For continuous variables, the tests of normality were performed rst.The Student's t-test was used to compare the continuous variables with a normal distribution.The Mann-Whitney test was used to compare discrete or continuous variables without a normal distribution.The chi-square test and Fisher exact test were used, as appropriate, for the categorical variables.Standard descriptive statistics (mean ± standard deviations or median and interquartile range) were used to summarize continuous variables.Percentages and frequencies were used for categorical variables.
Baseline variables that were considered clinically relevant or that showed a univariate relationship with the outcome(candidate variables with a p value <0.05 on the univariate analysis) were entered into the multivariable logistic regression model.Variables for inclusion were carefully chosen, given the number of events available, to ensure parsimony of the nal model.The results of logistic regression models were presented as odds ratio (OR) with their 95% con dence intervals (CIs).
The discrimination and calibration of the prediction model were evaluated.Discrimination is the extent to which patients with cesarean delivery is identified likely to have this positive outcome.Calibration refers to the extent to which the calculated risks reflect the actual percentage of women with the outcome in each group.It is the agreement between observed outcomes and predictions.The area under the receiver operating characteristic curve(AUC ROC) was calculated to assess the discrimination ability.AUC ROC was interpreted using following categories: non-informative (AUC ROC = 0.5), poor accuracy (0.5 < AUC ROC <0.7), moderate accuracy (0.7 < AUC ROC <0.9), high accuracy (0.9 < AUC ROC < 1); and perfect accuracy (AUC ROC = 1) [22].The calibration of the prediction model was assessed using the Hosmer-Lemeshow goodness-of-t test(P>0.05was taken to indicate good tting) and/or calibration plot.The ideal curve would be a 45-degree straight line.Perfect model calibration is represented graphically by a slope of 1 and an intercept of 0. Decision curve analysis (DCA) was used to assess the clinical value of the model, which was a method for evaluating the net bene t.A decision curve was plotted to inform clinicians about a range of threshold probabilities in which the models would be of clinical value once deployed in clinical practice.We assessed internal validity with a bootstrapping technique (resample 1000 times) to show the performance of model.We conducted an external veri cation of the nal model in another hospital.We reported the predictive performance in the validation cohort also using the measures of discrimination and calibration.Based on the nal model, we created a graphic nomogram that represented the predictive model by R. Established a dynamic nomogram by using the rms DynNom, and built a web online applications through shinyapps.All P values were two-tailed, and a signi cance level of 5% was used.

Study population and outcomes
During the study period, 18,228 deliveries were managed at our tertiary referral center.After exclusion of ineligible patients, 6,551 women were enrolled (Additional les 2).Among the recruited subjects, 576(8.79%)women gave birth by intrapartum cesarean.Most cesarean deliveries were performed for concerns with fetal heart rate patterns in labor (48.44% = 279/576) and arrest of dilation in the active phase of the rst stage of labor (20.83% = 120/576).

Characteristics of pregnancies
Participants were divided into vaginal delivery and cesarean delivery groups, and the comparisons of demographics and clinical characteristics were displayed in Table 1.
The women in cesarean delivery group were older and shorter in height, had higher BMI and higher weight gain during pregnancy and delivered later compared with those in vaginal delivery group.The incidence of PROM, labor induction, and the proportion of cases with meconium-stained amniotic uid were all higher in cesarean delivery group.There were more male infants and higher neonatal birth weight in the cesarean delivery group as well.There were no signi cant differences in the rate of delivering with labor epidural analgesia between the two groups.Women who aged more than 26.5 years, height less than 160.5 cm, gestational age more than 279 days, had baseline BMI more than 21.3 kg/m 2 , gained pregnancy weight more than 13.3 kg and gived birth to a newborn with a birth weight of more than 3,465g were more likely to undergo cesarean.

Derivation of a model
Factors that were statistically signi cantly associated with cesarean delivery in univariate analysis or that were considered clinically relevant were entered into the multivariable logistic regression.The results of the analysis showed in Figure1 and Table 2.
The results of multivariable logistic regression analysis showed that in the model, maternal age, height, gestational age, baseline BMI, Weight change during pregnancy, PROM, methods of induction, level of meconium-stained amniotic uid and neonatal sex were predictors of cesarean delivery.Developed with data from 6,551 women with complete data, the final equation

Discrimination, calibration and decision curve analysis
The areas under the receiver operating characteristic curve, calibration plot and decision curve analysis were presented in Figure 2.For model 1, the ROC curve for this predictive model achieved an AUC of 0.782(95%CI: 0.771-0.791).According to the ROC curve, the ideal cutoff point for predicted probability was obtained, that was when the predicted probability was greater than 7.45%, the AUC was the largest.At this time, there were 452 cases of cesarean delivery, the incidence rate was 17.6%.The sensitivity of the prediction model 1 was 78.47% and the speci city was 64.55%.The calibration plot of the Model 1 for the probability of cesarean delivery showed a good correlation between the predicted and actual probabilities(Hosmer-Lemeshow test: P=0.263, the slope and intercept of the calibration plot were 1.000 and 0.000, respectively.Figure 2B).The AUC curves for model 2 were 0.774(95%CI: 0.763-0.784).Using a cut-off of predicted probability of 8.7% to de ne a positive test classi ed women with good accuracy (sensitivity 70.66%; speci city 70.59%).The result of the Hosmer-Lemeshow test for goodness of fit showed that there was no statistical difference between 45-degree straight line and calibration plot (Hosmer-Lemeshow test: P=0.817, the slope and intercept of the calibration plot were 1.000 and 0.000, respectively.Figure 2C), suggesting the Model 2 was well calibrated .
DCA is a novel method that can examine diagnostic and prognostic strategies, can be used to evaluate and compare different predictive models, and identify the net bene t of a prediction model [23].Thus, we used DCA for models to predict the correct diagnosis of cesarean delivery.The results (Figure 2:D-F) indicated that two models were useful for threshold probabilities of 4% to 60%.There was no signi cant difference in the net bene t between two models.The clinical impact curves also demonstrated the clinical utility of the models.

Internal and External validation and predictive performance
Results of the internal validation using Bootstrap method indicated that the cesarean delivery rates predicted by both models were consistent with the real data(Figure 3 A-B).
We validated the two formulas by using a separate data set which was derived from another hospital(Sihong county People's Hospital).External validation of the logistic regression equations was performed using a cohort that included 7,657 low-risk nulliparous women who met the same inclusion criteria as those in the original data set used to create the model(Additional les 3).It is worth noting that cervical dilator balloon and Propess were not routinely used as induction methods in Sihong county People's Hospital in this study.In this hospital, 25 micrograms of misoprostol was used prostaglandin for induction of cervical ripening.Considering that misoprostol is also a prostaglandin drug induction method, therefore, when we veri ed externally, it replaced Propess in Induction method 4. That is, if misoprostol was used to induce labor, the value of Induction method 4 in the prediction formula was 1.When these two formulas were applied to this cohort, it achieved AUC of 0.775 (95% CI 0.755-0.796)and 0.775 (95% CI 0.754-0.796)(Figure 3 C-D).The calibration plots presented an acceptable agreement in the validation cohort between the prediction and actual observation for the Model1 and Model 2 (Figure 3 E-F).

Nomogram
A nomogram was created to represent the logistic regression model, which can be used to generate a patient-speci c risk for intrapartum caesarean (Figure 4).For a given woman, each characteristic was aligned with the corresponding number of points on the points axis, and a total summated point was derived.The sum of all points lined with predicted probability of cesarean delivery.We also developed a friendly software-based calculator that can give the percentage likelihood of cesarean delivery.It can be found at: https://fangcan.shinyapps.io/CSsexDynNomapp/, https://fangcan.shinyapps.io/CSDynNomapp/.We used two examples to illustrate the dynamic nomogram on web (Additional les 4).

Discussion
Intrapartum cesarean delivery has been concerned by many researchers.This study was aiming to identify the predictors for cesarean experience among low-risk women and develop a clinical prediction model.The model allows physicians to assess the individual and systematic risks of pregnant women before childbirth.It can be used during counseling to increase acceptance of vaginal delivery for those with a high chance of success and to minimize procedures performed on women with low chances for a successful vaginal delivery.Screening for high-risk women and applying adequate interventions may reduce the risk of adverse outcomes.To achieve the best outcomes for mothers and babies, medical staff need to provide the pregnant women with the proper suggestion of delivery.
The model highlight the importance of variables such as maternal age, maternal height, pre-pregnancy BMI, gestational age, weight gain during pregnancy, induction method, degree of meconium-stained amniotic uid, indicator for occurrence of premature rupture of membranes and male fetus in determining the cesarean delivery when women failed the trial of labor.These results are consistent with the existing literature on the risk of cesarean delivery [18,[24][25][26].Past studies also found gestational age of induced labor and induction methods were related to the incidence of cesarean [27,28].The ethnic disparity was observed in cesarean delivery rates, Stark et al. showed that the frequency of cesarean was lowest in non-Hispanic white women and highest in non-Hispanic black women [29].In addition, delivery time, clinicians' personal beliefs and pregnant women themselves' views of decision-making also have an impact on delivery mode [30][31][32].
Currently, given the maternal and fetal implications of intrapartum cesarean, a number of prediction models aimed at determining the likelihood of cesarean delivery had been developed.A study by Levine et al. [12] used a nomogram to develop and validate a predictive model for women undergoing an induction of labor with an unfavorable cervix.Nulliparity, BMI, gestation age ≥40 weeks, modi ed Bishop score, and height were signi cantly associated with cesarean.This model with an AUC of 0.79 in the development cohort and 0.73 in the validation cohort.Jochum et al. [13]had developed a scoring system for predicting cesarean delivery after labor induction with cervical ripening based on a secondary data analysis.Height, BMI, gestational age, parity, dilation, effacement, fetal head station, medical indication, suspicion of macrosomia, PROM and concerning fetal status were found to be strongly associated with cesarean delivery.The AUC ROC in the derivation set and internal validation set were 0.76 and 0.74, respectively.Rossi et al. [14] developed and validated a predictive risk calculator for cesarean among women undergoing induction of labor.Seven independent risk factors had been associated with an increased risk of cesarean included prior vaginal delivery, maternal weight at delivery, height, age, prior cesarean delivery, gestational age at induction, and maternal race.The model had an area under the curve for the receiver operating characteristic curve of 0.787 (95% CI 0.786-0.788),and it performed well on external validation (0.783, 95% CI 0.764-0.802).
However, it is uncertain whether these models are applicable to Chinese women.At the same time, China is a yellow race and people are relatively thin and have a low BMI.To date, China lacks the studies in relevant researches.Therefore, based on the analysis of clinically relevant factors of existing prediction models and the Fertility Policy of China(fetal sex identi cation is prohibited), this study established risk prediction models suitable for low-risk pregnant women for cesarean delivery through the multivariable Logistic regression analysis.These prediction models discriminated well (Model 1: AUC ROC was 0.782, Model 2: AUC ROC was 0.774).Moreover, an external validation of women from another hospital was conducted, which demonstrated a consistent measure of discrimination with the AUC of 0.775 (95% CI 0.755-0.796)and 0.775 (95% CI 0.754-0.796), respectively.The results of external validation showed that our prediction model can be further extended to the data set of another research center, indicating that the application value of the model has been greatly expanded.The online web server was constructed based on the nomogram to facilitate clinical practice.Clinicians can carry out a risk assessment and provide appropriate suggestion on patients at any time through mobile phones.
To our knowledge, the model developed in this study using information on maternal factors and using robust modeling methods is the rst model applicable to Chinese low-risk women.These models still show good prediction ability in both internal and external validation populations.Further investigation of model validity and impact before generalizing is important and should be undertaken.
The limitation of this study may include the following aspects.Firstly, this is a retrospective study in nature.Some data is inevitably missing and inherently biased.For example, the success of induced labor mainly depends on the cervical ripening, and the Bishop's score has most often been used to describe cervical ripeness.This was not well documented in retrospective cases, so we did not include this factor in the nal prediction model.The literature showed that the Bishop's score was originally designed to predict the likelihood of multiparous women at term to enter spontaneous labor, this could result in making it less predictive of outcome after labor induction in nulliparas.A systematic review also concluded that the Bishop score was a poor predictor for the outcome of induced labor at term [33].In addition, the de nition of indications for intrapartum cesarean was not completely consistent.The research period was from 2011 to 2017, which had experienced the transformation of the old and new standard and management of labor.Chinese expert consensus on the new standard and management of labor was issued in 2014.Since January 1, 2015, this tertiary referral hospital had gradually implemented this clinical guideline.In our retrospective analysis, we found that some indications did not fully refer to the new standard and management of labor.In fact, it is di cult to unify or standardize labor standard in clinical practice.There will be individual differences among women and different interventions in labor.At the same time, the indication of intrapartum cesarean during labor is not only related to the progress of labor but also affected by pregnant women, fetuses, and medical staff.Then, the sample size of the study was relatively small, and the included subjects were only the single-centered population in the region.The selection of the population also targeted only low-risk nulliparous.Given the diversity of geography, economy, medical level and environment throughout China, our ndings may not representative the Chinese population in many jurisdictions.Fourthly, the neonatal birth weight had been associated with an increased risk of cesarean, but it was not included in the nal prediction model because it could not be accurately known before delivery.Nowadays, the ultrasonography is the main method to estimate fetal weight, but it is not accurate enough.Ethnics, the time interval between ultrasonic examination and delivery, fetal sex, fetal position, maternal BMI and seniority of sonographers are in uencing factors on sonographic estimated fetal weight.Finally, in comparison with other prediction models, there are variations in induction rates and variables most likely to in uence successful vaginal delivery.Compared with the Jochum [13] model, which was established in the scoring system, the prediction model was developed by Logistic regression analysis in this study had a more complex process.But it was accurate in calculation and had high sensitivity and speci city.The nomogram was established, which can be convenient for clinicians in the clinic.At the same time, we developed a software-based calculator that gives the percentage likelihood of cesarean delivery.

Conclusion
In conclusion, multivariable analysis showed that maternal age, height, BMI, weight gain during pregnancy, gestational age, mode of labor induction, meconium-stained amniotic uid, presence of PROM andfetal sex were an independent risk factor for cesarean delivery in this study.These variables were used to develop a clinically useful calculator to predict an individualized patient risk assessment for intrapartum cesarean among low-risk nulliparous women at term.With the adjustment in China's family planning strategy and release of the second-child policy, more pregnant women are faced with the choice of delivery mode.The prediction model established by these factors has a better prediction performance.
Obstetricians and midwives can use the tool to predict which women will need surgery.Those who as high-risk women could be offered elective cesarean, which could avoid additional adverse effects.Those at low risk should be comforted and encouraged to adopt vaginal delivery.However, the bene ts of using models should be demonstrated before routine introduction into clinical practice.Further study is warranted to optimize these models by conducting multicenter researches studies with large samples.It is worth noting that these models should be combined with the clinical practice of patients rather than applied in isolation.It provides evidence-based knowledge to support their delivery mode choices and to improve maternal and perinatal outcomes and to optimize the allocation of resources.

Abbreviations
Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis; BMI, body mass index; PROM, premature rupture of membranes; OR, odds ratio; CI, con dence interval; AUC ROC, area under the receiver operating characteristic curve; DCA, decision curve analysis.

Declarations
Ethics approval and consent to participate Institutional Review Board approval by The First A liated Hospital of Soochow University and Sihong county People's Hospital were obtained for the study waiving informed consent for this retrospective study.A total of 14 208 study participants were recruited in two ways.First, there was retrospective recruitment of consecutive 6,551 pregnant women in the nal analysis delivered from January 1, 2011 to August 31, 2017 (2018019).Second, 7,657 women came form Sihong county People's Hospital between January 1, 2013 and December 31, were recruited retrospectively into a validation cohort.

Figure 3 Internal
Figure 3

Table 1 .
Characteristics of maternal and neonatal by mode of delivery BMI, body mass index; PROM, premature rupture of membranes; Data are mean ± standard deviation, n/(N %) * Two-sided P based on the χ2 for categorical variables, and the t test for continuous variables.