This study aimed to optimize the Rossi nomogram for predicting cesarean delivery following the IoL. In total, 721 patients who underwent induction were included in the study to externally validate and update the original model. In summary, the original model exhibited modest performance, with an AUC of 0.789 in the external validation. However, the calibration plots demonstrated the poor fitting of the original model (HL p-value = 0.0047). After including the new variable of the Bishop score, the discriminatory performance of the updated model improved from an AUC of 0.789 to an AUC of 0.811, and the calibration performance improved significantly (HL p-value = 0.775). Moreover, we determined that the original Rossi and updated models had higher net benefits when the probability threshold was between 0% and 60%, which means that it is beneficial to use the models to make decisions concerning patients who fall within this range of probability thresholds.
This study was the first to validate the nomogram developed by Rossi et al. in a Chinese population. This nomogram has previously been validated in a different cohort. In 2021, Lopez-Jimenez et al [17]. validated this nomogram in a prospective cohort of Spanish women, with an AUC of 0.752 (95% CI 0.707–0.797) when compared with an AUC of 0.787 for the model of Rossi et al., showing that the model’s performance was modest. Additionally, they also demonstrated that the model gave rise to adequate values in terms of the Hosmer–Lemeshow test (p = 0.094), although the calibration belt and test showed the model to have inadequate calibration (p = 0.032). In our cohort, under the premise that the cesarean delivery rate (183/721, 25.4%) was higher than in the derivation group (800,432/4117,644, 19.2%), the model’s performance was found to be modest (AUC = 0.789; 95% CI 0.753–0.825). However, the calibration revealed that the model’s overall accuracy was attributed to predictions of < 40%, while the underpredicted actual success was due to predictions > 40%. In the original study concerning Rossi et al.’s model development, the model demonstrated excellent calibration until the expected risk exceeded 80%, after which the model was noted to overpredict the risk. This variability in accuracy may be partly due to the higher cesarean delivery rate after the IoL in this study, and due to the geographic and racial or ethnic diversity, although it may also be that the model is mis-calibrated at higher predicted probabilities of success.
Traditionally, clinicians use the mother’s cervical status as the best predictor of the successful IoL. Many prior studies identified a favorable starting cervix to be a major driver of successful induction [5, 12, 13, 14]. Nevertheless, using the Bishop score as a single predictor variable is questionable because it comprises five individual components, not all of which have been shown to be equally important in determining the success of the IoL [13, 15, 18]. Additionally, a systematic review found that the initial Bishop score did not affect the route of delivery after the IoL [19]. In our study, the Bishop score was noted to be predictive. When compared with the original model, the AUC of the updated nomogram that included the Bishop score at the time of induction was significantly improved from 0.789 to 0.811, while the calibration performance was also improved significantly. Moreover, the models achieved similar predictive accuracy to the Levine induction calculator (AUC 0.79) [8], which was developed to calculate the risk for multiparous and nulliparous women with unfavorable cervices. The Levine model uses prospectively collected data and includes the Bishop score as a variable.
A prospective study published in the JAMA [20] found that the implementation of Levine’s predictive calculator to determine the risk of cesarean delivery at the time of the IoL could reduce the absolute risk of caesarean delivery by 8%. This suggests that the rational application of a predictive model of the IoL in clinical settings may help to reduce cesarean delivery rates. In the present study, the Rossi and adapted models were all found to be useful in Chinese pregnant women, although it might be difficult to decrease the cesarean delivery rate after the IoL in the whole of China because it depends on variable factors, including medical reasons. First, the treatment policy differs among obstetricians, which may lead to unsuccessful induction. For example, each obstetrician has their own policy on the criterion for cesarean delivery against an abnormal fetal heart during the induction process. Second, health-care providers often believe that induction is associated with failed labor and cesarean delivery [21–23]. Mostly, these results were obtained because women who were induced were compared with those who underwent natural labor, which is both a flawed and an unrealistic comparison. However, this view makes it somewhat easier for physicians to favor cesarean delivery in the management of labor. Third, as the duration of induced labor extends, the anxiety of pregnant women increases. If effective psychological counseling is not available, “requested cesarean delivery” without clear medical indications may be a reason for the failure of induction. Additionally, differences in obstetricians’ subjective evaluation of the Bishop score may lead to different predictive results when using the updated model. For these reasons, even women undergoing induction with a low predicted cesarean delivery rate face a number of issues when pursuing the goal of vaginal delivery.
The strengths of our study include the comparison of the original and adapted models with the use of DCA to aid in counseling patients as they make decisions regarding induction. In the validation study performed by Lopez-Jimenez et al [17], after the DCA, they demonstrated that the Rossi model was only adequate, with probabilities ranging between 0.10 and 0.55. However, in our study, the Rossi model and the modified models added net benefits when decisions were made using the model by a clinician or patient, with predicted probabilities of 0–60%. Therefore, the predictions of both models are most useful for the patient who is predicted to have a < 60% chance of success. In this case, the model could reassure the patient about the likelihood of success. However, use of these model predictions above a probability threshold of 60% does not provide any net benefits over allowing all patients to attempt vaginal birth.
This study also has certain limitations, including its retrospective nature, the relatively small size of the cohort, and the reliance on electronic medical record data entry by providers. Given the diversity of the geography, economy, medical services, and environment throughout China, our findings may not be representative of the Chinese populations in all jurisdictions; therefore, the models require external validation in other populations prior to widespread clinical application in China. In addition, the indication for cesarean delivery was not known in this study, which would be helpful to evaluate why the IoL failed (e.g., insufficient cervical ripening, labor arrest disorders, non-reassuring fetal status).
In conclusion, the nomogram for predicting cesarean delivery following induction developed by Rossi et al. was validated in a Chinese population in this study. Adaptation to a Chinese population by excluding ethnicity and replacing it with the Bishop score led to better performance. The predictions from both models proved most useful for patients predicted to have a < 60% chance of success. However, the benefits of using the models should be demonstrated prior to routine introduction into clinical practice. Further study is warranted to validate and optimize the models by conducting multicenter research studies with larger samples.