Development and Validation of Nomograms to Predict Long-term Oncological Outcomes in Serous Ovarian Cancer

Background: Nomograms are statistics-based predictive tools that integrate predictive factors. We developed and validated a nomogram to predict overall survival (OS) in serous ovarian cancer (SOC). Methods: In total, 6957 patients from the SEER database were included in the training group; the external validation group included 1244 SOC patients from two Chinese hospitals. The nomogram was structured on Cox regression analyses and was evaluated in both the training and validation groups using consistency index, area under the receiver operating characteristics curve (AUC), calibration plots, and risk subgroup classication. Kaplan–Meier curves were plotted to compare survival outcomes between subgroups. A decision-curve analysis was used to test the clinical value of the nomogram. Results: The independent factors identied by multivariate analysis in the training cohort and selected for the nomogram included age, tumor grade, and FIGO stage. The consistency indexes for OS were 0.689 (95% condence interval: 0.677–0.701) in the training cohort and 0.639 (95% condence interval: 0.601– 0.670) in the validation cohort; the AUCs were 0.675 and 0.661 in the validation cohorts, respectively. Calibration curves showed good consistency between predicted and actual 3- and 5-year OS. Signicant differences were observed in the survival curves of different risk subgroups. The decision-curve analysis indicated our nomogram was superior to the AJCC staging system. Conclusion: We constructed a nomogram to predict long-term OS in SOC and externally veried it in an Asian population. This nomogram showed more accurate survival predictions, which will help provide personalized treatments and follow-up strategies.


Introduction
Epithelial ovarian cancer (EOC) is one of the most common malignant tumors of the female genital organs. Although the prognosis of early-stage ovarian cancer (stage I and II) is favorable, approximately 70% of newly diagnosed cases are in advanced stages (stage III and IV), for which the 5-year overall survival (OS) rate is only 29%. The 5-year OS rate is 92% for early-stage disease. EOC is generally treated with cytoreductive surgery and platinum-based combination chemotherapy. Although treatment has greatly improved, the incidence rates and mortality rates have been increasing year by year [1]. As a subtype of EOC, serous ovarian cancer (SOC) presents a distinct biological pro le from other histological types. In 2002, Singer proposed a two-level classi cation system for SOC that divided patients into two subtypes: low-grade ovarian serous cancer (LGOS) and high-grade ovarian serous cancer (HGOS) [2]. In 2004, the pathologists Kurman and Shih, put forward the binary theory of EOC based on an analysis of clinical data, pathomorphology, and molecular genetics from a large number of EOC patients [3]. According to the binary theory of ovarian cancer, these malignancies can be divided into two types: type I and type II. In SOC, the binary theory of low-and high-grade disease is widely accepted because of the obvious differences in tumorigenesis, malignant biological behaviors, and molecular characteristics.
HGOS is type II ovarian cancer and the most common type of ovarian cancer, accounting for approximately 90% of SOCs. There are no precancerous lesions; thus, it is often found with extensive pelvic spread and high malignancy [4].
LGOS is type I ovarian cancer, and accounts for approximately 10% of ovarian serous adenocarcinomas. These tumors grow slowly and are not sensitive to chemotherapy, but the clinical diagnosis is usually early and the prognosis is good.
In the process of clinical diagnosis and treatment, gynecological oncologists are often asked by patients: "If I have surgery and chemotherapy, how long will I live?" For a doctor, having the ability to predict the probability of a certain outcomes could change medical practice models and clinical decisions. As a tool for risk and bene t assessments, clinical prediction models can provide more objective and accurate information for doctors and patients, so they are becoming more widely used. Nomograms are statisticsbased predictive tools that integrate pivotal predictive factors and have been widely used to quantify risks and evaluate the prognosis of many cancer types [4][5][6]. The use of nomograms has been compared to the traditional staging systems for many cancers; thus, it has been proposed as a new standard. However, to the best of our knowledge, no nomograms for patients with SOC have been developed. In this study, we aimed to construct nomograms using data extracted from the Surveillance, Epidemiology, and End Results (SEER) [7] database to predict the prognosis of patients with SOC. Then the prediction model was veri ed externally to determine whether it provides more-accurate predictions of patient survival compared with the currently available staging systems.

Data sources and extraction
The training cohort of ovarian cancer patients was obtained from the SEER database, which contains data of cancer patients from 18 regional registries, covering approximately 34.6% of the total United '8441-8442, and 8460-8463' were used to identify women with SOC [9]. Relevant information was extracted by using SEER*Stat software version 8.3.6.
A retrospective study was conducted on an external validation cohort of patients who underwent cytoreductive surgery for ovarian tumors between January 2009 and June 2015 at two tertiary institutional hospitals in Guangxi Province, China. All patients received satisfactory cytoreductive surgery and paclitaxel combined with platinum chemotherapy.

Inclusion and exclusion criteria
The inclusion criteria for this study included the following: no history of previous anticancer therapy, no history of other malignancies,satisfactory cytoreductive surgery, a pathological diagnosis of SOC (including LGOS and HGOS), stage was not limited, at least 3 courses of paclitaxel combined with platinum-based intravenous chemotherapy had been received after surgery, and there were no limits on age or race limit. The exclusion criteria were as follows: not the primary tumor; no histological con rmation, survival time shorter than 1 month, no surgery, no chemotherapy, and cases with incomplete information.

Study design and ethics
The nal analysis consisted of 8201 individuals, of which the training cohort extracted from the SEER database (n = 6957) were used for model development, and the remaining validation samples were used for model evaluation (n = 1244). This study was approved by the ethics committee, and all patients signed informed consents before surgery and chemotherapy.

Follow-Up
Detailed clinicopathological information and follow-up records were collected by four gynecologists who had received uni ed training. To ensure the accuracy of data entry, two specially trained gynecologists double inputted the same medical records and established a uni ed database after checking the doubtful parameters. The SEER program included demographic data, stage of cancer at the time of diagnosis, and treatment information at the time of follow-up.

Variates and outcomes
Variables were grouped to match with nomograms according to the actual clinical situation. We included the following factors assessed at diagnosis: age (< 50, 50-59, 60-69, 70-79, and ≥ 80 years), grade (LGOS or HGOS), AJCC stage (I, II, III, or IV). OS was used as the primary endpoint and was de ned as the time from diagnosis to death or the last follow-up.

Statistical analysis
We used standard model development and validation methods, including training and external validation cohorts [10]. The primary end point was OS. Categorical variables are shown as frequencies and proportions. Comparisons of clinicopathological characteristics between the training and validation cohorts were performed using the chi-squared test or Fisher's exact test. Continuous variables were compared using the t-test or Mann-Whitney U test for variables with an abnormal distribution. We performed univariate and multivariate analysis via Cox proportional hazard regression models in the training cohort to analyze different prognostic variables associated with OS. Those with a p-value < 0.05 by univariate analysis were included in the multivariate analysis.
A nomogram was formulated based on the independent prognostic factors de ned from the multivariate analysis and by using the RMS26 package in R version 3.6.2 (www.r-project.org; R Foundation, Vienna, Austria). The performance of the nomogram was internally validated in the training cohort and externally in the validation cohort. The receiver operating characteristic (ROC) was used to evaluate the discriminative abilities of the nomograms. The larger the ROC, the more accurate the prognostic prediction [11]. Calibration curves (1000 bootstrap resamples) were generated to test the consistency between the predicted and actual 3-and 5-year OS. Moreover, the whole cohort was regrouped into low-and high-risk groups, with the median risk score generated from the nomogram. Kaplan-Meier analysis and the log-rank test were used to explore survival differences between the risk subgroups.
All statistical analyses were performed using SPSS version 25.0 (IBM Corp.Armonk, NY, USA) and R software version 3.6.2. P values < 0.05 were considered statistically signi cant.

Patient characteristics
Detailed patient characteristics are shown in Table 1. We identi ed 8201 women with SOC. As shown in Table 1, the majority of cases (> 85%) were diagnosed in women aged 50 years and older. More than half of the patients (84.4%) had HGOS, of which the majority (77.8%) were at advanced stages. Characteristics of patients in the training (n = 6957) and validation cohorts (n = 1124) are listed in Table 1.

Independent prognostic factors in the training cohort
Univariate analysis of the training cohort showed that age, grade, and AJCC stage were signi cant risk factors for OS (p<0.05) ( Table 2). The independent prognostic factors in the nal model were identi ed by multivariate analyses. In the multivariate analysis of these three factors, age, grade, and AJCC stage remained independent factors signi cantly associated with OS (Table 2).

Constructing nomograms for OS
Prognostic nomograms for predicting 3-and 5-year OS were constructed independently on the basis of the prognostic variables from the training cohort ( Figure 1). The complex Cox regression formulas were transformed into visual graphics. The line segment corresponding to each variable was marked with a scale that represented the value range of the variable, and the length of the line segment re ected the contribution of the factor to the outcome event. The nomograms demonstrated that AJCC stage contributed most to OS for SOC patients. Nomograms are commonly used to estimate prognosis in oncology. Each variable was assigned a score ranging from 0 to 100. To easily use the nomograms, one can draw a vertical line upward from the speci c points of each predictors to the 'point' lines, and then add up the corresponding points [12]. Then, drawing a straight line from the total points axis to the 3-and 5-year survival axis, surgeons can get the probability of 3-and 5-year survival.
For example, if a patient was 55-years-old (18-score), diagnosed with HGOS (20-score), had undergone satisfactory cytoreductive surgery, with AJCC stage III disease (65-score), and received chemotherapy after surgery, the total score of the patient would be 103. Thus, the probability of survival for 3 + years was 73%, and the probability of survival for 5 + years was 53%. This visual scoring system of nomograms can help doctors answer the question: "how long can I live?" This calculated value aims to help clinicians make rapid predictions when counseling patients.

Validation of the nomograms
The nomograms were then externally validated. We used the c-statistic, ROC curves, and calibration curves to validate the model. These are indexes used to evaluate the accuracy of prediction models [13].

Values of the C-statistic
The C-statistic ranged from 0.5, which indicates absence of discrimination, to 1.0, which indicates perfect discrimination. Generally speaking, if the C-statistic value is > 0.6, the model has good predictive value.
In the training cohort, the C-statistic value for predicting OS was 0.689 (95% CI 0.677-0.701). In the validation cohort, the C-statistic value for OS was 0.639 (95% CI: 0.601-0.670). Thus, the c-statistic values of the training and validation sets were very similar.

ROC curve analyses
The nomograms were then externally validated. In the training cohort, the ROC curve for OS was 0.675; in the validation cohort, the ROC curve for OS was 0.661 (Fig. 2). ROC curves ranged from 0.5, which indicates absence of discrimination, to 1.0, which indicates perfect discrimination. Generally speaking, if the ROC curve was > 0.6, the model had a medium prediction value. The ROC curves of the training and validation sets were very similar (Fig. 2).

Calibration curves
The calibration curves indicated excellent agreement between the nomogram predictions and the actual survival outcomes in the training and validation cohorts. The x-axis represents the nomogram-predicted probability of the 3-and 5-year OS rates of each patient, while the y-axis represents the actual 3-and 5year survival rates of each patient. If the blue line completely coincided with the black dashed line, this would indicate that the model was the most ideal. The agreement between the predicted versus observed probabilities are indicated by the calibration curves for 3-and 5-year OS ( Figure 3).

Risk strati cation of SOC patients and Decision-curve analysis (DCA)
A risk score of each variable was generated from the nomogram, and total scores were calculated for all patients. The whole cohort was divided into low-and high-risk subgroups based on the median risk score. According to the survival curves shown in Figure 4A, signi cant differences were observed between the low-and high-risk groups for OS (P<0.001), which implied that the nomogram had better risk strati cation ability.
DCA indicated that the nomogram models made better predictions and outperformed the AJCC staging system (Figure 4 B).

Discussion
In the eld of medical research, tumor risk prediction models are used to predict the future incidence and prognosis of certain tumors. Speci cally, nomograms represent a mean to establish a statistical model of the quantitative relationship between multiple risk factors and tumor occurrence and/or prognosis. The purpose of such models include: informing patients of the risk of onset or prognosis, screening high-risk groups, and helping doctors make clinical decisions. In 2003, van Zee et al. [14] rst proposed a nomogram model to predict the risk of non-sentinel lymph node metastasis in sentinel node-positive breast cancer. This nomogram was based on a regression model to intuitively present the probability of outcomes. The advantage is that it can provide better individualized prognostic risk assessments in the form of intuitive graphics, which have de nite value in clinical practice and can provide a reference for individualized clinical decision-making. Thus, such predictive nomograms are widely used in clinical oncology [15][16][17]. All of the nomograms presented showed better discriminatory capacity than did classical staging systems. However, a lack of external veri cation was the common limitation for these studies. External validation of cohorts from other countries or prospective randomized clinical trials are necessary to con rm a model's performance.
A subtype of EOC, SOC accounts for approximately 85% of EOC diagnoses. Therefore, it is important to separately analyze the performance of nomograms with regards to their prognostic predictions for this EOC subtype. In this study, we used the external veri cation method. We analyzed a training cohort of 6957 SOC patients from the SEER database and a validation cohort of 1244 SOC patients from two tertiary institutional hospitals to develop and validate easy-to-use nomograms for predicting the OS at 3and 5-years. Our study identi ed age, grade, and AJCC stage as independent predictors of OS. We observed that the important predictors of improved OS were younger age, early clinical stage, and welldifferentiated grade, which is consistent with previous studies [18]. In this study, the majority of cases (> 85%) were diagnosed in women over 50-years-old; the older the patient was, the worse the prognosis was. Generally, older patients were more likely to present worse survival outcomes due to lower immune responses [19].
Many scholars believe that clinical staging is an important factor that affects the prognosis of ovarian cancer. In our study, > 70% of patients were diagnosed with advanced (III-IV) ovarian cancer. The later the clinical stage, the lower the 3-and 5-year OS rates. Patients with early FIGO staging could be more thoroughly removed after surgery, as the residual lesions were relatively small, chemotherapy-sensitive, and had a low risk of recurrence and metastasis; thus, the prognosis of these patients was good. Patients with late FIGO stage disease have tumor cells in the body that spread more widely, making it di cult to implement complete surgical treatment. Patients with a poor tolerance to chemotherapy have poor prognoses. Especially after stage IIIA, the later the stage, the higher the risk of death and the worse the prognosis [20].
HGOS is the most common subtype of EOC, and a majority of HGOS patients subsequently develop platinum-resistance with relapse; which demonstrates their overall poor prognosis [21]. In this study, 84.4% of patients were HGOS. As can be seen from the nomogram, the prognosis of these patients was worse than that of LGOS. It is generally believed that tumors with low histological grades have high degrees of malignancy, such as rapid disease progression and poor survival and prognosis due to adverse biological behaviors (rapid cell proliferation, diffusion, and strong invasion. The higher the degree of tissue differentiation, the slower the proliferation of tumor cells in the body. Weaker invasion of body tissues was associated with lower degrees of malignancy, slower disease progression, relative sensitivity to chemotherapy, longer survival times, and better prognoses.
We validated the accuracy of our nomograms using the C-index, ROC analysis, and calibration curves in both the training and validation cohorts. The C-index and ROC curves all exceeded 0.6 for OS in the external veri cation processes. Calibration curves also demonstrated good performance of the nomograms. These results show that our nal nomogram exhibited good discriminatory performance and calibration. Previous studies have reported using the SEER database to establish nomograms to predict the prognosis of EOC. This study also suggested that log of odds between the number of positive lymph nodes and the number of negative lymph nodes works as an independent prognostic factor for survival in EOC patients regardless of tumor stage; thus, the nomogram may be superior to the currently used FIGO staging system for predicting OS and CSS among post-operative EOC patients [25].The nomogram developed in our study also showed better predictive accuracy for survival compared with the AJCC 7th staging system. This nomogram model enabled risk strati cation of patients; thus facilitating personalized treatment plans and follow-up schedules.
Our study had several advantages. First, we developed and validated nomograms using clinically important long-term oncological OS outcomes, which reduced bias by scoring the model performance using high-quality data based on the large sample sizes from SEER. Second, although some studies have established nomograms to predict the prognosis of SOC, but the nomogram model of these studies only received internal validation. In contrast, this study adopted the method of external veri cation, using cohorts from other countries, which was necessary to con rm performance. Third, compared with the AJCC-stages, DCA curves in this study showed that our nomograms provided excellent clinical utility. All variables included in the SOC nomogram could be obtained easily, which could facilitate the application of nomograms in clinical practice. Furthermore, we aimed to establish a prediction model for general SOC patients with common characteristics, such that results were not affected by ethnic and regional differences. Although the SEER database contains information from the US population, our external validation population was from China. The C-statistic and AUC values of the training and validation sets were very similar, demonstrating the great discriminatory power of the nomogram. It also showed that this nomogram was applicable to regions other than the United States, which will facilitate the application of nomograms in clinical practice.
However, there were also several limitations to this study. First, the included variables were relatively simple and there were more detailed data that could have been included, such as family history of ovarian cancer, primary tumor diameter, positive lymph nodes, ascites cytological results, location of metastasis, chemotherapy regimens and cycles, sensitivity to chemotherapy, and genetic results. Second, this study was constructed using retrospective data, among which there may have been some undetected potential factors that introduced bias.

Conclusion
In conclusion, we used the SEER database to construct a nomogram of SOC prognosis by integrating a variety of prognostic factors into a simple and intuitive tool, and then externally verifying the model in an Asian population to obtain unbiased estimates that are universally applicable. Compared with the AJCC staging system, our nomogram improved the accuracy of survival predictions, which could help clinicians provide personalized treatment suggestions and follow-up strategies. In our next study, we will include more detailed variables into the construction of predictive models of SOC prognoses, which will make them more complete and accurate. Thus, we will collect more information to construct a predictive model of chemotherapy resistance.

Declarations
Ethics approval and consent to participate All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
This article does not contain any studies with animals performed by any of the authors.  The ROC curve. The ability of the model to be measured by the AUC in the training cohort(A) and in the validation cohort(B).