Establishment and validation of prognostic nomograms integrating histopathological features in patients with invasive endocervical adenocarcinoma

Background To develop and verify pathological models using pathological features basing on hematoxylin and eosin (H&E) images to predict postoperative survival in patients with invasive endocervical adenocarcinoma (ECA). Method: There were 289 patients with ECA classied into training and validation cohorts. A histological signature was produced in 191 patients and veried in the validation group. Histological models combining the histological features were built. They showed increased value compared to the conventional model in terms of individualized prognosis estimation. Results Our model included ve selected histological characteristics and was signicantly related to overall survival (OS). In the training cohort, it had AUC values of 0.862 and 0.955, respectively, for predicting 3-and 5-year survival; in the validation cohort, the equivalent values were 0.891 and 0.801. In the training cohort, it showed better OS evaluation (C-index: 0.832; 95% condence interval [CI] = 0.751–0.913) than both the FIGO staging system (C-index: 0.648; 95% CI = 0.542–0.753) and treatment (C-index: 0.687; 95% CI = 0.605–0.769), with advanced eciency for classifying survival outcomes. In both cohorts, a risk stratication system was built that could precisely stratify patients with stage I and II ECA into high-risk and low-risk subpopulations with signicantly different prognoses. Our nomogram with ve histological signatures had better OS in patients with ECA. This the in such


Introduction
Endocervical adenocarcinoma (ECA) comprises approximately 25% of cervical cancers. It is much more heterogeneous than other types of cervical tumors, with around 15% of cases being unrelated to human papillomavirus (HPV) infection. ECA has a higher prevalence than squamous cell carcinoma because it is di cult to detect glandular lesions using cytological screening [1]. According to the International Endocervical Adenocarcinoma Criteria and Classi cation (IECC), ECA is categorized into HPV-associated (HPVA) and non-HPV-associated (NHPVA) types. Although most cases of ECA are associated with HPV infection, NHPVA is challenging to diagnose and tends to be aggressive [2][3][4][5][6]. Furthermore, International Federation of Gynecology and Obstetrics (FIGO) stage is the most critical parameter for determining treatment and prognosis, but signi cant heterogeneities in clinical prognosis occur among patients with ECA who show similar FIGO stage. Therefore, to improve prognosis in patients with ECAs, researchers must ascertain the substantial prognostic determinants.
ECAs are classi ed based on descriptive morphological characteristics, particularly cytoplasmic features [7]. However, ECAs cannot easily be categorized using the 2014 World Health Organization classi cation because they are de ned based on empirical observation, rather than on clinical or biological features [7]. In our own studies, we have used the IECC classi cation for histological typing.
Previous studies have identi ed several pathological variables with prognostic value in ECA, namely tumor size, depth of invasion (DOI), lymphovascular invasion (LVI), and lymph node metastasis (LNM) [8,9]. Therefore, combined analysis of histological features is the most promising approach to improving clinical management. Previous studies have shown that this histological model is correlated with outcomes in patients with ECA. However, to our knowledge, no strategy has been developed that uses histological signatures to predict outcome.
In the present study, we generated and veri ed a histological feature-based histological model to predict outcomes. This histological model may accurately stratify patients with ECA into high-and low-risk groups.

Patients and samples
In the present retrospective study, we enrolled 289 patients with histologically con rmed ECA who had been treated at the Sun Yat-sen University Cancer Center between January 2010 and December 2014.
Patients were enrolled when (1) they had been diagnosed with primary ECA, and (2) they had complete clinical data available. The exclusion criteria were as follows: (1) systemic metastasis at diagnosis, (2) co-existing malignancies, (3) history of anti-cancer therapy. Patients were classi ed into either the training group (n = 200) or the validation group (n = 89). The last follow-up was conducted in June 2020. The Hospital Ethics Committee at the Sun Yat-sen University Cancer Center, China approved this study.
The critical raw data associated with this article have been uploaded onto the Research Data Deposit public platform (www.researchdata.org.cn), with the following RDD approval number: RDD2020001505.

Histologic features
All slides were evaluated by histologists who had no knowledge of the corresponding patient information.
In accordance with the new pathogenetic classi cation of the IECC, ECA was classi ed into HPVA and NHPVA histological types [10]. In all samples, the invasive patterns of the tumors were categorized as A, B, or C [11][12][13][14]. The following features were evaluated: nuclear grade, tumor cell necrotic debris, mitosis/10 high-power elds (HPF), tumor giant cells/10 HPF, LVI, tumoral tumor-in ltrating lymphocytes (TILs), stromal TILs, differentiation, DOI, stromal invasion, nerve invasion, endometriosis invasion, LNM, extranodal involvement, and parametrium invasion. Tumor cell necrotic debris was classi ed as focal, moderate, or extensive. Nuclear grade was classi ed as previously reported [15]. Mitosis was calculated in 10 high power elds (HPF), as were tumor giant cells, which were categorized as either multi-nucleated tumor giant cells or single giant nuclear cells, with a nucleus 3-4 times bigger than the surrounding tumor nuclei. These cells are usually identi ed using low power (4 ) or intermediate power (10 ) microscopy. Following a standardized method, TILs were evaluated on hematoxylin and eosin (H&E)stained slides to provide a percentage score for stromal and intratumoral compartments, as described previously [16]. The H&E images of these features are shown in Supplementary Figs. 1 and 2.

Construction and validation of the nomogram
In the construction of the nomogram, we combined the following clinical variables and histological factors as prognostic characteristics: age, menopause, oral contraceptive use, chief complaint, histological type, FIGO stage, tumor size, differentiation, growth pattern, nuclear grade, tumor cells necrotic debris, mitosis/10HPF, tumor giant cells/10HPF, tumoral TILs, stromal TILs, LVI, DOI, stromal invasion, nerve invasion, endometrial invasion, LNM, extranodal involvement, parametrium invasion, and treatment. We used the least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation to choose the most useful predictive markers and create nomograms of overall survival (OS) from the training cohort.

Statistical Analysis
We used IBM SPSS Statistical software version 19.0 (IBM Corp., Chicago, IL, USA) and R version 3.4.0 (http://www. R-project.org/) for statistical analyses. Survival curves were conducted using the Kaplan-Meier approach.

Development and validation of the histopathological nomogram
A multiple-feature-based histological signature was built to prognose survival in the training group. After applying the LASSO logistic algorithm, ve of the 22 clinical and histopathological features were nally used to develop our model ( Fig. 1A and 1B). The following histopathological features had a non-zero coe cient: histological type, DOI, stromal invasion, LVI, and LNM. Histological images of the above ve features are presented in Supplementary Fig. 1. Our model was created using the following formula: risk score = histological type · -0.066 + DOI · 0.002 + stromal invasion · 0.218 + LVI · 0.123 + LNM · 0.534. The contribution of each selected variable to signature construction is shown in Fig. 1C.
These factors were used to develop nomogram models in the training and validation groups. The histological nomograms, FIGO stage, treatment, and OS are presented in Fig. 2A and 2B. We noticed a decent calibration curve, which con rmed good agreement between prediction and observation for 1-, 3-, and 5-year OS in the training and validation cohorts (Fig. 2C-2H).
In the training cohort, the AUCs of our 3-and 5-year models were 0.882 and 0.891, respectively ( Fig. 3A  and 3C). The nomogram subsequently created was con rmed in the validation group. The AUCs of our models were 0.955 and 0.801 for 3-and 5-year OS, respectively, in the validation group ( Fig. 3B and 3D).

Risk strati cation of OS
At a cutoff risk-score of -1.62 ( Supplementary Fig. 3), patients were categorized into high-risk (> -1.62) and low-risk groups (≤ -1.62). The OS rate in the two groups is listed in Table 3. In the training cohort, the 1-, 3-, and 5-year OS rates were 93.2%, 79.5%, 65.9%, respectively, in the high-risk group, and 99.3%, 96.6%, and 95.9%, respectively in the low-risk group (Table 3). In the validation cohort, the 1-, 3-, and 5-year OS rates were 94.4%, 66.7%, 66.7%, respectively, in the high-risk group, and 97.5%, 97.5%, 95.0%, respectively, in the low-risk group (Table 3). Likewise, signi cant differences were observed for OS in patients with stage I, II, and whole ECA in the training and validation cohorts. Patients with lower risk scores generally had a better OS (Fig. 4). Each risk subgroup represented a distinct prognosis, and this system accurately separated OS in the two subgroups. Furthermore, our clustering approach revealed C1 and C2 patient groups in both cohorts ( Fig. 5A and 5B). In the training and validation cohorts, the correlation coe cient between our model and FIGO stage was similar to that between our model and treatment (Fig. 5C and  5D). There were signi cantly positive correlations between our model, FIGO stage, and treatment in both cohorts (Table 4).

Discussion
In the present study, nomograms merging pathological parameters were built to predict the 1-, 3-, and 5year OS rates of patients with ECA. Both identi cation and calibration were con rmed, and the nomograms will have a wide range of applications. According to the ROC curve and detrended correspondence analysis, the prognostic nomogram exhibited greater accuracy in patients with ECA than the current FIGO staging system. Furthermore, it could classify patients with ECA into low-and high-risk subpopulations, implying that it could routinely be applied to prognose ECA.
Previous research has demonstrated that histological type, DOI, stromal invasion, LVI, and LNM are highly prognostic factors, and that they currently in uence patient management [17][18][19]. Comparing HPVA and NHPVA reveals essential differences in tumor behavior and patient survival, with signi cantly worse clinical outcomes in patients with NHPVA [20]. Tumors with a DOI of less than 3 mm (FIGO stage IA1) have lower rates of lymph-node metastases, parametrial spread, and recurrence than larger tumors (stage IA2, IB1, and IB2). Measurement of DOI is restricted [21]. Thus, stromal invasion status can complement, but should not replace, the DOI metric. Depending on variables such LVI, prognosis regarding fertility may be approved [22,23].
Our nomogram scoring systems seemed to have outstanding capacities for prognosing ECA. Previous studies attempted to predict outcome in cervical cancer. For example, one study analyzed a four-factor model (histology, tumor size, deep stromal invasion, and LVI). The authors found that the presence of any two factors may predict recurrence in patients with cervical cancer [24]. Previous studies showed that DOI, LVI, LNM, and invasion patterns were strong independent predictors of disease-speci c survival in ECA [19,[25][26][27]. By incorporating a log of odds between the number of positive lymph nodes and the number of negative lymph nodes, the nomogram by Wang may be superior to the FIGO staging system in predicting OS in cervical cancer [28]. The present study enlarged the analysis of individual H&E morphological characteristics into a nomogram model for estimating survival, proving the histological signature's incremental signature for individualized OS estimation. Our model consisted of ve histological characteristics and provided a non-invasive, quick, low-cost, and reproducible method for collecting phenotypic information. As such, it may inform attempts to improve personalized medicine.
However, the present study had several de ciencies. Firstly, there may have been a selection bias because patients with ECA in situ were not enrolled in the nomograms. Secondly, our study only assessed OS prognosis in patients with ECA. Thirdly, the sample size was rather small. Therefore, another study must be conducted to verify the nomogram.

Conclusions
In summary, we generated new nomograms to prognose the OS rate in patients with ECA. Our simple and explicit nomograms have good clinical application value, and they show good discrimination and calibration ability. They may be a useful tool for assessing prognosis and managing treatment in patients with ECA.

Declarations
Ethics approval and consent to participate This study was approved by the Clinical Research Ethics Committee of the Sun Yat-sen University Cancer Center, and all patients provided written informed consent at the rst visit to our center.