Development and external validation of a nomogram for predicting the effect of tumor size on cancer-specific survival of resected gallbladder cancer: a population-based study

The impact of tumor size on account of the long-term survival results in gallbladder cancer (GBC) patients has been controversial. It is urgent necessary to identify the optimal cut-off value of tumor size in resected GBC, and we attempted to integrate tumor size with other prognostic factors into a prognostic nomogram to predict the cancer-specific survival (CSS) of GBC patients. 1639 patients with resected GBC were extracted from the Surveillance, Epidemiology and End Results (SEER) database. X-tile program was used to identify the optimal cut-off value of tumor size. A nomogram including tumor size was established to predict 1-, 3- and 5-year CSS based on the independent risk factors chosen by univariate and multivariable cox analyses. The precision of the nomogram for predicting survival was validated with Harrell’s concordance index (C-index), calibration curves, and receiver operating characteristic curve (ROC) internally and externally. Patients with GBC were classified into 1–13 mm, 14–63 mm and 64 mm subgroup based on the optimal cut-off for tumor size in terms of CSS. The nomogram according to the independent factors was well calibrated and displayed better discrimination power than 7th tumor–node–metastasis (TNM) stage systems. The results demonstrated that increased tumor size is closely associated with the worse CSS. Our novel nomogram, which outperforms the conventional TNM staging system, showed satisfactory accuracy and clinically practicality for predicting the outcome of resected GBC patients.


Introduction
Gallbladder carcinoma (GBC) is a rare, lethal malignancy characterized by its vague symptoms at initial stage, less than one-third of patients presenting are eligible for curativeintent surgical resection and only 16% patients with GBC will survive for more than 5 years [1][2][3][4]. Surgical resection remains the primary treatment for GBC and provides an opportunity for a cure or prolonged life span [4]. Because of the lack of a serosal layer of gallbladder adjacent to the liver, direct hepatic parenchymal invasion and metastasis are the major cause of its dismal prognosis [5].
Unlike lymph node metastasis and liver invasion, which have been recognized as the independent prognostic factors in GBC, the impact of tumor size on the long-term survival results in GBC patients, especially those who underwent resection, has remained a controversial debate in worldwide. The wide-accepted standard tumor-node-metastasis (TNM) staging system of the American Joint Commission on Cancer (AJCC) mainly relies on pathological evaluation of tumor infiltration into the layers of gallbladder, which can only be achieved through surgical excision of the gallbladder and did not take tumor size into account [6]. The recent studies, which proposed prognostic nomograms/systems, pointed 1 3 out that patient and tumor-related factors are closely associated with prognosis of GBC patients [7,8]. Notably, several studies have found that tumor size, as an easily quantifiable independent indicator, can affect the survival outcome of GBC patients [7][8][9]. However, the optimal cut-off value of tumor size for predicting prognosis of GBC is still controversial. Yadav staging system set tumor size larger than 5 cm as a factor indicated survival disadvantage; whereas, Zhang labeled 3 cm as an important prognostic factor [7,8]. In addition, no further studies were conducted to identify the association between the tumor size and other tumor-related factors, such as tumor grade and liver metastasis, for patients with GBC who undergo surgical resection. Therefore, we sought to explore the prognostic value of tumor size in resected GBC based on data from Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute, and assessed the ability of the novel prognostic model using the data from our center.
The well-established nomograms are increasingly becoming decision-making aids for predicting individual risk evaluation and therapeutic outcomes [10]. Compared to the current TNM staging system, we aim to establish a novel prognostic nomogram incorporated important factors for predicting the cancer-specific survival (CSS) of GBC patients after surgical resection based on the tumor size using SEER database. Validation was performed to evaluate the accuracy of the predictive model using an internal validation set derived from the random set of SEER databases and an external validation set from First Affiliated Hospital of Nanjing Medical University, as representative of the Asian population.

Population and covariates
Information of patients initially diagnosed with GBC between 1975 and 2016 was extracted from the SEER-18 database using SEER * Stat software version 8.3.6. Eligible GBC patients were identified according to the International Classification of Diseases for Oncology, Third Revision (ICD-O-3) codes reported on AJCC Cancer Staging Manual 2017 (Fig. 1). We limited our analysis to the primary GBC patients with confirmed pathological examination (positive exfoliative or histology) and available complete follow-up date.
Tumor size was defined as the largest dimension or diameter of the primary GBC from the postoperative pathology report. To ensure an adequate number of samples for analysis and explore the prognostic value of tumor size for patients after surgical resection, we also excluded patients with the following incomplete information: tumor size, not performing the surgical resection, tumor grade, race, regional nodes examined and positive, TNM stage. The detailed workflow for patient selection is shown in Fig. 1. To estimate the generalizability of the novel nomogram, we performed the separate external validation using 46 eligible GBC diagnosed in the First Affiliated Hospital of Nanjing Medical University. The clinicopathological features of all patients are shown in Table 1. The investigation was approved by the ethics committee of the First Affiliated Hospital of Nanjing Medical University.

Definition of variables
Demographic and clinical variables of 1639 patients fulfilling the inclusion criteria were obtained from the SEER database. Age variable in SEER is not the continuous manner. To further explore the impact of different age on CSS of resected GBC, we treated the age of diagnosis (50, 55, 60, 65, 70) as the cut-off value and analyzed the prognostic value of different cut-off value of age, and age < 60 years showed better survival outcome (Supplementary 1). Then, 60-year-old was set as the cut-off value in this study. Tumor grade was regrouped as well/moderately differentiated and poorly differentiated/undifferentiated. Regional nodes examined was listed as ≥ 6, 1-5 and 0 according to the recommendations in the 8th AJCC staging system. And regional nodes positive was stratified as 0, 1-3, ≥ 4 and no examined as recommended by the 8th TNM staging system. CSS was set as the primary outcome of patients with GBC, and time of CSS was counted from data of diagnosis to death due to GBC.

Nomogram development and statistical analyses
The included patients were randomized into a training cohort (n = 1103) and a validation cohort (n = 536) at a ratio of 7:3 [11,12]. Eight clinicopathological features, including age, race, sex, tumor grade, T stage, M stage, regional nodes examined and regional nodes positive were applied to analysis. Cox proportional hazard models were used to screen the independent risk factors to construct nomograms for predicting 1-, 3-and 5-year CSS. The construct nomogram was subjected to validation through 500 bootstrap resamples in the internal and external validation cohort. The area under receiver operating characteristic (ROC) curve (AUC) and the consistency index (C-index) were used to verify the precision of nomogram [13,14]. Calibration plotting was used to evaluate the agreement between the actual outcome and the predicted probability [15]. The age-standardized incidence of GBC was expressed as per 100,000 individuals using the 2000 US standard population. Annual percentage change (APC) in the incidence was calculated using the weighted least squares method [16]. Statistical analysis was conducted using R software 3.6.2 (R foundation, Vienna, Austria) and SPSS version 21.0 (SPSS Inc., Chicago, IL, USA). A twotailed P value < 0.05 was considered statistically significant.

Incidence trends
Generally, the overall age-adjusted incidence of GBC continued to increase with an APC of 1.19% between 1978 and 1993 (Fig. 2a). During this period of increased incidence of GBC, the incidence increased steeply from 1989 to 1992 with an APC of 4.17%; whereas, it turned to decrease since 1992. The incidence rate in males was basically consistent with the overall trend in the whole population (Fig. 2b). Notably, the age-adjusted incidence rate of this disease in females significantly decreased over time (Fig. 2c). Among the previous studies, sex was deemed as the prognostic factor in the progression of GBC. Considering the decline in the incidence among women since 1975, we speculated that there may be gender differences in the incidence of GBC.

Patients' characteristics
A total of 1103 patients diagnosed with GBC were included in the training set, and 536 patients were enrolled in the validation set. The external validation cohort include 46 patients from the first affiliated hospital of Nanjing Medical University. The descriptive and clinical characteristics of these patients are listed in Table 1. Overall, in training cohort, a large proportion of the patients (75.2%) were older than 60 years. Well and moderate differentiation (n = 631, 57.2%) were the most common tumor grade. A total of 268 (24.3%) patients had regional lymph nodes metastasis, and 34 (3.1%) patients had four or more regional lymph nodes positive.

Identification of optimal tumor-size cut-off value with prognosis
The cut-off value, 14 mm, and 64 mm, of tumor size was identified by X-tile plot based on minimal P-value approach and the maximum of chi-squared log-rank values was 107.2 according to CSS (Fig. 3a). To validate the effect of tumor size on CSS, we divided patients from training cohort and internal validation cohort into three risk groups using 14 mm and 64 mm as the cut-off value (Fig. 3b, c). The results showed that tumor size is closely associated with the prognosis of GBC patients with resection, and patients with 14 mm or less GBC had better prognosis, and tumor size larger than 64 mm is a significant negative factor implying a worse prognosis. To better reveal the clinical value of this tumor-size cut-off value, we compared the differences in tumor characteristics among the three groups. Of these, we found that increased tumor size group is associated with poor tumor grade, advanced T stage, more distant metastasis, more regional lymph nodes examined and positive, and more frequent liver metastasis (

Effect of tumor size counts on survival outcome of GBC
In the training cohort, the univariate cox analysis showed that, besides tumor size, age, tumor grade, T stage, M stage, regional nodes examined and regional nodes positive were also identified as statistically significant prognostic factors (Table 3, all P < 0.05). All these significant factors were subjected to the multivariate cox analysis to identify independent predictors of survival for patients. Using multivariate cox analysis of training cohort, age, tumor grade, T stage, M stage, regional nodes examined, regional nodes positive and tumor size were found to be independent risk factors in terms of CSS of GBC patients with surgical resection (Table 3, all P < 0.05).

Construction and validation of prognostic nomogram for CSS
Prognostic nomograms for GBC were constructed based on the seven independent prognostic variables for CSS from the training cohort (Fig. 4). The nomogram showed that T stage contributed most to CSS, and notably, the effect of tumor size on prognosis was basically the same as that of tumor grade and number of positive regional lymph nodes, which were widely accepted as the important factors in prognosis of GBC. And we were able to calculate the total risk point to estimate the 1-, 3-and 5-year CSS according to the survival probability scales in the nomogram. C-index and ROC curves are ordinarily used to evaluate the discriminatory   Fig. 4). The discriminatory capacity of the nomograms and 7th edition TNM stage systems was compared by analyzing the AUC values. In the internal validation, the AUC values of nomogram predicting 1-, 3-and 5-year CSS rates were 0.793, 0.821 and 0.825; whereas, the AUC values were 0.715, 0.765 and 0.799, respectively, for the 7th edition TNM stage system (Fig. 5a-c). These results showed that the novel nomogram had better accuracy for predicting the CSS than that with the 7th TNM stage system. In the external cohort, we also found that 1-and 3-year AUC value of the nomogram showed optimal accuracy for predicting the CSS compared with the 7th TNM stage system (1-year 0.737 vs 0.487; 3-year 0.785 vs 0.617, Fig. 5d, e).

Discussion
With lack of specific symptoms and early screening methods, GBC remains a highly malignant tumor with a discouraging prognosis [17,18]. Consistent with previous studies regarding the incidence of gallbladder cancer, we also found that overall incidence of GBC decreased in USA [19]. Notably, the incidence rate of GBC in male was similar to the overall incidence trend but significant declined year by year among woman. These findings showed variation in incidence patterns of GBC by sex, supporting the notion that GBC should be considered separately in different genders because of female reproductive factors proposed by Zhang et al. [20]. The accurate prognostic information is necessary for a surgeon to make better clinical decisions and perform consultations with GBC patients regarding life expectancy after surgical resection. Currently, TNM staging system is still an internationally recognized guideline for predicting GBC prognosis. However, TNM staging system is mainly based on accurate pathological assessment and not included patient-related factors. These factors and other clinical parameters may be helpful in predicting GBC patient outcomes, especially for the patients with surgical resection. Therefore, we aim to improve the predictive efficacy of postoperative prognosis of patients with GBC combined with these patients and tumor-related factors based on the advantages of the accuracy of postoperative clinicopathological factors. The significance of tumor size in predicting prognosis of GBC is still controversial worldwide. The current AJCC TNM staging system did not include tumor size as the prognostic factor in GBC. Several studies have found that the tumor size may be an independent prognosis factor in GBC, which is in line with our study, however, debate also exists regarding the optimal cut-off value of tumor size in GBC. Yadav et al. reported that tumor size larger than 5 cm exert prognostic significance for GBC patients [8]. Zhang et al. demonstrated that tumor size smaller than 3 cm obviously increase the survival outcome of GBC patients [7]. To address this issue, we first used the X-tile program to divided GBC patients into low, middle, and high-risk groups, and identified 14 mm and 64 mm as the optimal cut-off value in terms of GBC, and established a more accurate and effective survival model to predict the prognosis of patients after surgical resection based on the tumor size. Notably, our results demonstrated that increased tumor size group is associated with poor tumor grade, advanced T stage, more distant metastasis, more regional lymph nodes examined and positive, and more frequent liver metastasis. It is crucial to determine the generalizability and preventing overfitting of the novel prediction model by internal and external validation. The calibration plots of our scoring system demonstrated the best consistency between the nomogram predicted and actual observed 1-, 3-and 5-year CSS in both the SEER cohort and the external cohort. Our nomogram also demonstrated better indicating better prognostic performance compared with the AJCC TNM staging system. Remarkably, our nomogram strengthened again the role of tumor size in influencing the survival of GBC patients underwent resection.
This study has limitations inherent to its retrospective nature. First, the SEER registry lacks pre-operative information regarding imaging report and laboratory data, which limited the value of our predictive models for pre-operative assessment of GBC patients. Second, some potential prognostic factors, such as neoadjuvant therapies, were not available in the SEER dataset and were, therefore, not included in the nomograms as the evaluation of such variables could not be carried out in this study [21]. Third, although a large cohort and limited single-center external validation were available for this study, further external validations based on more large-scale cohorts will seek to test our model performance.

Conclusions
In conclusion, we demonstrated that tumor size was an independent factor in predicting the CSS of GBC patients after resection. Increased tumor size was associated with poor tumor grade, advanced T stage, more distant metastasis, more regional lymph nodes examined and positive, and more frequent liver metastasis in GBC. We created a novel nomogram, which has been constructed and external validated, for patients with GBC based on tumor size at the time of diagnosis. We found this system to be superior to the AJCC TNM staging system in predicting CSS in patients with surgical resection.