Development and validation of a nomogram for predicting overall survival in papillary renal cell carcinoma: a retrospective cohort study

Background Papillary RCC (PRCC) is the second most common pathological subtype after clear cell RCC (ccRCC), representing 10–20% of treatment of renal tumors. The aim of this study was to establish a reliable nomogram model to evaluate the prognosis of papillary renal cell carcinoma (PRCC) for overall survival (OS). Patients and methods : In total, 6,028 patients with PRCC from the Surveillance, Epidemiology, and End Results (SEER) database were randomly separated into training (n = 4,220) and validation (n = 1,808) cohorts. Cox regression analyses were used to identify the signicant variables. A nomogram was established to predict the prognosis of an individual patient with PRCC in terms of OS based on the Cox model. The predictive accuracy of the nomogram model was assessed via discrimination and calibration plots. Data of 6,028 patients with PRCC were retrieved from the SEER database. Age at diagnosis, grade, Tumor-Node-Metastasis stage (TNM, AJCC, 7th edition), surgical treatment, tumor number and marital status were the signicant independent prognostic variables. All variables were combined to establish a nomogram. Compared to the TNM stage system 7th edition, our nomogram exhibited a favorable discrimination power for OS prediction both in the training and validation cohorts. The calibration curves revealed high consistency between the prognosis prediction of our nomogram and the actual survival.


Introduction
Renal cell carcinoma (RCC) is a common urological malignant tumor worldwide, accounting for 62,700 new cases in the US [1] and 66,800 in China [2]. Although diagnostic techniques and targeted therapies have rapidly developed in recent years, approximately 20% of patients develop to an advanced stage by initial diagnosis, and approximately 30% of those with localized RCC, who receive curative surgery, subsequently experience recurrence [3,4]. Therefore, it is essential to have an accurate estimation of RCC prognosis to facilitate individualized treatment according to risk in order to select the optimal treatment strategy.
RCC types are divided into clear cell, papillary, chromophobe and other, including less common and unclassi ed subtypes [5,6]. Papillary RCC (PRCC) is the second most common pathological subtype after clear cell RCC (ccRCC), representing 10-20% of treatment of renal tumors [7]. In a previous study, Margulis V et al [8] found that there were signi cant differences in clinical manifestations, prognostic characteristics, and patient ending between those with PRCC and ccRCC. Many previous studies have proposed methods to assist physicians in treatment decision-making for patients with RCC. The vast majority focus on RCC [9,10] and ccRCC [11,12]. At present, there is a lack of agreement regarding how best to predict the survival prognosis of patients with PRCC.
Currently, the American Joint Committee on Cancer (AJCC) Staging Manual is regarded the gold standard staging scheme for predicting survival prognosis in patients with RCC [13]. In this classi cation system, patients are strati ed on the basis of depth of invasion, number of metastasis nodes and status of distant metastasis. However, patient survival outcomes may be quite distinct even in those with the same AJCC stage. However, many other signi cant factors such as age, race, sex, tumor number, tumor differentiation and surgical treatment are also correlated with prognosis in multiple cancers [14,15]. Therefore, a more re ned staging system that combines clinicopathological characteristics may exhibit a more accurate and credible prediction of survival prognosis than dose the AJCC staging manual.
A nomogram, as an e cient and convenient tool for prediction, combines Tumor-Node-Metastasis (TNM) stage and other important clinicopathological characteristics to predict a speci c outcome [16][17][18][19]. Combined with these clinicopathological variables, the nomogram can provide a reliable individual prediction of overall survival (OS) for patients with PRCC. Compared with the AJCC TNM staging system alone, the combination of clinicopathological variables and AJCC staging can more accurately predict the individual prognosis of OS [20,21]. In this study, we established a prognostic nomogram on the basis of a large amount of population information retrieved from the Surveillance, Epidemiology, and End Results (SEER) database to provide a more accurate and individualized OS prediction for patients with PRCC.

Study cohort
In the study, all clinicopathological information was collected from the SEER database via reference number 14581-Nov2017. Ethics approval was not required in present study as the data used from the SEER database are publicly available. The inclusion criteria were as follows: (1) diagnosed with PRCC; (2) with a Histologic ICD-O-3 number of 8130; (3) with complete clinicopathological characteristics including age at diagnosis, race, laterality, number of tumors, grade, surgical treatment and TNM stage (derived AJCC, 7 th [2010+]); and (4) with detailed and available survival time and vital status. Patients without the above clinicopathological features or su cient data were excluded from the study.

Variables and endpoints
Clinicopathological variables were: age at diagnosis, sex (female or male), race, laterality (left or right), histological grade, surgical treatment (including no surgery, partial nephrectomy, and radical nephrectomy), number of tumors (multiple or single), and 7 th AJCC tumor stage, including T1/T2/T3/T4/N0/N1/M0/M1 stage. Both the number of tumors and age at diagnosis were converted into classi cation variables for subsequent analysis. The OS of patients with PRCC served as the endpoint.

Nomogram validation
The internal and external validations were conducted using training and validation cohorts, respectively. First, the concordance index (C-index), which is similar to the area under curve (AUC) of the receiver operating characteristic (ROC) cure but more suitable for censored data, was applied for the evaluation of discrimination for the nomogram [22]. The C-index was also applied to assess the performance of the 7 th AJCC system in prognostic prediction. The range of the C-index statistic was from 0.5 to 1, and a greater C-index represented a more favorable prognostic discrimination of the model. Next, calibration cures of the nomogram were applied to assess the consistency between prediction and observation of OS. To decrease the over t bias, bootstrap was corrected, evaluating with 1,000 resamples [20].

Statistical analyses
The independent prognostic variables correlated to OS were screened out via univariate and multivariate Cox regression analyses. Then, the risk score (RS) of the multiple Cox regression model was estimated using the following formula: where n represents the the number of clinicopathological variables; Exp(i) represents the value of each variable; and R(i) represents the regression coe cient of each variable. The sample with an RS less than the average RS of all samples was considered a low-risk sample; otherwise, the sample was considered high-risk. Survival curves for high-and low-risk groups were plotted using the Kaplan-Meier method. In addition, the speci city and sensitivity were estimated by ROC curve and areas under the ROC curve (AUC value).
In this study, differences were regarded as signi cant at p-value <0.05. Statistical tests were conducted using the R software.

Patient characteristics
Overall, 6,028 eligible patients with PRCC were selected and composed the primary cohort. Of these, 4,220 were randomly assigned into a training cohort for construction and internal validation of our nomogram, while 1,808 patients made up the validation cohort for external validation. The clinicopathologic and demographic features of these cohorts are presented in Table 1. At the end of the follow-up period, 485 cases in the training and 210 in the validation cohorts had died. The average OS for both cohorts was 31 months. In the training and validation cohorts, most cases were male, older than 50 years, white with T1, N0, and M0 stages, and approximately two-thirds cases had a single tumor. Notably, the incidence of the original PRCC occurring on the left was similar to that on the right. Moreover, most patients with PRCC received a nephrectomy including partial and radical nephrectomy.

Independent Prognostic Variables Of The Training Cohort
The effects of all variables on OS were analyzed using the univariate and multivariate Cox regression analyses, and the impact of each variable on OS was quanti ed using the hazard ratio. The following variables were included in the analyses above: age at diagnosis, sex, race, laterality, Fuhrman grade, TNM stage (TNM, AJCC, 7th edition), surgical treatment, number of tumors, and marital status. The results are Based on the RS, patients in the training cohort were divided into a low-and high-risk group. The results of survival analysis showed those patients in the low-risk group had dramatically longer survival time than that in the high-risk group (Fig. 1). Then, time-dependent ROC curve analysis was adopted, and AUC values were used to test the sensitivity and speci city of the survival prediction. The results suggested that AUC values for three-and ve-year survival prediction were 0.777 and 0.757, respectively (Fig. 2), indicating high prediction performances.
According to Table 2, survival curves for these variables were plotted using Kaplan-Meier method and are presented in Figs. 3 and 4. The results indicated that age at diagnosis, Fuhrman grade, TNM stage, surgical treatment, tumor number and marital status were the independent risk factors for OS in patients with PRCC (Fig. 3). However, sex, race and laterality had no signi cantly association with OS (Fig. 4).

Prognostic Nomogram For Os
The variables mentioned above were employed in the nomogram for OS prediction at three-and ve-years in patients with PRCC (Fig. 5). The nomogram assigned a score for each subtype of the variables. By adding up the scores associated with each variable, thus locating the total score on the bottom scales, OS predictions at three-and ve-years were performed. The nomogram suggested that being aged over 80 years was the main contributor to OS, followed by surgical treatment and AJCC TNM stage. Laterality and race showed a minimal effect on OS.
For example, a 70-year-old black, male, married patient with a tumor originally located on the right that was grade IV and, T2N0M0 stage would receive a radical nephrectomy score > 180, indicating that the probabilities of survival at three-and ve-years are approximately 75% and 60%, respectively.

Validation Of The Nomogram
C-indexes for our nomogram were 0.807 (95% CI, 0.779 to 0.834) and 0.800 (95% CI, 0.759 to 0.841) in training and validation cohorts, indicating the applicability and feasibility of the established nomogram for patients with PRCC. Moreover, the differences between our nomogram and the AJCC TNM staging system 7th edition in the prediction of OS were compared using the training and validation cohorts. Cindexes for AJCC TNM staging were 0.686 (95% CI, 0.667 to 0.706) and 0.668 (95% CI, 0.638 to 0.697) in training and validation cohorts, respectively, which are dramatically smaller than those of the nomogram. This suggests that our nomogram discrimination is superior to the AJCC TNM staging in the prediction of OS. Additionally, both internal and external calibration diagrams for three-and ve-years OS indicate good correlation and high reliability for the training and validation cohorts between the nomogram predictions and observations (Figs. 6 and 7).  [12] found that age at diagnosis, race, sex, Fuhrman grade, marital status, TNM stage and surgical approach were markedly associated with the prognosis of patients with ccRCC, and they developed nomograms based on the factors mentioned above to predict three-, ve-and 10-year overall and disease-speci c survival. To our knowledge, a disease-speci c survival nomogram for patients with PRCC was constructed by Klatte et al for the rst time [26]. In their study, T classi cation, M classi cation,incidental detection, extent of tumor necrosis, and vascular invasion were considered independent prognostic factors, which were combined to develop the nomogram. However, this nomogram relied on a small patient cohort (258 for development and 177 for external validation), which undermined the power of their outcomes.

Discussion
In our study, a nomogram was established for individual OS prediction according to a large population of patients with PRCC from the SEER database for the rst time. Our nomogram combined individual clinicopathological and demographic information and exhibited excellent discrimination in either internal or external validation, showing suitable clinical practicality and appropriateness for patients with PRCC.
Moreover, the nomogram revealed more favorable precision and reliability than the AJCC TNM staging system 7th edition in terms of OS prediction. Additionally, survival curves of each independent prognostic variable were estimated using the Kaplan-Meier method, suggesting a signi cant correlation between all variables and OS.
Our nomogram consisted of nine variables, including age at diagnosis, sex, race, laterality, Fuhrman grade, number of tumors, TNM stage, surgical treatment and marital status. Age was considered a signi cant prognostic variable for OS in previous studies [27,28], as it could be related to the decline of human organs with the increase of age. Our results from the Kaplan-Meier curve analysis revealed that as age increas, the OS of patients with PRCC decreases, and patients older than 80 years have the shortest survival time. Patients who had a higher TNM stage and Fuhrman grade were correlated with an unfavorable prognosis. Furthermore, patients undergoing partial nephrectomy appeared to have greater OS than those who underwent radical nephrectomy or no surgery. This could be because most patients undergoing partial nephrectomy were at an earlier TNM stage. Based on the survival curves, we noticed that patients with a single tumor tended to have a better prognosis than those with multiple tumors. Notably, our results showed that race, sex, and laterality had no signi cant correlation with the OS of patients with PRCC, which was in accordance with the results of multivariate Cox regression analysis.
Lam et al [29] reported that the TNM stage alone is not su cient to effectively predict the prognosis because of survival heterogeneity even in individual stages. Compared with TNM staging, our nomogram model is not only simple, but can also provide a quantitative and reliable tool for the prognostic prediction of an individual patient with PRCC. For example, consider two PRCC cases (tumor on the right: T3N1M0: grade IV): case A) a 40-year-old black, married, female patient with a single tumor who underwent a radical nephrectomy would score approximately 170 and case B) a 50-year-old black, married, male patient with a single tumor who underwent radical nephrectomy would score approximately 215 points. According to our nomogram, the three-and ve-year OS probabilities for these two patients were 80%, 65% and less than 65%, and 45%, respectively. In contrast, both cases would be considered to be the same traditional TNM stage, indicating similar prognoses. This illustrates the disadvantages of the TNM staging system in individual prognostic prediction. Therefore, our nomogram is important for the evaluation of patients with PRCC in the follow-up period and for the selection of clinical treatments. For example, younger cases with earlier TNM stages and well-differentiated histology could receive surgery due to a favorable prognosis, while older patients with a later TNM stage and poorly-differentiated histology who have short life expectancy could not undergo surgery. However, it may be ineffevtive to purely rely on TNM staging to select a therapeutic strategy for patients, and clinicians must rely on their clinical experience to make a more appropriate decision. Therefore, physicians would be better able to select the optimal therapeutic strategy for patients with PRCC using our nomogram consisting of clinicopathological variables.
Overall, our nomogram has some strengths over previous models. First, to date, this is the rst study, to our knowledge, to predict OS for patients with PRCC based on a large population cohort. Second, this study included several important clinical variables, such as age at diagnosis, race, laterality, Fuhrman grade, tumor numbers, TNM stage, and marital status, which were able to improve the prognostic accuracy of the nomogram model. Third, both C-indexes and calibration plots were applied to assess the predictive accuracy of the nomogram. All C-indexes of the nomogram were over 0.7, suggesting good accuracy for OS prediction. Forth, the Kaplan-Meier curve analysis was used in this study and the results showed these variables signi cantly associated with OS. Finally, the clinicopathological variables in this model were easily available and could better re ect the patient status and tumor features, thereby providing clinicalinformation regarding PRCC.
Despite our nomogram exhibiting excellent accuracy, several limitations must be considered. First, this study was restricted by the SEER database, including in lacking a su cient amount of data on other pivotal prognostic variables, such as the ECOG prognostic scores, detailed histological information and mode of presentation, which have proven to be predictors of survival [30], and these predictors have not yet been analyzed in this study. In addition, chemotherapy and radiotherapy data were also unavailable. Moreover, the nomogram was based on retrospective data, which inevitably results in a risk     The calibration curves of 3-and 5-year OS for validation cohort (A for 3-year OS, B for 5-year OS).
Nomogram-predicted probability of survival is plotted on the x-axis, and the actual survival is plotted on the y-axis. Dashed lines through the point of origin represent the perfect calibration models where the predicted probabilities are identical to the actual probabilities. OS, overall survival