A Competing Risk Nomogram for Predicting Cancer-Specic Survival Among Patients with Prostate Cancer after Radical Prostatectomy

Background: We aimed to develop a detailed individual survival prognostication tool based on competing risk analyses to predict the risk of 5-year cancer-specic death after radical prostatectomy for patients with prostate cancer (PCa). Methods: We obtained the data from the Surveillance, Epidemiology, and End Results (SEER) database (2004-2016). The main variables obtained included age at diagnosis, marital status, race, pathological extension, regional lymphonode status, prostate specic antigen level, Gleason Score biopsy. In order to reveal the independent prognostic factors. The cumulative incidence function was used as the univariable competing risk analyses and The Fine and Gray’s proportional subdistribution hazard approach was used as the multivariable competing risk analyses. With these factors, a nomogram and risk stratication based on the nomogram was established. Concordance index (C-index) and calibration curves were used for validation. Results: A total of 95,812 patients were included and divided into training cohort (n = 67,072) and validation cohort (n = 28,740). Seven independent prognostic factors including age, race, marital status, pathological extension, regional lymphonode status, PSA level, and GS biopsy were used to construct the nomogram. In the training cohort, the C-index was 0.828 (%95CI, 0.812-0.844), and the C-index was 0.838 (%95CI, 0.813-0.863) in the validation cohort. The results of the cumulative incidence function showed that the discrimination of risk stratication based on nomogram is better than that of the risk stratication system based on D'Amico risk stratication. Conclusions: We successfully developed the rst competing risk nomogram to predict the risk of cancer-specic death after surgery for patients with PCa. It has the potential to help clinicians improve postoperative we developed a novel risk stratication for postoperative CSD in patients with PCa. Our risk stratication has potential clinical value and may help clinicians better identify patients who still need active intervention after RP. The results showed that the discrimination of our stratication system was not weaker than the commonly used EAU risk stratication based on D'Amico stratication.

speci c antigen (PSA) level, clinical stage, Gleason Score (GS), and pathologic extent to predict the prognosis. However, these tools are still awed. They are mainly developed based on a small number of patients, the weight between the various prognostic factors is not clear enough and some studies have pointed out that their prediction accuracy is often less than 70% [11,12]. In addition, many patients with PCa are elderly people with many comorbidities, and they are more likely to die from cardiovascular disease, infection, or other non-tumor factors. Therefore, it is more di cult for researchers to accurately determine the prognosis of patients. [13,14] Competitive risk analysis is a time-to-event analysis, which considers various fatal or non-fatal events that may change or prevent subjects from experiencing interest endpoints such as cancer-speci c death, and can more accurately and unbiasedly estimate the patient's true prognosis [14][15][16]. Meanwhile, the nomogram is a kind of widely used risk prediction tool that can visualize complex regression equations and can analyze the relative weight of each prognostic factor, making the results of the prediction model easier to read and easier to evaluate [17].
To circumvent these defects, with the approach of competing risk analyses, we evaluated the factors affecting prostate cancer-speci c survival (CSS) at a large cohort. Furtherly we developed a prognosis nomogram and construct a risk strati cation which may have potential clinical implication to help clinicians identify the patients with a high risk of cancer-speci c death after RP.

Patient selection
All patients' information was obtained from The Surveillance, Epidemiology, and End Results (SEER) database (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). SEER database is a public cancer dataset made up of 18 population-based cancer registries. It has covered about 25% population of the United States. [18] From the SEER database, patients with a diagnosis of adenocarcinoma of the prostate (International Classi cation of diseases-O-3 code: C61.9) between 2004 and 2016 were selected. Inclusion and exclusion criteria were shown in the owchart in detail (Fig. 1). With a ratio of 7:3, all the patients were randomly divided into the training cohort and validation cohort.

Variables and endpoint
For each patient, the information extracted from the SEER database included age at diagnosis, marital status, race, pathological extension, regional lymphonode status, PSA level, GS biopsy, and follow-up information. For continuous variables including age at diagnosis and PSA level, X-tile software (Yale University, USA) was used to assess the optimal cut-off values by the minimal p-value approach [19] (Fig. 2). The optimal cut-off values for age at diagnosis were ≤ 5.9, 6.0-14.9, > 14.9. The optimal cut-off values for PSA level were ≤ 5.9, 6.0-14.9, > 14.9.
Cancer-speci c death (CSD) was used as the primary endpoint. CSD was measured by all deaths caused by prostate cancer, complications of treatments, or unknown processes in patients with active tumor. Other cause-speci c death (OCSD) was measured by all deaths caused by non-cancer events and seen as the competing event of the CSD. Follow-up time was de ned as the time between the rst treatment and the patient's death or last follow-up.

Statistical analysis
For categorical variables, a X 2 test was used to evaluate the difference between the training cohort and validation cohort, and the results were presented as the frequency with its proportion. In the training cohort, we estimated the cumulative incidence function (CIF) for CSD and tested the survival differences by Gray's test to discover potential prognostic variables with a p-value < 0.05. Subsequently, we performed competing risk multivariable analyses based on the Fine and Gray's proportional subdistribution hazard approach to identify these independent prognostic variables with a p-value < 0.05. All the independent prognostic factors were selected to construct a nomogram to predict 5-year CSD probabilities for PCa patients after RP.
To validate the performance of the nomogram, in both two cohorts, the discrimination of the nomogram was assessed by Harrell's concordance index (C-index). The value of C-index ranged from 0.5 to 1, and a higher C-index means better discrimination for prediction model [20]. Besides, the calibration curve with 1000 resamples of bootstrapping was used to compare the predicted survival outcome with the actual survival outcome. The closer the calibration curve was to the standard curve, the closer the survival outcome predicted by the nomogram was to the actual survival outcome. [21] In addition, we developed a risk strati cation based on the nomogram risk score. The cut-off values of risk scores were determined using X-tile software. Then we compare the discrimination abilities of the risk strati cation with the European Association of Urology (EAU) risk strati cation based on D' Amico strati cation [2].
The statistical software R (version 3.4.3, The R Foundation) was used in the above statistical analyses. A p-value < 0.05 was considered statistically signi cant.

Patient characteristics
Finally, a total of 95,812 eligible patients were included in this study. Among them, 67,072 patients were assigned to the training cohort, while 28,740 patients were assigned to the validation cohort. Table 1 showed the characteristics of patients in detail. Between the training cohort and validation cohorts, there were no statistically signi cant differences except for the age at diagnosis.

Identi cation of independent prognostic factors
We performed the analyses of CIF and Gray's test as the univariable analyses. The results showed that age, race, marital status, pathological extension, regional lymphonode status, PSA level, and GS biopsy were the factors with a signi cant impact on CSD. The Fine and Gray's proportional subdistribution hazard approach was performed as the multivariable analyses. And the results were consistent, in which age, race, marital status, pathological extension, regional lymphonode status, PSA level, and GS biopsy were the signi cant prognostic factors of CSD. These variables could be thought as the independent prognostic factor for predicting the CSS of PCa patients after RP. The detailed results of univariable and multivariable analyses were showed in Table 2. pathological extension, regional lymphonode status, PSA level, and GS biopsy were used to construct the nomogram for predicting the probability of 5-year CSD for PCa patients after RP (Fig. 3). The detailed score of each nomogram variable was listed in Table 3. We performed the analyses of C-index and calibration curve to validate the reliability of the nomogram. For the training cohort, the C-index was 0.828 (%95CI, 0.812-0.844). For the validation cohort, the C-index was 0.838 (%95CI, 0.813-0.863). The relatively high C-index (> 0.8) showed the good predictive ability of this nomogram. Meanwhile, in both training cohort and validation cohort, the calibration curves showed a good agreement between the 5-year CSD predicted by nomogram and actual 5-year CSD (Fig. 4).

Establishment of risk strati cation for cancer-speci c death after RP
According to the score corresponding to each nomogram variable, we calculated the total risk score for each patient in both the training cohort and the validation cohort. By the X-tile approach, patients were divided into three risk groups based on the total risk score from the nomogram. The low-risk group included patients with 0-66 points, the middle-risk group included patients with 67-105 points, and the high-risk group included patients with no less than 106 points.
In order to verify the predictive value of this risk strati cation, we compared it with the EAU risk strati cation based on D' Amico strati cation. EAU risk strati cation is one of the most popular risk strati cation tools for PCa patients, which can divide patients into three groups including low-risk, medium-risk, and high-risk. In both training cohort and validation cohort, we plotted the CIF curves for different risk groups based on the risk strati cation of the nomogram or EAU risk strati cation (Fig. 5). Compared with our risk groups, the degree of separation of CIF curves of CSD between groups was more obvious than EAU risk strati cation. Meanwhile, in both two cohorts, The high-risk group identi ed by our risk strati cation had a signi cantly higher CSD risk than OCSD, while the high-risk group in the EAU risk strati cation did not. The results showed that the novel risk strati cation based on the nomogram had better prognostic discrimination than EAU risk strati cation.

Discussion
In this study, based on a large cohort of 95812 patients from SEER database, we identi ed seven risk factors and construct a competing risk nomogram based on these prognostic factors to predict the probability of the occurrence of CSD within 5 years after RP for each patient with PCa. Furthermore, based on the difference in nomogram scores, we developed a novel risk strati cation for postoperative CSD in patients with PCa. Our risk strati cation has potential clinical value and may help clinicians better identify patients who still need active intervention after RP. The results showed that the discrimination of our strati cation system was not weaker than the commonly used EAU risk strati cation based on D'Amico strati cation.
Competing risk nomogram is a kind of widely used risk predicting model in many elds in oncology such as lung cancer, breast cancer, and colorectal cancer [22][23][24]. The nomogram can incorporate many key factors of the disease into the prognosis prediction model and can consider the weight of each variable to make the prediction model more accurate. In addition, the graphical representation helps to more intuitively evaluate the individual situation of each patient, which is more practical. [25] At the same time, competing risk nomogram has its unique advantages compared to traditional nomogram or other prognosis predicting models. The competitive risk nomogram is based on competing risk analysis methods such as CIF and Fine and Gray's proportional subdistribution hazard approach, rather than the Kaplan-meier method and Cox proportional risk regression commonly used in other types of models [16]. Competitive risk analyses not only consider the survival and death of patients, but also consider the impact of death caused by other factors on the endpoints of interest such as CSD. This is especially important in the research of PCa, because a large part of PCa patients may die due to other factors before developing CSD [13]. To our knowledge, there is still no research reported on the competitive risk prognosis prediction model for the prognosis of PCa patients after RP.
In the eld of PCa, the currently commonly used nomogram is Stephenson nomogram. It is developed by Stephenson et al. to predict disease progression after salvage radiotherapy (SRT), with data from a multi-institutional retrospective cohort of 1540 patients. Seven variables were used to construct the nomogram including PSA before SRT, surgical margins, GS, PSA double time before SRT, lymph node metastasis and androgen deprivation therapy administration before or during SRT. [26] However, there are still some defects with Stephenson nomogram. Due to the limitation of inclusion and exclusion criteria, it is not widely applicable to PCa patients who have received RP. At the same time, it paid little attention to hard endpoints such as CSD. In the cohort of the original study, its c-index was 0.69, and the c-index obtained after the test in another study was even lower [27]. Therefore, for predicting the survival of PCa patients after RP better, a more accurate and versatile nomogram is still needed.
In our study, the competing risk analyses identi ed 7 prognostic factors including age, race, marital status, pathological extension, regional lymphonode status, PSA level, and GS biopsy. Among them, GS had the greatest in uence on survival outcomes. Many studies have reported the relationship between GS and the prognosis of PCa [28][29][30]. International Society of Urological Pathology (ISUP) reported that GS can be divided into ve groups (2-6, 7(3 + 4), 7(4 + 3), 8, ≥ 9) according to prognosis, and this was consistent with our research results [30]. With the increase of GS, the patient's nomogram score was also increasing, that is to say, the possibility of the patient developing CSD within 5 years was increasing. In the nomogram, we could nd that GS 4 + 3 = 7 group was with an obviously higher score than GS 3 + 4 = 7 group with. This was also consistent with the latest American Urological Association (AUA) clinical guideline, which indicated that many researches had demonstrated that the prognosis of GS 4 + 3 was signi cantly worse than GS 3 + 4 [31,32]. Pathological extension was another important prognostic factor whose weight was second only to GS. It has been widely accepted that poor pathological ndings such as extracapsular invasion and seminal vesicle invasion are related to disease recurrence and poor prognosis [33][34][35].
In addition to the above-mentioned well-known prognostic factors, our study also found the impact of race and marital status on the prognosis of PCa patients. Our nomogram showed that African Americans had the highest risk of CSD after RP, followed by Caucasian and other races. This nding was consistent with some studies published in recent years. According to statistics from researchers, the average annual incidence of PCa among African Americans was 60% higher than that of Caucasian men. Besides, compared with other races, African Americans have the highest mortality rate [36,37]. The causes of the result were very complicated. For example, In the United States, PCa tended to be larger in African Americans and was more likely to metastasize than white men [38]. From a genetic perspective, some gene mutations related to disease progression are more common in African Americans, such as TP53 mutations and MYC ampli cation [39]. Several risk-associated single nucleotide polymorphisms were found to be overexpressed in African Americans [40]. At the same time, African Americans may face some social barriers such as health insurance, which may affect the treatment and management of the disease [41]. Our competing risk analyses also identi ed marital status as an independent prognostic factor. More and more researchers have paid attention to the impact of this sociological factor on the disease. Outcomes of numerous studies showed that married marital status was a protective factor for the occurrence and development of a variety of tumors, including PCa. Marriage may be a multifaceted representation of many protective factors including social support. [42,43] EAU risk strati cation based on D' Amico strati cation is currently a common risk strati cation system for PCa patients, which divided patients into Low-risk group, Intermediate-risk group, and High-risk group for predicting the risk of disease recurrence [2]. In our study, we compared the novel risk strati cation based on the nomogram with EAU risk strati cation. The results showed that our risk strati cation system had better discrimination with a C-index over 0.8 and could better detect patients at higher risk of the occurrence of postoperative CSD after adjustment of competing risk analyses. The high-risk group obtained through our risk strati cation had a signi cantly higher risk of CSD than OCSD, which could better exclude the interference of death caused by non-tumor factors on the model. Our advantages may come from many aspects, such as a large cohort, more prognostic factors, and independent analyses of competing risks. At the same time, our research provides quantitative and graphical prognostic tools, which help to make more accurate assessments of each patient.
Our study revealed 7 main independent prognostic factors that affect the occurrence of CSD in patients after RP and explored the application of these factors in identifying high-risk patients through the nomogram and risk strati cation. At present, the guidelines pointed out that there were multiple managements for patients undergoing RP, including adjuvant treatment, salvage treatment, watchful waiting, etc. However, due to the lack of high-quality prospective data, the inclusion and exclusion criteria of patients are still controversial. [2,44] In our study, the risk strati cation proposed by the nomogram provided a reference for the selection criteria for the postoperative management of patients. Taking into account the differences in the risk of CSD, the high-risk group may require more active intervention, while the low-risk group may be more suitable for watchful waiting.
There are still several limitations to our study. First, our research is based on a large retrospective cohort.
We still need more prospective clinical trials to contribute more precise data. Second, due to the SEER database's limitations, we are unable to obtain some data that can enrich our research outcomes, such as patients' functional status and disease progression, as well as some more detailed clinical parameters such as PSA double time. Although our prediction model has reached a relatively high accuracy (C-index > 0.8), in the future we can try to use these parameters to further optimize the nomogram and risk strati cation system. Third, we also lack additional independent external validation sets, and this is our important work goal in the future.

Conclusions
In conclusion, we performed a competing risk analysis based on a larger cohort of 95,812 patients with nonmetastatic PCa from the SEER database. We also identi ed 7 independent prognostic factors of the occurrence of CSD after RP, and constructed a competing risk nomogram utilizing the 7 factors for detecting the risk of CSD for each patient. A risk strati cation system was established based on the nomogram to help clinicians better identify patients at high risk of CSD after surgery.