A Novel Risk Classication Model Predicts Overall Survival and Locoregional Surgery Benet in Colorectal Patients with Distant Metastasis at the Initial Diagnosis

Background and aims: In this research, we aimed to construct a risk classication model to predict overall survival (OS) and locoregional surgery benet in colorectal cancer (CRC) patients with distant metastasis. Methods: We selected a cohort consisting of 12741 CRC patients diagnosed with distant metastasis between 2010 and 2014, from the Surveillance, Epidemiology and End Results (SEER) database. Patients were randomly assigned into training group and validation group at the ratio of 2:1. Univariable and multivariable Cox regression models were applied to screen independent prognostic factors. A nomogram was constructed and assessed by the Harrell’s concordance index (C-index) and calibration plots. A novel risk classication model was further established based on the nomogram. Results: Ultimately 12 independent risk factors including race, age, marriage, tumor site, tumor size, grade, T stage, N stage, bone metastasis, brain metastasis, lung metastasis and liver metastasis were identied and adopted in the nomogram. The C-indexes of training and validation groups were 0.77 (95% condence interval [CI] 0.73-0.81) and 0.75 (95% CI 0.72-0.78), respectively. The risk classication model stratied patients into three risk groups (low-, intermediate- and high-risk) with divergent median OS (low-risk: 36.0 months, 95% CI 34.1-37.9; intermediate-risk: 18.0 months, 95% CI 17.4-18.6; high-risk: 6.0 months, 95% CI 5.3-6.7). Locoregional therapies including surgery and radiotherapy could prognostically benet patients in the low-risk group (surgery: hazard ratio [HR] 0.59, 95% CI 0.50-0.71; radiotherapy: HR 0.84, 95% CI 0.72-0.98) and intermediate risk group (surgery: HR 0.61, 95% CI 0.54-0.68; radiotherapy: HR 0.86, 95% CI 0.77-0.95), but not in the high-risk (surgery: CI And all risk groups could benet from 95% CI intermediate-risk: HR 95% CI high-risk: HR 0.46, 95% CI 0.40-0.53). Conclusion: A novel risk classication model predicting prognosis and locoregional surgery benet of


Background
Colorectal cancer (CRC) is among the most frequent malignant tumors of both genders (third for men, second for women) globally. Annually, 1.8 million patients were newly diagnosed, leading to nearly 0.86 million deaths per year (1,2). Though its incidence and mortality rates show an optimal tendency for slow declining thanks to early detections via colonoscope in USA, its worldwide incidence remains high and mortality rate keeps dreadful, mainly attributable to distant metastasis (3,4).
About one fth of CRC patients have metastatic lesions at the time of diagnosis, majority of which involving liver or lung. Approximately 50% of CRC patients will ultimately progress into metastasis in their lifetime, indicating the end stage of cancer progression and poor prognosis (5). Yet different metastatic organs result in different survival outcomes. Subset with isolated metastasis to liver and/or lung of CRC has currently been regarded as potentially curable with surgery, while for other speci c metastatic CRC, treatment can be palliative, mainly consisting of systemic chemotherapy (6). The pervasive applications of systemic chemotherapy remain controversial. Piling researches reported chemotherapy paradigms consisting of diverse agents and indications (7)(8)(9).
However, because of the tumor heterogeneity as well as various demographic risk factors, standard therapy does not bene t patients of all backgrounds (10,11). E cacy of systemic treatments depends on geography, race, age, and other clinicopathological features (12). Personalized regimen would be required for better therapeutic effect on individual, which should be guided by a comprehensive prognostic model to predict possible survival outcomes under given circumstances (13). By far, no such predictive model for CRC patients was constructed. Therefore, we aimed to build a risk classi cation model, which would be capable of visualizing the quanti ed risk factors and applying for clinical practice.

Study population
Patients included in this research were collected from the SEER database via SEER*Stat. Access authority to the published data pool has been acquired o cially from the SEER website (www.seer.cancer.gov). The SEER database has gained the inform consent before publishing the documents.
Among 185498 patients diagnosed with CRC registered in SEER between 2010-2014, we included cases meeting the criteria as follows: 1. CRC was pathologically identi ed as primary and the only malignancy; 2. patients had unequivocal metastasis with TNM stage rated as "M1"; 3. intact information on clinical and pathological parameters was documented in SEER database. Patients of multiple malignancies, vague diagnostic evidence, none or uncertain metastatic status, as well as incomplete information on clinicopathological data were excluded.
The nal cohort consists of 12741 patients, whose documented data on demographic, clinicopathological and treatment parameters including race, age, marital status, tumor site, tumor size, tumor grade, T and N stage, metastatic status on bone, bran, liver and lung, treatment information about radiotherapy, chemotherapy and surgery, vital status and survival months were abstracted from the SEER database. For continuous variables like age and tumor size, we applied the X-tile software to transfer them into categorical variables based on the cut-off values of optimal signi cance (age is divided into "<55 years", "55-75 years" and ">75 years", and tumor size is divided into "<3cm", "3-5cm" and ">5cm").
For better analysis, marital status and treatment data including radiotherapy, chemotherapy and surgery was simpli ed as dichotomous variables ("married" or "unmarried", "yes" or "no"). Vital death was de ned as the main outcome event. Overall survival (OS) time was calculated as the timespan between diagnosis and death of all causes.

Statistical analysis
We randomly assigned the included 12741 cases into the training and validation group at the ratio of 2:1.
The training group was regarded as the data resource for constructing the prognostic model, while validation group would be the validation for the model. Initially descriptive analysis on demographic and clinicopathological baseline characteristics of the entire cohort was performed via Chi-square analysis. The survival analysis was performed by the Kaplan-Meier method in each subgroup. Median survival time with 95% con dence interval (95% CI) was presented along with the frequency distributions. Univariate and multivariate analyses were performed to identify independent risk factors for overall survival. Factors sustaining statistically signi cant in multivariate model were eventually accepted for constructing the nomogram via R software (Version 1.1.456). The packages "rms", "VIM" and "survival" were applied in R software. Two-and three-year overall survival were adopted as endpoints in the nomogram. Harrell's concordance index (C-index) was used as evaluation indictor for the discriminative capacity of the prognostic model. Calibration curves were plotted to visualize the consistency between the predicted and observed survival time in given timespans, for assessing the predictive veracity of the model. Quanti ed scores of each risk factor were given by monogram considering weighted risk degrees. Total prognostic scores of patients were calculated by adding points of each risk factor, and patients were classi ed according to their prognostic scores for risk strati cations. Statistically signi cant were de ned as twosided P-values < 0.05. Statistically analysis involved in this study were accomplished via SPSS 24.0 (SPSS Inc., Chicago, IL, USA).

Baseline clinicopathological characteristics
In total, 12741 patients were selected based on the inclusion and exclusion criteria and their clinical information was abstracted from the SEER database (detailed selection protocol shown in Fig. 1). The whole cohort was randomly assigned into the training and validation groups at the ratio of 2:1, 8510 cases in the training set and 4231 cases in the validation set, respectively. Baseline clinicopathological parameters of the two sets are shown in Table 1, along with the OS and 95% CI of each subgroup. For demographic factors such as race, gender, age and marital status, the frequency distributions are relatively homogeneous between the two sets. For clinicopathological factors, about 3.1%, 1.0%, 16.9% and 70.0% patients in the training group presented metastasis to bone, brain, lung and liver, respectively, indicating a tendency to liver metastasis in CRC patients.

Univariate and multivariate analyses for overall survival
We applied univariate analysis of all the clinical parameters for screening out potential prognostic indicators. As revealed in Table 2 as independent risk factors for OS ( Table 2). The HR with 95% CI of each subgroup was also presented in a Forest plot (Fig. 2). Thus, these independent risk factors were all adopted for constructing the nomogram.

Nomogram construction and validation
All the veri ed independent risk factors in Table 2 were incorporated to construct the prognostic model.
The total score was matched to an estimated 2-and 3-year overall survival rate according to the bottom two lines in the nomogram (Fig. 3). As the indicator to evaluate the coherence degree of the prognostic model, C-indexes of both the training and validation groups were 0.77 (95% CI 0.73-0.81) and 0.75 (95% CI 0.72-0.78), respectively. Moreover, calibration curves for 2-and 3-year OS were created, showing satisfying consistency between the predicted and virtual survival rates in both internal and external validation cohort (Fig. 4).
A novel risk classi cation system for prognosis  (Fig. 5). The established risk strati cation system was proven to be accurate in predicting survival outcomes of CRC patients with distant metastasis.

Discussion
In the current research, we constructed a prognostic nomogram for predicting the overall survival of CRC patients with distant metastasis, and validated the model in both training and validation cohorts. In all, 12 demographic and clinicopathological parameters were identi ed as independent risk factors to OS. Further C-index calculation in both training and validation groups indicated accepted coherence degree of the nomogram. Calibration curves in both groups con rmed the model's predicting capacity on 2-and 3year OS in CRC patients with distant metastasis. Risk strati cation on patients according to weighted risk scores given by the nomogram can effectively distinguish differed OS outcomes of patients, suggesting potential application for clinical practice.
In views of epidemiology, age has been widely accepted as a major risk factor for sporadic CRC (14). This is consistent with our ndings in the study. Previous epidemic researches suggested that CRC incidence, especially large bowel caner, begins to increase in the ages of 40, and age-speci c incidence rates keep increasing in the succeeding decades (15,16). In this research, we further elaborated that not only incidence but also survival outcomes would be independently in uenced by age. CRC patients older than 55 years ended with less life expectancy, and those older than 75 years may be even worse. Different outcomes according to race appear to be attributable to different life behaviors and genetic backgrounds. We were surprised to notice that marital status also contributed to patients' survival outcomes. There have been studies indicating the correlation between marital status and survival outcomes of cancer patients (17,18). Some of them owed this connection to socioeconomic status and family care and support. We believe more investigations should be required for providing more guidance for social support.
Though liver is the dominant metastatic site for patients with CRC, brain metastasis turned out to be related to the worst prognosis, followed by bone, liver and lung metastasis. The AJCC guidelines for CRC management have pointed out that regional treatment like surgery for CRC with isolated liver metastasis may be recommended to be combined with systemic chemotherapy (19). Our ndings supported the propose that for CRC with isolated metastasis in liver and lung, relatively aggressive treatment for optimal survival may be considered. The idea that malignancies from solid organs may manifest organspeci c metastasis tendency in uencing survival outcomes differently has long been raised. We previously reported the metastatic preference of extrahepatic cholangiocarcinoma ultimately determined variant prognosis (20). Depicting characteristic metastasis patterns of malignancies can be of vital signi cance guiding treatment and prognosis prediction.
We also found that differed primary sites resulted in differed survival outcomes, with rectus the best, right colon the worst. Classi cation of CRC based on primary site has been long been a hot-discussed issue (21,22). In this research, we de ned the site classi cation as the canonical pattern put, that right colon includes colon starting from cecum to proximal splenic exure, while left colon refers to segments from distal splenic exure to sigmoid colons. In the view of embryology, right colon mainly originates from midgut and left colon formation initiates from hindgut. Differed histological derivation determines different malignant degrees for carcinogenesis. Owing to the characters of thin walls and mucus secretion, malignancy originating from right colon can be symptom-latent at the early phase (23,24).
Delayed onset of symptoms leads to ignorant detection during the early phase of the cancer.
Both adjuvant and neoadjuvant chemotherapy as an essential part of systemic treatment for metastatic CRC patients have been explored in last decades. Profound promotion in patients' long-term survival has been achieved by newly emerged chemo-regimens like FOLFOX and FOLFIRI (25). With the assistance of systematic chemotherapy, indications for surgery on CRC with distant metastatic sites have also been widen. Previously distant metastasis used to be absolute contradictions to surgical resections. Now surgical options on CRC with distant metastasis have been more radical than ever (26). Yet, not all evidence supports that surgical treatment promotes prognosis of advanced CRC patients in all. In this research, we found that for CRC patients of high risk could not bene t from surgeries, suggesting that accurate screening on risk factors be necessary for CRC patients with distant metastasis to consider surgical interventions. Moreover, roles of locoregional radiotherapy in treatment for metastatic CRC patients have been controversial (27). Several RCT and meta-analysis have been debating on the question that whether and to what fraction should radiotherapy be added to treatments of advanced CRC (28,29). We once reported that for advanced HCC patients, internal radiation therapy may achieve better therapeutic effect than external ways (30). However, in this retrospective research we identi ed radiotherapy as a non-bene cial treatment to CRC patients of high-risk, even though it can moderately promote prognosis for patients of low and intermediate risk. Conclusively, for CRC patients of high-risk, locoregional treatment options including surgery and radiation therapy may not achieve survival bene t as systematic chemotherapy does. Caution should be put on evaluating CRC patients' risk strati cation before making medical decisions.
As far as we are concerned, this study is among the pioneering work to construct a visualized prognostic model in metastatic CRC patients. Still, as a retrospective study, there exist several limitations. Also, information on treatment provided by SEER database is general and relatively super cial. Detailed information on doses and regimens of chemotherapy and radiation remains unknown. Moreover, external validation in the nomogram was still performed based on cases from the SEER database, requiring independent external cohorts investigating its performance.
To conclude, in this study, an innovative prognostic nomogram was built based on data abstracted from the SEER database, to predict survival outcomes of patients with metastatic CRC. We anticipate this prognostic model can be further con rmed by well-designed clinical trials and be of great signi cance for guiding medical practice and decision making.

Conclusion
We established and validated a novel risk classi cation model predicting prognosis and locoregional surgery bene t of CRC patients with distant metastasis. This predictive model could be further utilized by physicians and be of great signi cance for medical practice.

AVAILABILITY OF DATA AND MATERIALS
Publicly available datasets were analyzed in this study. This data can be found here: https://seer.cancer.gov/.        groups.