Screening and Validation of a Novel T Stage-Lymph Node Ratio Classication for Colon Cancer

Purpose Lymph node ratio (LNR) has advantages in predicting prognosis over the American Joint Committee on Cancer (AJCC) N stage. However, the prognostic value of establishing a novel T stage-Lymph Node Ratio classication (TLNR) for colon cancer by combining LNR and T stage is currently unknown. Methods We included 62,294 stage I-III colon cancer patients from the SEER data base as a training cohort. An external validation was performed in 3,327 additional patients. A novel LNR stage was established and included into a novel TLNR classication by combining with T stage. Patients with similar survivals were grouped according to T and LNR stages, with T1LNR1 as a reference. Results We developed a novel TLNR classication: stages I (T1LNR1-2, T1LNR4), IIA (T1LNR3, T2LNR1-2, T3LNR1), IIB (T1LNR5, T2LNR3-4, T3LNR2, T4aLNR1), IIC (T2LNR5, T3LNR3-4, T4aLNR2, T4bLNR1), IIIA (T3LNR5, T4aLNR3-4, T4bLNR2), IIIB (T4aLNR5, T4bLNR3-4), and IIIC (T4bLNR5). In the training cohort, the TLNR had better prognostic discrimination [area under receiver operating characteristic curve (AUC), 0.621 vs. 0.608, P < 0.001], superior model-tting ability in predicting overall survival [Akaike information criteria (AIC), 561,129 vs. 562,052], and better net benets than the AJCC 8 th tumor/node/metastasis (TNM) classication. Those results were successfully validated in predicting both overall and disease-free survivals in an independent validation set.


Background
Colon cancer is one of the most frequently diagnosed cancers and has become an health burden worldwide (Siegel et al. 2020). The American Joint Committee on Cancer (AJCC) tumor/node/metastasis (TNM) classi cation of colon cancer has been the most import prognostic assessment tool for colon cancer (Amin MB 2017). However, the current AJCC 8 th TNM classi cation of colon cancer has a limited ability to predict survivals, that is, stage III patients have better prognosis than stage II (Amin MB 2017; Chu et al. 2016; Kim et al. 2015). There are possible reasons for this paradox. It was suggested that in the TNM staging system, pT stage had a much lower weight compared with the pN stage (Li et al. 2014; Li et al. 2016). However, it was also revealed that the pT stage had comparable importance to pN stage, regarding that T4N0 colon cancer patients had signi cantly poorer survivals than T1-2N1-2a patients, regardless of the number of retrieved lymph nodes (Chan et al. 2019; Rottoli et al. 2012).
Patient survival is also affected by the total number of retrieved lymph nodes, possibly due to the therapeutic bene ts from an optimal lymphadenectomy, or more accurate staging from more harvested lymph nodes, which remains controversial. To reduce staging migration, no less than 12 lymph nodes should be retrieved to ensure an optimal staging (Amin MB 2017). However, the average number of retrieved lymph nodes is often less than 12 (Prandi et al. 2002;Wong et al. 2007), because that many factors might be associated with total number of retrieved lymph nodes, such as, surgical skills, surgical technique, the way the pathologist in collecting the lymph nodes, the actual number of regional lymph nodes surrounding tumors, and immune responses of patients (Simunovic and Baxter 2007). Lymph node ratios (LNR) had been proposed to reduce stage migration (Berger et al. 2005; Rosenberg et al. 2010; Rosenberg et al. 2008). LNR is de ned as the ratio between number of metastatic lymph nodes and total number of retrieved lymph nodes. It takes into account both number of positive lymph nodes and total number of retrieved lymph nodes, and it has been reported with a higher predictive accuracy rate than the pN stage, especially when an insu cient number of lymph nodes was retrieved (Pei et al. 2019).
The prognostic advantages of LNR in colorectal cancer have been widely con rmed (Berger et (2018)). We compared its discrimination performance, model-tting ability, and net bene ts with those of the AJCC 8 th TNM classi cations in the training cohort (SEER), and further validated the prognostic capacity of the novel TLNR classi cation in an independent validation cohort.

Patients and eligibility criteria
Operable stage I-III colon cancer patients were included from the SEER data base as a training cohort (Howlader et al. 2015), which was mainly applied to develop the novel LNR stage and TLNR classi cation. The eligibility criteria were: (1) primary and single colon cancer; (2) necessary information was available; (3) without distant metastasis (M0); (4) meet criteria for pathologic staging; (5) underwent surgical treatment; (6) follow-up at least ve years or until death; (7) postoperative survival time more than 1 month; (9) aged at least 18 years (Supplementary Figure 1). The last date of follow-up for the SEER cohort was December 2015. The data-use agreement of the SEER 1973-2015 research data le was approved.
The external validation was conducted by the data base of China Medical University Cancer Hospital, which was applied to validate the predictive performance of the novel TLNR classi cation. The eligibility criteria for the external validation cohort was same as that for the training cohort. The last date of followup was January 2020. The ethical review was approved by the Institute Ethics Committees of China Medical University Cancer Hospital (20210206K), and written informed consents were obtained.
Colon cancer with distant metastasis (M1) has been widely considered as the most advanced stage with the poorest prognosis and is generally considered incurable. Therefore, we only included colon cancer patients who underwent curable surgical treatments in this study. In this study, T1-4b and N0-2b are applied to simply present pT1-4b and pN0-2b in both the TNM and novel TLNR classi cations.

Statistical Analysis
Overall survival (OS) was calculated from surgery until death from any cause, and disease-free survival (DFS) was calculated from surgery to the identi cation of cancer recurrence and/or metastasis or until death (if no recurrence or metastasis occurs before death). Log-rank tests with Kaplan-Meier survival curves were conducted to analyze survival differences in overall and disease-free survival rates. Cox proportional hazards models were applied to estimate hazard ratios (HRs) with 95% con dence intervals (CIs).

Establishment of a novel LNR stage
In the training cohort, all patients were classi ed into 21 groups (LNR from 0 to 1) in units of 0.05. Cox proportional hazards model was performed to estimate HRs for all the 21 groups (LNR = 0 as a reference), and all the 21 groups were orderly sorted according to the HR values from the lowest (LNR = 0) to the highest (LNR > 0.95). Then, log-rank tests for overall survival were conducted between two sequential LNR stages and 21 c 2 values were generated. Four largest c 2 values were identi ed as the cutoff values. Finally, using these 4 c 2 cutoff values, we created ve categories and developed the novel LNR stage that paralleled to the AJCC 8 th pN stage.

Establishment of a novel TLNR classi cation
In the training cohort, the novel LNR and pT stages were combined into 25 groups, and the HR value of T1LNR1 were selected as the reference. HR values of 25 T stage and LNR stage combinations were ordered from the lowest (T1LNR1) to the highest (T4bLNR5) ( Table 1). Then, log-rank tests for overall survival were conducted between two sequential stages and 24 c 2 values were generated. Among the 24 c 2 values, six largest c 2 values were identi ed as cutoff values (Table 1). Finally, using these six c 2 values, we created seven categories of the novel TLNR classi cation that paralleled to the AJCC 8 th classi cation.
The model discrimination performance and model-tting ability between the novel LNR and previous reported LNR stages, and the novel TLNR and the AJCC 8 th TNM classi cations, were assessed by the area under the receiver operating characteristic (ROC) curve (AUC) and Akaike information criteria (AICs), respectively. A higher AUC value suggested better discrimination performance and a lower AIC value indicated superior model-tting ability (Hanley and Mcneil 1982). Statistically signi cant differences in AUCs were con rmed using Hanley and McNeil tests (Hanley and Mcneil 1982). The clinical bene ts were evaluated by decision curve analyses (Fitzgerald et al. 2015). Besides, the prognostic discrimination performances of the novel LNR stage and novel TLNR classi cation based on 5-year OS and DFS rates, log-rank tests, and HRs of Cox proportional hazards models were further assessed.
The SEER database was extracted by the SEER*Stat version 8.3.5. Statistical analyses were conducted using SPSS version 22.0 and R version 3.5.3. Hanley and McNeil tests were conducted with MedCalc version 18.11.3. All tests were two-sided and a P value < 0.05 was de ned as statistically signi cant.

Patient characteristics
A total of 62,294 patients with operable stage I-III colon cancer were nally included from the SEER data base as a training cohort (Supplementary Figure 1). In addition, 3,327 patients with operable stage I-III colon cancer from China Medical University Cancer Hospital were included as an external validation cohort. The characteristics of the baseline of the training and validation cohorts were presented (Supplementary Table 1). The mean ages (± SD) were 68.1 (± 13.8) and 59.9 (± 11.6) years in the training and validation cohorts, respectively. The mean number (± SD) of retrieved lymph nodes was 17.2 (± 9.6) and 16.7 (±10.0) in the training and validation cohorts, respectively. A total of 26.8% patients in the training cohort and 31.6% patients in the validation cohort had less than 12 retrieved lymph nodes.  Table 2). There were two previous LNR stages which we named as LNR-Berger (Berger et al. 2005) and LNR-Rosenberg  Table 3). Similar ndings were observed in patients with < 12 and ≥ 12 retrieved lymph nodes (Supplementary Table 3).

TLNR classi cation versus the AJCC 8 th TNM classi cation
In the training cohort, model discrimination and model-tting between the novel TLNR and the AJCC 8 th TNM classi cations were compared. Kaplan-Meier curves with log-rank tests con rmed that the novel TLNR classi cation showed superior model discrimination performance than the AJCC 8 th TNM classi cation, that the 5-year overall survival rates of the TLNR classi cation steadily decreased as stage increased, and HRs increased as stage increased (HRs, TLNR stages I to IIIC, 1.00, 1.48, 2.13, 3.07, 4.87, 6.94, and 9.70) (Table 2, Figure 2A, 2B). The novel TLNR showed better prognostic discrimination (AUC, 0.621 vs. 0.608; Hanley and McNeil test, P < 0.001) and superior model-tting ability (AIC, 561,129 vs. 562,052) than the AJCC 8 th TNM classi cation for overall survival (Table 3). Similar ndings were observed in patients with adequate (≥ 12) or inadequate (< 12) retrieved lymph nodes (Table 3). We further performed decision curve analyses to assess clinical utility, and the novel TLNR classi cation had superior net bene ts over the AJCC 8 th TNM classi cation between the threshold probabilities of 30-45% in the training cohort (Supplementary Figure 3A). The details of the novel TLNR and the AJCC 8 th TMM classi cations are presented (Figure 3).

External validation
In the external validation cohort, the TLNR still showed better model discrimination performance than the  (Table 3). Similar ndings were observed in patients with inadequate lymph nodes retrieved (< 12) but not in patients with adequate lymph nodes retrieved (≥ 12), suggested the advantages of the novel TLNR classi cation, especially in patients with inadequate lymph nodes retrieved ( Table 3). The decision curve analyses further revealed that the TLNR had superior net bene ts

Page 8/25
The AJCC TNM classi cation of colon cancer has long been considered with a limited ability to predict survivals that some stage III patients had better prognosis than some patients in stage II (Amin MB 2017; Chu et al. 2016; Kim et al. 2015). It was previously believed that stage migration from an inadequate number of retrieved lymph nodes might be one reason (Hari et al. 2013;O'Connell et al. 2004). Some experts thought that patient survival was affected by the total number of retrieved lymph nodes, and therapeutic bene ts could be obtained from an optimal lymphadenectomy. Others believed that this survival bene ts might only due to a more accurate staging of the tumors from a more harvested number of lymph nodes. However, for patients with adequate lymph nodes, many patients in stage III still have better survivals than patients in stage II, which could not explain this paradox.
However, even full efforts are made, the total number of retrieved lymph nodes is frequently inadequate that 26.8% patients in the training cohort and 31.6% in the validation cohort had inadequate number of retrieved lymph nodes, which is similar with previous reports (Prandi et al. 2002;Wong et al. 2007). This might be associated with multiple factors of surgical skills, surgical technique, the way the pathologist in collecting the lymph nodes, the actual number of regional lymph nodes surrounding tumors, and even The LNR take into account both the in uence of the number of positive lymph nodes and the number of examined lymph nodes on the stage, and it has been proved to have a higher predictive advantages in prognosis over the AJCC pN stage for colon cancer (Berger et al. 2005;Rosenberg et al. 2010;Rosenberg et al. 2008). However, the prognostic value of establishing a novel TLNR classi cation for colon cancer by combining LNR and pT stages is still unknown. We established a novel LNR stage, with better prognostic discrimination than the AJCC 8 th pN stage, and our novel LNR stage showed comparable prognostic discrimination with previous studies (Berger et al. 2005;Rosenberg et al. 2010;Rosenberg et al. 2008). Since we con rmed that the LNR stage was better than the pN stage, we established a novel TNLR classi cation for colon cancer by combining LNR and pT stages, and it was con rmed that the novel TLNR classi cation showed superior prognostic discrimination, model-tting ability and clinical usefulness than the AJCC 8 th TNM classi cation, especially in patients with inadequate lymph nodes retrieved.
The performance of a classi cation can be evaluated by the homogeneity within the subgroups, the ability to distinguish between different groups, and the monotonicity of the gradient shown by the correlation between stages and survivals (Ueno et al. 2001). The novel TLNR classi cation has several advantages over the AJCC 8 th TNM classi cation. First, HRs and 5-year overall survival rates differed statistically signi cantly between each pair of stages groups in the novel TLNR classi cation, suggesting an enhanced strati cation ability. Second, AUCs of the novel TLNR classi cation were signi cantly increased than the AJCC 8 th TNM classi cation, indicating a better prognostic discrimination. Third, the TLNR classi cation showed superior net bene ts than the AJCC 8 th TNM classi cation by decision curve analysis. Strati ed analyses further con rmed that the novel TLNR classi cation had good model applicability, especially in the patients with inadequate lymph node retrieved. We further validated those ndings in disease-free survival and the novel TLNR classi cation still showed superior predictive performance than the AJCC 8 th TNM classi cation. Therefore, the current ndings of this study should be considered reliable; given that they were based on a large-sampled SEER training set and validated by an external validation set, suggesting the TLNR is a more reasonable classi cation than the AJCC 8 th TNM classi cation. It should be considered as a better alternative to the AJCC 8 th TNM classi cation for better stratifying, especially for patients with inadequate lymph nodes retrieved.
To the best of our knowledge, this study was the rst to establish a TLNR classi cation by combining pT and LNR stages for colon cancer. Besides, this study was based on a training cohort with a large population and was successfully validated by an external validation cohort. The current novel TLNR classi cation was established by only LNR and pT stages; surgical strategies, adjuvant chemotherapy regimens (Lai et al. 2016;Sineshaw et al. 2018), and the molecular markers of microsatellite instability, KRAS, and BRAF can also affect prognosis; future studies are still required to validate the novel TLNR classi cation.

Conclusions
In conclusion, the novel TLNR classi cation provides a better prognostic performance for operable stage I-III colon cancer than the AJCC 8 th TNM classi cation, especially for patients with inadequate lymph nodes retrieved. It is a prognosis-based classi cation for better stratifying and can be considered as a good alteration of the current AJCC 8 th TNM classi cation for operable colon cancer patients.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.