A Nomogram Model Developed and Validated for The Evaluation of Lymph Node Metastasis in Patients with Rectal Cancer

Purpose: The aim of this study was to develop and validate a nomogram model to evaluate lymph node metastasis (LNM) in patients with rectal cancer (RC). Methods: A total of 162 patients with RC between 2019 and 2021 were included in the study. Patients were allocated to a training set and a validation set at a ratio of 7:3. The lymph node (LN) status was evaluated retrospectively from magnetic resonance imaging (MRI) images by two radiologists. Based on 103 radiomic features extracted from T2 weighted images (T2WI), the least absolute shrinkage and selection operator (LASSO) was used to screen and calculate the radiomic feature score (Radscore). The model was constructed using the logistics regression algorithm. The DeLong test and decision curve analysis (DCA) were used to compare the prediction performance and clinical utility of the MRI reported model, the Radscore model, and the complex model constructed by combining the MRI reported and Radscore. The nomogram model was constructed to visualize the prediction results of the best model. Model performance was evaluated in the training and validation groups, and the calibration curve and Hosmer-Lemeshow goodness of t test were used to evaluate the calibration. Result: This study included 162 patients with RC, including 54 patients with LNM and 108 patients without LNM. All three models constructed by the logistics regression algorithm were good at identifying LNM. The DeLong test and the DCA results showed that the complex model outperformed the MRI-based model and the Radscore model in relation to their predictive performance and clinical utility. The nomogram of the complex model had an area under the curve (AUC) of 0.902 (95% condence interval (CI): 0.848−0.957) in the training group and an AUC of 0.891 (95% CI: 0.799−0.983) in the validation group. Meanwhile, the calibration curve and the Hosmer-Lemeshow goodness-of-t test showed good calibration. Conclusion: The nomogram model constructed based on T2WI radiomics and MRI reported had good diagnostic ecacies for LNM in patients with RC, and provided a new auxiliary method for accurate and individualized clinical management.


Introduction
Rectal cancer (RC) is one of the most common malignancies in the digestive system, with its incidence and mortality rates rapidly increasing over the past two decades [1]. Currently, patients with RC are classi ed by staging of the tumor/lymph node/metastasis (TNM) system validated by the American Joint Committee on Cancer [2]. Accurate preoperative identi cation of lymph node metastasis (LNM) is an essential factor for guiding treatment decisions and predicting patient survival [3][4][5]. For patients with LNM, surgical resection accompanied by lymph node (LN) dissection is necessary, however, surgical treatment is invasive, expensive, and exhibits inevitable postoperative complications. Postoperative mortality for colorectal and rectal cancer surgery has been reported to be approximately 3-6% [6][7]. Endoscopic resection is more appropriate for RC cases without LNM because the risk of distant metastasis without LNM is low (3.6−16.2%) [8]. Thus, patients without LNM may not require additional radical resection, thus, avoiding overtreatment [7]. Patients with LNM have a 5-year survival of 50-68%, with a higher risk of locoregional recurrence. However, for patients without LNM, the 5-year survival increases to 95%, and the risk of loco-regional recurrence is relatively low [9]. Therefore, the prediction of LNs and the accurate assessment of LN state are essential for treatment decision making and prognostic assessments of patients with RC.
Currently, magnetic resonance imaging (MRI) has been recommended by the European Society for Medical Oncology as a part of the standard treatment program for RC [10]. Traditionally, LNs can be evaluated based on their size and changes in internal signal, although due to reactive LN hyperplasia, which can cause changes in internal structures, it can be di cult to identify whether the LN is metastatic or not by observing the change in signal strength alone [11]. In recent years, the application of diffusion weighted imaging (DWI) has greatly improved the qualitative diagnostic accuracy of LNM. The LN detection rate using DWI was about 6% higher than that using conventional T2WI. Seber et al. [12] proved that the apparent diffusion coe cient (ADC) can, to some extent, distinguish between benign nodes and malignant nodes. However, due to the sample size in that study, the choice of b value, ADC value mathematical algorithm model, and the region of interest (ROI), the ADC values have different predictive values for LNM in patients with RC. Other studies have aimed to explore the diagnostic accuracy of LMN in patients with RC by using dynamic contrast-enhanced MRI, magnetic resonance spectroscopy, and blood oxygenation level-dependent MRI; however, such methods could not achieve a uni ed consensus and are greatly affected by the scanning parameters and the technology itself. Although some histopathological ndings, such as LN in ltration and tumor differentiation, are known to be predictors of LNM, they are only available postoperatively [13].
Radiomics is the process of converting medical images into high-dimensional, exploitable data through high-throughput quantitative feature extraction, followed by data analysis for decision-making support [14]. Radiomics has shown promising prospects in assessing tumor heterogeneity, predicting prognosis, and responding to the tumor microenvironment [15]. Radiomics facilitates the exploration of deep hidden information from medical diagnoses at the macro level to promote precision medicine. Several studies have applied radiomics to study LNM in patients with RC, however, constructing a facilitative model for clinical use in patient management would be signi cant. The main aim of this study was to further explore the use of radiomics based on MRI assessments of LNM in RC patients, and to establish a visual nomogram model.

Patients
This retrospective study was approved by the ethics review board of The First A liated Hospital of Harbin Medical University. A total of 290 consecutive patients with RC who were treated between January 2019 and August 2021 were enrolled in the study. All patients underwent a rectal MRI examination and a postoperative pathological test. The inclusion criteria were: 1) pathologically con rmed adenocarcinoma < 15 cm from the anal verge, and 2) no history of pelvic surgery. A total of 128 patients were excluded for the following reasons: 1) they underwent neoadjuvant chemoradiotherapy, 2) they had a special histopathological type, including mucinous adenocarcinoma and villotubular adenoma, 3) their MRI scan was not performed or contained poor image quality, and 4) they did not undergo surgery. Ultimately, 162 patients were enrolled in the study. The patients were allocated to a training set (n = 114) and a validation set (n = 48) at a ratio of 7:3 using strati ed randomized sampling. The screening procedure for this study is shown in Figure 1. Baseline prognostic clinicalpathological factors, including age, sex, and LN stage were derived from the patients' electronic medical records. Every patient fasted for 8 h prior to the scan to empty the contents of their intestine. Transversal high resolution T2W turbo spin echo images were acquired with the following parameters: TR/TE = 4500/110 ms, FOV = 180×180 mm 2 , matrix = 320×320, slice thickness = 3 mm, gap = 0 mm, acceleration factor = 3, echo train length = 16, and acquisition time = 4 min 10 s.
Region of interest (ROI) delineation was performed by two independent radiologists (reader 1 with 3 years of experience in abdominal imaging, and reader 2 with 8 years in interpreting abdominal MRIs) who were aware of the inclusion criteria for the study, but were blinded to other histopathological ndings. All Manual segmentation may introduce a degree of uncertainty during the determination of tumor ROI. Some features may have less reproducibility when the tumor ROI is manually described by different individuals or at different times [16]. To eliminate the features that were lowly reproducible, reader one completed the lesion segmentation for all patients. At 14 days apart, reader two randomly selected 20 patients to segment the ROI. The intraclass correlation coe cient (ICC) was used to assess the intra-observer reproducibility of feature extraction. When the ICC exceeded 0.75, it was considered as having good agreement.
The range of the ICC between the two observers was 0.933±0.070. Two features (ClusterShade glcm and ClusterProminence glcm ) were poorly reproducible and were deleted. A total of 101 features were retained. All data generated or analysed during this study are included in this published article and its supplementary information les.

Radiomics signature building
All features were processed using z-score standardization. The least absolute shrinkage and selection operator (LASSO) method was used to screen the optimal features in the training set. Ten-fold cross-validation was used to compute the optimal lambda. The radiomic signature score (Radscore) was calculated based on the LASSO regression equation.

MRI reported
The LN status was evaluated retrospectively by two radiologists with 10 years of experience performing abdominal radiodiagnosis. Both radiologists observed images independently and were blinded to each other. The diagnostic criteria for LN status included: 1) whether the evaluated lymph nodes had chemical shift effects, 2) the short-axis node diameter was >9 mm, and 3) limited diffusion in the DWI sequences (the LNs showed a high signal). Agreement in LN status diagnosis was reached through consultation when the reviewers' opinions were contradictory. The diagnostic results of the two observers were compared with the histopathological validation.

Model establishment and comparison
The model was constructed using logistic regression in the training group. The MRI-based model was based on MRI images, the Radscore model was based on the Radscore, and the complex model was based on the MRI images and the Radscore. Model performance was evaluated using the receiver operating characteristic (ROC) and calculating the area under receiver operating characteristic curve (AUC) values. The Delong test was used to determine whether AUC values were statistically different between the three models. The clinical utility of the prediction models was determined and compared using decision curve analysis (DCA) by quantifying the net bene t to the patient under different threshold probabilities in the queue.

Development and validation of the individualized nomogram
To develop a visually quantitative tool to predict LNM in patients with RC, we developed a nomogram based on the prediction model with the highest AUC value and the clinical utility in the training set. The AUC (95% con dence interval (CI)), sensitivity, speci city, and accuracy of the model were calculated in the training and validation sets. Calibration curves were plotted to assess the calibration of the nomogram by bootstrapping (1,000 bootstrap resamples) based on the internal (training set) and external (validation set) validity. The Hosmer-Lemeshow test was used to assess the goodness of t of the nomogram model.

Statistical analysis
All statistical analyses and model building were performed in the R language (version 3.6.3, http://www.r-project.org). The R package was used to randomize the training and veri cation groups using "caret". Clinical data were expressed as ± s or percentage. An independent samples t-test or the Wilcoxon test was used for continuous variables, and the Fisher's exact test or χ 2 test was used for categorical variables. The ICC analysis was performed using the R software packages "readr" and "irr". LASSO regression and logistic regression model building was performed using the R package, "glmnet". ROC curve analysis was performed using the R software package, "pROC". The nomogram model and the calibration curve were constructed using the R software package, "rms". The Hosmer-Lemeshow goodness-of-t test was performed using the R software package, "ResourceSelection". The DCA curves were plotted using the R software package, "dcurves". A two-tailed P < 0.05 indicated statistical signi cance.

Patient characteristics
The cohort consisted of 162 patients with RC, including 57 females (35.2%) and 105 males (64.8%), with a mean age of 63.12 ± 9.95 years. Of those, 54 had LNM and 108 had non-LNM. The cohort was randomly divided into a training cohort (n = 114) and a validation cohort (n = 48) according to 7:3 ratio. The clinical characteristics of the 162 patients in the training and validation cohorts are summarized in Table 1. There were no statistical differences in age (P = 0.335), sex (P = 0.389), or N stage (P = 1.000) between the training and the validation cohorts.

Performance and clinical utility of the prediction models
The performance of the three models in predicting LNM in patients with RC was evaluated by ROC curves and compared using the DeLong test. The performance of the prediction models to identify LNM is shown in Figure 4A. The MRI-based model, Radscore model, and complex model all performed well in discriminating LNM, with AUC values of 0.882, 0.728 and 0.902, respectively. The Delong test showed that the AUC value of the complex model was signi cantly higher than that of the MRI-based model (P = 0.001) and Radscore model (P < 0.001), while the MRI-based model had a higher AUC than the Radscore model; however, the difference was not signi cant (P = 0.159).
Comparisons of the clinical utility of the models were performed using DCA. The results revealed that the complex model outperformed the MRI-based model and Radscore model in a wide threshold range ( Figure 4B). Therefore, the complex model was the most reliable clinical management tool for predicting LNM in patients with RC.  Figure 5B and C). The Hosmer-Lemeshow goodness of t test showed that there was no signi cant difference between the predicted and observed values in either the training cohort (χ 2 = 6.533, P = 0.588) or the validation cohort (χ 2 = 9.116, P = 0.333), thus, indicating a good t.

Discussion
Because the presence of LNM is an important factor in the recurrence of colorectal cancer (CRC), determining the presence of LNM is important for clinical management and the prediction of survival in patients with CRC [17]. However, the diagnostic e ciency of the TNM staging system remains inadequate in that it cannot fully support the selection of preoperative treatment options [18]. Meanwhile, only adequate intraoperative dissection of 12 LNs can su ciently con rm the presence of pathological LNM, and thus, the determination of the LN status may be inaccurate in patients with inoperable or inadequate LNs.
Thus, more reliable quantitative detection of LNM may provide a means of determining the optimal treatment for patients with RC.
Radiomics is a recently-developed approach that extracts a massive number of quantitative features from medical images and comprehensively evaluates tumor heterogeneity. Radiomic characteristics (intensity, shape, texture, or wavelet) provide information on the cancer phenotype and tumor microenvironment that is different, but complementary to other relevant data sources [15]. The results of numerous studies have suggested a potential correlation between the radiomic characteristics of primary tumors and LNM [19]. The results of the present study also con rmed that the Radscore constructs based on T2WI images differed signi cantly between the different LN states of RC (P < 0.05). The above results suggested that radiomic features are potential biomarkers for predicting LNM in patients with RC. Such bene cial results thus facilitate the use of radiomics to predict LN status. It is worth noting, however, that the effect of assessing LN status using radiomic features alone was limited, and the model constructed using Radscores alone was good at predicting LNM in patients with RC, with an AUC lower than that of the MRI-based model. Ma et al. compared multiple classi er models for N staging, and the diagnostic e ciency of the random forest classi er was better. However, the AUC was 0.74 [6]. Therefore, we believe that the value of radiomics alone as a marker of LNM needs to be further con rmed.
The assessment of LN status by conventional T2WI is performed based on the changes in size, morphology, and signal intensity of the LN. The diagnostic results are highly subjective and lead to low accuracy and reproducibility. With the development of functional MRI, studies have shown that the DWI detection rate for LNM was higher than ~6% for conventional T2WI [20]. Two experienced physicians were added to our study to assess LNM based on a combination of T2WI and DWI. Therefore, the prediction based on the MRI model was good (AUC: 0.882), suggesting that the role of MRI in the detection of LNM is critical. However, the ndings do not mean that the imaging model of T2WI + DWI is without drawbacks. Seber et al. reported that the ADC of benign LN was higher than that of malignant nodes, and when the ADC was 0.8 × 10 −3 mm 2 /s , the sensitivity for the diagnostic LNM was 76.4% compared to a speci city of 85.7% and an accuracy of 80.6%. Thus, those data indicated that DWI contributed to the diagnosis of LNM. However, this diagnostic method remained insu cient as the ADC overlaps between non-LNM and LNM, and hence, it could not fully identify benign and malignant LNs [12].
To build a more accurate model, we found that the predictive effect and clinical utility of the complex model combining the Radscore and MRI-based  [13]. Another study performed similar evaluations using MRI, where clinical risk factors were combined with high-resolution MRI factors and radiomic features to achieve good results (AUC: 0.900, 0.870) [21].
of interest from the ROIs of continuous slices may not accurately represent the true shape of the primary lesion [20,4] due to the growth properties of rectal cancer. Therefore, the maximum slice of segmentation may be a more appropriate way of segmentation.
This study also had the following limitations. First, the sample size was not su ciently large, and thus, the sample size should be expanded to reduce the impact of the data size on the accuracy of the results. The proportion of LNM in the patients in this study was low, resulting in an unbalanced sample size. Second, manual segmentation was used when sketching the ROI. Compared with semi-automatic and automatic segmentation methods, manual ROI segmentation introduces more subjectivity, which will then affect the accuracy of extracting radiomic features. Last, the proportion of LNM in the included patients was also low, which resulted in an unbalanced sample size.

Conclusion
In conclusion, the nomogram model constructed based on T2WI image radiomics and MRI images had good diagnostic e cacy for LNM in patients with RC, and provided a new option for precise personalized clinical management.   The LASSO algorithm and 10-fold cross-validation were used to extract the optimal subset of radiomic features. A. Optimal feature selection according to AUC value. When the value log (lambda) increased to 0.018, the AUC reached the peak corresponding to the optimal number of radiomic features. B. LASSO coe cient pro les of the 101 radiomic features. The vertical line was drawn at the value selected by 10-fold crossvalidation, where the optimal lambda resulted in 12 nonzero coe cients. LASSO: least absolute shrinkage and selection operator AUC: area under receiver operating characteristic curve SupplementaryFile.xls