Prognostic Value of Multiparametric MRI-Based Radiomics model: Potential Role for Chemotherapeutic Bene ts in Locally Advanced Rectal Cancer

BACKGROUND AND PURPOSE
We aimed to develop a radiomics model for the prediction of survival and chemotherapeutic benefits using pretreatment multiparameter MR images and clinicopathological features in patients with locally advanced rectal cancer (LARC).


MATERIALS AND METHODS
186 consecutive patients with LARC underwent feature extraction from the whole tumor on T2-weighted, contrast enhanced T1-weighted, and ADC images. Feature selection was based on feature stability and the Boruta algorithm. Radiomics signatures for predicting DFS (disease-free survival) were then generated using the selected features. Combining clinical risk factors, a radiomics nomogram was constructed using Cox proportional hazards regression model. The predictive performance was evaluated by Harrell's concordance indices (C-index) and time-independent receiver operating characteristic (ROC) analysis.


RESULTS
Four features were selected to construct the radiomics signature, significantly associated with DFS (P < 0.001). The radiomics nomogram, incorporating radiomics signature and two clinicopathological variables (pN and tumor differentiation), exhibited better prediction performance for DFS than the clinicopathological model, with C-index of 0.780 (95%CI, 0.718-0.843) and 0.803 (95%CI, 0.717-0.889) in the training and validation cohorts, respectively. The radiomics nomogram-defined high-risk group had a shorter DFS, DMFS, and OS than those in the low-risk group (all P <0.05). Further analysis showed that patients with higher nomogram-defined score exhibited a favorable response to adjuvant chemotherapy (AC) while the low-risk could not.


CONCLUSION
This study demonstrated that the newly developed pretreatment multiparameter MRI-based radiomics model could serve as a powerful predictor of prognosis, and may act as a potential indicator for guiding AC in patients with LARC.


Background
Outcome for patients with locally advanced rectal cancer (LARC) has improved over past decades, in part because of the use of neoadjuvant chemoradiotherapy (NCRT) followed by total mesorectal excision (TME) surgery (1,2). The local recurrence (LR) rates has been dramatically reduced to less than 10%.
However, these strategies may not necessarily improved survival, and distant metastasis (DM) is remain the major cause of treatment failure in LARC (3,4). Additionally, adjuvant chemotherapy (AC) has been recommended for LARC to reduce the incidence of DM. However, it should be pointed out that only certain subgroups of patients could bene t from AC. Some studies have indicated AC is unnecessary in patients with pathological complete response (pCR) (5). Therefore, accurate prediction of the patient survival and guiding AC is of paramount importance to devise appropriate treatment and improve the prognosis of LARC.
Currently, risk assessment in survival of LARC is primarily based on the traditional TNM staging system (6)(7)(8). However, these clinicopathological risk factors provide inadequate prognostic information and do not evaluate the intrinsic biological heterogeneity of LARC. Moreover, it ignores the potentially pathological risk factors like tumor regression grade (TRG) and extramural venous invasion (EMVI) as well as more comprehensive information of the entire tumor that can be obtained from multiparameter MRI (4,9,10). Hence, the identi cation of new noninvasive prognostic biomarkers that allow assessment of tumor heterogeneity might be helpful for personalized medicine (11).
Nowadays, MRI has been widely in clinical workup to diagnose, stage, and monitor treatment response for rectal cancer, and can detect several prognostic factors (10,12). Radiomics, converting these images into mineable high-dimensional features, has recently emerged as a promising method to evaluate tumor heterogeneity, as medical imaging can provide information regarding the underlying pathophysiology (13,14). Thus, this strategy has been successfully used for tumor characterization, gene prediction, and therapy guidance in terms of rectal cancer (15)(16)(17)(18)(19). Few studies have evaluated the value of the heterogeneity of MR images, using texture or radiomics analyses, for predicting survival in LARC (20)(21)(22). However, they may suffer from small sample sizes, analysis of the single sequence or single section of tumor rather than whole-tumor volume analysis.
Therefore, the present study aims to determine the association between pretreatment multiparameter MRI-based radiomics signature and different survival in patients with LARC, and further establish a radiomics nomogram that incorporates the radiomics signature and clinicopathological ndings for the individual prediction of survival.

Participant inclusion
Patients with rectal cancer treated at our hospital between December 2012 and November 2016 were reviewed and included for this study if they: (i) had locally advanced rectal adenocarcinoma (stage pre-CRT T2-weighted MR images as cT3/T4, and/or N-category positive) ; (ii) received pretreatment multiparameter MR images, including high-resolution T2WI, DWI, and contrast-enhanced T1-weighted MR imaging; (iii) treated by long-course NCRT followed by TME surgical resection. Flow chart of patient recruitment pathway was presented in Fig. 1. Institutional review board approval was acquired for this study, and written informed consent was waived owing to its retrospective nature.
Finally, 186 patients were enrolled and randomly divided into the training cohort (n = 131) and the validation cohort (n = 53) at a ratio of 7:3. Baseline characteristics, including age, gender, clinical T stage, N stage, histological grade, carcinoembryonic antigen (CEA), and carbohydrate antigen-199 (CA199), were obtained from medical records.

Neoadjuvant treatment and Surgery
All patients underwent three-dimensional conformal radiation therapy (gross tumor volume, 50-55 Gy; clinical target volume, 1.8-2.0 Gy; a total of 22 fractions). Concomitantly, capecitabine (800 mg/m 2 orally twice daily) was administered with radiation therapy. TME surgery was performed within 8 to 10 weeks (55 days on average, range 50-64 days) after the completion of NCRT. Afterward, patients received AC for 6 months, with the regimens of FOLFOX or XELOX. Reasons for not receiving AC included age, organ dysfunction suggestive of intolerance to treatment, and an individual patient's refusal.
Surgically resected specimens were evaluated by two dedicated pathologists, according to the Seventh American Joint Committee on Cancer (AJCC) TNM system (23). The tumor staging, lymph mode involvement, lymphovascular invasion (LVI), and lymphovascular invasion (PNI) were retrospectively collected. Additionally, the pathological TRG system was also evaluated according to the Mandard method (24).

Clinical endpoints and follow-up
Patients were followed up regularly after surgery, with 3-months intervals for the rst 2 years, then 6month intervals for the 3rd to 5th years, and annually thereafter. The main endpoint of this study was disease free survival (DFS), which was measured from the date of surgery until disease progression, death from any cause, or the last visit in follow up (censored), and nomograms were also built based on the DFS. Disease progression, including local recurrence and distant metastasis, were con rmed by clinical examination, imaging methods such as chest CT, and abdominopelvic CT or MRI, or biopsyproven. Other endpoints included distant metastasis-free survival (DMFS, time from surgery to rst distant metastasis) and overall survival (OS, time from surgery to death from any cause). The minimum follow-up period to con rm the 3-year DFS status was 36 months after the surgery, while the maximum follow-up period was 82 months (median, 44 months).

MRI protocol and Imaging segmentation
All patients underwent pretreatment multiparameter MRI. Detailed information on MRI protocol was presented in Appendix E1.
Oblique axial T2-weighted (T2W) and contrast enhanced T1-weighted (cT1W) images, as well as ADC images were retrieved from the picture archiving and communication system (PACS, Carestream, Canada) and then loaded into ITK-SNAP software (3.8.0, www.itksnap.org) for manual segmentation. A radiologist (reader 1) with 8 years of experience in pelvic MR imaging interpretation, outlined the wholetumor volume of interests (VOIs), delineating along the border of the tumor and excluding the intestinal lumen on each slice. For ADC maps, VOIs were placed on the region of high signal intensity on DW images with b of 1000s/mm 2 rst and then copied to the corresponding ADC maps, due to the higher resolution of DW images in comparison to ADC maps, while using all other image sequences as references.

Radiomics feature extraction
Before feature extraction, we standardized all above MR images by z-score normalization, and remove value out than 3 sigma in order to obtain a standard normal distribution of the image intensities.
Afterward, all voxels were isotropically resampling into 1 × 1 × 1 mm 3 using linear interpolation, and gray level of each image was quantized to 25 gray levels. Then a total of 1130 radiomics features quantifying phenotypic differences on the basis of rst-order (n = 19), shape (n = 16), texture (n = 58), and wavelet and Laplacian of Gaussian (LoG) features, were exacted for each sequence by using PyRadiomics. More information about these features were presented in Appendix E2.

Feature Selection and Radiomics Signature building
We devised a two-step procedure for selecting robust features and reducing dimension. First, to calculate the inter-/intra-observer agreement of radiomics analysis, 30 patients were selected randomly and segmented again by reader 1 and another radiologist (reader 2) with 18 years of experience in a blind fashion one month later. The intra-and inter-class correlation coe cients (ICCs) were calculated, and radiomics features with both intra-observer and inter-observer ICCs values greater than 0.80 were selected for subsequent analysis. Second, the Boruta algorithm, a random forest-based wrapper method, was used to detect the key features for prediction (25). Boruta could select all relevant features instead of only the nonredundant ones. Then, the radiomics score (Radscore) was computed in the training cohort via multivariate cox regression model using the selected key features.

Development and validation of the radiomics nomogram
In the training cohort, the radiomics signature and all mentioned clinicopathological candidate predictors were tested in the univariate Cox proportional hazards model. Then, variables signi cant in the univariate analysis (P < 0.05) were considered for the multivariate Cox proportional hazard model to construct the radiomics nomogram for DFS prediction. To overcome the multicollinearity, backward stepwise selection with Akaike information criterion (AIC) was used as the stopping rule.
The potential association of the radiomics nomogram with DFS was rst assessed in the training cohort and then tested in the validation cohort by using Kaplan-Meier survival analysis. The optimum cutoff value for classifying the patients into high-risk or low-risk groups according to the radiomics nomogram was identi ed by using the maximally selected rank statistics method. Strati ed analyses were performed to explore the potential association of the radiomics nomogram with the DFS according to the clinicopathological risk factors from all patients. The prognostic performance of the model was evaluated by Harrell's concordance index (C-index) and time-dependent receiver operator characteristic (ROC) analysis.
To improve the stability of results, we repeated the randomized assignment and model building procedures four additional times. Subsequently, the mean C-index of the radiomics nomogram in the additional divisions was calculated.

Assessment of incremental value of radiomics signature in individual DFS estimation
To demonstrate the clinical bene ts of the radiomics signature to the clinicopathological risk factors for individualized assessment of DFS in patient with LARC, we also developed a clinicopathological model, which incorporated only the independent clinicopathological risk factors. The calibration curves were generated to compare the predicted survival with the actual survival (26). The net reclassi cation improvement (NRI) were also quanti ed for evaluating the improvement of usefulness added by the radiomics signature (27). Finally, a decision curve analysis (DCA) was conducted to estimate the clinical usefulness of our radiomics nomogram by calculating of the net bene ts at different threshold probabilities.

Statistical analysis
The differences regarding clinicopathological characteristics between the training and validation cohorts or between the high-risk and low-risk groups, were assessed by using an independent samples t test, Mann-Whitney U test, Chi-Squared tests, or Fisher exact test, where appropriate. Kaplan-Meier survival curves and the log-rank test were used to compare differences in the survival between the high-risk and low-risk groups. All statistical analyses were performed by using R software version 3.6.3 (http://www.Rproject.org, Appendix E3). A two-sided P value of < 0.05 was considered statistically signi cant.

Baseline information of participants
The patient characteristics were summarized in Table 1. The median follow-up time of the whole cohort was 44 months (range, 4-82 months). The training and validation cohorts were similar in terms of the clinicopathological characteristics (P = 0.373-0.906), except for the pre-CRT CEA level (P = 0.038).
Regarding survival outcomes, 40 (30.5%) in the training cohort and 16 (29.1%) patients in the validation cohort experienced a con rmed disease progression (P = 0.845).  Fig. 2A and 2B. Similarly, the radiomics signature derived from each sequence were also constructed (Appendix E4).
In the training cohort, the radiomics signature derived from joint the T2W, ADC, and cT1W images achieved better discriminatory ability for DFS prediction (C-index, 0.750; 95% con dence interval [CI]:0.669-0.831) than Radscore from either of them alone. Similarly, good prognostic performances were further con rmed in the validation cohort, with the corresponding C-index of 0.752 (95% CI, 0.638-0.866) ( Table 2).   (Table 3). Afterward, a radiomics nomogram that incorporated the above independent predictors for individualized DFS estimation was established and presented in Fig. 3A.

Performance and validation of the radiomics nomogram
For the training and validation cohorts, the C-indexes of the radiomics nomogram for DFS prediction were 0.780 (95% CI, 0.718-0.843) and 0.803 (95% CI, 0.717-0.889), respectively ( Table 2). The calibration curves of the radiomics nomogram for DFS at 1, 2, and 3 years were illustrated in Fig. 3B and 3C, which demonstrated better agreement between prediction and actual observation in both cohorts (both P 0.05). Moreover, the prognostic accuracy of the radiomics nomogram at 1, 2, and 3 years was also satisfactory (Fig. 4).
We further evaluated the prognostic value of radiomics nomogram on the basis of various clinicopathologic risk factors in the whole cohorts. The strati ed analyses revealed that the radiomics nomogram remained a powerful and statistically signi cant prognostic predictor ( Figure S1). Moreover, the mean C-index of the additional four divisions was 0.772 (95% CI, 0.722-0.802) in radiomics nomogram.

Assessment of incremental value of radiomics signature in individual DFS estimation
When we removed the signature from the radiomics nomogram and kept only two statistically signi cant features (pN and tumor differentiation) for building the clinicopathological model, the C-index dropped to 0.699 (95% CI: 0.627-0.771) in the training cohort and 0.762 (95% CI: 0.650-0.873) in the validation cohort, respectively ( Table 2). The integration of MRI-based radiomics signature into the nomogram yielded a total NRI 0.207 (95% CI: 0.045-0.353) for DFS, indicating improved classi cation accuracy for survival outcomes. Furthermore, the radiomics nomogram was identi ed as the best model, with the lowest AIC (344.8) value compared with the clinicopathological model (AIC = 539.7) or the radiomics signature (AIC = 447.6). Additionally, time-independent ROC analysis also validated that the radiomics nomogram had the best prognostic power ( Figure S2).
The DCA showed that the radiomics nomogram was superior to the radiomics signature and the clinicopathological model over most of the range of reasonable threshold probabilities, indicating the incremental value of nomogram in terms of clinical application (Fig. 6).

Bene t of adjuvant chemotherapy
For the whole cohort, survival outcomes were comparable between AC and no-AC groups (all P 0.05, Figure S3). We then applied our radiomics nomogram to detect whether patients could bene t from AC. groups.

Discussion
In the current study, we not only investigate the prognostic value of pretreatment multiparametric MRIbased radiomics signature in patients with LARC, but also successfully develop and validate a radiomics nomogram, which was powerful in risk strati cation and was able to predict DFS, DMFS and OS better than the current clinicopathological model. More importantly, the proposed radiomics nomogram might help identify which patients are expected to bene t from AC, indicating that it might be helpful for the future management of LARC. features and higher DFS in 95 patients with LARC (29). All these prior studies have typically focused on few radiomics features, which seems intuitive and might underestimated the signi cance of radiomics.
Therefore, construction of multifactor panels is a more common approach to overcome this challenge in outcome estimation.
Radiomics hypothesizes that the intratumor heterogeneity, which was di cult to detect visually, could be exhibited on the spatial distribution of voxel intensities. For the construction of radiomics signature, 3930 candidate features were reduced to only four predictors. Intriguingly, all selected features were GLSZM/GLRLM-related features, which could take into account the interaction between neighboring pixels and were well suited to measure different aspects of textural heterogeneity within the tumor (30). Moreover, wavelet features, which could re ect multi-frequency information at different scales unrecognized by the naked eye to quantify tumor heterogeneity, were the majority used in our optimal radiomics signature (3/4), similar to other MRI-based radiomics studies (16,31).
As demonstrated in the present study, the identi ed radiomics signature from joint T2W, ADC, and cT1W images performed better than those from either of them alone, and was also an independent risk factor of DFS in the patients with LARC, with C-index of 0.750 and 0.752 in the training and validation cohort, respectively. A possible explanation is that the multi-sequences used in our study re ected different aspects of tumor, such as tumor intensity, cellularity, and vascularization, and a combination of them might re ect much more comprehensive information of the tumors and improve prognostication.
Moreover, the performance of our radiomics signature was superior to those derived from CT or PET/CT images, with the C-index about 0.650 (32,33). This has been con rmed by our study that the Radscore values were signi cant higher in the patients with progressive disease. That is, tumors with higher intratumoral heterogeneity, are more likely to be resistant to NCRT and have a poorer prognosis (18).
Furthermore, the radiomic nomogram, incorporating radiomics signature and clinicopathological factors, had signi cantly better ability to predict DFS than the clinicopathological model, with a higher C-index of 0.803 and positive NRI and lowest AIC, consistent with the results of Meng and Jeon et al (21,22). The results of time-independent ROC analysis further support this conclusion. This may be because the clinicopathological features only re ect speci c tumor information, while multiparametric MRI-based radiomics can comprehensively and quanti ably characterize the tumor phenotype, and thus provide a robust way to characterize the intratumoral heterogeneity noninvasively. Additionally, the radiomics nomogram's good ability to predict DMFS and OS, also con rmed its prognostic value, which could be used to stratify patients into corresponding high and low risks groups.
Currently, AC, given after TME, has been recommend for most LARC patients, and has proved to be a powerful a robust tool against DM. However, not all patients will bene t from chemotherapy e cacy.
Given these, it would be of great importance to identify patients with AC bene t. Previous studies have developed valuable radiomics models derived from different modalities, to identify patients with various cancers who will bene t from different therapies (34)(35)(36). Our ndings are consistent with above statement that high-risk patients could bene t from AC and more intensive observation and aggressive treatment regimens should be considered in these cases. These ndings provided a novel tool for guiding AC.
Nevertheless, some notes should be emphasized. Firstly, pN and tumor differentiation were identi ed as independent prognostic factors for DFS in our study, partly consistent with previous studies (22,28,33).
Even so, a single considered strong risk factor, could hardly assess the comprehensive outcome of individual patients. In contrast, the nomogram, taking into account multiple risk factors, could provide more informative metrics and an easy-to-use tool for clinicians. Secondly, whole-tumor VOIs, rather than signal largest slice ROIs, could provide a robust way to characterize the heterogeneity of the entire lesion, and reduce concerns of selection bias. Lastly, although lacked external validation, the clinical utility of radiomic nomogram was assessed by DCA, con rming the incremental value of signature to clinicopathological model for individualized estimation.
Some limitations of our study should be acknowledged. The rst is the limited sample size and the retrospective nature of data collection. Further prospective studies involving a larger population are needed to validate our model. Second, study data were collected from a single center. Although we have performed four additional divisions with good results, external validation should be warranted in the future. Third, the follow-up duration was not long enough, thus, we constructed the radiomics model primarily base on DFS. Finally, various modalities, pathological imaging, genomic sequencing, and molecular biomarkers, as well as some MRI morphologic characteristics, should be investigated and construct a more stable and accurate model, for tailored treatment into the era of personalized medicine.

Conclusions
In summary, we identi ed a multiparametric MRI-based radiomics signature as a powerful approach for predicting prognosis in patients with LARC, and improved the prognostic ability of the clinicopathological model. Additonally, the developed radiomics nomogram successfully classify patients into high-and lowrisk groups for all endpoints, and thereby might provide a novel tool for guiding AC and individualized treatment strategies. Finally, future prospective multicenter studies with considerably large cohorts are needed to validate our ndings. Final approval of manuscript: All authors.

Figure 1
Recruitment pathway for patients in this study.

Figure 1
Recruitment pathway for patients in this study.

Figure 2
Radscore and radiomics nomogram-de ned scores for each patient with local advanced rectal cancer (LARC). Radscore in the training cohort (A) and the validation cohort (B); radiomics nomogram-de ned score in the training cohort (C) and the validation cohort (D). Red bars represent the scores for patients who did not show disease progression, while blue bars represent the scores for those who showed disease progression.

Figure 2
Radscore and radiomics nomogram-de ned scores for each patient with local advanced rectal cancer (LARC). Radscore in the training cohort (A) and the validation cohort (B); radiomics nomogram-de ned score in the training cohort (C) and the validation cohort (D). Red bars represent the scores for patients who did not show disease progression, while blue bars represent the scores for those who showed disease progression.