Magnetic Resonance Imaging-based Radiomics Nomogram for Prediction of the Histopathological Grade of Soft Tissue Sarcomas: A Two-center Study

Background: Preoperative prediction of the soft tissue sarcomas (STSs) grade is important for treatment decisions. To preoperatively distinguish low-grade (grades I and II) and high-grade (grade III) STSs, we developed and validated the performance of a magnetic resonance imaging (MRI)-based radiomics nomogram. Methods: Patients with an STS based on the French Federation of Cancer Centers Sarcoma Group grading system at two independent institutions were enrolled (training set, n = 109; external validation set, n = 71). The minimum redundancy maximum relevance method and least absolute shrinkage and selection operator logistic regression were used to process feature selection and radiomics signature development. Three radiomics signature models were constructed based on T1-weighted imaging (RS-T1 model) and fat-suppressed T2-weighted imaging sequences (RS-FST2 model) and their combination (RS-Combined model). Model performance (discrimination capability, calibration curve, and clinical usefulness) was evaluated in the external validation set. Results: The RS-T1 model, RS-FST2 model, and RS-Combined model achieved predictive abilities with area under the receiver operating characteristic curves (AUCs) of 0.645, 0.641, and 0.829, respectively, in the external validation set. The nomogram, incorporating signicant clinical factors and the RS-Combined model, showed extremely high predictive ability in the training set and external validation set with AUCs of 0.916 (95% condence interval, 0.866–0.966) and 0.879 (0.791–0.967), respectively. The nomogram achieved signicant patient stratication. Conclusions: The proposed noninvasive MRI-based radiomics nomogram shows superior predictive performance in differentiating low-grade from high-grade STS.

Radiomics has the capability to express intratumoral heterogeneity in a noninvasive manner with large-scale digital medical images using high-throughput extraction of innumerable quantitative features [15]. Moreover, radiomics provides comprehensive knowledge because the data are obtained from the entire tumor instead from just a focal sample [16]. Furthermore, radiomics is reproducible. Thus, MRI-based radiomics would be broadly applicable to patients with sarcomas. Radiomics has been successfully applied in prediction of the histologic grade, local recurrence or distant metastasis, overall survival, and response to neoadjuvant therapy in patients with STSs [17][18][19][20][21][22]. Most previous reports de ned high-grade STS as grades II and III, while we de ne high-grade STS as grade III based on recently published studies [6,22]. Additionally, the use of MRI-based radiomics for prediction of high-grade STS (grade III) has not been widely recognized and requires further validation. Moreover, MRI-based radiomics nomograms that combine radiomics and clinical factors for STS grading using data from multiple centers are relatively limited.
This study was performed to develop an MRI-based radiomics nomogram using a two-center dataset and determine whether the nomogram can distinguish the preoperative grade of STS. Additionally, a radiomics nomogram-based model was generated for prognostic evaluation. Patient risk strati cation and clinical decision bene ts were analyzed for the established model in an effort to improve personalized clinical treatment strategies and therapeutic decisions.

Data collection
Ethical approval for this two-center retrospective study was provided by our hospital's institutional review board, which waived the requirement for informed consent. Pre-therapeutic T1-weighted imaging (T1WI) and fat-suppressed T2-weighted imaging  Table 1.
The patients were subsequently grouped into a low-grade group (n = 93) and high-grade group (n = 87) according to their FNCLCC tumor grade. The patients' basic data are shown in Table 2. The patients comprised 112 male and 68 female patients with a mean age of 52.4 years (range, 1-93 years). Fig. 1 shows a ow chart of the enrolled patients and radiomics implementation.

MRI acquisition and region-of-interest segmentation
All 180 patients underwent MRI scanning using a GE MRI 1.5T, GE Signa HDx 3.0T (GE Medical Systems, Milwaukee, WI, USA), Siemens Skyra 3.0T, Siemens Magnetom Prisma 3.0T (Siemens Healthcare GmbH, Erlangen, Germany), or Philips Achieva 1.5T (Philips Medical Systems, Best, the Netherlands). The following scanning parameters were used: T1WI (repetition time [TR] / echo time [TE], 420-680 ms / 6.1-20 ms); FS-T2WI (TR / TE, 2640-5000 ms/ 30-102 ms,); section spacing, 1 mm; section thickness, 3-4 mm, matrix, 320 × 320; eld of view, 200-400 mm Three-dimensional region of interest (3D-ROI) segmentation of all tumors was conducted manually using ITK-SNAP open-source software (v.3.8.0; http://www.itksnap.org). The ROI was outlined according to the contour of the tumor from each transverse layer on preoperative T1WI and FS-T2WI sequences and automatically turned into a 3D-ROI. The 3D-ROI segmentation covered the entire primary tumor and avoided obvious peritumoral edema. Intraobserver and interobserver intraclass correlation coe cients (ICCs) were calculated to test the intraobserver reproducibility and interobserver reliability for the radiomic feature extraction of 40 random patients. Readers 1 and 2 drew the 3D-ROIs, and the next Reader 1 repeated the segmentation after 1 month. The ROI segmentation depicted by Reader 1 were used for further analysis. Intraobserver and interobserver ICCs of >0.75 were included for the subsequent investigation.
Image preprocessing and radiomics feature extraction Preprocessing procedures were applied to compensate for inhomogeneous intensity caused by different institutions and to decrease the variability of features. A method to decrease the number of gray levels and thus improve the signal-to-noise ratio of the texture calculations results was applied. The 3D-ROIs were then isotropically resampled to a planar resolution (voxel size = 1 × 1 × 1 mm 3 ) using cubic interpolation to standardize the voxel spacing [23,24]. 3D Slicer software (v.4.10.2; https://www.slicer.org/) was implemented for radiomics feature extraction. Using this software, a range of radiomics features was extracted and the intratumoral heterogeneity of the segmented 3D-ROIs was quantitatively expressed by the extracted features. The radiomics features (n = 1130) were respectively derived from T1WI and FS-T2WI sequences from each 3D-ROI, incorporating shape features, rst-order features, texture features including the gray-level cooccurrence matrix, gray-level dependence matrix, gray-level size-zone matrix, gray-level run-length matrix, and neighboring gray tone matrix and wavelet decomposition features.

ComBat compensation method
Effects obtained by different MRI scanners and protocols were removed using the ComBat compensation method while retaining its outperforming features in texture patterns, which potentially improved the power and reproducibility of subsequent investigations [25,26].

Patients' clinical data and MRI features
Clinical data (age, sex, and TNM stage) and MRI features were analyzed. The TNM stage was determined using the preoperative MRI and computed tomography information. Each musculoskeletal MRI scan was evaluated by two readers who had 7 and 14 years of experience and were blinded to the clinical and histopathological data. A consensus was reached in cases of discrepancy. The recorded data were described in Supplementary Material.
A pathologist (F.H.) with 11 years of experience in soft tissue diseases explained the pathology, incorporating the stage and histologic subtype. The FNCLCC system assigns a score for the tumor's mitotic index, differentiation, and extent of necrosis, and the tumor grade is obtained by summing these three scores. The pathologic TNM stage was determined based on the guidelines in the American Joint Committee on Cancer (AJCC) Cancer Staging Manual, 8th edition.

Construction of radiomics signature
A subsequent statistical analysis was performed using R software (v3.5.1; https://www.R-project.org). After removing redundant and irrelevant features and retaining the most related features for grading of STS by applying the minimum redundancy maximum relevance algorithm, the 30 best radiomics features were selected and applied to least absolute shrinkage and selection operator (LASSO) regression to generate the radiomics signature [27]. Next, the radiomics features with non-zero coe cients selected from LASSO regression formed three radiomics signatures based on T1WI sequences (RS-T1 model), FS-T2WI sequences (RS-FST2 model), and their combination (RS-Combined model). The radiomics score (rad-score) was calculated according to its linear combination of corresponding LASSO non-zero coe cients.

Development of clinical model and radiomics nomogram
Univariate logistic regression was performed for the clinical risk factors and MRI features used to evaluate the STS grade. The factors with a two-sided P value of <0.05 were then introduced into a multivariate logistic regression. The variables with a twosided P value of <0.05 from the multivariate analysis were considered potential independent clinical risk factors associated with the histologic grade and were used to compose a clinical model. Ultimately, a clinical model was established. Finally, the signi cant clinical factors and the optimal radiomics signature were selected and combined in the radiomics nomogram.
Validation of the radiomics nomogram and performance assessment of differentmodels The radiomics nomogram was assessed for discrimination, calibration, and clinical application [28] in both sets. The discrimination capability of the clinical model, radiomics signatures models and radiomics nomogram to correctly distinguish the grade was quanti ed by the AUC and accuracy. The Hosmer-Lemeshow test was used to assess the goodness-of-t with a calibration curve to evaluate the calibration of the nomogram [29]. The external validation set was used to test the radiomics nomogram, and the rad-score was correspondingly calculated using the formula established in the training set.
The AUC between each two of the three models was evaluated using the DeLong test. The clinical application was estimated by a decision curve analysis (DCA) to determine whether the radiomics nomogram can be regarded as robust and effective. The DCA was used to quantitatively calculate the net bene ts for a range of different threshold probabilities in the whole cohort.

Follow-up and survival analysis
The patients underwent MRI or computed tomography follow-up examinations every 6 to 12 months for the rst 2 years and annually thereafter. Progression-free survival (PFS) was calculated as the survival endpoint for patient outcomes from the time of surgery to the time of radiographic detection of recurrence, time of last follow-up examination, or time of death without evidence of progression. Patients were censored in case of on 30 November 2019.
Survival curves were generated based on Kaplan-Meier survival analysis. Differences in survival curves were assessed by the log-rank test. The pathologic grade results model, radiomics signature model, and radiomics nomogram model were further evaluated for their performance in PFS strati cation. We combined the nomogram model with the AJCC staging system (Cancer Staging Manual, 8th edition) to analyze its ability in PFS strati cation.
Statistical analysis R software was used to perform the statistical analysis. A two-sided P value of <0.05 was regarded as statistically signi cant.
A univariate analysis was performed to evaluate the relationships between the patients' characteristics. For continuous variables, Student's t-test or the Mann-Whitney U test was used to determine whether signi cant inter-group differences existed between the low-grade and high-grade groups; for classi ed variables, the chi-square test or Fisher's exact test was performed where appropriate. The packages we used in R software were described in Supplementary Material.

Clinical factors and modeling
The basic clinical characteristics of the 180 patients with STS are shown in Table 2. The T-stage, MRI-reported margin, and median PFS were signi cantly different between the low-grade and high-grade groups in both the training set and external validation set (both P < 0.05). The remaining factors showed no signi cant difference between the two groups in either the training set or external validation set. The results of the univariate and multivariate logistic regression analyses with P < 0.05 are shown in Table 3. Based on the results of the multivariate logistic regression analysis, the T-stage and MRI-reported margin were included to create a clinical model, with an AUC of 0.787 and 0.833 in the training set and external validation set, respectively. The results are shown in Table 4 Table 4 and Fig. 3.
Kaplan-Meier survival curves of the different models are shown in Fig. 4. The strati ed analysis is shown in Supplementary Material. A radiomics nomogram that combined the RS-Combined model and signi cant clinical factors was subsequently constructed (Fig. 5a). The predictive performance of these models is shown in Table 4  showed that the nomogram-predicted model signi cantly strati ed patients for PFS in both sets (log rank P < 0.05, respectively; Fig. 4k, l).
The calibration curve of the radiomics nomogram showed good agreement between the predicted and observed tumor grade in both sets (Fig. 5b, c.). The Hosmer-Lemeshow test result was not statistically signi cant (P = 0.872 in the training set and P = 0.506 as veri ed by the external validation set). The DCA indicated that the radiomics nomogram had a higher overall net bene t than the radiomics signature and the clinical model in predicting the preoperative STS grade considering the "treat all" and "treat none" schemes through the reasonable treatment threshold probability (Fig. 5d), proving its clinical usefulness.

Prognostic radiomic prediction models show moderate performance
We also assessed whether the established models offer increased bene ts over the clinical staging system. The radiomics Finally, the model that combined radiomics nomogram model with the best-selected predictive performance and the AJCC staging system achieved a slightly increased bene t (C-indices for radiomics nomogram + AJCC: 0.591, 95%CI 0.492-0.641). Furthermore, the radiomics nomogram combined with the AJCC staging system signi cantly strati ed patients for PFS (log rank P < 0.010; Fig. 5e).

Discussion
Preoperative grading of STS is critical for choosing the optimal treatment method (neoadjuvant treatment or radiation therapy). Additionally, preoperative grading has been proven to independently affect the prognosis in patients with STS [6,30]. However, the preoperative grade based on biopsy examination results is sometimes underestimated because of tumor heterogeneity [7]. In this study, a noninvasive method derived from massive clinical and MRI data was investigated for predicting the preoperative histologic grade. The radiomics nomogram, which combined the RS-Combined model with clinical factors, successfully distinguished high-grade from low-grade STS with the highest performance and exhibited good calibration in both sets, indicating the incremental value of the nomogram and showing that it can be a promising tool for clinical strategy adjustment. Additionally, the developed radiomics nomogram model signi cantly strati ed patients into low-and high-risk patients for PFS.
The model that combined the radiomics nomogram and the AJCC staging system showed better prognostic ability than the AJCC staging system alone with signi cant patient risk strati cation. The radiomics models showed moderate performance for the prognostic prediction of PFS.
MRI is widely used for characterization of soft tissue tumors. Previous reports have noted some qualitative MRI characteristics that may serve as potential imaging biomarkers of the histopathologic STS grade. Zhao et al. [14] found that high-grade STS tended to have high peritumoral signal intensity on T2-weighted images. Crombé et al. [6] con rmed that MRI features of necrosis, heterogeneity, and peritumoral enhancement are associated with high-grade STS. In this study, a clinical model that included the T-stage (which represents tumor size) and MRI-reported margin was proven to predict the tumor grade, which is partly consistent with recent reports [13,14]. The results indicated that patients with high-grade STS are more likely to have an ill-de ned margin, and one reason may be that high-grade tumors show greater invasive ability in peripheral tissues [14].
However, although traditional MRI interpretation re ects the macroscopic imaging features, it lacks objectivity and quanti cation and tends to be in uenced by the radiologist's experience.
Radiomics as an emerging favorable alternative method for preoperatively describing the quantitative characteristics of tumors. It produces maximized information from large-scale radiologic images that is beyond qualitative evaluation by visual inspection [31][32][33]. Therefore, radiomics based on MRI scans may achieve better decision-making. Corino et al. [34] reported that a radiomics classi er based on apparent diffusion coe cients in 19 patients could be used to distinguish grade II from III STS. Xiang et al. [35] found that quantitative MRI-based histogram parameters can differentiate the grade of STS, Zhang et al. [17] demonstrated that FS-T2WI-based radiomics could be used to predict the histopathological grade of STS in a study with a small cohort of 35 patients, and Wang et al. [18] found that radiomics signature-based machine-learning classi ers can distinguish low-grade from high-grade STS. Nevertheless, these studies had small sample cohorts and two of them lacked a validation set, which potentially lead to the problem of over tting. Conversely, we de ned high-grade STS as grade III with balanced patient data, retrospectively enrolled more patients from two different hospitals, combined a clinical model to form a radiomics nomogram, and validated model e ciency in an external validation set. Additionally, conventional T1WI and FS-T2WI examinations were used for radiomics feature extraction because these techniques are commonly used, easy to access, familiar among radiologists, and show stability in clinical practice. Conversely, multi-modality MRI sequences such as diffusionweighted imaging readily exhibit distortion and susceptibility artifacts, and dynamic contrast-enhanced images tend to be in uenced by the injected contrast medium [36].
The RS-Combined model established in this study showed good reproducibility with a similar AUC and accuracy in the training set and external validation set, indicating that RS-Combined model was generalizability and stability. Peeken et al. [37] established a radiomics-combined model based on contrast-enhanced T1-weighted fat-saturated (T1FSGd) and fat-saturated T2-weighted (FS-T2WI) MRI sequences with an AUC of 0.76. However, because most patients in our study underwent noncontrast-enhanced T1WI and FS-T2WI examinations, we could only construct a more generalized RS-Combined model based on T1WI and FS-T2WI, and it achieved an AUC of 0.829 in the external validation set and showed prediction performance similar to that reported by Peeken et al. [37]. This might indicate that good model performance can also be obtained by incorporating T1WI instead of T1FSGd in future studies. This will allow patients to avoid invasive procedures and the potential risks of contrast-enhanced examinations, such as contrast agent allergy and increased liver and kidney metabolic burdens.
We established a radiomics nomogram that combined independent clinical factors and the RS-Combined model, showing the best AUC in each dataset, better calibration, and highest net bene t in a range of threshold probabilities. This is consistent with recent reports of strati cation of patients with glioblastoma and soft tissue tumors [37,38]. The nomogram graphically creates a clinical statistical predictive model, is easy to use, and enables accurate prediction of an individual patient's probability of preoperative strati cation. The preoperative nomogram model allowed us to stratify patients, identify patients with high-grade STS who may need adjuvant systemic therapies, and more con dently establish a rational follow-up schedule [39,40]. The nomogram can be applied in different clinical situations to provide complementary staging information, such as when a biopsy specimen is di cult to access anatomically or when the biopsy result is unclear.
We also assessed the prognostic ability of the developed radiomics models. Radiomics models solely developed to predict PFS displayed only moderate predictive ability. The radiomics nomogram grading model signi cantly strati ed patients into low-and high-risk groups in both the training and external validation sets compared with the other models. The radiomics nomogram showed a slightly enhanced prognostic capacity when combined with the AJCC staging system.
Previous investigations have demonstrated that radiomics can be used to predict survival outcomes in patients with STS.
Spraker et al. [20] found that radiomics features alone or combined with age and grade were respectively independent predictors of over survival, and the latter achieved best performance. Peeken et al. [37] demonstrated that the radiomics model based on T1FSGd MRI sequences showed signi cant patient strati cation performance for overall survival, and improved prognostic performance was found with the combination of a T2FS radiomics model with the AJCC system. Conversely, we evaluated survival prediction by incorporating a nomogram-predicted grade model that included clinical stage and MRI features, and it showed superior patient risk strati cation performance. Unlike Peeken et al. [37], we found that the pathologic-reported grade model itself achieved signi cant strati cation performance, making it easy for the proposed radiomics nomogram models to be potential surrogate markers. However, the radiomics nomogram models signi cantly strati ed patients in the training set only by a close margin (P = 0.047). Additionally, the radiomics nomogram combined with the AJCC staging system generated more signi cant PFS strati cation performance than the nomogram and AJCC staging system alone. The lter models might show promise for the long-term management of patients with STS.
Our study had several limitations. First, we retrospectively examined the images; thus, selection bias was inescapable despite our strict criteria. Second, manual tumor segmentation by a team may lead to irregularities. Key techniques such as automatic segmentation may be more accurate and time-saving and can be used to improve the robustness of radiomics models in future research. Third, our data were obtained from two institutions that used similar but different scanners and protocols. Therefore, resampling methodology and the ComBat compensation method were applied to reduce the differences in image speci cations, aiming to increase the stability of features and different models. Finally, sample enlargement is needed in future studies.

Conclusion
In conclusion, we identi ed and validated a noninvasive radiomics nomogram combined with the radiomics signature and clinical factors. It exhibited satisfactory predictive performance in differentiating the preoperative histopathological grade of STS and achieved superior patient risk strati cation performance. We expect that these results will help to improve clinical treatment strategies and improve survival outcomes in selected patients.

Consent for publication
All authors have read and approved the content and agree to submit for consideration for publication in the journal.

Availability of data and materials
All data generated or analysed during this study are included in this published article.

Competing interests
The authors declare that they have no competing interests.

Funding
This study has received funding by the National Science Foundation for Young Scientists of China (Grant No. 81571673