Prediction of histological grade using preoperative multi-sequence MRI-based radiomics signature in patients with non-small-cell lung cancer

Background: Non-small cell lung cancer (NSCLC) is treatable when caught early, yet limited non-invasive methods exist for grading NSCLC patients. In the present study, we aimed to examine the diagnostic utility of multi-sequence magnetic resonance imaging (MRI) radiomics and clinical features for grading NSCLC. Methods: In this retrospective study, 148 patients with postoperative pathologically-conrmed NSCLC were recruited. Both preoperative T2-weighted imaging (T2WI) and multi-b-value diffusion-weighted imaging (DWI) were performed on a 1.5 T MRI scanner. A total of 2775 radiomics features were extracted from the T2WI, DWI, and the corresponding apparent diffusion coecient (ADC) maps of patients. The least absolute shrinkage and selection operator (LASSO) and stepwise regression method were used for feature selection using the training cohort (n=110). Next, these features were further evaluated assessed in the two cohorts using a non-linear support vector machine (SVM) classier. Lastly, a Radscore model was used to develop the radiomics-clinical nomogram. Results: Favorable discrimination performance was obtained for ve of the optimal features using both cohorts, as demonstrated by the area under the curves (AUC) of 0.761 and 0.753. In addition, the radiomics-clinical nomogram, which integrated the Radscore with four independent clinical predictors, showed higher discriminative power, with AUCs of 0.814 and 0.767 for the X.T cohort and H.Y cohort, respectively. The nomogram showed excellent predictive performance and potential clinical utility for grading NSCLC. Conclusions: Multi-sequence MRI radiomics features can stratify NSCLC tumor grades noninvasively. The radiomics features can be integrated with the clinical features to improve its predictive performance. result similar to our Radscore results from multisequence MRI radiomics. We also applied the nomogram model proposed by the clinical factor. The results were slightly better than those of the enhanced CT-based imaging radiomics method in terms of identication tasks.

condition owing to the small size of the extraction site 8,9 . Moreover, biopsies are highly invasive and can cause physical and psychological trauma to patients, in addition to the risk of complications during surgery 10 . For these reasons, a non-invasive method for preoperatively predicting NSCLC tumor histology grades is desired. Currently, medical imaging is an essential tool for diagnosing NSCLC, as studies have reported strong positive correlations between speci c imaging markers and tumor grades. For example, dynamic multiphase net enhancement and multidetector CT (MDCT) perfusion parameters can help grade the differentiation of tumors 11,12 . In addition, apparent diffusion coe cient (ADC) values in diffusion-weighted imaging (DWI) sequences of magnetic resonance (MR) can distinguish between highgrade and low-grade lung cancers. In other words, the ADC values of high-grade lung cancer are signi cantly lower than those of low-grade lung cancer 13,14 . As another example, the standard uptake value (SUV) of positron emission tomography (PET) is higher for high-grade lung cancer due to vascular stroma and brosis 11 . However, the use of perfusion CT is limited in patients with increased radiation doses 15 , and DWI overlaps in the ADC values of high-and low-grade lung cancer 13 . Additionally, PET is more expensive than other imaging methods. Hence, newer non-invasive imaging techniques are needed to provide improved information for accurately grading tumors in the clinic.
In recent years, breakthroughs have been made in the acquisition, standardization, and analysis of medical images, which have improved made it possible to turn the accurate and quantitative image descriptions into non-invasive biomarkers for predicting the prognosis of cancer 16 . Radiomics is an emerging eld based on advanced pattern recognition. Using radiomics, researchers can extract quantitative features from digital images to determine the relationship between the features and the pathophysiology of tissues 17,18 . Currently, the radiomics strategies have been based on multi-sequence magnetic resonance imaging (MRI) data, including T2-weighted images (T2WI) and DWI. The respective ADC images have been used for discriminating various cancers, including renal carcinoma, bladder cancer, prostate cancer, and soft tissue sarcoma, making it possible to accurately assess the histological grades and predict outcomes [19][20][21][22] . However, it remains unknown if multi-sequence MRI-extracted radiomics features can accurately re ect differences between histological grades of tumors.
Hence, in this study, we investigated the feasibility of a multi-sequence MRI radiomics strategy for grading NSCLC. Secondly, we developed and validated a radiomics-clinical nomogram model for individualized risk strati cation of NSCLC patients using a combination of imaging radiomics features and clinical factors.

Methods
The retrospective study was approved by the institutional ethics review board of Xijing Hospital (Xi'an, China). Due to the retrospective design, the requirement for informed consent was waived by the committee. All methods were performed in accordance with the relevant guidelines and regulations.
Additional information about the study design is shown in Fig. 1.

Patients
Between January 2015 and December 2018, patients con rmed with squamous cell carcinomas or adenocarcinomas of the lung postoperatively were included from a single clinical center. The inclusion criteria were (1) MRI examination within two months of surgical pathology results or biopsy pathology results; (2) no other history of malignant tumors; and (3) lesions were larger than 8 mm to ensure adequate count statistics and consistent analysis of the region of interest (ROI). The exclusion criteria were (1) lung cancer had been treated prior to MRI, such as chemotherapy or radiotherapy; (2) no pathological results obtained after MRI; (3) MRI showed artifacts or the data could not be measured or analyzed; and (4) patients had contraindications to MRI (e.g., cochlear implants and pacemakers). (5) Only the largest tumor was used in cases of multifocal or multicentric lung cancer. In total, 148 patients (112 males and 36 females; mean age, 59 ± 11 years; age range, 20-79 years) were included in this study.
The age and gender of patients were obtained from the medical records, while tumor characteristics, including the histological type and degree of tumor differentiation, were obtained from the pathological records.
Histological grades were assessed by two pathologists independently, both more than nine years of clinical experience. The resected tumors were classi ed into grade 1 or well-differentiated (n = 11), grade 2 or moderately differentiated (n = 60), and grade 3 or poorly differentiated (n = 77). Due to the small number of grade 1 tumors, the patients were divided into two groups for the analysis, including a lowgrade group containing well-differentiated and moderately-differentiated tumors and a high-grade group containing poorly-differentiated tumors. Among the groups, 110 patients (84 males and 26 females) were randomly selected as the training model for development, and 38 (28 males and ten females) were used for performance veri cation. For this study, clinical features consisted of sex, age, smoking status, side of lung involvement, location of tumor, and histology subtypes (CEA, Ki-67, LPD, LD). These characteristics were obtained from archived medical records.
MR image acquisition and ROI delineation MR examinations were performed using the MAGNETOM Aera 1.5 T scanner from Siemens Medical Solutions (Erlangen, Germany). The scanner had an eight-channel phased-array torso coil. T2-weighted and diffusion-weighted MRI sequences were used in this study. A bi-exponential model (b = 50 or 800 sec/mm 2 ) was used to obtain the ADC maps automatically. For more information about the sequence parameters, please visit the Supplementary Materials.
Prior to delineating the ROI, an axial image slice from each MRI system was obtained by identifying the largest tumor area with the maximum size in the lung region of each patient. The tumor area on the selected image was selected using a manually drawn polygonal ROI. Since the DWI was used to calculate the ADC map with a double exponential model, the ROIs were placed onto the ADC maps to determine the respective tumor regions. Tumors were manually drawn by two radiologists independently (with nine and ve years of MRI diagnostic experience in lung cancer) using the custom-developed package ITK-SNAP software (www.itksnap.org). When the ROI selection diverged, the differences were negotiated and corrected to select a consistent ROI. Details about the delineation of ROIs are shown in Fig. 1.

Extraction of imaging features
High-dimensional radiological features were extracted from the ROI of the tumor using the MRI data to describe the tumor phenotype. The features were divided into four groups: (1) 13 graphic intensity features, (2) seven shape size features, (3) 165 high-order texture (RLM, NGTDM, and GLSZM) features, and (4) 740 wavelet features. Every single image contained 925 features, with 2775 features in three images, which fully described the local and regional tissue distributional changes in the tumor. For example, the rst group described the gray histogram of the tumor region and the voxel's gray distribution. The second group primarily described the tumor phenotype, including the shape, area, volume, and compactness of the circles and burrs. The third group showed the uniformity or heterogeneity of the tumor's internal structure. The fourth group showed the intensity and texture of the image by decomposing the original image. All of the features have been utilized in previous radiomics studies 16,23 . Because the original T2WI, DWI, and ADC images were of different gray levels prior to the high-order GLRLM, NGTDM, and GLSZM texture feature extraction, all tumor ROIs described for the three MRI sequences adopted a multi-grade normalization strategy. In other words, all tumor ROI gradations of the three MRIs were normalized (each feature, minus the mean divided by the standard deviation, converted to a mean of 0, with a standard deviation of 1). We used a publicly shared MATLAB software package, and the features were extracted online 24,25 .
Selection of features, evaluation of predictive performance, and generation of Radscore A combination of methods has been assessed for their discriminatory abilities in low-and high-grade NSCLC 26 . First, features were selected using the Student's t-test, and 42 features that were signi cantly different between the two grades in the training cohort were identi ed. A logistic regression analysis was performed using the least absolute shrinkage and selection operator (LASSO) regularization to determine which features showed the best predictive value for histological classi cation. LASSO is a shrinkage and variable selection strategy for the regression of high-dimensional data 27 . As LASSO requires tuning of the model parameter (λ), ten-fold cross-validation was used for tuning parameter (λ) selection in the LASSO model. After the LASSO regression, a total of 17 features remained. Next, for the combination case, the 17 features were further reduced using the stepwise regression, of which ve features were nally selected. The performance of the tool was evaluated using the two cohorts. All of these processes were achieved using the cv.glmnet function from the glmnet package under the platform R v. 3.2.3. In the next step, a logistic regression algorithm was applied using the ve optimal features in the training group, which made it possible to obtain the coe cient of each feature, along with the intercept for generating the Radscore formula 24,25 . Using the formula, the Radscores for the patients were calculated for further studies.
Development of the radiomics-clinical nomogram and assessment of its predictive performance The clinical usefulness of the radiomics model for discriminating between low-and high-grade NSCLC was evaluated using methods already described in the literature 26 . Brie y, the univariate and multivariate regression analyses were performed using the Radscore and clinical features from the training cohort. Next, the nomogram was created using the independent predictors from the training cohort. The predictive performance of the nomogram was quantitatively evaluated the training and validation cohorts for its sensitivity, speci city, accuracy, and area under the curve (AUC) of the receiver operating characteristic (ROC) 24,25 . The Hosmer-Lemeshow tests, used to determine the agreement between the expected and actual observations, and the decision curve analyses, were conducted to determine the precision and potential clinical utility of the model.

Statistical analysis
Statistical analysis was conducted using the R statistical software (version 3.4.4, x64). P-values of < 0.05 were considered signi cant. Both univariate and multivariate regression analyses were used to identify independent predictors for the discrimination task.

Patient characteristics
The baseline demographics and clinical data of patients from the training and validation cohorts are shown in Table 1. The characteristics of patients in the training and validation cohorts were not signi cantly different.
Performance of the optimal features selected to discriminate the histological grade After assessing 2775 radiomics features from the training cohort, 42 features were found to be signi cantly different between the two histological grades. Next, the LASSO method was applied to the salient features, and 17 were selected. Forward-backward stepwise regression was used to select ve major features as the optimal features, and a logistic regression model was constructed (Figures 2A and  2B).

Calculation of Radscore
To further simplify the prediction model, the logistic regression algorithm was used to generate the Radscore formula based on the best features. Figure 2B showed the coe cients of each feature. The formula's intercept was -0.555. Figure 2C showed the sum of the absolute coe cients of these features on the image class, from which the features of the DWI image contributed the most to calculating the Radscore. Using this formula to calculate the Radscore and ROC curves for the two patient cohorts ( Figures 3A and 3B) yielded a signi cant difference (p < 0.01) between the low-and high-grade NSCLC cases ( Figure 3C).
The radiomics-clinical nomogram's performance in thediscrimination of high-and low-grade NSCLC To enhance its recognition performance, univariate and multivariate analyses were performed in conjunction with clinical factors and Radscore. The combination of Radscore with clinical factors revealed that sex, smoking, and Radscore were independent predictors of recognition tasks (Table 2).
In the next step, sex, smoking, and Radscore were integrated to obtain a nomogram model of the clinical angiographic factors ( Figure 4A). Each patient's risk of being diagnosed with high-grade NSCLC was quanti ed using the nomogram model. Figure 4B showed that the low-and high-grade histology risk pro les differed signi cantly between the two cohorts (p < 0.01). Through our model, the recognition performance was improved greatly ( Figure 4C and Table 3). The prediction accuracy and AUC of the training queue increased to 77.3% and 0.814, respectively, while the prediction accuracy and AUC of the veri cation queue increased to 78.9% and 0.767, respectively.
In addition, the p-value of the Hosmer-Lemeshow test was 0.893, which was not statistically signi cant, indicating that the prediction results of the nomogram model agreed well with the observations. Clinical effectiveness was assessed with the decision curve analysis. If the risk was higher than 0.2, the net bene t was greater than the imaging ensemble model-alone ( Figure 5).

Discussion
In the present study, we demonstrated the construction of and validation of a radiomics-clinical nomogram based on preoperative MRI and clinical characteristics for predicting low-and high-grade NSCLC. Using the training and validation cohorts, our nomogram showed excellent discriminative power and clinical utility. Hence, our proposed nomogram may be an effective and non-invasive tool for preoperative histological grade assessment in patients with lung cancer.
Currently, there is no histological grading system with clearly de ned criteria and clinical signi cance widely accepted for patients with lung cancer. The World Health Organization (WHO) (4 th edition) classi cation method for lung cancer grades adenocarcinoma as 1 (good differentiation, predominantly with wall growth), 2 (moderate differentiation, with acinar or nipple), or 3 (poor differentiation, mainly solid or micropapillary) 28 . These structural patterns are based on the most important tissue classi cations for adenocarcinoma. Previously, Weichert et al. developed a grading system for lung squamous cell carcinoma, which summarized the scores of two independent prognostic markers, including tumor sprouting and tumor nest size 29 . However, both de nitions depend on the subjective judgment of the pathologist, and the hierarchy is easily confused. To study the unity of the standard, traditional histological grading methods were also used in this study, including the similarity of structural patterns to the normal lung tissue of its origin, as well as the tumor cell atypia and degree of differentiation. The traditional histology grading system has been established for many years and is clinically relevant and reproducible in breast cancer 30 , prostate cancer 31 , endometrial cancer 32 , soft tissue sarcoma 33 , and renal cancer 34 . Therefore, we assume that the traditional histological grading system is applicable in NSCLC.
From the hypothesis described above, the feasibility and performance of histological grading identi cation in NSCLC patients remain unclear for multi-sequence MRI ensembles. Therefore, we aimed to (1) explore and study imaging grouping strategies based on multi-sequence MRI, such as T2WI, DWI, and ADC, to preoperatively identify high-and low-grade NSCLC, and (2) verify whether the combination of radiomics features and clinical features may improve the discriminating ability of this study.
Our ndings demonstrate that the radiomics features extracted from the T2WI, DWI, and ADC were strongly correlated with the histological grade of NSCLC. The double MRI sequences used in the current study are standard and used in hospitals worldwide. The radiomics-based classi cation of NSCLC allowed for the non-invasive prediction and strati cation of high-and low-grade NSCLC. In our experiments, Radscore was the most effective method of distinguishing the histological grade of NSCLC. The Radscore combined ve optimal characteristics as one biological marker. Recent studies have combined multiple marker analyses into a single individual marker 35,36 . For example, a recent study screened 21 independent genes from patients with breast cancer. The characteristics of these 21 genes together were identi ed and validated as the optimal features that can prevent certain breast cancer patient groups from requiring chemotherapy 36 .
The highest sum of the absolute value coe cients was obtained from the DWI features; thus, higher DWI signals showed higher tumor levels, which exhibited faster cell proliferation, higher cell densities, larger cell nuclei, higher intracellular macromolecular protein content, higher cytoplasmic ratios, smaller extracellular space 13,37 , and more limited intracellular and extracellular diffusion. As of now, DWI is the only imaging method that can detect limited cell diffusion. DWI was rst used in the central nervous system, but applying this technology to the lungs is challenging due to its drawbacks, including its low signal-to-noise ratio due to the inherent low density of protons located in the lungs, image artifacts caused by the movement of the heart and the breath, and high-gradient elds that in uence the magnetic susceptibility of the in ated lung tissue, which increases the likelihood of artifacts 38,39 . Our experimental team used the ISHIM-EPI sequence to reduce magnetic-sensitive and respiratory motion artifacts. Under free-breathing conditions, the signal-to-noise ratio and image quality were improved for the DWI.
Only a few studies have explored the preoperative differentiation of NSCLC histological grades based on dual-energy CT and ADC in recent years. These tests have limitations, including CT ionizing radiation problems and overlap of ADC values 13 ; however, traditional medical imaging methods have always been qualitative. The rise of imaging radiomics, the rapid development of image acquisition methods, standardization strategies, and image analysis tools have enabled researchers to objectively, accurately, and quantitatively describing tumor imaging as a non-invasive biological marker for predicting the prognosis of patients. No previous studies achieved a quantitative risk strati cation of the histological level of NSCLC patients based on a nomogram model. Previously, Chen et al. performed image analytics based on enhanced CT to extract 591 radiomics features 40 . The minimum redundancy, maximum correlation algorithm, and logistic regression model were used to reduce dimensionality and select the best features. Finally, a model was established from nine features. A set of radiomics features was validated in an independent validation group. The feature set was used to distinguish between high and low-grade lung cancer in the training group, of which the AUC was 0.763, and the accuracy rate was 68.7%. For comparison, the AUC was 0.782, and the accuracy rate was 71.2% in the veri cation group. This result was similar to our Radscore results from multisequence MRI radiomics. We also applied the nomogram model proposed by the clinical factor. The results were slightly better than those of the enhanced CT-based imaging radiomics method in terms of identi cation tasks.
Clinical features, such as age, sex, smoking, tumor location, tumor typing, longest tumor diameter, vertical tumor diameter, CEA, Ki67, and histological subtype, are commonly used to diagnose patients with lung cancer. If the combination of the factors and the feature-generated Radscore can improve recognition performance requires further study. Surprisingly, the univariate analysis revealed that none of these factors correlated with the grading of lung cancer. Univariate correlation is not reported to show su cient predictive strength 41 , which is a common strategy for excluding the variable from model development.
However, among these predictors, nuances in the dataset may result in the exclusion of important predictors. These results may also have been due to confusion with other predictors. We also believed that all factors should be related; thus, the multifactor analysis showed that sex, smoking, and Radscore together were signi cant predictors of recognition tasks. Then, we generated the nomogram model from these predictive factors and obtained a satisfactory recognition rate with better recognition performance than that of the imaging ensemble model. Therefore, the combination of radiomics features and clinical factors can enhance its recognition ability. In addition, we also veri ed that the nomogram model showed good prediction accuracy and clinical application value.
This study had some limitations. First, because the study was retrospective, and the patient population was relatively limited, inherent bias may exist. The addition of more patients in a multi-center trial would help validate our ndings. In addition, owing to incomplete archival database data, this study excluded other potential clinical features, including genetic mutations and possible molecular markers, which require further analysis. Third, although we used multivariate logistic regression models, ROC curves combined with nomograms as well as calibration curves, which are commonly accepted in the eld of medical imaging analysis, along with other comparative studies are needed.

Conclusions
Multi-sequence MRI radiomics features could predict the histological grading of patients with NSCLC noninvasively. Additionally, the integration of radiomics features and clinical features resulted in improved performance. Hence, radiomics signatures may be a promising tool for determining the grade and treatment planning for NSCLC.

Declarations
Ethics approval and consent to participants This study was approved by the Ethics Committee of Xijing Hospital with number KY20141104-2. Due to the retrospective design, the requirement for informed consent was waived by the committee. All methods were performed in accordance with the relevant guidelines and regulations.

Consent for publication
Not applicable.
Availability of data and materials The data sets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors of this article declare that there is no con ict of interest.  The italics underlined values indicate statistical signi cance with p value < 0.05 after the multivariable analysis.
* LD, LPD, CEA and OR indicate the longest diameter, the longest perpendicular diameter, carcinoembryonic antigen, and odds ratio, respectively.