Standard deviation of CT radiomic features among malignancies in each individual: prognostic ability in lung cancer patients

It was reported that individual heterogeneity among malignancies (IHAM) might correlate well to the prognosis of lung cancer; however, seldom radiomic study is on this field. Standard deviation (SD) in statistics could scale average amount of variability of a variable; therefore, we used SD of CT feature (FeatureSD) among primary tumor and malignant lymph nodes (LNs) in an individual to represent IHAM, and its prognostic ability was explored. The enrolled patients who had accepted PET/CT scans were selected from our previous study (ClinicalTrials.gov, NCT03648151). The patients had primary tumor and at least one LN, and standardized uptake value of LN higher than 2.0 and 2.5 were enrolled as the cohort 1 (n = 94) and 2 (n = 88), respectively. FeatureSD from the combined or thin-section CT were calculated among primary tumor and malignant LNs in each patient, and were separately selected by the survival XGBoost method. Finally, their prognostic ability was compared to the significant patient characteristics identified by the Cox regression. In the univariate and multi-variate Cox analysis, surgery, target therapy, and TNM stage were significantly against OS in the both cohorts. In the survival XGBoost analysis of the thin-section CT dataset, none FeatureSD could be repeatably ranked on the top list of the both cohorts. For the combined CT dataset, only one FeatureSD ranked in the top three of both cohorts, but the three significant factors in the Cox regression were not even on the list. Both in the cohort 1 and 2, C-index of the model composed of the three factors could be improved by integrating the continuous FeatureSD; furthermore, that of each factor was obviously lower than FeatureSD. Standard deviation of CT features among malignant foci within an individual was a powerful prognostic factor in vivo for lung cancer patients.


Introduction
Under diverse microenvironments, tumor cells within a malignant lesion can slowly evolve genomic and biological variations (Liu et al. 2018;Makohon-Moore et al. 2017;VanderWeele et al. 2019). The phenomenon is called heterogeneity which can be found among tumors of patients (intertumor) or foci of an individual (intratumor). The latter can be further divided into that among different parts of a primary tumor(VanderWeele et al. 2019) or a metastatic lesion, and that among primary tumor and metastatic foci (IHAM in this study) (Hata et al. 2015;Makohon-Moore et al. 2017). Results from molecular biotechnology indicated that heterogeneity would lead to patient misclassification and outcome misjudgment (Lim et al. 2019), and a better understanding of individual heterogeneity would advance Si Hongwei, Hao Xinzhong, Xue Shuqin and Cao Jianzhong have contributed equally to this work.
As an attractive field, radiomics technology provides a new method for heterogeneity analysis in vivo (Bashir et al. 2016). However, the related studies mainly focused on the predictive or prognostic ability of intratumor heterogeneity (Bashir et al. 2016;Yip and Aerts 2016). Typically, radiomic features are extracted from a single lesion, for example, a primary tumor or a metastasis lesion. To analyze the individual heterogeneity among malignancies (IHAM) among malignant foci, we introduced the standard deviation (SD) of descriptive statistical which was known as mean ± SD. It is used to measure the dispersion of a variable, and is calculated as the square root of variance by determining variability relative to the mean.
Above all, SD of CT features (Feature SDs ) among primary tumor and malignant lymph nodes of a patient was served as a representative of IHAM, and its prognostic ability was explored in lung cancer patients.

Patients and follow-up
Pathologically proved lung cancer patients were selected from our previous study (ClinicalTrials.gov, NCT03648151) (Si et al. 2021) which enrolled the patients who accepted PET/CT scans during 2014-2016. The criterion was that, besides the primary tumor, at least one lymph node (LN) could be identified on CT images. The thin-section CT serials after the PET/CT scans were used to evaluate the reproducibility of our analysis. Additionally, because there is no consensus threshold of SUVmax on the differential diagnosis of malignant LNs, the threshold was set as SUVmax ≥ 2.0 and ≥ 2.5 on PET image (Carkaci et al. 2012), and was grouped into the cohort 1 and 2, respectively (Fig. 1). The ethics committee at the First Affiliated Hospital of Shanxi Medical University approved the protocol, and the review boards agreed to waive the requirement for informed consent.
The enrolled patients were staged according to the 7th edition of the UICC-AJCC (Union for International Cancer Control American Joint Committee on Cancer System), and were followed to the end of November 2019. Overall survival (OS) was defined from the day of PET/CT scan to death for any reasons. Treatment modalities after PET/ CT examinations were followed and stratified into single treatment only, combined therapy, and no treatment. Target drugs included signal transduction inhibitors, gene expression modulators, immunotherapy and others.

Image acquisition and reconstruction
After fasting 4-6 h, blood glucose levels of all the patients were tested and were ensured lower than 10 mmol/L. Sixty minutes after intravenous injection of 18F-FDG (89-482 MBq; average activity, 367.69 MBq), patients laid on the table of PET/CT scanner. Firstly, a non-enhanced CT images (combined CT) for attenuation correction were acquired by the GE Discovery VCT scanner (GE Healthcare, WI, USA) at 140 keV with 40-80 mA (couch movement of 0.8 s and 30 mm per rotation) and a slice thickness of 5-mm. Subsequently, PET images were acquired from vertex Fig. 1 Differential diagnosis of lymph nodes. The patient has 3 lymph nodes (LN) and a primary tumor. LN 1 in our previous cohort is discarded from this study. LN 2 is included in cohort 1, but excluded from cohort 2 for SUVmax < 2.5 to mid-thigh of the patient in 3 dimensions with 11-slice overlap (speed: 4 min/bed position). At last, the thin-section serials were acquired with a spacing of 1.25 mm.
Both the combined and thin-section CT images were reconstructed by the workstation of the scanner (5.14 mm and 1.25 mm slice thickness; 21 subsets; 2 iterations, 128 × 128 matrices). Based on the scaling factors of Hounsfield units transformed from materials, the CT images were converted into PET attenuation maps which were used to reconstruct PET images.

Image segmentation and features extraction
Two experienced radiologists identified the boundary of primary tumors and LNs on the combined CT images (shortaxis diameter ≥ 3 mm). Subsequently, volumes of interest (VOI) for primary tumors and LNs were semi-automatically segmented by the 3D-Slicer (version 4.8) software, and were manually revised to better cover their boundaries (Yang et al. 2017). Subsequently, VOIs on the combined CT were copied to the thin-section CT. Finally, CT features, included firstorder, second-order, high-order, absolute gradient, shape, and size (Sollini et al. 2017), were extracted by the IBEX software (version 2.0) (Zhang et al. 2015). Additionally, because the transformed continuous features were directly used, they were not normalized as other studies.
The maximal standardized uptake values (SUVmax) of lymph nodes and primary tumors were measured on the corresponding PET images, and were normalized to the patient's body weight. The higher SUVmax of LN had, the greater malignant odds of it might be (Carkaci et al. 2012). Therefore, to differentiate malignant LNs, we selected the criteria of SUVmax ≥ 2.0 and SUVmax ≥ 2.5, separately.

Features transformation and statistical analysis
According to our malignant criterion (SUVmax ≥ 2.0 or ≥ 2.5), SD of each feature (Feature SD ) was calculated among foci including both primary tumor and LNs for each patient. In variable selection, the extreme gradient boosting decision trees (XGBoost) method was used to sort the contribution of Feature SDs by the comparably predictive index of VIMP (variable importance). XGBoost which implements Machine Learning algorithms is a highly flexible technique, and is suitable for analyzing both binary and continuous variables in survival regression. Additionally, besides the cross-validation, it can also avoid overfitting using shrinkage scales and column subsampling techniques.
Data were analyzed by the R package (version 3.6.1, a language and environment for statistical computing). Besides the basic packages, others included ggplot2, survival, mlr3, Seurat, xgboost, and et al. Two sides of P < 0.05 were considered as a significant level.

Patient characteristics
Among the lung cancer patients in our previous study (n = 137, ClinicalTrials.gov, NCT03648151) (Si et al. 2021), 117 patients had primary tumor and more than one lymph node, simultaneously. Furthermore, among the analyzed lymph nodes of the previous cohort (n = 1239), SUVmax of 413 and 322 ones were higher than 2.0 and 2.5, respectively. The differential diagnosis of lymph nodes is illustrated on Fig. 1. Finally, for the cohort 1 and 2, treatment modalities of 94 and 88 patients could be followed, and median number of malignant foci was 4 (ranged from 2 to 16) and 3.5 (ranged from 2 to 15), respectively. Additionally, median age of cohort 1 was 65y (ranged from 38 to 88 y), and was used to group the patients. In general, patient characteristics were similar between the two cohorts (Table 1), and indicated that the patient stratifications could not be obviously influenced by the thresholds of SUVmax.

Treatment modalities
In the cohort 1, after PET/CT examination, there were 8, 5, 55 and 26 patients in TNM stage I-IV, successively. Among the patients in stage I-II, 9/13 accepted surgery only, and others received chemotherapy only (n = 1), target therapy only (n = 1), combined treatment (n = 1), and no treatment (n = 1). Correspondingly, 3/81 patients in stage III-IV underwent surgery only, and others received combined therapy. The treatment modalities (Fig. 2) did not exist significant difference between stage I and II (x 2 = 0.612, p = 0.450) or stage III and IV (x 2 = 8.482, p = 0.075), but did among stage I-IV (x 2 = 23.759, p < 0.001). Therefore, in our analysis, TNM stages were stratified into I-II and III-IV. Additionally, none of this cohort was administrated radiotherapy alone which might result from the inclusion criteria that treatment after the PET/CT examination were followed.

Survival analysis of patient characteristics
During a median observation time of 40 months (ranged from 34 to 55 months), both 70 patients of cohort 1 (n = 94) and 66 patients of cohort 2 (n = 88) died in a median OS of 10 months (ranged from 0 to 40 months). In the univariate Cox analysis, patient characteristics were separately regressed against OS. In general, the squamous and adenocarcinoma patients in early-stage who accepted surgery or target therapy had longer OS than others. On Fig. 3, hazard ratios (HR) of the factors are similar between the cohorts. Compared to no treatment, surgery and target therapy were significantly against OS in both cohorts. Additionally, although total lesion glycolysis (TLG) was a significant variable in both cohorts, its range of HR was very closed to 1.
In the multi-variate Cox analysis of the two cohorts (Table 2), 3 patient characteristics (TNM stage, surgery and target therapy) were significantly against OS in both cohorts, but others were failed. Therefore, in further analysis, the factors of surgery, target therapy, and TNM stage were used as the basic model for the evaluation of the prognostic ability of Feature SD. . The Harrell's concordance index (C-index) of the model composed of the 3 variables was 0.717 (95% CI 0.662-0.772) and 0.723 (95% CI 0.672-0.774) in the cohort 1 and the cohort 2, respectively. Above all, the criteria of malignant LN could slightly affect the results of survival analysis, and C-index of the regressed model were in an overlapped 95% CI.

Feature selection for the combined and thin-section CT
On both the combined and the thin-section CT images of the enrolled patients, 141 features (1683 variables) were extracted from each lesion, and, using the R package, SD of each feature among malignant foci for every individual (Feature SD ) was calculated as continuous variables. Subsequently, the survival extreme gradient boosting method (survival XGBoost) was used to score variables including continuous ones, and to rank the prognostic ability of each feature by the variable importance (VIMP).
In the survival XGBoost analysis of the data extracted from the combined CT images, the independent variables included Feature SD and the 3 significant characteristics in the Cox regressions. On Fig. 4, 19 and 15 features with a predictive VIMP are selected for the cohort 1 and the cohort 2, respectively. Among the ranked variables, three were repetitive between the cohorts, and the measured correlation of GLCM (Gray Level Cooccurrence Matrix) ranked in the top three in both cohorts. However, none of the 3 patient characteristics were ranked high enough on the list. Above all, as continuous variables, some Feature SDs among malignant foci were strong prognostic factors for lung cancer patients, and even had much higher VIMP than those identified by the Cox analysis. By contrast, the data extracted from thin-section CT images indicated that, besides the 3 patient characteristics, no Feature SD was repetitive between the top 20 VIMP lists of the cohort 1 and cohort 2. Detailed results were in the Supplementary Data (Fig. 1S). Therefore, only the data from the combined CT images was used in our further analysis.

Feature SD evaluation
The Feature SD identified in the survival XGBoost analysis ranged from 0 to 0.437 with a median of 0.163 (0.167 ± 0.116) and from 0 to 0.419 with a median of 0.141 (0.152 ± 0.114) in the cohort1 and 2, respectively. Furthermore, we used the penalized smoothing splines method to determine the threshold of the Feature SD . Figure S2 indicates that, especially in the cohort 2, exponential HR (hazard ratio) increases with increase of the variable, and the reference value is 0.166 and 0.138 for the cohort 1 and 2, respectively. However, when using the reference values to group the patients, C-index of the feature could be even worse than directly using the continuous variable.
In the evaluation, the prognostic ability of the Feature SD was compared to the 3 significant factors selected by the Cox regressions. On Fig. 5, C-index of the Feature SD in the cohort 2 is the highest among the variables, and is the 2nd highest in the cohort 1. Furthermore, C-index of the model included Feature SD was higher than others. Nomogram and calibration plots of the model integrated the Feature SD ( Fig. 6) indicated the reliability of the model. Above all, some Feature SD (correlation of GLCM in this study) among malignant foci within an individual were powerful prognostic factors, and their prognostic ability was obviously higher than the factors of patient characteristics.

Discussion
Our results indicated that the standard deviation of CT radiomics features (Feature SD ) among malignant foci of an individual, as a representative of individual heterogeneity among malignancies (IHAM), was a powerful prognostic factor for lung cancer patients, and their prognostic ability were obviously better than patient characteristics identified by the Cox survival analysis.
In this study, we firstly compared the replicability of Feature SD between the combined and thin-section CT serials, and found that the former's was obviously repeatable than the latter in predicting the prognosis of lung patients. Secondly, in selecting prognostic factors, only one Feature SD from the combined CT images were reliable between the two cohorts (SUVmax ≥ 2.0 or ≥ 2.5). Therefore, Feature SD were sensitive in predicting the prognosis of patients, and a repeatable feature from the combined CT images could be identified.
Heterogeneity within a tumor could be observed by the degree of genomic instability and gene mutations in breast cancer (Liegmann et al. 2021;Pizon et al. 2013) and other cancers (Galvan et al. 2010;Hata et al. 2015; VanderWeele . However, the heterogeneity between the primary tumor and metastases, i.e. IHAM, was not well elucidated, and its drivers and monitors would contribute to better understood naturally occurring metastases, treatmentnaive metastatic disease, and patient prognosis (Makohon-Moore et al. 2017). In a previous study, genome sequencing of 26 metastases from 4 patients indicated that driver gene mutations in each metastatic lesion were identical for every patient, and passenger gene mutations did not account for all the consequences of heterogeneity. Above all, IHAM had critical implications for personalized treatment, and its measurement in vivo would help us to monitor treatment efficacy and to predict the prognosis of patients (Makohon-Moore et al. 2017).
Molecular biotechnology can be used to measure IHAM; however, it should be performed on excised tissue after biopsy. As indicated by our study, radiomics were another method, and its superiority was the ability of obtaining results in vivo. Although there were few radiomics studies Fig. 3 The univariate Cox analysis of cohort 1 and cohort 2. HR with 95% confidence intervals (95% CI) is plotted in the 5th column on IHAM, some studies had shown CT features related to gene mutation status. It was reported that, for both non-small cell and adenocarcinomas lung cancer patients, CT features associated to the somatic mutations of epidermal growth factor receptor (EGFR) (Sollini et al. 2017), and a set of the features could predict the mutation status with an area under the curve (AUC) of 0.647 (Liu et al. 2016). Therefore, Feature SD among malignant foci of a patient could be used to represent individual heterogeneity.
In the results of the combined CT, except Feature SD , the significant factors in the Cox regressions (TNM, surgery, and target therapy) did not appear on the list sorted by the comparable parameter of prognostic ability. Furthermore, from the point of C-index, the Feature SD was obviously higher than the three patient characteristics. Above all, these results indicated that Feature SD , as a representative of IHAM, was a powerful prognostic factor for lung cancer patients.
Considering the sample size of PET study, our patients were not limited to a specific TNM stage or treatment modality. To compensate this, we not only stratified the patients by treatment modalities, but also by each therapy method. Furthermore, all the therapy related factors were both included in the univariate and multi-variate survival analysis. At last, our results indicated that, although surgery and target therapy were significant factors against OS, their prognostic power was inferior to the identified CT Feature SD .
Additionally, not as other studies evaluating classification variables only in survival analysis, we used the survival XGBoost method to include continuous variables. As illustrated by TLG, its small changes was significant in the univariate and was not in the multi-variate Cox regression against OS; therefore, the prognostic ability of continuous variables was much lower than classification variables (Liu et al. 2019). However, the nomogram on Fig. 6 is plotted based on the multi-variate Cox analysis, and Feature SD is a significant factor in both cohorts (p = 0.032 and 0.001). Above all, the transformed continuous CT feature was a strong prognostic factor for lung cancer patients.
Because our goal was to determine whether Feature SD were prognostic factors, the results were not validated by other cohorts. However, the survival XGBoost was used to get relatively reliable results and avoid overfitting. Additionally, some Feature SD on the ranked list between the cohorts were not so repeatable which meant that the results might be sensitive to conventional CT protocols. Therefore, in future studies, we would try to find suitable acquisition protocol Fig. 6 Nomogram (upper) and calibration plot (lower) of the transformed feature and the three Cox factors. Prognostic factors on the nomogram are projected to the points axis, and the total points accumulated by each factor can be used to predict the 1 or 2-year survival rate of the patient and to validate the prognostic ability Feature SD in future studies.
In summary, SD of CT features among malignant foci of an individual, as a representative of individual heterogeneity among malignancies (IHAM), was a strong prognostic factor for lung cancer patients, and it could obviously improve the prognostic ability of the Cox model composed of patient characteristics, and superior to other factors.
Author contributions SH, LS, XS and CJ contributed to the study conception and design. Material preparation, data collection and analysis were performed by XS, WR, LL and. The first draft of the manuscript was written by HX and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding The work was funded by grants from Science and Technology project of Health Commission of Anhui Province (No. AHWJ2021b148), Collaborative Innovation Center for Molecular Imaging and Precise D&T Center (Grant No. MP201604) and Shenzhen Science and Technology Innovation Commission (JCYJ20200109120205924).

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.