Radiomics features from whole thyroid gland tissue for prediction of cervical lymph node metastasis in the patients with papillary thyroid carcinoma

We aimed to develop a clinical-radiomics nomogram that could predict the cervical lymph node metastasis (CLNM) of patients with papillary thyroid carcinoma (PTC) using clinical characteristics as well as radiomics features of dual energy computed tomography (DECT). Patients from our hospital with suspected PTC who underwent DECT for preoperative assessment between January 2021 and February 2022 were retrospectively recruited. Clinical characteristics were obtained from the medical record system. Clinical characteristics and rad-scores were examined by univariate and multivariate logistic regression. All features were incorporated into the LASSO regression model, with penalty parameter tuning performed using tenfold cross-validation, to screen risk factors for CLNM. An easily accessible radiomics nomogram was constructed. Receiver Operating Characteristic (ROC) curve together with Area Under the Curve (AUC) analysis was conducted to evaluate the discrimination performance of the model. Calibration curves were employed to assess the calibration performance of the clinical-radiomics nomogram, followed by goodness-of-fit testing. Decision curve analysis (DCA) was performed to determine the clinical utility of the established models by estimating net benefits at varying threshold probabilities for training and testing groups. A total of 461 patients were retrospectively recruited. The rates of CLNM were 49.3% (70 /142) in the training cohort and 53.3% (32/60) in the testing cohort. Out of the 960 extracted radiomics features, 192 were significantly different in positive and negative groups (p < 0.05). On the basis of the training cohort, 12 stable features with nonzero coefficients were selected using LASSO regression. LASSO regression identified 7 risk factors for CLNM, including male gender, maximum tumor size > 10 mm, multifocality, CT-reported central CLN status, US-reported central CLN status, rad-score, and TGAb. A nomogram was developed using these factors to predict the risk of CLNM. The AUC values in each cohort were 0.850 and 0.797, respectively. The calibration curve together with the Hosmer–Lemeshow test for the nomogram indicated good agreement between predicted and pathological CLN statuses in the training and testing cohorts. Results of DCA proved that the nomogram offers a superior net benefit for predicting CLNM compared to the “treat all or none” strategy across the majority of risk thresholds. A nomogram comprising the clinical characteristics as well as radiomics features of DECT and US was constructed for the prediction of CLNM for patients with PTC, which in determining whether lateral compartment neck dissection is warranted.


Introduction
Papillary thyroid carcinoma (PTC), which comprises 90% of all cases of thyroid cancer, has seen a rise in incidence in recent years (Seib and Sosa 2019).Various clinical characteristics of PTC have been extensively studied, such as extrathyroidal extension (ETE) and lymph node metastasis (LNM) (Kim et al. 2021).Even though, PTC is generally indolent and typically responsive to comprehensive, standardized treatments, with an excellent prognosis (Gulec et al. 2021).However, individuals with cervical lymph node metastasis (CLNM) tend to have higher rates of local recurrence, accompanied by poorer overall survival.According to research findings, regional LNM has been reported to contribute to a significant proportion of local recurrences, approximately 31% of cases (Zou et al. 2022).
The cervical lymph nodes (CLNs) are typically the first site of lymph node metastasis (LNM) in PTC patients (Alsubaie et al. 2022).CLNs can be classified into two compartments: the central (level VI) as well as the lateral compartment (level II-V).LNM incidence in central and lateral CLNs is high in PTC patients, with around 30-80% for central LNM and up to 40% for lateral LNM (Tong et al. 2022).However, 28-33% LNM cannot be detected before surgery (Seib and Sosa 2019).
In the meantime, due to the typically indolent nature of PTC, in cases involving small, unifocal tumors without ETE or LNM, it is often recommended to perform an ipsilateral lobectomy.However, performing unnecessary surgery in this region can augment the likelihood of complications such as recurrent laryngeal nerve injury (RLNI) and permanent hypoparathyroidism (Lee et al. 2021).Thus, indications for neck dissection need to be meticulously evaluated.In PTC patients, precise preoperative diagnosis of LNM is imperative in determining whether lateral compartment neck dissection is warranted (Chang et al. 2022).
Ultrasound (US), computed tomography (CT) along with cytology pathology are primarily utilized for preoperative assessment of CLNM in patients with PTC.Combined US and neck CT scanning has been demonstrated to be considerably more effective, as compared to the use of neck US alone (Lu and Chen 2022).Dual-energy CT (DECT) imaging facilitates the reconstruction of material decomposition (MD) images as well as a sequence of monochromatic images utilizing photon energy within the range of 40-140 keV, leading to the production of a spectral Hounsfield unit (HU) curve.The incorporation of MD images and the spectral HU curve can assist in diagnosing CLNM in PTC (Zhao et al. 2017).However, the sensitivity of the methods above is mostly below 70% (Lu and Chen 2022;Li et al. 2019).Radiomics is a rapidly progressing methodology involving computer-aided detection/diagnosis that unifies the analysis of digital images with a machine learning algorithm (Mayerhoefer et al. 2020).It has the capability to overcome the limitations associated with subjective image interpretation that typically relies on the experience and expertise of medical professionals.By extracting a broad spectrum of quantitative features from medical images, Radiomics can translate imaging data into mineable information that can be potentially employed as predictive, diagnostic, or prognostic indicators, thereby bolstering the clinical decision-making process.
The present study endeavored to establish a nomogram for the effective prediction of the likelihood of CLNM of PTC patients using clinical characteristics as well as radiomics features of DECT and US, thereby directing clinicians in surgical decision-making.

Patient selection
Patients with suspected PTC who received DECT scanning for preoperative assessment between January 2021 and February 2022 were retrospectively recruited from our hospital.The inclusion criteria were: (1) patients with histologically confirmed PTC; (2) patients with no preoperative anticancer treatment; (3) patients with DECT scanning conducted within 2 weeks before surgery; (4) patients with neck dissection accompanied by ipsilateral lobectomy/total thyroidectomy, and pathological diagnosis of LNs.The exclusion criteria included: (1) patients with an absence of PTC in the thyroid gland based on postoperative pathological examination; (2) patient with a history of malignant tumor; (3) patient having a primary tumor with a maximum diameter of < 3 mm or unclear on CT images; (4) patients with CT imaging over two weeks before surgery or suboptimal image quality.The Institutional Review Board approved this retrospective study, and the requirement for informed consent was waived.

DECT imaging acquisition
A third-generation DECT scanner (Somatom Force; Siemens Healthcare) was used for imaging acquisition.The region extending from the base of the skull to the upper margin of the aortic arch was covered by DECT scans.A consistent protocol included: detector configuration, 192 × 0.6 mm; gantry rotation, 0.25 s; pitch factor, 0.6; field of view, 150 × 150 mm; slice thickness, 3 mm.Tube A operated at 80 kVp peak voltage along with a reference tube current of 118 mAs, while Tube B functioned at Sn150 kVp along with a reference tube current of 59 mAs.To facilitate contrastenhanced scanning, 80 mL of a non-ionic iodinated contrast agent (Iohexol, 300 mg/dL iodine or Iodixanol, 320 mg/dl iodine; both from Yangzijiang Pharmaceutical Company) was administered via an automated high-pressure syringe into the cubital vein at a flow rate of 3.5 ml/s.The scan delay times were set at 50 s for venous phases.Venous phase CT images obtained at 80 kVp were reconstructed with a 1 mm layer thickness and exported in DICOM format for radiomics analysis.

Clinical characteristics and imaging features
Clinical characteristics, such as age, gender, and serological examination results, were retrieved from the medical record system.Stringent verification was conducted by qualified specialists to ensure their accuracy and reliability.In compliance with the American Joint Committee on Cancer (AJCC) staging system (the 8th edition), patients were allocated into two groups based on age: ≥ or < 55 years (Haugen et al. 2016).Serological examination results were obtained within 1 week before surgical treatment.Normal values were as follows: thyroglobulin (TG), 1.150-131.000ng/ml; TG antibody (TGAb), 0-4.000IU/ml; thyroid peroxidase antibody (TPOAb), 0-9.000IU/ml; thyroid-stimulating hormone (TSH), 0.5600-5.9100nIU/ml.
CT image analysis was carried out by two readers who had 3 and 10 years of experience in the field of head and neck oncologic imaging, respectively.In cases of discrepancies between the two readers, a third reader with 20 years of experience in the same field was consulted to examine the images and provide a definitive adjudication.Throughout the entire study, the observers remained completely blinded to both the study design and the corresponding clinicopathological findings.Besides, the status of CLN on the US was based on the US report.

Volume of interest segmentation and extraction of radiomics features
Radiomics workflow of our study is depicted (Fig. 1).We prepared axial venous phase CT images with a 1 mm thickness for thyroid segmentation.A radiologist who has worked for 10 years in head & neck oncologic imaging (reader 2) delineated the entire thyroid gland tissue using open-source imaging software (ITK-SNAP, version 3.8.0).
Radiomics feature extraction was performed utilizing Pyradiomics (version 2.2.0, https:// github.com/ Radio mics/ pyrad iomics), resulting in a total of 960 features.These comprised 14 shape features, 18 first-order features, 68 textural features, and 860 high-dimensional features.The high-dimensional features included 18 first-order features transformed by log-sigma, 154 textural features transformed by log-sigma, 18 first-order features transformed by wavelet, and 670 textural features transformed by wavelet.

Feature selection and rad-score calculation
To eliminate potential differences in the dimensions of the indices, the radiomics features were normalized by standardization using z-scores.Additionally, features between groups were compared by the Mann-Whitney U test, among which the insignificant (p ≥ 0.05) ones were removed.Moreover, LASSO (Least Absolute Shrinkage and Selection Operator) regression with tenfold cross-validation refined feature selection.Ultimately, the retained features were incorporated into classification models.A radiomics signature was developed by computing a linear combination of the chosen features, with their relative weights determined by their respective coefficients.The corresponding radiomics score (rad-score) was computed for each patient as follows: and α is the intercept; β i is the value of radiomics feature; X i is the corresponding coefficient.

Risk factors selection and nomogram construction
Univariate and multivariate logistic regression analyses were employed to examine clinical characteristics as well as rad-scores.All features were incorporated into the LASSO regression model, with penalty parameter tuning conducted through tenfold cross-validation, to screen risk factors for CLNM.A radiomics-clinical model was constructed by integrating these risk factors using multivariate logistic regression analysis in the primary cohort, resulting in an easily accessible radiomics nomogram for clinicians.

Statistical analysis
SPSS 22.0, R software (version 4.1) and associated packages were adopted for statistical analyses.Continous variables are described as median (interquartile range) or mean ± standard deviation (SD) depending on the results of the Shapiro-Wilk test; while categorical variables were expressed in proportions.Chi-squared test, Fisher's exact test, along with the Mann-Whitney U test were used to compare the differences between groups.To assess the performance of the developed models, receiver operating characteristic (ROC) curve analysis was conducted, with the area under the curve (AUC) serving as the primary performance metric.Calibration curves were employed to assess the calibration performance of the clinical-radiomics nomogram, followed by goodness-of-fit testing.Additionally, decision curve analysis (DCA) was performed to determine the clinical utility of the established models by estimating the net benefits at each threshold probabilities for training and testing groups.p < 0.05 was deemed statistically significant.

Patient characteristics
A total of 461 patients were retrospectively recruited from our hospital.202 patients were enrolled in the study.Among them, 102 cases had central compartment metastases, while the remaining 100 cases had no metastases.Patients were randomly assigned to training and testing cohorts at a 7:3 ratio (Fig. 2).There were no significant differences observed with respect to all factors between the training and testing cohorts (

Selection of features and calculation of the rad-score
Out of the 960 extracted radiomics features, a subset of 192 were deemed different in positive and negative groups (p < 0.05).LASSO regression was adopted for the training cohort, leading to the selection of 12 stable features with nonzero coefficients.This feature set comprised two texture features, three first-order features transformed by wavelets, and seven texture features transformed by wavelets (supplement: Table 3).The rad-score for each patient was

Selection of risk factors and construction of the nomogram
The results of univariate and multivariate logistic regression analyses are displayed in Table 2. LASSO regression identified seven risk factors for CLNM (Fig. 3), including male gender, maximum tumor size > 10 mm, multifocality, CT-reported central CLN status, US-reported central CLN status, rad-score, and TGAb.As shown in Fig. 4, a clinicalradiomics nomogram based on these factors was developed to predict the risk of CLNM.Each variable was assigned a proportional point value within the 0-100 range in the nomogram, corresponding to the respective regression coefficient for CLNM.To determine the corresponding probability of CLNM for each individual, their total score was obtained by adding the point values of each variable, obtained by forming a vertical line from each variable axis to the point axis on the nomogram.Finally, the total score was positioned on the total score scale.

Model evaluation and clinical use
We conducted ROC analysis on the radiomics nomogram for predicting CLNM in both training and testing cohorts.
The AUC values in each cohort were 0.850 (95% confidence interval: 0.782-0.91)and 0.797 (95% confidence interval: 0.69-0.902),respectively (Fig. 5).The calibration curve analysis for the radiomics nomogram proved an excellent agreement between predicted and actual pathological CLN statuses for both the training and testing cohorts (Fig. 6).This suggests no deviation from an ideal fit.DCA demonstrated that the nomogram offers a superior net benefit for predicting CLNM compared to the "treat all or none" strategy across the majority of risk thresholds (Fig. 7).Studies have demonstrated that prophylactic central neck dissection (PCND) does not confer any significant benefits with regard to preventing local recurrence of PTC, and poses a higher risk of surgical complications (Yan et al. 2021).The extent of surgery is primarily determined by the preoperative assessment of LNs, which is also a crucial factor that affects postoperative recurrence and overall prognosis.Performing PCND in cN0 patients is still a topic of debate, due to the increased risk of hypoparathyroidism together with RLNI associated with this surgical procedure (Hu et al. 2020).Patients with preoperative evidence of lateral LNM generally necessitate a more aggressive regimen that includes lateral compartment LN dissection as well as highdose radioactive iodine (RAI) therapy (Park et al. 2020).In the management of LNM, either performing therapeutic neck dissection or administering RAI ablation may pose a significant risk for morbidity and degrade patients' quality of life.In this context, minimizing the incidence of complications should be achieved through the judicious and selective use of surgical intervention on the CLNs to reduce the number of unnecessary procedures required.Henceforth, an accurate evaluation of CLNM prior to surgery is of paramount importance in selecting appropriate therapeutic interventions for PTC patients.Currently, there has been insufficient judgement on whether there is a spread of LNs in the neck, which makes it difficult to determine the appropriate surgical approach early on.Therefore, how to apply existing technologies to improve the accuracy of predicting CLNM has become a hot issue in clinical practice.
Presently, most medical practitioners utilize clinical characteristics for the prediction of the likelihood of CLNM among PTC patients.Thyroglobulin (TG) is a large glycoprotein that serves as a substrate in thyroid hormone synthesis (Coscia et al. 2020).It is secreted by thyroid follicular epithelial cells, both in their normal state as well as in welldifferentiated malignant thyroid tumor cells (Trimboli et al. 2015).
Metastatic LNs often display observed traits of calcification, cystic necrosis, hyperechogenicity, a round shape, peripheral or mixed vascularity and absence of an echogenic hilum.US is the primary noninvasive imaging tool for the evaluation of CLN status prior to the surgery.Neck US examination exerts a crucial effect on the staging of PTC.Numerous previous studies have extensively investigated the correlation between CLNM and US features of tumors.For instance, Zhan et al. identified that either a talle and wide shape or tumor size were predictive factors for LNM.In addition, the existence of calcification together with a smaller distance between the mass and the capsule have been suggested as potential indicators for LNM as well (Zhan et al. 2019).Nonetheless, the clinical utility of the US in evaluating CLNM in cases of papillary Thyroid Microcarcinoma remains a topic of ongoing deliberation and debate among medical researchers and practitioners (Seib and Sosa 2019).The effectiveness of preoperative US in detecting CLNM is constrained by the presence of the overlying thyroid gland, and such LNs often exhibit little or no abnormality on preoperative imaging or upon direct inspection during surgery.The detection rate of CLNM is suboptimal, largely attributed to the the limitations of US that prevent it from consistently visualizing the deep anatomic structures/ areas that may be subject to acoustic shadowing from bone or air (Guo et al. 2019).Furthermore, the accuracy of US examination in diagnosing lateral LNM is prone to interobserver variability and subjectivity since it is highly dependent on the skill and expertise of the operator (Park et al. 2020).Moreover, US is operator-dependent, and it has been observed that nodes < 1 cm and located in proximity to the mandible may be missed by US (Tong et al. 2021).Hence, the predictive accuracy of US in identifying CLNM before surgery may be limited.Although US has been reported to possess high specificity (80.5-97.4%) in many studies, its sensitivity is relatively low, varying between 36.7 and 61.0%(Wang et al. 2020;Tong et al. 2021).Several studies have attempted to address the significant variability in the sensitivity (10.9-94%) and specificity (69-90%) of US when detecting CLNM (Jiwang et al. 2022).
The CT holds several advantages when it comes to assessing CLNM, as it can effectively visualize LNs and illustrate their spatial relationship to peripheral vessels (Seib and Sosa 2019).Neck CT, as a supportive tool, can aid in the planning of thyroidectomy plus LN dissection, but it also has some potential drawbacks to consider.One of the major trade-offs is the exposure to ionizing radiation.Additionally, the use of iodinated contrast material in neck CT can delay or alter the timing of iodine ablation treatment(s), which is an important consideration for postoperative management of PTC (Lu and Chen 2022).Moreover, Neck CT has demonstrated suboptimal sensitivity and specificity for detecting LNM in the central neck compartments.Therefore, we propose to utilize Radiomics technology to investigate the potential of utilizing whole thyroid CT radiomics for predicting the likelihood of LNM in cases of PTC.However, the above-mentioned studies are based on the analysis of primary lesions, which is not suitable for CLNM evaluation in multiple-lesion PTC.Moreover, the manually drawn primary lesion often cannot reflect the full extent of the lesion.Therefore, the whole thyroid CT radiomics approach has demonstrated its advantages in addressing these limitations (Xu et al. 2022).
High-quality and standardized imaging is the foundation of radiomics, and compared to traditional CT, DECT has lower radiation doses and can provide better image quality (Ren et al. 2020).In this study, we used the vein-phase (50 s after contrast injection) CT images acquired at 80 kV for the radiomics analysis.The CT images at 80 kV have higher contrast and lower signal-to-noise ratio, and the vein-phase CT images can better reflect the blood flow in the thyroid gland and primary lesions.The 5 radiomics features selected in this study were all features obtained from wavelet transformation, with 1 being first-order feature and the other 4 being texture features.These features mainly reflect the values and distribution of image intensities within the region of interest (ROI).The correlation of these features with CLNM may be due to the following reasons: (1) Previous studies have shown that patients with multiple PTC lesions or larger primary lesions are more likely to experience CLNM (Liu et al. 2019;Zhang et al. 2022).These two conditions can increase the ratio of lower-enhanced primary lesions to higher-enhanced thyroid gland, resulting in uneven distribution of gray values within ROI. (2) LNM in PTC patients is associated with the increase of vascular endothelial growth factor-D (VEGF-D), which can promote tumor and glandal angiogenesis.These new vessels can increase the lymphatic venous linkage of the tumor, leading to more micro-metastatic cell clusters entering the lymphatic system (Shi et al. 2023).Vein-phase CT images can directly reflect the blood flow in the gland and primary lesions, highlighting the texture differences among different groups of ROI.(3) It has been suggested that chronic lymphocytic thyroiditis can cause extensive infiltration of immune cells in thyroid tissue, thereby exerting regulatory effects on the tumor microenvironment and enhancing anti-tumor immune response, thus limiting the occurrence of CLNM in PTC patients.Chronic lymphocytic thyroiditis on enhanced CT is manifested as decreased and uneven enhancement of the gland, which can also result in changes in ROI gray values.
Among the three radiomics models constructed based on the five radiomics features, LR model showed the optimal performance, revealing the AUC to be 0.656 in the training set and 0.631 in the testing set.In previous studies, Shenshasa et al. constructed an LR model based on wavelet texture features of CT vein-phase images of primary PTC lesions, which had a slightly lower AUC of 0.602 compared to the LR model in this study.Li et al. applied six classifiers to construct a radiomics model on the basis of CT images of primary PTC lesions to predict CLNM and the optimal model had better performance (AUC = 0.709) than the model in this study (Li et al. 2021).Yang et al. also used six classifiers to construct a prediction model for LNM in PTC patients, achieving a much higher performance in the testing set with an AUC of 0.859 compared to the model in this study.
Compared to the aforementioned studies, the advantages of this study include: (1) using the entire thyroid gland instead of primary lesions as ROI, which reduces the subjectivity of manual delineation and can more objectively reflect the overall extent of the disease; (2) widening the inclusion criteria of cases, no longer limited to the study of single PTC cases, resulting in a model with stronger generalization ability.Taking the previous findings into account, we identified the risk factors with higher predictive potentials for CLNM in our analysis and selected those factors to be included in constructing the nomogram model.
There were several limitations in this study.Due to the complexity of the omics features extracted from the whole thyroid gland, it was challenging to select more specific features, but the predictive efficacy of the nomogram can be augmented.In future, our research team will attempt (1) to incorporate non-contrast CT images based on venous phase CT images to enrich the omics features; (2) to construct a CLNM prediction model using clinical data, CT and US data, dual-energy CT quantitative parameters, etc., and combine it with the omics model; (3) to utilize more and updated imaging omics classification models such as XGBoost; (4) to explore the feasibility of predicting CLNM using deep learning methods including convolutional neural network.

Conclusion
We have developed a clinical-radiomics nomogram that integrates clinical characteristics as well as radiomics features of DECT and US to improve the accuracy of predicting the CLNM among PTC patients.Clinical-radiomics nomogram has the potential to assist clinicians in charting an individualized and optimal treatment plan by providing an accurate prediction of CLNM for PTC patients.

Fig. 2
Fig. 2 Recruitment pathway for eligible patients in this study

Fig. 3
Fig. 3 LASSO regression for risk factors for cervical lymph node metastasis in patients with papillary thyroid carcinoma

Fig. 6 Fig. 7
Fig. 6 Calibration curve analysis for the clinical-radiomics nomogram in the training set (a) and the validation set (b)

Table 1
Comparison of patient baseline characteristics between the training and testing cohorts

Table 2
Results of univariate and multivariate logistic regression analyses