Predicting malignancy in thyroid nodules based on conventional ultrasound and elastography: the value of predictive models in a multi-center study

doi:10.21203/rs.3.rs-1945305/v1

Background: This study aimed to establish predictive models based on features of Conventional Ultrasound (CUS) and elastography in a multi-center study to determine appropriate preoperative diagnosis of malignancy in thyroid nodules with different risk stratification based on 2017 Thyroid Imaging Reporting and Data System by the American College of Radiology (ACR TI-RADS) guidelines.

Methods: Five hundred forty-eight thyroid nodules from three centers pathologically confirmed by the cytology or histology were retrospectively enrolled in the study, which were examined by CUS and elastography before fine needle aspiration (FNA) and surgery. Characteristics of CUS of thyroid nodules were reviewed according to 2017 ACR TI-RADS. Binary logistic regression analysis was used to develop the prediction models based on the different risk stratification of CUS features and elastography which were statistically significant. Values of predictive models were evaluated regarding the discrimination and calibration.

Results: Binary logistic regression showed that patients’ age, taller-than-wider, lobulated or irregular boundary, extra-thyroid extension, microcalcification and the elastic parameter of Virtual touch tissue imaging quantification (VTIQ) max were independent predictors for thyroid malignancy (p<0.05) in the ACR model and showed the area under the curve (AUC) in training (0.912) and validation cohort (internal and external: 0.877 vs 0.935). Predictive models showed predictors in ACR TR4 and TR5 for malignancy and diagnostic performance of AUC in training, internal and external validation cohort respectively: the VTIQ max (p < 0.001) with AUC of 0.809 vs 0.842 vs 0.705 and the age, taller than wide, VTIQ max variables with AUC of 0.859 vs 0.830 vs 0.906 in validation cohort. All predictive models have better calibration capabilities (p>0.05).

Conclusions: Predictive models combined CUS and elastography features would aid clinicians to make appropriate preoperative diagnosis of thyroid nodules among different risk stratification. The elastography parameter of VTIQ max has the priority in distinguishing thyroid malignancy with moderately suspicious (ACR TR4).

prediction

thyroid nodule

Conventional Ultrasound

elastography

binary logistic regression

Thyroid nodules are a common clinical problem and the prevalence of thyroid cancer yearly increased due to the incidence of papillary thyroid cancer (PTC), which occupied the vast majority[1]. Thyroid imaging reporting and data system for Ultrasound (US) features has been widely used to stratify the risk of malignancy and help determine whether FNA is needed at an early stage[2]. Various guidelines such as European Thyroid Association guidelines for ultrasound malignancy risk stratification of thyroid nodules (EU-TIRADS), associazione Medici Endocrinologi (AACE/ACE/AME) guidelines, and updated 2017 American College of Radiology (ACR) Thyroid Imaging, Reporting, and Data System (TI-RADS) had given recommendations for the risk classification of thyroid nodules. Typical ultrasound characteristics were assigned scores which were divided into five levels (TR1 to TR5) for representing the risk malignancy of thyroid nodules on 2017 ACR TI-RADS guidelines. Thyroid ultrasound risk stratification system can evaluate the malignant risk of thyroid nodules more objectively and accurately. The number of suspicious US features and the ACR TI-RADS scores were potential risk factors for cervical lymph node metastasis in patients with the PTC(less than 10mm)[3]. It has been widely used to screen thyroid nodules requiring puncture and further treatment. A higher ACR TI-RADS score can forecast an increased risk of malignancy[4]. It allowed clinicians to manage patients more convenient, effective and cost-effectiveness[5].

Elastography has been recognized the auxiliary method for ultrasound diagnosis and risk assessment of benign and malignant thyroid nodules. The combination of the two dimensional shear wave elastography (2D SWE) and ACR TI-RADS classification could improve diagnostic sensitivity and accuracy when differentiating thyroid malignancy with indeterminate FNA cytology[6]. The rate of unnecessary biopsy was significantly decreased when the ACR TI-RADS classification were combined with elastography in a multi-center study[7]. A meta analysis reported that combining elastography and other ultrasound techniques improves evaluation of indeterminate thyroid nodules[8]. A prospective research showed that strain elastography had a better performance than Kwak TI-RADS classification on thyroid nodule discrimination, while their combination improved sensitivity[9].

Previous studies were mostly based on the CUS and elastography including 2D SWE and strain elastography (SE) features of thyroid nodules to predict malignancy. In the actual clinical situation, clinicians usually use the risk stratification system to evaluate the malignant risk of nodules, and then make a comprehensive analysis combined with the results of elastography. Therefore, using the thyroid nodule CUS risk stratification system combined with elastography to predict malignancy of thyroid nodules is more conducive to nodules management process.

Our study aimed to establish the prediction model using CUS risk stratification system and elastic parameters for verifying the diagnostic efficacy of thyroid nodules in ACR TR4-5 classifications. Additionally, further verification was carried out in the different grades of nodules classified based on the ACR TI-RADS guideline and determined appropriate preoperative diagnosed method for malignancy.

Patients

The multi-center study was approved by the ethics committee of the three hospitals respectively and complied with the Declaration of Helsinki. Informed consent was obtained from all individual participants included in the study. The flowchart of the included and excluded procedure was shown in Fig. 1. The inclusion criteria were listed as followed: (1) age ≥ 18 years old, (2) the component of nodule was solid or mixed cystic and solid (< 75% solid); (3) sufficient normal thyroid tissue surrounded the nodule; (4) sizes of nodules were ranged from 5 mm to 50 mm; (5) nodules were not treated before examinations. Exclusion criteria: (1) malignant FNA results had no confirmed surgical pathology; (2) indefinite nodules did not receive repeated punctures or surgeries; (3) nodules were diagnosed as benign by FNA without 1y of follow-up; (4) incomplete or poor quality images of CUS and elastography. In patients with multiple nodules, the nodules with the most malignant features were selected.

If multiple nodules had the same degree of malignancy, the largest nodule was selected. Finally, a total of 548 nodules from 548 patients were included in the study. Four hundred and seventy eight patients from center 1 (Department of Medical Ultrasound, Shanghai Tenth People’s Hospital), 60 patients from center 2 (Department of Ultrasound, The first Affifiliated Hospital of Harbin Medical University), 10 patients from center 3 (Department of Ultrasound, The second Affifiliated Hospital of Kunming Medical University) were included from June 2016 to June 2019.

CUS, SE and 2D SWE examinations

CUS and elastography were operated by the same kind of machine (4–9 MHz multi-frequency 9L4 transducer, Siemens ACUSON S3000) in three centers. Four experienced physicians from three centers respectively participated in the images acquisition.

CUS

CUS images operation standards were listed as followed: (1) grayscale and color images of the long axis and short axis section horizontal and vertical were stored and measured; (2) target nodules were placed in the center of the image and occupied a third of the area as much as possible; (3) asking patients to breathe smoothly and expose the neck during the operation. Characteristics of target nodules in gray scale images were recorded, scored and graded according to the 2017 ACR TI-RADS guideline: composition ( cystic or almost completely cystic, 0 points; spongiform, 0 points; mixed cystic and solid, 1 points; solid or almost completely solid, 2 points ), echogenicity ( anechoic, 0 points; hyperechoic or isoechoic, 1 points; hypoechoic, 2 points; very hypoechoic ), shape ( wide-than-tall, 0 points; taller-than-wider, 3 points ), margin ( smooth, 0 points; ill-defined, 0 points; lobulated or irregular, 2 points; extra-thyroid extension, 3 points ), echogenic foci ( none or large comet-tail artifacts, 0 points; macrocalcifications, 1 points; peripheral calcifications, 2 points; punctate echogenic foci, 3 points ).

SE

Switch to the elastic mode of SE using the same probe (4–9 MHz multi-frequency 9L4 transducer, Siemens ACUSON S3000) to get images when the target nodule was stable on the grayscale mode. Slight pressure were performed on patients neck and then acquiring the images after 5 seconds until the quality number up to 50. The selection of the sampling frame made both the target nodule and surrounding tissues. Differences of hardness of the region of interest (ROI) can be reflected by the color map (red color means harder tissue, blue means softer). Elastic scores were classified into five different patterns as described[10]:

Elastic score-1: the nodule is displayed homogeneously in green.

Elastic score-2: the nodule is displayed predominantly in green with a little blue spots.

Elastic score-3: the nodule is displayed 50% areas in blue and green.

Elastic score-4: the nodule is displayed homogeneously in blue.

Elastic score-5: the nodule and surrounding tissues are displayed homogeneously in blue.

2D SWE

2D SWE is represented by Virtual touch tissue imaging quantification (VTIQ) on the elastic mode. Sampling frame sufficiently encapsulated the target nodule and surrounding tissues on grayscale images. Starting the VTIQ mode after the image stabilized and then acquiring four modes,including SW quality mode, SW velocity mode, SW displacement mode and SW time mode. The selection high quality image depends on the SW quality mode, which was represented as a color map (green color means high quality, red means low quality). Differences in hardness between tissues can be displayed qualitatively in color (red color means harder tissue, blue means softer tissue) and quantitatively in numerical (threshold was from 0 to 10m/s). The ROI was placed on the image of SW velocity mode. Size of the ROI was defined as 1 × 1 mm.The measurement was repeated 7 times in different positions of the nodule avoid cystic and calcified areas. Calculating the maximum values which be represented by the parameter of VTIQ max. According to the machine settings, the SW velocity measurement result of “High” was replaced with upper threshold (10m/s), which is corresponding to the solid portion of nodules.

Predictive models

We built predictive modules for all nodules and ACR TR4-5 nodule respectively using the Binary logistic regression method. Seventy percent of the 478 nodules from center 1 were enrolled in the training cohort, and remaining percent were enrolled in the internal cohort. A total of seventy nodules from center 2 (60/70) and center 3 (10/70) were enrolled in the external cohort. Binary logistic regression was applied in the training cohort to analyze the predictors for malignancy.The performances of the predictive models were evaluated with discrimination and calibration. Receiver operating characteristic (ROC) curves and measurements of area under the curve (AUC) values were used to evaluate the discrimination of the predictive model in the training cohort and validation cohorts. The method of Hosmer-Lemeshow was used to test the goodness whether three predictive models were well-calibrated.

Statistical analysis

Statistical analysis was performed by the SPSS 20.0 and R software. SPSS was used to compare the differences of variables between training and validation cohort. The quantitative and qualitative variables were expressed with the form of mean ± standard deviation and independent t-test respectively. P-value greater than 0.05 indicates that no significant differences of the variables between the training and validation cohort, while the p-value less than 0.05 means the differences in the benign and malignant groups of the training cohort. R software was used for building the prediction model based on the binary logistic regression in the training cohort. Predictive models were established in the training cohort meanwhile in TR4 and TR5 classification based on CUS and elastography features. Values of the predictive models were evaluated regarding the discrimination with the AUC (area under the receiver operator: ROC curve), accuracy (ACC) and calibration with the method of Hosmer-Lemeshow test.

Patients information and nodules characteristics

A total of 548 nodules (306 benign nodules/242 malignant nodules) from 548 patients were enrolled in this study from June 2016 to June 2019.Thirty hundred and thirty-four nodules of 334 patients (mean age: 49.39±12.35 years, range: 22-80 years) in training cohort and 144 nodules of 144 patients (mean age: 47.84±13.69 years, range: 18-78 years) in external validation cohort from center 1, and 70 nodules of 70 patients (mean age: 50.17±12.00 years, range: 22-76 years) in internal validation cohort from center 2 and 3 were enrolled in the study.There were 123 nodules in TR4 (benign: 107, malignant: 16) and 187 nodules in TR5 (benign: 60, malignant: 127) were included in a total of 334 nodules (benign: 188, malignant: 146) . There were no significant differences (Table 1) between two cohort in the respects such as the age and gender of patients as well as the nodules characteristics (size, location, ratio of benign and malignant results and elastic parameters) (p>0.05).

CUS findings

Nodules classified by ACR TR4, TR5 classification were showed on Table 2. In the training cohort, significant features for differentiation between benign and malignant nodules were echogenicity, shape, margin and echogenic foci (all p<0.000). Composition of nodules is the only feature related to malignancy in TR4 classification (p=0.009). Features of the shape and margin achieve significant differences between benign and malignant nodules in TR5 classification (all p<0.000) (Table 3).

SE score and 2D SWE

The elastography results of thyroid nodules were presented in Table 3. Elastic parameters including SE and VTIQ max were statistically significant differences between benign and malignant nodules as well as TR5 classification (all p<0.000). VTIQ max has the prior advantage in diagnosing TR4 nodules (p=0.012) rather than SE and CUS features (Table 3). In all training cohort samples, 91 nodules had the VTIQ max value greater than the cut-off value (2.855 m/s), which included 87 benign and 4 malignant nodules. In the TR4 training cohort, 32 nodules (benign: 20, malignant: 12) had the VTIQ max value greater than the redefined cutoff value (3.225 m/s). The results show that VTIQ max in TR4 classification improved the detection rate of malignant thyroid nodules (Fig 2, 3).

Predictive models

1. Prediction models based on CUS and elastography features

Based on the training cohort, nodules characteristic in CUS and elastic images were included into the binary logistic regression predictive models:

ACR model: Binary analysis confirmed that the age had a significant negative correlation with an increased risk of thyroid malignancy. Additionally, the positive results of taller-than-wider (OR: 8.130, 95% CI: 3.947-17.867, p<0.000), lobulated or irregular boundary (OR: 3.728, 95% CI: 2.095-6.732, p<0.000), extra-thyroid extension (OR: 9.194, 95% CI: 2.393-47.348, p= 0.011), microcalcification (OR: 2.871, 95% CI: 1.680-4.969, p<0.001) and VTIQ max (OR: 4.802, 95% CI: 3.102-7.807, p<0.000) were associated with increased risks for malignancy in ACR model.

ACR TR4 model: The positive VTIQ max result (OR: 5.248, 95% CI: 2.390-13.974, p=0.001) was only independently associated with increased risks for malignant nodules in ACR TR4 model.

ACR TR5 model: The positive results with taller-than-wider (OR: 4.904, 95% CI: 2.421-10.602, p=0.010) and VTIQ max (OR: 4.412, 95% CI: 2.537-8.391, p=0.000) had significant positive correlations with the increased risks of thyroid malignancy (Table 4).

2. Three formulas of predictive models were established by combined independent risk factors of malignancy:

(1) ACR model: Logit p = - 1.862 - 0.446 * age + 3.420 * taller-than-wider + 2.223* lobulated or irregular boundary + 3.800 * extra-thyroid extension + 2.518 * microcalcification + 1.545 * VTIQ max

(2) ACR TR4 model: Logit p = - 3.278 + 1.657 * VTIQ max

(3) ACR TR5 model: Logit p = - 0.305 - 0.510 * age + 1.399 * taller-than-wider + 1.484 * VTIQ max

3. Evaluating performance and Goodness Test of prediction models

3.1 Discrimination

The performances of predictive models in the training cohort and validation cohorts evaluated by ROC curves (Fig 3) and measurements of AUC values were showed in Table 5. The ACR model yielded an AUC of 0.912 (95% CI 0.880–0.944), indicating a diagnostic accuracy (ACC) of 85.9% and was confirmed in the internal and external validation cohort respectively, which yield the AUC with 0.877 (95% CI 0.818–0.935), 0.935 (95% CI 0.884–0.986) and ACC with 77.7% and 88.2%. In the ACR TR4 model, positive VTIQ max result was the only variable associated with increased risk and yield an AUC of 0.809 (95% CI 0.684–0.935) and ACC of 89.4%, which yield an AUC of 0.842 (95% CI 0.719–0.962), 0.705 (95% CI 0.271–1) and ACC of 82.0% and 80.9% in the internal and external validation cohort respectively. In the ACR TR5 model, the AUC had a favorable value of 0.859 (95% CI 0.801–0.918), indicating a ACC of 82.3% and the performances were verified on the validation cohort: the AUC with 0.830 (95% CI 0.716–0.945), 0.906 (95% CI 0.816–0.995) and ACC with 79.2% and 86.2%.

3.2 Calibration

All three models showed considerable results for the calibration curves (all p > 0.05), meaning that all models showed good agreement between prediction and observation (Table 4).

CUS is a noninvasive main imaging tool that contributes to assessing the risk of malignancy in thyroid nodules, and FNA was guided. Other imaging method, such as elastography, because of the characteristic of evaluating the stiffness of tissues, has the good ability to distinguish benign from malignant thyroid nodules[11, 12]. While USE holds promise as a non-invasive means of assessing cancer risk, its performance is highly variable, perhaps influenced by factors such as operator dependence. In this study, we verified risk factors associated with thyroid malignancy in different risk stratification after comprehensively evaluating ultrasound and elastic variables in a population of 334 patients. Subsequently, we developed predictive models which were validated in the internal and external validation cohort with respect to discrimination and calibration, carried out in the different grades of nodules, thereby avoiding missing features of malignant thyroid nodules in different risk stratification and assisting clinicians to make more accurate decisions preoperatively.

In this study, younger age was the negative independent risk factor for thyroid malignancy in ACR TR5 model (p = 0.012) but not in ACR TR4 model according to binary logistic analysis. The result indicated that younger age may have the high possibility related to the malignancy in nodules with higher risk degree. Other researches also identified that decreased age was one of the independent risk factors for thyroid cancer. However, the incidence of thyroid cancer diagnosis rather than disease prevalence has increased dramatically in the past 30 years[13]. Modern medical practices have resulted in the heightened detection of subclinical disease, increasing the representation of low-risk patients in the cohort of diagnosed patients[14].

The diagnosis of benign and malignant results by FNA is very helpful for the detection of thyroid cancer[15]. However, most nodules are benign and up to one-third of fine-needle aspiration biopsies may be nondiagnostic, causing the pre-operative preferred approach remains a challenge[16]. A reliable, reasonable, and non-invasive method to determine which nodules require FNA is essential. CUS is widely used to differentiate malignancy from benign[17]. Five imaging features of thyroid nodules in CUS including solid composition, hypo-echogenicity, taller-than-wide shape, irregular margins, and microcalcification are the most important predictors for malignancy[18].

The 2017 ACR TI-RADS committee has provided recommendations for the diagnosis of thyroid nodules[19]. In this study, each nodule was assigned ultrasound features accordance with the guideline to predict malignancy. Binary logistic regression analyzed were performed to evaluate the association between malignancy. Factors with taller-than-wider, lobulated or irregular boundary, extra-thyroid extension and microcalcification were confirmed predictive roles of malignancy. In ACR TR5 model, however, diagnostic value was decreased after being included in the binary logistic regression analysis, probably due to its weak role in predicting malignancy, which could be masked by including other co-effectors.

In this study, malignant nodules accounted for only 13% of the TR4 classification and the 2017 ACR guidelines recommended that nodules larger than 1.5 cm in TR4 nodules are required FNA. According to our results, Of the 16 malignant nodules, 75% of the nodules had diameters less than 1.5cm and ultrasound characteristics analysis: only 2 nodules (17%) with microcalcification and 5 nodules (41.6%) with irregular boundaries from Ultrasound features. US has limitations in the differentiation of malignancy from benign. However, 2D SWE especially VTIQ max showed good diagnostic performance for TR4 classification nodules based on the regression analysis. Other studies had also reported that SWE is a highly accurate diagnostic modality for the identification of malignant thyroid nodules[20]. Different subtypes of thyroid cancer could be quantitatively distinguished through the SWE from the study by Bardet at al[21]. Other research has reported that SWE can be used to identify benign and malignant using the 22kPa cutoff value for nodules that cannot be diagnosed by FNA[22].

Discrimination and calibration are the most commonly used pair of indicators of predictive models evaluated and examined by inter and external verification in this study. The AUC has been applied to evaluate the discrimination ability. Witczak et al retrospectively reviewed the demographic, biochemical, and ultrasound characteristics of 536 thyroid nodules. They reported that serum thyrotropin, sex, microcalcification, and margin were independent predictors of malignancy. A model consisting of these variables and age group was developed and demonstrated an AUC of 0.770[23]. The radiomics score demonstrated an AUC of 0.921 in the training cohort, which was statistically similar to the AUCs of ACR score assessed by senior radiologists. The good discrimination performance of the radiomics score was confirmed in the validation cohort[24]. In our research, the training cohort of overall data demonstrated an AUC of 0.912, which was statistically similar to the AUCs of inter and external cohort: 0.877 and 0.935. The good discrimination performance was also confirmed in the classification of TR4 (0.809, 0.842 and 0.705) and TR5 (0.859, 0.830 and 0.906). The good discrimination performance of the model combining clinical, CUS and elastography characteristics was confirmed in the validation cohort and no less than the radiomics score. The calibration curves of the prediction models demonstrated good agreement between the predictive and actual probability when the p values were more than 0.05. Additionally, the prediction models had better calibration capabilities with 0.211, 0.890 and 0.737, which suggested that the value between the predicted model and the actual observed has no statistically differences.

Our study has several limitations. Firstly, the sample sizes of the external cohort were not very enough leading to poor sample size and the range with 0.271-1 of 95% CI in the external validation cohort of ACR TR4 model. External sample size would have to be increased to make the results more accurate. Secondly, because of not suspicious and mildly suspicious of malignancy in TR2 and TR3 classification respectively, we did not analyze these cases in addition to the poor sample sizes. However, there is still 1 (25%) PTC in TR2 classification, 2 (10%) in TR3 of training cohort; 3 (27%) in TR3 of internal cohort and 3 (30%) in TR3 of external cohort. Therefore, although this study provides initial evidence that multi-factor predictive models can be useful for predicting malignancy of thyroid nodules, more multi-center and large sample studies should be performed to validate our results.

Although the diagnostic performances of CUS and elastography were comparable in the overall sample, elastography parameter of VTIQ max showed diagnostic superiority in moderately suspicious (ACR TR4) thyroid nodules. Preoperative examinations, which varied with different risk stratification, avoided missing features of potentially malignant thyroid nodules. The establishment of predictive models that combined CUS and elastography characteristics would aid clinicians to make more appropriate preoperative diagnosis.

Foundation

This work was supperted by the foundations of National Natural Science Foundation of China (Grants No. 81927801, 81725008, 81772849 and 82171942), Science and Technology Commission of Shanghai Municipality (Grants No. 19DZ2251100, 19441903200 and 20ZR1443400), Shanghai Municipal Health Commission (Grants No. 2019LJ21 and SHSLCZDZK03502).

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Chang-Jun Wu, Qi Chen, Chun-Juan Xia, Bo-Ji Liu, Yun-Yun Liu, Hui-Xiong Xu and Yi-Feng Zhang. The first draft of the manuscript was written by Ying Zhang, Qiong-Yi Huang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Ethics approval

This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the three hospitals respectively (Shanghai Tenth People’s Hospital, The first Affifiliated Hospital of Harbin Medical University and The second Affifiliated Hospital of Kunming Medical University).

Sherman Steven I. Thyroid carcinoma. Lancet. 36，501-11(2003). https://doi.org/ 10.1016/ s01 40-6736(03)12488-9
Kwak JY, Han KH, Yoon JH, Moon HJ, Son EJ, Park SH et al. Thyroid imaging reporting and data system for US features of nodules: a step in establishing better stratification of cancer risk. Radiology. 260, 892-899(2011). https://doi.org/10.1148/radiol.11110206
Park HM, Lee JH, Kwak JY, Park VY, Rho M, Lee M et al. Using ultrasonographic features to predict the outcomes of patients with small papillary thyroid carcinomas: a retrospective study implementing the 2015 ATA patterns and ACR TI-RADS categories. Ultrasonography. 41: 298-306(2022).https://doi.org/10.14366/usg.21097
Modi L, Sun W, Shafizadeh N, Negron R, Yee-Chang M, Zhou F et al. Does a higher American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) score forecast an increased risk of malignancy? A correlation study of ACR TI-RADS with FNA cytology in the evaluation of thyroid nodules. Cancer Cytopathol. 128, 470-481(2022). https://doi.org/10.1002/cncy.22254
Horvath E, Majlis S, Rossi R, Franco C, Niedmann JP, Castro A et al. An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management. J Clin Endocrinol Metab. 94, 1748-1751(2009). https://doi.org/10.1210/jc.2008-1724
Zhang WB, Li JJ, Chen XY, He BL, Shen RH, Liu H et al. SWE combined with ACR TI-RADS categories for malignancy risk stratification of thyroid nodules with indeterminate FNA cytology. Clin Hemorheol Microcirc. 76, 381-390(2020). https://doi.org/10.3233/CH-200893
Wang HX, Lu F, Xu XH, Zhou P, Du LY, Zhang Y et al. Diagnostic Performance Evaluation of Practice Guidelines, Elastography and Their Combined Results for Thyroid Nodules: A Multicenter Study. Ultrasound Med Biol. 46, 1916-1927(2020). https://doi.org/10. 1016/j.ultrasmedbio.
Qiu Y, Xing Z, Liu J, Peng Y, Zhu J, Su A. Diagnostic reliability of elastography in thyroid nodules reported as indeterminate at prior fine-needle aspiration cytology (FNAC): a systematic review and Bayesian meta-analysis. Eur Radiol. 30, 6624-6634(2020). https://doi.org/ 10.1007/s00330-020-07023-0
Cantisani V, David E, Grazhdani H, Rubini A, Radzina M, Dietrich CF et al. Prospective Evaluation of Semiquantitative Strain Ratio and Quantitative 2D Ultrasound Shear Wave Elastography (SWE) in Association with TIRADS Classification for Thyroid Nodule Characterization.Ultraschall Med. 40, 495-503(2019). https://doi.org/10.1055/a-0853-1821
Itoh A, Ueno E, Tohno E, Kamma H, Takahashi H, Shiina T et al. Breast disease: clinical application of US elastography for diagnosis. Radiology.239,341-350(2006). https://doi.org/ 10.1148/radiol.2391041676
Asteria C, Giovanardi A, Pizzocaro A, Cozzaglio L, Morabito A, Somalvico F et al. US-elastography in the differential diagnosis of benign and malignant thyroid nodules. Thyroid. 18, 523-531(2008). https://doi.org/10.1089/thy.2007.0323
Liu B, Liang J, Zheng Y, Xie X, Huang G, Zhou L et al. Two-dimensional shear wave elastography as promising diagnostic tool for predicting malignant thyroid nodules: a prospective single-centre experience. Eur Radiol. 25,624-34(2015).https://doi.org/10.1007/s00330-014-3455-8
Davies Louise, Welch H Gilbert. Current thyroid cancer trends in the United States. JAMA Otolaryngol Head Neck Surg. 140, 317-322(2014).https://doi.org/10.1001/jamaoto.2014.1
Feinstein A R, Sosin D M, Wells C K. The Will Rogers phenomenon. Stage migration and new diagnostic techniques as a source of misleading statistics for survival in cancer. N Engl J Med. 312, 1604-1608(1985).https://doi.org/10.1056/NEJM198506203122504
Singh Ospina N, Brito JP, Maraka S, Espinosa de Ycaza AE, Rodriguez-Gutierrez R, Gionfriddo MR et al. Diagnostic accuracy of ultrasound-guided fine needle aspiration biopsy for thyroid malignancy: systematic review and meta-analysis. Endocrine. 53(3), 651-661(2016). https://doi.org/10.1007/s12020-016-0921-x
Castro M R, Gharib H. Continuing controversies in the management of thyroid nodules. Ann Intern Med. 142, 926-931.(2005).https://doi.org/10.7326/0003-4819-142-11-200506070-00011
Brito JP, Gionfriddo MR, Al Nofal A, Boehmer KR, Leppin AL, Reading C et al. The accuracy of thyroid nodule ultrasound to predict thyroid cancer: systematic review and meta-analysis. J Clin Endocrinol Metab.99,1253-1263(2014).https://doi.org/10.1210/jc.2013-2928
Smith-Bindman R, Lebda P, Feldstein VA, Sellami D, Goldstein RB, Brasic N et al. Risk of thyroid cancer based on thyroid ultrasound imaging characteristics: results of a population-based study. JAMA Intern Med. 173,1788-1796(2013).https://doi.org/10.1001/jamainternmed.2013.9245
Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, Teefey SA et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol. 14, 587-595(2017).https://doi.org/10.1016/j.jacr.2017.01.046
Lin P, Chen M, Liu B, Wang S, Li X. Diagnostic performance of shear wave elastography in the identification of malignant thyroid nodules: a meta-analysis. Eur Radiol. 24, 2729-2738(2014).https://doi.org/10.1007/s00330-014-3320-9
Bardet S, Ciappuccini R, Pellot-Barakat C, Monpeyssen H, Michels JJ, Tissier F et al. Shear Wave Elastography in Thyroid Nodules with Indeterminate Cytology: Results of a Prospective Bicentric Study. Thyroid. 27, 1441-1449(2017).https://doi.org/10.1089/thy.2017.0293
Samir AE, Dhyani M, Anvari A, Prescott J, Halpern EF, Faquin WC et al. Shear-Wave Elastography for the Preoperative Risk Stratification of Follicular-patterned Lesions of the Thyroid: Diagnostic Accuracy and Optimal Measurement Plane. Radiology. 277, 565-573 (2015).https://doi.org/10.1148/radiol.2015141627
Witczak J, Taylor P, Chai J, Amphlett B, Soukias JM, Das G et al.. Predicting malignancy in thyroid nodules: feasibility of a predictive model integrating clinical, biochemical, and ultrasound characteristics. Thyroid Res. 9, 4(2016).https://doi.org/10.1186/s13044-016-0033-y
Liang J, Huang X, Hu H, Liu Y, Zhou Q, Cao Q et al. Predicting Malignancy in Thyroid Nodules: Radiomics Score Versus 2017 American College of Radiology Thyroid Imaging, Reporting and Data System. Thyroid. 28, 1024-1033(2018).https://doi.org/ 10.1089/thy.2017.0525

Table 1

Information for thyroid nodules in training and validation cohorts

Characteristic	Training cohort (n=334)	validation cohort Internal (n=144) External (n=70)		p-value*	p-value#
Sex Male Female	82 252	30 114	11 59	0.379	0.110
Age	49.39±12.35	47.84±13.69	50.17±12.00	0.225	0.629
Size	13.54±9.13	12.90±8.21	13.90±8.03	0.466	0.764
Location Left Right Isthmus	184 140 10	76 64 4	47 43 1	0.876	0.445
Pathology result Benign Malignancy	188 146	78 66	40 30	0.669	0.890
Composition Cystic or almost completely cystic Spongiform Mixed cystic and solid Solid or almost completely solid	0 2 7 325	0 2 5 137	0 0 5 65	0.459	0.064
Echogenicity Anechoic Hyper or iso Hypo Very hypo	0 35 289 10	0 13 123 8	0 18 48 4	0.377	0.001
Shape(taller/wider) ＜1 ＞1	276 58	116 28	54 16	0.701	0.508
Margin Smooth Ill-defined Lobulated or irregular Extra-thyroid extension	131 31 157 15	60 12 64 8	35 1 31 3	0.889	0.101
Echogenic foci None Micra Peripheral Micro	159 11 3 161	77 4 0 63	38 0 3 29	0.474	0.049
SE score Score-1 Score-2 Score-3 Score-4 Score-5	4 43 90 164 33	0 22 44 68 10	0 10 20 30 10	0.454	0.643
VTIQ max	3.39±1.27	3.31±0.90	3.09±0.87	0.079	0.056

P-value reﬂects the differences between the training and validation cohorts.

* reﬂects the differences between the training and internal validation cohorts.

# reﬂects the differences between the training and external validation cohorts.

Table 2

Information for ACR TI-RADS classification of thyroid Nodules.

ACR TI-RADS classification	Training cohort (n=334)		Internal validation cohort (n=244)		External validation cohort (n=70)
ACR TI-RADS classification	Benign (n=188)	Malignant (n=146)	Benign (n=178)	Malignant (n=66)	Benign (n=40)	Malignant (n=30)
TR4	107	16	40	10	17	8
TR5	60	127	23	53	5	22

ACR TI-RADS: Thyroid Imaging Reporting and Data System by the American College of Radiology.

Table 3

Information for thyroid Nodules in the Training cohorts

Characteristic

All data

p-value

TR4 classification

p-value

TR5 classification

p-value

Benign

(188)

Malignant

(146)

Benign

(107)

Malignant

(16)

Benign

(60)

Malignant

(127)

Sex

Male

Female

Age

Size

Location

Left

Right

Isthmus

Composition

Cystic or almost completely cystic

Spongiform

Mixed cystic and solid

Solid or almost completely solid

Echogenicity

Anechoic

Hyper or iso

Hypo

Very hypo

Shape(taller/wider)

＜1

＞1

Margin

Smooth

Ill-defined

Lobulated or irregular

Extra-thyroid extension

Echogenic foci

None

Micra

Peripheral

Micro

SE score

Score-1

Score-2

Score-3

Score-4

Score-5

VTIQ max

41

147

51.36±11.10

14.02±8.93

104

77

7

0

1

6

181

3

28

155

2

180

9

109

24

53

2

116

8

2

62

3

37

73

70

5

2.89±0.59

41

105

46.68±13.39

12.94±9.37

80

63

3

0

1

144

0

4

134

8

96

49

22

7

104

13

43

3

1

99

1

6

17

94

28

4.05±1.59

0.186

0.001

0.284

0.650

0.280

0.000

22

85

51.47±11.03

14.96±9.04

56

47

4

0

107

0

1

11

95

107

0

67

19

21

0

89

8

2

0

2

23

43

38

1

2.84±0.59

2

14

47.50±12.35

16.10±11.94

9

6

1

0

1

15

0

1

15

16

0

8

3

5

0

19

1

0

2

0

2

4

9

1

4.15±1.83

0.735

0.189

0.652

0.823

0.009

0.811

/

0.532

0.858

0.086

0.012

15

45

51.47±11.15

9.72±4.10

32

25

3

0

60

0

58

2

51

9

22

5

31

2

6

0

54

0

11

17

28

4

3.01±0.61

38

89

46.53±13.15

12.40±8.77

70

55

2

0

127

0

1

118

8

78

49

11

4

99

13

27

2

1

97

0

14

30

111

31

4.06±1.57

0.366

0.009

0.052

0.399

0.324

0.548

0.003

0.000

0.152

0.000

All data: All samples included.

p-value reflects the differences between the benign and malignant nodules.

VTIQ: Virtual touch tissue imaging quantification.

SE: strain elastography.

ACR: American College of Radiology.

Table 4

Binary Logistic Regression Analysis for Nodule Malignancy in the Training Cohort

	characteristic	B	SE	OR		(95%CI)	P-value	P-Value* (Hosmer- Lemeshow test)
ACR model	Age	-0.423	0.160	0.645	(0.500, 0.849)		0.008	0.211
	Taller-than-wider	2.095	0.456	8.130	(3.947, 17.867)		0.000
	Lobulated or irregular boundary	1.315	0.353	3.728	(2.095, 6.732)		0.000
	Extra-thyroid extension	2.218	0.880	9.194	(2.393, 47.348)		0.011
	Microcalcification	1.054	0.328	2.871	(1.680, 4.969)		0.001
	VTIQ max	1.569	0.279	4.802	(3.102, 7.807)		0.000
ACR TR4 model	VTIQ max	1.657	0.525	5.248	(2.390, 13.974)		0.001	0.890
ACR TR5 model	Age	-0.473	0.188	0.623	(0.452, 0.845)		0.012	0.737
	Taller than wide	1.590	0.446	4.904	(2.421, 10.602)		0.000
	VTIQ max	1.585	0.363	4.412	(2.537, 8.391)		0.000

B: coefficients, SE: regression coefﬁcient of the predictor, OR: odds ratio, CI: conﬁdence interval.

p-values* of Hosmer- Lemeshow test more than 0.05 were considered well-calibrated.

VTIQ: Virtual touch tissue imaging quantification.

ACR: American College of Radiology.

Table 5

AUC and ACC of training and validation cohorts in three predictive models

Models	Training cohort AUC (95%CI) ACC (%)		Internal validation cohort AUC (95%CI) ACC (%)		External validation cohort AUC (95%CI) ACC (%)
ACR model	0.912(0.880-0.944)	85.9	0.877(0.818-0.935)	77.7	0.935(0.884-0.986)	88.2
ACR TR4 model	0.809(0.684-0.935)	89.4	0.842(0.719-0.962)	82.0	0.705(0.271-1)	80.9
ACR TR5 model	0.859(0.801-0.918)	82.3	0.830(0.716-0.945)	79.2	0.906(0.816-0.995)	86.2

AUC: area under the curve, ACC: accuracy.

ACR: American College of Radiology.

Predicting malignancy in thyroid nodules based on conventional ultrasound and elastography: the value of predictive models in a multi-center study

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Patients

CUS, SE and 2D SWE examinations

CUS

SE

2D SWE

Predictive models

Statistical analysis

Results

Patients information and nodules characteristics

CUS findings

SE score and 2D SWE

Predictive models

1. Prediction models based on CUS and elastography features

2. Three formulas of predictive models were established by combined independent risk factors of malignancy:

3.1 Discrimination

3.2 Calibration

Discussion

Conclusions

Declarations

References

Tables

Status:

Version 1