A total of 1,040 patients were included in this study, and all patients underwent total thyroidectomy and central neck dissection. In total, 180 patients (17.3%) underwent lateral neck dissection due to lateral LN metastasis. The average tumor size was 12.4mm (range, 2-125). ETE findings were observed in 586 (56.3%) patients, and tumor multiplicity was observed in 178 (17.1%) patients. ENE findings were observed in 159 patients (15.3%). With respect to T classification, 508 patients were T1, 46 patients were T2, 483 patients were T3, and three patients were T4. The average number of metastatic LNs in the central compartment was 1.73 (range, 0-19) and the average number of removed LNs was 8.32 (range, 0-36). LNR was obtained by dividing the number of metastatic LNs in the central compartment by the total number of LNs removed. The average LNR value was 0.20 (range, 0-1). The mean follow-up period was 79.0 months (range, 46-149), and the total number of recurrence events during the study period was 41. Other clinico-pathological information is summarized in Table 1.
The cut-off value for the LNR was set to show the optimal sensitivity and specificity for recurrence prediction. Regarding the prediction of recurrence, LNR showed a statistically significant correlation (p-value = <0.001), with an AUC value was 0.752, and 0.24 was set as the optimal cut-off value. There were 519 patients (49.9%) with LNR=0, 179 (17.2%) with 0<LNR<0.24, and 342 (32.9%) in the group with an LNR value of 0.24 or more. Recurrence-free survival was significantly decreased in the patient group with LNR>0.24 (Fig. 1A) compared other two groups. The cut-off value for the number of metastatic LNs was set to show optimal sensitivity and specificity for recurrence prediction. The number of metastatic LNs was statistically significantly correlated with recurrence (p-value = <0.001), the AUC value was 0.742, and the value of 2 was set as the cut-off for the number of metastatic LNs. There were 519 patients (49.9%) with 0 metastatic LNs, 161 (15.5%) with one LN metastasis, and 360 (34.6 with two or more LN metastases. Recurrence-free survival was analyzed by dividing these into three groups, and was significantly decreased in patients with two or more metastatic LNs (Fig. 1B).
We analyzed the association between clinico-pathologic factors and recurrence through univariate analysis. Sex, tumor size, ETE, pT classification, pN classification, number of metastatic LNs, and LNR were significantly correlated with disease recurrence. Clinico-pathologic factors related to recurrence were also analyzed with logistic regression. Among the factors included in the analysis, only sex and tumor size showed a significant correlation with disease recurrence (Table 2).
To build a machine learning prediction model, the algorithm was trained using parameters including age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LN metastasis, number of metastatic LNs, and LNR. Since disease recurred only in 41 out of 1040 cases, the SMOTE technique was applied to adjust the imbalance of learning data. The performance of five machine learning models for recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy of 95%, and the lightGBM and stacking model together showed 93% accuracy. Table 3 summarizes the performance comparison of the five models. The tree structure of the Decision Tree model was visualized using graphic software, and feature importance was also visualized and analyzed (Fig. 2). In addition, feature importance was explored to determine the major factors that influence the prediction of recurrence in PTC patients. Although the feature importance results differed slightly between machine learning models, LNR and contralateral LN metastasis were important features in all models (Table 4).