Machine Learning-Based Prediction Model for Papillary Thyroid Carcinoma Recurrence

Background: This study analyzed the prognostic signicance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model using machine learning techniques. Methods: We retrospectively analyzed clinico-pathologic data from 1040 patients diagnosed with papillary thyroid cancer between 2003 and 2009. Results: We analyzed clinico-pathologic factors related to recurrence through logistic regression analysis. Among the factors that we included, only sex and tumor size were signicantly correlated with disease recurrence. Parameters such as age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LNs metastasis, number of metastatic LNs, and LNR were input for construction of a machine learning prediction model. The performance of ve machine learning models related to recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy at 95%, and the lightGBM and stacking model together showed 93% accuracy. Conclusions: We conrmed that all machine learning prediction models showed an accuracy of 90% or more for predicting disease recurrence in PTC. Large-scale multicenter clinical studies should be performed to improve the performance of our prediction models and verify their clinical effectiveness.


Introduction
In the past 20 years, the incidence of thyroid cancer has increased rapidly, and most of these are papillary thyroid carcinoma (PTC). (1) PTC has an excellent prognosis and a better survival rate than other carcinomas, but the disease generally recurs in about 5-21% of PTC patients. (2,3) In PTC patients with recurrent disease, surgical treatment is usually required, and re-operation can increase the risk of complications and morbidity compared to the rst surgery. Therefore, preventing recurrence in PTC patients can reduce the morbidity of reoperation and prevent quality of life from deteriorating. According to previous reports, tumor size, extrathyroidal extension (ETE), age, lymph node (LN) metastasis, tumor multiplicity, and extranodal spread (ENE) are known risk factors for disease recurrence. (4)(5)(6) In particular, LN metastasis occurs in 20-90% of PTC patients and is a signi cant risk factor for recurrence. (7)(8)(9)(10) The number of metastatic LNs and the lymph node ratio (LNR), representing the metastatic LN burden, is also an important prognostic factor associated with recurrence of PTC. (11)(12)(13) Since various clinicopathological factors, along with nodal factors such as the number of metastatic LNs and LNR, are related to the recurrence of PTC, these factors should be considered in an integrated manner to establish a disease recurrence prediction model.
The 8th TNM staging system was revised by the American Joint Committee on Cancer (AJCC) to more accurately predict the disease-speci c survival of PTC patients. However, it does not re ect the biological behavior of PTC and has limitations in predicting the risk of recurrence. (14)(15)(16) In particular, the number and size of metastatic LNs are known to be important prognostic factors for recurrence of PTC, but are not re ected in the revised TNM staging system. The N classi cation of the revised TNM stage system is too simply divided into three groups, and does not consider other nodal factors. (17)(18)(19)(20)(21)(22) Therefore, more accurate recurrence prediction model should be established for PTC patients.
Machine learning technology is widely used in the medical eld due to the development of image recognition techniques, especially in the elds of radiology, ophthalmology, and dermatology. (23)(24)(25)(26)(27)(28) However, studies on the construction of machine learning models that predict disease recurrence related to thyroid cancer are extremely rare. If a robust predictive model to predict the recurrence of PTC patients is established, high-risk patients can be selected so that they can undergo customized treatment according to the risk strati cation, and active follow-up can be suggested in patients with high risk. This study analyzed the prognostic signi cance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model based on various clinico-pathological factors using machine learning techniques.

Materials And Methods
This study was approved by the Institutional Review Board (IRB) of Pusan University. Informed consent was not obtained from any participants because the IRB waived the need for individual informed consent.
This retrospective research was performed in accordance with the Declaration of Helsinki. Medical data of patients diagnosed and treated for PTC at Pusan National University Hospital from June 2003 to December 2009 were analyzed retrospectively. We included patients who were diagnosed with papillary thyroid cancer and underwent total thyroidectomy and central neck dissection with/without lateral neck dissection. We excluded (1) cases with a distant metastasis at the time of diagnosis, (2) patients who received previous surgery or radiotherapy to the head and neck area, and (3) cases with insu cient clinical data that were lost to follow-up after surgery. Finally, 1,040 patients were included in the study, including 147 males and 893 females. Their ages ranged from 13 to 79 years and the mean age was 48.5 years. Tumor stage was classi ed based on the 8th AJCC staging system.
To detect disease recurrence, all patients underwent physical examination, ultrasound, and thyroglobulin measurement every 6-12 months after surgery. If necessary, additional imaging studies such as computed tomography, whole body iodine scan, and positron emission tomography were performed.
Recurrence was de ned as a case in which a new lesion that was not previously observed was detected in the imaging studies, and which had pathological con rmation through ne needle aspiration cytology.
Tumor size, ETE, multiplicity, ENE, and TNM stage were analyzed. The surgical specimens from central neck dissection were divided into ipsilateral and contralateral areas according to the location of the tumor, and the number of metastatic LNs and the total number of removed LNs were analyzed. LNR was calculated by dividing the number of metastatic LNs by the total number of harvested LNs. The cut-off value of LNR was determined in consideration of the sensitivity and speci city optimized to predict disease recurrence using a receiver operating characteristic curve (ROC).
Machine learning was performed based on the supervised learning method and a range of machine learning models were used including the Decision Tree model and Ensemble model, which included the Random Forest, XGBoost, LightGBM, and Stacking techniques. Learning was performed with the ve models mentioned above, and accuracy was used to evaluate the performance between models. Scikitlearn version 12.3 was used for model building and learning. 80% of the data set was classi ed as the training set and was used for learning, and the remaining 20% was used as a test set. To account for selection bias, the ve-fold-cross-validation technique was applied.
Patient's clinical information, pathologic information, recurrence, and cause of recurrence were collected and analyzed. The Chi-square or independent two-sample t-test were used to evaluate differences in variables between two independent groups. The multivariate Cox proportional hazards regression model was used to evaluate the effect of several variables on disease recurrence. A p-value<0.05 was considered to indicate statistical signi cance. Statistical analyses were performed using python 3.8 version and SPSS 25.0 for Windows (SPSS, Chicago, IL).

Results
A total of 1,040 patients were included in this study, and all patients underwent total thyroidectomy and central neck dissection. In total, 180 patients (17.3%) underwent lateral neck dissection due to lateral LN metastasis. The average tumor size was 12.4mm (range, 2-125). ETE ndings were observed in 586 (56.3%) patients, and tumor multiplicity was observed in 178 (17.1%) patients. ENE ndings were observed in 159 patients (15.3%). With respect to T classi cation, 508 patients were T1, 46 patients were T2, 483 patients were T3, and three patients were T4. The average number of metastatic LNs in the central compartment was 1.73 (range, 0-19) and the average number of removed LNs was 8.32 (range, 0-36). LNR was obtained by dividing the number of metastatic LNs in the central compartment by the total number of LNs removed. The average LNR value was 0.20 (range, 0-1). The mean follow-up period was 79.0 months (range, 46-149), and the total number of recurrence events during the study period was 41. Other clinico-pathological information is summarized in Table 1.
The cut-off value for the LNR was set to show the optimal sensitivity and speci city for recurrence prediction. Regarding the prediction of recurrence, LNR showed a statistically signi cant correlation (pvalue = <0.001), with an AUC value was 0.752, and 0.24 was set as the optimal cut-off value. There were 519 patients (49.9%) with LNR=0, 179 (17.2%) with 0<LNR<0.24, and 342 (32.9%) in the group with an LNR value of 0.24 or more. Recurrence-free survival was signi cantly decreased in the patient group with LNR>0.24 (Fig. 1A) compared other two groups. The cut-off value for the number of metastatic LNs was set to show optimal sensitivity and speci city for recurrence prediction. The number of metastatic LNs was statistically signi cantly correlated with recurrence (p-value = <0.001), the AUC value was 0.742, and the value of 2 was set as the cut-off for the number of metastatic LNs. There were 519 patients (49.9%) with 0 metastatic LNs, 161 (15.5%) with one LN metastasis, and 360 (34.6 with two or more LN metastases. Recurrence-free survival was analyzed by dividing these into three groups, and was signi cantly decreased in patients with two or more metastatic LNs (Fig. 1B).
We analyzed the association between clinico-pathologic factors and recurrence through univariate analysis. Sex, tumor size, ETE, pT classi cation, pN classi cation, number of metastatic LNs, and LNR were signi cantly correlated with disease recurrence. Clinico-pathologic factors related to recurrence were also analyzed with logistic regression. Among the factors included in the analysis, only sex and tumor size showed a signi cant correlation with disease recurrence (Table 2).
To build a machine learning prediction model, the algorithm was trained using parameters including age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LN metastasis, number of metastatic LNs, and LNR. Since disease recurred only in 41 out of 1040 cases, the SMOTE technique was applied to adjust the imbalance of learning data. The performance of ve machine learning models for recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy of 95%, and the lightGBM and stacking model together showed 93% accuracy. Table 3 summarizes the performance comparison of the ve models. The tree structure of the Decision Tree model was visualized using graphic software, and feature importance was also visualized and analyzed (Fig. 2). In addition, feature importance was explored to determine the major factors that in uence the prediction of recurrence in PTC patients. Although the feature importance results differed slightly between machine learning models, LNR and contralateral LN metastasis were important features in all models (Table 4).

Discussion
The revised 8th TNM staging system is suitable for assessing the risk of death in patients with PTC, but not for predicting the risk of recurrence. Age, aggressive histology, tumor size, and LNs metastasis are known risk factors associated with PTC recurrence. (29) The 2015 American Thyroid Association guidelines suggested the number and size of metastatic LNs and ENE as risk factors for recurrence. (30,31) In particular, the LNR, calculated by dividing the number of metastatic LNs by the total number of removed LNs, has been reported in previous studies as a risk factor for recurrence of PTC. (13,32,33) Lee et al. reported that the performance of recurrence prediction increased when LNR was incorporated into the existing 2015 ATA risk strati cation. (34) In our study, the sensitivity and speci city for predicting disease recurrence were optimized when the cut-off LNR value was set to 0.24. The number of metastatic LNs also showed a statistically signi cant correlation with the prediction of disease recurrence when 2 or more were set as the cut-off value. However, multivariate analysis, unlike previous studies, showed no signi cant correlation with disease recurrence of PTC.
In univariate analysis of risk factors for PTC recurrence, sex, tumor size, ETE, pT, pN, number of metastatic LNs, and LNR were signi cantly correlated with recurrence. In multivariate analysis, only tumor size showed a signi cant correlation with disease recurrence. Logistic regression was used to analyze prognostic factors based on a linear combination between variables. Therefore, if the degree of correlation between variables is high, the analysis is limited. On the other hand, since machine learning models do not assume a linear combination of variables used, the effect of correlation between variables can be diminished. When analyzing the feature importance of the parameters used for machine learning model construction, contralateral CLN metastasis and LNR were used at high frequency for machine learning model construction in all machine learning models, and other clinical factors such as tumor size and age also showed signi cant importance in constructing the machine learning predictive model.
As various clinical and pathologic factors are related to PTC recurrence, a technique that can analyze these factors in an integrated manner must be used to establish a robust prediction model. Among the machine learning techniques used in this study, the Decision Tree model showed the highest accuracy, followed by Ensemble models such as lightGBM and stacking techniques. The other two machine learning techniques also showed 90% or more accuracy. Since the model is trained based on data from 1,000 patients, more patient data is required to increase the performance of our models and apply them in clinical practice, and a multi-institutional clinical study should be performed to verify their clinical effectiveness. This study is of value as the rst study on a machine learning model for predicting PTC disease recurrence based on clinico-pathologic factors, and we con rmed that machine learning models showed acceptable performance with an accuracy of 90% or more. However, this study has the following limitations. Since this is a retrospective study conducted at a single institution, the in uence of selection bias cannot be excluded. In addition, considering the indolent features of PTC, a short follow-up period is less optimal for detecting recurrence in PTC patients.
Various machine learning models were used to construct a model for predicting disease recurrence in PTC patients, and all the models had a con rmed accuracy of 90% or more. In the future, large-scale clinical studies on many patients should be performed to improve the performance of our prediction models, and multicenter clinical studies will be needed to verify their clinical effectiveness.