Machine Learning-Based Prediction Model for Papillary Thyroid Carcinoma Recurrence

DOI: https://doi.org/10.21203/rs.3.rs-113105/v1

Abstract

Background: This study analyzed the prognostic significance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model using machine learning techniques.

Methods: We retrospectively analyzed clinico-pathologic data from 1040 patients diagnosed with papillary thyroid cancer between 2003 and 2009.

Results: We analyzed clinico-pathologic factors related to recurrence through logistic regression analysis. Among the factors that we included, only sex and tumor size were significantly correlated with disease recurrence. Parameters such as age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LNs metastasis, number of metastatic LNs, and LNR were input for construction of a machine learning prediction model. The performance of five machine learning models related to recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy at 95%, and the lightGBM and stacking model together showed 93% accuracy.

Conclusions: We confirmed that all machine learning prediction models showed an accuracy of 90% or more for predicting disease recurrence in PTC. Large-scale multicenter clinical studies should be performed to improve the performance of our prediction models and verify their clinical effectiveness.

Introduction

In the past 20 years, the incidence of thyroid cancer has increased rapidly, and most of these are papillary thyroid carcinoma (PTC). (1) PTC has an excellent prognosis and a better survival rate than other carcinomas, but the disease generally recurs in about 5-21% of PTC patients. (2,3) In PTC patients with recurrent disease, surgical treatment is usually required, and re-operation can increase the risk of complications and morbidity compared to the first surgery. Therefore, preventing recurrence in PTC patients can reduce the morbidity of reoperation and prevent quality of life from deteriorating. According to previous reports, tumor size, extrathyroidal extension (ETE), age, lymph node (LN) metastasis, tumor multiplicity, and extranodal spread (ENE) are known risk factors for disease recurrence. (4-6) In particular, LN metastasis occurs in 20-90% of PTC patients and is a significant risk factor for recurrence. (7-10) The number of metastatic LNs and the lymph node ratio (LNR), representing the metastatic LN burden, is also an important prognostic factor associated with recurrence of PTC. (11-13) Since various clinico-pathological factors, along with nodal factors such as the number of metastatic LNs and LNR, are related to the recurrence of PTC, these factors should be considered in an integrated manner to establish a disease recurrence prediction model.

The 8th TNM staging system was revised by the American Joint Committee on Cancer (AJCC) to more accurately predict the disease-specific survival of PTC patients. However, it does not reflect the biological behavior of PTC and has limitations in predicting the risk of recurrence. (14-16) In particular, the number and size of metastatic LNs are known to be important prognostic factors for recurrence of PTC, but are not reflected in the revised TNM staging system. The N classification of the revised TNM stage system is too simply divided into three groups, and does not consider other nodal factors. (17-22) Therefore, more accurate recurrence prediction model should be established for PTC patients.

Machine learning technology is widely used in the medical field due to the development of image recognition techniques, especially in the fields of radiology, ophthalmology, and dermatology. (23-28) However, studies on the construction of machine learning models that predict disease recurrence related to thyroid cancer are extremely rare. If a robust predictive model to predict the recurrence of PTC patients is established, high-risk patients can be selected so that they can undergo customized treatment according to the risk stratification, and active follow-up can be suggested in patients with high risk. This study analyzed the prognostic significance of nodal factors, including the number of metastatic LNs and LNR, in patients with PTC, and attempted to construct a disease recurrence prediction model based on various clinico-pathological factors using machine learning techniques.

Materials And Methods

This study was approved by the Institutional Review Board (IRB) of Pusan University. Informed consent was not obtained from any participants because the IRB waived the need for individual informed consent. This retrospective research was performed in accordance with the Declaration of Helsinki. Medical data of patients diagnosed and treated for PTC at Pusan National University Hospital from June 2003 to December 2009 were analyzed retrospectively. We included patients who were diagnosed with papillary thyroid cancer and underwent total thyroidectomy and central neck dissection with/without lateral neck dissection. We excluded (1) cases with a distant metastasis at the time of diagnosis, (2) patients who received previous surgery or radiotherapy to the head and neck area, and (3) cases with insufficient clinical data that were lost to follow-up after surgery. Finally, 1,040 patients were included in the study, including 147 males and 893 females. Their ages ranged from 13 to 79 years and the mean age was 48.5 years. Tumor stage was classified based on the 8th AJCC staging system.

To detect disease recurrence, all patients underwent physical examination, ultrasound, and thyroglobulin measurement every 6-12 months after surgery. If necessary, additional imaging studies such as computed tomography, whole body iodine scan, and positron emission tomography were performed. Recurrence was defined as a case in which a new lesion that was not previously observed was detected in the imaging studies, and which had pathological confirmation through fine needle aspiration cytology.

Tumor size, ETE, multiplicity, ENE, and TNM stage were analyzed. The surgical specimens from central neck dissection were divided into ipsilateral and contralateral areas according to the location of the tumor, and the number of metastatic LNs and the total number of removed LNs were analyzed. LNR was calculated by dividing the number of metastatic LNs by the total number of harvested LNs. The cut-off value of LNR was determined in consideration of the sensitivity and specificity optimized to predict disease recurrence using a receiver operating characteristic curve (ROC).

Machine learning was performed based on the supervised learning method and a range of machine learning models were used including the Decision Tree model and Ensemble model, which included the Random Forest, XGBoost, LightGBM, and Stacking techniques. Learning was performed with the five models mentioned above, and accuracy was used to evaluate the performance between models. Scikit-learn version 12.3 was used for model building and learning. 80% of the data set was classified as the training set and was used for learning, and the remaining 20% was used as a test set. To account for selection bias, the five-fold-cross-validation technique was applied.

Patient’s clinical information, pathologic information, recurrence, and cause of recurrence were collected and analyzed. The Chi-square or independent two-sample t-test were used to evaluate differences in variables between two independent groups. The multivariate Cox proportional hazards regression model was used to evaluate the effect of several variables on disease recurrence. A p-value<0.05 was considered to indicate statistical significance. Statistical analyses were performed using python 3.8 version and SPSS 25.0 for Windows (SPSS, Chicago, IL).

Results

A total of 1,040 patients were included in this study, and all patients underwent total thyroidectomy and central neck dissection. In total, 180 patients (17.3%) underwent lateral neck dissection due to lateral LN metastasis. The average tumor size was 12.4mm (range, 2-125). ETE findings were observed in 586 (56.3%) patients, and tumor multiplicity was observed in 178 (17.1%) patients. ENE findings were observed in 159 patients (15.3%). With respect to T classification, 508 patients were T1, 46 patients were T2, 483 patients were T3, and three patients were T4. The average number of metastatic LNs in the central compartment was 1.73 (range, 0-19) and the average number of removed LNs was 8.32 (range, 0-36). LNR was obtained by dividing the number of metastatic LNs in the central compartment by the total number of LNs removed. The average LNR value was 0.20 (range, 0-1). The mean follow-up period was 79.0 months (range, 46-149), and the total number of recurrence events during the study period was 41. Other clinico-pathological information is summarized in Table 1.

The cut-off value for the LNR was set to show the optimal sensitivity and specificity for recurrence prediction. Regarding the prediction of recurrence, LNR showed a statistically significant correlation (p-value = <0.001), with an AUC value was 0.752, and 0.24 was set as the optimal cut-off value. There were 519 patients (49.9%) with LNR=0, 179 (17.2%) with 0<LNR<0.24, and 342 (32.9%) in the group with an LNR value of 0.24 or more. Recurrence-free survival was significantly decreased in the patient group with LNR>0.24 (Fig. 1A) compared other two groups. The cut-off value for the number of metastatic LNs was set to show optimal sensitivity and specificity for recurrence prediction. The number of metastatic LNs was statistically significantly correlated with recurrence (p-value = <0.001), the AUC value was 0.742, and the value of 2 was set as the cut-off for the number of metastatic LNs. There were 519 patients (49.9%) with 0 metastatic LNs, 161 (15.5%) with one LN metastasis, and 360 (34.6 with two or more LN metastases. Recurrence-free survival was analyzed by dividing these into three groups, and was significantly decreased in patients with two or more metastatic LNs (Fig. 1B).

We analyzed the association between clinico-pathologic factors and recurrence through univariate analysis. Sex, tumor size, ETE, pT classification, pN classification, number of metastatic LNs, and LNR were significantly correlated with disease recurrence. Clinico-pathologic factors related to recurrence were also analyzed with logistic regression. Among the factors included in the analysis, only sex and tumor size showed a significant correlation with disease recurrence (Table 2).

To build a machine learning prediction model, the algorithm was trained using parameters including age, sex, tumor size, tumor multiplicity, ETE, ENE, pT, pN, ipsilateral central LN metastasis, contralateral central LN metastasis, number of metastatic LNs, and LNR. Since disease recurred only in 41 out of 1040 cases, the SMOTE technique was applied to adjust the imbalance of learning data. The performance of five machine learning models for recurrence prediction was compared based on accuracy. The Decision Tree model showed the best accuracy of 95%, and the lightGBM and stacking model together showed 93% accuracy. Table 3 summarizes the performance comparison of the five models. The tree structure of the Decision Tree model was visualized using graphic software, and feature importance was also visualized and analyzed (Fig. 2). In addition, feature importance was explored to determine the major factors that influence the prediction of recurrence in PTC patients. Although the feature importance results differed slightly between machine learning models, LNR and contralateral LN metastasis were important features in all models (Table 4).

Discussion

The revised 8th TNM staging system is suitable for assessing the risk of death in patients with PTC, but not for predicting the risk of recurrence. Age, aggressive histology, tumor size, and LNs metastasis are known risk factors associated with PTC recurrence. (29) The 2015 American Thyroid Association guidelines suggested the number and size of metastatic LNs and ENE as risk factors for recurrence. (30,31) In particular, the LNR, calculated by dividing the number of metastatic LNs by the total number of removed LNs, has been reported in previous studies as a risk factor for recurrence of PTC. (13,32,33) Lee et al. reported that the performance of recurrence prediction increased when LNR was incorporated into the existing 2015 ATA risk stratification. (34) In our study, the sensitivity and specificity for predicting disease recurrence were optimized when the cut-off LNR value was set to 0.24. The number of metastatic LNs also showed a statistically significant correlation with the prediction of disease recurrence when 2 or more were set as the cut-off value. However, multivariate analysis, unlike previous studies, showed no significant correlation with disease recurrence of PTC.

In univariate analysis of risk factors for PTC recurrence, sex, tumor size, ETE, pT, pN, number of metastatic LNs, and LNR were significantly correlated with recurrence. In multivariate analysis, only tumor size showed a significant correlation with disease recurrence. Logistic regression was used to analyze prognostic factors based on a linear combination between variables. Therefore, if the degree of correlation between variables is high, the analysis is limited. On the other hand, since machine learning models do not assume a linear combination of variables used, the effect of correlation between variables can be diminished. When analyzing the feature importance of the parameters used for machine learning model construction, contralateral CLN metastasis and LNR were used at high frequency for machine learning model construction in all machine learning models, and other clinical factors such as tumor size and age also showed significant importance in constructing the machine learning predictive model.

As various clinical and pathologic factors are related to PTC recurrence, a technique that can analyze these factors in an integrated manner must be used to establish a robust prediction model. Among the machine learning techniques used in this study, the Decision Tree model showed the highest accuracy, followed by Ensemble models such as lightGBM and stacking techniques. The other two machine learning techniques also showed 90% or more accuracy. Since the model is trained based on data from 1,000 patients, more patient data is required to increase the performance of our models and apply them in clinical practice, and a multi-institutional clinical study should be performed to verify their clinical effectiveness. This study is of value as the first study on a machine learning model for predicting PTC disease recurrence based on clinico-pathologic factors, and we confirmed that machine learning models showed acceptable performance with an accuracy of 90% or more. However, this study has the following limitations. Since this is a retrospective study conducted at a single institution, the influence of selection bias cannot be excluded. In addition, considering the indolent features of PTC, a short follow-up period is less optimal for detecting recurrence in PTC patients.

Various machine learning models were used to construct a model for predicting disease recurrence in PTC patients, and all the models had a confirmed accuracy of 90% or more. In the future, large-scale clinical studies on many patients should be performed to improve the performance of our prediction models, and multicenter clinical studies will be needed to verify their clinical effectiveness.

Declarations

Acknowledgments

This study was supported by a Research Grant from Gangnam Severance Hospital, Yonsei University College of Medicine.

References

  1. Altekruse S, Das A, Cho H, Petkov V, Yu M. Do US thyroid cancer incidence rates increase with socioeconomic status among people with health insurance? An observational study using SEER population-based data. Altekruse S, Das A, Cho H, Petkov V, Yu M. BMJ Open. 2015;5(12):e009843.
  1. Grant CS. Recurrence of papillary thyroid cancer after optimized surgery. Grant CS. Gland Surg. 2015;4(1):52-62.
  1. Liu FH, Kuo SF, Hsueh C, Chao TC, Lin JD. Postoperative recurrence of papillary thyroid carcinoma with lymph node metastasis. Liu FH, Kuo SF, Hsueh C, Chao TC, Lin JD. J Surg Oncol. 2015;112(2):149-54.
  1. Lan X, Sun W, Zhang H, Dong W, Wang Z, Zhang T. A Meta-analysis of Central Lymph Node Metastasis for Predicting Lateral Involvement in Papillary Thyroid Carcinoma. Lan X, Sun W, Zhang H, Dong W, Wang Z, Zhang T. Otolaryngol Head Neck Surg. 2015;153(5):731-8.
  1. Yan H, Zhou X, Jin H, Li X, Zheng M, Ming X, Wang R, Liu J. A Study on Central Lymph Node Metastasis in 543 cN0 Papillary Thyroid Carcinoma Patients. Yan H, Zhou X, Jin H, Li X, Zheng M, Ming X, Wang R, Liu J. Int J Endocrinol. 2016;2016:1878194.
  1. Chéreau N, Buffet C, Trésallet C, Tissier F, Leenhardt L, Menegaux F. Recurrence of papillary thyroid carcinoma with lateral cervical node metastases: Predictive factors and operative management. Surgery. 2016;159(3):755-62.
  1. Park CH, Song CM, Ji YB, Pyo JY, Yi KJ, Song YS, Park YW, Tae K. Significance of the Extracapsular Spread of Metastatic Lymph Nodes in Papillary Thyroid Carcinoma. Clin Exp Otorhinolaryngol. 2015;8(3):289-94.
  1. Ji YB, Song CM, Sung ES, Jeong JH, Lee CB, Tae K. Postoperative Hypoparathyroidism and the Viability of the Parathyroid Glands During Thyroidectomy.Clin Exp Otorhinolaryngol. 2017;10(3):265-271.
  1. Podnos YD, Smith D, Wagman LD, Ellenhorn JD. The implication of lymph node metastasis on survival in patients with well-differentiated thyroid cancer. Am Surg. 2005;71(9):731-4.
  1. Zaydfudim V, Feurer ID, Griffin MR, Phay JE. The impact of lymph node involvement on survival in patients with papillary and follicular thyroid carcinoma. Surgery. 2008;144(6):1070-71.
  1. Sugitani I, Kasai N, Fujimoto Y, Yanagisawa A. A novel classification system for patients with PTC: addition of the new variables of large (3 cm or greater) nodal metastases and reclassification during the follow-up period. Surgery. 2004;135(2):139-48.
  1. Park YM, Wang SG, Lee JC, Shin DH, Kim IJ, Son SM, Mun M, Lee BJ. Metastatic lymph node status in the central compartment of papillary thyroid carcinoma: A prognostic factor of locoregional recurrence. Head Neck. 2016;38 Suppl 1:E1172-6.
  1. Vas Nunes JH, Clark JR, Gao K, Chua E, Campbell P, Niles N, Gargya A, Elliott MS. Prognostic implications of lymph node yield and lymph node ratio in papillary thyroid carcinoma. Thyroid. 2013;23(7):811-6.
  1. Momesso DP, Tuttle RM. Update on differentiated thyroid cancer staging. Endocrinol Metab Clin North Am. 2014;43(2):401-21.
  1. Haugen BR. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: What is new and what has changed? Cancer. 2017;123(3):372-381.
  1. Kim TH, Kim YN, Kim HI, Park SY, Choe JH, Kim JH, Kim JS, Oh YL, Hahn SY, Shin JH, Kim K, Jeong JG, Kim SW, Chung JH. Prognostic value of the eighth edition AJCC TNM classification for differentiated thyroid carcinoma. Oral Oncol. 2017;71:81-86.
  1. Baek SK, Jung KY, Kang SM, Kwon SY, Woo JS, Cho SH, Chung EJ. Clinical risk factors associated with cervical lymph node recurrence in papillary thyroid carcinoma. Thyroid. 2010;20(2):147-52.
  1. Lang BH, Chow SM, Lo CY, Law SC, Lam KY. Staging systems for papillary thyroid carcinoma: a study of 2 tertiary referral centers. Ann Surg. 2007;246(1):114-21.
  1. Lupi C, Giannini R, Ugolini C, Proietti A, Berti P, Minuto M, Materazzi G, Elisei R, Santoro M, Miccoli P, Basolo F. Association of BRAF V600E mutation with poor clinicopathological outcomes in 500 consecutive cases of papillary thyroid carcinoma. J Clin Endocrinol Metab. 2007;92(11):4085-90.
  1. Xing M, Alzahrani AS, Carson KA, Viola D, Elisei R, Bendlova B, Yip L, Mian C, Vianello F, Tuttle RM, Robenshtok E, Fagin JA, Puxeddu E, Fugazzola L, Czarniecka A, Jarzab B, O'Neill CJ, Sywak MS, Lam AK, Riesco-Eizaguirre G, Santisteban P, Nakayama H, Tufano RP, Pai SI, Zeiger MA, Westra WH, Clark DP, Clifton-Bligh R, Sidransky D, Ladenson PW, Sykorova V. Association between BRAF V600E mutation and mortality in patients with papillary thyroid cancer. JAMA. 2013;309(14):1493-501.
  1. Randolph GW, Duh QY, Heller KS, LiVolsi VA, Mandel SJ, Steward DL, Tufano RP, Tuttle RM; American Thyroid Association Surgical Affairs Committee’s Taskforce on Thyroid Cancer Nodal Surgery. The prognostic significance of nodal metastases from papillary thyroid carcinoma can be stratified based on the size and number of metastatic lymph nodes, as well as the presence of extranodal extension. Thyroid. 2012;22(11):1144-52.
  1. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, Pacini F, Randolph GW, Sawka AM, Schlumberger M, Schuff KG, Sherman SI, Sosa JA, Steward DL, Tuttle RM, Wartofsky L. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid. 2016;26(1):1-133.
  1. Mazo C, Bernal J, Trujillo M, Alegre E. Transfer learning for classification of cardiovascular tissues in histological images. Comput Methods Programs Biomed. 2018;165:69-76.
  1. Karri SP, Chakraborty D, Chatterjee J. Transfer learning based classification of optical coherence tomography images with diabetic macular edema and dry age-related macular degeneration. Biomed Opt Express. 2017;8(2):579-592.
  1. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, Webster DR. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016;316(22):2402-2410.
  1. Kermany DS, Goldbaum M, Cai W, Valentim CCS, Liang H, Baxter SL, McKeown A, Yang G, Wu X, Yan F, Dong J, Prasadha MK, Pei J, Ting MYL, Zhu J, Li C, Hewett S, Dong J, Ziyar I, Shi A, Zhang R, Zheng L, Hou R, Shi W, Fu X, Duan Y, Huu VAN, Wen C, Zhang ED, Zhang CL, Li O, Wang X, Singer MA, Sun X, Xu J, Tafreshi A, Lewis MA, Xia H, Zhang K. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell. 2018;172(5):1122-1131.
  1. Hood DC, De Moraes CG. Efficacy of a Deep Learning System for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus Photographs. Ophthalmology. 2018;125(8):1207-1208.
  1. Tschandl P, Codella N, Akay BN, Argenziano G, Braun RP, Cabo H, Gutman D, Halpern A, Helba B, Hofmann-Wellenhof R, Lallas A, Lapins J, Longo C, Malvehy J, Marchetti MA, Marghoob A, Menzies S, Oakley A, Paoli J, Puig S, Rinner C, Rosendahl C, Scope A, Sinz C, Soyer HP, Thomas L, Zalaudek I, Kittler H. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947.
  1. Ito Y, Higashiyama T, Takamura Y, Kobayashi K, Miya A, Miyauchi A. Prognosis of patients with papillary thyroid carcinoma showing postoperative recurrence to the central neck. World J Surg. 2011;35(4):767-72.
  1. Tuttle RM, Haugen B, Perrier ND. Updated American Joint Committee on Cancer/Tumor-Node-Metastasis Staging System for Differentiated and Anaplastic Thyroid Cancer (Eighth Edition): What Changed and Why? Thyroid. 2017;27(6):751-756.
  1. Nixon IJ, Kuk D, Wreesmann V, Morris L, Palmer FL, Ganly I, Patel SG, Singh B, Tuttle RM, Shaha AR, Gönen M, Shah JP. Defining a Valid Age Cutoff in Staging of Well-Differentiated Thyroid Cancer. Ann Surg Oncol. 2016;23(2):410-5.
  1. Schneider DF, Chen H, Sippel RS. Impact of lymph node ratio on survival in papillary thyroid cancer. Ann Surg Oncol. 2013;20(6):1906-11.
  1. Ryu IS, Song CI, Choi SH, Roh JL, Nam SY, Kim SY. Lymph node ratio of the central compartment is a significant predictor for locoregional recurrence after prophylactic central neck dissection in patients with thyroid papillary carcinoma. Ann Surg Oncol. 2014;21(1):277-83.
  1. Lee J, Lee SG, Kim K, Yim SH, Ryu H, Lee CR, Kang SW, Jeong JJ, Nam KH, Chung WY, Jo YS. Clinical Value of Lymph Node Ratio Integration with the 8(th) Edition of the UICC TNM Classification and 2015 ATA Risk Stratification Systems for Recurrence Prediction in Papillary Thyroid Cancer. Sci Rep. 2019;9(1):13361.

Tables

Table 1. Clinical information for all patients (n=1040) enrolled in the study.

Variable

No. of patients (%)

Mean age, y (range)

48.5 (13-79)

Sex

 

  Male

147 (14.1)

  Female

893 (85.9)

Extrathyroidal extension

 

Yes

586 (56.3)

No

454 (46.7)

Tumor multiplicity

 

Yes

178 (17.1)

No

862 (82.9)

Extranodal extension

 

  Yes

159 (15.3)

  No

881 (84.7)

pT classification

 

  1

508 (48.8)

  2

46 (4.4)

  3

483 (46.4)

  4

3 (0.2)

pN classification

 

  N0

506 (48.7)

  N1a

354 (34.0)

  N1b

180 (17.3)


Table 2.
Logistic regression analysis for disease recurrence adjusted for clinico-pathologic factors.

Factor

Hazard ratio

95% CI

p-value

  Sex

2.393

1.128-5.077

0.023

Size

1.030

1.007-1.052

0.010

ETE

3.105

0.697-13.834

0.137

pT

1.547

0.810-2.954

0.186

  pN

0.790

0.170-3.683

0.765

Number of metastatic CLNs

1.619

0.582-4.502

0.356

LNR

1.337

0.480-3.725

0.578

CI, confidence interval; ETE, extrathyroidal extension; LNR, lymph node ratio.

Table 3. Results of three machine learning models.

Model

Accuracy

Precision

Recall

F1 score

Decision Tree

0.95

0.66

0.18

0.28

Random Forest

0.91

0.25

0.27

0.26

XGBoost

0.92

0.25

0.18

0.21

LightGBM

0.93

0.28

0.18

0.22

Stacking

0.93

0.33

0.18

0.23


Table 4.
Top five features of importance among machine learning models.

Rank

Decision Tree

Random Forest

XGBoost

LightGBM

1st

contralateral CLN metastasis

LNR

tumor size

age

2nd

LNR

ipsilateral CLNs metastasis

age

tumor size

3rd

age

contralateral CLNs metastasis

ipsilateral CLNs metastasis

ipsilateral CLNs metastasis

4th

 

tumor size

LNR

LNR

5th

 

age

contralateral CLNs metastasis

contralateral CLNs metastasis