OSCC is characterized by a high degree of invasion into the surrounding tissues, as well as a high incidence of lymph node metastasis [19]. Therefore, it is important to determine the invasive ability of the tumor in each case, as this will inform the establishment of a treatment strategy. In this study, we developed an automatic machine-learning based method for differentiating OSCC cases according to the YK classification criteria. Overall, this system yielded relatively accurate results, as indicated by a high F value of 0.87. However, a further analysis of individual grades yielded a relatively low F value for Grade 2, suggesting that our method may not accurately distinguish these tumors. When we analyzed the survival rates according to the YK grade, the survival rate decreased as the grade determined by the clinician increased. In contrast, however, the machine learning-determined YK Grade 2 cases had the second-worst survival rate after Grade 4D. Moreover, only two-thirds of Grade 2 cases were correctly assigned by the machine learning system, and three-quarters of the mismatched cases actually met the criteria of a higher grade.
Grade 2 may be particularly easy to misjudge via machine learning because these lesions have an unclear borderline and a cord-like shape and are easily misclassified as more invasive tumors (e.g., Grade 4C), even during a subjective clinician-based analysis. Grade 2 cases also comprised the smallest subpopulation in this study. Consequently, machine learning became inadequate, and many cases were misinterpreted. We further consider that Grade 2 was associated with the second-worst survival rate. Although machine learning and AI are being promoted in the medical field, particularly as diagnostic approaches to head and neck cancers, our findings suggest that clinicians should consider the risk of misjudgment when using machine learning, which is instructed using human-determined features [13].
This study was complicated by the fact that that hematoxylin- and eosin-stained (HE) images were largely not used, despite the desirability of such an approach from the perspectives of cost and convenience. However, as this research involved the challenge of a first approach to this technology, we performed IHC to detect claudin-7, which specifically stains OSCC tumor cells, to further clarify the borderline between the tumor and the stroma and ensure clear binary images [20, 21]. The use of HE specimens alone would have made it particularly difficult to capture the sparsely scattered tumor cells in the stroma tissue of Grade 4D specimens. However the inclusion of a claudin-7 IHC analysis better facilitated the detection of tumor cells even in these Grade 4D cases [20]. In the future, advance are needed to ensure that machine learning can detect bivalence using simpler and more useful HE samples.
To improve the classification accuracy using deep learing, it is necessary to include a substantially high number of cases; however, we did not have the required number of cases.
Therefore, a classifier can be created with a limited number of cases by providing a clinician with a minimal amount of learning of capturing supervised image data. Therefore, we initially aimed to facilitate the creation of classifiers via machine learning by setting the features used by clinicians to determine the YK classification. The good overall F value suggests that good feature values were extracted. Moreover, this approach might also be useful for constructing an automatic YK classification discrimination method, although the accuracy must be improved.
The accuracy of machine learning could potentially be improved by dramatically increasing the number of cases. Although many pathological image findings and clinical information can be obtained from The Cancer Genome Atlas database, this information is provided in a pathological image format and the contents are not uniform [22]. Consequently, it is difficult to apply these data in a machine learning setting. In the future, it will be necessary to collect a larger number of cases through a multi-center collaboration. The deep learning and inter-pathologist reproducibility, including the YK classification, encouraged by these efforts will lead to a breakthrough in the field. Furthermore, increasing numbers of patients will benefit when clinicians and pathologists use a more effective AI system. We should continue to cooperate with the field of AI analysis to develop diagnostic tools.