Background: Different Traditional Chinese medicine (TCM) constitution types have different disease susceptibility and tendency, and TCM constitution identification is of great significance in TCM clinical practice. The TCM constitution identification method based on observation and consultation is subjective, and the objective identification technique opens up a new way to modernize TCM treatment. Our study aimed to build a TCM constitution identification model based on tongue feature data and machine learning algorithms, which provides a new fast and accurate method for TCM constitution identification.
Methods: We use TFDA-1 tongue diagnostic instrument to collect standardized tongue images of people with Yang deficiency constitution, Yin deficiency constitution and balanced constitution. and use tongue image analysis software (TDAS) to quantitatively analyze tongue color, tongue texture and tongue coating area. Pearson correlation analysis was used to explore the correlation between tongue characteristics and TCM constitution. Four machine learning algorithms, including SVM, decision tree, random forest, and XGboost were used to build a TCM constitution identification model based on tongue features and evaluate the model's effectiveness.
Results: The results show that XGboost has the highest accuracy rate among the four machine learning algorithms and the best performance in model evaluation. Pearson correlation analysis found a specific correlation between TCM constitution and tongue features. Significant correlations existed between the Yang deficiency constitution, Yin deficiency constitution, and the balanced constitution with 16 tongue features. In addition, the model's accuracy for the group 2 containing 16 tongue features was higher than that of the whole feature group (Group 1). XGboost was the most effective in this study for identifying TCM constitution, and the tongue features filtered by correlation analysis led to higher accuracy of TCM constitution identification.
Conclusions: Tongue feature information can be an essential reference for TCM constitution identification. Machine learning provides a method for rapid identification of TCM constitution types. The XGboost TCM constitution identification model with good performance gives a new way for clinical " Identifying TCM Constitution by Tongue Image" implementation offers a reference and contributes to the performance of " Preventive Treatment of Disease" of TCM and individualized diagnosis and treatment and health preservation. In addition, Objective identification technology has opened up a new way to modernize TCM diagnosis and treatment.