Background: Mining massive prescriptions in Traditional Chinese Medicine (TCM) accumulated in the lengthy period of several thousand years to discover essential herbal groups for distinct efficacies is of significance for TCM modernization, thus starting to draw attentions recently. However, most existing methods for the task treat herbs with different surface forms orthogonally and determine efficacy-specific herbal groups based on the raw frequencies an herbal group occur in a collection of prescriptions. Such methods entirely overlook the fact that prescriptions in TCM are formed empirically by different people at different historical stages, and thus full of herbs with different surface forms expressing the same material, or even noisy and redundant herbs.
Methods: We propose a two-stage approach for efficacy-specific herbal group detection from prescriptions in TCM. For the first stage we devise a hierarchical attentive neural network model to capture essential herbs in a prescription for its efficacy, where herbs are encoded with dense real-valued vectors learned automatically to identify their differences on the semantical level. For the second stage, frequent patterns are mined to discover essential herbal groups for an efficacy from distilled prescriptions obtained in the first stage.
Results: We verify the effectiveness of our proposed approach from two aspects, the first one is the ability of the hierarchical attentive neural network model to distill a prescription, and the second one is the accuracy in discovering efficacy-specific herbal groups.
Conclusion: The experimental results demonstrate that the hierarchical attentive neural network model is capable to capture herbs in a prescription essential to its efficacy, and the distilled prescriptions significantly could improve the performance of efficacy-specific herbal group detection.