Background: Mining massive prescriptions in Traditional Chinese Medicine (TCM) accumulated in the lengthy period of several thousand years to discover essential herbal groups for distinct eﬃcacies is of signiﬁcance for TCM modernization, thus starting to draw attentions recently. However, most existing methods for the task treat herbs with diﬀerent surface forms orthogonally and determine eﬃcacy-speciﬁc herbal groups based on the raw frequencies an herbal group occur in a collection of prescriptions. Such methods entirely overlook the fact that prescriptions in TCM are formed empirically by diﬀerent people at diﬀerent historical stages, and thus full of herbs with diﬀerent surface forms expressing the same material, or even noisy and redundant herbs.
Methods: We propose a two-stage approach for eﬃcacy-speciﬁc herbal group detection from prescriptions in TCM. For the ﬁrst stage we devise a hierarchical attentive neural network model to capture essential herbs in a prescription for its eﬃcacy, where herbs are encoded with dense real-valued vectors learned automatically to identify their diﬀerences on the semantical level. For the second stage, frequent patterns are mined to discover essential herbal groups for an eﬃcacy from distilled prescriptions obtained in the ﬁrst stage.
Results: We verify the eﬀectiveness of our proposed approach from two aspects, the ﬁrst one is the ability of the hierarchical attentive neural network model to distill a prescription, and the second one is the accuracy in discovering eﬃcacy-speciﬁc herbal groups.
Conclusion: The experimental results demonstrate that the hierarchical attentive neural network model is capable to capture herbs in a prescription essential to its eﬃcacy, and the distilled prescriptions signiﬁcantly could improve the performance of eﬃcacy-speciﬁc herbal group detection.