Improving Biomedical Named Entity Recognition with Syntactic Information

doi:10.21203/rs.3.rs-21994/v1

Download PDF

Research article

Improving Biomedical Named Entity Recognition with Syntactic Information

https://doi.org/10.21203/rs.3.rs-21994/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 25 Nov, 2020

Read the published version in BMC Bioinformatics →

You are reading this latest preprint version

Background Biomedical named entity recognition (BioNER) is an important task for understanding biomedical texts. The task can be challenging due to the lack of large-scale labeled training data and domain knowledge. Previous studies have shown that syntactic information can be useful for named entity recognition; however, most of them fail to weigh that information with respect to its contribution as they treat the syntactic information as gold reference.

Results In this paper, we propose BioKMNER, a BioNER model for biomedical texts with key-value memory networks to incorporate syntactic information, which is extracted from syntactic structures automatically generated by existing toolkits. Our approach outperforms baselines without memories and achieves new state-of-the-art results on on four biomedical datasets compared with previous studies, i.e., 85.67% on BC2GM, 94.22% on BC5CDR-chemical, 90.11% on NCBI-diease, and 76.33% on Species-800.

Conclusion Experimental results on four benchmark datasets demonstrate the effectiveness of our method, where the state-of-the-art performance is achieved on all of them.

Bioinformatics

Machine Learning

Artificial Intelligence

Bioinformatics