The growing use of electronic health record systems (EHRs) in health care generates a vast growth of actual data, presenting new clinical investigation prospects [1]. EHR records include demographic information about patients, progress notes, problems, medicines, vital signs, prior medical histories, vaccinations, laboratory data, and radiological results. The electronic health record is a set of interconnected patient data systems [2–3]. The EHR includes the infrastructure that links these networks, as well as databases, interfaces, physician order entry, electronic communication systems, and clinical workstations. EHR systems are intended to improve the quality of patient care [4–5] with reduced utilizing resources, producing a data stream for computerized billing, and also providing electronic assistance for secondary users.
As therapeutic reports contain a significant quantity of important clinical information, natural language processing (NLP) strategies of artificial intelligence has been applied to extract information from these narratives. A natural language processing system employs computers to communicate with humans using their natural languages (not computer languages) [6–7]. In research on respiratory illness, as well as in other clinical areas, artificial intelligence has been demonstrated to be capable of recognizing and interpreting pulmonary function tests, detecting asymptomatic left ventricular dysfunction from a 12-lead EKG, and identifying early breast cancer stages from a biopsy [8–10]. There are a number of potential benefits that EHR-based informatics studies could offer beyond those explored in these studies. A massive quantity of complete continuous medical information is acquired, maintained, also made accessible digitally, encompassing clinical background, blood tests, drugs, therapies, and therapeutic intervention plans.
Clinical Named Entity Recognition (NER) is one of the vital steps in extracting patient data from medical records. Clinical NER has gained several research attention because it is a necessary step in the clinical data mining process. In several aspects, medical named entity recognition differs from generic NER. The massive quantity of alternative spellings and synonyms causes an expansion of vocabulary. This lowers the performance of medicine interpretation. The Clinical NER overcomes the challenges faced by this NER [11–13]. The primary goal of Clinical NER is to recognize and categorize clinical words in clinical data, such as symptoms, drugs, and therapy entity boundaries, and category labels are often anticipated simultaneously when approaching the topic as a sequence labeling issue [14]. Many studies have been undertaken to extract named elements from clinical literature through machine learning methodologies and deep neural network-based approaches [15–16]. The existing conventional NER models all use a neural network design that is devoid of handcrafted features, making them more flexible to various applications, languages, and domains [17–21].
One such task in knowledge discovery is relation extraction, which tries to discover and describe the semantic correlation between biological/clinical elements. The sorts of relationships might vary based on the genres and domains, including gene-protein interactions, medication interactions, or medical ideas (problems, treatments, or tests). The research community has conducted much competitive devaluation in connection extraction for biomedical texts in recent decades, resulting in a rise in attention. Shared task systems currently in use [22] use guided approaches and have a range of traits [23–25]. Every investigation however has one problem in widely accepted: the influence of various sorts of attributes has not been properly investigated, and intra-relationships between sentences have not yet been discovered. To address the aforementioned issues, a fresh information retrieval methodology must be devised.
The main contribution of this research is to excerpt knowledge from clinical text data using natural language processing and relation extraction.
-
An end-to-end clinical knowledge discovery has been introduced based on the clinical XLNet model, whereas entity recognition, as well as relation extraction, are performed solely.
-
Both the independent and dependent clinical relation associations are obtained with joint conditional and multinomial naïve Bayes probability functions.
-
The relation between entity pairs presented in the same sentence as well as the consecutive sentences are obtained to discover the great knowledge from clinical narratives.
This paper has been structured as follows: Section 2 provides a review of recent literature, section 3 discusses the proposed methodology, section 4 discusses the results of the implementation, and section 5 concludes the paper.