Ontologies serve as structured frameworks that depict concepts and their intricate relationships within a defined knowledge domain [1]. In the realm of medicine, their significance lies in their ability to capture and disseminate complex and diverse information sourced from clinical records, biomedical literature, health databases, and patient feedback. The application of ontologies in the medical field offers a multitude of advantages, including improved information retrieval, seamless data integration, enhanced decision support, efficient natural language processing, and valuable knowledge discovery. An illustrative example of Healthcare Ontology and its significance in the field of medicine is presented in Fig. 1.
In addition, it is important to recognize that ontologies are not static or isolated entities; they exhibit dynamic characteristics, adapting to changes in the domain and the evolving needs of users. Furthermore, they actively engage with other ontologies that address related or intersecting facets of the same domain. Consequently, ontology integration emerges as a critical process, where multiple ontologies are skillfully combined into a unified and coherent representation, preserving the semantics and structure of the source ontologies [2].
The integration of ontologies can facilitate the reuse of existing ontologies, promote interoperability, and significantly enrich and enhance the quality and breadth of the integrated ontology [3]. Figure 2 presents the integration of ontologies, shedding light on the importance of seamless knowledge organization in various domains.
Ontology integration encompasses various techniques that vary in granularity and complexity, depending on the similarity and alignment between source ontologies. Common methods include [4]:
1. Ontology Matching: Establishing correspondences or mappings between concepts and relations across multiple source ontologies, using manual, semi-automatic, or automatic approaches based on syntax, semantics, structure, linguistics, logic, or pragmatics. This forms the foundation for ontology alignment, mapping, articulation, or merging [5].
2. Ontology Merging: Creating a new ontology that includes concepts and relations from multiple source ontologies. This involves applying operators or rules to combine them into a unified ontology, either based on ontology matching results or predefined schemas/templates, resulting in union, intersection, difference, or hybrid ontologies [6].
3. Ontology Mapping: Transforming one ontology into another using rules or functions, utilizing ontology-matching results or mapping languages/frameworks. It is commonly used for data integration, query translation, or ontology translation, and can result in direct, indirect, global, local, or mediated mapping [7].
4. Ontology Articulation: Specifying logical connections or constraints between multiple ontologies, utilizing ontology-matching results or articulation languages/frameworks [8]. It is often used for ontology integration, reasoning, or verification, resulting in bridge rules, bridge axioms, bridge ontologies, or ontology relations [9]. These methodologies offer diverse approaches to tackle ontology integration challenges, providing frameworks and techniques for establishing correspondences, merging ontologies, mapping concepts and relations, and specifying logical connections. Leveraging these methodologies facilitates effective and comprehensive ontology integration, enabling seamless knowledge sharing and collaboration across various domains and applications.
Ontology Integration in the Medical Field
Ontology integration is crucial in the complex field of medicine due to the diverse and intricate nature of medical knowledge. This field spans multiple subdomains, each with distinct contexts and objectives, involving various sources and applications. Ontology integration in medicine offers significant benefits, including:
1. Improved Information Extraction: Integrated ontologies enhance the accuracy and completeness of information extraction from medical texts by providing a comprehensive range of concepts and relations.
2. Streamlined Data Integration and Analysis: Integrating data from different medical texts using a unified ontology promotes interoperability and consistency, facilitating seamless data integration and analysis across diverse domains.
3. Enhanced Text Generation and Synthesis: An integrated ontology provides a coherent vocabulary and structure, enabling the generation of coherent and rich medical texts through natural language generation techniques.
4. Support for Knowledge Discovery: Integrating medical texts with an integrated ontology captures both implicit and explicit connections and patterns among concepts and relations, facilitating the discovery of new knowledge within the medical field.
Ontology integration in the medical field faces several significant challenges:
1. Ambiguity and Variability of Natural Language: Medical texts often contain ambiguity, imprecision, and inconsistencies, making it challenging to identify and match relevant ontological elements. Synonyms and polysemous words further complicate the integration process, as word meanings can vary with context, domain, or author.
2. Incompleteness and Inconsistency of Textual Information: Medical texts may lack comprehensive or consistent information, leading to incomplete representations. Contradictory information and conflicts with other sources pose difficulties in merging concepts and relations from different ontologies.
3. Dynamism and Evolution of Medical Knowledge: Medical knowledge evolves rapidly, making it challenging to keep integrated ontologies up-to-date with the latest developments in the field.
4. Diversity and Heterogeneity of Medical Ontologies: Medical ontologies vary in scope, structure, terminology, and quality, making it complex to resolve conflicts and inconsistencies among them and identify relevant ontologies.
5. Complexity and Scalability: Large and interconnected medical ontologies require substantial computational and cognitive resources for successful integration. Maintaining and evolving integrated ontologies also presents ongoing challenges.
6. Evaluation and Validation: Assessing the quality and utility of ontology integration lacks standardized approaches, often requiring human involvement and feedback to ensure correctness and adequacy.
Addressing these challenges necessitates innovative approaches capable of handling natural language intricacies, accommodating incomplete and inconsistent textual information, and adapting to the ever-evolving nature of medical knowledge. Overcoming these obstacles in ontology integration can enhance the representation, retrieval, and utilization of medical knowledge from textual sources. This, in turn, contributes to informed decision-making, research progress, and advancements in healthcare practices.
In the medical field, various ontology integration mechanisms have been proposed and developed to tackle the challenges and complexities associated with integrating ontologies in medical texts. These mechanisms can be broadly classified into two categories: ontology-based and text-based approaches [10].
Ontology-based mechanisms rely on existing ontologies as the primary sources and targets of integration. They employ diverse techniques and tools to discover, compare, match, merge, map, or articulate ontologies from different sources and domains. Several examples of ontology-based mechanisms include:
1. UMLS: The Unified Medical Language System (UMLS), as described by [10] Ivanović and Budimac (2014), is a comprehensive compilation of over 200 biomedical terminologies and ontologies that encompass various aspects and levels of health and disease. UMLS offers a meta thesaurus, which establishes mappings between concepts from different source ontologies, and a semantic network that defines semantic types and relations for organizing these concepts.
2. PROMPT: The PROMPT suite comprises interactive tools designed for ontology merging and mapping. Noy and Musen (2013) outline the capabilities of PROMPT, which encompass tasks such as identifying differences and similarities between ontologies, merging multiple ontologies into a new one, mapping one ontology to another, extracting modules from ontologies, and partitioning ontologies into smaller ones [7].
3. GALEN: The Generalized Architecture for Languages Encyclopedias and Nomenclatures (GALEN) is a framework developed for the creation and utilization of medical ontologies. GALEN offers a core model that establishes a common vocabulary and structure for representing medical concepts and relations. Additionally, it provides a set of modules that extend the core model to cater to specific subdomains [11].
On the other hand, text-based mechanisms focus on medical texts as the primary sources and targets of integration. These mechanisms employ natural language processing techniques to extract, analyse, synthesize, or generate medical texts based on ontologies. Some noteworthy examples of text-based mechanisms include:
1. MedLEE: The Medical Language Extraction and Encoding (MedLEE) system, as described by Chiang, Lin, and Yang (2020) is a natural language processing system that extracts structured information from clinical texts. MedLEE utilizes an ontology defining medical concepts and relations, along with linguistic rules that map natural language expressions to ontology terms [12].
2. OntoNLG: The Ontology-based Natural Language Generation (OntoNLG) system, according to Philipp Cimiano, Unger, and McCrae (2014), is a natural language generation system that produces coherent and informative summaries from multiple medical texts. OntoNLG leverages an integrated ontology that combines concepts and relations from different source ontologies, along with linguistic rules that generate natural language sentences from ontology terms [13].
3. OntoSum: The Ontology-based Text Summarization (OntoSum) system, proposed by Fiori (2019), is a text summarization system specifically tailored for biomedical literature. OntoSum employs an ontology that captures the primary topics and subtopics within the literature domain, using semantic similarity measures to rank sentences based on their relevance to the ontology. The system generates concise and pertinent summaries. While these mechanisms demonstrate existing approaches and techniques for ontology integration in the medical field, it is important to recognize their limitations and drawbacks, which drive the necessity for developing an improved ontology integration mechanism [14].
Existing ontology-based mechanisms in the field face several limitations. Firstly, they heavily rely on manual or semi-automatic methods for tasks like finding, comparing, matching, merging, mapping, or articulating ontologies, which can be time-consuming, labor-intensive, and error-prone, especially with large and complex ontologies [15].
Secondly, these mechanisms often employ fixed or predefined schemas or templates for ontology integration, which may not be adaptable to different integration purposes or contexts, potentially neglecting user preferences and changes in source ontologies [16]. Moreover, current ontology-based techniques do not adequately consider the unique characteristics of medical texts as integration sources. They overlook elements like headers, tables, figures, and references, as well as the contextual aspects of medical literature, affecting the utility of integrated ontologies in the medical domain.
Lastly, there is a lack of sufficient support for user involvement and feedback in the ontology integration process. Users are typically unable to specify their preferences, constraints, or review and modify integration results, limiting the accuracy and suitability of integrated ontologies for medical texts [17].
Text-based ontology integration mechanisms suffer from several limitations. First, they heavily depend on the quality and availability of medical texts as integration sources, which may be incomplete, inconsistent, outdated, or biased, and in some cases, inaccessible due to privacy concerns [10]. Second, these mechanisms often employ simple or shallow natural language processing techniques for tasks such as extracting, analyzing, synthesizing, or generating medical texts based on ontologies. These techniques may struggle with the inherent ambiguity and variability of natural language and may fail to capture the intricate semantics and structure of medical knowledge [18]. Furthermore, existing text-based approaches underutilize ontologies in the context of medical texts. They do not effectively harness ontologies to guide or enhance natural language processing tasks or enable advanced applications such as information retrieval, data integration, decision support, knowledge discovery, or knowledge exploration in the medical domain, which could significantly benefit medical decision-making and research. Lastly, current text-based approaches inadequately address the challenges associated with ontology integration for medical texts, including the ambiguity and variability of natural language, issues of incompleteness, inconsistency, and the dynamic nature of medical knowledge. These shortcomings directly affect the quality and utility of integrated ontologies for medical texts [19].
As a result, there is a pressing need for an improved ontology integration mechanism that can address effectively and efficiently these challenges and shortcomings.
Proposed Ontology Integration Mechanism
The proposed mechanism for integrating ontologies is based on the following key principles:
1. Ontology-driven: The mechanism utilizes ontologies as the main sources and targets of integration, as well as guides and constraints for natural language processing tasks in medical texts. By leveraging the structured information provided by ontologies, the mechanism improves the accuracy and completeness of information extraction, text analysis, text synthesis, and text generation in the context of medical texts.
2. Text-aware: The mechanism takes into account the specific characteristics and requirements of medical texts as the sources and targets of integration. It considers textual features and formats such as headings, tables, figures, and references. Additionally, the mechanism considers the textual content and context of medical texts, including their purpose, audience, domain, genre, and style. These factors influence the relevance and utility of the integrated ontology for medical texts.
3. User-centric: The mechanism supports user involvement and feedback throughout the ontology integration process. Users can specify their preferences or constraints for ontology integration, such as the desired level of granularity, coverage, consistency, or quality. They can also review or modify the results of ontology integration, including correspondences among ontologies or the structure and content of the integrated ontology. These features enhance the correctness and adequacy of the integrated ontology for medical texts.
4. Adaptive: The mechanism addresses the challenges and complexities associated with ontology integration in the domain of medical texts. It handles the ambiguity and variability of natural language in medical texts, as well as the incompleteness and inconsistency of textual information. Moreover, the mechanism copes with the dynamism and evolution of medical knowledge. These considerations are crucial for ensuring the quality and usefulness of the integrated ontology for medical texts.
The proposed ontology integration mechanism comprises four main components:
1. Ontology Finder: This component is responsible for locating and retrieving relevant ontologies from diverse sources and domains related to medical texts. Various methods and criteria, such as keywords, topics, domains, languages, or formats, are employed to search for ontologies. The component also provides a user interface that allows users to browse and select ontologies for integration.
2. Ontology Integrator: This component integrates multiple ontologies into a coherent and consistent representation that preserves the semantics and structure of the sources. Techniques and tools for ontology matching, merging, mapping, or articulation are utilized in this process. The component also includes a user interface where users can specify their preferences or constraints for ontology integration and review or modify the integration results.
3. Text Processor: This component handles the extraction, analysis, synthesis, or generation of medical texts based on ontologies. Natural language processing techniques are applied to perform tasks such as information extraction, text analysis, text synthesis, or text generation. Ontologies guide and enhance these processes by providing a common vocabulary and structure for representing medical concepts and relationships.
4. Text Application: This component enables advanced applications for medical texts based on ontologies. It utilizes ontologies to provide valuable insights and evidence for medical decision-making and research. Information retrieval, data integration, decision support, knowledge discovery, and knowledge exploration are among the supported functionalities.
The proposed ontology integration mechanism offers several notable advantages:
- It synergistically combines ontology-based and text-based approaches, leveraging their respective strengths and compensating for their weaknesses.
- It integrates multiple ontologies from various sources and domains, ensuring a comprehensive coverage of concepts and relations relevant to medical texts.
- It produces an integrated ontology that is accurate, complete, consistent, scalable, and suitable for use in medical texts.
- It enhances the representation and processing of medical texts by employing an integrated ontology that facilitates information extraction, text analysis, text synthesis, text generation, information retrieval, data integration, decision support, knowledge discovery, and knowledge exploration.
- It actively involves users in the ontology integration process, allowing them to specify their preferences or constraints and provide them with the ability to review or modify the integration results.