Mapping Global Knowledge Domain, Research in Information Retrieval in Medical Sciences: A Scientometric and Evaluative Study

Objective: The main goal of this paper is to visualize and draw the intellectual and cognitive structures of information retrieval (IR) in the medical sciences through the use of Science Mapping. Methods: This Scientometrics research has been undertaking using Mapping Knowledge Domain (MKD) methods and drawing scientic maps to analyze scientic products and show their trends. In this study, we recruited all documents indexed in the Web of Science database (WOS), with the topic of storing and retrieval of information in medical sciences. To analyze the results, 3 software SciMAT-v1.1.04, VOSviewer-v1.6.14, CitNetExplorer_v1.0.0 were used. Results: Our results show that most scientic productions in this eld fall into two categories: 1. Effective methods of organizing information and 2. Application and operation of IR system in the process of intelligent questioning and answering, and analyzing information behaviors of physicians and health professionals". The results showed that the similarity index increased over time from 0.43 to 0.71. Analysis of the ndings shows that Similarity measures, Expert systems, Concepts, Experience, Answers, and Multi-model IR clusters are considered as mature and completely centralized clusters in the rst quarter of the strategic chart. Conclusions: The effectiveness of scientic documents based on answering clinical questions and focusing on health professionals' information behaviors has increased compared to search methods and tools.


Introduction
In recent years, there has been a drastic change in the way information is been disseminated in the scienti c world. Huge amounts of scienti c evidence make it more di cult for researchers to access the information they need. Therefore, information retrieval (IR) technologies have been developed to answer the information needs of researchers and scholars and to help them retrieve the most appropriate and diverse scienti c resources to the questions provided to fully meet their information needs (Alves et al. 2018;Di Girolamo 2019;Xu et al. 2019).
The tools created in the science of data retrieval are used to achieve the maximum content produced, both directly and indirectly. In the past decades, the use of language modeling, ltering, recommendation systems, and answering interactive questions has become the main area of research, and these researches seem to be focusing more on users, including modeling behavior, and xing user interface has become more important. But information technology, information systems, and data retrieval have changed in ways that can't even be imagined. These changes occurred so rapidly that it is di cult to predict what will happen in the next 20 which will make it di cult to understand the quality of scienti c development in the eld of IR, especially in medical sciences (Harman 2019;Li et al. 2013). Discovering medical information retrieval (MIR) subject trends, as a process to obtain useful information from the explosive amount of data available in the eld of health, which help improving healthcare services, and provide better therapeutic responses for patients, have always been of interest to researchers in the eld (Irmawati et al. 2019).
In this regard, today the ability to map out concepts, ideas, and issues in various scienti c elds has become very important for logical reasons which will help in achieving the goals of policymakers. As a result of the information explosion in medical sciences, several ways of knowledge mapping have been deduced in this eld.
The scientometric method is the quantitative study of documents in a scienti c eld that is done through experimental methods such as visualization, citation analysis, citation analysis, vocabulary analysis, etc. (Ding et al. 2001;Janssens et al. 2005).
In scientometric research, one of the indicators of the maturity of a research eld is the growth in the number and quality of research publications. By examining scienti c productions in a particular eld, it is possible to understand the nature of research in that eld. This method of evaluating the status of a eld can re ect the current content and orientations of that eld by providing a lot of information (Franker 2020;Mohammadi et al. 2019; Rorissa and Yuan 2012). The previous paradigm and intellectual base of a discipline that is re ected in its existing scienti c productions can inform us about future research fronts and the analysis of research frontiers and the development of unique research areas or topics. Overall, these connections "represent the cognitive structure of the research area" (Lu et al. 2020;Rorissa and Yuan 2012).
Tracking and drawing the developments in the subject areas of science by determining and separating the most important documents and analyzing them in scientometric studies is of increasing importance and is growing rapidly. In this regard, MKD as a research method provides ways to extract the mass of data obtained. MKD or knowledge graph or knowledge visualization research is a cross-sectional approach that includes applied mathematics, information science, and computer science, and is a new eld of scientometric research. MKD produces graphic representations that represent the processes of development and structural relationships in a eld of scienti c knowledge. This method is an effective tool for tracking the boundaries of science and technology, showing scienti c research that helps in making scienti c and technological decisions. By mapping out what is known in sciences, its most notable elements can be visualized, and it will help researchers to extract the knowledge and interact with it (Li et al. 2019;Zeng et al. 2017;Zhu et al. 2015;Zou and Vu 2019).
In this paper, our main goal is to visualize and draw the intellectual and cognitive structures of the IR subset in the medical sciences through Science Mapping. Scienti c mapping or bibliographic mapping focuses on determining areas of research in a scienti c eld to determine the cognitive structure and its evolution (Cobo et al. 2011). By presenting various illustrated maps of data related to scienti c production in the eld of IR in medical sciences and focusing on large data and discovering patterns in this data, an attempt is made to gain insight into the nature of the research structure of IR knowledge as at present. As well as provide perspectives for future researches in this eld. Therefore, this research has been done with MKD methods and drawing scienti c maps to analyze scienti c products and draw trends.
In this section, we describe two related categories of information retrieval work in medicine and KMD in medical science Information Retrieval in Medical and Health domain Regarding information retrieval in the eld of medical sciences, research is very diverse and focuses on the application of data analysis methods, especially arti cial intelligence tools in data retrieval. There is also research on optimal recovery methods and trial and error of semi-automated strategy formulation methods in various elds. The advent of Web 2 and social media have also been studied in medical information retrieval. For example, Di Girolamo reminds us of the effective emergence of social networks in medical information retrieval; Milliken et al. have designed an accurate medical article retrieval system (ARtPM) and provides a ranking system for related articles to summarize speci c medical cases (genetic types, diseases, demographics, and other medical conditions) (Milliken et al. 2019). Salvador et al. examined errors in search strategies in a systematic review and their effects on data retrieval (Salvador-Oliván et al. 2019). Ahmed et al. have used information retrieval and data mining methods to retrieve medical images as one of the most important medical resources (Ahmed et al. 2016). Hess describes techniques for accessing information through various sources on the Internet (Hess 2004).

Mapping knowledge domain
In the eld of information retrieval, limited studies have been conducted on scienti c mapping and the use of KMD. One of the most notable is Zhao's research, which uses CiteSpace II information visualization software to map knowledge based on cross-language information retrieval (CLIR) data and analyze CLIR research points (Zhao and Rui 2011).
Zowj, Ghane, and Ehsanifar have identi ed the process of data retrieval research using the author's citation network (Zowj et al. 2019). Ding et al. have mapped the intellectual structure of the eld of information retrieval (IR) during the period 1987-1997. Keyword analysis was used to reveal patterns and trends in the eld of IR by measuring the co-occurring power of the terms of the respective publications or other texts produced in the eld of IR (Ding et al. 2001).
As mentioned, in the eld of information retrieval in medical sciences, until this research was done, no scientometric study or KMD has been conducted to analyze research in this eld. In the most relevant research, Kim et al. have used KMD methods to examine historical footprints, emerging technologies, and challenges in using UMLS resources and tools to present potential future directions (Kim et al. 2020). But KMD has been used in many studies in speci c elds of medical sciences. For example, Ebener et al. have explored the possibility of integrating knowledge mapping into a conceptual framework that can be used as a tool to understand the many complex processes involved in a health system and to identify potential gaps in knowledge translation processes to address them (Ebener et al. 2006). Zhang et al. have examined the main topics of research on military health and medical research and aims to be used as a reference for development in military health and medicine (Zhang et al. 2017). Chen et al. has used CiteSpace as a research tool to explore key factors, such as the focus of research, key researchers, its evolution, and important results over the past ten years (Chen et al. 2020). Cargnel et al. improved the laboratory diagnostic research capacity of emerging diseases using knowledge mapping. Raju Vaishya has examined the trend of publishing in 3D printing in the eld of orthopedics using KMD (Cargnel et al. 2020). Zhao conducted research to reveal the general state of research on the Ebola virus by mapping knowledge of the Ebola virus literature around the world (Zhao et al. 2015).

Methods
In this study, we recruited all documents indexed in the Web of Science database with the topic of storage and retrieval of information in medical sciences. To achieve maximum comprehensiveness, SCI-EXPANDED and SSCI collections were selected. The search for the documents took place on March 15, 2020. To illustrate the thematic process, all the documents available on the Web of Science database in the search eld, which were published from 1968 and 2020, were examined.

Search Strategy
Searching for resources in the Topic eld was done with the following strategy: ((Retrieval AND (information* OR storage* OR data* OR system* OR article* OR research* OR image*)) AND (health* OR Medic*)).

Inclusion and Exclusion Criteria
This study examined all articles on information retrieval in the eld of medical sciences. Therefore, the inclusion criteria include the following: All research articles on the subject of information retrieval or data retrieval or data storage and retrieval systems or retrieval systems, or article retrieval or research retrieval or image retrieval in the eld of medicine and health. Exclusion criteria were: (1) studies that are not research articles, (2) studies whose bibliographic information was not su cient to obtain standard outputs.
Therefore, out of the total document retrieved, 8404 articles were included in the study. 4578 unrelated articles were excluded from the study. If after reviewing the abstract or the full text of an article, it became clear that its subject is not directly related to information retrieval in medical sciences, it was removed as an irrelevant item.

Data analysis
To analyze the results, 3 software SciMAT-v1.1.04, VOSviewer-v1.6.14, CitNetExplorer_v1.0.0 were used. SciMAT (Science Mapping Analysis Software Tool) is a scienti c mapping software designed by the University of Granada and available as an open-source (Cobo et al.). This software allows scientometric analysis based on bibliographic networks such as co-word, co-citation, author co-citation, journal co-citation, coauthor, bibliographic coupling, journal bibliographic coupling, and author bibliographic coupling (Cobo et al. 2012). CitNetExplorer software has also been used in this study to cluster documents based on citation relationships and analyze results based on individual authors. This software is designed by Leiden University as a tool for illustrating and analyzing citation networks of scienti c publications at the level of authors.VOSviewer software has also been used for thematic cluster analysis. VOSviewer is a software tool for creating and visualizing bibliographic networks. While CitNetExplorer is used to analyze a cluster at the level of separate documents, VOSviewer is used to analyze clustering at the entire level of articles. (Cobo et al. 2012 To extract bibliographic and citation information from documents in a readable format by the software used, the data is exported in the form of full records (covering author and author units, source journal titles, titles, keywords, and abstracts) and cited references in plain -text format. Some considerations on how to con gure software for analysis are provided in table 1.  Thus, the analysis of the results was as follows that is a total of 3826 publications are involved in 3661 citation links between these publications during the study period. Table 2 provides an overview of citation links in three three-year block periods. The highest citation links and relative publications are observed in the third 10-year period. The chronological citation network is shown in Figure 2. CitNetExplorer was used in the visualization of a citation network of documents, by default, displays tags with the rst author's last name. In this image, the circles symbolize the documents. The curved lines represent the citation relationships of each document (Van Eck and Waltman 2011).
The map above shows that the main and most cited articles were in two main themes. In the rst theme (left), the main content of the resources was an effective method of organizing information. Most of the articles in this category deal with the methods of mining, anthologies, and their application and indexing of resources. As time goes on, the topics of the articles move from mining and its methods in retrieving information to anthologies and their application to the meaning of information. In this regard, Wilbur and Yang's article is considered as a basic article. They provided a new information-theoretical interpretation of term strength, reviewed some of its uses in focusing on the processing of documents for IR, and described new results obtained in document categorization (Wilbur and Yang 1996).
In the second topic, the main content of the documents was the application and performance of IR systems in Question and Answering forms and the analysis of information behaviors of health professionals. Over time, the thematic content of documents has shifted from search, text browsing, and search tools to topics such as physicians' clinical answers and information-seeking behaviors. The Harsh article in this category is considered as a basic article in which it discusses the use of IR systems by physicians to answer clinical questions and physician information behavior. The purpose of this article is to provide a conceptual framework and to apply the results of previous studies to this framework (Hersh and Hickam 1998).
Thematic clustering of documents based on CiteNetExplorer analysis is shown in Figure 3. From the result after analysis, the documents were categorized into 4 thematic clusters. Each cluster contains documents that are strongly related to each other. The results showed that a total of 136 documents are placed in these clusters and the core clusters were in the form of the following clusters: 48 (35%) to group 1 (blue), 36 (26%) to group 2 (green), 35 (25%) to group 3 (red) and 10 (7%) to group 4 (orange).
The thematic theme of the documents in blue clusters was "the analysis of physicians' information behavior, IR systems, EBM, and CDSSs", in Green Cluster were "EHR and Medical Documents". Also, the thematic theme of the red cluster was "text mining and indexing" and the thematic theme of the orange cluster was "question answering systems".
Also, as can be seen in the chart above, most of the core publications were published from 2000 and 2010, and this indicates the signi cant impact of the scienti c activities within this decade on the scienti c productions of the next decade. In other words, most of the core scienti c products that have created the infrastructure for other IR research in medical sciences had been from 2000 to 2010. As shown in Table 2, the citation link ratio of the scienti c production of this period was higher than the number of its publications (0.67).
Topic networks were based on Co-occurrence networks and term maps using VOSviewer software. This embodiment shows the most important terms in the publications belonging to a cluster and the co-relational relationships of these terms. In this section, the co-occurrence analysis of words for the analysis of thematic trends in the eld of IR in medical sciences is examined.
One of the problems of this stage was the existence of different forms of writing or singular and plurals and synonyms of concepts for drawing lexical maps. Therefore, to unify the concepts and prevent the dispersion of the same concepts, the researchers rst designed a specialized thesaurus in IR in medical science to be used in the analysis by VOSviewer. This is one of the specialized advantages of VOSviewer software analysis. Figure 4 shows a picture of designed terminology to use in analyzing data by VOSviewer The results of this section showed that the documents examined had a total of 10783 keywords. In addition to the author's keywords, a "keyword plus" is provided on the web of science database to provide a more accurate overview of the summary of articles. Therefore, based on the researchers' experience, both options were selected as the criteria for selecting keywords for deeper analysis. For the meaningful drawing of knowledge maps, the minimum number of occurrence conditions was considered to be 20 for analysis, and under these conditions, 116 keywords were selected as frequent keywords for these articles. Then, to increase accuracy, irrelevant keywords such as "medicine" were removed from the selected keywords. In the end, 80 keywords remained. In all maps, we plotted the weight of the words based on the frequency of the events.
The placement of keywords in clusters and the distance between nodes is based on the simultaneous use of two or more similar keywords. The size of each circle in the cluster indicates the abundance of that word in that cluster (Mohammadi et al. 2019;Rezaei and Mohammadi 2018).
After drawing the clusters and examining the keywords, it was found that the analyzed documents were in the themes of IR technologies and techniques ( rst cluster), information behaviors and CDSS systems (second cluster), indexing and knowledge representation tools (the third cluster) and the knowledge of searching for resources and topics related to databases (the fourth cluster) and searching for information as placed on the web (the fth cluster). The rst and second clusters had the highest number of keywords with 30 items, and after these clusters, the third clusters with 10, the fourth with 7 items, and the fth cluster with 4 items.
In terms of all the three indicators of links, total strength link, keyword occurrence, the order of importance of keywords in the 5 clusters are as follows: In the rst cluster, the keywords of "Information storage and retrieval", "IR system", "Natural language processing", "Ontology's"; in the second cluster, "Knowledge", "Models", "Electronic health record", in the third cluster, "Query expansion', "MeSH", "UMLS", "Terminology", in the fourth cluster, "Bibliographic databases", "Bibliometric", "Databases" and "Literature searching" have the most important in their cluster ( Figure 5).    Using two indicators, centrality and density, the strategic chart is divided into four quarters. The topics in the upper right quarter ( rst quarter) are fully developed and are very important for the development of the main research structure in medical science. They are known as special themes due to their high centrality and density.
The placement of the Topics in this quarter means that they have the most internal coherence and connection and are conceptually very close and related. Topics in the upper left quadrant (second quarter) are still coherent but decentralized, each of which consists of smaller specialized areas of science. Topics in the lower left quadrant (third quarter) have low density and centrality, which mainly re ect emerging or declining scienti c disciplines. Topics in the lower right quarter (fourth quarter) are important in a research eld but have not yet matured and have the potential to become major topics in the eld (Abdollahzadeh 2019;Cobo et al. 2011;Ke et al. 2013;Melcer et al. 2015) (Figure 7).
To draw a strategic diagram to explain the situation more accurately, a strategic diagram is presented based on the number of scienti c productions and the index of citation to the scienti c products of the eld under study.
Based on the average of citation to scienti c products, the largest clusters includes 'Similarity measures" (40.41 citations), "Mechanism" (39.37 citations), and "Barriers" (34.82 citations). In the Similarity measures cluster, "Similarity measures", "distance nodes" with 11 documents were the largest nodes, followed by "Sets", "Topic Models" with 6 documents in the next ranks. In the Mechanism, the cluster was "Mechanism" nodes with 15 documents and "Single-molecule magnet" with 3 documents. In the Barriers cluster, there were "Complexes" nodes with 14 documents and "Barriers" with 4 documents.

Discussion
Examining the thematic areas of information retrieval in medical sciences, and drawing its maps, is one of the most essential methods for predicting ground research based on the past path and this study was carried out to evaluate the evolution of research and Mapping Global Knowledge Domain in works of literature of this eld.
Analysis of information based on the effectiveness of research in the eld of IR in medical sciences (bases on analysis of highly cited documents), shows that most scienti c productions in this eld fall into two categories: 1.
Effective methods in organizing information and 2. Applications and operations of IR systems, the process of intelligent questioning and answering and analysis of information behavior of physicians and health professionals ". The important point in this regard is to increase the effectiveness of scienti c productions in the issues of structuring and organizing knowledge and using tools such as ontologies and other semantic tools in systematizing knowledge compared to methods such as data mining. In other words, over time, research, and attention to pre-designed tools and semantic tools has increased over the methods of automatic data extraction and retrieval.
Also, the effectiveness of scienti c documents based on answering clinical questions and focusing on health professionals' information behaviors has increased compared to search methods and tools. It can be said that this situation indicates the conditions in which researchers have focused more on human factors in IR.
Zowj et al identi ed 10 clusters in a study to identify trends in data retrieval research using the author's citation network, including Library and Information Science, Computer Science, Electrical Engineering, Information Retrieval, Information-seeking Behavior, Psychology. Multimedia Information Retrieval, Software Engineering, Ophthalmology, and Surgery. In our research, the documents were in 4 thematic clusters: "Analysis of Physicians' Information Behavior, IR Systems, EBM and CDSS", "EHR and Medical Documents", "Text Mining and Indexing" and "Question Answer Systems". The reason for this difference, in addition to the focus of current study on information retrieval articles in the eld of medical sciences, was the exclusion of non-information retrieval articles in our study. Therefore, only articles written directly in the eld of information retrieval in medical sciences were included in the cluster mapping. The point is that regarding the information behavior of users, the results of the mentioned research are in line with the results of our research. In both studies (information retrieval and information retrieval in medical sciences), attention to human dimensions and user behavior has been one of the most important focuses of research (Zowj et al. 2019).
On the other hand, the analysis of scienti c documents published based on keywords in the eld of IR research in medical sciences shows that the thematic clusters of "IR technologies and techniques" in terms of all 3 indicators, Total Strengths Link and Keywords Occurrence's has been the strongest and most cohesive cluster. The "Information Behaviors and CDSS Systems" cluster ranks next to all of these indicators with little difference. This situation shows that in terms of the frequency of the subject of the research, the technologies and retrieval techniques are still at the top; but the abundance and strengths of human subjects and aspects are quite signi cant close to the thematic domains of the rst cluster. In other words, in terms of the number of Items, focus, and attention to human aspects of IR in medical sciences such as information behavior and application of technology in clinical science processes and related clinical areas have been increased. This con rmed the analysis of scienti c products based on their effectiveness (based on the citation status of published documents).
Ding et al. in their research on data retrieval research mapping using keyword analysis, identi ed 5 main clusters in this research and stated that the trend of information retrieval research is moving towards concepts such as the World Wide Web. Web, information retrieval behaviors, arti cial intelligence, online databases, electronic publishing, neural networks, knowledge illustration, data mining and search engines, and topics such as information needs of users in parallel with technical issues of information retrieval have been considered. This research is consistent with our research and indicates the continued focus of researchers in this eld on the human aspects of information retrieval. Also, the use of intelligent methods of knowledge organization instead of classical methods such as organization based on traditional methods has received more attention (Ding et al. 2001). This part of the results is also consistent with the current results.
From another perspective, Zhao and Rui identi ed cross-language information retrieval research centers. The main centers of research are CLIR techniques, machine translation, query translation, query expansion, parallel corporan. Similarly, in our study, query expansion was in the third cluster, and this situation shows the importance of query expansion in various areas of data retrieval (Zhao and Rui 2011).
The results also showed that the similarity index increased over time from In Abdollahzadeh's research, which drew a thematic map of the eld of librarianship and information using the co-occurrence method, it was found that the metadata cluster was one of the central but not developed clusters, which is completely consistent with the results of our research (Abdollahzadeh 2019).

Conclusion
Paying attention to the evolution of various scienti c elds is one of the most important prerequisites for research policy-making and predicting the scienti c needs of researchers. This study aimed to respond to this goal and draw future perspectives in the highly variable and developing eld of IR in medical sciences. The importance of this issue is that the IR and its related subjects in medical science need to evaluate IR techniques as a powerful tool for developing the research capabilities. Therefore, paying attention to the model, maps, and visualizations in this research, which has been the result of systematic analysis of scienti c products in the most prestigious scienti c journals in the world, can be effective in understanding research gaps and future needs in IR. Other considerations include a dramatic approximation of the vocabulary used (in fact, research areas) by researchers and a relative slowdown in the growth rate of the subject's domain in the last decade from 2000 to 2010.
Therefore, it seems necessary to pay attention to the expansion of the elds of IR and the application of its concepts in medical information sciences.
In particular, research ndings indicate a relative growth in the focus of IR research on the practical and human aspects of IR and information retrieval behaviors. These conditions indicate the speci c situation of the application of IR technologies in medical sciences and the focus on human factors along with technological factors. Therefore, it can be recommended that designers of IR systems and techniques in medical information sciences pay more attention to human factors attentively to develop new technologies and tools.