Meta-Phill: feasibility of a new metadata repository for Evidence-based practice in literature review

doi:10.21203/rs.3.rs-3133674/v1

Download PDF

Research Article

Meta-Phill: feasibility of a new metadata repository for Evidence-based practice in literature review

https://doi.org/10.21203/rs.3.rs-3133674/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Meta-Phill is a systematic review and meta-analysis-friendly search engine for peer-reviewed articles. The metadata is based on the PICO (Population/problem, Intervention/exposure, Comparison, and Outcome) components that are crucial for the evidence-based literature review. As the preexisting literature is so large, we utilized a tuned Python engine with natural language processing methods of OpenAI API interference with human supervision to record the metadata of those articles. Results are getting saved to a repository every day with a current speed of an average of 1.43 seconds and 0.0045$ per automated entry. The provided dataset could be supplemented to the search strategy of researchers for filtering studies and finding those with similar PICO characteristics that are most suitable for systematic reviews and meta-analyses. We used R-based analytics of text to process and analyze the dataset for hierarchical clustering studies. Preexisting literature is being recorded by AI; while future research would be submitted by humans for journals included in Meta-Phill.

systematic review

meta-analysis

ChatGPT

artifitial intelligence

Health is one of the most important aspects of human life and we always need to be sure that what we use in medicine is the best treatment option based on the available evidence, totally termed as Evidence-based medicine (EBM) (1). In evidence-based medicine, medical decisions are made based on scientific evidence and reliable information, and the strongest available evidence (2). Systematic review studies and meta-analyses provide the highest and most reliable propositions used in evidence-based medicine (3). The results of these studies are written in medical books, taught in medical schools, and performed at the bedside of patients (2, 3). The meaning of the systematic review (SR) before the meta-analysis (MA) is that we search all the available sources with a completely codified, pre-recorded, and completely repeatable method in order to find and put together the previously published evidence. Conducting an SR is a time-consuming procedure (4). Various automated methods have been introduced for more detailed and faster SR using machine learning and natural language processing models (5). But most of these tools are being used in stages of SR other than the primary search. There is a high speed of publication of medical peer-reviewed articles that in the year 2010, it was estimated that more than 70 articles are getting published a day in biomedicine (6). Through the numerous articles, there is a need for valid and sufficient search strategies for answering medical questions. In EBM, PICO has been employed to construct a search strategy based on the optimal questions (7). While scientific data exchange and storage have progressed quickly and valuable search engines and strategies are employed in the literature review, having access to studies that have reported similar outcomes within similar study designs and study samples or populations is still the aspiration of any evidence synthesis specialist or team. This emerges when the most important source of literature for systematic reviews, Cochrane Library, has categorized its’ published systematic review articles based on PICO terms. But what if we have access to this level of categorization for original research articles? More than 35 million articles are published in PubMed (8) and 18 million in Elsevier (9). No team of humans would be able to categorize all pre-existing literature based on the PICO, with affordable investments. So we aimed to generate a metadata format containing PICO components.

The scope of the repository is all fields of medical sciences that include a study on a live creature or viruses, so chemical sciences or studies on pure chemical analyses would not be interested in Meta-Phill as they may be rarely targeted for pooled evidence synthesis. Potential stakeholders are Meta-phill developers, Peer-reviewed Journal editors, publishers and librarians of those journals, individual researchers, research communities, clinicians, policymakers, and systematic reviewers.

Resources:

Each meta-phill meta-data contains 9 parts of information, average weighing 3.6 Kilobytes, that solely including all pubmed records, would contain a data repository of as large as 126 Gigabytes. To handle the large dataset, a powerful server or cluster of servers with high-performance CPUs, a large amount of RAM, and sufficient storage space to store the dataset. The database management system (RDBMS) of MySQL stores the big metadata.

The hardware used to host the website is a 4-core CPU Linux hosting server with RAM of 4 gigabytes. The PHP language is used to design the website. The table was handled to be online with an interface of a MySQL table and data table software. We designed DataTables with functionalities of SearchPanes, Select, FixedColumns, Buttons, SearchBuilder, and DateTime.

Metadata automated generation:

The Metadata generator engine is a Python-written application (https://github.com/Metaphill/Engine.git). Abstracts of papers first get saved as a text file via Entrez API for studies indexed in the PubMed database. DOI’s of articles were also used to save abstracts to text files. The engine code defines a list of OpenAI API keys and randomly selects one for use and prompts for different study aspects such as study design, study population, etc. The exact prompts are as bellow:

prompts = { 'Study Design': 'Given the following abstract, what is the study design? please don't write sentences. just give me maximum 5 words. \n\nAbstract: ',

'Study Population/Disease/Situation': 'Given the following abstract, what is the study population? Don't write sentences. name those. \n\nAbstract: ',

'Study Comparison/Prognostic Factor': 'Given the following abstract, what is the study comparison? Don't write sentences. name those. \n\nAbstract: ',

'Study Exposure/Intervention': 'Given the following abstract, what is the study exposure/intervention? Don't write sentences. name those. \n\nAbstract: ',

'Study Primary Outcome': 'Given the following abstract, what is the study primary outcome? Don't write sentences. name those. \n\nAbstract: ',}

To determine the exact study design, Universal Sentence Encoder (USE) model was used for a column of Study design. The source code is available at Git Hub. Full lists of ICD disease classifications and MESH terms were downloaded and embedded in a similar Python app that using the TensorFlow and USE model, selects the best matching term for the study population.

Human supervision:

Two trained researchers were asked to handle the data provided by chatGPT and check their validity. More than 800 entries were validated in 6-hour working time. Personnel who get involved in this step would always be who have more than 1 year of research experience with more than 10 published articles. Final results are being supervised by trained human research experts by comparing the title of the study with the provided prompt responses or the abstract and full text in suspected cases of a mistake by AI.

New metadata registration:

Articles from journals that tended to be included in Metaphill would be generated and uploaded to the repository by human staff. Individual articles are also welcome to be included by requests to the admin email. Journals could ask the authors in the submission era to provide this metadata information for better quality metadata and lower costs.

R shiny-based application of study classifying

The supplied application receives CSV inputs from the users, which are easily downloaded from the repository. It uses hierarchical clustering based on the Jaro-Winkler distances calculated from the text inputs of each row. There is a threshold from the resulting hierarchical clustering dendrogram that cut the diagram at a specified similarity threshold. Users can change this based on the circumstances of the exported articles from the Meta-Phill repository.

Meta-Phill‎ is freely available at metaphill.com A pilot implantation of the engine was performed. The machine-generated each metadata in average 1.43 seconds. text-davinci model with a tempreture of 0.5 is utilized from OpenAI API. For each metadata, average 2250 tokens of API are being used. The price of 1K tokens is $0.002. So, each article being analyzed by chatGPT costs about 0.0045$. The validity of responses was more than 90% for all data.

Authors are not allowed to directly use these text outputs in their research as these are generated by ChatGPT and no rephrasing prompt is made that might cause plagiarism concerns, even if stating that Meta-Phill has provided this text in the AI-related statements of manuscripts.

Table 1

Sample metadata provided by ChatGPT
Study Design	ICD/Mesh/Keyword	Study Population	Study Comparison	Study Exposure	Study Primary Outcome	Citation	Year
Cross-sectional study	COVID-19	Adolescents with major psychiatric disorders during the COVID-19 pandemic	Internet addiction and residual depressive symptoms among clinically stable adolescents with major psychiatric disorders during the COVID-19 pandemic	Internet addiction durring COVID-19 Pandemic	The primary outcome of this study is to assess the inter-relationships between residual depressive symptoms (RDS) and Internet addiction (IA) using network analysis among clinically stable adolescents with major psychiatric disorders during the COVID-19 pandemic.	Cai H, Zhao YJ, He F, Li SY, Li ZL, Zhang WY, Zhang Y, Cheung T, Ng CH, Sha S, Xiang YT. Internet addiction and residual depressive symptoms among clinically stable adolescents with major psychiatric disorders during the COVID-19 pandemic: a network analysis perspective.. Translational psychiatry. 2023;13(1):186.	2023

A total number of 847 entries were included in a dataset in a working day under the supervision of two researchers. Figure 1. Shows the articles registered for each study design.

Meta-Phill search engine capabilities provide an interface that data repository can be explored through various features.

Advanced search builder: There is an advanced search designer that researchers can search a specific PICO component with different keywords, as shown in Fig. 2.
Search panes: Search panes are used to visualize the number of articles in each category. Selecting an object from a search pane, the containing of other search panes gets changed accordantly. There is a button to clear selections for a new search.
Table: the main table of the repository has 10 columns of “Links to study”, “Study design”, “ICD/Mesh/Keyword [it is generated by AI-based natural language processing (NLP) tool of TensorFlow for tokenization of ChatGPT provided study population to classify it based on the MESH terms, ICD terms, or keywords]”, ”Study Population”, “Study Comparison”, “Study Exposure”, “Study Primary Outcome”, “Citation”, and “source [human generated or AI-generated entries”.
Exporting data: Users can easily export CSV files from the Meta-Phill that can be used as a basic spreadsheet to continue a systematic review study. CSVs get generated from selected studies by clicking on the items. Multiple selections are allowed by holding the “Ctrl” button. Select all and deselect all are also available.

Meta-Phill study classifying R shiny application:

The R shiny application is available online at https://meta-phill.shinyapps.io/classify/. The clusters labeled with the same number are more related and potentially more suitable for systematic review (Fig. 4).

Here, to provide a detailed exclusive search tool for the study collection of systematic review studies based on the PICO, we developed an automated machine based on natural language processing (NLP) to continuously ask the PICO items from all available studies of the world. The machine sorts dois, PubMed articles, and other published articles to access and collects the abstracts. Then these prompts are applied using the ChatGPT API:” Study Design: Given the following abstract, what is the study design? please don’t write the sentence. just give me a maximum of 5 words.” This gets continued for the study population, comparison, exposure, intervention, and outcomes.

We provided a repository of metadata of published articles. The metadata is based on the PICO questions that are crucial for the evidence-based literature review. As the preexisting literature is so large, we utilized ChatGPT API to record the metadata of those articles. Boudin et al. analyzed PICO elements of more than 1.5 million abstracts and introduced a location-based weighting strategy and found that applying PICO would increase clinical information retrieval precision (10). We have hypothesized our willingness based on this finding, while we yet have not launched the project. Further research could be performed by asking researchers to use Meta-Phill to find studies for an SR and evaluate their precision, using Meta-Phill or not. This was practically evaluated by Schardt et al. (11) who asked researchers to use PICO templates to conduct a PubMed search. Participants using the PICO templates had higher precision. Kang et al. used the same idea to generate automated PICO elements from text files of randomized trials using TensorFlow (12); But, ChatGPT API makes it more doable for any kind of article, using TensorFlow and multiple other methods of natural language pressings.

In addition to the studies mentioned, PubMed has also implemented a literature review search strategy based on PICO terms. Their PICO search interface (https://pubmedhh.nlm.nih.gov/pico/index.php) allows users to search for articles using PICO terms, with the user interface structured as searching for Patient/Population AND Intervention. However, our repository introduces a significant improvement to this concept of searching based on PICO elements. This comprehensive inclusion of all PICO elements in a filterable manner ensures a more refined and targeted search for studies that align with specific research questions and systematic review objectives. We think it's time to use ChatGPT in a beneficial way for expanding the science.

Data Availability:

All codes are freely available on Git Hub (https://github.com/Metaphill).

Author Contributions

Naser Hatami and Farshid Javdani have designed the idea of this work, research and development, coding, installations, manuscript writing, and drafting.

Mohammad Zarenezhad and Mojtaba Ghaedi have contributed to hosting installations and data preprocessing.

Farshid Javdani has contributed to coding and research and development.

Pouyan Keshavarz, Navid Kalani, Vida Hafezi, Amin Shafiei, Fateme Bagheriand Safarabadi have evaluated the validity of AI-generated responses.

Alireza Sadeghinikoo and Ali Babou have run the AI side codes that can not be accessed in Iran due to sanctions against Iran.

Amir Feily and Minou Najar Nobari have contributed to manuscript writing and drafting.

Alireza Doroudchi has contributed to manuscript writing and drafting.

Seyed Abbas Hashemi has contributed to research and development, manuscript writing, and drafting.

Competing Interests

Naser Hatami and Farshid Javdani are on the Board of Directors of the Radan Modern Medicine-Equipment Co funding agency. Others have none to declare.

Sackett DL, Rosenberg WM, Gray JM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn't. BMJ. 1996 Jan;13(7023):71–2.
Mellis C. Evidence-based medicine: What has happened in the past 50 years? J Paediatr Child Health. 2015 Jan;51(1):65–8.
Akobeng AK. Principles of evidence based medicine. Archives of disease in childhood. 2005 Aug 1;90(8):837 – 40.
Allen IE, Olkin I. Estimating time to conduct a meta-analysis from number of citations retrieved. Jama 1999 Aug 18;282(7):634–5.
Marshall IJ, Wallace BC. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst reviews. 2019 Dec;8:1–0.
Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?. PLoS medicine. 2010 Sep 21;7(9):e1000326.
Santos CM, Pimenta CA, Nobre MR. The PICO strategy for the research question construction and evidence search. Rev Latinoam Enferm. 2007;15:508–11.
National Institutes of Health (NIH). PubMed Overview. Reterived from https://pubmed.ncbi.nlm.nih.gov/about/ on 8 may 2023.
Elsevier. Elsevier at a glance. Reterived from https://www.elsevier.com/about/this-iselsevier#:~:text=18m%2B,18%20million%20visitors%20a%20month on 8 may 2023.
Boudin F, Nie JY, Dawes M. Clinical information retrieval using document and PICO structure. InHuman Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics 2010 Jun (pp. 822–830).
Schardt C, Adams MB, Owens T, Keitz S, Fontelo P. Utilization of the PICO framework to improve searching PubMed for clinical questions. BMC Med Inf Decis Mak. 2007 Dec;7:1–6.
Kang T, Zou S, Weng C. Pretraining to recognize PICO elements from randomized controlled trial literature. Studies in health technology and informatics. 2019 Aug 8;264:188.

Download PDF

Version 1

posted

You are reading this latest preprint version

Meta-Phill: feasibility of a new metadata repository for Evidence-based practice in literature review

Status:

Version 1

Abstract

Figures

Background & Summary

Methods

Resources:

Metadata automated generation:

Human supervision:

New metadata registration:

R shiny-based application of study classifying

Results

Meta-Phill study classifying R shiny application:

Discussion

Declarations

References

Status:

Version 1