A systematic review protocol of medical and clinical research landscapes and quality in Malaysia and Indonesia [REALQUAMI]

Research landscapes and quality may change in many ways. Much research waste has been increasingly reported. Poorly conducted clinical and biomedical researches are detrimental to the of the people and healthcare performance with misleading evidence. Efforts to improve research performance will need good data on the and This systematic review aims to describe the characteristics and examine the quality of and biomedical in Malaysia and Indonesia.


Background
There is now an increasing number of clinical and biomedical research conducted and publications published in the world, especially those originating from Asia [1]. The quantity does not see a tandem growth in quality. Instead, huge research wasting have been reported because of irrelevancy [2], poor research designs [3], inaccessible research data [4] and incomplete reporting [5,6]. Moreover, "It was very easy to make errors" as admitted by John Ioannidis, one of the co-director at the new Meta-Research Innovation Center at Stanford (METRICS) on the challenges along the research process despite the noble intentions of the researchers [7]. However, it is uncertain of the actual clinical and biomedical research landscapes that is evolving throughout the past decades in Asia beside those that are reported from a few sources and more in terms of quantity [1]. Similarly, the quality of the published research in a country such as Malaysia and Indonesia over the past few decades has not been examined. These comprehensive assessment and evidence are needed to inform the existing researchers, research institutes and funders in the countries of adequacy of current effort or a need to improvise the existing ways of conducts.
There are about 200 tools available for evaluating research quality or biases in randomized and nonrandomized studies [8][9][10]. Nevertheless, most tools available for assessing non-randomized studies are generally of poor methodological quality, making that the assessment of methodological quality and risk of bias across primary studies consistently di cult or impossible [11]. Many different tools exist for different study designs such as the Cochrane Risk of Bias tool for randomized trials [12], the QUADAS 2 tool [13] for diagnostic test accuracy studies, and the AMSTAR [14] and ROBIS tools [15] for systematic reviews, and the ROBINS-I [16] for non-randomized studies of the effects of interventions. Additionally, there are a few web-based tools and checklist for different study designs such as the NIH Study Quality Assessment Tool for controlled intervention studies, systematic reviews and meta-analyses, observational cohort and cross-sectional studies, case-control, pre-post, case series studies (https://www.nhlbi.nih.gov/health-topics/study-quality-assessment-tools); the Critical Appraisal Skills Programme (CASP) checklists by an Oxford-based Better Value Healthcare Ltd (https://casp-uk.net/casptools-checklists/); a web application Critical Appraisal Tools (FLC 2.0) developed by OSTEBA Spain to guide critical appraisal process (http://www.lecturacritica.com/es/acerca.php).
Among some of the more widely used and recommended tools are the Newcastle-Ottawa scale [17], the Downs and Black instrument [18] and the latter RTI item bank (RTI-IB) [19]. The Newcastle-Ottawa Scale (NOS), which has been used to illustrate issues in data extraction from primary non-randomized studies, and it has only eight items and is simpler to apply [17]. However, the items may still need to be customized to the review question of interest. The Downs and Black instrument [18] has been modi ed for use in a methodological systematic review [9]. The reviewers found that some of the 29 items were di cult to apply to case control studies, that the instrument required considerable epidemiological expertise and that it was time consuming to use. There are reports that these tools are di cult to apply [20][21][22][23], and agreement between review authors is modest. Median observed inter-rater agreement for the RTI-IB was 75% (25th percentile [p25] =61%; p75 =89%), median rst-order agreement coe cient statistic was 0.64 (p25 =0.51; p75 =0.86). Although the RTI-IB facilitates a more complete quality assessment than the NOS but is more burdensome. Additionally, there are different meanings in epidemiological terminology in different countries for example the term 'selection bias' describes what others may call 'applicability' or 'generalizability'. Thus, comprehensive manuals are required to accompany these tools to offer instructions for standardized interpretation by different users. However, this may pose a great challenge to users and not many tools have such comprehensive manual. Therefore, no tool is found adequate as an all-rounded tool for all types of study designs [10], or is a recommended tool that is suitable to assess the quality of the published researches as a relatively quick screening tool. Accordingly, we assimilate the quality indicators used in the existing tools, based on the series of the users' guides to the medical literature by the Evidence-Based Medicine Working Group [24,25] and systematic reviews [26,27] and principles of clinical epidemiology [28], and developed one for this review project (see further).

Aims of the project
This project aims to systematically identify for published research articles performed by researchers in each participating country. For example, we aim to identify for articles published by Malaysian researchers on research conducted in Malaysia. We will subsequently assess the characteristics and quality of the researches published in journals as described below.

Methods/design
This systematic review will consists of two phases. In the Phase 1, we will descriptively report the demographics and characteristics of research performed in each country to date (research landscapes).
In the Phase 2, we will assess the quality of the research based on the published reports in journals (research quality) ( Figure 1).

Inclusion criteria and search strategy
All clinical and biomedical research conducted in Malaysia or Indonesia from January 1962 (Malaysia after Singapore independence) to December 2019 will be identi ed from the following databases: PubMed, EMBASE, CINAHL and PsycINFO. We will include all published peer-reviewed papers of health and biomedical research done in each country (Malaysia or Indonesia) or by citizen of each country (Malaysian or Indonesian) with an a liation in one of the institution in each country (Malaysian or Indonesian). We will also search for additional literature from MyMedR (http://mymedr.afpm.org.my/) database as it speci cally compiles published papers in health and biomedical research conducted in Malaysia or by authors who has a Malaysian a liation. MyMedR also draws from MyJurnal, an online system used by Malaysia Citation Centre (MCC), Ministry of Higher Education Malaysia to collect and index all the Malaysian journals. Search results will be compiled into Endnote reference management software where duplicates will be removed. If necessary, authors and institutions will be contacted. A medical librarian and a science o cer at the Faculty of Medicine and Health Sciences Universiti Putra Malaysia will assist in these tasks. The review work will be completed by two separate teams with each is based in Malaysia and Indonesia, respectively.

Study selection and data extraction
All reviewers will independently screen identi ed articles by title and abstract. Full text of eligible article will be retrieved and independently extracted using a standard data extraction template. This template has been pilot-tested on 10 articles among all the reviewers for clarity, and modi cation of the template was done accordingly. Any discrepancy will be solved by consensus between three or more reviewers. To ensure the data quality, a reviewer (BHC) will reassess 10-20% of the articles. The nal piloted template is available as Additional le 1.
In the event of duplicate publications or multiple reports of a research study, we will use the most complete data set aggregated across all known publications. Duplicate publications are de ned as two or more published articles that report on the same research question.

Research landscapes
The Phase 1 of the project will describe the characteristics of the reported research project such as team members and the journal that publishes the article. The following lists the research characteristics of interest (see Additional le 1). In Phase 2 of the study, the research quality will be assessed based on the following criteria in three domains: relevance, credibility and usefulness (Table 1). All reviewers will learn about the principles of clinical epidemiology through a workshop and reach consensual understanding on the terms used to represent research quality in this project. During the workshop, we will implement a training session for all reviewers in which all reviewers will read and score the same articles. This will be followed by discussion on any similarity or difference in the quality assessment and scores. This will help to ensure uniformity in the understanding of the quality domains when applied on the papers. We will also determine the interrater reliability agreement using Cohen's kappa κ and intra-class correlation (ICC).The kappa κ is a measure of agreement between different observers beyond chance agreement [29]. The κ statistic will be computed separately for each domain's item (0 or 1). The ICC will be used to assess the domains' subtotal (3, 4 and 3) and the grand total score of the tool ( Table 1).
The Kappa result be interpreted as follows: values ≤ 0 as indicating no agreement and 0.01-0.20 as none to slight, 0.21-0.40 as fair, 0.4 -0.60 as moderate, 0.61-0.80 as substantial, and 0.81-1.00 as almost perfect agreement [30,31]. For the ICC, values < 0.40 is poor, 0.40 -0.59 is fair, 0.60 -0.74 good, and 0.75 -1.0 is excellent [32,33]. We specify an a priori level of κ > 0.60 and ICC > 0.75 must be achieved before the second phase of this study begins. Retraining and reassessment of the reviewers on different articles will be conducted until the inter-rater agreement reach the desirable levels. The expected lower bound of a 95 % con dence limit for κ is no less than 0.60, with an assumed same marginal prevalence of zero score of 30%. Using alpha and beta error rates of 0.05 and 0.2, respectively, a pair of two reviewers will rate 20 papers each [32,33], with ve pairs of reviewers and 100 samples for the subtotal and total ICC estimation [31].

Relevance
The relevance of a research will be assessed from three perspectives: scienti c relevance, the composition of the research team and societal relevance. A research is being scienti cally relevant if it addresses a true and real scienti c problem and provides the needed knowledge to understand an existing phenomena. Scienti c relevance also denotes that the research sets out on justi ed scienti c foundation and informed of existing evidence. Thus, a scienti cally relevant research is usually a globally relevant research due to its highly generalizable topic and subjects of research.
Societal relevance refers to the research that addresses a true and real problem in the society. This relevancy may exist at a smaller and wider population such as it may relevant for all the human population in the world or it may be relevant to a particular group of condition or disease in a unique population. These two domains of scienti c and societal relevance relate to having a novelty in the research.
The last domain in the relevance category is about the research team of comprising investigators and experts of relevant professional quali cations. This may include patients and public people in certain research area when opinion of the end-users are considered important such as intervention or experience of the patients or family members.

Credibility
This category is further assessed after it is judged that the research is relevant. Four essential features that are considered the very minimums in a research for it to be credible and its results to inform or contribute to practice change are data collection design, precision, important sample (external validity) and internal validity.
The design of the data collection of a research is to be appropriate to its objective or research question. The approach used in the data collection depends on whether it is a causal or non-causal research, and then experimental or non-experimental conduct of the research would provide better data. The time feature or characteristic of the variables involved in the research should be collected in their intended phases or stages such as a risk factor in the asymptomatic phase, or symptoms or biomarkers in the latent period.
Sampling and samples are the next important credibility domain. The sample of the participants is to be right group of the population for the research. They represent the important population to which the results could be generalised to later. However, in causal or experimental research, comparability between groups in the research take precedence over representativeness because confounding or prognostic factors between groups results in valid outcomes as of the exposure.
Quantitative research is essentially about measurement, measuring tools and process. The measurement of the variables is to be done by validated tools, through a standardised process, and if necessary by trained and blinded assessors. Any query or suspicion on the methods of measurement in the research will cause internal non-validity.
A credible research provides an appropriate and rational sample size estimation. This bases on the research question and its primary objective, and a similar earlier research. Adequate sample size is required for su cient precision in a research. The achievement or non-achievement of the desired sample size should be reported or justi ed and discussed, respectively.

Usefulness
The research that is credible worth its results a good attention. Usefulness of the research results consists of it being important outcomes, providing meaningful estimates and fair conclusion as supported by the research designs.
Important outcomes are that of high priority and concern to the end-users. These generally refer to the hard outcomes or strong correlates or intermediate markers Lastly, conclusion of the research bears the second testimony to that of the readers' own judgement of the research. As the nal interpretation and remarks by the authors and investigators of the research, it is important to put the results of the research as an evidence in the right context and applicability taken into consideration of the constraint in the research designs and limitations encountered along the whole research process.

Data analysis
The principal investigator has the overall responsibility for compilation, maintenance and management of the review database. The database is stored on a password-protected computer.
Every eligible and included journal article will be assessed according to two main areas -the research characteristics and quality of the research as reported in the article. Data will be checked for any missing data and errors. The data will be reported descriptively, with frequency and percentage for categorical data while mean and standard deviation (median and interquartile range) for normally distributed (and not normally distributed) continuous data. Time series plot will be conducted to investigate the trends and patterns of the research characteristics, health conditions studied and quality of research over the years. Geographic information system (GIS) may also be plotted to evaluate the locations and areas of research conducted. Longitudinal trends of certain research characteristics, health conditions or areas in different settings, by different clinical or biomedical disciplines will be explored.
Associations between characteristics of the research and quality will be explored, and the independent effect of each of the determinants will be quanti ed in multiple linear regression analysis. Additionally, the research quality as a categorical outcome will be explored as tertiles. The highest tertile will be compared to the lowest tertile, and the determinants will be assessed in multiple logistic regression.
Longitudinal trends of the research quality will be explored. A calculated 95% con dence interval and twosided α of 0.05 will be used to test signi cance. Model checking will be conducted in order to get the best and parsimony nal model that meet statistical assumptions. Estimates will be obtained with PASW 25.0 (SPSS, Chicago, IL) and MLwiN version 3.02 (Centre for Multilevel Modelling, University of Bristol).

Discussion
Results will be informative to all stakeholders of clinical and biomedical research in the country of the evolution of research conduct and performance from the past till now. Pro les of the research throughout the past decades may be studied according to socioeconomic, politic or policy changes of certain years. The longitudinal and prospective trends of the research pro les, research quality and the association between them could provide suggestions on improvement initiatives or an institutional role model that has been 'successful' to some extent could be discovered. Additionally, health conditions or areas in different settings, and whether they are over-or under-studied may help future prioritization of research initiatives and resources. Descriptive comparison between countries may also be possible if there are similar studies done in other countries. This provides meaningful benchmarking and insights into the effects of evolving historical events on clinical and biomedical research activities and quality in each country.
The research quality tool of this study may be a useful screening tool for all quantitative study designs except qualitative study, case reports, and systematic reviews. We hope it would be a useful tool for a quick critical appraisal of research quality. The sequence of Relevance-Credibility-Usefulness enable e ciency and empower the tool users in the critical appraisal process. The main limitation of this review would be the reporting quality of the research including zero reporting or null publication of any completed studies [34]. In addition, a relatively large number of graduate and postgraduate students' research projects that were published as thesis and not in journals [35] will not be searchable through the search strategies used in this review project. Reporting quality is not assessed with the research quality tool that is created for this project because there are already speci c guides and checklists for this purpose. The quality and comprehensiveness of the research reporting may be less worse than the research quality in terms of methodology but may affect its assessment [36]. The 10 items within the three domains of the research quality screening tools are believed to be the fundamental minimums of most clinical and biomedical research that would be available in most published articles. Contacting the corresponding authors either through email or telephone would recover missing information in the included articles.

Consent for publication
Not applicable.

Availability of data and materials
Collected data will be made available upon request to the corresponding author. There is no time period or limit. Only deidenti ed participant data will be shared. All requests are to provide a clear study protocol to the principal investigator.

Competing interests
The authors declare that they have no competing interests.

Funding
This study has applied for a funding. The funder will not have any role in the whole process of the review including data interpretation, reporting and publication.