The literature search consists of electronic database and hand searches that will be carried out to identify appropriate peer-reviewed articles that meet the inclusion criteria. We will search for published literatures using PubMed/Medline, Scopus and Cochrane Library electronic databases and Google Scholar hand searches for important journals, to explore the various diarrheagenic microbes detected using PCR-based methods to examine quality of drinking water provided for communities in low- and middle-income countries. References to other literatures will also be considered, making use of snowball techniques. The search strategy will be conducted using major and broad terms limited to studies published literatures in English language. All studies conducted in low- and middle-income countries that use any types of PCR method on improved water samples without publication date restriction will be included in the review and analysis. We will use the initial search terms of the following keywords in combination with medical subject headings (MeSH) terms: (Polymerase chain reaction or PCR) AND (Characterization OR Identification) AND (Improved water OR Drinking water) AND (Pathogenic microbes OR E. coli OR Bacteria OR Virus OR Protozoa) AND (Low- and middle-income countries OR Africa OR Asia OR Caribbean Countries). A full search strategy for PubMed/Medline and Cochrane databases is detailed in Table 1. The results of the search and the full process for selecting included studies will be reported in full in the final report and presented in a PRISMA flow diagram [26] (Fig. 1).
Data collection and analysis
Data management and selection of studies
Data collection will be carried out using developed excel spreadsheet tool. The searched results will be managed using the Mendeley Desktop reference management software version 1.19.4 (Mendeley Ltd., Elsevier, Netherlands). The screening of studies will be conducted by two independent reviewers (STG and NES). The articles found by searches in databases and hand searches will be evaluated for inclusion at three levels. That is, by title, then by abstract, and finally by the full text. The full text of selected studies will be retrieved and assessed in detail against the inclusion criteria. Discrepancies will be discussed between reviewers and refine inclusion criteria. For the screening of articles at full text level, rejection of an article will be decided by the review team upon suggestion of the first reader.
Details regarding the final decision of inclusion of articles will be clarified and archived in a database. In cases of uncertainty in the decision to include or exclude an article, the reviewer will include this article for the next level of screening. The documents without abstracts will be screened at the full text level. A list of articles excluded at full text level will be provided in the systematic review and accompanied by reasons for exclusion.
Data extraction and management
Data will be extracted and collected by two independent reviewers (STG and NES) from studies included in the review using a prepared data extraction excel spreadsheet tool. For each study, authors’ name, place and year of publication, study period, date of search, and the study conducted country, sample type, data on sample size, types of PCR techniques used, result of included studies: detected type of pathogen/s, amount, detection capacity and health impact will be extracted. The data extraction form will be pretested and revised.
Risk of bias of included studies
Two authors (STG and NES) will evaluate independently the methodological quality of the studies that meet the selection criteria. The methodological quality of all studies that meet the selection criteria will be evaluated independently by two authors using the Joanna Briggs Institute (JBI) Critical Appraisal tools [28]. Each study will be assessed individually and independently by the two reviewers, both at the outcome and study level to generate an overall risk of bias score. The reviewers will judge each study as critical, serious, moderate, or low risk of bias via the assessment of the gathered information. The rating for each bias criterion of the two authors will then be compared. Disagreements between the two authors on individual bias criterion will be identified and discussed in an attempt to reach a consensus. Any disagreements that arise between the reviewers will be resolved through discussion or with a third and/or fourth reviewer/s (SRG and AFD).
Dealing with missing data
The reviewers will contact the authors for missing data and clarification of primary studies if required; such inclusions will be reported in the review. We will reject if the author/s didn’t respond about the missing data and clarification request.
Data synthesis
The reviewers will conduct a narrative synthesis first to describe study details, participant and intervention characteristics, and outcomes of the included studies. STATA version- 16(College Station, TX 77845, USA) will be used for the data synthesis. We will calculate 95% confidence intervals and p-values for the outcome.
Investigation of sources of heterogeneity
Heterogeneity will be assessed statistically using chi-square test or Q-test statistics for presence of heterogeneity and inverse variance index (I2) for magnitude of heterogeneity. Variance index values will be classified as follows: no relevant heterogeneity (0–25%), moderate heterogeneity (26– 50%), and significant heterogeneity (> 50%)[29]. Forest plots will be generated to present the pooled estimates to summarize where there are two or more similar studies with similar outputs. Funnel plots of the result of studies will be evaluated for publication bias. Sensitivity analyses will be repeated after exclusion of studies with a high risk of bias. The presence of publication bias will be examined using a funnel plot, Egger’s and Begg’s test. A sensitivity analysis will be repeated after excluding one study to observe the impact of the individual study on the pooled estimate.
Assessment of the quality of the evidence
The quality of evidence of the outcomes will be assessed with the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach using GRADEPro GDT version 3.6.1 /2019 (McMaster University, ON, Canada [30, 31]. According to GRADE, evidence quality assessment is performed for each outcome, and the combined available evidence is considered. The GRADE approach will classify the quality of the evidence into four levels: high, moderate, low, and very low based on the comprehensive assessment of inconsistency, indirect evidence (not generalizable), inaccuracy, and publication bias. These levels represent confidence in the estimation of the treatment effects presented. The level of evidence and strength of recommendation will be determined by discussion involving all authors.