Inclusion Criteria
To be included in the systematic review, articles must report on Bayesian methods and factor analysis in the context of primary care practice or research. For the purpose of this literature review, the definition of primary care follows that of the American Academy of Family Physicians as being “comprehensive”, “first contact’ and “continuing” meanwhile it covers “any undiagnosed sign, symptom, or health concern” (10). Family medicine is characterized as “an academic and scientific discipline” and “a clinical specialty” “orientated to primary care” (11). We are looking at the aspects of primary care as in family medicine, epidemiology, health services and policy research. We will use a high sensitivity search strategy to identify all potentially relevant records to family medicine and primary care (12, 13).
Types of studies
We will include quantitative and empirical research studies, methodological studies using Bayesian factor analysis, review articles, conference abstracts and thesis / dissertation documents. Research studies using similar model structures such as structure equation model and latent variable model, as well as item response theory, factor loading and item domain correlation with Bayesian methods in primary care context will be included. The full text paper of the included studies must be available. There is no language limitation for the articles to be included.
Exclusion Criteria
We will exclude editorials, commentaries, book reviews, hypotheses, critical appraisals, reflections, surveys, case reports or studies using other frequentist statistical methods (such as hypothesis tests, confidence intervals, and p-values) or with no information on Bayesian methods. Studies that include some of the key words but use them under different connotations or references are excluded. Examples of ineligible use of words include “primary studies”, “prior to”, “human epidermal growth factor” and “genetic factor”.
Bayesian methods used in other types of analyses, such as Bayes rule, Bayes / Bayesian factor studies, variational Bayes, Bayesian Information Criterion/Criteria, Bayesian random effects models, Bayesian/Bayes network, belief network and Bayes(ian) model or probabilistic directed acyclic graphical model are excluded. Instead of factor analysis, studies using hierarchical Poisson models with latent variables, Gaussian process and risk factor analysis are excluded. Studies not in family medicine or primary care but use related words are excluded. Examples include: “a family of methods” and “exponential family”.
If no information is given in the title or abstract about any of the three criteria, i.e. no indication about whether the study is Bayesian, using factor analysis or in primary care, we include those studies at the initial stage of screening. When not sure, especially the term “factor analysis” is mentioned but not specified whether it is Bayesian or not Bayesian, the article is kept for the next round of full-text review.
Search methods for identification of studies
A comprehensive search strategy is adopted to identify potential studies, indexed in PubMed and translated to other databases. The search strategy will include terms (and synonyms) for Bayesian, factor analysis and primary care. The search strategy will be developed with a specialized librarian and conducted by at least two reviewers independently. We will combine searches of electronic databases with hand searches of reference lists. The computer-based searches will combine medical subject headings (MeSH terms) and free text (or full text) searches related to Bayesian factor analysis in primary care and family medicine. We will systematically search for studies using Bayesian factor analysis reporting on the (i) data characteristics and (ii) use of Bayesian factor analysis from inception to date.
Electronic searches and time frame
We will identify articles from PubMed, Medline, Embase, Cochrane Library, CINAHL, and Scopus. We consider all relevant articles published before January 1st, 2020.
Searching other resources
Google Scholar will be manually scanned for the first 100 records for supplementary information. Reference lists and the future citation of the retrieved articles will be manually searched with two additional rounds. Fig. 1 shows the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram for the identification of studies (14).
Data collection and analysis
Selection of studies
Titles and abstracts of studies are sequentially screened using the search strategy by at least two independent reviewers using the software Rayyan applying the inclusion and exclusion criteria (15). Studies that represent a ‘best fit’ will be included. Deadline is given for the review to be done in a limited timeframe. The full text of articles that meet the inclusion criteria will be retrieved and examined independently by the reviewers. Any disagreement between the reviewers about the eligibility of specific studies will be discussed and additional reviewer will be involved if necessary, until consensus is reached. For studies with multiple publication records, the most comprehensive or up-to-date record will be used.
Data extraction and management
Data extraction are independently conducted by at least two reviewers. All records will be coded and categorized under the predefined themes (codebook) from the CIHR (Canadian Institute of Health Research) grants and rewards guide (16). Despite existed guidelines / recommendations on reporting of general Bayesian methods, confirmatory factor analysis and questionnaire development independently and separately, no single comprehensive recommendation was found on the reporting of Bayesian confirmatory factor analysis (17-19). Where applicable, the following data will be extracted: the types of journal, the publication dates, the geographical locations, the sample sizes, the number of items/questions used for the Bayesian factor analysis, the number of factors (domains/constructs), the reported item-domain correlations, regression parameters and/or factor loadings (parameters of structural equation models), the use of prior information and assumed prior distributions, and their primary care settings. A standardized predesigned data collection form will be used for data extraction. We will follow the screening criteria below:
- Did the authors use either a Bayesian confirmatory factor analysis or Bayesian exploratory factor analysis or Bayesian latent variable model or Bayesian structure equation model?
- If they used one of those, what was the parameter of interest they were aiming to estimate using a Bayesian approach: item-to-domain correlation, factor loading or latent model regression parameter? In other words, for which parameter did they impose a prior distribution?
- How did they inform their prior distribution of the respective parameter? What was the prevalence of studies that employed non-informative priors?
- If they mention the term “factor loading”, did they explain it and if, how did they interpret it i.e. as item-to-domain correlation or as model parameter (latent variable coefficient)?
- Did they report factor loadings (results) that exceeded [-1, 1] interval?
- Were credible intervals (or confidence intervals) reported for factor loading / item-to-domain correlation / model parameter (regression coefficients)?
- What software / libraries are used, are software codes or original data available? (reproducibility)
Assessment of quality of implementation and reporting of Bayesian methods
Risk of bias in individual studies is not applicable to and will not be assessed in our study since the goal is to summarize the use of Bayesian methods, i.e. there is no single effect parameter that is of primary interest. The data collected and pooled across studies are information on the presence or absence of specific criteria in the design, conduct and reporting of Bayesian factor analysis. We will assess the quality of implementation and reporting of Bayesian methods for each eligible study rated on a scale of very low, low, moderate and high on the following aspects: reporting about methodology, Bayesian model, estimated parameters, factor loading, informed prior, and basic information about the publication. The assessment of quality will be presented in tables in the final publication of the systematic review. Given that no available critical tools yet exist to appraise the use of Bayesian factor analysis, we will develop a recommended procedure / tool for the use and reporting of Bayesian factor analysis.
Strategy for data descriptions / synthesis
We will provide a descriptive-analytical synthesis of the findings from the included studies with graphs and tables detailing the use of Bayesian factor analysis based on a common analytical framework on authors, years of publication, estimates, the number of publications as changing over time, geographical locations, the study populations, the aims of the study, data types, key information about the data (sample sizes, number of questions in a questionnaire, number of domains/factors), the type of Bayesian method used, different estimation procedures and software routines (e.g. analytical solutions vs. sampling-based solutions) and important results. We will compare, summarize and report the results using content analysis based on the research themes in CIHR guidebook to examine the use of Bayesian factor analysis (16). We will chart the data, synthesizing and interpreting the data.
Statistical Synthesis
We anticipate conducting meta-analyses to summarize relative frequencies of Bayesian factor analysis approaches being used as well as general and specific reporting issues.