This exploratory psychometric study was conducted in 2018 in Beiragh, in the suburb of Tabriz metropolitan city. The Ethics Review board at Tarbiat Modares University approved conducting the study.
Developing the Initial Questionnaire
Initial items of BPQ were achieved through a thorough literature review and interviews with all stakeholders. Databases, including MEDLINE, PubMed, EMBASE, ERIC, Cochrane Library and CINAHL were searched to find any published study about brucellosis prevention or animal vaccination. A combination of keywords of "preven*", "brucell*" and "vaccinat*" were used to search in English and Persian. Approximately 110 related papers, published between 2008 and 2019, were found. Reviewing those papers and contacting the corresponding authors, nine Persian and three English questionnaires were obtained.
Interviews were conducted to identify the factors influencing animal breeders' prevention behavior. The conceptual framework for conducting the interviews was the concepts from the first four phases of PRECEDE model . Purposive sampling was employed to recruit participants  and directed content analysis was used to analyze the content of the interviews .
Six Animal breeders, four health educationists, four veterinarians and three experts from a vaccine providing institute in the region, who were volunteer, participated in 30 to 45 minute long, face-to- face interviews. Interviews were took place at participants' desired time and place. They were told that their information will be kept confidential and will be used by an anonymous way.
The items from literature review were combined with the findings from interviews. Identical and duplicate questions were removed and some questions were edited. Finally, after three sessions of group discussion, the first draft of the research questionnaire was confirmed. Anchor response of the items were discussed and finalized by research team members too.
Assessment of face and content validity of the questionnaire
Face validity was examined in both qualitative and quantitative ways. For qualitative assessment, based on the feedback from animal breeders and health educationists, any ambiguity in the meaning, wording and scaling of the items, errors in grammar and errors in items allocation, were identified and resolved. For quantitative assessment, impact score (IS) of each item was calculated.
Fifteen animal breeders and seven health educationists participated to assess the face validity of BPQ. Those animal breeders were different from the ones, who participated later in assessing the construct validity of BPQ and those in the cross-sectional part of the study. In this phase, BPQ was emailed to twenty veterinarians and health educationists to evaluate the impact score and content validity of the questionnaire (Response rate= 0.85%). To evaluate the face validity of the items, the appropriateness of each item was rated by an expert, using a five-point Likert scale. Impact score of each item was calculated by the formula of:
Impact Score = Frequency (%) × Importance [13, 14].
Content validity of the questionnaire was examined in both qualitative and quantitative ways, too. For quantitative assessment of content validity, Content Validity Index (CVI) and the Content Validity Ratio (CVR) were calculated. To do so, BPQ was submitted to twenty experts. Response rate was 95 percent. Concerning the precision of the data, two questionnaires were set aside. CVI and CVR were calculated based on three and four part Likert scales, respectively. CVR was calculated by the formula of (Ne – N/2)/ (N/2) . In this formula, N is the total number of panelists and Ne is the number of panelists who rated the item to be “essential”. Based on the Lawshe table, items with CVR below 0.46 were removed [15, 16].
CVI of the items was calculated by the formula of
CVI=the number of specialists who assigned scores 3 and 4 to the items/N
Relevance of each item was rated by a 4-point Likert scale. Items with the CVI less than was 0.79 were removed [15,17].
Assessment of construct validity of the questionnaire
Construct validity of the questionnaire was evaluated by both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) . Four hundred and fifty animal breeders, who were randomly selected, participated at this phase.
Exploratory Factor analysis (EFA)
EFA was done on 42 binary items and 17 Likert-scale items, which were intended to explain animal breeders' prevention behavior.
Principal component analysis and oblimin rotation method were used to determine the number of optimal factors. Loadings with significance lower than 0.05 were excluded from the analysis . If an item was loaded into different factors, that item was related to the factor in which it had the largest factor loading. The extracted factors were named by team members.
Confirmatory Factor analysis (CFA)
For the items with binary response anchors, the generalized confirmatory factor analysis  and WLSMV statistical estimation method were used . The software of M-Plus- version 7.4 was applied to test and evaluate the intended conceptual model.
The fitness of the proposed model was assessed by fit indices including the ratio of Chi-square to degrees of freedom (X2/DF), Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), and Tucker-Lewis Index (TLI). The values of at least 0.90 for CFI and TLI, and below 0.08 for RMSEA represented a good fitness [21, 22]. After excluding non-significant items, the final conceptual model was introduced.
Assessment of reliability of the questionnaire
Cronbach’s alpha coefficient and composite reliability (CR) were calculated for testing the internal consistency of the questionnaire. Stability of the results was examined by calculating the Interclass Correlation Coefficient (ICC) [23,24]. To do so, forty two volunteer animal breeders were asked to complete the research questionnaires twice (by a two-week interval) . For analysis of the absolute reliability of the results, standard error of measurement (SEM) was calculated.
The IBM SPSS statistics 24 was used to perform data cleaning and calculating reliability indices. P values lower than 0.05 were considered significant.