The study used secondary data from the 2015 Zimbabwe Demographic and Health Survey (ZDHS). The ZDHS contains important national health and demographic variables. Administratively, Zimbabwe is divided into 10 provinces. Each of the ten provinces is constituted by districts, and a district is constituted by wards. The 2012 Zimbabwe population census sub-divided each ward into Enumeration Areas (EAs) for convenience 27. The ZDHS is a nationally representative cross-sectional survey that applied a stratified two-stage cluster sampling design 28, drawing from the 2012 population and housing census 27. In the first stage, EAs/ villages/ clusters were randomly selected followed by households in the second stage. Women age 15–49 who were either permanent residents or visitors who slept in the selected household before the survey were considered 28. A comprehensive explanation of the sampling approach is published in the ZDHS report 28. We used 2015 ZDHS individual members’ recode, and took all women (9955) from 10 provinces who were relevant to our study. The 2015 ZDHS selected the 9955 women from all the provinces including; Manicaland (1019 women), Mashonaland central (993 women), Mashonaland East (910 women), Mashonaland West (1054 women), Matabeleland North (849 women), Matabeleland South (829 women), Midlands (1062 women), Masvingo (1046 women), Harare (1235 women), and Bulwayo (658 women). 28.
Outcome variable
The outcome variable “ever been screened for cervical cancer” was measured using the question: “Have you ever been screened for cervical cancer?” (No/Yes). This question was asked to all women aged 15–49 years. Building on the abovementioned question, we constructed the outcome variable “ever screened for cervical cancer” as a binary variable taking value1 if yes and 0 if no.
Explanatory Variables
Individual and community characteristics were examined for possible association with cervical cancer screening in Zimbabwe. The composition of the complete list of variables depended on what was contained in the dataset, and guided by existing studies 20,29−32. Individual-level variables considered for the study include: the women’s age, religion, employment status, health insurance coverage, region, contraceptive use, total children ever born, type of marriage, household size, and place of delivery. Variables including; age, religion, working status, health insurance coverage, ever used modern family planning methods, and total children ever born were dichotomized. Additionally, age at first sex was Quadripartitioned while education level, amount of media exposure, type of marriage, household size, and place of delivery were trichotomized. Wealth index and region were categorized into five and six respectively. The categorization of the abovementioned variables was guided by existing literature related to the subject 20,31,32. The 2015 ZDHS dataset captured 10 regions which we categorized into six; regions with related names were categorized into one to obtain fewer categories. Wealth index was a composite score pre-measured by household assets such as televisions, bicycles, materials used for house construction, water access types, sanitation facilities, and other characteristics related to wealth. Factor scores of household assets were generated through a principal component analysis and were then standardized and categorized into quintiles (poorest, poorer, middle, richer, and richest) 28.
Community-level Determinants
Community-level factors were conceptualized as a set of variables capturing community disadvantages, i.e. factors that may make it difficult for people living in certain areas to achieve positive life outcomes. The nested nature of the 2015 ZDHS dataset enabled the use of multi-level logistic regression with the aim of isolating the contribution of individual-level from community-level factors with regard to cervical cancer screening. The Socio-Economic Indexes for Areas (SEIFA) approach was used to identify community disadvantages/ advantages 33. With the exception of type of residence, the 2015 ZDHS did not collect data on community measures. Therefore, individual responses of women were aggregated to their respective communities to obtain community disadvantages. We defined community variables in relation to the 400 communities considered by the 2015 ZDHS. Specific community measures include: decision-making autonomy, attitudes towards wife-beating, type of residence, perceived distance to a health facility, and community-level socioeconomic status. The aforementioned variables have been derived and used as measures of community disadvantages by studies elsewhere 34,35.
We measured two aspects of women’s autonomy/ empowerment: decision-making power in the household and attitudes towards wife-beating. Decision-making power in the household was measured using the answers to the following three questions: the questions as to who decides matters pertaining to (a) the woman’s health (personal decision-making authority), (b) visits to friends or family (mobility decision-making authority), and (c) food to be cooked each day. First, we generated an individual level indicator by differentiating women who made all three aforementioned decisions, either alone or jointly with their spouses, as having high decision-making autonomy, from women who did not as having low decision-making autonomy. Second, we aggregated scores of individuals at community level to derive the proportion of women with high decision-making autonomy for every community/ cluster.
Attitudes towards wife-beating or domestic violence were assessed by asking women if they believed that a man had a right to beat his wife for five hypothetical scenarios: (1) she goes out without telling him, (2) she neglects the children, (3) she argues with him, (4) she refuses to have sexual intercourse with him, and (5) she does not cook food properly. Similarly to what described above, we first classified individual responses, differentiating women as having a favorable attitude towards domestic violence when responding positively to at least one of the five scenarios and as having a negative attitude against domestic violence otherwise. Then, we aggregated values at community level to derive the proportion of women with favorable attitude towards wife beating.
The 2015 ZDHS posed a question “Do you perceive distance to a health facility to be a big challenge” with no/ yes responses. We used this variable as a proxy-measure for community disadvantage in terms of access to health facilities. Women who responded that distance was a big challenge in accessing health facilities were established and total number per community obtained. For the 400 communities, the minimum score was 0, and the maximum was 27, with a mean score of 7. Clusters that had scores above the mean were categorized as communities with higher proportion of women who reported distance to the health facility as a big challenge (coded 0) and vice-versa. The aforementioned wealth indices for women were aggregated at their respective villages to obtain aggregate community socio-economic disadvantages. This was done by classifying women with middle, richer and richest wealth indices as having non-poor wealth index and proportions for each community/ cluster were established. The community-level socioeconomic measure was obtained by categorizing clusters into those with high and low proportions of women with a non-poor wealth index. This approach of measuring socioeconomic disadvantages has been used by studies in Kenya 34 and Korea 36. The composition of group-constructs from individual-level survey dataset is benefitial especially where multi-level models 37 are deployed to provide evidence regarding the contribution of community-level factors 33. The categories and the hypothesized direction of the association between explanatory variables and the outcome variable are summarized in Table 1.
Table 1
Categories of individual-level and community-level variables with their hypothesized direction of association
Variable category
|
Variable
|
Categorization
|
|
Outcome
|
Ever been screened for CC
|
0 = No
1 = Yes
|
|
|
|
|
Expected sign of association with ever been screened for CC (Yes)
|
Explanatory variables
(Individual level variables)
|
Highest educational level attained
|
0 = Primary or no education
1 = Secondary
2 \(\ge\) Post-secondary
|
+
|
Age
|
0\(\le\) 30
1 = 31–49
|
+
|
|
Region
|
0 = Manicaland
1 = Masvingo
2 = Mashonaland
3 = Matabeleland
4 = Midlands
5 = Harare
6 = Bulwayo
|
+
|
Total children ever born
|
0\(\le\)1
1 = 2
2 \(\ge\) 3
|
+
|
|
Currently working
|
0 = No
1 = Yes
|
+
|
Religion
|
0 = Christians
1 = Non-Christians
|
-
|
Age at first sex
|
0\(\le\)17
1 = 18–22
2 = 23–37
3 = Singles
|
+
|
Covered by health insurance
|
0 = No
1 = Yes
|
+
|
Currently using any modern family planning method
|
0 = No
1 = Yes
|
+
|
|
Wealth index
|
0 = Poorest
1 = Poor
2 = Middle
3 = Rich
4 = Richest
|
+
|
Amount of media exposure
|
0 = None
1 = Multiple
|
+
|
Household size
|
0 = 1–4
1 = 5–8
2 = 9+
|
-
|
Type of marriage
|
0 = Monogamy
1 = Polygamy
2 = Not in union
|
-
|
Place of delivery
|
0 = Health facility
1 = Home
2 = Never given birth
|
-
|
Explanatory variables
(Community level variables)
|
|
|
|
|
Community distribution of women who reported distance to a health facility as a major problem
|
0 = Low
1 = High
|
-
|
Community distribution of women with high decision making autonomy
|
0 = Low
1 = High
|
-
|
Community distribution of women with non-poor wealth index
|
0 = Low
1 = High
|
-
|
Community distribution of women with positive attitude towards wife beating
|
0 = Low
1 = High
|
-
|
|
Place of residence
|
0 = Rural
1 = Urban
|
+
|
Statistical Analyses
We used frequency distributions to describe women’s demographic and socioeconomic characteristics. We used cross-tabulation and applied Pearson’s chi-squared (\({x}^{2}\)) tests to investigate associations of individual and community level characteristics with uptake of cervical cancer screening. The definition of the outcome variable as dichotomous and the hierarchical nature of the Zimbabwe DHS dataset enabled the use of the two-step multi-variable multilevel logistic regression with the log-binomial function of the generalized linear mixed models family 38. The associations of individual-level and community-level determinants with uptake of cervical cancer screening were analyzed in a stepwise manner. The nesting of individuals within communities in which women lived generated three models for analysis. We started by fitting the variance component model or empty model (null model) (Eq. 1); the empty model excluded the fixed effects.
\(log\left(\frac{{\pi }_{ij}}{1-{\pi }_{ij}}\right)\) =\({\beta }_{0}\)+\({\cup }_{0j}\) Eq. 1
Where; \({\pi }_{ij}\) is the probability of woman i in community j having ever screened, \(1-{\pi }_{ij}\) is the probability of woman i in community j not having ever screened, \({\beta }_{0}\) is an intercept shared by all communities, and \({\cup }_{0j}\) is the random effect specific to the community. The variance component model was constructed to determine whether the variation in uptake of cervical cancer screening could be explained by variations in communities in which women live (model including random effects only). This was attained by establishing the Intra-Cluster Correlation coefficients (ICCs), or Variance Partition Coefficients, or ρ (the Greek rho). Mathematically, ρ or ICC is obtained by dividing the proportion of variance at the group level with the total variances at the individual and group levels (Eq. 2) 39.
ICC or ρ = \(\frac{{S}_{b}^{2}}{({S}_{b}^{2}+{S}_{w}^{2})}\)Eq. 2
Where \({S}_{b}^{2}\) is the variance between clusters, and \({S}_{w}^{2}\) is the variance within clusters. We fitted model 2 adding all the individual-level factors (Eq. 3).

Finally, model 3 was fitted comprising individual-level and community-level determinants (Eq. 4).

Where; \({X}_{ij}={(X}_{1ij},{X}_{2ij},\dots , {X}_{qij})\)represents the first (women level factors) and the second level covariates (community level factors), \({\beta }_{0},{\beta }_{1},{\beta }_{2},\dots ,{\beta }_{q}\)are regression coefficients, and \({u}_{0j},{u}_{1j},{u}_{2j},\dots ,{u}_{kj}\)are the random effect of model parameter at level two. The random effect is assumed to follow a normal distribution with variance \({\sigma }^{2}\mu 0\). To assess the fitness of model 3 relative to model 2, we estimated the likelihood ratio test and Akaike Information Criterion (AIC) of the two models; with a lower AIC value denoting a better model fit 40. The odds of cervical cancer screening while controlling for individual-level and community-level determinants in model 3 were presented with their accompanying P-values and 95% confidence intervals 41. We performed Variance Inflation Factor (VIF) and Tolerance test to check for multicollinearity among the covariates in the models. No multicollinearity problems were observed in the regression models since all variance inflation factor values were less than 10 and tolerance values were greater than 0.1. Stata SE 15 software was deployed for the analyses and the two-tailed Wald test was used to determine the statistical significance of the covariates at significance level of alpha equal to 5% 39.
Ethical Considerations
All data used in the study were obtained in fully anonymized format from the 2015 ZDHS, as such no targeted ethical approval was required for completion of this study. Data collection was conducted in accordance with Helsinki declaration for conducting research involving humans. During data collection, written informed consent was obtained from each respondent before the interviews 28. Procedures and questionnaires for standard DHS surveys have been reviewed and approved by the ICF International Institutional Review Board (IRB). We obtained approval to use the data from the DHS repository (http://dhsprogram.com/data/available-datasets.cfm).