Data source: Ghana Demographic Health Survey (2014- GDHS)
This research used the 2014 Ghana Demographic and Health Survey (GDHS) dataset. The Demographic and Health Survey (DHS) program has a long-standing reputation for the collection, processing and sharing of population and health information in over 90 low-and lower-middle-income countries, including Ghana. The 2014 GDHS was conducted in Ghana by Ghana Statistical Services, Ghana Health Services and the National Public Health Laboratory with technical support from the Inner City Fund (ICF) international.
According to the 2014 GDHS report, the participants of this survey were sampled using a two-stage sampling technique. The sampling process started with the random selection of 427 clusters (hereinafter community) that were representative both at the national (10 regions) and residence (urban and rural) level using the 2010 Ghana population and housing census sampling frame as a guide. Furthermore, using systematic sampling technique about 30 houses were chosen from each cluster amounting to 12,831. Among the selected houses, 12,010 were occupied and the remaining houses were vacant. A response rate of 99% was achieved for successfully interviewing 11,835housholds out of the 12,010 households. Overall, 9,396 women aged 15-49 years were interviewed after consenting to participate out of the 9,656 women that were eligible for the survey in the interviewed households representing 97% response rate [7]. This nationwide survey captured information including women’s reproductive health, health-seeking behavior, socioeconomic, and demographic background as well as geo-reference data. The Global Positioning System (GPS) data were collected for all the 427 communities in Ghana. Out of the 9,396 interviewed women, a subgroup of 4292 respondents gave birth in the previous 5 years before the survey and were asked whether they received skilled care 41 days after live delivery (Figure 2). All the eligible participants (4292) responded to the interview question, which constituted the study sample for this research. More detailed information on the sampling procedure and data is published in the Ghana Demographic and Health Survey of 2014 [7].
Study variables
The dependent variable was whether the mother received PNC services from a skilled health care provider immediately or within 41 days after delivery. The outcome measure was coded ‘yes or no’ depending on the response to the questionnaire. The questions in the survey regarding PNC services were restricted to last delivery in the past five years hitherto the 2014 GDHS to limit recall bias. The coding structure for the outcome variable has been detailed in Table 1.
The community-level and individual-level factors employed in this study were chosen using the Andersen health utilization model as displayed in Figure 3. Community-level factors describe the characteristics of the community while the individual-level factors focus on women’s attributes. Andersen’s behavioral model highlights the health system and community characteristics as well as predisposing and enabling factors as facilitating and inhibitory factors for health care utilization [31, 32].
According to the 2014 GDHS, information about the usual community the women lived in, whether rural or urban was captured and was termed as area of residence in this study. Also, the 2014 GDHS collected information about whether women had an issue with distance to a health facility when seeking medical attention. This survey question captured self-reported information on women’s perceived distance to a health facility in their community and was referred hereinafter in this study as community-level problem with distance to a health facility. According to the 2014 GDHS’s original data file, the variable “problem with distance to facility” was a dummy variable , which was, had two responses either: ‘a big problem’ (indicating longer perceived distance) or ‘not a big problem’ (suggesting shorter perceived distance). This variable was used as a proxy to measure the association of community-level problem with distance to a health facility with the dependent variable as was done in an analogous study in Nigeria [25, 33].
On the other hand, the community poverty level was categorized into two groups as reported in a similar study [34]. The community poverty level variable was created from the wealth index in the survey data. The GDHS wealth index was made from information on possession of household assets and dwelling factors including means of transport, refrigerator, toilet facilities among others [24]. The survey employed Principal component analysis to create wealth index [24] that was categorized into quintiles: poorest, poorer, middle, richer and richest. In this study, the poorest and poorer groups were merged to represent the poor category. The percentage of women who were poor per community was estimated. Community’s level of poverty was coded as high ‘1’ when above 50% otherwise was coded low ‘0’ (Table 1).
The community level of education was generated from women’s responses to the question “highest education attended” in the survey. The survey data categorized the highest education attended into 4 groups namely no education, primary (1-6 years), Secondary (7-12 years) and higher. For this study, the higher education attended variable was created by combining secondary and higher education. The percentage of women with at least secondary education was computed for each community. Also, the community unemployment level was generated from the responses of women on either they were working or not. The survey data created two dummy variables of employment: working ‘1’ and not working ‘0’. The percentage of unemployment per community was calculated. The study variables such as community level of education (p-value=0.67) and community unemployment level (p-value= 0.29) did not fail the linearity test and were examined as a continuous variable. The community-level factors that were analyzed to explain the discrepancies in the utilization of PNC services include the area of residence, community-level problem with distance to a health facility, community poverty level, community education level, and community unemployment level.
Individual-level variables that were studied in this research were maternal age, marital status, religion, ethnicity, parity, education, wealth status, and employment status. As exhibited in Table 1 of this study, maternal age was examined as a continuous variable. Marital status was grouped as single, cohabitating, widow/divorced/separated and married. In terms of religion, this study classified women into traditionalist or other, Muslim, and Christian. Also, ethnicity was classified as Akan, Northern tribes, Ewe, Ga, and other groupings. For this research, the parity of the women was grouped into 1 birth, 2 births, and ≤ 3 births. Women’s highest education level was classified into no education, primary, secondary or higher. This study grouped women’s wealth status into poor, middle, and rich classifications. Finally, women’s employment status was grouped into not working and working. The categorizations of the study predictors were adopted from the literature [35, 36].
Descriptive statistics
This research employed Chi-square tests to ascertain the differences in the distribution of women across all the categories of the explanatory variables. In this study, proportions and frequencies of postnatal care services use were tabulated according to the hypothesized socio-demographic and economic predictors for women of child-bearing age. Mean, standard deviation, median and interquartile range (IQR) were used for quantitative variables.
Spatial Clustering
This study hypothesized that spatial autocorrelations exist in the use of PNC services across communities. Kulldorff’s spatial scan statistics is a powerful tool to detect spatial autocorrelations based on geographic positioning [37]. This technique was employed in this study to identify local clusters of PNC services across the communities. For analyses, this study used the GDHS spatial data that only allows a set of coordinates per community. A purely spatial analysis was conducted using a discrete binomial model to scan for communities with high rates of non-utilization of PNC services in Ghana. SaTScan technique used in this study hypothesized that the risk of non-use of PNC services was likely different between the inner and outer parts of a circular window. The circular- shape spatial window scan communities to identify areas with a maximum spatial cluster size of 50 percent of the population at risk. The probability model relied on Monte Carlo simulation with replication of 999 and 50 percent of the population at risk was considered the maximum size of a spatial cluster [38]. The analyses were conducted using SaTScan software, version 9.6.0. The outputs generated from SaTScan analyses were displayed on Google Map to highlight the spatial patterns of non-use of PNC services.
Inferential Statistics
Multilevel mixed regression model
Given the sampling technique and the hierarchical nature of the weighted 2014 GDHS data, a 2-level mixed logistic regression model was specified for the dichotomous outcome [20, 39] using mean-variance adaptive Gauss–Hermite quadrature at an integration point of 12. The components of the model were level one (individuals) nested in level two (communities). This study considers the error in the second level as a random effect to check for disparities in the likelihood of PNC services usage across the communities. The two-level mixed model used is stated below.
Equation 1: Multilevel mixed logistic regression model

Three models were estimated in this study. A null model was first fitted with no covariates. Second, unconditional mixed logistic regression analyses were conducted between the use of PNC services and each individual-level as well as community-level predictors. Unadjusted odds ratios were generated and correlations with liberal p-values of 0.25 or less were selected as candidates for the multivariate 2-level mixed modeling [40]. This unconventional cut-off was used to avoid the elimination of important predictors that could be masked or suppressed by other control variables [40, 41]. Lastly, as proposed by Hosmer and Lemeshow [41], a selection method that manually eliminates insignificant factors was utilized in the final model. This backward technique sequentially removes less relevant characteristics, beginning with the highest p-value and eventually retaining just significant predictors with a p-value less than or equal to 0.05.
A complete case analysis was used in this study to remove subjects with missing values. A polynomial model was used to test the assumption of linearity for age by introducing a quadratic term. Multicollinearity test for selected individual-level predictors was done to ensure inflated standard errors due to many predictors measuring the same characteristics are controlled. In this research, the parameters for variance inflation factor (VIF) ≤ 2.5 and tolerance ≥ 0.4 were set as recommended by Johnston et al [42]for the logistic regression model to identify potentially redundant variables due to collinearity.
Type-3 likelihood ratio test was used to examine categorical explanatory variables that have classifications greater than two. Predictors were considered confounders if the difference in the regression coefficient in the unconditional and conditional model was > 20% [40]. This study tested interactions among predictors that were significant in the multivariable model.
The final model had both fixed and random effects, which were reported as odds ratios and intraclass correlation coefficients (ICC) respectively. To compare the effect on individuals across the communities, this study manually calculated population-averaged odds ratios (ORs) and 95% confidence intervals from the subject-specific coefficients from the final model using the following equation:
Equation 2: Population-averaged Odds Ratios

where 𝜎2group is community-level variance, β is the subject-specific regression coefficient.
Based on the latent response variable approach [43], the variance partition coefficient (VPC) which is also referred to as Intraclass Correlation Coefficient (ICC) was calculated for the community in both the null and final model, which measures the variability in the dependent variable attributable to the contextual level [44]. The VPC was computed from this formula below.
Equation 3: Variance Partition Coefficient (VPC)

This study computed “design effect (deff)”, the quotient of the variance in a clustered data structure relative to that in an independent structure. Due to the fact that the variation within or between clusters for discrete data is not always constant, deff is an approximation [45].
Equation 4: Design effect

where C is average cluster size and ICC represents intraclass correlation coefficient.
The final model in this study was compared with the null model, and a smaller value of Akaike’s Information Criterion (AIC) and Bayesian Information Criterion (BIC) was regarded as a parsimonious model [46]. Also, model diagnostics was done using the area under the curve (AUC) of the receiver-operating characteristic (ROC). Alpha level of 0.05 was used to gauge the association that was statistically significant in this current research. STATA 14 (Stata Corp. Inc., TX, USA) was employed in this study.