This study analysis utilized the data of the 2016 EDHS. The approval letter for the use of this data set was gained from the Measure Demographic and Health Survey (DHS) (the Authorization was granted) and the data set was downloaded from the Measure DHS website: www.measuredhs.com. The DHS is undertaken every five years and the 2016 survey is the fourth DHS in Ethiopia. The first, second, third, and fourth DHS were undertaken in 2000, 2005, 2011, and 2016 in Ethiopia, respectively .
Study design and settings
A population based cross-sectional study design was applied. The survey was done from January 18, 2016, to June 27, 2016. The study population of this study was all children aged from six up to fifty-nine months and their families who participated in the 2016 EDHS. The full details of the methods used during data collection for the EDHS have been published .
Sample size and sampling technique
A nationally representative sample of 15,683 women age 15-49 and 12,688 men age 15-59 in 16,650 selected households were interviewed using a structured questionnaire. This represents a response rate of 95 % of the women and 86 % of the men . In this study, a weighted sample of 8,361 children aged from six up to fifty-nine months was included in the analysis.
A stratified, two-stage cluster sampling technique was used to identify representative samples. The sampling frame of the 2016 EDHS consists of a complete list of 84,915 enumeration areas. An enumeration area is a geographic area covering on average of 181 households. In the first stage, 645 enumeration areas (202 in urban areas and 443 in rural areas) were selected using probability proportional to each size of enumeration area and with independent selection in each sampling stratum. In the second stage, twenty-eight households per cluster were selected using systematic selection. The mothers either permanent residents of the selected households or visitors who stayed in the household the night before the survey were eligible to be interviewed .
The dependent variable is vitamin A supplementation. This outcome data were collect from mothers’ direct verbal report, whether their children had taken vitamin A capsule .
Independent variables include individual level factors: The age of the mother, religion, ethnicity, marital status, the educational status of the mother, the educational status of the husband, the employment status of the mother and their husbands, the working status of the mother, place of the delivery, sex of the child, pregnancy wanted, the mothers health cheek after delivery, current age of the child, birth order, number of children live, and the number of antenatal care visits. And, the community level factors: the region, the place of residence, type of region of residence, community level poverty, community level education, and community media exposure. These variables were selected for the analysis in this study because they have been considered by the previous similar studies in the developing countries [1, 2, 31]. Some of the covariates were re-coded for suitable analysis. The aggregate community level covariates were found by aggregating individual level covariates at cluster level and its categorization was done as high or low based on proportion distribution values . Histogram was used to check the distribution of the proportion values. If the aggregate variable was normally distributed, the mean value was considered, and if not normally distributed, median value was used as cut off point for categorization . Therefore, the community poverty was categorized as high if the proportion of mothers from the two lowest wealth quintiles in a given community was 35-100 % and low if the proportion was 0-34 %. Community media exposure was classified as low if the proportion of the media exposure in the community was from 0-68 %, and as high if the proportion was from 69-100 %. Community education was classified as low if the proportion in the community with primary, secondary and above education was 0 %, and classified as high if the proportion was from 1-100 % . These aggregations were performed because the variables are not directly available in the EDHS data set. This study adopted the classification method because the previous studies have analyzed the community-level variables in this way [1, 31].
Data collection tools, techniques, and procedures
The data were collected through face-to-face interviews using a structured questionnaire. The questionnaire was first prepared in English, translated in to three different local languages. The trained interviewers collected the data under close supervision throughout the data collection process in order to ensure its quality .
Data processing and analysis
The data were checked for its completeness. The covariate that needed coding was coded and missing values were dealt before the actual analysis. The data were analyzed by the investigators using Version 14.0 Stata Statistical Software. The dependent variable vitamin A supply was coded as No = 0 and Yes =1. Univariate analysis was done to explain the frequency and percentage of the dependent and independent variables. In a bivariate analysis, cross tabulation was carried out to show the percentage of the vitamin A supply as changes in the categories of the explanatory variables and to describe the relationship between the variables using the crude odd ratio.
Multi-level logistic regression analysis technique was applied in order to consider the hierarchal nature of the data (two-stage cluster-sampling procedure) and the binary response of the dependent variable. Binary multilevel logistic regression analysis was applied to calculate the crude odds ratios at 95 % confidence interval and those covariates that were statistically significant were considered for individual and community level multivariate analysis. Multivariable multilevel logistic regression analysis (multivariate) was applied for individual and community level variables and those variables, which were statistically significant, were considered for the final model of multivariate analysis. Multivariable multilevel logistic regression analysis was done to calculate the adjusted odds ratios and to estimate the extent of the random variations between the communities [31-34]. In multivariate analysis, Variance Inflation Factor (VIF) was calculated to identify the extent of multi-collinearity of explanatory covariates using the average value of VIF, help to identify suitable covariates for multivariate analysis. As a rule of thumb, average VIF value is less than five can be tolerated [35-38].
Four models were fitted using the xtmelogit command. Model I, empty model, was fitted without independent covariates in order to test random variability of the intercept and to estimate the intra-class correlation coefficient (ICC). Model II fitted in order to see the effects of individual level covariates. Model III fitted to see the effect of community level covariates. Model IV examined the effects of both the individual and community level variables simultaneously. The individuals (the mother and child) were nested within the community was expressed elsewhere [31, 32]. The chi square likelihood-ratio test was used to assess the difference between the models because the models were nested, the P-values were estimated using the wald statistics, tells about the model adequacy.
Parameter estimation methods
In the multilevel models, the fixed effects (measures of association) estimates the association between the likelihood of taking vitamin A capsule and the individual and community level factors. These estimates were expressed as odds ratio with their 95 % confidence intervals. The random effects are the measures of variation in use of vitamin A supply across the communities. It was expressed as ICC and proportional change in variance (PCV). The ICC was calculated to evaluate whether the variation in vitamin A supplementation is primarily within or between communities [39, 40]. The ICC ranges from 0 to 1, with ICC of 1 indicating that mothers in the community have identical use of vitamin A supply for their children, and with ICC of 0 indicating that mothers in the community do not have identical use of vitamin A supply. A multilevel random intercept logistic regression models was used in the analysis. In addition, the mixed-effect logistic regression was used to determine extent of the variation in use of the vitamin A supply attributable to the individual and community-level characteristics. The mixed-effects logistic regression model consists of two parts, namely, the fixed effect and the random effect [37, 38].
The model was specified as:
logit (πij) = log (πij/1-πij) = β0 + β1xij + β2x2ij + … + β8x8ij
+ β9z1j + β10z2j +…β14z6j + μ0j
Where: πij is the log of the odds of using vitamin A supply for the mothers i in the cluster j; (1-πij) is the log of no-receiving; x and z are the explanatory variables for the likelihood of taking vitamin A; x1 to x8 are the individual-level variables; z1 to z6 are the community-level variables; β0 is the overall intercept; β1-β14 are the regression coefficients for the explanatory variables x1 to x8, and z1 to z6; and u0j is the community-level random effect (assumed to be normally distributed with mean equal to 0 and variance equal to σ2μ0). The ZjXij is added as a cross-level interaction term.
The ICC calculated as: τ/ τ+(π2/3) where τ is the estimated community-level variance . Since the logistic distribution for the level one residual variance implies a variance of π2/3 (σ2) ~3.29 .
The researchers had received the survey data and an authorized approval letter from the Measure DHS site (Supplementary file 1).