Study Area, design and population
The study was conducted in Ethiopia located in the horn of Africa with a total population of about 105,350,020. The country has 9 Regional States and two city administrations with over 80 different ethnic groups. About 65% of rural households in Ethiopia consume the World Health Organization's minimum standard of food per day (2,200 kilocalories), with 42% of children under 5 years old being underweight [21]. This study employed a cross-sectional study design to assess the determinants of OBF among a total of 1,087children aged 0-5 months in EDHS 2016 selected enumeration areas [22]. All women of the reproductive age group residing in the selected households were eligible to be interviewed. All the data related to all children with age 0-5 month in the EDHS 2016 were included in this study.
Sample size determination and procedure
The sample of 1,087 children’s from the 2016 EDHS was designed to provide estimates of key indicators for the country. Each region was stratified into urban and rural areas yielding 21 sampling strata. Samples of enumeration areas were selected independently in each stratum in two stages. In the first stage, a total of 645 enumeration areas (202 in urban areas and 443 in rural areas) were selected. In the second stage of selecting a fixed number of 28 households per cluster were selected with an equal probability systematic selection from the newly created household listing.
Data source, data extraction, and management
The EDHS, 2016 was a nationally representative large scale dataset conducted by the central statistical agency (CSA) in collaboration with the ministry of health from January 18, 2016, to June 27, 2016. Data set was obtained by attaching the title and significance of the study. Then, after receiving permission for registration data set was downloaded at www.measuredhs.com from Measure DHS website. This was followed by extraction of the wide range of information about potential individual and community level factors from the EDHS 2016; particularly from the dataset of child record including child mortality, nutrition, maternal, child health, family planning, and other reproductive health issues.
Variables and measurements
Optimal breastfeeding: refers to a child less than six months old who had received breastfeed timely (within one hour of birth) and had no intake of food/fluids other than breast milk for the first six months (i.e. EBF).
In this study, the dependent variable was optimal breastfeeding (OBF) practices. In the regression analysis, OBF practice was coded ‘1’ while ‘0’ was coded for non-OBF practices. The independent variables considered were: age, residence, educational status, marital status of mothers, household income, occupation, family size, sex of the infant, place of delivery and ante-natal and post-natal service utilization. The age of mothers was categorized as 15-19, 20-24,
25-29, 30-34and 35-39.
The younger age group was taken as a reference population in the regression analysis. The religion of mothers was coded as ‘0’ for Christian and ‘1’ for Muslim. Urban and rural residences of mothers were coded as ‘0’ and ‘1’ respectively. Regarding the educational status of parents, those who could not read and write were coded as ‘0’ while the rest were coded as ‘1’. Mothers who were housewives were coded ‘0’ while farmers and employed mothers were coded ‘1’. The lowest household income was coded ‘0’ while the other two levels were coded ‘1’. Mothers who received infant feeding counseling and delivered in a health institution were coded as ‘0’, while those who did not receive those services were coded as ‘1’. Age of the child was categorized as less than 2 months, 2-3 months and 4-5 months.
Data analysis
The extracted dataset was cleaned, coded, and analyzed by using Stata version 13.1 and excel was used. After generating the new variables EIBF and EBF the outcome variable was also generated from both EIBF and EBF. EDHS sample was not self-weighted because of non-proportional allocation. Therefore, sampling weights was used to make the sample representative of the entire population.
Frequency and percentage was used to report categorical variables, while median was used to report non parametric continuous explanatory variables. Bivariate analysis was performed to examine the association between optimal breastfeeding and each individual (level 1) and community level factors (level 2) at p –value less than or equal to 0.25. Finally, multivariable multilevel logistic regression analysis was performed to estimate the adjusted odds ratios for both level and to estimate the extent of random variation between communities at p –value less than 0.05 and confidence interval 95% [23].
Model specification
This is the first step in the multilevel model analysis fitted without covariates to test random variability in the intercept and to estimate the intra-class correlation coefficient (ICC). The empty model enables the researcher to verify if the random effects at the community level are large enough to justify assessing random effects at the community level. When ICC is close to zero most of the variation is explained by lower level that means there is no variation between group effect and also if the ICC is close to one most of the variation is explained by higher level that means there is no variation between lower level effect [22]. However; the minimum value or cut point of ICC is 0.1 or 10% [23].
β0j is the intercept β1 , β2,βn are the regression coefficients estimate the data X1j, X2j ,Xnjare covariates at individual level, Z1jcovariate at community level------- [21].
Parameter estimation methods
Maximum likelihood estimators (MLE) maximize the probability of finding the sample data that actually found. The Maximum Likelihood (ML) was used to assess the goodness of diagnostic test. This estimator includes both the regression coefficients and the variance components in the likelihood function [24].
In the multilevel models the measures of association (fixed-effects) estimates the associations between likelihood of children optimally breastfeed and various explanatory variables expressed as Adjusted Odds Ratio (AOR) with their 95% Confidence Intervals (CIs). The measures of variation (random-effects) were reported as Intra-class correlation coefficient (ICC) ICC = σ2u0 / (σ2u0 + σ e²) where σ e² = π2/3 where the variance was explained by the higher level. The change in the community level variance between the empty model (Model 1) and the consecutive models Ve was expressed by Proportional Change in Community Variance (PCV) by using the formula PCV=(Ve-Vmi)/Ve, The ICC and PCV were calculated at each model with reference to null model using the above formula.
Model diagnostic
The effect of multicolliniarty between the predictor variables was checked using variance inflation factor (VIF) at cut off point of 10. Predictors having a VIF value of less than 10 indicate absence of Multicolliniarty [25].
Model selection
Akaike information criteria (AIC) were used to compare and check the goodness of fit of consecutive models. The AIC values for each model was compared and the model with the lowest value of AIC was considered to be better explanatory model fitting the data very well [22].
Model accuracy
The receiver operating character (ROC) curve were used to show in a graphical way the connection /tradeoff between sensitivity and specificity for every possible cutoff for a test or combination of tests and the area beneath the curve is often used as a measure of the predictive power (usefulness of a test) which indicates the greater the predictive power, the more bowed the curve. A model with no predictive power has area 0.5; a perfect model has area 1. Lroc were used to examine the predictive ability of the model.
Ethical clearance
Ethical approval was obtained from Mekelle University, College of Health Sciences Institutional Review Board (IRB) and approval to access the 2016 EDHS dataset was obtained from (web address: http://www.measuredhs.com), after making a request via DHS program website.