Count regression models for under-five deaths in rural Ethiopia

Under-five mortality is defined as the likelihood for a child born alive to die between birth and fifth birth day. Mortality of under the age of five has been the main target of public health policies and is a common indicator of mortality levels, especially in developing countries. It is also viewed as an indicator of the level of development, health and socioeconomic status of the population. The objective of this study was to identify determinants of under-five mortality in Ethiopia using the 2011 EDHS data. To achieve the objective of this study descriptive statistics and count regression models were used for data analysis using socio-economic, demographic and environmental related variables as explanatory variables and the number of under-five deaths per mother as the response variables. According to Ethiopian Demography health Survey, 2011 report the level of under-five mortality in rural parts of Ethiopia is 114 deaths per 1000 live births. Factors influencing the number of under-five deaths have been identified. The study revealed that mother’s age at the first birth, breastfeeding status, wealth index, current mother working, region and mother’s level of education had statistically significant on the number of under-five deaths in rural parts of Ethiopia.

Sub-Saharan Africa still has the world's highest rate of child mortality, despite reducing the number of under-five deaths by 48% since 1990. Sub-Saharan Africa is the region most affected and accounts for more than one-third of deaths of children under the age of five. In 2013, about half of global under-five deaths occurred in sub-Saharan Africa and 32 percent in South Asia. Around half of under-five deaths occur in just five countries: India (21%), Nigeria (13%), Pakistan (6%), Democratic Republic of the Congo (5%) and China (4%) [20]. Logistic regression has been widely used in analyzing child death data. However, since its dependent variable is dichotomized to be either "1" (death) or "0" (alive), logistic regression undercounts the total number of under-five mortality since multiple child deaths are collapsed into a single unit to fulfill the requirements of logistic regression. Logistic regression cannot provide sufficient information for studying the pattern of multiple child deaths. Therefore, we would develop and compare count regression models for the number of under-five deaths and discuss how they can enhance our understanding of the risk factors of under-five deaths.
The general objective of the study is to identify socioeconomic, demographic and environmental factors of under-five mortality in rural Ethiopia using count regression model.The specific objectives are to examine the determinants of underfive mortality in rural parts of Ethiopia using 2011 EDHS data and to fit an appropriate count regression model for rural parts of Ethiopia.

Variables in the study
The dependent variable for this study is the number of deaths of under-five deaths per mother. Based on the [15] determinants of childhood morbidity and mortality framework for developing countries, experiences from the available similar studies and available data on the subject, the main predictors explored for under-five mortality have been grouped into demographic, socioeconomic and environmental factors. The demographic factors for this studies are mother's age at the firth birth, mother's currently breastfeeding and mother's marital status. The socioeconomic factors are mother's level of education, mother currently working, region and household's wealth index. The environmental factors are Source of drinking water and Toilet facility.

Methods of data analysis
In this study, the variable of interest is a count variable. When the response or dependent variable is a count (which can take on non-negative integer values (0, 1, 2, …), it is appropriate to use non-linear models based on non-normal distribution to describe the relationship between the response variable and a set of predictor variables. For count data, the standard framework for explaining the relationship between the outcome variable and a set of explanatory variables includes the Poisson and negative binomial regression models. Unlike linear regression, count data regression models have counts as the response variable that can take only nonnegative integer values. The two most popular models for count data are the Poisson model and the negative binomial model.

Poisson regression model
This regression model is a popular and simple regression model for count data. It assumes a Poisson distribution, characterized by a positive skewed and a variance equals the mean. Poisson regression analysis is a technique which allows to model dependent variables that describe count data [3]. According to [19], the apparent simplicity of Poisson comes with two restrictive assumptions. First, the variance and mean of the count variable are assumed to be equal. The other restrictive assumption of Poisson models is that occurrences of the event are assumed to be independent of each other. If P-value of LRTα < α (level of significant), which is an indicated of over-dispersion is present; negative binomial is preferred. The negative binomial regression model is more appropriate for over-dispersed data because it relaxes the constraints of equal mean and variance.
In the general Poisson regression model, we think of μ i as the expected number of under five-child death from the i th mother and the total number children ever born from the i th mother is N i . This means parameter will depend on the population size and the total number of children ever born from the individual mother. Thus the distribution of Y i can be written as: Where, is the total fertility rate of i th mother and = ( T ) The logarithm of the children ever born is introduced in the regression model as an For estimating regression coefficients β and dispersion parameter α, the Newton-Raphson iteration procedure is applied like Poisson model.

Zero-inflated model
In some cases, excess zeros exist in count data and considered as a result of overdispersion. In such a case, the NB model cannot be used to handle the overdispersion which is due to the high amount of zeros. To do this, zero-inflation (ZI) can be alternatively used.

Zero-inflated Poisson regression model
The where, µ i is the mean of the underlying negative binomial distribution, α >0is the over dispersion parameter and is assumed not to depend on covariates and 0 ≤ ≤1. Also the parameters and depend on vectors of covariates and , respectively.
The log-likelihood function l = l(α,µ i ,ω i ;y), for the ZINB model is given below.
[ This statistic has a Chi-squared distribution with 1 degrees of freedom and LL is loglikelihood. If the statistic is greater than the critical value then, the model 2 is better than the model 1.

Vuong Test
The Vuong test is a non-nested test that is based on a comparison of the predicted probabilities of two models that do not nest [21]. That means vuong test statistics are needed to provide the appropriateness of zero-inflated models against the standard count models. For testing the relevance of using zero-inflated models Where, m̅ is mean of m i , sm standard deviation and n sample size. The null hypothesis of the test is that the two models are equivalent. Vuong showed that asymptotically, V has a standard normal distribution. As Vuong notes, the test is directional [21].
If V > Zα/2, the first model is preferred.
If | V | < Zα/2, none of the models are preferred. Among the 8,668 women, considered, 5,180 children died before the age of five (see Table 1). Table 2 showed that the sample mean of the response variable, the number of under-five deaths was 0.5976 while the sample variance was 1.052. The fact that the mean is smaller than the variance, suggested a case of overdispersion. Moreover, the data has excess zeros and thus one might expect that the Poisson model would not be appropriate to predict the number of under-five deaths.
As shown in Figure 1, the distribution of the number of under-five deaths has a rapidly decreasing tail and is highly skewed to right with excess zeros.   The results in Table 7 showed that region has a significant factor on the We can interpret mother's age at the first birth obtained in Table 7

Interpretation of ZINB regression model for covariates of zero counts in rural parts of Ethiopia
As shown in Table 7, mother's marital status has a significant effect on the probability of being an excess zero. The odds of being in the zero groups are increased by a factor of 2.82 for unmarried mothers as compared to married mothers controlling for other variables in the model.

Discussion
This study was carried out to identify the risk factors of under-five mortality in rural areas based on EDHS 2011 data. The total number of women from rural areas included in the present study was 8,668 among which 35.5% experienced under-five deaths due to different factors. The most appropriate model was selected from four possible count models and the ZINB regression model was selected as the most appropriate model in rural parts of Ethiopia.
As the results show, the number of under-five mortality in rural Ethiopia has a wide variation among regions. This result is consistent with the findings by [7].
In this study, mothers' education was found to be an important socio-economic predictor of the number of under-five mortality in rural Ethiopia. Under-five mortality decreased with increased level of mother's education. This result is consistent with the findings by [24], [5], [1], [13], [12] and [17].
The current study revealed that those children whose mother's age at first birth was below 15 years had the higher risk of dying relative to children whose mothers' age was 15 and above. A similar study in Kenya by [7] also found that mothers' age at first birth has a significant effect on infant and child mortality showing that a child born to a younger mother experienced the highest risk of dying. These findings are consistent with the findings of [1] and [17].
According to the results, mother's current breastfeeding status was a significant determinant of under-five mortality showing that children born to non-breastfeeding mother experience higher risk of mortality than children born to breastfeeding mothers. This result is consistent with the findings by [16] and [17].
The findings suggested that under-five child mortality risk is higher for children of poor mothers compared to children of medium and rich mothers. This finding is consistent with [16], [7], [22] and [5].
The study indicated that children born from working mothers have higher risk of mortality than non-working mothers. This finding is consistent with [5].
The results of ZINB indicated that marital status has a significant factor on the odds of being in the always zero group. The odds of being in the always zero group for unmarried mothers were higher than unmarried mothers.

Conclusion 19
The main objective of the study was to identify some of the factors that influence the number of infant and child mortality not only at national level but also separately at rural and urban levels. The study was based on secondary data obtained from the central statistical agency of Ethiopia. Among the four models considered for analyzing the data from women in rural areas Factors influencing the number of under-five deaths have been identified. The study revealed that mother's age at the first birth, currently breastfeeding, wealth index, current mother working, region and mother's level of education had statistically significant effect on the number of under-five deaths in rural parts of Ethiopia.
Based on our findings we recommend that There is a need for comprehensive prevention strategies that will help to further reduce child mortality.
The government/ministry of health should give greater attention to improve immunization services and concentrate on health education campaigns for mothers and for the community.
Early marriages should be discouraged and awareness about the danger of giving birth at early ages should be created through education.
Health interventions should particularly be targeted towards women who are suffering from illness and weakness to allow them to continue breastfeeding.
Effort should be made for providing better access to education and health facilities for mothers so that the gab in under-five mortality is bridged.