Human Capital Impacts of Income Inequality: An Extensive Empirical Analysis From The African Continent


 This paper evaluates the impacts of income inequality on life expectancy in the African countries. The empirical analysis has been performed on a panel dataset of 52 African nations covering the period of 1995-2018. For estimating the relationship, I have employed Two-Stage Least Squares (2SLS) technique and a Panel Error Correction Model (PECM). The long-run cointegrating relationship was estimated using a Panel Dynamic Ordinary Least Square (PDOLS) estimator. The outputs of both static and dynamic estimation models suggest that income inequality has negatively affected life expectancy at birth in the African continent overall. Though a positive short-run causal relationship was established, in the long-run, income inequality had deleterious effects. A series of steps had been followed to check the soundness of the result of the main empirical examination and it was confirmed that the results are robust.


Introduction
"A nation will not survive morally or economically when so few have so much and so many have so little" -Bernie Sanders (US Senator) In 2011, the World Economic Forum established that the income inequality and corruption are the two most serious challenges in front of the world (World Economic Forum, 2011). The World Social Report 2020, published by UNDESA asserts that the inequality is rising for more than 70 percent of the global population, thwarting socio-economic development. It has been well over four decades since Rodgers (1979) reported the associations of health outcomes and income inequality, in his seminal work. His work, along with Wilkinson's (1992Wilkinson's ( , 1996, gained traction and inspired further research on the subject matter in the following decades (for eg., Pickett & Wilkinson, 2015;Truesdale & Jencks, 2016). Apparently, one of the several channels via which income inequality proves to be detrimental to the economy is via the health facet of human capital. As mentioned by Wilkinson (1996), the distribution of income is "one of the most powerful in uences on the health of whole population in the developed world to have come to light". Two primary hypotheses prevail to illustrate the ecological association between population health and income inequality: the absolute income hypothesis and the contextual inequality hypothesis, also known as the "Wilkinson Hypothesis". The absolute income hypothesis maintains that the uneven income distribution does not directly affect individual health. Instead, the association between income inequality and population health is purely an outcome of a curvilinear relation between income and individual health, i.e., a diminishing return of health to income (Deaton, 2003;Gravelle, 1998). The argument is depicted in Figure 1. Assume that health is nonlinear concave function of income, such that an increase in income causes diminishing returns of health and, that a hypothetical economy consists of only two equal-sized cohorts, the rich with income x 4 and the poor with income x 1 . The average population health is y 1 . If an amount of money ( x 4 − x 1 ) is taken away from the rich and given to the poor, the poor's health will improve and the rich's health will deteriorate. Due to diminishing health return to income, the poor will have gained more health than the rich would have lost. Therefore, the population health increases from y 1 to y 2 , as income inequality decreases, though the average income in the society remains unchanged.
On the other side, the Wilkinson hypothesis advances a direct, contextual and causal income inequality impacts on the health of individuals, over and above underlying individual-level socio-economic determinants of health, inclusive of individual's income (Wilkinson, 1992(Wilkinson, , 1996(Wilkinson, , 2001. Income inequality stands as a hindrance in the path of nutrition consumption and accessing healthcare services due to disproportionate income emanating in poor households (Birdsall et al., 1995). Health conditions become aggravated in the case of income inequality even beyond the absolute poverty (or, read income) due to the psychological consequences of social and status comparisons, rise in violent crimes and 2. Literature Review A multitude of empirical researches have been conducted since forever, to understand the income inequality-health hypothesis and consequently, shed light on how income distributions impact the human capital from health perspective, across countries. It can be safely asserted that, just like the takes on the subject matter by different economists are not unanimous, the empirical ndings are not uniform as well. A broad spectrum of ndings exists, with most conforming to the negative impacts of income inequality on health, some nd no relationships and some has conclusions exactly on the opposite end. For convenience in analysing the existing studies, the review has been presented in the tabular format below:

Model Speci cation and Variable Descriptions
The impact of income inequality on life expectancy has been studied empirically for the case of African economies.
Both static and dynamic analysis using panel data, were done to capture the impacts extensively.
For static analysis, this article resorts to IV-2SLS estimation technique. The model used is speci ed as follows: gini i , t = α 0 + α 1 ag i , t + αX i , t + m i , t 1 liex i , t = β 0 + β 1 gini i , t + βX i , t + n i , t 2 For the dynamic counterpart, the long and short-run relations between public health expenditure and growth will be estimated using a Panel Error Correction Model (PECM) of the form as follows: where liex is life expectancy at birth and the dependent variable; gini is income inequality index; health is government health expenditure and the independent variable of interest; agis agricultural land per capita and the instrument; λ i and θ t in the short-run equation (4) are the country xed effects and time dummies to control for unobserved heterogeneity and cross sectional dependence (CSD); and the term in the parenthesis in equation (4) Rasella et al., 2013). They are as follows: afr is the adolescent fertility rate. Global health observatory, an initiative by the World Health Organisation, reports that the WHO African region has the highest adolescent fertility in the world (118 births per 1000 women). It has been hypothesized that with repeated pregnancies, women become depleted of resources that would otherwise be available for maintenance and repair of the body, and hence, lead to higher mortality. When child birth occurs at adolescent age, the chance for health complicacies increases which decreases life expectancy even more. edin represents education index. Educational attainment is one of the crucial social factors of increased life expectancy. Educated individuals tend to make better overall lifestyle choices which positively impacts the life longevity. heaexpen is the total health expenditure which is a determinant of human life quality and expectancy. Better health facilities lead to a healthier population. urb is the urban population, used as a measure of urbanisation. Urbanisation leads to multidirectional development, enhancement of lifestyle and increase in facilities including the health sectors which might improve quality of life. However, urbanisation also leads to environmental pollution, increase in crime activities amongst others which might prove counterproductive. regsc is the regime score, with lower scores being autocratic and higher scores being democratic form of government. It is a prevailing theory that democracy improves population health. Wigley and Akkoyunlu (2011) shed light on this argument by explaining that the democratic regimes distribute health-promoting resources more widely than autocratic regimes. saf is access to basic drinking water services. This is used as a control for basic sanitation. The world health organisation reports that every year more than 3.4 million people die as a result of water related diseases, making it the leading cause of disease and death around the world. Most of the victims are young children, the vast majority of whom die of illnesses caused by organisms that thrive in water sources contaminated by raw sewage. Hence, access to basic water services is an important determinant of life expectancy. The nal control variable, gdppc is the real GDP per capita which is employed to control for economic prosperity. Economic prosperity leads to higher standard of life and subsequently, a healthier population.
The long-run relationship estimation implies that there exists a cointegrating relationship between the variables and that the series are non-stationarity. It is also to be taken into account that in longer panels, individual time series are probably affected by the same common factors, resulting in the presence of CSD. Hence, a couple of testing procedures are to be carried out preliminarily. I begin by testing for the presence of CSD following Pesaran (2015) and subsequently, test for the presence of unit roots. At rst, the CIPS test developed by Pesaran (2007) is used for series with CSD. The CIPS test belongs to the second-generation panel unit root tests and is robust in presence of CSD. For testing the existence of cointegration, Kao cointegration test has been used and the lags were selected by AIC criterion.
The cointegration relation of equation (3) is estimated using Panel Dynamic OLS (PDOLS) estimator, the lags and leads of which were selected by AIC criterion and Fully Modi ed OLS estimator. By adding lags and leads to the variables in (3), these estimators control for potential simultaneity bias. The parametric DOLS is preferred to the non-parametric FMOLS in that the latter (unlike the former) imposes additional requirements that all variables should be integrated of the same order and that the regressors themselves should not be cointegrated. Hence, I will be mainly using PDOLS estimator. The FMOLS estimation outputs are provided just to observe whether there are any contradictory results. I, then estimate the PECM speci cation of equation (4) by using a standard xed effects estimator.

Instrument Core Identi cation Strategy
Instrumenting inequality is called for because the outcome variable affects inequality in certain ways. It is an established fact that human capital is a necessary determinant of overall economic and individual well-being.
Narrowing down on the health aspect of human capital, poor health can decline productivity by limiting one's ability to work, inhibiting educational attainment and hence, lead to medical debts and bankruptcy. This, in turn, thwarts income generation and aggravates income inequality situations. Evidences exist attesting to the direction of causality from health to income (for eg. Smith, 1999). Henceforth, I propose agricultural land per capita as an instrument for inequality.
The single most crucial activity in Africa by far is agriculture. The agricultural sector provides employment to about twothirds of the continent's working population and for each country, contributes an average of 30 to 60 percent of the GDP.
Provided the fact that agriculture is the main livelihood of the people, agricultural land is most likely to be the key factor behind income generation, or consequently, the income inequality. A high correlation coe cient (0.3959) between the instrument and income inequality con rmed the potential explanatory power of the instrument. However, contrary to the argument that more agricultural land available to people would generate more income and reduce income inequality, the correlation was found positive. This is entirely plausible and the explanations are; rstly, the poorer population might own too less land to generate su cient income. The entire land produce is used up for consumption by the owners.
That brings me to the second point. Even if larger land areas are available, the poor land owners might not have enough means to produce at a sustainable level. In such cases, large land areas might become a burden. Thirdly, the data on

Data Type and Source
The study involves annual data from a panel of 52 African countries, ranging over the period from 1995 to 2018.

Results and Discussion
The IV-2SLS estimation outputs are plotted in Table 2 along with the rst stage regression and other necessary test results.  Note: ***, ** and * represent signi cance at 1%, 5% and 10% level respectively.
Following the CSD test, the unit root tests are done. The lag speci cation is ranged from 0 to 1. The null hypothesis being existence of unit root, is accepted for all series with non-signi cant test statistics. Hence, the variables with signi cant test statistics (urb and regsc) are dropped out from the rest of the dynamic analysis.
The outcomes of the Kao cointegration tests (optimal lags selected by AIC criterion) are displayed in Table 4. Skimming through the test results, we see that three out of ve test statistics are signi cant and hence, can con rm that there exists a long-run cointegration relationship in equation (3).  In Table 6, the estimation results of equation (4) are shown. ECT is the error correction term. The result of a xed effects model is plotted. From the PECM outputs, it is observed that income inequality increased life expectancy in the short-run, which is a surprising result. A potential explanation for this outcome could be that in the short-run, the effects of the well-off section of the population is stronger. The coe cient of the ECT is negative and signi cant at the 1% level indicating that the short-run disturbances are persistent and that a deviation from the long-run equilibrium is corrected by approximately 11.13% over the following years. In other words, there is long-run convergence towards equilibrium.

Robustness Check
In order to check the robustness of the empirical ndings, I resorted to xed effects, random effects and pooled OLS estimators. The pooled OLS estimations were carried out with clustered standard errors by countries, given that standard errors are correlated among countries by construction. The outcomes are mostly similar (Appendix IV).
Additionally, to have a brief historical context about the life expectancy-inequality relationship, data were collected between the period of 1820 and 1990, inclusive. However, the dataset was really thin due to lack of data availability. All the available variables and observations were included. No matter the inaccuracies arising due to less available control variables and data points, the results still con rm that income inequality has been inimical to life expectancy, keeping in mind the negative coe cients of the independent variable of interest (Appendix V).
The multiple-step checks con rmed the robustness of the empirical ndings. Hence, we can safely establish the answer of the primary research question that inequality has negative impacts on human capital or precisely, life expectancy at birth in the African countries.

Conclusion
Comprehending the relevance and the impacts of diminishing socioeconomic inequalities, and in particular income inequality, in developing countries is of paramount importance. Inequality is detrimental not only from socioeconomic perspectives, but also for the health conditions of a population. Bearing the poor income inequality conditions in Africa and mediocre performance to ameliorate the condition in mind, this paper investigates the impacts of income inequality on the health aspect of human capital in the African continent.
The empirical analysis was carried out on a panel dataset consisting of data from 52 African nations, covering the period of 1995-2018. Life expectancy at birth has been used as measure for the population health. The relationship has been estimated using IV-2SLS estimation and a Panel Error Correction Model. Agricultural land per capita was used as the instrument for the Gini index. The long-run cointegrating relationship was estimated using PDOLS estimator. The outputs of both the static and dynamic estimation models suggested that income inequality has negatively affected life expectancy at birth. Though a positive short-run causal relationship was established, in the long-run, income inequality was prejudicial to population health. The robustness of the nding was proved following a series of steps.
However, the study is not free of shortcomings. Due to lack of data availability, the dataset had fewer observations and multiples gaps in cross sections. Hence, enhanced panel cointegration tests like Westerlund could not be performed. For the same reason, I could not perform dynamic CCE estimation which is robust in presence of cross-sectional dependence, even though cross-sectional dependence was con rmed. The data availability problem also led to dropping out multiple cross sections in the PDOLS estimation. Hence, there potentially exists certain degrees of inaccuracies in the dynamic analysis. Lastly, the dynamic model wasn't checked for endogeneity, or to be precise, the direction of causality. Further scopes on studies in this area includes nding more data, using more inclusive indices of income inequality, using household-level data for better policy formulation, checking the direction of causality and if there exists any reverse causality, more advanced estimation methods like CCE-GMM estimation should be employed nding suitable instruments.

Declarations
Availability of data and materials The datasets generated and/or analysed during the current study are available in the World Bank World Development Indicators (2020 Q4 Edition), Polity5, UNFAO, UNDP HDR, SWIID, Penn World Table and Clio Infra repository,

Competing interests
The author declares that he has no competing interests.

Not Applicable
Authors' contributions Not applicable (solo author)