A Decomposition of Diﬀerences in Concentration Indices: With an Application to Socio-economic Inequality in Over-nutrition in India

Background : This paper proposes a new semi-parametric method to decompose the diﬀerences between two concentration indices. Statistical property of copulas is used to model dependence between health and socioeconomic status. The proposed methods are applied to diﬀerences in socio-economic inequality in over-nutrition between rural and urban areas in India, along with existing decomposition methods. Methods : Taking advantage of the statistical property of copulas, we ﬁrst decompose the observed diﬀerences into the part which is due to the diﬀerences in the dependence structures (the dependence eﬀect) and the other part due to the diﬀerences in the marginal distributions of health (the health eﬀect). Next, we decompose both eﬀects further into parts explained by diﬀerences in the covariates in the model and the part that cannot be explained by them. Results : The results show that the diﬀerence in the proportion of Hindus and the proportion of households that use safe cooking fuel contribute the most to the observed diﬀerences. Conclusions : Comparison among diﬀerent approaches suggests that the identifying assumptions play substantial roles in the decomposition analysis.


Introduction
The study of the relationship between health and socio-economic status (SES) has played an important role in health economics for decades. In order to measure socio-economic health inequality, a long list of indices has been proposed in the fields of both epidemiology and health economics (Regidor, 2004;Mackenbach and Kunst, 1997), among which the concentration index (CI) proposed by Wagstaff et al. (1991) and Kakwani et al. (1997) is considered a standard tool (Clarke et al., 2003). The concentration index quantifies the degree of socio-economic-related inequality in a health variable. Due to its analogy to the Gini coefficient and its decomposability into determinants, the concentration index has been widely used in many applications. 1 The factors underlying socio-economic health inequality can be various, and an understanding thereof is of great importance for policymakers seeking to identify effective intervention areas for mitigating health inequality. Wagstaff et al. (2003) proposed the seminal method (hereafter denoted as the "Wagstaff CI decomposition"), allowing one to decompose the concentration index into its contributing factors. Heckley et al. (2016) proposed an alternative method to decompose the concentration index with the re-centred influence function (RIF) regression approach (Firpo et al., 2009) (hereafter denoted as "RIF regressionbased CI decomposition"). Both decompositions take totally different approaches, and we will closely review their differences in Section 2. For now, one of the major differences is that the Wagstaff CI decomposition relies on the additive linearity of health, while the RIF regression-based CI decomposition relies on the linearity of the RIF. This paper proposes a new semi-parametric approach to decomposing the change/difference in two concentration indices, without the linearity assumption. Its assumptions and properties are explored in Section 3.
In Section 4, this paper applies the proposed decomposition method to analyse socio-economic inequality in over-nutrition between rural and urban populations in India. In both rural and urban areas, overnutrition has a pro-rich gradient, while in rural areas its gradient is significantly steeper. Decomposition results show that more than half of the observed differences are explained by the difference in the distribution of covariates. The differences in the proportion of Hindus and households that use safe cooking fuel contribute greatly to these observed differences.
2 Related literature

Notation
This study uses the following notation. First, N is the total sample size. Suppose we have two groups, A and B, which are composed of N A samples and N B = N − N A samples, respectively. g = {A, B} signifies the group to which individuals belong. 2 H g denotes a continuous non-negative, one-dimensional health variable for group g with support H g ⊂ R, while H g denotes the mean of H g . Y g is a continuous SES variable of group g with support Y g ⊂ R, and X g = {X g 1 , ..., X g K } is a vector of the exogenous determinants of health and SES of the group g with support X g ⊂ R K . These variables have the distributions F g H , F g Y and F g X , and F g HY is a joint distribution of H g and Y g . The lower cases with subscript i, such as h i and y i , denote the individual i's realised values.
The fractional ranking of the SES variable of the ith person in the SES distribution in group g is denoted by R(y g i ) = i m=1 1 Ng , where its numerator is a sum that follows the ordering of the ith values of Y , which is ordered such that y 1 ≤ y 2 ≤, ..., ≤ y Ng . The concentration index for group g, which is jointly determined by individual health and SES rank, is given by The CI basically measures the correlation between a health variable and a rank of SES variable, ranging from negative one to positive one. The sign indicates the direction of the relationship between the health variable and the position in the SES distribution, while the magnitude reflects both the strength of the relationship and the degree of variability in the health variable. If the health outcome is equally distributed by SES, the index equals zero.

Wagstaff CI decomposition
The idea behind the Wagstaff CI decomposition is based on the conceptual assumption that inequality in health stems from inequalities in the determinants of health. It assumes a linear additive model of the health variable, H, as follows: where is an unobservable factor. Under the additive linearity assumption, and with the coefficients estimated by OLS, the concentration index can be expressed as a linear function of the socio-economic 2 In the context of the change in the concentration index over time, one can regard A and B as t − 1 and t.
inequalities of its determinants, weighted by their respective mean elasticities: where X k denotes the mean of X k . Equation (3) is made up of two parts. The first part is the deterministic part, which corresponds to the weighted sum of the concentration indices of each covariate, namely CI k . The weight is the elasticity of the health outcome with respect to X k , namely β k X k H ≡ η k , which measures the share of X k explaining the concentration index. The product of elasticity and CI k reflects the contribution made by X k . The second part, GCˆ , is the generalised concentration index for the residual,ˆ , which is the residual component reflecting the socio-economic inequality in health that is unexplained by the covariates. Wagstaff et al. (2003) introduced an additional approach to explain changes/differences in the concentration index across different groups, by applying an Oaxaca-Blinder type of decomposition (Oaxaca, 1973;Blinder, 1973) and deriving the following equation: Concentration index ef f ect where ∆CI = CI B − CI A denotes the difference in the concentration index between groups B and A. The equation (4) shows that this difference comprises three factors: a difference in the concentration index of the covariates (concentration index effect), a difference in elasticity (elasticity effect) and a residual.
Equation (4) shows the extent to which difference in health inequalities are attributable to difference in inequality in the determinants of health, and it also illustrates the extent in this regard that is due to difference in their elasticities (Wagstaff et al., 2003).

RIF regression-based CI decomposition
Heckley et al. (2016) argued that contributory factors in the Wagstaff CI decomposition are assumed not to affect the rank of the SES (rank ignorability). For example, Heckley et al. (2016) argued that the Wagstaff CI decomposition assumes that an increase in education years affects only the health outcome, keeping the SES rank unchanged. Heckley et al. (2016) proposed a method for the decomposition of the concentration index with the RIF (re-centred influence function) regression approach.
The RIF regression is firstly developed by Firpo et al. (2009) as an unconditional quantile regression method. The RIF regression can estimate an unconditional marginal effect of each covariate on the distributional statistics of interest. In the RIF regression method, we firstly prepares an influence function (IF) vector and then create a recentred influence function (RIF) vector. Secondly, we run an OLS regression of RIF on covariates and obtain their coefficients, which show unconditional marginal effects.
The RIF function for the concentration index is a sum of its influence function (IF) and the concentration index. Heckley et al. (2016) derived the IF for the concentration index with the following form: is an influence function of the absolute concentration index.
As an important property, an expectation of the IF is zero (Firpo et al., 2009). The RIF of the concentration index is obtained by adding the concentration index to its IF: Since the expectation of the influence function is zero, the expectation of the RIF is equal to CI. The RIF regression approach can be combined with the Oaxaca-Blinder decomposition (Oaxaca, 1973;Blinder, 1973), in order to analyse the differences in the two concentration indices. 3 Heckley et al.
(2016) themselves did not extend their approach to implement the Oaxaca-Blinder decomposition for the difference in the concentration index, albeit extending the RIF regression-based decomposition to the concentration index is a straightforward undertaking. The goal of the RIF regression-based CI decomposition is to decompose observed differences in the concentration index into two parts: the part attributable to the difference in the association between the concentration index and the covariates (structural effect), and the part reflecting the difference of the distributions of covariates (covariate effect).
These two effects can be decomposed further into contributions made by individual covariates. Thanks to the additive linearity assumption, these effects are expressed as sums of respective contributions made separately by each covariate. The detailed decomposition is implemented as follows: ∆ S,k = X k,B ( γ B,k − γ A,k ) and ∆ X,k = [X k,B − X k,A ] γ A,k are interpreted as the contributions made by x k to the structural and covariate effects, respectively.
RIF regression-based CI decomposition does not require the conditional expectation of health to be linear in X, which is an essential assumption used in the Wagstaff CI decomposition. Instead, the RIF regression-based CI decomposition assumes the linearity of the conditional expectation of RIF . However, we should not overlook the fact that the additive linearity assumption for the RIF model is not innocuous. First, as the RIF regression-based decomposition is premised on the linear approximation of the nonlinear functional, it is important to keep in mind that the approximation could produce er-rors. Second, as argued in Rothe (2012), this assumption implies that the distributional statistics depend on the marginal distribution of the covariates, but only via their means, which implies that distributional information for covariates is not fully exploited. Furthermore, distributional statistics in general are composed of covariate interaction terms as well as individual factors (Rothe, 2015). Certainly, additive linearity makes interpretation far simpler, but this simplicity comes at the cost of prediction accuracy.
The RIF regression itself yields marginal effects for each covariate on the concentration index, implicitly incorporating effects of covariates on the SES as well as health. For example, the RIF regression can measure how an equal marginal increase in education for everyone would influence the concentration index, while taking account of its potential impact on not only the health outcome, but also the SES rank.
However, when the RIF regression is used for the decomposition analysis, due to this implicit effect, it is difficult to infer the underlying mechanism by which individual covariates in a model contribute to explaining the difference in the concentration index. There should exist a number of pathways whereby a difference in the covariate contributes to the difference in the concentration index; for example differences in covariates can contribute to a difference in the concentration index by changing the marginal distributions of health and SES, and/or changing the association between health and SES.
This paper proposes the new non-parametric decomposition method which helps us to understand the underlying mechanism that generates the health inequality difference by focusing on the joint distribution of health and SES. Different from the existing methods, the proposed approach separately analyses the marginal distributions of health and SES, and the dependence structure between health and SES, which is made possible by taking advantage of the statistical property of copulas. The proposed method allows us to explore the underlying mechanism that causes the difference/change in the two concentration indices. Furthermore, the proposed approach does not rely on the functional form assumptions for health, SES and the concentration index, while the Wagstaff CI decomposition and the RIF regression-based CI decomposition require linearity assumptions.
The approach is composed of the following three steps, an overview of which is shown in Figure 1. The observed difference in the concentration index is first decomposed into the part that can be attributed to the difference in the dependence structure between health and SES distributions (dependence effect). The other part is due to differences in the marginal distribution of health (health effect). Second, these two effects are decomposed into a part that can be explained by the difference in the distribution of covariates X and a part which cannot be explained by it. Finally, the explained parts are decomposed into factors associated with the difference in each individual covariate distribution.

Model setting
Suppose H g and Y g have the following general structures: where m(.) and n(.) are unknown functions, and η and are unobservable error terms. We do not impose any specific functional form assumption.
We make an assumption on the relationship between X and the unobservable error terms.
Assumption 1 Conditional exogeneity: and η are independent of g, given X = x, ∀x ∈ X .
Conditional exogeneity states that the distributions of the unobserved explanatory factors in the health and SES functions are the same across groups, once we condition on a vector of observed determinants.
Under this assumption, the effect resulting from differences in the distribution of X is not confounded by differences in the distributions of and η. The plausibility of conditional exogeneity depends on the amount of information that X contains. Conditional exogeneity is essential if we are interested in exploring causal explanations of the decomposition analysis. In such a case, the Wagstaff CI decomposition and RIF regression-based decomposition also require this assumption. When the conditional exogeneity assumption is not satisfied, the decompositions show descriptive measures of the factors contributing to the difference in CI.
In general, joint distributions can be expressed by their marginal distributions and a copula function.
Copulas allow us to model dependence structures of variables from their marginal distributions, and they have the property that a functional association between underlying variables is not influenced by their marginal behaviours (Nelsen, 2006;Trivedi and Zimmer, 2007). According to Sklar's theorem, there exists a copula function, C g (., .), such that The copula function unites the marginal distributions of health and SES and determines the dependence structure of these two marginal distributions. Hence, the concentration index, which is a function of the joint distribution of H and Y , is re-written as Modelling the joint distribution with a copula allows us to analyse joint distribution behaviour by separately analysing its marginal distributions.

Overall decomposition -step 1-
First, we decompose the observed difference in the concentration indices (∆CI = CI B − CI A ) into (1) the dependence effect and (2) the health effect. The dependence effect is the part of the total difference that can be attributed to the difference in the dependence structure of health and SES distributions. The health effect is the other part due to differences in the marginal health distribution.
In order to implement a decomposition, we introduce a counterfactual concentration index, which is the hypothetical concentration index estimated from the counterfactual distribution which has the marginal distributions of group B and the copula function of group A. This counterfactual concentration index can be interpreted as the hypothetical socio-economic health inequality that the group B would face, if they had the dependence structure of group A. Using this counterfactual concentration index, the observed difference can be decomposed as follows: As equation (14) shows, the dependence effect is attributable to the difference in the copula function between groups A and B, while the health effect is attributable to the difference in the marginal distributions.
Equation (14) is formulated from the viewpoint of group A. Alternatively, we may define a counterfactual concentration index, CI(C B (F A H , F A Y )), and then estimate the difference. In general, the choice of the reference group affects the result, but this is just a matter of choosing a meaningful counterfactual, and hence there is no right answer (Fortin et al., 2011). As such, the choice is contingent largely upon the context of the empirical study and the interests of researchers. This paper discusses the case in which ) is chosen as the counterfactual, but it does not limit the general applicability of the method.
In order to calculate the counterfactual concentration index, The counterfactual outcomes of individual i are given by the following transformation: From the predicted values, In addition, the SES rank of individual i in the counterfactual distribution is identical to that observed in the observed distribution of group A. This means R( y i ) = R(y A i ), for all i ∈ N A . Since the concentration index is based on a measurement of the correlation between health and SES rank rather than the value of SES itself, ∆ H in equation (14), reflects the part only due to the difference in the marginal health distribution. 4

Overall decomposition -step 2 -
In the spirits of the Oaxaca-Blinder decomposition, we decompose both dependence and health effects further into two parts to explore how much of the respective effects are associated with the difference in the covariate distribution between the two groups. The first part is a covariate effect, which is attributable to the difference in the covariate distribution, while the other part is a structural effect, which is due to the difference in the associations between H and X, and between Y and X.
Using the law of total probabilities, the distributions of H and Y and the joint distribution between H and Y can be expressed as the integral of the conditional distributions over the covariate distributions, and they can be expressed as follows: We introduce counterfactual marginal distributions of health and SES, which are denoted by F Assumption 2 Invariance of conditional distributions: The conditional distribution structures F g H|X , F g Y |X and F g H,Y |X remain stable when the covariate distribution F g X is replaced with F g X , where g = g.
This assumption allows the counterfactual health and SES marginal distributions and their counterfactual joint distribution to be constructed by replacing the distribution of X while keeping their respective conditional distribution structures fixed. For example, suppose we wish to estimate the counterfactual distributions of health and SES at period t, which would be observed if education years had the distribution found at period t−1. The counterfactual marginal distributions of health and SES can be obtained simply by replacing the distribution of education years, because the invariance of the conditional distribution assumption ensures that possible returns from education to health and SES for period t are not affected by replacing the distribution of education years at t with the one at period t − 1. This assumption is in line with the invariance of conditional mean function of H and RIF , which is imposed in the Wagstaff CI decomposition and the RIF regression-based decomposition methods respectively.
We also make the following overlap assumption regarding the support of covariate distributions: Assumption 3 Common support: The support of X B is a subset of the one attributed to X A , namely X B ⊂ X A . 5 The common support assumption requires some form of overlap in the distributions of determinants across groups, in the sense that there is no such value of X that is observed only in group B. The common support assumption is also imposed in the Wagstaff CI decomposition and the RIF regression-based decomposition methods. Under the assumptions 2-3, the counterfactual distributions The counterfactual marginal and joint distributions will now be estimated by the re-weighting (DiNardo, Fortin, and Lemieux, 1996) method (hereafter denoted as "DFL re-weighting"). The idea behind DFL re-weighting is that replacing the marginal distribution of X for group B with the marginal distribution of X for group A with an estimated re-weighting factor will produce the counterfactual distribution in equation (18). Intuitively, the DFL re-weighting weighs the observations for group A so that the weighted observations will have the marginal distribution of X for group B. Hence for j = {H, Y, HY }, where Ψ(X) = dF B X (X) dF A X (X) is the re-weighting factor and it can be estimated from the data: where P r(g = B|X) is a propensity score (Rosenbaum and Rubin, 1983), i.e. the probability of belonging to group B, given the covariates X. P r(g = A) and P r(g = B) are the unconditional probabilities of belonging to group A and B, respectively. The ratio P r(g=A) P r(g=B) is interpreted as the proportion of people in group A relative to B, and the propensity scores can be estimated parametrically or non-parametrically (Hirano et al., 2003).
The failure of the common support assumption potentially prevents us from estimating the counterfactual concentration index with the re-weighting factor, because such a situation makes the propensity score zero or one. Intuitively, if the propensity score is close to zero or one, observations will get close to zero or an extremely high weight in estimates of the contractual concentration index with the DFL reweighting approach, potentially leading to an imprecise estimation in terms of its variance. One approach to deal with the lack of common support is to set the upper and lower limits of the propensity score and trim the samples outside the limit (Crump et al., 2009).  (14) can be decomposed by adding and subtracting CI(C A|B (F B H , F B Y )) and As equations (21) the joint distribution of X. Structural effects are the remaining parts which are not attributable to the differences in X.
In line with the assumptions outlined above, the following propositions hold: Proposition 1 Identification of each effect: All ∆ D , ∆ H , ∆ DS , ∆ DX , ∆ HS and ∆ HX are identified.
and they have the following desirable properties.
Proposition 2 Dependence and health effect properties: Proof of the above is provided in Appendix A.
The first statement of proposition 2 suggests that the dependence effect becomes null when there is no difference in the dependence structure of health and SES (copula function) between the two groups.
The second statement says that the health effect is null when there is no difference in the marginal distribution of health. The third statement means that the structural effects is zero when structural functional forms of health and SES are identical across groups. The last statement proposes that the covariate effects is null when the covariate distributions between the two groups are identical.

Detailed decomposition -step 3-
Ψ k (X k ) is a weighting factor, which can be expressed as where Ψ(X) is a DFL weighting factor defined in equation (20), is a DFL weighting factor calculated from the covariates other than X k .

Decomposition of the dependence covariate effect
First, we calculate the marginal contribution made by the kth covariate to ∆ D X . We introduce the counterfactual joint distribution, which has the marginal health and SES distributions of the group B and the counterfactual copula function that group A would face if the kth covariate of the group A were replaced with that of group B. The concentration index based on this counterfactual distribution ). The marginal contribution made by the kth covariate to ∆ D X is calculated by Repeating the calculation of the marginal effects for k = 1, ..., K, the dependence covariate effect can then be decomposed in the following way: Repeating the calculation of the marginal effects for k = 1, ..., K, the health covariate effect can then be decomposed in the following way:

Interpretation of the interaction effect
While the decomposition above is path-independent, thanks to the non-sequential weighting procedure, in general the sum of the marginal contributions does not add up to the total covariate health/SES effect (adding-up problem), because in non-linear models the total covariate effect generally cannot be expressed by the additively separable contributions made by the marginal components of variables. The part which cannot be explained by the sum of the marginal contributions is due to the interaction effects among the covariates, which are conceptually hard to attribute to a single covariate (Fortin et al., 2011). The existence of the interaction effects can be recognised as the cost of not imposing the additively-separable linear functional form assumption on the health, SES or the concentration index. This, however, does not mean that the linearity assumption can solve the problem or that the linear assumption is favourable.

Comparison with the existing methods
It is worth noting that performance of each of the three methods cannot be directly comparable from a quantitative point of view, because the meaning of "contribution" differs across the decomposition methods. Specifically, the contribution in the Wagstaff CI decomposition measures how much overall differences in inequality are explained by differences in the inequalities and elasticities of health determinants. In the RIF-regression decomposition, on the other hand, the contribution measures approximately how much overall difference is due to the difference in the means of covariate distribution. In the case of the proposed method, the contribution measures how much the overall difference would be if we replace one of the covariate marginal distributions, while other covariates' marginal distributions remain unchanged. Therefore we cannot directly compare the relative performance of each method through the simulation analysis. Keeping in mind this issue, in the next section, we apply the three decomposition methods one by one to the differences in socio-economic inequality in over-nutrition between rural and urban regions in India and discuss results obtained by the three decomposition methods.

Empirical application
This section applies the proposed decomposition methods to the differences in socio-economic inequality in over-nutrition between rural and urban regions in India and attempt to qualitatively compare the results with those estimated by the RIF regression-based CI decomposition method and the Wagstaff CI decomposition method.

Prevalence of over-nutrition
Over the last few decades, India has been experiencing strong economic growth, and an increase in income per capita has led to an increase in life expectancy, living standards and literacy in the country. The also been a notable decline in the poverty rate, from 45.3 per cent in 1993 to 21.9 per cent in 2011 (World Bank, 2018). 9 India is now undergoing rapid epidemiological or nutritional transition (Aizawa, 2019).
While the country has been the single largest contributor to the global prevalence of under-nutrition for years, recent studies report an emerging nutrition transition and an increasing prevalence in obesity among adults (Agrawal et al., 2012;Griffiths and Bentley, 2001;Zargar et al., 2000;Misra et al., 2001).
The positive relationship between the increase in obesity and economic growth has been observed across the world. As an economy develops, people typically shift from the agricultural sector to manufacturing and eventually the service industries. As a result, work becomes more sedentary and with fewer physical activities involved. Firms and households can also afford to adopt labour-saving technologies (Doak et al., 2000;Lanningham-Foster et al., 2003), and increased leisure time is dedicated most often to sedentary activities such as television-watching and computer games (Shetty, 2002). Hand-in-hand with economic development comes more income, which in turn can be spent on food, thereby making a wide range of produce available. With development and urbanisation, people also tend to shift toward the intake of inexpensive, energy-dense and high-fat foods (including those from street vendors and fast-food restaurants), particularly in rapidly-growing areas (Bell et al., 2002;Popkin and Du, 2003;Popkin, 2001Popkin, , 2005Popkin et al., 2006;Du et al., 2002Du et al., , 2004Guo et al., 2000;Popkin and Slining, 2013;Drewnowski, 2000).

Over-nutrition and socio-economic status
The growing number of overweight and obese people is an international trend, but the heterogeneous distribution of over-nutrition across different socio-economic groups has been reported according to the level of development. Exhaustive literature reviews (e.g. Sobal and Stunkard, 1989;McLaren, 2007;Reynolds et al., 2007;Jones-Smith et al., 2012) describe the relationship between socio-economic status and obesity in both developed and developing countries, observing a consistent, inverse association, particularly for women; in other words, the poor are more likely to be obese. In developing countries, on the other hand, a strong positive relationship has been revealed among men, women and children. In other words, people enjoying a more privileged socio-economic status are more likely to become overweight and obese.
McLaren (2007) argues that as a country moves from a low-and middle-income status to a high-income status, the relationship between socio-economic status and obesity is reversed. That is to say, in more advanced economies, the less wealthy are more likely to be exposed to the risk of obesity than people 9 The national poverty headcount ratio is the percentage of the population living below the national poverty line.
with a higher socio-economic status. Monteiro et al. (2004) argue that the reversal of the obesity gradient occurs for women when a GNP per capita reaches 2,500 USD, while Dinsa et al. (2012), conversely, shows that the reversal can trigger at a considerably lower per capita GNP level (1,000 USD).
These transitions can be described through three stages. In the first stage, where the prevalence of obesity is observed mainly among rich people, this cohort moves away from the condition of undernutrition and traditional staple foods towards Westernised diets, thus contributing to an increase in their body mass. In the next stage, where the influences of economic growth and globalisation are much more marked, the larger proportion of people have access to a variety of animal products and processed foods.
Obesity among the middle classes starts to be observed at this stage, at which point a country typically faces the "double-burden", i.e. the co-existence of under-nutrition and over-nutrition. Most Indians are currently in this stage of nutrition transition (Misra et al., 2011;Chhabra and Chhabra, 2007). In the final stage, where the reverse of the relationship between obesity and SES occurs, some people belonging to the higher socio-economic stratum begin to recognise their adverse eating habits and attempt instead to adopt a healthier lifestyle.
Given the fact that India has huge regional variations with respect to development, differences in socioeconomic inequality relating to over-nutrition between rural and urban are is expected. We analyse the determinants that contribute to the difference in socio-economic inequality, based on the claim emphasised by Popkin (1998) that it is not so much urban locations per se but lifestyles associated with urban living that cause obesity.

National Family Health Survey
This application exploits the latest National Family Health Survey (NFHS-4) conducted in 2015/16 in India. The NFHS is a nationwide household survey that provides information on health, health-related behaviours and household socio-economic status; furthermore, it is the Indian version of the Demographic Health Survey conducted in more than 85 low-and middle-income countries since 1984 (Corsi et al., 2012).
All women aged 15-49 and men age 15-54 in the selected sample households were eligible for interviewing.
Key advantages of the NFHS include its national coverage and high participation rates of over 90 per cent (IIPS and ICF, 2017).

Health outcome and SES variable
From measurements of the heights and weights of the respondents, BMI was calculated, which is defined as an individual's weight divided by their height squared and then expressed internationally in units of kg/m 2 . Information on heights and weights in the NFHS was based on actual measurements 10 , and therefore it was less likely to be subject to measurement errors such as under-reporting of weights and over-reporting of heights (Gorber et al., 2007), which substantially enhances the credibility of the result.
The density distributions of BMI in rural and urban areas are shown in Figure 2.
[ Figure 2 about here.] The threshold points for being overweight and obese are 25 kg/m 2 and 30 kg/m 2 , respectively, corresponding to WHO definitions in this regard. A person is classified as overweight if his/her BMI is greater than or equal to 25 and less than 30, while they are classed as obese if their BMI is greater than or equal to 30. Furthermore, this application introduces a new variable, excess weight, which is defined as the difference between a respondent's weight minus an upper limit of his/her optimal weight (Aizawa and Helble, 2017). 11 Excess weight is non-negative and set to 0 if a respondent's weight is below his/her upper limit of optimal weight. 12 The use of excess weight as an outcome variable allows us to look at socio-economic inequality in relation to the degree of being overweight and obese, beyond the headcount of overweight/obese people.
As a socio-economic status, we use household wealth, which is captured by the NFHS through a composite index of relative standards of living derived from 132 indicators of asset ownership, housing characteristics and water and sanitation facilities. 13 The wealth index in the NFHS is based on principal component analysis, designed by Filmer and Pritchett (2001) and developed in collaboration with the World Bank, and it has been shown to be a consistent proxy for household income and expenditure (Rutstein and Staveteig, 2014;Montgomery et al., 2000). The advantage of using wealth over income is that the former, 10 In the NFHS, each respondent is weighed in light clothes with shoes off, using a solar-powered digital scale with an accuracy of ±100g. Heights are measured using an adjustable wooden measuring board, designed specifically to provide accurate measurements (to the nearest 0.1cm) in a developing country field situation.
11 An upper limit of the optimal weight occurs when the BMI is equal to 25, i.e. an upper limit = 25((height/100) 2 ). Given the argument that the BMI threshold for Asian people should be lower, we also define the optimal weight using a BMI of 23.0 as another threshold. The qualitative results of the analysis are very robust in relation to the difference in the threshold. The results for this robustness check are available from the author upon request.
12 This definition is consistent with the idea of the Foster-Greer-Thorbecke (FGT) index with a weight parameter being one (Foster et al., 1984), which is used in development economics to measure the "degree" of poverty beyond simple headcounts.
13 A full list is available at https://www.dhsprogram.com/topics/wealth-index/Wealth-Index-Construction.cfm (Accessed 25/May/2018) as a stock of income, is suitable as an indicator for reflecting the long-term living standards of households.
In addition, wealth is less susceptible to temporary economic shocks and seasonal events such as drought, which is important for the analysis of developing countries in which agriculture plays an important role.

Covariates
The potential determinants were selected by following the empirical literature on the overweight and obese population in India (Doak et al., 2005;Kulkarni et al., 2017, etc.) and the epidemiological and public health literature on nutrition (Popkin et al., 2006;French et al., 2001;Swinburn et al., 2004;Siddiqui and Donato, 2016;Gouda and Prusty, 2014;Prentice, 2006, etc.). As covariates reflecting demographic characteristics, age and family size are used, with both treated as a continuous variable. Age squared (divided by 100) is also included.
As religion plays an important role in dietary choices and health-related behaviours, for example the diets of Muslims are generally richer, and social mobility among Muslim women is lower than amongst Hindu women (Kulkarni et al., 2013), in this application Hindu and Muslim dummies are used to reflect individual socio-cultural backgrounds. Other religions (e.g. Christianity, Sikhism, Buddhism) are benchmark groups. In addition, dummy variables for the scheduled caste/tribe and other backward classes are included. Scheduled castes/tribes are the most socially disadvantaged groups, members of which have suffered the greatest burden of social and economic segregation and deprivation within the traditional Hindu caste hierarchy (Chitnis, 1997). Other backward classes are a diverse collection of intermediate castes that are considered low in the caste system but are clearly above scheduled castes. The remaining classes, including those not identifying themselves as legislatively marginalised, act as the benchmark category herein.
To capture the occupational socio-economic status and occupation-related physical activity level, four occupation types are included: (i) professional/technical/managerial workers, (ii) clerical, sales or employed in the service sector, (iii) farmers 14 and (iv) manual workers. 15 Not currently participating in the labour force is a benchmark category. 16 Educational levels are measured by three categories in terms of the highest level of education completed. In this analysis, higher education, secondary education and 14 Either as an employee or as an owner 15 Including both skilled and non-skilled manual workers. 16 This includes those not seeking employment, such as home-makers. primary education dummies are used in the analysis, and an education level less than primary education works as a benchmark category. A literacy dummy is also used. 17 As variables reflecting household living standards, the following dummy variables are used: access to piped drinking water in a house, access to electricity, having a flushing toilet in a house, having a mobile phone, having a car and the availability of safe cooking fuel (electricity, LPG, natural gas).
Last, as variables reflecting individual health-related behaviours, the frequency of TV-watching, food consumption, current smoking status and alcohol consumption are also included. Frequency of TVwatching is divided into the three types: (i) not at all/less than once a week (benchmark category), (ii) at least once a week and (iii) almost every day. These also work as proxies for reflecting physical activity in leisure time (Griffiths and Bentley, 2001;Foster et al., 2006). Food consumption is constructed as a composite index derived from the first principal component of the frequency of eating various foods. 18 The alcohol drinking dummy was based on the frequency of drinking and equalled unity when the respondent consumed alcohol on a daily basis.
After dropping observations with possible outliers 19 , the sample sizes were 93,440 and 90,043 for women and men, respectively. Descriptive statistics are shown in Appendix B. In this application, the propensity score was estimated by the logit model. Observations within the top and bottom 1 per cent of the propensity score were trimmed from the sample, to enhance the DFL re-weighting performance. Figure 3 illustrates the concentration curves of both regions, by plotting the cumulative percentages of excess weight against the cumulative percentage of the population ranked from poorest to richest (Kakwani et al., 1997). Figure 3 shows that in both areas, excess weight is concentrated more among rich people and that higher concentrations are evident in rural areas than in urban areas for both men and women.

Results
[   Table 1 shows the overall decomposition results of the differences in the concentration indices (steps 1 and 2). The lower part of Table 1 contains the detailed decomposition results of these two covariate effects in terms of the partial contributions made by each variable (step 3). Standard errors are calculated by the bootstrap method with 500 repetitions, and all terms are expressed in units and as percentages of the total observed difference. The coefficients of the logit estimation, which are used to estimate the DFL weighting factor, are available in Table B.2 in Appendix B.
First of all, both urban and rural areas show positive significant concentration index values (p < 0.01), meaning that excess weight has a pro-rich pattern. The concentration index in rural areas is significantly higher than in urban areas, so the difference (∆CI = CI U rban − CI Rural ) is therefore negative (p < 0.01). Given the observation that rural areas are less developed than urban areas (Das and Pathak, 2012;Chamarbagwala, 2010;Nayyar, 2008), this contrast is consistent with international findings on the socio-economic trend of over-nutrition, which argue that as an economy develops, the overweight/obese will start to be found among middle-and lower-SES groups.
First step: Next, the observed negative difference in the concentration indices is decomposed into the dependence effect and the health effect. The opposite signs of the dependence and health effects are observed, in that the health effect shows a significant positive contribution to the observed difference, which means that the observed difference in socio-economic inequality is explained by the difference in the marginal health distributions between urban and rural areas. On the other hand, the negative contribution of the dependence effect means that the difference in the dependence structure between excess weight and wealth partially offsets the part attributable to the health effect. The negative contribution of the dependence effect also suggests that in urban areas the distribution of health and SES has stronger positive dependence. 20 Second step: Third, we decompose both the dependence and the health effects into their structural and covariate effects, respectively. For the dependence effect, its large proportion is contributed by its covariate effect (p < 0.01), implying that the difference in the distribution of covariates between urban and rural areas explains the large part of the dependence effect. We also observe the major contribution made by the covariate effect to the health effect (p < 0.01). Overall, these observations show that the difference in the distribution of covariates illustrates large contributions to both dependence and health effects.
Third step: The lower part of Table 1 contains the results of the detailed decomposition of these two covariate effects reporting the partial contributions made by each variable. Being a scheduled caste/tribe (p < 0.01) member, being Hindu (p < 0.01) and using safe fuel for cooking (p < 0.01) were found to be the largest three contributors to the observed difference in the concentration index, all of which made contributions primarily through the health effect.
[  Table 2 reports the results for the male samples. As we observed in the female case, both areas show a positive concentration index, with the concentration index in urban areas exhibiting the smaller value.

Male samples
This implies that excess weight is observed more commonly among the lower SES groups in urban areas, compared with the urban male population. The difference in the concentration index (∆CI = CI U rban − CI Rural ) is thus negative and significant (p < 0.01). Compared with the female case, the observed difference is smaller, which is consistent with previous studies reporting that the transition of socioeconomic inequality in obesity first occurs among women in developing countries (Popkin and Gordon-Larsen, 2004;Sarlio-Lähteenkorva and Lahelma, 1999;Ismail et al., 2002).
First step: Decomposing the observed difference, we find the positive contribution of the health effect (p < 0.01), part of which is offset by the negative contribution of the dependence effect (p < 0.01). This means that the difference in socio-economic inequality is due largely to the difference in the marginal health distributions.
Second step: Next, we measure what proportions of the dependence and health effects are explained by the difference in the distribution of covariates. Most parts of the dependence effect and the health effect are explained by differences in the covariate distributions (both p < 0.01).
Third step: The lower part of Table 2 shows the results of the detailed decomposition of these two covariate effects reporting the partial contributions made by each variable. The availability of safe fuel for cooking (p < 0.01), smoking behaviour (p < 0.01) and being Hindu (p < 0.01) make the greatest contributions to the total difference primarily through the health effect.
[  Table 3 shows the decomposition of the difference in the concentration indices into the structural effect and the covariate effect. The observed negative difference in the concentration indices is divided into both the composition effect and the structure effect, and their statistical significance is confirmed at the 5 per cent level. The aggregate decomposition shows that the differences in the distributions of covariates explain less than a half of the changes in the observed differences for both women and men. The covariate effect accounts for 44.1 per cent and 39.0 per cent of the observed differences for women and men, respectively.
Detailed decomposition: The lower part of Table 3 contains a detailed decomposition of the covariate effect into the contributions of each variable. The differences in the proportions of using safe fuel for cooking (p < 0.01), owning a flushing toilet (p < 0.01) and completing higher education (p < 0.01) show the large contributions made to the observed difference in the concentration index for women. In the case of men, the differences in the proportions of being a farmer (p < 0.01), using the safe fuel for cooking (p < 0.01) and having piped water for drinking show the large contributions made to the observed difference.

Wagstaff CI decomposition
Female samples: Table 4 shows the results of the Wagstaff CI decomposition for the female samples.
Columns 2-4 show the concentration index effects, the parts reflecting differences in inequalities in terms of contributors, and columns 5-7 show the elasticity effects, the parts capturing differences in the associations between health and the contributors. Both the overall concentration index effect (p < 0.01) and elasticity effect (p < 0.01) are negative, thereby contributing positively to the observed negative difference.
The concentration index effect outnumbers the elasticity effect in absolute values, which suggests that the difference in the degree of socio-economic inequality in excess weight is associated with the differences in the socio-economic inequalities of the health determinants, rather than the differences in relations between health and contributors.
The total contributions made by each covariate are shown in columns 8-10. In total, almost all of the differences in the concentration index can be explained by the contributory factors in the model. Amongst the covariates in the model, differences in the proportions of those viewing television every day (p < 0.01), using safe fuel for cooking (p < 0.01) and having a flushing toilet (p < 0.01) are found to be the three largest contributors, thus indicating that concentration index effects dominate the elasticity effects. In other words, these large contributions were due to differences in socio-economic inequality between urban and rural regions.
[ Table 4 about here.] Male samples: Table 5 shows the results of the Wagstaff CI decomposition for the male samples. Differences in the sign of concentration index effect and the elasticity effect are found, whereby the overall concentration index effect shows a negative sign whilst the overall elasticity effect is positive. The elasticity effect outnumbers the concentration index effect in absolute values, which suggests that the difference in the degree of socio-economic inequality in excess weight among men is associated mainly with the differences in the socio-economic inequalities in covariates, rather than the differences in the relationship between health and covariates.
The total effects of each covariate are shown in columns 8-10. In total, a large proportion of the differences in the concentration index can be explained by the covariates in the model. Differences in the proportions of viewing television almost every day (p < 0.01) and using safe fuel for cooking (p < 0.01), and differences in the mean value of age squared (p < 0.01), make the greatest contributions. Frequent television-watching makes a more significant contribution through the elasticity effect than through the concentration index effect, which means that its larger contribution is due to the difference in the associ-ation between excess weight and frequent television-watching. For the use of safe fuel for cooking and age squared, their contributions derive primarily from the concentration index effect, which means that their contributions are due to difference in the socio-economic inequalities between urban and rural regions. [

Conclusion
This paper proposed a new semi-parametric method to decompose the differences/changes in two concentration indices. In essence, the difference can be made by (1) difference/change in the association between health and SES, (2) difference/change in the marginal distribution of health and/or SES. The proposed method helps us explore these underlying mechanisms which cause the differences/changes in two concentration indices. Applying copulas, we first decomposed the observed differences into a part due to a difference in the dependence structure between health and SES (dependence effect), and a part due to a difference in the marginal distribution of health (health effect). Second, by applying the re-weighting method devised by DiNardo et al. (1996), we decomposed both the dependence and the health effects into parts explained by the difference in the joint covariate distributions (dependence/health covariate effect) and those that cannot be explained by it (dependence/health structural effect). Last of all, the marginal contributions made by each covariate to the dependence/health covariate effects were obtained through a non-sequential re-weighting approach.
The proposed method did not impose a functional form assumption for health, SES or the concentration index. As health outcomes often exhibit distinctive features of non-normality, not imposing the linearity assumption in this regard was advantageous in applications. Different from existing methods, in which only the mean differences in covariates are assumed to contribute to differences in the concentration index, in the proposed method, differences throughout the entire covariate distributions were considered.
Requiring fewer distributional assumptions, however, came at a price when we interpreted the results of the detailed decomposition. The contributions made by each covariate, which were estimated by nonsequential weighting, did not sum-up to the total effect in general, because the total effect could not be additively separated into the individual covariates in non-linear models.
Finally, this paper applied the proposed decomposition method to study differences in socio-economic inequality in excess weight between rural and urban populations in India, where obesity is increasing gradually among the middle and lower socio-economic groups, especially among urban women. A significantly steeper pro-rich gradient was found in rural areas than in urban areas. The observed difference in the concentration indices was explained by the difference in the marginal distributions, rather than the difference in dependence structure. The large proportion of the difference was explained by the difference in the covariate distributions between rural and urban areas. Scheduled caste/tribe and the availability of the safe fuel for cooking made the largest marginal contributions to the observed differences found among the female and male samples, respectively. The proposed copula-based method and the Wagstaff CI decomposition show that the difference in the covariate distribution explains the larger part of the observed difference in the socio-economic inequality, while in the RIF regression-based CI decomposition, a smaller proportion of the observed difference is attributable to the difference in the mean of covariates.
This suggests that the identifying assumptions may play substantial roles in the decomposition analysis.
However, the three methods consistently identify the use of safe fuel for cooking as the large contributor to the observed difference.

Declarations
• Ethics approval and consent to participate: This study uses only the publicly-available secondary data with no identifiable information.
• Consent for publication: An author agrees to publish this study.
• Availability of data and material: The data in this study is publicly available.
• Competing interests: There are no known conflicts of interest associated with this publication.
• Authors' contributions: Conceptualisation, statistical analysis and writing up.
• Acknowledgements: None. Hence, Part 3: Under the assumptions of invariance of conditional distribution and common support, Similarly, under the assumptions of invariance of conditional distribution and common support, Therefore, Furthermore, under the assumptions of invariance of conditional distribution and common support, If m A (.) = m B (.) and n A (.) = n B (.), F

(B−A) HY
(u, v) = 0, ∀u ∈ H B , v ∈ Y B holds. By the Sklar's theorem, copula stability assumption, and equations (A.10) and (A.15), Therefore, Part 4: Under the assumptions of invariance of conditional distribution and common support, Similarly, under the assumptions of invariance of conditional distribution and common support, Hence, Furthermore, by the assumptions of copula stability, invariance of conditional distribution, common support, and equations (A.32)-(A.38),  Table B.1 shows the descriptive statistics in respective areas. Both women and men in urban areas have a higher BMI (p < 0.01) and excess weight (p < 0.01) on average than those in rural areas. Average wealth among urban residents is higher than that among rural residents (p < 0.01), and the proportions of professional workers, clerical workers, sales and service-sector workers are higher in urban areas (p < 0.01). On the contrary, the proportion of farmers is significantly higher in rural areas (p < 0.01). Both women and men in urban areas tend to be literate (p < 0.01) and have higher academic achievements (p < 0.01). In addition, higher proportions of households in urban areas have access to piped drinking water (p < 0.01), electricity (p < 0.01) and a flushing toilet (p < 0.01).
They are more likely to own cars (p < 0.01) and mobile phones (p < 0.01), and to use safe fuel for cooking (p < 0.01).
A significantly higher proportion of people watch television every day in urban areas than in rural areas (p < 0.01). Rural samples show higher proportions of smokers (p < 0.01) and frequent alcohol drinkers (p < 0.01).
Contributions (Contr) measure the proportions to the observed differences in the concentration index.