A Semi-Parametric Analysis of Childhood Stunting in India: District Level Scenario

A growing body of literature has recognized the urgency of addressing the issue of childhood stunting (low height for age) in India and dedicated substantial resources to identifying the factors that are strongly associated with its considerable prevalence in the country. However, most of these studies have focused on parametric models with limited use of geographic information and prexed assumption on the linear association of various factors with stunting outcomes. The present study re-investigates the nonlinear association of certain covariates along with spatial effects by using a exible Bayesian semi-parametric regression approach at the district level. Method Data is taken from of Family Health Survey (NFHS The analysis is based on the data from 2,24,190 children whose complete anthropometric measurements of weight and height are measured. Studies based in African countries argue that after controlling for the spatial dependence, the signicance of certain covariates in the xed part of a semi-parametric model was revealed (Gayawan et al., 2014, 2016). This was lost in a parametric approach. To our knowledge, this is the rst study in India that applies semi-parametric exible modeling at the district level of selected states divided by their prevalence of childhood stunting. The ndings from this exercise aim to initiate re-thinking restricted parametric modeling of certain public health indicators in India that are characterized by wide regional variations.


Abstract
Background A growing body of literature has recognized the urgency of addressing the issue of childhood stunting (low height for age) in India and dedicated substantial resources to identifying the factors that are strongly associated with its considerable prevalence in the country. However, most of these studies have focused on parametric models with limited use of geographic information and pre xed assumption on the linear association of various factors with stunting outcomes. The present study re-investigates the nonlinear association of certain covariates along with spatial effects by using a exible Bayesian semi-parametric regression approach at the district level.

Method
Data is taken from the recent fourth round of the National Family Health Survey (NFHS 4, 2015-16). The analysis is based on the data from 2,24,190 children whose complete anthropometric measurements of weight and height are measured.

Results
The paper has two major contributions. First, it challenges the pigeonholing of linear association of covariates. It identi es a nonlinear association of child's age, mother's age at birth, and mother's BMI with childhood stunting. Children's height-for-age score worsens up to the age of 20 months and then stabilizes at a lower HAZ score in the states with a high prevalence of childhood stunting. Almost all high prevalence states display an inverted U association for maternal BMI suggesting that not just underweight mothers, but overweight mothers are also likely to have stunted children. Secondly, the results indicate high spatial clustering of poor performing districts in states with a high prevalence of childhood stunting. For example, the districts in the West of Bihar show signi cantly higher levels of childhood stunting than the ones in the East.

Conclusions
This is the rst time that a exible and realistic model has been applied to district-level data to identify regional variation and highlight the issues of pigeonholing linear association of all correlated of childhood stunting. The ndings from the study is a novel attempt to rethink restrictive modeling approaches of various public health issues in a regionally diverse country such as India. Background Stunting (low height for age) is one of the chronic forms of malnutrition among children under the age of ve which is usually a result of chronic or recurrent undernutrition. Stunting is associated with poverty, poor maternal health and nutrition, frequent illness, and inappropriate feeding and care in early life (De Onis et al., 2012.; WHO | Stunting in a Nutshell). Also, it has severe long-term effects in the forms of low cognitive development, poor school performance in childhood, reduced income, and increased risk of nutrition-related chronic diseases in adulthood (Stop Stunting | UNICEF India). Globally, 21.3% (144 million) of all under-ve children were affected by stunting in 2019 (WHO-Fact Sheets, 2020;UNICEF et al., 2020) and more than half of all stunted children under ve years lived in Asia. The prevalence of stunting in India is around 38% which is greater than the average of 25% among the developing countries (Country Overview Malnutrition Burden; IIPS, ICF (2017)). Due to the comprehensive nature of the indicator, most of the international and national nutrition targets are xed in terms of childhood stunting. For example, the second Sustainable Development Goal envisions to achieve the World Health Assembly's target of reducing childhood stunting by 40% by 2025 and eliminating all forms of malnourishment by 2030. In the recently launched domestic scheme, POSHAN Abhiyaan, the Government of India aims to reduce childhood stunting to 25% by 2022 and sets the target of a 2 percent reduction every year.
To lower the levels of childhood stunting a growing body of literature has attempted to nd the correlates of stunting. Results from these former works suggest many demographic, socio-economic, and geographical factors like child's age and sex, mother's BMI and age at birth of the child, place of residence, wealth index, birth order, feeding practices, and diseases like diarrhea or fever, etc., affect the stunting outcome of children under ve years of age. (Chowdhury et al., 2016;Delbiso et al., 2016;Gayawan et al., 2016;Khan & Mohanty, 2018;Raj et al., 2014;Sanghvi et al., 2001;Singh et al., 2015;Takele, 2013;Talukder, 2017;Yadav et al., 2015). However, most of these studies have been limited to examining only the level of malnutrition and have explored various associated factors. . These studies have generally failed to incorporate all the socio-economic, demographic, and health-related effects including the spatial effects in a single framework. Additionally, most of the previous studies have an assumption of a strictly linear effect of these covariates on the stunting outcome which may not always be appropriate for the continuous covariates in the data.
There are many continuous variables like child's age, mother's BMI, mother's age at birth (Fahrmeir & Khatab, 2008;Fenske et al., 2011;Ferede Asena, 2015;Getnet Ayele et al., 2016;N. Kandala et al., 2001;N. B. Kandala et al., 2011;Khatab, 2010;Mohammed & Asfaw, 2018;Takele, 2013;Yadav et al., 2015), duration of breastfeeding (Fenske et al., 2011) and mother's educational attainment (Yadav et al., 2015) which are assumed to have a non-linear effect on the stunting outcome of children under age ve. More often than not, it is di cult to model the nonlinear effect of such covariates through a parametric functional form (Fahrmeir & Lang, 2001;N. B. Kandala et al., 2011) and requires more exible semiparametric forms of predictor modeling (Fahrmeir & Lang, 2001). A few research on stunting has addressed the spatial effect on stunting; which is an essential aspect from a policy standpoint as most of the health outcomes in India, including childhood stunting is characterized by vast regional heterogeneity. This is mainly due to wide differences in health care utilization, infrastructural gaps, socio-economic level, availability of food, and dietary diversity (Hagos et al., 2017;Khan & Mohanty, 2018). Incorporating spatial effects while controlling for all the other covariates at the same time in the model cannot be solved using conventional multilevel modeling with uncorrelated random effects (N. B. Kandala et al., 2011).
To address these gaps in the modeling of childhood stunting, the study uses the Bayesian geo-additive modeling technique which is a exible approach that incorporates xed effects (categorical variables), nonlinear effects of the metrical covariate spatial effect, and also other time-varying effects of other variables in a uni ed framework, while controlling for the hierarchical nature of the data via random effects at the same time (Fahrmeir & Lang, 2001;Gayawan et al., 2014;N. B. Kandala & Ghilagaber, 2006;Khatab, 2010;Mohammed & Asfaw, 2018;Takele, 2013). The Bayesian approach is based on an updated belief on the probability of an event happening given prior and the data observed. It enables one to make probability statements about the likely values of parameters. The credible intervals can be interpreted easily as they have straightforward probabilistic interpretations unlike con dence intervals in the frequentist approach (Bacha & Tadesse, 2019;Grzenda, 2015). The present study uses data from the latest round of Indian Demographic Health Survey (NFHS, 2015-16) and based on the prevalence of childhood stunting; few states are selected for districtlevel analysis. Four states with the highest prevalence (Bihar, Jharkhand, Uttar Pradesh, and Rajasthan) and four with the lowest prevalence (Assam, West Bengal, Punjab, and Kerala) are selected.
The study aims to examine the non-linear impact of selected covariates on stunting outcomes (height-for-age z-score) of children under the age of ve. The study also investigates the spatial effect of districts of a few selected states of India on the childhood stunting outcome using a semi-parametric regression model which takes into account the non-linear effects of the continuous covariates at the individual level and the spatial variation of the districts along with the effects of the categorical variables within a uni ed equation. This is the rst study in India that has taken care of spatial effects at the district level in studying childhood stunting since there lies a wide regional difference in childhood stunting in a vast country like India.

Methods
The study used the data from the latest round of the National Family Health Survey, conducted in 2015-16; which is a nationally representative household survey that provides data for monitoring and impacts evaluation in the areas of population, health, and nutrition. The survey drew a representative sample of women of reproductive age group (15-49 years), by administering a questionnaire and making an anthropometric assessment of the women and; children that were born in the last ve years before the survey. The analysis for this study is based on the data from 2,24,190 children whose complete anthropometric measurements of weight and height are measured.
Outcome variable: In the present study, stunting outcomes of children under the age of ve years are measured using height for age z-scores, The z-scores represents the number of standard deviations by which an individual child's anthropometric index differs from the median of the World Health Organization international growth reference population ("WHO Child Growth Standards", 2009). Children with z-scores below minus two standard deviations from the median of the reference population for height-for-age are considered as "stunted".
Predictor variable: Various factors/determinants affect the stunting outcomes of children under ve years of age. Table 1 gives the description (variables, categories, the prevalence of stunting, and the sample size) of all the socio-economic, demographic, and health-related variables used in this study and which are considered important determinants of childhood stunting in the past literature (Biswas, 2014;N. B. Kandala et al., 2011;Khan & Mohanty, 2018;Khatab, 2010;Mohammed & Asfaw, 2018;Yadav et al., 2015).
The histogram (Figure 1) for the distribution of children's height-for-age z-scores shows clear evidence that a Gaussian regression model is a reasonable choice for our inference for the dependent variables. Also; Figure 2 shows the scatter plot of height-for-age zscores with each continuous covariates; the child age (in months), mother's body mass index (kg/m 2 ), and mother's age at birth.
The gure showed that there is no de nite pattern of relationships between the indicators of the nutritional status of children with the child's and mother's characteristics (the continuous variables). Hence these variables are taken in continuous form for the analysis.
Statistical Analysis-Bayesian Geo-additive Regression model The statistical analysis employed in this paper is based on the Bayesian approach. Bayesian geo-additive regression is a exible approach that permits to analyze, the non-linear effects of the continuous covariates and the spatial effect of the districts along with the usual linear effects of the categorical variables within a consolidated semi-parametric equation. As the predictor contains usual linear terms, nonlinear effects of metrical covariates, and geographic effects in additive form, such models are called geoadditive regression models.
Considering observations (Yi, Xi, Wi), i=1,...,n; where Y is the response variable, X is a vector of continuous covariates and W is a vector of categorical covariates. We assume that Yi's given the covariates and unknown parameters are independent and Gaussian with mean hi and a common variance s 2 across subjects, i.e. yi ~N (hi,s 2 ). Here; the response variable Y is the stunting outcome measured as a height-for-age z score. Traditionally, the effect of the covariates on the response is modeled by a linear predictor; hi = x'i b + w'i g. To explore the possible non-linear effects, we replace the strictly linear predictor by a more exible semiparametric predictor where the function f 1 , f 2 …f p are non-linear smooth effects of the continuous covariates which are modeled by Bayesian P-splines, the term f spat is the effect of the district d i belongs to 1,…,d where mother i lives. All the computations for the present study are done using the STATA-15 and BayesX function in R software version 3.6.1. Table 1 presents the level of stunting in India by various explanatory variables. In India; around 38% of the children below ve years of age are stunted. The prevalence is higher among males, children aged 16-30 months, who are nancially better off and belonging to rural areas. Households with improved sanitation facilities have fewer stunted children under ve years of age.

The Fixed Effects of correlates on stunting outcomes in India and selected states
The results for xed effects estimation of parametric coe cients on the childhood stunting outcome are shown in Table 2. The output gives posterior means, posterior median along with their standard deviations, and 95% credible intervals. Household wealth status, mother's education level, child's birth order, sex of the child were signi cantly associated with the stunting status of a child (at 5% signi cance level). The geo additive model shows female children are at an advantageous position with higher height-for-age z-scores (HAZ) than male children in all the selected states of India. Children belonging to non-poor households, literate mothers, have households with improved sanitation facilities have higher mean HAZ and are less likely to be stunted. Although in the state of Punjab, the improved sanitation facility isn't a signi cant predictor for stunting outcomes. Place of residence is the only signi cant factor in the state of Jharkhand showing that urban children have better mean HAZ score hence are healthier than their rural counterparts.
The Non-Linear Effect of the continuous covariates on stunting outcome of children under ve years of age  show the exible modeling of the effect of the child's age, mother's age at birth, and mother's BMI kg/m 2 on the mean height-for-age (HAZ) for the children under ve years of age in a few selected states of India. The posterior means together with 80 % and 95% point-wise credible intervals are displayed.
Results show that ( gure 3) the in uence of a child's age on stunting is considerably high in the early months. In the high prevalence states; the mean HAZ score decreases very rapidly till the age of 20 months and shows an 'L' shaped pattern in the states of Bihar and Jharkhand. Whereas, for the states of Uttar Pradesh and Rajasthan the mean score uctuates periodically post 20 months. . In comparison; the low prevalence states the HAZ shows a slower decline in the mean HAZ score. The state of Assam and Punjab that the decline starts after 5 months and uctuates at a lower level after 20 months. Kerala, unlike any other state; shows a lower rate of decrease of mean HAZ score. A sudden drop in the mean HAZ score is seen between 10-20 months of the age of a child and later it again increases within the next 10 months to stabilize at a moderate height-for-age z-score.
The high prevalence states including Punjab show a positive relationship between the age of the mother at the time of birth and the child's HAZ score ( gure 4). Younger mothers tend to have more stunted children than mothers who are older as with the increase in the age of the mother; the child's HAZ score increases. Unlike other states; the states of Assam and West Bengal show an inverted U pattern It is evident that mothers who gave birth in their later span of reproductive life (after 30 years) also have children with low HAZ scores. Kerala doesn't show any signi cant non-linear effect of mother's characteristics on a child's height-for-age z-score and could have been modeled linearly. Figure 5 shows the effect of the mother's BMI Kg/m2 on the child's mean HAZ score. States of Jharkhand, Assam, and West Bengal show that the child's height-for-age z-score increases with an increase in the mother's BMI.
Trends in the states of Rajasthan, Uttar Pradesh, Punjab; and Bihar also are increasing (starts from a higher level of z-score) but only till a certain BMI range and then later starts decreasing for mothers who are overweight or obese.
The Spatial Effect of the districts on the stunting outcome of Children in a few selected Indian states Figure 6 shows the estimated mean effect of the districts on mean HAZ for children under ve years of age in a few selected states of India. The posterior means of estimated residual spatial effect are mapped controlling for all other continuous and categorical covariates like child's age and sex, mother's BMI & mother's age at birth, place of residence, wealth quintile, sanitation, and drinking water facility, etc... The level of stunting among children under ve years of age is signi cantly higher (lower) with districts colored red (blue) while for those colored light, it is not signi cant.
After controlling the other covariates; the posterior means of estimated residual spatial effect in Bihar shows an east-west divide in childhood stunting where the districts in the western part i.e. Pashchim Champaran, Purb Champaran, Siwan, and Saran show a signi cantly higher risk of stunting in children under ve years of age. This indicates that the covariates considered under the model are important determinants of childhood stunting in these districts. the eastern districts of the state of Uttar Pradesh have a lower level of stunting suggesting healthier children with higher mean HAZ whereas the level of stunting is higher in a few districts in south and western parts of the state. The district named Shravasti (Uttar Pradesh) has a signi cantly higher level of stunting compared to any other district of the state. Though the states of Rajasthan and Jharkhand have a high prevalence of childhood stunting, they show a little smoothed pattern except in a few districts in the south-western part (with a higher level of stunting) whereas the district of Saraikela (Jharkhand) and Ganganagar (Rajasthan) has a signi cantly lower level of stunting.
The mean spatial effect in the state of Assam is signi cantly high and the districts in the extreme west (Dhubri, Bongaigaon, Chirang) and extreme east (Tinsukia) of Assam have a signi cantly higher risk of a child being stunted. In the state of Punjab, the level of stunting is signi cantly higher in the districts of Mohali followed by Mukysar and Patiala. For other parts of the state, the level of stunting is not signi cant and shows a smoothed pattern overall. A north-south divide is seen in the state of West Bengal with the southern part has a higher risk of a child being stunted with Medinipur as an exception. Unlike other states; Kerala shows a smoothed estimated spatial effect (uniform color pattern) throughout the state except for Iddukki (with signi cantly lower levels of stunting) suggesting that the level of stunting is not signi cant in the state given other socio-economic factors being controlled.
However, most of these studies have generally failed to include the non-linear association of some covariates and spatial dependence simultaneously in a single framework. Though parametric models are easy to understand and interpret, they fail to give a realistic representation of the relationship of a few correlates. This study adopts more exible and realistic modeling of childhood stunting and adds to the extant literature on the non-linear nature of the association of certain correlates like age of a child, mother's age at birth, mother's BMI. Our paper further contributes to providing more robust estimates of spatial effects at the district level.
The Bayesian approach gives the possibility of adding the prior information on the prior distribution given the sample data which further improves the accuracy and credibility of the estimations. The credible intervals have probabilistic interpretation and are easier to understand. The approach used in the study is exible that allowed the joint modeling of xed effects, nonlinear effects, and spatial effects of districts in a uni ed regression model.
The Bayesian semi-parametric model analysis revealed that sex, birth order, wealth quintile, mother's educational level and, The child's age, mother's BMI, and mother's age at birth were found to be statistically signi cant non-linear covariates affecting childhood stunting. Results showed that height-for-age continuously worsened up to the age of 20 months in almost all the states with a steep decline in high prevalence states but a steady decline in low prevalence states. These results strongly support the rst 1000 days approach adopted by the ongoing POSHAN Abhiyaan (National Nutrition Mission) in India. The rst 1000 days after birth is a crucial period to bolster growth in a child's life as childhood stunting is an irreversible outcome of inadequate nutrition and Past studies have con rmed the association of growth faltering with a child's age but shows wide regional variations (15 months in Egypt, 17-30 months in Ethiopia) (Fahrmeir & Khatab, 2008;Ferede Asena, 2015;Mohammed & Asfaw, 2018;Takele, 2013). The mother's age at birth and her BMI also showed a signi cant non-linear relationship with childhood stunting outcome and is characterized by wide state-wise variations. In the states like Assam and West Bengal; an inverted U pattern is seen suggesting that children born to older women (more than 30 years of age) are also likely to be more stunted. The average childbearing age in these states is 25 and 23 years respectively. In these states, around 35% of children from West Bengal and 44% of children from Assam are stunted to the mothers below 19 years. On the other side, around 39% are stunted to mothers older than 35 years in both states.
Similarly; states like Rajasthan, Uttar Pradesh, Punjab, and Bihar; show that child's HAZ increases only till a certain BMI range and then later starts decreasing for mothers who are overweight or obese. These results suggested that not only underweight mothers but also those who are outside the normal BMI range (including obese and overweight) and those who gave birth in the later span of reproductive age are more likely to have stunted children. The environment conducive to overweight and obesity is equally unwelcoming and contributes to stunting. Therefore it is important to develop culturally sensitive public health strategies that can simultaneously address the needs of both. Unlike the other selected states, Kerala did not show any signi cant effect of the mother's BMI and her age at birth on children's height-for-age scores as the relationship came out to be a straight line which could be modeled under linear assumptions also. This could be due to the difference in the coverage and participation of children in the Integrated Child development services in Kerala and other low performing states.
The study shows a strong spatial dependence and the district where the child lives, has a highly signi cant effect on the stunting outcome of children under ve years after controlling for all other linear, non-linear covariates within a uni ed model. There seems to be a clear east-west divide in the estimated mean effect of districts on childhood stunting in Bihar and a north-south divide in the state of West Bengal. The spatial pattern indicates that the factors taken in the model affect the height-for-age z-score in the districts and they play an important role in further reduction of the level of stunting in the low-performing states. Kerala on the other side being a developed state showed a smooth estimated spatial effect (uniform color pattern) throughout the state suggesting that the level of stunting is not signi cant given other covariates are controlled. Due to effective policy and programs in betterperforming states such as Kerala, there is a possibility that the effect of a mother's BMI, education, and age at rst birth on stunting is mitigated.
The present study is limited due to data constraints and was not able to control for latent factors like temperature, rainfall, natural disasters, or other environmental and seasonal covariates in the stunting model. However, this study has a signi cant contribution to the extensive childhood stunting research in India as it has adopted a novel approach to re-examine the association of covariates with the stunting outcomes using a Bayesian semi-parametric approach. The ndings strongly advocate considering the non-linear effects of various correlates while controlling for the spatial effects on stunting outcomes in a country like India with wide-spread regional variations in public health outcomes are one of the most signi cant challenges. To conclude, the prevalence of childhood stunting in India depends on numerous socio-economic and demographic factors that even include those which are non-linearly related to the stunting outcome like child's age, mother's BMI, and mother's age at birth. Also; not just young and underweight mothers; but in a few states, overweight mothers are to be considered a risk. After controlling for all the linear, non-linear covariates in one single framework using a semi-parametric approach; there still seemed a wide regional variation in the factors that affect the stunting outcome of children. The factors like mother's age and BMI which are not signi cant in a state like Kerala are of much importance in the states with high stunting prevalence and couldn't have been modeled with linear assumptions.
Studies based in African countries argue that after controlling for the spatial dependence, the signi cance of certain covariates in the xed part of a semi-parametric model was revealed (Gayawan et al., 2014(Gayawan et al., , 2016. This was lost in a parametric approach. To our knowledge, this is the rst study in India that applies semi-parametric exible modeling at the district level of selected states divided by their prevalence of childhood stunting. The ndings from this exercise aim to initiate re-thinking restricted parametric modeling of certain public health indicators in India that are characterized by wide regional variations.   Place of residence is signi cant only in Jharkhand where urban children are less likely to be stunted. Wealth quintile (all four states), Mother's education (All except Kerala), Birth order (Punjab and Assam), Sanitation and drinking water facility, and gender is signi cant in West Bengal only.

Non-Linear Effect
Child's Age · Steep Decline in the mean HAZ till 20 months of age · Bihar and Jharkhand shows an 'L' shaped trend · The states of Uttar Pradesh and Rajasthan shows a periodic uctuation in the mean score after 20 months.
· The state of Assam and Punjab shows an initial increase in the HAZ score (3-5 months) and then decreasing till the 20 th month, with uctuations at later months.
· Kerala shows a lower rate of decrease of mean HAZ score and later stabilize at a moderate height-forage z-score Mother's age at birth · All the states show a positive relationship. With the increase in the age of the mother; the child's HAZ score increases hence the child less likely to be stunted · The states of Assam and West Bengal show an inverted U pattern, unlike other states · Kerala doesn't show any signi cant non-linear pattern and could have been modeled linearly Mother's BMI · Uttar Pradesh, Bihar, and Rajasthan shows an inverted U pattern revealing that not only underweight mothers but also those who are outside the normal BMI range(including obese and overweight) · Assam and West Bengal shows an increasing trend · Kerala doesn't show any effect of the mother's BMI on a child's height-for-age z-score Spatial Effect · East-West divide in Bihar with western districts have a signi cantly higher level of stunting · Jharkhand and Rajasthan show a smoothed pattern except for a few districts in the southwest.
· North-South divide in the state of West Bengal where southern districts have signi cantly higher chances of having stunted children.
· Kerala unlike other states shows a smoothed pattern suggesting stunting is not signi cant and the factors taken into consideration may not be the important factors explaining the variation Figures Figure 6 Posterior means of total spatial effect on stunting outcome of children under ve years of age for a few selected states of India, 2015-16. Note: Maps are created using R Software version 3.6.1; R Core Team (2019); URL: https://www.R-project.org/ Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Figure 6
Posterior means of total spatial effect on stunting outcome of children under ve years of age for a few selected states of India, 2015-16. Note: Maps are created using R Software version 3.6.1; R Core Team (2019); URL: https://www.R-project.org/ Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.