Empirical Model
LPG Regressions. —The first three DiD regressions compare the effect of exposure to the PMUY policy for the treatment group relative to the comparison group with regard to obtaining an LPG connection (any kgs of LPG, any large cylinders of LPG, and any small cylinders LPG) and then overall LPG consumption (kgs of LPG, number of large cylinders, or number of small cylinders). The equation describing these regressions is as follows:
(1) 𝑦𝑖 = 𝛽0 + 𝛽1 𝑇𝑅𝑇𝑖 + 𝛽2𝐴𝐹𝑇𝑖 + 𝛽3𝑇𝑅𝑇𝑖 ∗ 𝐴𝐹𝑇𝑖 +𝛽4𝑋 + 𝜀𝑖
where 𝑦𝑖 is a binary variable for obtaining any LPG (kg, large cylinder, and small cylinders) and continuous variables of kg of LPG, number of large cylinders, and number of small cylinders, respectively. TRT is a dummy variable to indicate if the household was eligible for the PMUY policy, AFT is a dummy variable to indicate if this data point was before or after treatment, X is representative of the control variables, and 𝜀 is the error. These equations are estimated using an intent-to-treat linear regression DiD model with kernel propensity score matching on the common support using the STATA function DIFF35. We assume that both the parallel trends assumption and the common shocks assumption of DiD are satisfied. Since we do not have more than one period of data prior to PMUY implementation, we implement the standard approach to correct for any deviations from the parallel trends assumption: propensity score matching36. With respect to the common shocks assumption, we are unaware of simultaneously implemented policies that would only have affected either the treatment or comparison group.
The second regression allows us to determine if the household did take advantage of the policy (i.e. get a free connection), did this lead to an increased consumption of LPG as compared to the non-equivalent control.
(2) 𝑦i = 𝛽0 + 𝛽1 𝑇𝑅𝑇𝑖 + 𝛽2𝐴𝐹𝑇𝑖 + 𝛽3𝑇𝑅𝑇𝑖 ∗ 𝐴𝐹𝑇𝑖 + 𝛽4𝑋 + +𝜀𝑖 𝑖𝑓 𝑦2 > 0
Where 𝑦i is the amount of LPG in kg the household purchased annually, TRT is a dummy variable to indicate if the household was eligible for the PMUY policy, AFT is a dummy variable to indicate if this data point was before or after treatment, X is representative of the control variables, and 𝜀 is the error.
Tier and Home Delivery Regressions. —The final two regressions compare the effect of exposure to the PMUY policy for the treatment group relative to the comparison group with regard to opting for homedelivery and moving up a tier group, and is defined as:
(3) 𝑦𝑖 = 𝛽0 + 𝛽1 𝑇𝑅𝑇𝑖 + 𝛽2𝐴𝐹𝑇𝑖 + 𝛽3𝑇𝑅𝑇𝑖 ∗ 𝐴𝐹𝑇𝑖 +𝛽4𝑋 + 𝜀𝑖
Where 𝑦𝑖 is binary dependent variable for home delivery and then moving up a tier group respectively, TRT is a dummy variable to indicate if the household was eligible for the PMUY policy, AFT is a dummy variable to indicate if this data point was before or after treatment, X is representative of the control variables, and 𝜀 is the error. These equations are estimated using DIFF as described above.
Data Collection and Variables
India’s CEEW conducted the ACCESS survey in conjunction with the National University of Singapore in 2015, prior to the implementation of the PMUY policy, and then again in 2018, after the policy’s implementation 5. Six states were surveyed: Bihar, Jhardkhand, Madhya Pradesh, Odisha, Uttar Pradesh, and West Bengal. Enumerators randomly sampled one district within each administrative division (except West Bengal where two were sampled because it was the largest administrative division). Within these administrative divisions, a district was chosen for the sampling with a probability proportional to population size. Within each district, they selected 7 large villages (from a sample containing 50% of district population) and 7 small villages (from a sample containing 50% district population). In total, 8,568 responses were collected from 714 villages in 51 districts in 2015. In 2018, they expanded to three additional districts in Odisha to increase the sample size of that state, which added 504 observations 37.
We describe here our dependent, explanatory, and independent variables. The CEEW’s questionnaire is available online25 (see Data Availability).
Dependent Variables. — For the LPG section, the dependent variables of interest are binary variables that indicate the decision to consume any amount of LPG, any amount of large cylinders, and any amount of small cylinders. Then, the continuous dependent variables indicate the amount of LPG (in kgs), the number of large cylinders of LPG, and the number of small cylinders of LPG. Finally, for equation 2, the dependent variable is LPG (in kgs) only if that value is greater than zero.
To address the issue of whether LPG is used in addition to other fuels, we also examine use as dependent variables through the following “tiers” of LPG use: exclusive LPG use, partial LPG use, and no LPG use. The CEEW defined these Tiers as a measure of access to clean cooking based on health and safety, availability, quality, affordability, and convenience. These tiers are based on self-responses. Households are classified as Tier 3 if they only use BLEN (Biogas, LPG, Ethanol, or Natural Gas), if they are satisfied with availability, if the quality of cooking is adequate, if cooking fuel is affordable, and if it is neither difficult nor time consuming. These classifications were self-reported directly from survey questions. Households are Tier 2 if they use a mix of traditional fuel and BLEN is used, if they are neutral to availability, if the quality of cooking is adequate, and if it is either difficult to use or time consuming. Households are Tier 1 if they use a mix of traditional fuel and BLEN is used, if they are unsatisfied to availability, if the quality of cooking is not adequate, and if it is both difficult to use or time consuming. Finally, Tier 0 households only use traditional fuels (firewood, dung cakes, or agricultural residues), if they are cooking less because of lack of availability, quality of cooking is not adequate, cooking fuel is not affordable, and cooking is both difficult to use and time consuming 5. Our dependent variable was a binary variable indicating if the household moved up at least one tier group.
The final dependent variable we used was home delivery. Home delivery increased from roughly 19% to 39% in 2019. It is unclear how PMUY customers are obtaining LPG (home delivery or procuring it themselves). Kar et al. found that the worst performing LPG provider (India has 3 LPG companies) provided home delivery to all villages, while the other two do not. Therefore, we include this as a dependent variable to investigate if the policy leads to connection, but not to convenient access.
Explanatory Variable. — Our explanatory variable is eligibility for the PMUY policy. Starting in 2016, the PMUY policy offered free connections to all households that had the ration card distinguishing them as BPL. Previous work found that PMUY LPG consumers purchased fewer refills than general customers 22. These households are at a lower income than typical general customers who could afford the high upfront cost of the initial connection.
Independent Variables. — We included the following independent variables in our models based on existing literature on LPG adoption. The descriptive statistic table (Table 4) outlines the summaries of the variables; however, here we discuss the hypothesized direction of effect on the dependent variables. Our model included socio-economic factors that have been studied in relation to adoption of LPG.
State: We included binary indicators of each state surveyed (Bihar, Jhardkhand, Madhya Pradesh, Odisha, Uttar Pradesh, and West Bengal (base case). This controls for differing levels of improvement in clean cooking access 5. Jhardkhand and Odisha had the highest percentage increase in clean cooking access index 5.
Age: We included the age of the respondent, which was either the head of household or the primary cook as a continuous variable. Studies have found that the age of the head of household is negatively associated with clean fuel adoption 38.
Religion: We included binary indicators of three religion categories (Hindu, Muslim, and Other Religion). Studies such as 38–40; found that adhering to the major religion, Hindu, in India was positively correlated with LPG use.
Education: Previous studies have found that level of education has been positively correlated with clean cooking use 12,14,15,41. The survey grouped categories of education into: no education (the base category), up to the 5th Standard, up to the 10th standard, up to the 12th standard, and then graduate education.
Household Daily Expenditure: We used daily expenditure as a proxy for economic status because although BPL status was calculated from a number of socio-economic variables, the PMUY policy was not offered to anyone with an income over 131 Rs per day 42. Household expenditure has been shown to be positively correlated with use of clean cooking fuels 41.
Caste: The survey collected data on each household’s social stratification, or caste within Indian society. We included binary indicators of four caste categories: Scheduled Caste, Scheduled Tribe, Other Backward Class, and General. Households within Scheduled Caste and Scheduled Tribe have been negatively correlated with clean cooking adoption 43,44.
Household Size: There is an unclear association between household size and clean fuel adoption. Previous studies have found negative 14,15 correlation between household size and clean fuel adoption 16,43.
Female Decision Maker: Our model controlled for households with females as the only decision maker or as an equal joint decision maker. In an analysis of the 2015 ACCESS data, female decision-making families had higher odds of owning LPG 17. Female led households have been shown to be positively correlated with clean cooking fuel 18,45 and male heads of households are less likely to adopt clean cooking fuel. However, other studies found that female led households were less likely to adopt clean cooking fuel due to the social and economic disadvantages of women 46. Typically, women are the primary cook and are more likely to allocate household funds for expenditures that ease their work 47.
Descriptive Statistics
Table 4 provides information on household-level demographic characteristics from the baseline year of 2015 that were mentioned as control, key interest, and outcome variables. The majority of respondents were from Uttar Pradesh at 35%. In 2018, households from Odisha increased from 6% to 11% which is expected based on the addition of 3 districts and 504 observations from the state in the second round. The average age of the respondent was 42 years and then 43 years respectively in 2015 and 2018. Eighty five percent of respondents were male in 2015, but only 72% in 2018. Education levels were similar across the 2015 and 2018 panels with roughly a third having no education and a third having up to the 5th standard. The majority of respondents are Hindu at 87% and 88%, respectively. For caste, 48% of respondents were classified as “Other Backwards Class.” Household size shrunk slightly from 6.7 to 6.2 between 2015 and 2018. Average daily expenditure also increased from 177 to 208 Rs per day (2.33 to 2.74 USD per day), but the standard deviation on both years implies a large spread among respondents. For context, no one over 131 Rs a day (1.75 USD per day) was considered below the poverty line 42. In 2018, there was a slight increase of female decision makers (31%) from 19% in 2015, but households are still largely led by male decision makers. Average price for all the alternative fuels were relatively stable, except for agricultural residues, which decreased from 4.5 Rs to 1.8 Rs. The average age of LPG connection decreased from 5 years to 3.5 years, likely due to the addition of all the new LPG users from the PMUY policy. Home delivery also doubled in this time period (from 18% to 39%). Additionally, both large and small annual LPG refills declined from 7.5 and 0.2 in 2015 to 6.2 to 0.1 in 2018 respectively. This is also likely due to the addition of poorer families into the LPG market. Use of LPG increased from 22% to 55% from 2015 to 2018. We also evaluated the demographic characteristics of the treatment and control groups, which are outlined in Table 5.
TABLE 4—DESCRIPTIVE STATISTICS
Panel A. Independent Variables 2015 State
|
Mean
|
Sample Size
|
Frequency
|
Proportion
|
Minimum
|
Maximum
|
Bihar
|
|
8563
|
1511
|
0.18
|
|
|
Jhardkhand
|
|
8563
|
840
|
0.10
|
|
|
Madhya Pradesh
|
|
8563
|
1680
|
0.20
|
|
|
Odisha
|
|
8563
|
504
|
0.06
|
|
|
Uttar Pradesh
|
|
8563
|
3023
|
0.35
|
|
|
West Bengal
|
|
8563
|
1005
|
0.12
|
|
|
Age
|
42 (14)
|
8563
|
|
|
18
|
95
|
Gender (Male) Education
Up to 5th Standard
|
|
8563
8563
|
7306
2648
|
0.85
0.31
|
|
|
Up to 10th Standard
|
|
8563
|
1713
|
0.20
|
|
|
Up to 12th Standard
|
|
8563
|
817
|
0.10
|
|
|
Graduate Education
Religion Hindu
|
|
8563
8563
|
649
7463
|
0.08
0.87
|
|
|
Muslim
|
|
8563
|
1051
|
0.12
|
|
|
Other Caste
Scheduled Caste
|
|
8563
8563
|
49
1569
|
0.01
0.18
|
|
|
Scheduled Tribe
|
|
8563
|
860
|
0.10
|
|
|
Other Backwards Class
|
|
8563
|
4082
|
0.48
|
|
|
General
|
|
8563
|
2052
|
0.24
|
|
|
Household Size
|
6.7 (3.5)
|
8563
|
|
|
1
|
46
|
Daily Expenditure
|
177 (130)
|
8563
|
|
|
17
|
2000
|
Female Decision Maker
Panel B: Dependent Variables 2015
|
|
7558
|
1457
|
0.19
|
|
|
Home Delivery LPG
|
|
1851
|
341
|
0.18
|
|
|
# of Large LPG Refills per Year
|
1.58(3.5)
|
8563
|
|
|
0
|
32
|
Binary Large Cylinder
|
|
8563
|
1790
|
0.209
|
|
|
# of Small LPG Refills per Year
|
0.038(.623)
|
8563
|
|
|
0
|
25
|
Binary Small Cylinder
|
|
8563
|
60
|
0.007
|
|
|
Kg of LPG
|
22.7 (49.9)
|
8563
|
|
|
0
|
454.4
|
Binary LPG
|
|
8563
|
1806
|
0.211
|
|
|
Increase in Tier Group
|
.
|
8185
|
6978
|
0.853
|
|
|
TABLE 5—DEMOGRAPHIC STATISTICS OF TREATMENT AND CONTROL
Panel A. Comparison Descriptive Statistics
State
|
Mean
|
Sample Size
|
Frequency
|
Proportion
|
Minimum
|
Maximum
|
Bihar
|
|
5940
|
895
|
0.151
|
|
|
Jhardkhand
|
|
5940
|
615
|
0.104
|
|
|
Madhya Pradesh
|
|
5940
|
1106
|
0.186
|
|
|
Odisha
|
|
5940
|
706
|
0.119
|
|
|
Uttar Pradesh
|
|
5940
|
706
|
0.119
|
|
|
West Bengal
|
|
5940
|
702
|
0.118
|
|
|
|
|
|
|
|
|
98
|
Age
|
|
43 (14.7)
|
5940
|
|
|
18
|
Gender (Male)
|
|
|
5940
|
4233
|
0.713
|
|
Education
|
|
|
|
|
|
|
|
Up to 5th Standard
|
|
5940
|
1841
|
0.310
|
|
|
Up to 10th Standard
|
|
5940
|
861
|
0.145
|
|
|
Up to 12th Standard
|
|
5940
|
478
|
0.080
|
|
|
Graduate Education
|
|
5940
|
349
|
0.059
|
|
Religion
|
|
|
|
|
|
|
|
Hindu
|
5940
|
5266
|
0.887
|
Muslim
|
5940
|
648
|
0.109
|
Other
|
5940
|
26
|
0.004
|
Scheduled Caste
|
5940
|
1108
|
0.187
|
Scheduled Tribe
|
5940
|
742
|
0.125
|
Other Backwards Class
|
5940
|
4082
|
0.687
|
General
|
5940
|
1334
|
0.225
|
Household Size
|
5.7 (2.7)
|
5940
|
|
|
1
|
27
|
Daily Expenditure
|
160 (66.9)
|
5940
|
|
|
0
|
300
|
Female Decision Maker
|
|
5940
|
1948
|
0.328
|
|
|
|
|
Panel B: Treatment Descriptive Statistics
|
Mean
|
Sample Size
|
Frequency
|
Proportion
|
Minimum
|
Maximum
|
State
|
|
|
|
|
|
|
Bihar
|
|
1559
|
246
|
0.158
|
|
|
Jhardkhand
|
|
1559
|
147
|
0.094
|
|
|
Madhya Pradesh
|
|
1559
|
306
|
0.196
|
|
|
Odisha
|
|
1559
|
245
|
0.157
|
|
|
Uttar Pradesh
|
|
1559
|
453
|
0.291
|
|
|
West Bengal
|
|
1559
|
162
|
0.104
|
|
|
Age
|
42.6 (14.2)
|
1559
|
|
|
18
|
86
|
Gender (Male)
|
|
1559
|
1058
|
0.679
|
|
|
Education
|
|
|
|
|
|
|
Up to 5th Standard
|
|
1559
|
523
|
0.335
|
|
|
Up to 10th Standard
|
|
1559
|
185
|
0.119
|
|
|
Up to 12th Standard
|
|
1559
|
97
|
0.062
|
|
|
Graduate Education
|
|
1559
|
58
|
0.037
|
|
|
Religion
|
|
|
|
|
|
|
Hindu
|
|
1559
|
1378
|
0.884
|
|
|
Muslim
|
|
1559
|
173
|
0.111
|
|
|
Other
|
|
1559
|
8
|
0.005
|
|
|
Caste
|
|
|
|
|
|
|
Scheduled Caste
|
|
1559
|
433
|
0.278
|
|
|
Scheduled Tribe
|
|
1559
|
184
|
0.118
|
|
|
Other Backwards Class
|
|
1559
|
670
|
0.430
|
|
|
General
|
|
1559
|
272
|
0.174
|
|
|
Household Size
|
5.8 (2.5)
|
1559
|
|
|
1
|
27
|
Daily Expenditure
|
160 (62.8)
|
1559
|
|
|
6.67
|
300
|
Female Decision Maker
|
|
1559
|
477
|
0.306
|
|
|
Internal and External Validity
The largest limitation in our study is the lack of an exact control group. Instead, we use a non-equivalent comparison group. Our comparison group is households slightly above the poverty line that are in a similar income bracket to those below the line but were not offered the PMUY policy.
We reasonably assume that the parallel trends assumption holds. However, this assumption is not empirically testable due to earlier relevant data for the treatment and comparison groups being unavailable. However, Table 2 outlines the demographic statistics between the treatment and comparison groups. The only notable difference is the slightly higher proportion of Other Backward Class in the comparison (67% in control compared to 43% in the treatment) and the slightly higher proportion of treatment households from Uttar Pradesh (11.9% in control compared to 29% in treatment). Given the overwhelming similarity despite these two aspects, it is not unreasonable to assume that parallel trends hold. However, for robust analysis, we have incorporated kernel propensity score matching.
In addition, we also reasonably assume that the common shocks assumption holds. We are unaware of any simultaneous policy intervention or other event that may violate the common shocks assumption.
Reverse causation is not likely to be present in the variables of interest. Eligibility for the policy is based on a number of socio- economic characteristics that establish the household of BPL. Potential omitted variable bias may be present which would bias the parameter positively or negatively depending on its correlation with our dependent variables. However, we are confident in our engagement with the literature of factors affecting clean fuel use. Measurement bias is also likely to be present in the variables. The estimated parameter will therefore be biased towards zero and to the extent the mis-measured variable is correlated with other independent variables, the estimated parameters of those correlated variables will be biased in unknown directions.
Despite the complex survey design, we do not use probability weights in the analysis as they are not available at the household level. Solon, Haider, and Woodridge suggest weighting is only an issue when estimating causal effects in three situations: (1) correcting for endogenous sampling, (2) correcting for heteroscedasticity, and (3) identifying average partial effects in situations where effects are likely to be heterogenous by subgroup48. None of these issues applies in the current study, since sampling was not endogenous, we are able to correct for heteroscedasticity apart from weighting, and we are interested in the local average treatment effect, not in identifying heterogenous effects by subgroup. Thus, not being able to weight at the family level is of no practical significance.
This paper is externally generalizable to the Indian population as of 2018. The PMUY policy further expanded to groups beyond BPL status after this 2018 survey was taken. This could also provide valuable lessons for the evolution of the PMUY policy in India and serve as an example for other countries in the developing world (Ghana, Indonesia, South Africa, etc.) pursuing national LPG policies. However, these results should not be taken without the context of India’s socio-economic environment in these particular states, the current LPG infrastructure, and other national policies that benefit BPL households.