Improving rural women’s health in China: cooking with clean energy

It is well known that women bear the greatest burden of health, work time, and labor supply due to gender disparity in many developing countries. In this study, we analyze the health inequality in rural China caused by indoor air pollution from traditional energy use. Specifically, we study the effect of clean energy access on woman health outcome by exploiting a nationwide rollout of clean cooking fuel program in 2014. Based on interviews with rural women in 2014 and 2016, this study analyzes the impact of clean energy use on women’s health by using the propensity score matching method with the difference-in-differences model (PSM-DID). We also analyze the heterogeneous health effects of clean energy uptake on rural women with different characteristics. The results show that clean energy applications can significantly improve the health of rural women. The positive health effects are substantial for middle-aged and older women, illiterate women, and those women lived in northeastern China. The results highlight the role of clean energy in reducing gender disparities in health inequality.


Introduction
One of the main sources of gender disparity is traditional gender norms in the types of jobs assigned to men and women (Imelda and Verma 2019). In the vast majority of developing countries, as women do more housework (Duflo 2012), they have to bear excessive exposure to indoor air pollution caused by the use of traditional household energy that is polluting and harmful to health. Long-term exposure to indoor air pollution will have a significant negative impact on people's health (Kumar and Viswanathan 2007). Meanwhile, indoor air pollution is the leading environmental health risk factor for women in less developed countries and is a major cause of stroke, chronic obstructive pulmonary disease, lung cancer, heart disease, and other non-communicable diseases (Imelda and Verma 2019;Malla et al. 2011). According to the WHO (2018), women and children accounted for more than 60% of all premature deaths caused by indoor air pollution. This is not just a health issue, but a serious gender disparity that requires global attention.
In most developing countries, cooking is not only typically classified as a female responsibility, but women also have little power to choose the type of cooking fuel (Wang et al. 2019a). With unclean cooking fuel emitting a large number of harmful pollutants, biased gender roles and low bargaining power of women impose a disproportionately higher health cost on women than men (Imelda and Verma 2019). Especially in the traditional rural areas of China, this phenomenon is more prominent (Tian et al. 2020). According to the official report of the National Bureau of Statistics (NBS), nearly 40% of rural dead women in China died of respiratory and heart diseases in 2019, that is, two percentage points more than urban women. 1 These diseases are closely related to the widespread use of traditional energy in rural China (Du et al. 2021;Shao et al. 2021). Owing to the combustion of solid fuel in cooking, the WHO data shows that indoor air pollution causes nearly 4 million premature deaths every year, and women exposed to high levels of indoor smoke are more than twice as likely to suffer from Responsible Editor: Baojing Gu * Guanglai Zhang zglai24@126.com chronic obstructive pulmonary disease (COPD) than women who use cleaner fuels and technologies. 2 Therefore, rural women health can be affected by traditional energy use; it is important to discuss the health benefits after clean energy use in the rural kitchen, but the specific benefits is unknown. And existing literature on health inequality caused by the environment has mainly focused on outdoor air pollution Morelli et al. 2019;Zhao et al. 2019), such as the topic about air pollution and public health (Dedoussi et al. 2020;Lelieveld et al. 2015). So, this paper reports on examination of potential benefit after household culinary energy substitution. Specifically, we address the following: (1) Basing on rural women's perception of their own health, we examine the exact extent of health influence when the kitchen energy is converted to clean energy by propensity score matching method with difference-in-differences model (PSM-DID).
(2) Using PSM-DID, we also discuss that if the impact of clean energy are heterogeneous among age, education level, and location, what would have been different varying effects of such a change on the health of rural women? Our paper tries to analyze the potential mechanism behind it.
By completing these tasks, this paper contributes to the existing literature as follows. First, much less is known of the impact of culinary air pollution on the gender disparity in developing countries, especially in China, which has the largest rural female population in the world. Therefore, this paper chooses Chinese rural women as the research object, which is a new research perspective. Second, this study provides new empirical evidence for the heterogeneous health effects of indoor air pollution on rural women. We test the existence of heterogeneous effects by classifying the interviewees based on their age, education level, and residential location. The results show that rural woman in China exhibit varying health improvement levels of response to clean energy using in the process of cooking. Third, instead of experimental or medical test methods, we choose to conduct our analysis by combining PSM-DID of causality inference. The PSM-DID method can effectively solve endogeneity problems in the model and ensure the reliability of the benchmark regression results. Therefore, it also benefits to extend the research methods of this topic.
The rest of this paper is organized as follows. "Literature review" Section introduces the literature reviews on the relationship between clean energy use and women's health. "Methodology" Section presents the research design from three perspectives: data source, research strategy, and variable setting. The fourth section analyzes the empirical results, including a "before" and "after" comparison of rural women's health, a balancing test after application of the PSM, the impact of rural clean energy use on women's health, and several robustness tests. "Conclusions and policy implications" Section presents the heterogeneity analysis. The final section presents the study's conclusions and policy implications.

Literature review
From scholars' investigation of how China's energy consumption has developed, it can be observed that the use of clean energy in China is still at a low level (Carter et al. 2020), the main reason being rural households' lack of understanding of the risks posed by the use of traditional energy sources (Elie et al. 2019;Wang et al. 2019b). Current research into the relationship between clean energy and health mainly focuses on analyzing the impact of air pollution on public health (Fann et al. 2013;Nel 2005;Shaddick et al. 2020;Sofiev et al. 2018). Some scholars have looked at physiological data in their investigation of the relationship between energy substitution and middle-aged and elderly rural residents' health (Liu et al. 2018). However, overall, there have been few studies on this topic in China, with most of the research having been conducted in other countries. For example, Sattler et al. (2018) found that replacing coal-fired power plants with clean energy in Illinois could improve the residents' health. Baumgartner et al. (2011) found that air pollution resulting from traditional energy use may raise the risk of hypertension and cardiovascular diseases. Haines et al. (2007) pointed out that the current pattern of fossil fuel using causes substantial health problems. Therefore, a comprehensive clean energy application plan should strongly emphasize the health benefits of improved air quality.
More recently, the relationship between clean energy use and health and, in particular, the impact of kitchen fuel on health is beginning to attract academic attention (Imelda 2020;Yun et al. 2020). However, studies of this issue have reached different conclusions. On the one hand, it is believed that improved biomass stoves could only incrementally improve air quality and provide few health benefits, and cleaner fuel stoves were non-effective in reducing their risk of contracting pneumonia (Mortimer et al. 2017;Rosenthal et al. 2018). On the other hand, most scholars have found that clean fuel used for cooking has a significant impact on health. For example, Yun et al. (2020) have confirmed that ultra-fine particulate matter produced by the combustion of solid fuel poses potential health risks to people. Alexander et al. (2017), having conducted a randomized trial in Nigeria, concluded that replacing the solid fuel used in household stoves by ethanol may reduce pregnant women's diastolic blood pressure and hypertension: the use of clean cooking fuels may, therefore, reduce the overall harmful health effects caused by household air pollution. Imelda (2020) estimated the carcinogenic risk of hetero-cyclic amines inhaled by women under different cooking conditions by simulating the process of typical household solid fuel combustion. The results showed that when the smoke could not be effectively discharged, women had higher health risks when bituminous coal, straw, and wood were used.
It is clear, then, that scholars have different views on the health effects of clean energy use in the kitchen. However, when studies on the impacts of the wider environment are focused on the health of women, the evidence of adverse health impacts of poor air quality is more conclusive. These include the impacts on blood pressure problems during pregnancy (Lee et al. 2012); certain physiological diseases including those of the respiratory tract (Vanker et al. 2017), eyes (Mo et al. 2019); lung cancer (Liu et al. 2020); and others (Zhang and Zhang 2020;Wu et al. 2021). From these studies, therefore, it can be safely inferred that the use of traditional energy sources for cooking has a negative impact on women's health.
To sum up, scholars have confirmed that there is likely to be a positive relationship between the use of clean energy and health, but when the health benefits of clean energy use in kitchen are specifically examined, the matter is unsettled. Chinese scholars mainly focus on verifying whether the particulate matter and soot produced by fuel combustion has negative effects on health from the perspective of environmental science. Other countries' scholars have focused more on the effects of stoves using different fuels on air quality and health. However, owing to different sample populations, the analyses have reached different conclusions, and while most studies investigating the environment and medicine use objective data, there has been little analysis of women's subjective view of their own health. In other words, there is still a lack of quantitative research into the health effects of women's cooking energy choices outside the environmental and medical fields; in China, in particular, such research simply does not exist. Therefore, this study attempts to include the female perspective into the debate among scholars regarding energy use in the kitchen and women's health.

Data source
The data used in this study are all from the China Family Panel Study (CFPS), which is funded by Peking University and the National Natural Science Foundation of China and conducted by the Institute of Social Science Survey, Peking University. The data covers a wide range of domains for individual, family, and community levels from 162 counties in 25 provinces of China, including their economic activities, energy use, education outcomes, family dynamics and relationships, and health. The CFPS uses multistage probability proportional to size sampling with implicit stratification to better represent Chinese society. To be specific, the sample from CFPS2014 and CFPS2016 baseline survey is drawn through three stages (county, village, and household) from 25 provinces and covers 21,007 interviewees who were interviewed for two consecutive years. The randomly chosen 162 counties largely represent Chinese society. Figure 1 shows the sample distribution characteristics of interviewees in this paper; it is clear that CFPS survey can effectively avoid the limitations caused by geographical constraints. 3 To be specific, this paper chooses rural adult subjects from CFPS2014 and CFPS2016. The first step is to eliminate all the males and non-rural residents from the analysis. Second, people who were already using clean energy to cook in 2014 are removed, according to the DID model requirements. Then, the dummy variable of "cooking fuel" is defined based on whether the solid fuel used in the household generates airborne particulates. If the subject uses firewood or coal, and no clean energy, it is considered that the sample uses traditional energy for cooking, and the dummy variable is assigned the value 0. However, if the sample uses natural gas, liquefied gas, solar energy, biogas, or electricity, it is considered that the sample uses clean energy for cooking, and the dummy variable is assigned the value 1. Next, this paper constructs the control group using the subjects' IDs. Then, the outliers and subjects supplying invalid responses (such as "not applicable" and "refuse to answer") are eliminated from the sample. Finally, we arrive at 3008 unique rural female interviewees in the matched sample, leading to 6016 interviewee-year observations from CFPS 2014 and 2016 in our baseline model.
The survey did not include questions concerning smoke extraction in the subjects' kitchen or the households' cooking method, so it is impossible to judge whether the research results will be affected by smoke extraction in the kitchen or households' different dining habits. However, given that these samples are all from rural China, either a traditional chimney or a new style range hood will effectively remove smoke from the kitchen (Clark et al. 2010). Moreover, compared with urban kitchens, rural kitchens have more natural ventilation, which leads to better air quality (Sharma and Jain 2019). Meanwhile, unlike certain western dishes that are served cold, Chinese means are largely "fried, boiled, or stewed", and the use of hot oil cannot be avoided. Therefore, this study assumes that the female subjects in this survey are consistent in their methods of cooking and smoke extraction.

Research strategy
To solve new problems, meet the challenge of Chinese energy development, and build a clean, efficient, safe, and sustainable modern energy system, the State Council issued the Notice of the State Council General Office on Issuance of the Strategic Action Plan for Energy Development (2014Development ( -2020 in 2014 (hereinafter referred to as the "Notice 4 "). The Notice clearly proposed that China should implement a "Green and Low-carbon Strategy" to optimize its energy structure, develop mainly clean and low-carbon energy when restructuring energy sources, stress the importance of transforming rural energy supplies, and strengthen energy conservation in rural areas. Based on the Notice, a major task for the government is to promote the transformation of energy use in urban and rural areas. According to this Notice policy, in line with the overall requirements of integrated urban and rural development and new urbanization, all regions should combine centralized and decentralized energy supply, build urban and rural energy supply facilities in accordance with local conditions, promote the transformation of urban and rural energy use mode, and improve the level and efficiency of urban and rural energy use. Important action steps include the following: (1) Implementing action plans for new towns, new energy sources, and a new life. Governments should formulate comprehensive energy plans for cities and towns, vigorously develop distributed energy sources, and scientifically develop cogeneration of heat and power. At the same time, the governments should encourage regions to develop cogeneration of heat, power and cooling, and develop wind, solar, biomass, and geothermal heating.
(2) Accelerating reform of rural areas energy use. The government should promptly formulate long-term policies and measures to promote the construction of green energy counties, townships, and villages. Besides, the government should vigorously develop small hydropower in rural areas and strengthen the construction of new rural electrification counties of hydropower and the ecological protection project of replacing small hydropower with fuel. The efficient utilization of noncommercial energy and the strengthening of rural energy conservation work are also the main tasks to be focused.
(3) Launching a nationwide campaign to conserve energy.
The government has implemented the national energy conservation action plan, popularizing farmers' knowledge of energy conservation through publicity and education in rural areas, promoting new energy-saving technologies and products, and vigorously advocating green lifestyles to guide rural residents to use clean energy in their daily lives, such as clean energy use in their kitchen.
Therefore, this policy provides an excellent quasi-natural experimental background for this study's research. Then, in order to better analyze the health effects of rural clean energy uptake following the implementation of policies outlined in the Notice, the DID model is mainly used. However, in practice, there may be significant differences in the sample characteristics of the rural women who are affected by the policies. For example, rural women with higher levels of income and education are more likely to use cleaner energy to cook, but such cross-sectional heterogeneity could bias the DID model results. In order to effectively solve this problem, this study uses propensity score matching and differences-in-differences (PSM-DID) to analyze the health impact of clean energy on rural women who engage in cooking, to generate a more reliable result (Heckman et al. 1998). Specifically, Rosenbaum and Rubin (1983) suggest matching on the propensity score the probability of receiving the treatment conditional on covariates. Dehejia and Wahba (2002) prove that the PSM approach succeeds in focusing attention on the small subset of comparison units, which are comparable to the treatment group and thus succeed in taking the bias away because of the systematic differences between the treatment and control group.
In particular, we can find a sample i of subjects who do not use clean energy for cooking in the control group, and another sample j, who have similar characteristics and use clean energy for cooking in the treatment group, which can effectively account for the differences in observable characteristics between the two groups, satisfying the "conditional independence hypothesis" as far as possible. This effectively avoids both the endogeneity problem in rural women's clean energy use and the non-random grouping bias in the DID model. Based on this analysis, the DID model result shows the net effect of rural women using clean energy for cooking.
Therefore, against the quasi-natural experimental background, the main research design of this study is as follows: based on the 2014 and 2016 CFPS balance panel data, and investigating the change in cooking fuel of rural women before and after implementing the Notice, this study investigates whether rural women replace traditional energy with clean energy and then collects reliable samples of the treatment group and control group. Specifically, we set 2014 as the initial period of the policy, but owing to limited data, we choose 2016 as the tracking period (based on the survey data, 22.97% of the subjects started to use clean energy in 2016, so it is considered that the analysis and tracking period setting is reliable). The treatment group includes rural women who did not use clean energy for cooking in 2014 but did in 2016. The control group includes rural women who did not use clean energy for cooking in both 2014 and 2016. The specific research steps are divided into three steps: The first step is to estimate the probability that subjects use clean energy for cooking in 2016 by their examining their initial characteristics. Characteristic variables include individual age, marital status, education level, income and consumption level, family size, housing type, and individual health habits (whether or not they smoke), as well as local medical resources and whether the subjects have health insurance. The logit model is used to clarify the relationship between clean energy use and these above initial characteristic variables, and we obtain the propensity score of subjects using clean energy for cooking by examining the estimation results. The logit model is constructed as follows: where D i = 1,16 represents sample i cook with clean energy in 2016; X i , 14 represents the initial characteristic variable of sample i in 2014, which is also the variable matched by the PSM method in this paper; F (.) stands for the logit distribution function; and α i represents the coefficient of the characteristic variables. Therefore, according to the results generated, we can rematch the treatment group and the control group, making the two groups have more similar characteristics, so as to more effectively address the non-random grouping problem. Owing to the identical propensity scores, the balancing test is a necessary prerequisite for matching so, after the above calculation, this study conducts radius nearest neighbor matching to perform an balancing test (details are given in "Balancing test after PSM matching"), which ensures grouping randomness between the treatment group and the control group.
In the second step, after grouping of the treatment group and control group by the PSM, eliminating samples that do not meet the "common area assumption", and through controlling the time and individual fixed effects in the DID model, we can eliminate the impact of time and individual unobserved heterogeneity. These do not change with the individual or over time and include such attributes as the resource endowment, or the energy-use environment. Doing so allows us to further avoid the possibility of endogeneity problems in the model, so that we can accurately capture the relative health differences between the treatment group and the control group before and after using clean energy for cooking. This better reflects the effect of the policy on rural energy use. The specific model is constructed as follows: In formula (2), Y it represents dependent variables of this paper, which is the health status of sample i (comprising the results of health self-assessment, and the response to "whether physical discomfort was felt during the last two weeks"); D i represents the dummy variable indicating whether sample i uses clean energy for cooking (D i = 1 refers to use of clean energy; otherwise D i = 0); T i represents time (T i = 1 refers to tracking period of 2016, and T i = 0 refers to initial period of 2014); β 1 represents the estimated treatment effect of rural clean energy use on women's health; and X it represents a series of covariables. It should be pointed out that the control group and the treatment group samples will have similar individual characteristics after PSM, so it is unnecessary to further control the covariables in the DID model. c i represents the individual fixed effect; year t represents the time-fixed effect; and u i is the random error term. After constructing the model, this study uses the kernel matching method and direct OLS regression to calculate and compare the model's results, so as to ensure the robustness of the conclusions.
Third, in order to further study the impact of clean energy use for cooking on women's health in rural areas, this study attempts to examine whether there are differences in the health effects between women with different characteristics, such as age, education level, and region through heterogeneity analysis. Therefore, the subjects are grouped again, according to their different characteristics, and the results are analyzed using the PSM-DID method.

Independent variable
The interaction term between cooking fuel used by rural women and time is an important independent variable in this study and is mainly used to reflect the changes in fuel used by sample women between the initial period and the tracking period. According to the above definitions of D i and T i , if the sample i is in the treatment group, the value of this independent variable is equal to 0 in 2014 and 1 in 2016; otherwise, if the subject is in the control group, both values are 0. Of the 6016 samples, 1382 (22.97%) are in the treatment group, and 4634 (77.03%) are in the control group.

Dependent variables
The health self-assessment result of rural women is the key dependent variable in this study. From the perspective of time, this study chooses two health indicators related to subjects' health in the CFPS questionnaire as the dependent variables. They comprise "health self-assessment" (health selfassessment) and "whether I have felt physical discomfort during the last two weeks" (unhealth, hereinafter "physical discomfort"). Among indicators, "health self-assessment" has been taken as an indicator of health measurement by a large number of scholars, because it can reflect a person's overall health assessment and incorporates both subjective and objective health information. On the one hand, the subjective evaluation of a person's health can reflect private health information that is only known only to that person, and it can include aspects of health that cannot be quantified by objective indicators. On the other hand, there is a significant correlation between this indicator and objective indicators of morbidity, mortality, and so on, and it also contains aspects of non-private information, such as current and past disease and health status. Therefore, this study considers that the "health self-assessment" indicator can effectively reflect the health status of subjects. In the CFPS data, the values of this variable range from 1 to 5, where larger values indicate poorer health.
In addition, "physical discomfort" (unhealth) is also treated as the dependent variable in further testing. Specifically, this indicator is a binary dummy variable in the questionnaire: its value equals 1 when the sample has recently felt physical discomfort, and 0 otherwise.

Matching variables
Following the existing literature (Deshpande and Khanna 2021;Zheng and Lu 2021), the matching variables of this study include individual basic conditions (age, region, and marital status), individual socioeconomic conditions (education level, economic level), individual family conditions (family size, housing type), individual daily habits (smoking or non-smoking), and local medical conditions. The meaning and descriptive statistical characteristics of each matching variable are shown in Table 1. Importantly, there are a large number of outliers in both CFPS2014 and CFPS2016 when measuring economic level by income, 5 so we choose to measure the economic level of subjects using relatively stable indicators such as "household cash, household total deposit and household consumption expenditure". Also, in order to better distinguish whether the subject has medical insurance, we construct a binary dummy variable: regardless of the type of medical insurance (public medical treatment, urban workers' medical treatment, urban residents' medical treatment, supplementary medical treatment, new rural cooperative medical insurance), if a subject has any type, the value of the dummy variable is 1; otherwise it is 0. Table 2 lists the descriptive statistics and DID model estimation results of the dependent variables for both the treatment group and the control group, in the initial period in 2014 and the tracking period in 2016. Owing to the uptake of clean energy, the results show that all the sample women of the treatment group had an improved self-evaluation of their health compared to the control group in both 2014 and 2016. Moreover, the improvement in the treatment group is greater than that in the control group. Specifically, the value of "health self-assessment" (health self-assessment) in the treatment group is 0.11 lower (indicating an improvement in their self-assessed health) than that of the control group in 2014, and the difference in mean value widens to 0.15 in 2016. Similarly, in the evaluation of "physical discomfort" (unhealth), the mean value of the treatment group is 0.014 lower than that of the control group in 2014, and the difference in mean value widens to 0.066 in 2016. Thus, it can be seen that the DID of the dependent variables are − 0.04 and − 0.052, respectively. These results indicate that the health level of sample women who replace their traditional cooking fuel with clean energy in 2016 will improve by 0.04 (health self-assessment) and 0.052 (unhealth) compared to the sample women who use traditional fuel for cooking in both 2014 and 2016. However, the results also show that for "health selfassessment", clean energy use does not result in a statistically significant improvement of health, while for "physical discomfort", the result is significant only at the 10% level. Therefore, this study uses the PSM for further analysis.

Balancing test after PSM matching
The regression results of the logit model used to estimate the propensity score show that education, marriage, age, economic level (total cash and deposits, the consumption expenditure of residents), and other factors in 2014 (the initial period) have a significant impact on the probability of using clean energy for cooking in 2016 (the tracking period): (1) If the subject has higher educational level, she is more likely to use clean energy; (2) as the subject's marital status changes from unmarried to married, cohabitating, or "other", the probability that she will use clean energy rises significantly; (3) the preference for clean energy is more obvious among older women who have been engaged in cooking for a long time; (4) a rise in the individual's economic level may also promote the clean energy use. Table 3 shows that there is a significant difference in nearly half of the variables' initial characteristics between the treatment group and the control group before matching. Therefore, if DID analysis was directly carried out by these samples without matching, the results would be unreliable, owing to the selectivity bias of the samples. However, given the characteristics of samples in the initial period, after adjusting the groupings through the radius nearest neighbor matching in PSM, it becomes clear that the variables' characteristic of the two groups will no longer be significantly different. In other words, the treatment group and the control group will have similar initial characteristics after PSM processing, and the results of the kernel matching method also have similar characteristics. Thus, it is considered that the application of PSM-DID can effectively solve the nonrandomness problem of the sample groupings and enhance the credibility of the analysis results. Baseline results Table 4 shows that the PSM-DID (radius nearest neighbor matching, kernel matching) and OLS regression methods have a highly consistent influence on the significance and direction of the two dependent variables, indicating that the results of the estimations in this analysis have a high degree of robustness. Further, given that smaller values of the dependent variables represent better health conditions, we can observe that the negative coefficient of the independent variable has a positive effect on health. The PSM-DID analysis of the radius nearest neighbor matching suggests that if the women in the sample replace traditional energy with clean energy for cooking during the tracking periods, their self-assessed level of health will improve by 0.1555, which implies that the change of energy source leads to a 4.64% improvement in their self-assessed health (health self-assessment) 6 and a 13.82% reduction in their "physical discomfort" (unhealth). 7 Therefore, it can be seen that the uptake of clean energy in rural kitchens will have a significant positive impact on women's health, which is conducive to improving their self-assessed level of health. This could be explained by "the Syndrome of Drunk Oil" (Van wormer 2007). As mentioned in "Dependent variables", above, the health variables adopted in this paper can objectively reflect personal health information and also incorporate related information that is only known to the individuals concerned, so that their objective health level will be closely related to their self-assessed health level. The syndrome of drunk oil shows that kitchen fumes is the most common cause of poor appetite among women who cook by traditional means since it contains harmful gases and particulate matter (such as CO, SO 2 , CO 2 , and NO compounds). We believe that these fumes released during cooking negatively impact the subjects' health. Thus, clean energy use Values in the table are means. *, **, and *** represent significant differences between the means of the treatment group and the control group before and after matching, at the 10%, 5%, and 1% levels, respectively can effectively reduce this effect and significantly improve women's health.

Placebo test
In order to ensure the reliability of this study's conclusions, we now conduct a placebo test, which aims to verify the impact of rural clean energy use on women's health. The research methods are as follows: (1) considering the samples of rural women after PSM; (2) randomly choose some women (the sample number is consistent with the original PSM-DID treatment group) as the treatment group in a new DID regression, and assume that the subjects in the group used clean energy for cooking in 2016; and (3) estimating the regression coefficient β 1 of the new, randomly chosen treatment group according to the original formula to run the model for 500 random simulations. Accordingly, we obtain a distribution of the coefficient β 1 . If the regression coefficient of the DID treatment group is always significant, and is consistent with the benchmark results, this would indicate that the empirical results presented above are not generated from clean energy use: they might merely have been caused by coincidence and unobserved variables. However, if it can be proved that the application of clean energy in rural areas does lead to a significant health improvement for rural women, this would further verify the reliability of the benchmark regression results.
We therefore perform a placebo test with samples that have already been matched by the radius nearest neighbor matching method. Figures 2 and 3 show the results of the placebo test for "health self-assessment" and "physical discomfort" from Table 4; we observe that the regression coefficients of the two dependent variables are − 0.1555 and − 0.0596, but in Figs. 2 and 3, after 500 random simulations, it can also be seen that more than 95% of the regression coefficients are distributed outside the benchmark regression results. The result suggests that artificial random selection samples of rural clean energy use do not have a significant impact on rural women's health, supporting the conclusion that the benchmark results of this study are not affected by unobserved factors.

Using the comprehensive index as explained variable
The dependent variables used in this paper can be considered to combine the "health self-assessment" index and the "physical discomfort" index to form a comprehensive index, and the comprehensive index can be added to the dependent variables for robustness check. The calculation steps of the comprehensive index in this paper are as follows: firstly, we assume that both "health self-assessment" index and the "physical discomfort" index are equally important and therefore give them the same weight of 50%. Secondly, for the "health self-assessment" index, because it is assigned between 1 and 5, and reflects the range from very healthy to very unhealthy, therefore, we assign the values 0.2, 0.4, 0.6. 0.8, and 1 point, respectively. Thirdly, for the "physical discomfort" index, because this indicator is a binary dummy variable in the questionnaire: its value equals 1 point when the sample has recently felt physical discomfort, and 0 point otherwise. Lastly, we multiplied each respondent's score by 50% and then added the two index scores together to get the final comprehensive index score. Table 5 shows the results using the comprehensive index as our dependent variable. The results show that the estimated effect remains significantly negative, which suggest that the use of comprehensive health assessment index is also effective and our baseline results are robust.

Concurrent event
During the implementation of "the Notice of the State Council General Office on Issuance of the Strategic Action", there may be other similar regulations or policies that were enacted during the clean energy policy period that could affect the treatment setting and impact our results. Specifically, there is a related regulation in the same period: the "Coal to Gas & Electricity (CGE)" project that was implemented in Beijing, Tianjin, and other regions from 2014 to 2017. The CGE policy aims to shift energy consumption to relatively clean natural gas and electricity by reducing coal consumption. This shift in clean energy consumption will directly reduce the amount of coal burning and reduce pollutant emissions caused by coal burning. Therefore, this time-varying concurrent event may be correlated with the outcome variable and the regressor at the same time, leading to bias in our estimates. In light of this concern, we carry out a difference-in-difference-in-differences (DDD) strategy to combine three types of variation: the time variation (i.e., before and after the start of the year 2014), the energy use variation (i.e., cooking with clean energy versus cooking with non-clean energy), and the regional variation (i.e., CGE target cities versus less non-CGE target cities). As for a DDD model, we interact the CGE target dummy with the D i × T i , and Table 6 shows the DDD estimates. We found that the coefficient of D i × T i × CGE_dummy is insignificant, while the D i × T i remains statistically significant at the 1% or  Table 5 Robustness check by using the comprehensive index as dependent variable In both "radius nearest neighbor matching" and "kernel matching" the number of samples that met the common area assumption was 5454 (there are differences between the matching samples); neighbor = 1, caliper = 0.05; OLS regression was directly performed on initial samples of 6016. *** indicates significance at the 1% levels 5% level. The results show that the impact of clean energy use on rural women's health will not be affected by the CGE policy and then prove our results are robust.

Instrumental variable (IV) approach
A threat to the identification process is that some omitted and unobserved variables may be related to both clean energy use and rural women's health, which may bias the results. To solve the potential endogenous problem, we use the instrumental variable approach to estimate the effect of clean energy use. Following the existing literature, we instrument the household's choice of cooking fuel with the distance to the nearest market from the community (Silwal and Mckay 2015). Proximity to the market captures the remoteness of a community and is an important determinant of fuel choice. A household is unlikely to switch to a market-based fuel if the fuel cannot be easily acquired. If access to a market is poor, households may prefer to stick to firewood or coal that is more readily available locally. Besides, we believe that this instrument variable conforms to the exclusion restriction holds; that is, proximity to the nearest market does not directly affect rural women's health after controlling for all the covariates. While individuals may move towards a market for better economic opportunities, there is no reason to believe that they would do so primarily because of their health (Silwal and Mckay 2015). That is to say, our instrument variable satisfies the conditions of instruments. The distance to the nearest market from the community is based on the CFPS question "How far is your location from the nearest market town?" We estimate the effects of clean energy use on rural women's health using a two-stage least squares (2SLS) model as follows: where Distance i represents instrument variables of this paper, which is the distance to the nearest market from the community of sample i, and the remaining variables are same as the previous settings in Eq. (2). Specifically, β 1 is the variable of our interest, representing the impact of clean energy use on rural women's health, and the coefficient λ 1 provides the first stage estimated effect of the distance to the In both "radius nearest neighbor matching" and "kernel matching" the number of samples that met the common area assumption was 5454 (there are differences between the matching samples); neighbor = 1, caliper = 0.05; OLS regression was directly performed on initial samples of 6016. *** and ** indicate significance at the 1% and 5% levels, respectively. The standard errors are reported in parentheses nearest market from the community on a household's choice to cook with clean or non-clean energy. The instrumental variable results are shown in Table 7. The first stage results present that the distance to the nearest market from the community is significantly at the 1% confidence level and negatively correlated with a household's choice to cook with clean energy, which is consistent with our expected assumption. Besides, the F statistic in the first stage regression is comfortably above the rule of thumb of 10 and satisfies the standard criteria of instrumental variable. For the second stage results of 2SLS model, with the distance to the nearest market from the community as instruments, the size and direction of coefficients are similar with that of Table 2. On the whole, a consistent conclusion can be drawn from the instrumental variable results that clean energy use has a relatively larger effect on rural women's health, after considering the potential endogenous biases.

The impact of clean energy use on men's health burden
This paper analyzed the gender disparity in health burden caused by indoor air pollution in rural China. The underlying assumption of this paper is that women do all the cooking in rural area, resulting in rural women's health being affected by clean energy use. In this part, we further examined whether this health improvement effect existed in men's health and compare changes on health burden between men and women to confirm the reliability of the results of this study. Therefore, we use men as our study subjects, and we also use PSM-DID method to estimate the impact of clean energy use on men's health burden referring to Table 2. As shown in Table 8, we found that no matter using OLS estimation method or other PSM matching methods, all regression results show that the clean energy use of kitchen fuel in rural households does not have any significant impact on men's health burden. The results further prove that indoor air pollution caused by kitchen fuel in rural areas only has a negative impact on rural women to a large extent, which is exactly the health inequality and gender disparity problem that this paper hopes to attract widespread social attention.

Heterogeneity analysis
Based on the above empirical results, it can be seen that the uptake of clean energy in rural areas is conducive to improving women's health. The questions now arise are whether the uptake of clean energy has different health effects on rural women with different characteristics. In order to answer these questions, we now divide the sample women into different group according to their age, education level, and regional characteristics, and further investigate the health effects of their clean energy use. Given that the results of the radius nearest neighbor matching and the kernel matching of samples in the PSM are consistent with each other, we now In the "radius nearest neighbor matching" and "kernel matching" the number of samples that met the common area assumption was 5244 and 5218, respectively; neighbor = 1, caliper = 0.05; OLS regression was directly performed on initial samples of 5739  Table 9 Heterogenous results The matching method used for is the "radius nearest neighbor matching" method. Neighbor = 1, and caliper = 0.05. ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively take the treatment group and control group that are already matched by the radius nearest neighbor matching as the basis of the heterogeneity analysis. Following the age categories of the WHO, 8 the sample women are divided into a young group and a middle-aged/ older group. As shown in Table 9, middle-aged/older women who replace traditional energy with clean energy for cooking could experience a greater positive effect on their health than young women: their "health self-assessment" and "physical discomfort" are, respectively, reduced by 0.1736 and 0.0878, which means an increase of 5.18% and 20.35% in their health self-assessment. The possible reason is that the elderly are more sensitive to the environment. Previous studies have confirmed that the elderly health is different from that of the young, especially because elderly group often stay indoors, which makes them more likely to be affected by indoor air pollution to their physical and mental health Ao et al. 2021). Therefore, middleaged/older women who spent more time at home will get more significant health improvements if their households switched to cleaner fuels in kitchens. In addition, in Chinese rural families, most married middle-aged/older women do most of the cooking for the family, especially when their daughters-in-law go out to work with their husbands. When three generations of a family live in the same household, the grandmother will generally do all the home cooking. As a result, the health of middle-aged/older women shows the greatest improvement when traditional cooking fuel is replaced by clean energy.
Next, considering that education may make a difference to the women's understanding and acceptance of new things, and based on the fact that the sample women are generally less educated (about 50% are illiterate), this paper divides the samples into illiterate and non-illiterate groups. This analysis finds that the health of the illiterate groups improves significantly when clean energy is used for cooking, which is basically consistent with the existing literature (Imelda 2020). The possible reason for this is that highly educated women are more aware of health risks and take appropriate protective measures to protect themselves from indoor air pollution in their daily lives. On the one hand, highly educated women could pay more attention to the daily cooking environment and may actively purchase the range hood in the kitchen to reduce the damage to their health caused by the smoke from cooking. On the other hand, highly educated women usually pay more attention to physical health checks and are more willing to accept the promotion of clean energy applications.
Finally, to investigate regional differences, the whole sample is divided into four groups according to which region of China they inhabit: eastern, central, western, or northeastern. It is found that northeastern women experience the most significant health improvement after using clean energy for cooking. This may be fully related to the living habits and household energy consumption of residents in northeast China. Because northeast China needs to be heated for half a year, and some families in rural areas of northeast China usually use some dirty energy to burn Kang bed-stove or keep warm, such as burn coal or wood, though an efficient use of resources, the Kang generates soot as well as heat, as it burns fuel. Therefore, northeastern women experience greater positive health effects after replacing traditional fuel with clean energy. In rural areas of the central region, it is also found that clean energy use for cooking has a positive impact on women's health.

Conclusions and policy implications
Using panel data for samples of rural women in CFPS2014 and CFPS2016, this study applies the propensity score matching method and a difference-in-differences model (PSM-DID) and combines these with heterogeneity analysis to comprehensively analyze the impact on women's health of rural clean energy use. The conclusions are as follows: (1) Clean energy use can significantly improve the health of rural women, and (2) when clean energy replaces traditional energy, the health improvement of rural women differs according to their age, education level, and region. Middleaged, older, illiterate, and northeastern women experience greater positive health effects. Therefore, clean energy use should be promoted in rural areas as a means of equalizing, as well as improving, women's health outcomes. The analysis was shown to be robust under different analytical methods, including radius nearest neighbor matching, kernel matching, and OLS regression, and it also passed the placebo test. It is therefore believed that the conclusions are robust and that it can be said with confidence that clean energy use in rural areas has a positive impact on women's health.
Based on the conclusions of this study, and in order to realize the doubly positive impact on "health efficiency" and "health equity" for women through clean energy use in rural areas, the following policy measures are proposed: (1) Improving education about clean energy and raising women's awareness and knowledge about health issues. Specific cases could be used to highlight the potential risks to women's health of using traditional fuel. Local farmers could be encouraged to move to cleaner energy use and popularize clean energy based on their experience of its benefits, (2) providing support for clean energy technologies. Our heterogeneity analysis shows that the health of middle-aged, older, and illiterate women benefits most from clean energy use in rural areas. However, these women are also more likely to encounter difficulties when they try to assess the pros and cons of new technologies and when they pay for converting their kitchens to clean energy use; they are therefore more inclined to continue to use traditional energy and so bear more health risks. The government should try to build a more perfect clean energy technology service system. For example, the government could provide more efficient service guarantees for users by stationing technical stations and employing professional technologists to help households transform their kitchens to use clean energy, and the government could subsidize clean energy use, (3) promotion of clean energy should be optimized for local conditions. Women in different regions of China benefit to different degrees when they use clean energy. Therefore, when promoting clean energy in rural areas, the varying resource endowments and customs in the regions need to be considered. In particular, when promoting energy transformation in northeast China, we should respect the complementarity of local cooking and winter heating in the region and ensure that clean energy performs at least as well as traditional energy.
One other point worth emphasizing is that our paper still exist a limitation about the CFPS dataset. CFPS surveys are conducted every 2 years; up to now, the questionnaire survey of CFPS has only been conducted until 2018. Due to the confidentiality principle of data acquisition, we are temporarily unable to obtain the data of CFPS in 2018. Therefore, our study just only use the two consecutive surveys CFPS2014 and CPFS2016, which leads us to face a new problem about how to explain the value of the research results produced by using the older data to the current stage. Our response is that we think that using CFPS data of 2014-2016 to demonstrate the research topic of this paper is also of certain practical significance, because our paper ultimately found that promoting the use of clean energy in rural areas of developing countries could significantly improve women's health and reduce the resulting health inequalities. In particular, there are still a large number of developing countries and underdeveloped regions around the world using traditional solid fuels for cooking and daily life. We hope that the conclusions of this study will trigger their attention to the health issues of kitchen fuels and household energy use, thus stimulating the government to increase women's health concern and relevant policy support. However, it is certain that we will supplement the latest data to support our research conclusions as much as possible in future studies.