Incentive-based Electricity Demand Response effectively and vulnerable friendly reduce peak load during hot spell

The world is experiencing climate changes characterized by global warming, and energy conservation policies that can reduce greenhouse gas emissions are attracting increasing attention. Heating is one of the most important factors that contribute to the peak load of power consumption throughout the year. Incentive-based electricity demand response (EDR) policies can serve as an important regulation tool during energy system operations, especially in countries with a regulated power market like China. However, whether people will sacrifice comfort to respond to such a policy during hot spells or not and what will the impact be on vulnerable groups, are still unclear. To answer these questions, large-scale EDR trials involving more than 150,000 households were conducted in southwestern China during continuous extreme high temperatures. Households’ 15-min electricity consumption data, hourly meteorological data, and matched survey data were integrated to estimate the regulatory effect of this EDR policy and the discrepancies in response behaviors among urban and rural households, as well as households with children and the elderly. We found that this incentive based EDR policy is similar with price based policy, which can effectively reduce the peak load, however with little adverse effect on vulnerable groups. Temperature rise during a hot spell will slightly decrease the reduction effect. The energy-saving potential for urban households was higher than that of rural households. Households with children did not respond to the EDR policy, while the elderly response proved to be more positive during a hot spell. In addition, repeated and frequent implementation of this policy did not result in attenuation of the regulatory effect on power consumption. This is one of the few energy conservation options will have undergone multiple trials before promotion on a large scale in China, and the results can serve as a reference for countries with similar regulated power markets. Although, no direct harm is done to vulnerable groups, before deploy it nationwide, vulnerable groups still need to be considered to avoid exacerbating existing energy injustices or creating new energy injustices through transfer payment.


INTRODUCTION
The world is experiencing climate changes characterized by global warming, and effective ways to reduce greenhouse gas emissions through energy conservation are actively being explored. High temperatures in summer are one of the main causes of peak loads on a power grid 1 , and this situation can lead to increases in social energy infrastructure investments and carbon emissions. In China, with the promotion of electrification processes and improvements in people's living standards, the electricity consumption of Chinese households has experienced dramatic growth 2 . In 2018, China's electricity consumption reached 6,844.9 TWh, which accounted for 27.5% of the world's energy consumption. At this time in China, urban and rural households' electricity consumption reached 968.5 TWh, which represents an increase of 10.4%. The surge in the household demand for electricity is an important reason for the double-digit growth in power peak loads and the increased peak-valley difference (5% of the peak load only lasts for dozens of hours on an annual basis). In order to fill this gap, power generation capacity installation rate exceeded 2 billion kW in China during 2019. At the same time, the problems associated with insufficient power supplies during peak periods are still prominent, and power rationing occurs from time to time 3 . Some countries, not limited to China, are now faced with the dual pressures of needing to address the underutilization of new installation capacities and emission increases on the power grid supply side and increasing energy consumption on the demand-side, and the issue of how best to effectively guide households to rationally use energy so as to reduce the need for energy investments and emissions from newly installed power plants has become an important practical problem that urgently needs to be solved.
Price-based policies, such as time-of-use (TOU), and incentive-based electricity demand response (EDR) policies represent important regulatory tools for household demand side management (DSM), and such policies have been widely used all over the world 4,5 . In academic research circles, the regulatory effects of TOU have been the focus of extensive discussions. The TOU approach induces households to transfer electricity consumption behaviors from peak hours to non-peak hours to satisfy demand fluctuations 6,7 . However, for households that already struggle with electricity bills, TOU can be detrimental [8][9][10] . Specifically, households suffering from energy poverty are forced to make trade-offs between paying for electricity bills versus other necessities such as food and medicine 11,12 . In addition to price adjustments, the use of remote controls for smart home appliances and Home Energy Management Systems (HEMSs) during peak periods can help to limit energy consumption 13 -16 , and these topics have been well-explored to provide consumers with optimal solutions such as the best rate plan and best energy efficiency level 17,18 . In practice, the American Pecan Street project has verified "smart home" adjustment technology and capabilities by collecting and processing household electricity data. Additionally, the Low Carbon London project collected electricity data from more than 5,000 households by the use of smart meters in London and analyzed the impacts of low-carbon technology on its distribution network, and the Customer Behavior Trials project, which was conducted in Ireland, determined how social attributes, lifestyles, housing, and other factors affect customer electricity consumption behaviors.
Notably, the above studies either were conducted under a deregulated electricity market in which flexible TOU tariffs were implemented, or with the aid of smart equipment that required high installation costs, which would make the policy difficult to implement in countries with a partially monopolized power market like China, or on a large scale in households with poor economic conditions. Furthermore, the influence of those tools on vulnerable groups has not received much attention, except for the discussion by White etc. (2020) 19 , which used static TOU rates as the DSR measure; the findings suggested that these tools may not only worsen the trade-off pressure of the vulnerable ("the heat or eat dilemma"), but also be inoperable in countries with a regulated power market like China.
The impacts of climate change on energy end users across the globe have been extensively investigated. It has been suggested that diurnal and seasonal changes in electricity demand are largely driven by human responses to meteorological factors 20,21 . Among all meteorological factors (temperature, humidity, precipitation, wind, cloud cover, sunshine, etc.), temperature has the highest correlation with electricity consumption 22,23 . The excessive use of air conditioners in extreme weather can lead to extreme peak demands for electricity 24 , and as households' income level increases, the demand for electricity due to the use of air conditioners will increase dramatically 25 . This comfort seeking behavior appears as a U-shaped function of temperature 26 . Especially when households are repeatedly exposed to elevated indoor temperatures, their temperature tolerance might change 27,28 . Implementing EDR policies in high or low temperature environments can reduce or transfer loads as the grid approaches its capacity. The behaviors of households in terms of demand response activities that involve temporary adjustments in the air-conditioning temperature have been explored 29 -35 . In hot and humid climates, an increase in the cooling setting temperature from 23°C to 26°C has been found to be largely acceptable to households 36 . In addition, the sustainability of DSM policies has also been discussed. These might bring about long-term effects such as habituation and habit formation 37,38 .
Most of the existing studies are based on electricity price signals for the implementation of the DSM, and few studies have explored how to carry out large-scale incentive-based EDR policies, which involve a voluntary reduction in power consumption supported by extra government incentives. Incentive-based EDR policies, as an important regulatory tool in energy system operations, will likely play an increasingly important role, especially in countries with a regulated power market like China. However, will people sacrifice their comfort to respond to the incentive-based policy during hot spells and what is the impact of such a policy on vulnerable groups are questions that remain unanswered at present. Additionally, the sustainability of such policies also needs to be tested before extensive promotion. This is the first-ever study to analyze energy conservation options with this large scale trials in China under non-flexible energy pricing market, and the results can serve as a reference for countries with similar regulated power markets, which receive insufficient attention in previous demand response practices.
During the summer of 2019, China experienced a record-setting hot spell, and some provinces were subjected to temperatures of above 35°C for 52 days. In order to identify the regulatory effects of the EDR policy on households' electricity consumption under high temperatures, six EDR trials were conducted on more than 150,000 households in southwestern China during these continuous extremely high-temperature weather events. Before the trials, high-speed power line communication (HPLC) meters were installed in the study area to collect high-frequency power consumption monitoring data before and after the pilot. The households' 15 min electricity consumption data and hourly meteorological data monitored by the stations were collected. First, we explored voluntary peak load reduction behaviors induced by the EDR policy and identified the sensitivity of the regulatory effect to temperature increases during hot spells. Then, we investigated heterogeneous nature of the regulatory effect on vulnerable groups through matched household surveys and carried out response effect comparisons among urban, rural, elderly, and child present households. Finally, we analyzed the sustainability of the policy and habit formation.

Contribution of hot spells to the households' electricity consumption.
The rise in temperatures had a significant positive contribution to the peak load in the power network. Figure 1a shows the daily maximum temperature in regions of southwestern China that exceeded 35°C (equal to 95°F) for more than 10 days, and the peak load value of the power network was above 17.0 GWh. The maximum load of the power network increased with the daily maximum temperature. The daily maximum temperatures of 36°C and 39°C contributed daily maximum loads of 17.7 GWh and 21.3 GWh to the whole network, respectively.
We analyzed electricity consumption behavior patterns during hot spells for households within cities, counties, county towns, towns, and the countryside. As shown in Figure 1b, residential electricity consumption increased dramatically during extreme heat days and contributed substantially to peak loads during these periods. There were two peak periods in household electricity consumption; one at 13:00 and one around 21:00. The main peak around 21:00 may indicate that electricity consumption changes show distinct hysteresis to temperatures, and this might have been caused by the increased quantity of household appliances in operation due to increases in the number of people at home at night. During hot spells, when the temperature rose by 1°C, household electricity consumption rose by 0.059 kWh (P = 0.000; Table S1). This result is consistent with the results of previous research 39 . In Figure 1c, we describe all peak loads of the entire power network, and as shown, peaks usually occurred between 20:00 and 21:30 during the hot spells. This is also the period of our EDR pilot, which aimed to shift peak loads.

Influence of the incentive-based EDR policy on reductions in household electricity consumption.
In order to explore whether incentive-based EDR policies can effectively induce households to reduce the peak load of the power grid by adjusting their electricity consumption behavior, we conducted six large-scale incentive-based EDR trials in southern China with more than 150,000 households. In the experimental design, we selected one day (when the temperature was higher than 35°C) as the response day and the previous day as the benchmark day. On the morning of the pilot day, the system randomly selected users within a specific range of districts. The power grid then sent a message through the platform to inform these users that a power demand response process would be conducted from 20:00:00 to 21:30:00 in the evening. If the power consumption during the EDR period on the response day was 1 kWh lower than that for the same period on the benchmark day, the saved power consumption generated a cash subsidy of $ 0.143 1 /kWh. Users who confirmed that they would participate in the EDR program had to present a response of "yes" to the platform. After the EDR program was complete, the platform settled users' accounts and issued rewards accordingly. In our analysis, users who replied "yes" to platform were treated as the EDR group, and those who never received any message were treated as the control group. In the pilot area, the number in the EDR group was less than the number in the control group, and the response index (refers electricity saved) of the treatment group was higher than that of the control group ( Figure 2). Specifically, based on basic statistical results, the average response index of the treatment group was about 0.15 kWh, and the control group's value was approximately 0.03 kWh. In addition, we estimated the influence of the temperature change on the response behavior during the pilot. We found that the response index was lower as the temperature became higher. In the treatment group, the average response index values were approximately 0.19 kWh and 0.10 kWh in areas where there was almost no temperature rise and where there was a temperature rise of 3°C, respectively. We used the data set of the sixth EDR pilot, which had the largest participation scale, for further analyses. All analyses were performed by using STATA SE 15.1. The pilot included an EDR treatment group and non-EDR control group. We found that the EDR policy could regulate electricity consumption under continuous high temperatures effectively, and households participating in the EDR policy voluntarily saved 0.104 kWh of electricity consumption on average during the 1.5 h of peak load (Table 1).
For the pilot conducted during hot spells in southwestern China, electricity consumption was expected to be driven by cooling needs. We used a difference-in-difference-in-differences (DDD) approach to examine whether the EDR policy resulted in electricity saving behavior for households during hot spells with high temperatures (see the EXPERIMENTAL PROCEDURES for further details). Each model compared the control to the EDR pilot. There were two models for each group. One contained the EDR policy indicator, and the other contained the rising temperature indicator. The electricity consumption during the 1.5-h pilot period represents the dependent variable, and independent variables are the full set of interaction terms and main effects required for a triple difference model; the controls variables are the air quality and climate related variables (see the EXPERIMENTAL PROCEDURES).   The double difference term EDR×Post in odd-numbered models reflects the effects of the EDR policy on electricity saving behavior, and the triple difference term EDR×Post×TempDiff in even-numbered models gives the estimated effects of the EDR policy for vulnerable households during hot spells with high temperatures (see the EXPERIMENTAL PROCEDURES). As expected, the incentive-based EDR policy effectively reduced the peak load (Coef. = 0.104; P = 0.000; Table 1, Figure 4, and Table S3), and the temperature increase during hot spells resulted in a slight drop in the reduction effect (Coef. = 0.046; P = 0.018; Table 1; Table S3). When the temperature rose by 1°C during the trial, the effect of the demand response decreased by 0.034 kWh (P = 0.010; Table S4).
In order to counter the hot feelings caused by increased temperatures, households who did not participate in the EDR policy generally used more electricity (Figure 3a), and they may have reduced the temperature setting on air conditioners to improve comfort or increased the frequency of water use, with equipment that requires electricity for operation. This increased electricity was equivalent to households' reducing the air-conditioning temperature setting by 0.5°C during the demand response period. Households participating in the EDR policy generally reduced their electricity consumption (Figure 3b), and households with a higher daily electricity consumption showed greater electricity saving potential.

Figure 3. Electricity consumption involved in EDR trials during hot spells.
This figure represents the effects of EDR on the electricity consumption pattern of households in each group (see EXPERIMENTAL PROCEDURES for details) when the temperature rises. The horizontal axis is the number of each group (sorted by the difference in electricity consumption before and after EDR), and the value in parentheses is the proportion of the number of households in each group. In subgraphs A and B, the lengths of the red bars and green bars represent increased or reduced electricity consumption of households in EDR, respectively. The two endpoints of the line show the maximum and minimum electricity consumption. (a) Subgraph A reflects the change in electricity consumption of the control group. The average electricity consumption increased by 40%, 50% and 10% of households (increased electricity consumption from low to high) is 0.01 kWh, 0.04 kWh, and 0.16 kWh, respectively. (b) Subgraph B reflects the changes in electricity consumption of the EDR group. The average electricity consumption saved by 40%, 50%, and 10% of the households (saved electricity consumption from low to high) amounted to 0.06 kWh, 0.16 kWh, and 0.46 kWh, respectively. (c) Subgraph C reflects the changes in electricity consumption within each group during the EDR. The blue bars indicate the electricity consumption (predicted value) of the households when the temperature had not changed. The orange bars indicate that the electricity consumption increased with the temperature rise. The green bars reflect the decrease in electricity consumption after participation in EDR trials.

Discrepancy among vulnerable groups in sensitivities to EDR
We analyzed the heterogeneity in the regulatory effects of the incentive-based EDR policy on households in different regions during hot spells. Rural households participating in the EDR policy used 0.066 kWh (P = 0.000; Table 1; Table S5) less than the control group. This was equivalent to the 1.5-h EDR period, when rural households stopped using electric fans during the pilot, and the electricity consumption of rural households did not change significantly when the temperature rose. The urban households who participated in the EDR policy saved 0.110 kWh (P = 0.000; Table 1; Table S5) more than the control group, which amounts to approximately 1.67 times the electricity saved by rural households. Additionally, urban residents were more sensitive to the heat, and the effects of the EDR policy on this group decreased in the presence of persistent overheating. In areas experiencing rising temperatures, the demand response of urban households decreased by 0.079 kWh (P = 0.013; Table 1; Table S5) on average. This was equivalent to the 1.5-h EDR period, when urban households reduced their bedroom air-conditioner temperature setting by 0.7°C.
The energy saving potential of urban households is higher than that of rural households, which is related to household income levels and quantities of household appliances 40 . Urban households have higher disposable income levels, larger living areas, and more household appliances 41 , which leads to higher amounts of electricity consumption and thus more potential. In addition, the efficiency of household appliances and electronic devices is also critical to energy conservation, and wealthy households have the ability to purchase smart devices and install home power management systems that will help them to save more when needed. Both urban and rural households can reduce their electricity consumption under stimulation provided by monetary incentives; however, a temperature rise does not readily affect the electricity consumption behavior of rural households. Notably, rural households have fewer household appliances (the economic conditions of rural households in China are lower than those of urban residents; in 2018, the per capita disposable income of urban households was 2.69 times that of rural households 2 , and most of China's rural households are low-income households), and some economically disadvantaged families can only afford light bulbs. Hence, even if they were willing to participate in the EDR policy pilot, the energy saving potential was very limited. Urban households have higher income levels than rural households and will increase the use of electrical appliances when the temperature rises to improve the comfort of the living environment.
Demand response policies can have heterogeneous effects on households with children present and those with elderly present during a hot spell 42,43 . Here, the EDR policy had no significant effect on households with children and the results were not sensitive to temperatures either, but a larger effect was detected for households with elderly. Elderly households participating in the EDR policy saved 0.193 kWh (P = 0.002; Table 1; Table S6) more than control group on the pilot day, which is equivalent to turning off the air conditioner for 10 min during the EDR pilot or increasing the temperature setting to 0.64°C during the whole EDR ( Figure 4). There are two possible reasons why the EDR policy had no effect on households with children. When children are young and have weak physical resistance to fluctuations in temperature, parents tend to provide their children with a comfortable living environment 44 . In addition, Chinese children have abundant homework and study pressures in the summer, and parents are willing to provide a cool learning environment for children during hot summers 45 . The comfort and learning efficiency of children outweigh the limited financial incentives. The fact that the EDR policy had a significant effect on elderly households could have been due to the thrifty virtues of older generations in Asian countries. According to the theory of continuity, a person lives and develops in specific environmental conditions, and the formed lifestyle has continuity, which guides the activities of the elderly. Since the Asian elderly have a frugal lifestyle 46 , households with elderly are more inclined to save electricity when stimulated by money.

Figure 4. Power saved by incentive-based EDR and related home appliance working minutes.
The data in the layout of the house above the horizontal line show the time corresponding to the average electricity saved by this 1.5-h EDR trial for each type of household appliance (see the conversion standards in Table S11). Data below the horizontal line indicate the electricity saved by urban, rural, and elderly households participating in this EDR; data were transformed into the working minutes of the household appliances. Each circle encompasses 60 min, and the value above each circle represents the percentage of the household appliance's working time during the 1.5-h EDR, while the value below each circle represents the working minutes of each appliance supported by the electricity saved by the EDR or what might have ensured to achieve this energy reduction effect.

Sustainability of the incentive-based EDR effect
Moral persuasion and economic incentives are frequently used by regulators to influence intrinsic and extrinsic motivations for a variety of economic activities 47 . A central question for economists and policymakers designing such policies is whether appealing to intrinsic and extrinsic motivations can generate persistent effects on economic activities 48 . We selected the households (n = 864) that received invitation text messages consecutively in six EDR trials and analyzed these data by using the method of individual fixed effects regression. The dependent variable was the change in electricity consumption during the trial. The key independent variable was the number of times the household chose to participate in the EDR policy. The resulting coefficient was used to estimate the persistent effects of continuous performance in the trials, and this method controlled for temperature changes and other weather-related variables. As expected, the more times the households participated in the EDR policy, the more electricity was saved in general (Coef. = 0.046; P = 0.002; Table S7). Repeated and frequent implementation of this policy did not result in the attenuation of the regulatory effects on power consumption. Repeated economic incentives induced larger treatment effects, and these might have helped to solidify energy saving strategies or promoted habit formation. There are two potential mechanisms. One is related to "learning by doing." At the beginning, residents might not know of many ways to save energy, but as the trials are carried out repeatedly, residents begin to plan ahead and engage in practices such as preheating the water heater and refrigerating the room to a lower temperature in advance. The other possibility might be related to investments in smart energy-efficient appliances, which we do not expect to happen extensively anytime soon, but we can expect this to become more commonplace over the long run. If such an effect was systematically large for the households who participated in the EDR pilot, it could explain the persistent usage reductions, including weaker habituation and stronger habit formation.

DISCUSSION
We conducted six EDR trials involving more than 150,000 households from southwestern China during hot spells. High-frequency power consumption data, hourly meteorological data, and matched survey data were integrated for the analyses. Then, the regulatory effects of an incentive-based EDR policy during hot spells were estimated, and differences in voluntary peak load reduction behavior among vulnerable groups were explored. Finally, the long-term effectiveness of the stimulus was examined.
Incentive-based EDR policies were found to be an effective way to encourage residents to adopt voluntary peak load reduction behaviors during hot spells. Unlike the use of TOU rates as a common DSR measure, which might worsen the heating or eating dilemma of the vulnerable, and also can be inoperable in countries with a regulated power market like China, incentive-based EDR policies can promote voluntary reductions in power consumption with extra government incentives. Such policies are effective in shifting demand away from peak hours during hot spells and have no side effects related to trade-off pressures within vulnerable groups. Furthermore, such policies can help with energy conserving habit formation, which could lead to additional policy impacts in post-intervention periods. As a supplementary policy to the step tariff approach, this is one of the few treatment options that have gone through multiple trials. It provides a practical and flexible solution for relieving seasonal power supply and demand contradictions and reducing carbon emissions under the background of global warming, especially for countries with a regulated power market.
However, before this policy can be rolled out nationally, which was the original intention of the pilot research, vulnerable groups need to be reconsidered so to avoid creating new energy injustices. The heterogeneity analysis showed that urban residents have about 2.16 times more energy saving potential than rural residents (which account for 40% of China and are often low-income groups) when provided financial incentives. In the case of limited publicity and organization costs, power companies and local governments might be more willing to carry out EDR policy trials in affluent areas. The money used to incentivize EDR policies could be disproportionately distributed to urban households, who find it easy to reduce electricity demands. These types of transfer payments may be unfavorable for low-income groups.

Experience-based design and data collection
We conducted six electricity demand response trials based on monetary rewards in southwestern China from July 18, 2019, to August 21, 2019 (Table 2). Before the pilots, we installed high-speed power line communication (HPLC) smart meters in the relevant households to collect high-frequency electricity consumption data. At the same time, the experimental effect was evaluated by combining the three-year electricity consumption data of each household with the hourly meteorological data monitored by nearby stations. We conducted three pre-trials to ensure that data collection procedures were correct and gradually expanded the scope of the trial. On the morning of the trial, we sent a message to all trial households through the network platform to inform them we were going to carry out the electricity demand response (EDR) policy from 20:00:00 to 21:30:00 on the same evening (response day). If the electricity consumption during the EDR policy implementation on the response day was 1 kWh lower than that during the same period on the benchmark day (the day before the trial), the saved electricity generated a cash subsidy of $ 0.143/kWh. Households who were willing to participate in the EDR program had to reply with a "yes" to the platform. After the EDR trial ended, the platform settled each household's account and distributed the rewards accordingly. In these pilot areas, we used HPLC smart meters to collect residential electricity data 49 , and a total of ten million electricity data points were collected. During the trial, there were persistent high temperatures in the area with daily maximum temperatures above 35°C. Figure 2 illustrates the distribution of households group in the pilot area. At the same time, we randomly conducted a questionnaire survey on the households and asked whether there were children or elderly in the homes. A total of 5,170 questionnaires were distributed, and through a rigorous screening process, 4,674 valid questionnaires were finally recovered with an effective recovery rate of 90.41%. In addition, we also collected the historical annual electricity consumption data for the households and the hourly meteorological data monitored by nearby stations. Then, we evaluated the EDR policy effects by combining these multiple sources of data.
We pre-processed the electricity data for the electricity demand response trial from August 18th to August 19th, and we deleted the samples with missing or abnormal values caused by the collection and transmission of data by the HPLC meters; electricity consumption data always at 0 (vacant homes) were removed. The final sample retained 122,132 households, and the daily maximum temperature in these regions was above 35°C (Table S2). The pilot temperature where 50,785 households were located continued to rise during the trial, and it even reached to 3°C.

Relationship between temperature and electricity consumption
Based on the 11 days of electricity consumption data for 3,992 households in August 2019, combined with the hourly temperature data during the same period, the individual fixed effect model was expressed as follows: Among these terms, Electricity refers to the total electricity consumption of households i in period t from 20:00 to 21:30; is the individual fixed effect, which refers to those influencing factors that do not change with time, such as age and income. Considering the difference in electricity consumption between working days and non-working days, we added control variables of whether it was weekend or weekday. Controls represent the remaining control variables, including the air quality and climate related variables, namely, PM 2.5 , wind direction (Wind_direction), wind speed (Wind_speed), wind level (Wind_level), relative humidity (Humidity), atmospheric pressure (Atmos_pressure), and water pressure (Vapor_pressure). By controlling for these factors, this model can measure the relationship between temperature and electricity consumption during the EDR pilot period.
The results showed that after controlling for other climate-related impacts, the temperature rise increased electricity consumption during the EDR pilot period, and holidays showed a higher response than working days (Table S1).

Empirical strategy and baseline model
The difference-in-difference (DID) method was selected as our recognition strategy. We compared changes in electricity consumption between households who participated in the EDR trial (treatment group) and households who never received any related messages (control group), as well as electricity consumption before and after trial implementation in the treatment group to study the treatment effects of this incentive-based EDR. As a classic method of causal inference, the DID method has been widely used in many studies 50 -53 . It is generally assumed that the phenomenon studied has a clear exogenous impact, and it is crucial that the parallel trend hypothesis of the treatment and control group should be met. Demand response trials can be seen as an exogenous shock; however, the selection of the treatment group might not have been completely random, e.g., some low-income households, households with a high saving potential, or residents who are more sensitive to monetary incentives might have been more inclined to participate in the EDR trial. This could further generate sample self-selection bias. The usual approach for solving this problem is through matching methods. However, because of the randomness, dynamics, and uncertainty of individual power consumption behaviors during the 1.5-h study period, it would be difficult for the traditional matching method to capture all of the factors through the controlling variables (such as propensity score matching). We thus adopted a dynamic time warping (DTW) based matching method to control the parallel trend between the treatment and control group 54 . This method integrates individual households' 15-min high-frequency and monthly low-frequency power consumption data on a micro-scale. From the perspective of behavior results, we believe that households with similar electricity patterns in the long-term (36 months) and short-term (15 min) before the trial should be comparable and meet the parallel trend. We applied this method to divide all samples into 100 groups of different types of electricity consumption patterns. After DTW matching, which fully captured the similarity between households' short-term electricity behavior fluctuations and long-term electricity consumption behavior patterns, we could ensure that the potential results of households' electricity consumption behaviors with and without EDR participation were randomly distributed within the group. Figure S1 shows the electricity consumption of households in different groups, which were parallel to each other within each group.
The samples were divided into the treatment group and control group according to whether households participated in the EDR policy. The model is shown in Equation 2: is the electricity consumption in 1.5 hours during EDR, is a dichotomous variable set to 1 if the household was on EDR and 0 if the household was in the control group, is a dichotomous variable set to 1 if the date was response day and 0 for the benchmark day. These terms are included as controls for each individual indicator. Subscript i refers to a term that differs across subjects, but is constant over time for a given subject. Subscript t refers to a term that changes over time, but is constant across subjects at any given point in time. Terms with sub-script it varies across both subjects and time.
* controls for the effect on electricity saving due to the EDR during the trial, and takes a value of 1 for households that were on EDR and 0 for all others. Controls represent other control variables, including PM2.5, wind direction, wind speed, relative humidity, atmospheric pressure, water pressure and other variables related to temperature changes, and the average monthly electricity consumption of households, city level related to households' characteristics. The results proved the positive stimulating effect of the EDR policy on households' electricity saving behavior (Table  S3).
In order to estimate the impact of the implementation of the EDR policy on households' electricity saving behaviors when the temperature rose, we built a difference-in-difference-in-differences (DDD) model, which is described by equation 3: where is a dichotomous variable for household set to 1 if the local temperature rises more than 1°C during the trial period and 0 if it does not change; × controls for differences experienced during the trial by temperature rising areas regardless of the EDR; × controls for the differences of temperature rising areas assigned to the EDR regardless of whether the trial had begun or not. The term of interest is × × , which gives the effect of the EDR during the trial in temperature rising areas; households who live in temperature rising areas participated in EDR have a value of 1 for this term during the trial; all other groups, and other time periods, take a value of 0. Errors are clustered at the household level.
refers to the idiosyncratic error term. The results demonstrated that the EDR policy will guide households to adopt adaptive behaviors, but as the temperature increases, the responsiveness of households will decrease (Table  1; Table S3).

Heterogeneity analysis of the sensitivity to EDR activity in various populations
Considering that different households have different sensitivities to the EDR policy, we adopted the DID and DDD models in the empirical strategy section and analyzed the heterogeneity through group regression. In this EDR trial, we divided the data into four sub-samples, namely, rural households (n = 31,155), urban households (n = 90,977), households with children (n = 1,923), and households with the elderly (n = 1,729).
We found that urban households have more energy-saving potential than rural households ( Table 1). In addition, we designed a questionnaire covering various factors that may influence the households' energy consumption behaviors, such as the family structure, living habits, quantity of household appliances, etc. We then conducted surveys through online channels (e.g., apps, WeChat, Official Account Platform), and we checked the validity of the questionnaires by telephone return calls. Returns of the questionnaire with lower validity lead us to ask the participants to refill it or we found other households to ensure the quality and quantity of valid questionnaire samples. Among the valid questionnaires, we found that the EDR policy had no significant incentive effect for households with children and these results were not sensitive to temperature, but the policy did have a significant incentive effect for households with the elderly (Table S6).

Robustness test
An important hypothesis for using the DID method to evaluate the impact of an EDR policy on households' electricity consumption behavior is that if there is no EDR policy, the parallel trend between the treatment and control group is consistent and there is no systemic difference over time. In addition to the DTW-based matching method mentioned in the empirical strategy section, we also adopted a basic parallel trend test method. We selected the electricity consumption data for the same period of time for continuous 6 days around the EDR pilot (20:00-21:30 on August 15, 2019, to August 20, 2019) to generate six interaction terms by time dummy variables and treatment group dummy variables. The interaction terms were used as explanatory variables for the regression, and the coefficients of these data reflect the difference between the treatment and the control group at a specific time. The results showed that before the implementation of the EDR pilot, the interaction terms were not significant, which indicates that there was no significant difference between the treatment and the control group before the pilot, that is, the parallel trend hypothesis was satisfied; meanwhile, the coefficient of the interaction term during the pilot implementation period (August 19, 2019) was significantly negative, and then, on the next day (August 20, 2019) it became insignificant again. These findings indicate that the implementation of the EDR policy had a significant negative stimulus effect on households' electricity consumption only on the trial day and there were no systematic differences between the treatment and control groups before the trial.
In order to further test the robustness of the results, we also performed a placebo test to conduct counterfactual testing by changing the implementation time of the policy. Specifically, we set up hypothetical treatment and control groups, and a hypothetical EDR pilot implementation time. We selected the electricity consumption data of the same households on non-demand response days (August 15, 2019, and August 16, 2019, that is, assuming the EDR pilot was implemented some days in advance). We then took the residents who really participated in the EDR as the hypothetical treatment group, and the remaining residents were used as the hypothetical control group. The regression results (Table S8-S10) showed that the estimated coefficients of * in each group were not significant, which means that following the removal of the EDR pilot, there were no systematic differences in the changes in electricity consumption between the treatment and the control group. This proves that our previous estimation results are robust.