Causal complexity of environmental pollution in China: a province-level fuzzy-set qualitative comparative analysis

Environmental problems are endowed with the causal complexity of multiple factors. Traditional quantitative research on the influencing mechanism of environmental pollution has tended to focus on the marginal effects of specific influencing factors but generally neglected the multiple interaction effects between factors (especially three or more). Based on the panel data of 30 Chinese provinces between 2011 and 2020, this study employs fuzzy set qualitative comparative analysis (fsQCA) — which can provide a fine-grained insight into the causal complexity of environmental issues — to shed light on the influencing mechanism of environmental pollution. The results show that there are several different configurations of pollution drivers which lead to high pollution or low pollution in provinces, confirming the multiple causality, causal asymmetry, and equifinality of environmental pollution. Furthermore, the combination effect of advanced industrial structure, small population size, and technological advance is significant in achieving a state of green environment compared to environmental regulation factors. In addition, spatiotemporal analysis of the configurations indicates that strong path dependencies and spatial agglomeration exist in current local environmental governance patterns. Finally, according to our findings, targeted policy recommendations are provided.


Introduction
The rapid development of the Chinese economy has brought about serious environmental consequences (Huang et al. 2020). During the period from 2010 to 2015, China contributed nearly 20% of global emissions of nitrogen oxides (NO x ) and 30% of sulfur dioxide (SO 2 ) (Zhang et al. 2018a). In 2016, the concentrations of PM 2.5 in three-quarters of monitored cities in China were below the national grade II standard (≤ 35 μg/m 3 ) and the WHO standard (≤ 10 μg/ m 3 ) (MEP 2017). Until 2021, the annual mean concentration of PM 2.5 in the Beijing-Tianjin-Hebei region and Fenwei plains exceeded 38 μg/m 3 (CMA 2021). Environmental harm poses a major threat to sustainable development and currently attracts extensive attention from the Chinese government (Guan et al. 2014). Moreover, the multifactorial complexity of environmental problems poses a great challenge to the traditional instruments of environmental governance patterns (Tan and Fan 2019). From a panoramic point of view, therefore, identifying and measuring multiple synergy effects induced by the environmental complexity will be of great significance to environmental governance. Communicated by Baojing Gu. A considerable amount of existent research has attempted to analyze the main factors that influence environmental pollution from a socioeconomic perspective. Ehrlich and Holdren (1971) firstly attributed the anthropocentric impact (I) on the environment to population growth (P), affluence development (A), and technological progress (T) in his IPAT model. Specifically, population dynamics have been at the center of arguments pertaining to environmental deterioration (Pham et al. 2020). Based on neoclassical growth theory, a rapid expansion of population, the consumption of limited natural resources by the existing population, and a compound of both could give rise to environmental pollution from the consumption side (Li et al. 2019). Similarly, the intertwined connections between economic activities and environmental impacts are undeniably significant (Guan et al. 2014). Economic growth undoubtedly stimulates demand for natural resource extraction and consumption and leads to environmental unsustainability (Krueger and Grossman 1995). In addition, technological factors play a crucial part in maintaining or altering the balance between population, economy, and the environment. Unlike population growth and economic development, technological advances through declining natural resource consumption per unit output seem to have positive effects on environmental sustainability (Pham et al. 2020).
A further strand of literature has focused on the environmental impacts of other socioeconomic factors, including industrial structure (Zheng et al. 2020), urbanization (Zhang et al. 2018c), industrial agglomeration (Shen and Peng 2021), foreign direct investment (FDI) (Cheng et al. 2020), foreign trade (Chen et al. 2019), and energy prices . For instance, Zheng et al. (2020) argued that industrial structure determines the allocation of production factors (such as capital, labor, technology, and energy) among different sectors, and this, consequently, significantly affects resource consumption and pollutant emissions. As an important bond between environment and economy, the industrial structure is an indispensable element to realize integrative development. Zhang et al. (2018c) proved that urbanization had a significantly positive spatial spillover effect on CO 2 emissions in China by utilizing the spatial Durbin panel model. Shen and Peng (2021) conducted a spatial panel analysis of China's environmental efficiency and found an apparent U-curved relationship between industrial agglomeration and environmental efficiency. Of further note is the work of Cheng et al. (2020), who used the generalized moments method (GMM) to explore the influence of FDI on the environment based on the panel data of 285 Chinese cities; they asserted that FDI significantly intensified China's urban PM2.5 pollution.
Aside from socioeconomic factors, governmental intervention, which holds a significant role through institutional and regulatory aspects, has also attracted considerable attention in the environmental fields. On the one hand, the government has attempted to facilitate corporate green innovation through carrot-and-stick policies including assistances that are in the shape of green R&D subsidies (Bai et al. 2019), tax preferences for low-emission and high-tech enterprises (Zheng and Shi 2017), and punitive taxes imposed upon technologies or actions that are environmentally undesirable (Hunt and Fund 2016). These stimulus policies have contributed to pushing enterprises to achieve "innovation compensation" by reducing compliance costs and promoting production efficiency. For example, Bai et al. (2019) argued that government R&D subsidies stimulated green innovations of energy-intensive firms. On the other hand, the government endeavors to control and eliminate environmental pollution by means of environmental regulations, including source-oriented treatments (Laplante and Rilstone 2004) and end-of-pipebased treatments (Wang and Zhang 2019). For instance, Zhao et al. (2020) employed the GMM estimation method to investigate the effects of environmental regulation on greenhouse gas emissions. They found that apart from the direct effect on CO 2 emission reduction, environmental regulations also indirectly restrain CO 2 emissions by adjusting the structure of energy consumption.
However, it is difficult to reach a consensus about the influence of driving factors on environmental pollution in the existing literature. For instance, some researchers have revealed a positive relationship between population and energy-related pollution (Li et al. 2019;Liddle and Lung 2010) while others have reported that population impact on the environment is an inverted U-shaped curve (Zhang et al. 2018b). Similarly, some studies have argued that economic growth tends to increase pollutant emissions (Dong et al. 2018), whereas substantial evidence also states that economic growth could arise from structural transformation and advanced production technology, and that these factors may thence offset the negative effects of growing economic activities on the environment (Krueger and Grossman 1995). We summarize the divergent views of several representative factors and their impacts on pollutant emissions (see Table 1).
It can be observed that the effects of factors on pollution emission are diverse and may even be mutually contradictory. The failure of the adopted symmetrical analytical instruments to describe the practical asymmetric causal relationship may be a cause of the inconsistent findings. This mismatch of analytical instruments may lead to the attributes proved to be causally related in one situation to be unrelated or adversely related in another situation (Meyer et al. 1993). In detail, the estimation of the marginal effect in quantitative methods may neglect some samples with weak significance while valuing the samples with large variances. For this reason, at the individual level, cases opposite to the observed net effects often appeared, that is, not every case in the sample supported a fixed relationship between the dependent and independent factors (Woodside 2013).
In addition, few studies have investigated the interaction effects and combined impacts of multiple factors (especially three or more) on environmental pollution at present. Far from being mutually exclusive, the pollution drivers not only coexist, but also prominently impact one another's operation because of causal complexity (a consequence of the mix of both the individual marginal effects and conjunction impacts induced by multiple causes). Traditional quantitative approaches which aim to examine marginal effects -such as multiple regression analysis -may, to some extent, interpret the multiple conjunctural causation between several variables. However, capturing interaction effects for in excess of three variables is arduous (Woodside 2013). Moreover, these methods are weak in their ability to both handle the causal complexity from a holistic level and uncover individual heterogeneity observed in reality. For these reasons, this paper attempts to utilize the fsQCA method, which focuses on correlations between combinations of factors and the outcome, also making explicit the impact of the context and the interaction effects between factors, to overcome these limitations.
Based on configurational theory, the QCA method is a set-theoretic approach by applying Boolean algebra to explore the combinations of organizational attributes leading to the outcome at issue (Ragin 2000). This method aims to combine approaches from quantitative and qualitative techniques; taking the best attributes from both (Pappas and Woodside 2021). Although there are other comparative evaluation approaches in the configurational theory such as cluster analysis (Lim et al. 2006), deviation scores (Delery and Doty 1996), and the interaction effects method (Dess et al. 1997), the QCA method is superior in grasping causal complexity at a fine-grained degree and can enable scholars to glean statuses of equifinality, substitution, and complementary effects among variables (Greckhamer et al. 2018). This method enables researchers to examine how multiple causal conditions combine to result in an outcome (conjunctural causation), estimate whether multiple combinations are related to the same outcome (equifinality), and identify whether both the absence and the presence of conditions may be related to the outcome (asymmetry). In brief, the QCA method enables researchers to understand complex interactions across multiple causal situations (David and Charles 2010). To date, QCA has witnessed extensive use in different fields such as business strategies (Douglas et al. 2020), information systems (Park et al. 2020), and social networks (Rutten 2020). However, it is less common in research upon environmental pollution. The method will be suitable for the study of environmental governance considering that environmental problems are endowed with the attribute of multi-factor causal complexity.
In conclusion, there are some deficiencies in existent understanding. First, it is difficult to reach a consensus about the influence of driving factors on environmental pollution from a single factor perspective. Second, traditional quantitative analysis instruments, such as regression analysis, have advantages in estimating the net effect of a single factor on the outcome ceteris paribus, whereas are difficult to elaborate multiple interaction effects (more than three factors) considering the complicated statistical interpretations and the multicollinearity problems. Third, the fsQCA method, which is suitable for handling the causal complexity of multiple factors, is seldom used in environmental pollution research. To fill these gaps, this paper adopts fsQCA  (2014) • Negative • GMM method Zhao et al. (2020) to investigate the causal complexity of environmental pollution at the provincial level and provides environmental improvement paths for high-pollution provinces. This study contributes to extant works in the following ways: (1) By introducing the fsQCA method creatively, this work assesses the multiple causations and asymmetric causality for leading to high pollution and low pollution at the individual level, which fills the gaps of previous studies on the interaction and combination effects of pollution factors; (2) We compare and combine the econometric models with the fsQCA methods, and thus offer a more fine-grained and comprehensive insight into the interactive mechanism of environmental drivers; (3) We innovatively conduct the spatiotemporal analyses of configuration characteristics and changes, and thus identify what combinations of causal conditions can lead to a green environment for regions with different development types, and provide the corresponding improvement paths for high-pollution provinces. In the following section, we illustrate the specification of the fsQCA method. Then, we report the results and undertake a discussion of the implications of our findings for policymakers. The final part of the paper presents the major conclusions.

Method and data
There are three main variations of the QCA method: crispset QCA (csQCA), multi-value QCA (mvQCA), and fuzzyset QCA (fsQCA). CsQCA is suitable to handle complex sets of binary data (Ragin 1989), while mvQCA, which regards variables as multivalued rather than dichotomous, is an extension of csQCA. Both csQCA and mvQCA require their selected data to be classified according to explicit distinguish criteria; as a result, it is hard to grasp complexity in cases that naturally change by level or degree (Rihoux and Ragin 2009). FsQCA integrates fuzzy-sets and fuzzy-logic manners to break through this limitation; it offers a more fine-grained insight into data by providing a more realistic approach in which variables can capture all values from 0 to 1. Therefore, fsQCA was applied to our research subject as our variables had no clear classification criteria. The basic steps in fsQCA method are shown in Fig. 1.

Selection of variables
The first step in performing fsQCA analysis is specifying the configural model; identifying what antecedent conditions should be involved in estimation accounting for the outcome (Douglas et al. 2020). On the basis of the IPAT model (population, economy, and technology), two indispensable dimensions of industry and government are extended in our research framework. Within these aspects, technological factors include technological innovation capacity and R&D subsidy; governmental factors containing the environmental regulations on source treatment and end-of-pipe treatment; economic, demographic, and industrial factors include economic growth, population scale, and industry structure respectively. Within this framework, the configuration of seven environmental driving factors that may lead to high or low environmental pollution is shown in Fig. 2. This paper used panel data from 30 provinces in China (excluding Tibet, Taiwan, Hong Kong, and Macao) between 2011 and 2020. In view of the policy effects of the Five-Year Plan, we divided the time spans into two sections (2011-2015 and 2016-2020) to provide  a time-variant perspective. It is noted that the fsQCA method is adept in analyzing cross-section data, and 12-50 cases generally correspond to 4-8 causal conditions (Greckhamer et al. 2013). Hence, we adopted the average value for 2011-2015 and 2016-2020 respectively. In addition, the price took the deflator coefficient into account and 2011 was considered the base period (year 2011 = 100). The detailed data source of variables is presented in Table 2. Specifically, industrial pollution was adopted to represent environmental pollution and includes, given available data, industrial wastewater and industrial waste gas   emissions (e.g., sulfur dioxide, smoke, dust, and nitrogen oxides). To circumvent subjective bias, the entropy method was used to calculate the comprehensive index of environmental pollution as this method maximizes the overall situation of pollution (Jianqin et al. 2010). The indicators representing technological innovation were R&D expenditure , the number of patents (Linares et al. 2019), the number of researchers (Wen et al. 2020), full-time equivalents , and new product sales (Bruno and Reinhilde 2006). Similarly, we used the entropy method to depict the overall picture of technological innovation ability.

Data calibration
Following the stages outlined above, the original data should be calibrated to fuzzy-set membership scores (range from 0 to 1) that represent the membership of a variable (Ragin 2008). For example, 1 means the high level or presence of a defined set, whereas 0 means it is low level or absent. Based on Fiss (2011), we used the first quartile, the third quartile, and their average as the three qualitative anchors of fully out, fully in, and crossover point respectively. To these, we then applied the direct calibration method in the fsQCA3.0 to transform the data into fuzzy-set memberships. Table 3 summarizes the descriptive statistics and the calibration thresholds of the variables.

Necessity and sufficiency analyses
In the following, a truth table, which is a data matrix, should be constructed to provide all logically possible configurations of variables in 2 k rows (k = number of variables), where each row represents a specific configuration. Based on the memberships in the fuzzy sets, every case in our study is associated with a row of the truth table. To identify whether a variable was necessary or sufficient for an outcome, we then analyzed whether the conditions were always present (or absent) in each case when the outcome was present. It follows that if an innovative advantage is necessary to lead to low pollution in all regions, low pollution will not happen if one region lacks such an advantage. Likewise, if an advantage in technological innovation is a sufficient condition that leads to low pollution, all regions with such an advantage will have low pollution.
The interpretive tools for both the necessary and sufficient conditions are consistency and coverage. Consistency establishes the extent to which the cases that share a configuration of variables agree in their outcome and is analogous to a correlation, while coverage displays the proportion of cases representing an outcome in a certain configuration and is comparable with the coefficient of determination (e.g., R 2 ).
The consistency and coverage of configuration i is: (1) where X i,r and Y i,r is the membership of region r in the set of solution i and outcome, respectively. To identify the necessary condition, the necessity analyses of all conditions and their negation were conducted with a consistency criterion of ≥ 0.9 (Wagemann 2012). Thereafter, sufficiency analyses were performed using the truth table algorithm to recognize configurations that were constantly related to an outcome. To avoid "simultaneous subset" relations of configurations in both the outcome and its absence, the raw consistency benchmark of sufficiency analysis must be more than 0.8 accompanied by a benchmark for PRI (proportional reduction in inconsistency) score of over 0.65the higher the value, the more robust the solution (Misangyi and Acharya 2014). In our study, we set the thresholds for raw consistency and PRI as 0.8 and 0.8, respectively. Then we set the frequency threshold of one strong case for a configuration's inclusion to ensure 100% of the studied sample in the sufficiency analysis (≥ 80% is recommended proportion) (Greckhamer and Gur 2021).
In the next step, and based on the truth table algorithm, the truth table rows should be logically simplified to classify causal conditions into core and peripheral conditions. In this progress, there may be no practical cases of any particular configuration, which is a common problem named "limited diversity." For this, counterfactual analysis based on theoretical and substantive knowledge supports solving the limitations and obtains parsimonious, intermediate, and complex solutions (Wagemann 2012). The core conditions appear in parsimonious solutions, while the peripheral conditions appear in the intermediate or complex solutions. In general, core conditions are more convincing than peripheral conditions; the latter are relatively complementary. As a result, we placed interpretative emphasis on parsimonious and complex solutions. Wagemann (2012) proposes two set-theoretic-method-specific dimensions of robustness: the set-relational states of the different principles and the variations in the coefficients of fit. Where different decisions bring about different solution terms but retain a subset relation between each other, the results can be interpreted as robust. In like manner, where different decisions result in differences in the coefficients of fit that are too insignificant to ensure a meaningfully different substantive interpretation, the results can also be considered robust. Given this, the calibration threshold of the crossover point in this paper was changed from the average value of the first and third quartile to the median (Greckhamer and (2)

Robustness analyses
Gur 2021). Thereafter, we recalculated the explained variable (environmental pollution) by adding the production of solid waste. Next, we improved the threshold of raw consistency from 0.8 to 0.9, and the PRI from 0.8 to 0.9. Then, the predictive validity was performed to test the accuracy of the models in the first time window (2011)(2012)(2013)(2014)(2015) using the data of the second time window (2016)(2017)(2018)(2019)(2020). Finally, to handle the bidirectional causality problem between environmental regulation and environmental pollution, we conducted two robustness tests that considered 1-year and 2-year lagged environmental regulation terms in our fsQCA models. It should be noted that the first robustness check with the alternative calibration may transform an empirically observed row into a logical remainder (or vice versa), or from a consistent row into an inconsistent one (or vice versa), drawing different conclusions (Wagemann 2012). However, minor changes are observed in the results (including the specific number of solutions, solution consistency, solution coverage, and the characters of configurations), that is to say, the explanations of the main conclusions remain substantively unchanged (see Supplementary Table S2-S3). As for the second and third robust tests, we found that substitution of the explained variable and tighter threshold requirements produced substantially similar solutions with expected minor changes including subset relations of configurations, solution consistency, and solution coverage (see Supplementary Table S4-S7). The finding of the predictive validity test shows that the configurations between 2011-2015 and 2016-2020 are highly consistent (see Table 6). At last, the results of the robust tests for the bidirectional causality problem show that the consistency in most of the configurations exceeds 0.9, which indicates that our results were robust (see Supplementary Table S8-S9). In short, a series of robustness analyses confirmed the credibility of the results presented in this paper.

The regression analysis
To demonstrate the potential contribution of fsQCA in understanding environmental pollution compared to the quantitative analysis, we first conducted the traditional regression models, considering including all the same antecedent conditions and two industry-related interaction terms as independent variables. Logarithmic variables, fixed-effect models, and clustering robust standard errors were used to avoid inter-group heteroscedasticity while centralizing the interaction terms is helpful to explain the coefficient. As Table 4 shown, model 1 controls individual fixed effects, model 2 controls individual and time fixed effects, and then models 3 and 4 introduce different interaction terms based on model 2. Several findings are summarized below.
First, for models 3 and 4, the coefficients of industryrelated interaction terms are significant, which indicates that interaction effects between environmental drivers exist significantly.
Second, by adding interaction terms, the goodness of fit (R 2 ) increases from model 1 to model 4, which implies that the inclusion of interaction terms enhances the explanatory power of models; Third, from model 2 to 3, the coefficients of technology and industry become larger and more significant when adding the interaction of technology and industry; from model 2 to 4, the coefficient of the economy is still not significant, but the coefficient of technology becomes larger and more significant when adding the interaction of technology and economy, which means that although economy does not affect environment alone, it mediates and enhances the effects of industry. The findings argue that interaction effects can significantly affect the effects of individual variables.
To sum up, the regression analysis confirmed the existence of interaction effects between pollution drivers. However, the construction of regression models faces a dilemma. On the one hand, adding too many interaction terms into models will make results complicated and redundant. For example, the economic implications of multiple interaction terms (especially for over three factors) will be difficult to interpret. More seriously, even if the variables are centralized, excessive interaction terms will result in serious multicollinearity problems in the model, which then leads to a biased estimation; on the other hand, the absence of interaction terms will be inconsistent with reality and lose partial explanatory power of the model. Therefore, we then performed fsQCA analysis that can reveal the multiple interaction effects of environmental drivers while avoiding multicollinearity problems.
We firstly identified the necessary condition of high pollution or low pollution. According to the results (see Supplementary Table S1), no variable strictly meets the criteria of necessary conditions. This finding echoes the theory of complementarities that no organizational elements are best practices alone but will affect positively only when they occur in conjunction with other elements.

Sufficiency analyses between 2011 and 2015
During the 12th Five-Year Plan Period (2011)(2012)(2013)(2014)(2015), there were 10 configurations that led to either high pollution or low pollution in regions, as Table 5 presents. These configurations illustrated that there were varied strategic paths that led to equifinal outcomes, and this, in turn, verifies the existence of multiple causal relationships in environmental issues. Furthermore, these 10 pathways can be grouped into five distinct pairs of neutral permutations (C1-C5). Pathways in each pair represented the same core conditions and only varied in their complementary conditions. The solution coverages of high pollution and low pollution were 0.587 and 0.755, respectively, which exhibits a strong explanatory power, while all the configurations maintained very high consistencies (0.977 in high pollution, and 0.962 in low pollution), suggesting that these configurations are persuasive for the outcomes.

Configurations for high pollution
There were five configurations (C1a-C3) that illustrated the possible causal relationships that led to high pollution between 2011 and 2015. It is worth noting that the first four configurations (C1a-C2) contained the same core condition of possessing a backward industrial structure; this illustrates that structural imbalance was the leading factor causing high pollution in the involved regions. Therefore, we label these four configurations as structural imbalance types.
Specifically, configurations 1a and 1b (C1a and C1b) featured technical lag, small R&D subsidies, large end-of-pipe treatment costs, small populations, and backward industrial structures. These features signified that even though local government spent a large amount of money on end-of-pipe treatment, backward technological development and industrial structure were still harmful to the environment. In addition, the features of large source treatment costs and backward industrial structure in C2 further revealed that the source treatment measure was ineffective for mitigating environmental burden when the industrial structure was backward. In other words, no matter how much the government had spent on environmental regulation, a backward industrial structure hindered environmental improvements to a greater extent.
In C3, the core conditions included both intensive cost of end-of-pipe treatment and large populations, with the peripheral conditions including advanced innovation ability, strict environmental regulations, and the possession of highly developed economies. The representative regions of C3 are Guangdong and Jiangsu, both are well-developed and densely populated. Specifically, Guangdong and Jiangsu have topped China's provinces for the past decades in terms of their recorded levels of GDP. These regions have coupled their possession of massive natural resources with rapid economic and social development. Apart from this, the growth polar effect that has arisen as a consequence of economic agglomeration has undoubtedly attracted the inward migration of people from surrounding regions. Rapid population growth and the possession of a large population will, therefore, not only lead to population agglomeration but also accelerate the consumption of limited resources, and bring about enormous population pressures. In turn, these effects stimulate further economic-social activities and may give rise to either predatory or disruptive use of resources. Increasing populations also give rise to huge levels of consumptive pollution (Ehrlich and Holdren 1971). The finding echoes Li et al. (2019) and Wang and Zhou (2021). Given these assorted facets, we labeled C3 as the extensive development type.

Configurations for low pollution
According to Table 5, there were five alternative configurations (C4a-C5b) that led to low pollution. C4a-C4c shared the same core conditions: advanced industrial structure and low inputs for source treatment. Specifically, C4a featured smaller populations, advanced innovation abilities, substantial R&D support, and exhibited a comparatively ideal path for pollution governance in which local government paid more attention to structural optimization than the cost of pollution treatments. Consequently, this configuration was labeled as green development type. C4a included two municipalities, Beijing and Shanghai; these two cities have realized win-win situations that have balanced economic development with environmental protection. C4b and C4c were inferior in terms of technological innovations, governmental regulations, and economic development, but were superior with regard to their industrial structures. The positive effects of industrial structure on the reduction of pollution are consistent with Zheng et al. (2020) and Yu et al. (2018). The typical cases in this configuration included Yunnan and Heilongjiang, where tertiary industries make up more than half of the region's GDP. Yunnan, for example, records that its tourist industry accounted for 51.5% of its GDP in 2020. To respond to the national strategy "clear water and green mountains are as valuable as mountains of gold and silver," the local government attempted, in addition to continuing to promote the transformation and upgrading of its tourism and cultural industry, to adopt a series of ecological measures, including developing green finance, implementing coal substitution, and increasing forest carbon sinks; all of which are beneficial to maintaining a low-pollution status.
C5a and C5b both possessed core conditions of low inputs for end-of-pipe treatment, less developed economies, and small populations with the peripheric condition of low source treatment costs. These configurations indicated that these less developed areas could ensure low pollution while expending (comparatively) less on environmental governance. One explanation for this is that a small population size alleviates the contradiction between humans and nature, i.e., consumption-based pollution is reduced. By and large, these four low-polluted configurations of C4b-C5b are inferior to C4a (green development type) in respects of technical innovation, economic development, and industrial structure. Therefore, we labeled these pathways as the green ecology type.

Spatiotemporal variations of configurations
So that we could elaborate further on the evolutionary patterns of configurations, we further studied the sufficient conditions between 2016 and 2020. As evidenced in Table 6, we found six configurations that led to high pollution while five configurations resulted in low pollution. It can be observed that the solution coverage and the solution consistency of these configurations were high, indicating a strong explanatory power for outcomes. The diverse pathways indicated that multiple solutions existed for achieving the equifinality of outcomes. In addition, several impressive findings were obtained by comparing the configurations in two time spans from temporal and spatial perspectives.
Path dependencies existed in regional development patterns. In the period between 2016 and 2020, the levels of the conditions (e.g., pollution levels, environmental regulation inputs, and others) in most configurations were parallel with the former time span (2011)(2012)(2013)(2014)(2015), indicating that there were strong path dependencies in most provinces where few changes had been undertaken about their development patterns. For instance, for high polluted regions, the states of seven conditions in C7b, C7c, and C8 were the same as C2, C1c, and C3, respectively. Similarly, for low polluted regions, C9, C10a, C10c, and C10d were similar to C4a, C4b, C4c, and C5b, separately. Therefore, we label these paths as being the same as for the previous period (see Table 6). Compared with C1a and C1b, the peripheral condition of small innovation subsidy in C6a and C6b becomes the core condition, and backward industrial structure becomes the core condition. Combined with the condition of laggard technology with the core conditions, we label C6a and C6b as technical laggard types.
To further explore the extent of path dependency, predictive validity (using the second data set from 2016 to 2020 to compute the fuzzy scores for each of the ten configurations in Table 5) was performed as presented in Table 7. Taking C1a for example, the second data set is completely consistent (100%) with the argument that C1a is a subset of high pollution, and C1a accounts for 7.1% of the total memberships in high pollution. It is found that the consistency in all of the configurations exceeds 0.9, which indicates that the configurations between 2011-2015 and 2016-2020 are highly consistent. Furthermore, the high raw coverage of each configuration, especially for C3 (40.1%), C4b (37.9%), and C5a (44.9%), confirms the strong path dependencies in regional development patterns.
Beyond that, the crowding-out effect was found in Shaanxi province. During the whole of the investigated period, Shaanxi spent hugely on R&D subsidies, but its technological innovation capacity changed from high (C5b in 2011-2015) to low (C10d in 2016-2020). In other words, governmental R&D subsidies failed to achieve the desired effect and instead eliminated regional technological innovation. This suggests that the crowding-out effect was more dominant in local environmental governance. Specifically, corporate R&D strategies stressed short-term benefits while the subsidies offered by the local governments sought to achieve long-term technical progress. Such a conflict in the direction of R&D initiatives weakened the driving forces of R&D investments. It should also be noted that China currently faces challenges in supervising governmental R&D investments and that this may result in the misuse of R&D subsidies. This regulatory defect can be seen to lead to inefficient capital utilization. Black circles indicate the high level (or presence) of a condition; circles with crosses indicate the low level (or absence) of a condition; large circles indicate core conditions; small ones, peripheral conditions; and blank spaces indicate "don't care" To intuitively investigate the spatial distribution of configuration types and their transformations over time, this study applied the regional pollution labeling to a map of China (see Fig. 3). During the two periods, most regions in the involved types exhibited geographical adjacency. As Fig. 3 illustrates, most of the eastern regions belonged to the high-polluted group (e.g., extensive development type) while western areas were mainly belonging to low-pollution clustering (e.g., green ecology type).
It may also be noted that Shanxi and Inner Mongolia changed from a structural imbalance to a technical laggard type. This transformation may be a consequence of continuous improvements to the quality and efficiency of supplyside structural reforms, a consequence of the constraints induced by the imbalanced structure being weakened, and technological factors becoming the main drivers of environmental pollution.
During the periods investigated, the large inputs of endof-pipe treatments and the possession of backward industrial structures were common characteristics shared by most high-polluted regions. In contrast, the low costs expended on source treatment and end-of-pipe treatment, the possession of smaller populations, and less developed economies were common features for most low-pollution regions. To a great extent, it contributes to their unique geographical advantages (e.g., climate, terrain, and vegetation resources) on the ecological environment and their original status of low pollution level (e.g., Yunnan). Furthermore, local governments are confronted with multiple tasks from the central government, including economic growth and environmental protection. Thereinto, environmental targets are obligatory in China's performance evaluation system, but there are no incentives for local government to surpass these targets Zhang (2020). In other words, the local governments in low-pollution areas just needed to input small environmental treatment costs to surpass the cut-off score of environmental requirements decided by the central government.

Implications from fsQCA
An extensive number of studies have explored the net effects induced by single factors on environmental issues from a symmetric perspective. These works have tended to construct a linear or curvilinear relationship among theoretical elements of interest and have failed to explore Fig. 3 The maps of configural types in the two time periods. Note: the abbreviations of regions are detailed in Appendix Table A practical asymmetric causal relationships. However, the fsQCA method enables heterogeneity to be revealed; something that traditional symmetric analytical methods challenge to manage. The approach also results in a more fine-grained taxonomy of development types. This study explores what conditions of configurations are relevant for pollution and how these conditions unite to work. The fsQCA method used in this paper provided new insights into these situations by developing a multifaceted comprehension of the conditions' dynamics. The results exposed subtle details of heterogeneity among the regions of China and identified sub-groups for which various pathways lead to the same outcome. In addition, it was shown that the configurations of high pollution and low pollution disclosed multiple causal relationships of variables, which demonstrates that the factors leading to environmental pollution at a provincial level are complex and asymmetrical.
Therefore, the causal complexity of environmental pollution in our findings provides direct implications for local governments when it comes to their addressing of environmental issues. For example, while it is widely accepted that technological progress aids environmental protection (Pham et al. 2020), we found that, for some high-polluted regions (e.g., Henan, Hebei, and Anhui) labeled as structural imbalance type (C1c and C2), their backward industrial structures still caused high pollution despite their possession of highlevel technological innovation capabilities. Hence, local governments should not only enhance technological innovation and R&D subsidy but also optimize their industrial structures simultaneously. Apart from that, comparing C4b and C5b (the green ecology type) indicates that a combination of advanced innovation ability and large R&D subsidy can serve as an alternative to advanced industrial structure in terms of pollution reduction. Accordingly, the low-polluted regions should enhance innovation ability and R&D subsidy or optimize their industrial structures. Comparing C4a with the high-polluted configurations that have a large population (C1c and C2) or developed economy (C1b) or both (C3), we found that the technical innovation superiority combined with advanced industrial structure is more efficient in reducing pollution than the combination of advanced innovation capability with large environmental governance input. Therefore, the high-polluted regions should pay more attention to optimizing the industrial structure while enhancing innovation capabilities.
This study also confirmed that the causal asymmetry derived from causal complexity may resolve a long-standing dispute about whether the relationship between environmental pollution and its driving factor is positive or negative. For example, the effect of economic growth on environmental quality in C3 (the extensive development type) is negative while in C4a (the green development type), it is positive. This result is exactly because of the conjunction of economic growth and other conditions (e.g., environmental regulations, population size, and industrial structure), instead of being only (as advanced in previous studies) a consequence of economic growth. Given this, our findings can inspire future research to consider the causal complexity of environmental issues.

Path dependence
According to our results, regional environmental governance presented strong path dependence during the investigated period. As Fiss (2011) argues, configurations and types appear to impact future configural states by influencing the tracks of subsequent development models, thereby making certain tracks more possible while decreasing the probability of others. In the present study, most regions showed strong path dependencies in their development models, especially high-polluted regions. This leads to a significant issue, namely, why do these regions become locked into development models that lack dynamism, while other (rare) regions evade the lock-in effect and renovate themselves via consecutive new pathways. The lock-in effect works on a self-reinforcing logic that prefers continuity and replication (David 1985). Specifically, the first-mover advantage of one built-in development pattern decreases operating costs through the scale effect, and the popularity of such patterns further leads to the improvement of the learning effect. Then, the synergetic effect of both contributes to achieving a virtuous circle of self-reinforcing, thus maintaining the original pattern in a state of persistence or lock-in, unless with the help of exogenous shock (e.g., policy reform). It follows that the originally reasonable pattern and correcting errors in time are crucial to environmental governance. The local government needs to consider the long-term impacts of policy implementation instead of just the short-term effects.
In addition, the government should take corrective actions as soon as possible once deviation between the practical effects of reform and its goal is detected.

Crowding-out effect
In our findings, the crowding-out effect that occurred in some regions, such as Shaanxi province, showed that government subsidies for technological innovation reduced or eliminated the improvement of technological performances. This phenomenon may have occurred because government R&D contracts were designed to bring social benefits or long-run efficacy, whereas grantees such as private firms tend to pursue economic interests or improvements to short-run performance. The divergence of original intentions between the two sides might crowd out the corporate R&D investment of the private and originally planned research agenda. Another possible explanation from a firm's point of view is that R&D subsidies released possible liquidity constraint -as a cheaper cost to apply for government subsidies than to raise funds in the capital market, enterprises considered the innovation subsidy an alternative source of financing instead of an actual R&D incentive. It follows that the government should lay greater stress on the improvement of the regulatory regime for firms' capital flows.

Policy implications for high-polluted regions
In our results, C9 (green development type) represented the best practice for achieving the dual targets of environmental protection and economic development. Its core conditions included possessing an advanced industrial structure, high technological innovation capacity, and a relatively small population. This is worth learning for high-polluted regions which were divided into structural imbalance, extensive population, and technical laggard types. Figure 4 shows the advised paths to achieve a green environment for these high-polluted types.
Possession of a backward industrial structure was a common and significant condition that led to high pollution in these polluted types. To address this, local governments should first optimize industrial structure. For instance, Beijing and Shanghai (green development type), which were low-pollution regions, began their industrial transitions alongside declining dependence on pollutionextensive sectors. Their low-emission industries (e.g., telecommunication equipment and transport equipment manufacturing) dominate their secondary industry sectors, and their tertiary industries have become pillars of their sustainable economies. In direct contrast, the pillar industries of Shanxi (structural imbalance type in [2011][2012][2013][2014][2015], which is the high-polluted region, are predominantly concentrated in high-emission fields, such as the smelting and pressing of ferrous metals, petroleum processing, and coke refining. Within such provinces, well-designed and wellenforced industrial policies should be established to accelerate the independent elimination of pollution-intensive enterprises and the emergence of eco-friendly industries, a process that will, in turn, minimize the negative byproduct pollutants of economic development. In addition, inter-provincial cooperation between the provinces should be encouraged. For high-polluted regions of the technical laggard type (including Shanxi and Inner Mongolia in 2016-2020), poor technological innovation abilities and R&D subsidies, and high end-of-pipe treatment inputs indicated that their local governments tended to solve issues pertaining to current pollution rather than improving technological innovation to ensure longer-term benefits. To address this deficiency and the problems associated with short-term localized vanity projects (and the problems that they may create for successor administrations), the central government should establish a retroactive investigation mechanism for environmental governance decisions. Former officials who leave a legacy of environmental destruction should pay a penalty for their actions even if they have left office or retired. In addition, and especially when making major administrative decisions for environmental governance, local governments should be required to be open and transparent, and to implement sound supervision and assessment mechanisms. Fig. 4 The advised paths to a green environment for three high-polluted types Local governments in these regions should also increase R&D subsidies and tax preferences for high-tech and lowemission enterprises. Concurrently, to improve the utilization efficiency of R&D subsidies, the government should perfect the supervision system to deter adverse selection problems. For example, cooperative innovation organizations such as national engineering laboratories or industrial R&D centers are encouraged to be set up by enterprises, universities, and research institutes. The government should also undertake external supervision of innovative organizations by designing a post-evaluation system for R&D achievements.
Regarding the high-polluted regions with extensive development types, it is concluded that their extensive populations and economic development were the key factors that led to the huge levels of their consumption-related pollution while their variable of industrial structure has no significant effect. Obviously, high-quality development of population and economy is crucial. Considering that overcontrol of population growth will impede economic development brought by demographic dividend, improving population quality and raising people's environmental awareness are practical ways. The government needs to guide people to adopt green consumption habits through public education and publicity measures. Besides, local governments should speed up efforts to upgrade their industrial structure as discussed above.

Conclusion
This study creatively introduced a fine-grained analysis tool -the fsQCA method -to explore the interactive mechanism and causal complexity of environmental drivers by using the panel data of 30 Chinese provinces in the period between 2011 and 2020. The findings will provide targeted policy recommendations for local governments at the provincial level. The main conclusions are as follows: 1) The regression analyses and necessity analyses confirm that no single factor is a necessary condition for a high or low pollution, and the interaction effects of pollution drivers exist significantly; 2) There are several different configurations of environmental drivers which lead to high pollution or low pollution in regions. This confirms the multiple causality, asymmetry, and equifinality of environmental issues; 3) The combination effect of advanced industrial structure, small population size, and technological advance is significant in achieving a state of green environment compared to environmental regulation factors; 4) By testing predictive validity and analyzing the spatiotemporal variations of configurations in the two periods of 2011-2015 and 2016-2020, we found that most regions showed strong path dependencies and spatial agglomeration for their development patterns.
Although this study filled some of the gaps within existing literature pertaining to the environmental area, the scope of this research is not broad enough due to a lack of available data for some variables at a more fine-grained scale, such as city-level or prefecture-level. As the heterogeneity between cities is also prominent, more new conclusions may be found in future research at the city level. Besides, we have tried to use the fsQCA method based on panel data before, but to the best of our knowledge, panel data is rarely used in the fsQCA method because the methodology is still immature. Our future works will focus on these respects.