Evolutionary game analysis of coal enterprise resource integration under government regulation

Resource integration of coal enterprises is conducive to reducing pollution and carbon emissions, thus alleviating environmental problems such as global warming. Government regulation has a great influence on enterprise behavior. Therefore, it is necessary to analyze the strategies of government and coal enterprises in resource integration. Based on the perspective of government regulation, this paper discusses how to guide and restrict coal enterprises to conduct resource integration behavior, and whether the government supervises this behavior. First, through empirical research, government regulations of coal enterprises are given practical policy implications. Second, using evolutionary game and simulation technology, from the perspective of government regulation, we explore the complex behavioral interaction mechanism between the dominant and inferior coal enterprises, the mechanism between the government and coal enterprises, and analyze the impact of key factors on the dynamic evolution process. Finally, the sensitivity analysis of the selected parameters is discussed in details, which provides useful decision-making suggestions for the government and enterprises. In addition, this paper further analyzes the impact of different government policies on coal enterprises' green innovation strategies. Results demonstrate that (1) when the power gap between enterprises is large, the probability of dominant enterprises choosing resource integration converges to 1, while the probability of inferior enterprises converges to 0. Therefore, government regulations are invalid for inferior enterprises; (2) the combination of government regulations can help improve the efficiency of coal enterprises’ strategic choices. With the increase in the intensity of government rewards and punishments, the probability of enterprise resource integration evolves from 0 to 1; (3) excessive government regulations make the choice between the government and coal companies tend to swing, because the probability of the two is between 0 and 1. Therefore, excessive government regulations cannot effectively achieve resource integration and government regulation. (4) The government subsidy strategy is less effective than the government’s pollution penalty strategy in promoting the green innovation of enterprises. Our research shows that the government should choose different policy combinations and intensities to regulate resource integration according to the market power of coal enterprises, which provides theoretical reference and practical guidance for the government to regulate corporate resource integration behavior.


Introduction
Recent years, human beings have been unscrupulously consuming the earth's resources of order to adapt to the rapid economic development, causing the increasingly prominent, climate issues, which draw global attention to the environmental issues (Nakajima et al. 2020). In the past few decades, the global climate has warmed rapidly (Zhao et al. 2020), which further affected ocean circulation (Wu 2020), Biological invasion (Kim et al. 2020), and crop phenology (Xiao et al. 2019), melting of glaciers (Nojarov et al. 2019). Fortunately, at the 75th UN General Assembly, China, a highcarbon emission country, announced that it would achieve the goal of "carbon neutrality" by 2060. At the same time, the US, another high-carbon emission country, plans to rejoin the Paris Agreement, whose attitude to reduce carbon emissions will have a profound impact on countries around the world, both numerically and psychologically (Davis 2017). Coal as one of the main sources of global carbon emissions, whose enterprises have also gradually attracted people's attention for its governance of highly polluting and energy-consuming problems. The world's coal resources are mainly concentrated in the Asia-Pacific region. As of 2019, China's total amount of coal is as high as 3.8 billion tons, ranking first in the world. As shown in Fig. 1 about China's energy extraction from 2008 to 2018, China's coal extraction far exceeds that of oil and natural gas, which confirms that China's energy structure is dominated by coal (Gu et al. 2020). However, as a traditional energy resource, coal has a series of deepening contradictions, such as industry risk, capital intensiveness, and excessive pollution . How to reduce the pollution of coal enterprises has become an urgent problem to be solved.
Resource integration has become a key factor in competition between enterprises and countries (Barbasanchez and Atienzasahuquillo 2010). As the main energy source of the industrial revolution, coal resource integration has gradually become the focus of attention (Zhang et al. 2011). In order to avoid the negative effects of traditional resource-based energy, many countries have adopted merger and reorganization to realize the integration and optimization of resources, which not only enlarges the scale of enterprises, but also improves the production efficiency (Cannon et al. 2020;Christofi et al. 2019). However, due to the uneven distribution of coal resources in China (Teng et al. 2016), which causes coal enterprises to be restricted by regional differences in the process of resource integration. At the same time, there are few ways of resource integration, mainly large enterprises gobbling up small enterprises. The effect of resource integration is not good, a case in point is the company with unclear objectives blindly expands corporate size, on the contrary, acquiring growing decline in profit. Therefore, it is often impossible to achieve the maximum effect of resource integration by relying solely on market response mechanisms, and it is necessary to use the power of government regulation to macro-control the merger of coal enterprises.
In terms of government regulation, industrial policy has become an important external driving force for resource integration of coal enterprises. When sustainable economic development is constrained by the environment, industrial structure adjustment is the critical path to solve the difficult pattern (Zhen and Zhu 2016). Government regulations are often used to support the formal development of the industry (Veiga and Marshall 2018;Park and Kim 2019;Dutu 2016). The development of China's coal industry urgently needs government guidance and restriction (Tao et al. 2019). The specific form of government regulation is mainly embodied in two forms: government penalties and tax incentives. Throughout the development history of Chinese coal companies, mergers and acquisitions are the indispensable part, and most of them are jointly operated by the government and enterprises. However, the existing studies have given no emphasis on the interaction methods and influence mechanisms between government and enterprise. Based on bounded rationality, both enterprise and government are pursuing the maximization of their own interests (Sun et al. 2020), and the choice of individual strategies may vary with the environment . In China, the coal industry is controlled by the government. However, as an economic organization, the enterprise aims at maximizing profits. What the government is pursuing is the improvement of the environment and the progress of green technology (Cao and Zhang 2017;Zhang et al. 2019). On the one hand, coal enterprises will pay a higher cost for resource integration activities. Therefore, based on the principle of profit maximization, coal enterprises are unwilling to integrate resources. On the other hand, the resource integration of coal enterprises can help improve the environment and promote green innovation, which is in the interest of the government. Since the interests of enterprises and the government are inconsistent, enterprises may ignore environmental issues in order to obtain economic benefits. Therefore, government policies need to be adopted to regulate corporate behavior. The Chinese government uses a variety of environmental regulations to promote energy conservation and reduce pollution. For example, the government reduces the cost of environmental protection activities and increases the willingness of enterprises to integrate resources through incentive-based environmental regulation policies (Yi et al. 2020a). In addition, since 2015, the Chinese government has implemented a new environmental protection law to punish corporate pollution in order to reduce corporate pollution . Evolutionary games not only study the interaction between actors (Collins and Kumral 2020), but also provide a powerful theoretical framework for how to promote and maintain the cooperative relationship between actors . Therefore, it is necessary to use evolutionary game theory to study the strategic choices of government supervision and enterprise resource integration. From the perspective of government regulation, this paper empirically examines the impact of resource integration on coal enterprise innovation and pollution, and gives government regulations practical policy implications for coal enterprises. The evolutionary game is used to construct a game model between enterprises and between government and enterprises, and the results of the model are simulated. This paper answers the following key questions & From the perspective of government regulation, how do enterprises choose resource integration strategies to divide the interests of the existing market? & From the perspective of government regulation, how do enterprises and governments make choices to achieve each other's optimal combination of strategies? & Does the change of government regulations affect the strategic choices between enterprises, and between enterprises and governments? How does it make an impact?
This paper makes several contributions. First, it contributes to expanding the research methods of the strategic choice of the participants in the integration of enterprise resources. We not only carried out empirical research, but also further constructed a dynamic evolutionary game model, thus proposing innovative viewpoints based on bounded rationality. Second, this paper also contributes to the broader government regulation literature. We assign government regulations to specific policy meanings, and divide them into incentive and punishment policies for common consideration. Third, our research explores the boundary of coal enterprise resource integration and government supervision strategy selection with unique policy implications. On the one hand, it explains the inefficiency of government regulations, on the other hand, it provides feasible governance ideas for the realization of coal enterprise resource integration.
The rest of the paper is organized as follows: the "Literature review" section reviews previous literature. The "Research assumptions and parameters" section gives practical meaning to government regulation through empirical research, and game model builds the basic framework. The "A game model of resource integration behavior between enterprises" and the "A game model of resource integration behavior between government and enterprise" sections use dynamic evolution and simulation to analyze the game between enterprises and the game between government and enterprises from the perspective of government regulation. The "System simulation analysis in different situations" section conducts a sensitivity analysis on the selected key parameters. The "Discussion" section presents the discussion. Conclusions and policy implications are provided in the "Conclusions and policy implications" section.

Literature review
With the increasing global warming and environmental problems, there is more and more urgency in energy conservation and emission reduction. Coal is an important energy source for human development, and it will have a negative impact on the environment during the process of mining and use. To find a solution, more and more scholars have been devoted to research on the pollution of coal resources from various aspects. With the continuous improvement of the degree of mechanization, the efficiency of coal mining has gradually improved, but more serious problem of coal dust pollution has also occurred in the process of coal mining. By studying the law of dust migration under the condition of dynamic coal cutting, methods to reduce coal dust pollution can be found (Jiang et al. 2021). Water pollution is also a serious problem in the process of coal mining. In coal port areas, an ecological intelligent control system integrating environmental protection functions can alleviate water pollution (Zhao et al. 2020). Power generation is one of the uses of coal. When coal is used for power generation, a large amount of greenhouse gases will be produced. The solar-aided coal-fired power generation (SACPG) system can effectively increase the power generation rate and reduce carbon dioxide emissions (Zhao and Yang 2020). In addition to power generation, the burning of coal in rural households also causes pollution. When coal is burned, a lot of sulfur dioxide will be generated. Therefore, the quality of coal products needs to be improved (Zhang et al. 2020b). To sum up, most scholars study the pollution caused by coal resources from a micro-level technical perspective. The resource integration of coal enterprises is conducive to technology improvement, energy saving, and pollution reduction (Zhang et al. 2011). However, there are currently few studies on resource integration of coal enterprises.
Resource integration is the only way for an enterprise to expand, and its timing is crucial. In 1937, Coase conducted research on the nature of the enterprise, the reasons for the existence and development of the enterprise, and discussed in detail whether the boundaries of the enterprise after integration will affect the allocation of resources and the value of the enterprise (Coase 1937). The existing research on resource integration of coal enterprises mainly focuses on the exploration of motivations and integration paths. The research methods are mainly empirical research (Sun et al. 2016;, theoretical research (Chen et al. 2020;Teerikangas and Colman 2020) and case research (Bi et al. 2020;Notteboom et al. 2020;Zhuo et al. 2021). However, mathematical models are rarely used for analysis. Therefore, this paper uses a combination of empirical research and mathematical deduction to analyze the integration of resources. At the same time, from the perspective of research content, some scholars have also conducted relevant research on the integration of resources across regions, industries, and ownership of coal enterprises (Sun et al. 2020). He et al. (2020) found that mergers and acquisitions of coal enterprises will help resolve excess capacity and improve enterprise efficiency. Moreover, mergers and acquisitions between coal enterprises in the expansion stage can bring maximum efficiency. However, the property rights structure of China's coal enterprises has certain particularities, and the market mechanism of resource integration cannot play the maximum role, so it needs to rely on the power of government regulation (Sun and Zhang 2019).
Government regulations can play a role in making up for market failures. Therefore, government regulations play a vital role in the resource integration of coal enterprises. Zhang et al. (2020a) contend that government regulation can promote enterprises to improve technology and energy-saving effects. The existing research on government regulation mainly focuses on interaction between government and enterprises. Zhang et al. (2017) found that under different market conditions, the supervision of the central government will affect the game ability of coal enterprises and local governments. When the government supervises coal enterprises, tax incentives and financial subsidies policy can be used (Yang et al. 2021). Becker and Fuest (2008) found that differential tax incentives are conducive to enterprise mergers and acquisitions. When enterprises face government regulations, three active strategies including expansion of enterprise scale, technological innovation and environmental protection can be used (Qi et al. 2019;Feng et al. 2019). Ouyang et al. (2020) found that with the deepening of government regulations, enterprises in the industry will gradually choose technological innovation to reduce pollution. In summary, in the interaction between the government and enterprises, both the government and the enterprise have a variety of strategic choices. But where is the boundary between government and enterprise? Over time, will there be changes in the practices of the government and enterprises? Previous studies have not given due attention.
Because of asymmetric information, both the government and the enterprise follow the principle of bounded rationality in the decision-making process. The strategy selection of the government and enterprises is a dynamic process of learning and adjustment, which is consistent with the characteristics of evolutionary game theory (Fang et al. 2019;Xu et al. 2019). Therefore, this paper considers the heterogeneity of enterprises and establishes a dynamic evolutionary game model to analyze the decision-making process of enterprise resource integration and government supervision. This paper explores the impact of dynamic interactions between enterprises, and between enterprises and governments on enterprise resource integration behavior, and studies how to optimize parameter design, how to find the boundaries of government regulation, and how to formulate optimized policy combinations. This provides useful guidance for the decision-making of enterprises and governments.

Research assumptions and parameters
Government regulations are implemented in various forms, and punishment and incentives are the main methods. Due to their own industry characteristics, coal enterprises often have different policy responses to the selection of resource integration strategies. Choosing a resource integration strategy often leads to innovation and upgrading of the enterprise ), while choosing not to implement a resource integration strategy and maintaining the status quo of enterprise operations will often lead to aggravation of pollution. To verify this conclusion, this paper selects 20 enterprises that were listed on China's A-shares in coal mining, coal preparation, and coal washing from 2010 to 2018, with a total of 180 samples of observed variables. Take Integration as the independent variable, Patent and Pollution as the dependent variable.
The specific empirical test steps are as follows: First, for the selection and acquisition of variables, this paper draws on . The choice of resource integration strategy is measured by whether the enterprise has a vertical integration strategy, and the vertical integration strategy is assigned a value of 1, otherwise it is 0. Among them, the vertical integration degree index is calculated using the method that the input-output table and the main business income of the sub-industry correspond to each other, and the calculation results are ranked. The top 50% are samples with a high degree of vertical integration, which are defined as enterprises that are vertically integrated, and the bottom 50% are samples with a low degree of vertical integration, which are defined as enterprises that do not perform vertical integration. The innovation ability of an enterprise is measured by the number of patents granted by the enterprise in that year. The larger the value, the stronger the innovation ability. The pollution level of a company is measured by selecting the pollution discharge fee in the company's disclosure statement. The larger the value, the greater the pollution of the company and the more expensive it will be spent on governance.
Secondly, considering the lag effect of the influence of strategy choice, this paper constructs a multiple regression model of enterprise resource integration and innovation capabilities, as well as enterprise resource integration and corporate pollution levels. The empirical results show ( Table 1) that coal companies choose resource integration strategies, which are conducive to corporate innovation and thus obtain government tax preferences. Choosing a non-resource integration strategy and maintaining the status quo will increase pollution and face government pollution penalties.
Finally, in order to improve the robustness of the model, On the one hand, this paper directly uses the vertical integration degree index to replace the dummy variables of the original vertical integration strategy selection. On the other hand, it controls other variables that may affect corporate innovation and pollution, such as corporate financial leverage; inventory intensity; corporate size Shareholders' funds; total corporate assets; corporate management expense ratios, etc. The results show that the aforementioned conclusions are still robust.
Therefore, based on the above research background and empirical research, this paper gives the government regulation a practical policy meaning, that is, coal enterprises will get the government's tax incentives for resource integration, and they will get pollution penalties for non-resource integration. At the same time, establish a game model between enterprises and enterprises from the perspective of government regulation(model 1), and study how different enterprises can play games to achieve each other's optimal decision when selecting resource integration strategies; Establish a game model between the Enterprise-A and the government from the perspective of government regulation(model 2), and study how the government and the enterprise can play the game to achieve each other's optimal decision when the enterprise chooses resource integration strategies and the government chooses policy supervision. The model assumptions and parameter settings are as follows:

Model 1
Assumption 1 The players of the game. According to resource-based theory, enterprises with heterogeneous resources can gain a competitive advantage and thus gain more market share, while enterprises lacking heterogeneous resources have a smaller competitive advantage and gain less market share. Enterprises with unique resources are dominant enterprises, and those lacking unique resources are inferior enterprises (Helfat and Peteraf 2003). For the convenience of analysis, we assume that there are only two coal enterprises in the game market, namely Enterprise-A and Enterprise-B. Enterprise-A represents the dominant enterprise and Enterprise-B represents the inferior enterprise. According to the resource-based theory, if there are multiple enterprises of different sizes in the market, they can still be divided into dominant and inferior enterprises according to whether they have strong competitive resources. Among them, Enterprise-A represents all dominant enterprises, and Enterprise-B represents all inferior companies. Based on the assumption of bounded rationality, since both players cannot fully understand each other before starting the game, it takes many repeated games to achieve the best game results. Note: ***, **, * indicate significance levels of 1%, 5%, and 10%, respectively Assumption 2 The strategy combination of the game. Model 1 mainly explores the impact of resource integration between two coal enterprises on each other. Therefore, the combination of strategies is (resource integration, non-resource integration), and both players choose maximize their own interests to deal with the game.

Assumption 3
The selection ratio of the game strategy. The ratio that Enterprise-A chooses resource integration is x(0 ≤ x ≤ 1), and the ratio of non-resource integration is1 − x; the ratio that Enterprise-B chooses resource integration is y(0 ≤ y ≤ 1), and the ratio of non-resource integration is 1 − y.

Assumption 4
The heterogeneity of enterprises. Due to the differences in resource levels such as scale and size among coal enterprises, it is also based on the theory of resource advantages (Helfat and Peteraf 2003). Therefore, this paper assumes that there is heterogeneity between the two coal enterprises in the game, and use market power, that is, market share to measure their power, at the same time, if the size of the coefficient of market power determines the distribution of benefits and conflicting costs between game players in the coal market. If the market power of the dominant enterprise is m(0.5 ≤ m ≤ 1), and the market power of the inferior enterprise is 1 − m(0 < 1 − m ≤ 0.5).

Assumption 5
Revenues and losses in the coal market. If the total revenue of the coal market is R, it is shared by the two coal enterprises that exist in the market. At the same time, the integration of resources will increase the future total revenue of the coal market. When Enterprise-A and Enterprise-B choose non-resource integration, it will reduce future revenue, thereby affecting the current market revenue. Therefore, this paper assumes that when both enterprises choose non-resource integration, the original market revenue will decrease. In the short term, according to Marshall's theory of economies of scale, when one enterprise chooses resource integration to expand its scale, and the other enterprise chooses non-resource integration to maintain its current scale, because resource integration has not yet formed economies of scale, it will not produce a substantial increase in the original total market revenue, but will squeeze out non-resource integration enterprises' share in total market revenue, resulting in squeeze benefits (Marshall 1920). Therefore, this paper assumes that the squeeze coefficient of resource integration is v, which is less than the market revenue loss coefficient bcaused by neither resource integration.

Assumption 6
The cost of coal enterprises. If the cost of the resource integration is C 1 and the cost of non-resource integration is C 2 . At the same time, because resource integration needs to consume additional costs in addition to production and operation costs to achieve resource connection (Kravet et al. 2018), the cost of resource integration should be greater than the cost of non-resource integration.
Assumption 7 Government regulation. According to the empirical research, since the resource integration of enterprises will promote the transformation and upgrading of enterprises, reduce pollution emissions, and improve the innovation capabilities of enterprises. Considering that in order to encourage innovation, almost all countries in the world adopt the method of pretax deduction of R&D expenses, that is, the use cost is the tax base. For example, starting from January 1, 2021 in China, the rate of deduction for manufacturing R&D expenses has increased from 75% to 200%. However, some countries directly implement tax exemption policies in order to stimulate innovation, that is, use income as the tax base. For example, in Germany, for energy-intensive companies that meet specific energy-saving targets, a 10-year tax exemption period will be extended from 2013. Therefore, this paper assumes that the government will give coal enterprises of resources integration a certain innovation tax incentive ratio d, and In Model 1, because both parties in the game are two types of enterprises, in order to facilitate the analysis of the different effects of government regulations on the benefits and costs of superior and inferior companies, this paper sets the benefits and costs in the parameter setting. Therefore, in Model 1, this paper chooses cost as the tax base. When coal enterprises choose non-resource integration to maintain the status quo of their business operations, pollution emissions are unable to be improved and they will face government pollution penalties. Therefore, this paper assumes that the government will impose a certain amount of pollution penalties E on nonresource integration coal enterprises (Table 2).

Model 2
Assumption 8 The players of the game. Assuming that the Government-W and the Enterprise-Z represent governments at all levels and many coal enterprises to play games. Based on the assumption of bounded rationality. Since both players cannot fully understand the other at the beginning of the game, it is necessary to repeat the game many times achieve the best game result.

Assumption 9
The strategy combination of the game. Model 2 mainly discusses the game between whether the government is supervising or not and whether coal enterprises are conducting resource integration. Therefore, the government's strategic combination is (supervision, non-supervision), and the coal enterprise's strategic combination is (resource integration, non-resource integration). Both players choose maximize their own interests to deal with the game.

Assumption 10
The selection ratio of the game strategy. The probability that the government chooses supervision is g (0 ≤ g ≤ 1), and the probability of non-regulation is 1 − g; the probability of enterprise Z choosing resource integration is j (0 ≤ j ≤ 1), and the probability of non-resource integration is 1 − j.

Assumption 11
The revenues of coal enterprises. Assuming that under the condition of subtracting costs, the revenue of the enterprise that chooses resource integration is R 1 , and the revenue of the enterprise that chooses non-resource integration is R 2 .

Assumption 12
Costs and revenues of government supervision. Since the government is a non-profit organization, Model 2 does not set the government's revenue. This paper assumes that the government's regulatory cost is C 3 . Regardless of whether the government conducts supervision or not, when the coal enterprises in their jurisdictions implement the innovative utility of resource integration, no matter how large or small they will bring intangible benefits to the government (Guo et al. 2017). In this paper, it is assumed that the positive utility of resource integration to drive innovation is I 1 .
Assumption 13 Government regulation. It is the same as the assumption of Model 1. However, in Model 2, the participating parties are the government and enterprises, and the situation of heterogeneous enterprises is not considered at this time. Therefore, in the parameter setting of enterprises, this paper mainly considers the income of enterprises and decides that the income of the enterprise is used as the tax base for tax incentives (Table 3).

Model establishment
According to the assumption of model 1, the decision tree is shown in Fig. 2, and the payout matrix is shown in Table 4.
According to Table 4, Under the premise that Enterprise-A chooses the resource integration strategy, when Enterprise-B adopts resource integration and non-resource integration strategies, the sum of the expected payoffs of Enterprise-A is as follows: The average expected payoff of Enterprise-A is as follows: The replication dynamic equation of Enterprise-A's choice of resource integration strategy is as follows: Market losses caused by the integration of non-resources between the two players b>v E Non-resource integration pollution penalty E>0 According to Table 3, Under the premise that Enterprise-B chooses the resource integration strategy, when Enterprise-A adopts resource integration and non-resource integration strategies, the sum of the expected payoffs of Enterprise-B is as follows: The average expected payoff of Enterprise-A is as follows: The replication dynamic equation of Enterprise-B's choice of resource integration strategy is as follows: Thus, a two-dimensional continuous dynamic system is formed as follows: Let dx dt ¼ 0; dy dt ¼ 0, the five singularities of the system can be calculated asA(0, 0), At the same time, the Jacobian matrix of the system is as follows: According to the Jacobian matrix, the value of the determinant and the trace of the matrix can be obtained.
However, due to the undetermined parameter values in the singularity E, it is impossible to determine its specific location and its influence on the two-dimensional continuous dynamic system. Therefore, its parameters need to be classified and discussed.

Equilibrium analysis
This system is divided into three cases for discussion. The results of local equilibrium and stability are shown in Table 5.
(1) Case 1:the evolutionary game system converges to point A(0, 0), indicating that when the cost of resource integration is far greater than the cost of non-resource integration, based on bounded rationality, for enterprises, The positive effects of resource integration to promote innovation I 1 >0 regardless of whether the government provides resource integration tax incentives or non-resource integration pollution penalties, enterprises tend to choose nonresource integration strategies.
(2) Case 2:the evolutionary game system converges to point C(1, 0), indicating that when the cost of resource integration lies between the non-resource integration costs of the dominant enterprise and the inferior enterprise, because the cost of resource integration is less than the cost of non-resource integration, non-resource integration pollution will bring punishment, resource integration innovation will bring tax incentives, so dominant enterprises tend to choose resource integration strategies. And inferior Enterprise B tends to choose a non-resource integration strategy, because the cost of resource integration is greater than the cost of non-resource integration, and government regulations are invalid for it. (3) Case 3:the evolutionary game system converges to point D(1, 1), indicating that when the cost of resource integration is far less than the cost of non-resource integration, coupled with the resource integration incentives and nonresource integration penalties in government regulations, based on bounded rationality, they tend to choose the resource integration strategy.

System simulation analysis
To observe the evolution process of the strategy choices of different game players in the game, this paper uses MATLAB software for simulation, (x, y) is given five initial values of (0.1,0.1), (0.3,0.3), (0.5,0.5), (0.7,0.7), (0.9,0.9)at the same time.
For case 1, the parameter settings are R=200, C 1 =120, C 2 =30, b=50, m=0.7, v=0.1, d=0.1, E=10. Its evolution trajectory is shown in Fig. 3(a). It can be seen from the figure that the evolution trajectory approaches the point (0, 0), indicating that Whether the market power of the enterprise is dominant or inferior. Based on bounded rationality, enterprises tend to choose non-resource integration strategies, which is consistent with the theoretical model results.
For case 2, the parameter settings are R=200, C 1 =80, C 2 =20, b=50, m=0.7, v=0.1, d=0.1, E=10. Its evolution trajectory is shown in Fig. 3(b). It can be found from the figure that the evolution trajectory approaches the point (1, 0), indicating that dominant enterprises tend to choose resource integration strategies, and inferior enterprises tend to choose non-resource integration strategies, which is consistent with the theoretical model results.
For case 3, the parameter settings are R=200, C 1 =50, C 2 =30, b=50, m=0.7, v=0.1, d=0.1, E=10. Its evolution trajectory is shown in Fig. 3(c). It can be seen from the figure that the evolutionary trajectory approaches the point (1, 1), which indicates that whether the market power of enterprises is dominant or inferior. Based on bounded rationality, they tend to choose resource integration strategies, which are consistent with the theoretical model results.
In short, when the cost of the two strategies is vastly different, the government regulation will not work for any enterprises. When the power of the enterprise is very different, the government regulation will only fail for the inferior enterprises.
A game model of resource integration behavior between government and enterprise

Model establishment
According to the assumptions of Model 2, the decision tree is shown in Fig. 4, and the payout matrix is shown in Table 6.
According to Table 6, Under the premise that Enterprise-Z chooses to integrate resources, when Government-W adopts supervision and non-supervision strategies, the sum of the expected payoffs of Enterprise-Z is as follows: The average expected payoff of Enterprise-Z is as follows: The replication dynamic equation of Enterprise-Z's choice of resource integration strategy is as follows: According to Table 3, Under the premise that the Government-W chooses the supervision strategy, when Enterprise-Z chooses resource integration and non-resource integration strategies, the expected payoffs of Government-W is as follows: The average expected payoff of Government-W is as follows: The replication dynamic equation of the Government-W's choice of resource integration strategy is as follows: Thus, a two-dimensional continuous dynamic system is constructed: Let dj dt ¼ dg dt ¼ 0, The five singularities of the available system are A (0, 0), B(0, 1), C (1, 0), D (1, 1), E (j * , g * ) At the same time, the Jacobian matrix of the system is as follows: According to the Jacobian matrix, the value of the determinant and the trace of the matrix can be obtained.
However, due to the undetermined parameter values in the singularity E, it is impossible to determine its specific location and influence on the two-dimensional continuous dynamic system. Therefore, its parameters need to be classified and discussed.

Equilibrium analysis
This system is divided into four cases for discussion, and the results of local equilibrium and stability are shown in Table 7.
(1) Case 1: the evolutionary game system converges to point C(1, 0), indicating that the maximum cost of government supervision is between the maximum and minimum benefits brought to the government by the enterprise's resource integration, and at the same time, when the difference between the profits of the coal enterprise without resource integration and resource integration is greater than the total amount of tax incentives and pollution penalties, based on bounded rationality, the government tends to choose regulatory strategies, but for enterprises, considering the maximization of benefits, they still tend to choose non-resource integration strategy, resulting in the failure of government supervision. (2) Case 2: the evolutionary game system converges to point C(1, 0), indicating that when the maximum cost of government supervision is less than the minimum benefit to the government brought by the enterprise's resource integration, at the same time, when the difference between the profits of the coal enterprise without resource integration and resource integration is greater than the total amount of tax incentives and pollution penalties, based on bounded rationality, both the government and enterprises choose strategies that are favorable to them, that is, the government tends to choose regulatory strategies, and enterprises tend to choose non-resource integration strategies.
(3) Case 3: the evolutionary game system converges to point D(1, 1), indicating that when the maximum cost of government supervision is less than the minimum benefit to the government brought by the enterprise's resource integration, at the same time, when the difference between the profits of the coal enterprise without resource integration and resource integration is less than the total amount of tax incentives and pollution penalties, the government tends to choose regulatory strategies, while enterprises tend to choose resource integration strategies under the constraints and guidance of government regulations. (4) Case 4:the evolutionary game system has no equilibrium point, indicating that when the maximum cost of government supervision is greater than the minimum benefit to the government brought by the enterprise's resource integration, at the same time, when the difference between the profits of the coal enterprise without resource integration and resource integration is less than the total amount of tax incentives and pollution penalties, the government and enterprises tend to choose uncertainties. When the strategy is beneficial for them, they may be inclined to choose it.

System simulation analysis
To observe the evolution process of the strategy choices of different game players in the game, this paper uses MATLAB software for simulation, (j, g) is given five initial values of (0.1, 0.1),(0.3, 0.3),(0.5, 0.5),(0.7, 0.7),(0.9, 0.9) at the same time.
For case 1, the parameter settings are R 1 =50, R 2 =100, C 3 =20, d=0.1, E=40, I 1 =20. Its evolution trajectory is shown in Fig. 5(a). It can be found from the figure that the evolution trajectory approaches the point (0, 1), indicating enterprises tend to choose non-resource integration strategies, and governments tend to choose regulatory strategies, which is consistent with the theoretical model results.
For case 2, the parameter settings are R 1 =50, R 2 =100, C 3 =20, d=0.1, E=40, I 1 =30. Its evolution trajectory is shown in Fig. 5(b). It can be found from the figure that the evolution trajectory approaches the point (0, 1), indicating that enterprises tend to choose non-resource integration strategies, and the government tends to choose regulatory strategies, which is consistent with the theoretical model results. For case 3, the parameter settings are R 1 =120, R 2 =200, C 3 =10, d=0.4, E=40, I 1 =60. Its evolution trajectory is shown in Fig. 5(c). It can be found from the figure that the evolution trajectory approaches the point (1, 1), indicating that, enterprises tend to choose resource integration strategies, and the government tends to choose regulatory strategies, which is consistent with the theoretical model results.
For case 4, the parameter settings are R 1 =120, R 2 =200, C 3 =10, d=0.4, E=40, I 1 =30. Its evolution trajectory is shown in Fig. 5(d). It can be found from the figure that there is no equilibrium point in the evolution trajectory, indicating that enterprises and governments have no tendency to strategy, which is consistent with the theoretical model results.
In short, from the above four cases, the government must supervise for its own management responsibilities, unfortunately, when the cost of supervision is greater than the potential benefits, whether the government's supervision or not will fluctuate over time; Enterprises are in the process of gaming with the government, constantly measuring the benefits and losses that government regulations bring to themselves. Only when tax incentives and pollution penalties have a qualitative impact on enterprises' profits, government regulations will exert active guidance and restriction.

System simulation analysis in different situations
In order to verify the stability of the model and at the same time summarize the impact of different coefficients on the government's and coal enterprises' strategy choices. First, this section discusses the influence of enterprise heterogeneity on the choice of game strategy among enterprises from the perspective of market power; then, analyze the single factor in government regulation on the choice of strategy combination between enterprises, and between government and enterprises; finally, use the combination of influencing factors in government regulation and analyze the influence of its changes on the combination of strategy choice. For the choice of game strategy among enterprises, considering that the initial cost of resource integration is higher than that of non-resource integration, so the initial value of the probability of selecting resource integration is set to 0.1. At the same time, in order to analyze the differential economic meaning of strategy selection, so the strategy combination is set as the case 2 of the model 1; for the game between government and enterprises, considering the different organizational nature of the two, in order to ensure the fairness of the game, so the initial value of the probability of enterprise selection resource integration is set to 0.5, and the initial value of the probability of the government selection supervision is also set to 0.5, and the strategy combination is set as the case 2 in model 1. The above factors are simulated for the following.
At the same time, regarding the selection of simulation values, this paper conducts simulation analysis based on the case of China Shenhua merged with China Guodian. Through the investigation, sorting, statistics, and summary of the relevant data of the merger and reorganization event, the parameter values in the evolutionary game payment matrix are obtained under the premise of simplified calculation, and the MATLAB software is used for numerical analysis to visually reflect the problem.

The influence of market power on the choice of enterprise strategic combination
The parameters are assumed as follows: R=200, C 1 =80, C 2 =20, b=50, v=0.1, d=0.1, E=10. From the simulation dynamic evolution trajectory of Fig. 6, the enterprise's strategy combination gradually evolves from (0, 0) to (0, 1), as the value of m gradually increases. At the same time, it can be found from the arc of the curve, the larger the value of m, the faster the response speed of the enterprise, and the shorter the response time to converge to (1, 0). When m=0.5, it approaches (0, 0) in the early stage of strategy evolution, but still converges to (0, 0) after a long time of change.
In short, the size of the market power between enterprises has no qualitative impact on their game strategy combination choices, but with the gradual expansion of the difference value, when enterprises make strategic choices, the time required for the game will become shorter and shorter. Dominant enterprises tend to choose resource integration, and the development of them is getting better. Inferior enterprises tend to choose not to integrate resources and maintain the status quo. The gap with dominant enterprises will become larger and larger, and they will be gradually eliminated by the market.

The influence of government regulation on the choice of enterprise strategic combination
Aiming at the influence of tax incentives on the choice of enterprise strategic combination. The parameters are assumed as follows: R=200, C 1 =80, C 2 =20, b=50, v=0.1, m=0.7, E=10. The simulation dynamic evolution trajectory is shown in Fig. 7(a).
Aiming at the influence of pollution penalty on the choice of enterprise's strategic combination. The parameters are assumed as follows: R=200, C 1 =80, C 2 =20, b=50, v=0.1, m=0.7, d=0.1. The simulation dynamic evolution trajectory is shown in Fig. 7(b).
Aiming at the influence of government regulation combination on the choice of enterprise strategic combination. The parameters are assumed as follows: R=200, C 1 =80, C 2 =20,  b=50, v=0.1, m=0.7. The simulation dynamic evolution trajectory is shown in Fig. 7(c).
It can be seen from Fig. 7(a) and (b) that as the values of d and E gradually become larger, the enterprise's strategic combination gradually evolves from (1, 0) to (1, 1). From the arc of the curve, we can find that the critical values of d and E are between 0.1 to 0.3 and 30 to 40, respectively. If the values are α and β respectively. When the values of d and E are closer to α and β, the convergence speed becomes slower and slower. And at the same time when it is larger than the critical value, the final evolution strategy combination of the enterprise is (1, 1). When it is less than the critical value, the final evolution strategy combination of the enterprise is (1, 0). It can be seen from Fig. 7(c) that the combination of the d and E value, the strategy selection results are the same as Fig. 7(a) and (b), but the critical value will be lowered.
In short, due to their own scale advantages, inferior enterprises have a higher probability of choosing resource integration for better development. Therefore, government regulations have less impact on them. However, for inferior enterprises, they are greatly influenced by the outside world, and the guiding and restricting role played by government regulations is particularly prominent. At the same time, the combined use of multiple government regulations will make the guiding and restricting more efficient.

The influence of government regulation on the choice of government-enterprise strategy combination
Aiming at the influence of tax incentives on the choice of government-enterprise strategic combination. The parameters are assumed as follows: R 1 =50, R 2 =100, C 3 =20, E=40, I 1 =30. The simulation dynamic evolution trajectory is shown in Fig.  8(a). It can be seen from the figure that as the value of d increases, the strategic combination of government and enterprise evolves from (1, 0) and (1, 1) to non-equilibrium points. The evolution trajectory is divided into three sections, the critical value is 0.2, and a value between 0.3 and 0.4, which is assumed to be λ. When the value of d is less than 0.2, the strategic combination of government and enterprise gradually evolves toward (0,1); when the value of d is greater than 0.2 and less than λ, the strategic combination of government and enterprise gradually evolves toward (1, 1); when the value of d is greater than λ, there is no equilibrium in the strategies of government and enterprises, and they are in a state of swing.
Aiming at the influence of pollution penalty on the choice of government-enterprise strategic combination. The parameters are assumed as follows: R 1 =50, R 2 =100, C 3 =20, I 1 =30, d = 0.1. The simulation dynamic evolution trajectory is shown in Fig. 8(b). It can be seen from the figure that as the value of E increases, the strategic combination of government and enterprise evolves from (1, 0) to (1, 1), and the critical value is between 40 and 50. Assuming that the value is μ, When the value of E is less than μ, the strategy combination of government and enterprise evolves to (1, 0); when the value of E is greater than μ, the strategy combination of government and enterprise evolves to (1, 1).
Aiming at the influence of government regulation combination on the choice of government-enterprise strategic combination. The parameters are assumed as follows: R 1 =50, R 2 =100, C 3 =20, I 1 =30. The simulation dynamic evolution trajectory is shown in Fig. 8(c). It can be seen from the figure that when the value of d is combined with the value of E, the critical value of d will be reduced, and the optimal policy between government and enterprise will be reached in a shorter time.
In short, excessive tax incentives and pollution penalties often make government regulations unable to play a guiding and restrictive role. At the same time, the combined use of government regulations can increase the flexibility of the government and enterprise's strategic choices and improve their efficiency of strategic choices.

Discussion
Strengthening the resource integration of coal enterprises is another important way to achieve energy conservation and emission reduction. Government regulations can effectively guide and encourage this approach. From the micro level, it can improve the utilization efficiency of coal resources, and from the macro level, it can take advantage of the integration of coal enterprises' existing resources. However, under the influence of government regulation, the cooperation between the executive body and the supervision body of resource integration is insufficient (Patterson and Whincup 2017). According to the participants involved in resource integration, this paper divides the main body into enterprises and governments, considering the strategic competition between enterprises, and between enterprise and government, and respectively constructing two types of tax incentives and pollution penalties mechanisms. Under these two mechanisms, the method of evolutionary game analysis and simulation is used to study the effect of government regulation. At the same time, it analyzes the influence of government regulations on coal enterprises and government strategy choices in different situations.
Different from previous studies, the differences in this paper are as follows: (1) based on the bounded rationality, this paper constructs a dynamic evolutionary game model to analyze the strategic choices of participants in coal enterprise resource integration under government regulation, and uses a method, which is combination of empirical research and mathematical deduction; (2) when studying government regulations, this paper considers both incentive and punishment, and gives them specific policy implications; (3) in the context of simulation , this paper looks for the boundary of coal enterprise resource integration and government supervision strategy selection. This section will discuss the practical application of the conclusions of the aforementioned research in actual situations. In addition, we have added research related to the application of evolutionary game theory in mergers or other similar business activities. We have established an evolutionary game model between the government and enterprises to further explore the impact of government regulations on the green innovation of coal enterprises.
(1) Can static government regulations guide enterprises to integrate resources to reduce pollution?
Policy governance is divided into static and dynamic. Static governance studies evaluate the governance effects of policies at a certain point in time, while dynamic governance observes changes in governance effects by adjusting the implementation of policies (Ma et al. 2020). The guiding policies for resource integration are mainly based on incentive tools and punishment tools, guiding enterprises to choose resource integration strategies to reduce pollution.
In the three cases of the game model between enterprises, A(0, 0), C(1, 0), D(1, 1) are possible EES points, indicating that whether the business is inferior or dominant, under certain conditions, it will choose resource integration. In the four cases of the game model between government and enterprise, the possible EES points of the first three cases are (1, 1), (0, 1), (1, 1), and the fourth situation has no EES points, indicating that with the participation of the government, enterprises will choose to integrate resources under certain conditions. Therefore, under certain conditions, static government regulations can guide enterprises to integrate resources to reduce pollution caused by coal enterprises.
For the government, within a certain limit, no matter what environment the government is in, it will choose to supervise the enterprise's resource integration behavior for its own responsibility mission (Vongsathorn 2012). However, beyond this limit, the maximum cost of government supervision is greater than the minimum benefit that enterprises can bring to the government from resource integration. The government needs to consider more factors in the process of strategic choice, resulting in a state of vacillation in strategic choice. For enterprises, the difference between the benefits of the two strategies is greater than the maximum loss caused by government supervision. No matter what countermeasures the government adopts, based on bounded rationality, enterprises all choose strategies that are in line with maximizing their own interests, resulting in the failure of government supervision strategies. However, the benefits brought by the enterprises' own strategic choices are lower than the maximum losses brought about by government regulation. Currently, government supervision takes effect, which restricts and guides enterprises' behavior.
However, through the research of this paper, it is also found that in some cases, government regulations also have signs of failure, mainly because the government and enterprises start from their own costs and benefits without considering social benefits (Blackwell et al. 2017). Therefore, in reality, in addition to relying on government regulations to guide the enterprise's resource integration behavior, other governance entities also need to coordinate governance (Knudsen 2018) to prevent the failure of government regulations and achieve the best governance effect.
(2) Can dynamic government regulations guide enterprises to integrate resources to reduce pollution?
By adjusting the coefficients of tax incentives and pollution penalties, the corporate strategy portfolio has gradually evolved from (1, 0) to (1, 1), indicating that regardless of whether the enterprise is inferior or dominant, the increase in incentives and punishment will make the enterprise choose resources integration. However, there is a critical value for the strategic combination of government and enterprise. The critical value does make the enterprise choose the resource integration strategy, but once the critical value is exceeded, the strategic choice of the enterprise appears to be in a state of swing. Therefore, under certain conditions, dynamic government regulations can guide enterprises to integrate resources thereby reducing pollution caused by coal enterprises.
For the strategic combination of enterprises, the tax incentives and pollution penalties in government regulations have a certain guiding and restrictive effect on the strategic choices of enterprises, but they have little impact on dominant enterprises; for the strategic combination of government and enterprises, the government blindly encourages resource integration, often backfires, when a certain critical value is reached, the strategic combination of the government and the enterprise presents a swing state, which cannot truly achieve the purpose of controlling pollution and encouraging innovation. Excessive penalties will make enterprises quickly choose government-oriented decisions under the deterrence of the government, but they are not voluntary and not conducive to the long-term development of the enterprise. Combination of government regulations can effectively improve the efficiency of enterprises strategy selection. On the one hand, reducing tax incentives can reduce the government's burden of tax.
Therefore, in reality, the government needs to consider various factors when formulating policies, comprehensively use incentive tools and punishment tools (Pastore 2018). And through actual research and policy simulation to explore the best incentive and punishment strength, and formulate flexibility rules.
(3) What is the difference between inferior or dominant enterprises affected by government regulations?
Survival of the fittest and natural selection are a universal law in nature, and it is also applicable in the life cycle of an enterprise. A strong enterprise has a strong adaptability and tends to choose change. However, for weak enterprises, they tend to choose to stay as they are, and they may decline and be gradually eliminated by the market (Michael 2018).
By adjusting the size of the market power, it can be found that the strategic combination of enterprises has gradually evolved from (0, 0) to (1, 0), indicating that the different market shares of enterprises have great differences in the choice of resource integration strategies.
When the market power between enterprises is vastly different, compared with inferior enterprises, dominant enterprises can quickly find their own dominant strategic choices, and quickly push the system equilibrium to the side that benefits them. Government regulations have played a guiding role in the strategic selection of dominant enterprises, but have no effect on inferior enterprises. At the same time, no matter how the market power changes, based on of bounded rationality, when the cost of government regulation is less than the profit brought by non-resource integration, for inferior enterprises, government regulation fails, and the optimal strategy is still non-resource integration. When the market power reaches equilibrium, in the choice of strategy considering cost, the two enterprises prefer non-resource integration. However, as the game is repeated many times, one enterprise will take the lead in making strategic changes under the influence of government regulations, then break the equilibrium and evolve toward the reverse strategy.
This paper assumes that there are only two different enterprises of different sizes in the market. However, in reality, there are many coal enterprises and their market shares are also quite different. Therefore, when the government formulates guiding policies, it can make use of the different responses of different market's enterprises to government regulations to formulate targeted policies. At the same time, more attention should be paid to inferior enterprises. Due to their weak ability to bear risks, they are more inclined to maintain their original state in making decisions. Government regulations alone cannot help them quickly transform to achieve energy conservation and emission reduction. Therefore, multiagent collaborative governance is needed to achieve efficient integration of resources (Arimura and Wakabayashi 2020).
(4) Does government regulation affect the green innovation strategy of coal enterprises?
According to the results of empirical research and game model analysis, it can be seen that the resource integration of coal enterprises can promote enterprise innovation, and the non-resource integration will cause pollution. Therefore, in order to further explore the economic consequences of the influence of government regulations on the resource integration of coal enterprises, we have established a game model between enterprise and the government to further analyze the influence of different government policies on coal enterprises' green innovation strategy choices. Through the game model and simulation, we explore how the government and the enterprise can play the game to achieve each other's optimal decision when the enterprise and the government make strategic choices. In this game model, the coal enterprise's strategy choice is (green innovation, non-green innovation), and the government's strategy choice is (subsidy, pollution penalty). We analyzed the steady state of the evolutionary game model in four situations. Due to the limitation of the length of the paper, we put the detailed process of mathematical analysis and simulation analysis in Appendix C.
According to the game model and simulation results, we found that the government subsidy strategy is less effective than the government's pollution penalty strategy in promoting the green innovation of enterprises. When the government's pollution penalty is less than the cost of green innovation, the probability of the enterprise's strategy choice converges to 1, enterprises always choose non-green innovation. When the pollution penalty is greater than the cost of green innovation, enterprises may choose green innovation. This shows that government pollution penalties have promoted corporate green innovation.
The reason for this result may be that although innovation subsidies have increased corporate innovation revenue, they have not increased the costs and losses of corporate non-green innovation. When faced with choices, enterprises have chosen non-green innovation activities with lower risks and costs. Government pollution penalties increase the cost of corporate pollution to encourage enterprises to carry out green innovations. Therefore, the government needs to comprehensively play the role of different policy combinations in the process of governing the green innovation of coal enterprises. On the one hand, the government needs to increase the benefits of corporate green innovation; on the other hand, the government needs to increase the costs and risks of corporate non-green innovation activities.

Conclusions and policy implications
Aiming at the problem of how to strengthen the resource integration of coal enterprises, this paper constructs an evolutionary game model to explore the strategic choices of heterogeneous coal enterprises under government regulation. In this paper, a simulation is performed through matlab to analyze the evolution and steady state changes of the main game player's strategy, and the conclusions are as follows: (1) The power of coal enterprises in the market affects their strategic choices for resource integration. At the same time, when the market power gap is large, government regulation is only effective for dominant enterprises.
(2) The government's innovation preferences and pollution penalties have a guiding and restrictive role in the strategic choices of enterprises. Moreover, the combination of these two regulatory methods can more effectively promote the integration of corporate resources.
(3) Excessive government rewards or punishments are not conducive to the integration of corporate resources. Excessive government rewards will increase the corporate dependence on policies. At the same time, excessive government penalties will increase the pressure on enterprises, resulting in enterprises lacking the initiative to integrate resources. Ultimately, these conditions will lead to inefficiency in government regulation.
Therefore, under the restriction and guidance of government regulations, how to efficiently realize the resource integration of coal enterprises should be implemented in the following aspects: (1) Due to the differences in market power between enterprises, their response to government regulations and their strategic choices are also different. Therefore, when formulating rules and regulations, the government should start from the gap in market power among enterprises, implementing layered governance, breaking the inertia of institutional balance. More importantly, the government adjust flexibly according to the actual situation of enterprises, and strive to maximize the effectiveness of governance.
(2) Excessive tax incentives can encourage enterprises to abandon non-resource integration strategies, thereby reducing pollution, but based on bounded rationality, excessive tax incentives will impose an excessive burden on the government, and at the same time, excessive tax incentives will also increase their dependence on policies and increase their expectations for the policy. Once the government lowers tax incentives, enterprises will choose non-resource integration strategies again. Therefore, in the process of formulating regulations, the government should achieve regulatory portfolio governance, innovate institutional advantages, and formulate flexible regulatory standards.
(3) Due to the heterogeneity of enterprises, a mere certain policy may not be able to achieve the optimal governance effect. Therefore, an efficient governance mechanism is inseparable from the combined use of government regulations. First, the combined use of regulations should adapt to the characteristics of the enterprise. Second, the combination of government regulation must consider the vested interests of both the government and the enterprise, avoiding to give up the interests of one party to achieve a balanced state, which will result in the failure to implement the policy in a long term. Finally, to combine and innovate regulations requires pilot experiments, and do not stay at the theoretical level.
Data availability The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Declarations
Ethics approval Not applicable.
Consent to participate Not applicable.
Consent for publication Not applicable.

Conflict of interest
The authors declare no competing interests.

Calculation details of model 1
The payment matrix of Enterprise-A is defined as A, which is shown as follows: The payment matrix of Enterprise-B is defined as B, which is shown as follows: Calculation details of model 2 The payment matrix of Enterprise-Z is defined as Z, which is shown as follows: The payment matrix of Government-W is defined as W, which is shown as follows:

Additional analysis
(1) Research assumptions, and parameters Model assumptions and parameter settings are as follows: Assumption 1 The players of the game. Assuming that the Government-W and the Enterprise-Z represent governments at all levels and many coal enterprises to play games.

Assumption2
The strategy combination of the game. In China, government subsidies and pollution penalties are two important environmental regulatory policies (Albrizio et al. 2017). Therefore, we assume that the government's strategic combination is (subsidy, pollution penalty) and the coal enterprise's strategic combination is (green innovation, non-green innovation).

Assumption 3
The selection ratio of the game strategy. The ratio that a company chooses green innovation is j (0 ≤ j ≤ 1), and the ratio of non-green innovation is 1 − j. The ratio of the government choosing an innovative subsidy policy is g (0 ≤ g ≤ 1), and the ratio of choosing a pollution penalty is 1 − g.

Assumption 4
The revenues of coal enterprises. We assume that the enterprise's revenue is R 3 . If an enterprise chooses green innovation, it needs to pay a cost of C 4 .
Assumption 5 Government revenue. When coal enterprises carry out green innovation, the social benefits brought by green innovation are R 4 , and this part of the revenue is regarded as government revenue.
Assumption 6 Government regulation. The government's environmental regulation policies can be divided into innovation subsidies and pollution penalties. Since 2012, China has formulated a green innovation subsidy program to promote green innovation (Lu et al. 2021). At the same time, in the "Made in China (2025) Strategy" plan, China also proposed an innovation subsidy program (Yi et al. 2020a(Yi et al. , 2020b. In addition to innovation subsidies, China also imposes related taxes and fees on corporate pollution. For example, starting in 2018, China promulgated a new tax law and began to levy taxes on corporate emissions (Wang and Yu 2020). Therefore, we assume that the government grants an innovation subsidy S (S < C4) to coal enterprises that conduct green innovation. When coal companies choose non-green innovations, the enterprises may face pollution penalties P from the government because they may not be able to improve their pollution emissions.
(2) Model establishment According to the assumptions of the model, the payout matrix is shown in Table 10. Table 9 The determinant and trace of the Jacobian matrix Equilibrium point (j,g) φdeterminant expression φ trace expression A(0,0) (R 1 −R 2 ) ×(−C 3 +E)

(3) Equilibrium analysis
This system is divided into four cases for discussion, and the results of local equilibrium and stability are shown in Table 12.
In case 1, the evolutionary game system is in an unstable state. It shows that when the amount of government pollution penalty is greater than the cost of green innovation of enterprises, and the environmental benefits brought by green innovation to the government are greater than the amount of innovation subsidies, the strategic choices of the government and enterprises are in an unstable state.
In case 2, the evolutionary game system converges to C(1, 0). It shows that when the amount of government pollution penalty is greater than the cost of green innovation for enterprises, and the environmental benefits brought by green innovation to the government are less than the amount of innovation subsidies, based on bounded rationality, the government tends to choose pollution punishment strategy, and enterprises tend to choose green innovation strategy.
In case 3, the evolutionary game system converges to A(0, 0). When the amount of government pollution penalty is less than the cost of corporate green innovation, and the environmental benefits of green innovation to the government are greater than the amount of innovation subsidies, companies tend to choose non-green innovation strategy, and the government tends to choose pollution penalty strategy.
In case 4, the evolutionary game system converges to A(0, 0). It shows that when the amount of government pollution penalty is less than the cost of green innovation for enterprises, and the environmental benefits brought by green innovation to the government are less than the amount of innovation subsidies, based on bounded rationality, the government tends to choose pollution penalty strategy, and enterprises tend to choose non-green innovations strategy.

(4) System simulation analysis
To observe the evolution process of the strategy choices of different game players in the game, this paper uses MATLAB software for simulation. We conducted simulations based on China Shenhua Energy Company Limited's 2018 Environmental, Social and Governance Report and the 2018 financial data of China Shenhua from the CSMAR database (https://www.gtarsc.com/#/index). In 2018, China Shenhua's operating income was 264.101 billion RMB, R&D investment was 860 million RMB, pollution expenditure was 40 million RMB, government subsidies were 124 million RMB, and the enterprise's environmental protection investment was 2.032 billion RMB. Based on the above data, for case 1, this paper  sets the parameters as R 3 =2641.01, R 4 =20.32, C 4 =8.6, S=1. 24, P=10. For case 2, this paper sets the parameters as R 3 = 2641.01, R 4 =20.32, C 4 =28.6, S=21.24, P=30. For case 3, this paper sets the parameters as R 3 =2641.01, R 4 =20.32, C 4 =8.6, S=1.24, P=0.4. For case 4 this paper sets the parameters as R 3 =2641.01, R 4 =20.32, C 4 =28.6, S=21.24, P=20.4. The simulation result is shown in Fig. 9. From Fig. 9, it can be found that the evolution result of case 1 is an unstable state, the evolution result of case 2 is(1, 0), and the evolution result of case 3 and case 4 is(0, 0).