Does manufacturing agglomeration promote or hinder green development efficiency? Evidence from Yangtze River Economic Belt, China

Sustainable development can be mainly achieved by promoting the green transformation and development of the world economy and by improving the efficiency of regional green development, which often receive extensive attention from the academia. This paper uses a spatial econometric model to estimate the impact of manufacturing agglomeration on green development efficiency based on the panel data of China’s Yangtze River Economic Belt (YREB). The results show an overall large gap of green development efficiency between regions in the Yangtze River Economic Zone, mostly due to the extremely uneven development of green development efficiency in the upper reaches. Opposite to the middle and lower reaches, manufacturing agglomeration in the upper reaches of the YREB improves green development efficiency. Manufacturing agglomeration is conducive to the improvement of green development efficiency in adjacent areas. Nonetheless, it may hinder green development efficiency by inhibiting green technological innovation. This paper provides empirical evidence and policy implications for applying manufacturing agglomeration to promote green development efficiency in accordance with local conditions.


Introduction
Promoting sustained, inclusive, and sustainable economic growth and working together to address global climate change are key objectives of the 2030 Agenda for Sustainable Development. To achieve this goal, China has listed green development as a major strategic planning task in the fourteenth five-year plan for national economic and social development in 2021. This will serve as the guidelines for China's economic and social development in the near future. The essence of green development is to change the traditional economic growth mode that sacrifices resources and environment, and to protect the ecological environment while promoting sustainable economic growth (Yuan et al. 2020a). The key to promoting green development lies in improving green development efficiency, that is, the input-output efficiency of the socioeconomic system concerning the undesirable output of energy consumption and pollutant emission (Shuai and Fan 2020;Zhu et al. 2019a). An important way to improve it would be promoting manufacturing agglomeration by following international advanced standards (Aleksandrova et al. 2020;Fang et al. 2020).
Manufacturing industry is the pillar of national economy and the leading sector of economic growth in China.
Responsible Editor: Eyup Dogan 1 3 Manufacturing enterprises have contributed 32% of the GDP and 12% of the global goods exports to the Chinese economy (World Bank 2017). Manufacturing industry shows typical spatial agglomeration characteristics (Ellison et al. 2010). By the end of 2019, 61.03% of the manufacturing enterprises have agglomerated in the East China, and have gathered 57.22% of the assets and produced 58.55% of the profits out of all the manufacturing enterprises. Moreover, in the East China, the total power consumption has reached 50.53% of the country's total, and the investment in industrial pollution control has reached 53.22% of the country's total. 1 Thus, it can be seen that manufacturing agglomeration contributes to economic growth at the expense of huge energy consumption and environmental pollution (Jones 1995;Lan et al. 2021). In other words, manufacturing agglomeration may hinder the improvement of green development efficiency. This raises the following research questions that have been barely studied in the existing literature: how does manufacturing agglomeration influence green development efficiency? Are there any patterns of the influence? The answers to the above questions would not only enrich the connotation of industrial agglomeration theory in transition countries, but also provide new ideas for green development in China and sustainable development around the world.
Although the existing literature has not directly addressed the impact of manufacturing agglomeration on green development efficiency, related studies can be roughly summarized into three categories. (1) From the perspective of economic effects, starting from the 1860s and 1870s, European and American scholars have studied the impact of manufacturing agglomeration on economic growth based on theoretical analysis frameworks such as economies of scale and economies of scope, and have conducted a large number of empirical verifications (Acs et al. 1994;Porter and van der Linde 1995;Henderson 2003). (2) From the perspective of environmental effects, in the mid-to-late twentieth century, more and more researchers discussed the impact of manufacturing agglomeration on environmental quality based on public goods theory and externality principles (Verhoef and Nijkamp 2002;Ehrenfeld 2003;Zeng and Zhao 2009). (3) From the perspective of sustainable development, since the twenty-first century, some researchers started to investigate the comprehensive impact of manufacturing industry on economic environment (Yuan et al. 2020b). Yang et al. (2022) examined the impact of intelligent transformation of manufacturing industry on green innovation based on China's industry panel data, and the study found that the intelligent development of manufacturing industry is conducive to promoting green technology innovation.
Compared to the previous research, this article attempts to integrate manufacturing agglomeration and green development efficiency into a unified analytic framework and investigate the relationship between the two, both theoretically and empirically. Besides, the slack-based measure (SBM) model is used to measure the green development efficiency under the constraints of energy and environment from the perspective of multi-input and multi-output, which may lead to more practical conclusion. Furthermore, the trade ports opened in the Qing Dynasty from 1842 to 1909 are adopted as an instrumental variable for manufacturing agglomeration, which might effectively mitigate the estimation bias caused by endogenous problems.
In the remainder of the article, we provide a brief literature review in "Literature review," conception framework in "Conception framework," and research design in "Research design." We present the empirical results in "Empirical results" and discuss the research conclusions and policy recommendations in "Discussion."

Measurement of green development efficiency
Green development efficiency is mainly measured by parametric and non-parametric analysis methods. The parametric analysis method is represented by stochastic frontier analysis (SFA), and the non-parametric analysis method is represented by data envelopment analysis (DEA). The biggest advantage of SFA is that it can exclude the inefficiency term and random error term, thereby ensuring the effectiveness and consistency of the effect estimation. However, SFA is only applicable to the data sets with multi-input and singleoutput (Klein et al. 2020;Liu et al. 2020a). As for the DEA method, the significant advantage is that there is no need to set a specific production function in advance, which prevents possible deviation caused by mistaken setting. However, its production process cannot be fully described (Charnes et al. 1978;Huang et al. 2021a). Since green development efficiency is a multi-input and multi-output process, it is unrealistic to forcibly adopt undesirable output indicators such as pollution emission as input variables. Previous research on the measurement of green development efficiency mainly focuses on the following two aspects: First is the development of measurement methods. The application of measurement models can be roughly divided into four stages.
In the first stage, the undesirable output variable (pollutant emission) is used as an input variable. Mohtadi (1996) first introduced pollutant emission as an input variable into the traditional DEA model. Later, some scholars included both pollution emission and energy consumption as input variables into the DEA model for estimation (Korhonen and Luptacik 2004;Ramanathan 2005).
In the second stage, the directional distance function (DDF) model is applied. Chung et al. (1997) first proposed the DDF model in 1997, and successfully separated desirable output from undesirable output. Managi and Kumar (2009) further adopted the DDF model to set regional GDP as desirable output and sulfur dioxide and carbon dioxide emissions as undesirable output, measuring technological changes in 76 countries from 1963 to 2000. Lin and Benjamin (2017) then used non-radial DDF model to estimate the green development status of Chinese provincial regions.
In the third stage, the SBM model and the DDF model are integrated. Since the traditional DEA model cannot identify the slack variable of invalid DMU, the efficiency of the undesirable output cannot be accurately calculated. To solve the slack variable problems of desirable and undesirable outputs, Tone (2001) proposed the SBM model. The SBM model allows input and output to change by different proportions and is not subject to the input or output perspective. Furthermore, Chen et al. (2019) and Yuan et al. (2020a) used the SBM-DFF model to measure China's green development efficiency from provincial and municipal scales.
In the fourth stage, the Super SBM-DFF model is adopted. Although the SBM-DFF model has resolved the issues of directivity and slackness, it cannot distinguish and sort effective units. To this end, Tone (2002) proposed the Super SBM-DFF model, which can better sort DMU and reflect the difference in green development efficiency more realistically. Moreover, Zhu et al. (2019b) used the Super SBM-DFF model to measure the green development efficiency of China's provinces.
Second is expansion of input and output variables.
Most studies seem to have a relatively narrow range of input and output selections when analyzing green development efficiency. They only take capital and labor into account for input variables, and economic output and industrial pollutant emission for output variables (Zhang et al. 2018). However, with the economic and social development, new elements such as energy, resources, and technology have become increasingly prominent, and the proportion of undesirable output such as PM2.5, chemical oxygen demand, and total ammonia nitrogen emission has also increased (Chen et al. 2019;Yuan and Xiang 2017). Wu et al. (2020) set capital, labor, and energy consumption as inputs, the actual GDP of each region as desirable output, and the industrial wastewater discharge, industrial waste gas discharge, and industrial solid waste discharge as undesirable outputs so as to examine the green development efficiency of China's provinces. Based on the same input and output variables, Jin et al. (2019) measured the green development efficiency of Chinese cities.

Manufacturing agglomeration and total factor productivity
Research on the impact of manufacturing agglomeration on total factor productivity is closely related to the research object of this article. It can be seen from the existing research that scholars have disagreements on the relationship between the two, which can be roughly divided into three views (Table 1): First, manufacturing agglomeration can help to promote total factor productivity. Beeson (1987) and Ciccone (2002) used instrumental variable method to analyze the USA and European samples and found that manufacturing agglomeration can significantly promote the rate of total factor productivity. Based on the dynamic panel regression method, Brülhart and Mathys (2008) and Hu et al. (2015) have obtained similar results. Graham (2009) has reached consistent conclusions based on panel data from 27 industries in the UK. What is more, by using the structural model to analyze the panel data of 250,000 micro-enterprises in the Netherlands, Graham (2009) also found that manufacturing agglomeration is conducive to the improvement of total factor production efficiency.
Second, manufacturing agglomeration may inhibit the increase of total factor productivity. Gopinath et al. (2004) analyzed the panel data of 246 four-digit manufacturing enterprises and found that manufacturing agglomeration may hinder the improvement of total factor productivity. Similar conclusion was also gained from the Dutch city samples (Broersma and Oosterhaven 2009).
Third, there might be a nonlinear relationship between manufacturing agglomeration and total factor productivity. On the basis of the panel data of Chinese textile companies, Lin et al. (2011) found that there is a significant inverted "U" relationship between industrial agglomeration and total factor productivity; that is, when industrial agglomeration is lower than the critical value, industrial agglomeration can help to promote total factor productivity; when industrial agglomeration exceeds the critical value, however, industrial agglomeration would hinder the increase of total factor productivity.
In summary, although extensive research has been done on the relationship between manufacturing agglomeration and total factor productivity, there are still some shortcomings. First, most of the studies only examine the impact of manufacturing agglomeration on economic development, but few have considered the comprehensive impact of manufacturing agglomeration on economic development and environmental pollution. Second, existing research seems to have relatively homogenous measurement of the input and output of green development efficiency, without concerning diverse features of green development efficiency in the new era. Third, there might be serious endogenous problems between manufacturing agglomeration and economic development or environmental pollution, yet little literature has delved into this problem.

Conception framework
Although Marshall (1920) first revealed the mechanism of manufacturing agglomeration from the effects of labor pool, intermediate input sharing, and knowledge and technology spillover, Duranton and Puga (2004) believe that Marshall only explores the micro-mechanism of agglomeration economy from the perspective of matching input factors of the enterprises. They systematically reveal the micro-mechanism of agglomeration economy from three dimensions, including sharing, matching, and learning. Therefore, this article tries to analyze how manufacturing agglomeration affects green development efficiency from these three mechanisms: sharing, matching, and learning.

Sharing
Through sharing mechanism, manufacturing agglomeration would affect green development efficiency by increasing returns to scale, sharing diversification, specialization profit, and joint risk prevention. (1) Increasing returns to scale. Large amounts of fixed construction costs are required to build infrastructure, and the marginal costs of consumers using these public goods are fixed. Only by ensuring maximum improvement of the commuting situation between consumers and quasi-public goods can the optimal allocation of public product resources be achieved. By virtue of the spatial proximity (Raiher 2019), the enterprises in the cluster can share indivisible public goods and facilities, which might reduce their marginal production costs. However, the expansion of production scales in the cluster might also increase the total energy consumption, thus reducing the efficiency level of regional green development.
(2) Sharing diversification and specialization profit. Under the background of complete market competition and constant returns to scale, comparative advantages in production and increasing returns to scale can be gained through diversified agglomeration (Duranton and Puga 2004). The diversified and specialized manufacturing agglomeration can not only meet various consumer needs, but also stimulate the enterprises' innovation through differentiated competition (Zeng et al. 2021). However, agglomeration of many enterprises in the same region might cause a sharp increase in factor needs and factor costs, which will increase the production costs of enterprises and reduce the efficiency level of regional green development (Brakman et al. 1996;Henderson 2003). (3) Joint risk prevention. Through specialized and diversified agglomeration, enterprises can obtain specialized and diversified profits as well. The close connection between enterprises and the nesting of industrial chains are conducive to reducing the operational risk of individual enterprises (Overman and Puga 2010). However, deepening cooperation between enterprises may result in further agglomeration of social resources and an increase in energy consumption, thereby inhibiting regional green development.

Matching
Manufacturing agglomeration can affect green development efficiency by increasing matching opportunities and matching quality and alleviating the "lock-in" dilemma. (1) Increasing matching opportunities. The population gathers with the agglomeration of industries. The manufacturing industry is a labor-intensive industry; thus, a large number of labor forces are gathered in the cluster, which increases the opportunity for employers to meet the employees and thereby compresses the temporal and spatial costs for labor recruitment for enterprises (Berliant et al., 2006). Although the increase in matching opportunities guarantees sufficient labor for the production of enterprises, it also imposes more pressure on the environment. (2) Increasing matching quality. The agglomeration of manufacturing enterprises would not only gather large population, but also attract a large number of skilled talents . This not only ensures the basic recruitment demands of the enterprises and reduces the cost of employment, but also meets the recruitment needs for special positions and improves the quality of matching between enterprise and employees. (3) Alleviating the "lock-in" dilemma. The population agglomeration brought about by industrial agglomeration can significantly alleviate the problem of "lock-in" between enterprises and employees. The "lock-in" issue results from incompleteness of contract and specific investment. It can be alleviated by agglomeration, which helps companies and employees to change partners at lower or even zero cost and accumulate funds. It should be noted that after manufacturing agglomeration improves the matching opportunities and matching quality between enterprises and labor, and alleviates the "lock-in" dilemma, the production efficiency of enterprises may be greatly improved. However, this improvement may be achieved at the cost of increased resources and energy consumption.

Learning
Manufacturing agglomeration can affect green development efficiency through knowledge generation, knowledge diffusion, and knowledge application. Organizational learning theory holds that an enterprise is a knowledge system composed of different knowledge. Acquiring and creating knowledge through organizational learning is the source of an enterprise's competitive advantage (Grant 1996). (1) Knowledge generation. Frequent contacts and exchanges between different enterprises and technicians in the cluster can stimulate innovative thinking and improve innovation capabilities, and thus generate new knowledge. This would in turn affect the production and environmental protection behavior of the enterprises. (2) Knowledge diffusion. With the advantage of geographical proximity, enterprises in the cluster can quickly disseminate new technologies and new products through technical exchanges and cooperation. The rapid diffusion of knowledge can guide the enterprises in the cluster to learn and re-create. (3) Knowledge application. Innovation is the source of sustaining the vitality of enterprises. The generation and diffusion of knowledge provides a suitable environment for imitation and learning of enterprises in the cluster. In the market environment of survival of the fittest, enterprises in the cluster tend to learn and innovate quickly. The production and application of a large number of innovative outcomes will help improve the efficiency level of regional green development. However, largescale agglomeration of enterprises within a limited region can easily lead to vicious competition among enterprises, which might shrink the margins of individual enterprises. In this sense, funds available for R&D would be reduced, thereby reducing the probability of successful innovation, which is not conducive to regional green development .
To sum up, manufacturing agglomeration may have either economic or uneconomical agglomeration effect on green development efficiency. If the economic agglomeration effect is greater than the uneconomic agglomeration effect, manufacturing agglomeration is conducive to improving green development efficiency, and vice versa. Therefore, rigorous measures need to be utilized to investigate the impact of manufacturing agglomeration on green development efficiency.

Econometric models
According to the theoretical analysis, manufacturing agglomeration has a significant impact on green development efficiency. In order to properly identify the causal relationship between the two, this paper refers to the approach adopted by Yuan et al. (2020b) and Feng et al. (2022) to set a benchmark model between manufacturing agglomeration and green development efficiency: (1) Among them, GDE it represents the green development efficiency in the t period of i region, MA_LQ it represents the manufacturing agglomeration degree in the t period of i region, X it is a series of control variables, 1 and j are explanatory variables and the parameters of the control variables, u i , v t represent individual fixed effects and temporal fixed effects respectively, and it is a random interference term.
This article decomposes green development efficiency into green development technology efficiency (GDTC) and green development technology progress (GDTP). Examining the impact of manufacturing agglomeration on components of green development efficiency can provide an in-depth analysis of the complex mechanism between the two. The equations are ): This paper further expands Eq. (4) spatially, by including the spatial lagged terms of manufacturing agglomeration and green development efficiency in the model to control its spatial correlation, and investigating the relationship between the two based on the spatial panel Durbin model (SPDM) (Song et al. 2020;Zhang et al. 2020). It is expressed as follows: Among them,W it is the spatial weight matrix, 1 and j are the parameter to be estimated for each explanatory variable and control variable, is the spatial lag coefficient of the dependent variable, 1 and j are the spatial lag coefficient for each explanatory variable and the control variable, i and t represent individual fixed effects and temporal fixed effects respectively, and it is the random interference term.
Since traditional point estimation methods cannot truly measure the impact of explanatory variables on the explained variables (Guliyev, 2020), some scholars have proposed partial differential decomposition (He et al. 2020;Lu et al. 2021). Therefore, this paper mainly analyzes the impact of manufacturing agglomeration on green development efficiency based on the results of partial differential decomposition.
The spatial weight matrix is the core element of spatial econometric analysis (Kopczewska et al., 2017). With the development of modern economy and society, the factors that affect spatial correlation between variables are no longer limited to geographical distance, yet economic development level, information technology and other factors have become increasingly important. Therefore, this paper intends to construct an asymmetric spatial weight matrix (W 1 ) that considers both geographic distance and economic development level to quantify the spatial correlation between manufacturing agglomeration and green development efficiency (Yuan et al. 2020a). The equations are as follows: Among them, W d is the geographic distance weight matrix, Y i is the average per capita GDP in the period between t 0 and t 1 of city i, Y is the average per capita GDP of all cities in the research period. Two different economic distance matrices, W 2 (Feng et al. 2019) and W 3 (Wang and He 2019), were used as substitutes for the robustness test.
New economic geography believes that manufacturing agglomeration is endogenous to economic growth, which has significant endogenous problems itself (Yuan et al. 2020b). Since regions with high green development efficiency are generally underdeveloped cities, to obtain political promotion, local officials usually lower the threshold of environmental regulations to attract enterprises to settle down, thereby promoting the local manufacturing agglomeration (Miao et al. 2019). To this end, this paper selects the treaty ports opened in Qing Dynasty from 1842 to 1909 as the instrumental variable of manufacturing agglomeration to alleviate the estimation bias caused by indigenousness (Chen et al. 2018). Besides, to ensure that the instrument variables can change with time in the panel data analysis, this paper multiplies the instrument variable with the year dummy variable (Nunn and Qian 2014).
The trade ports opened in the Qing Dynasty from 1842 to 1909 are selected as the instrumental variable of the endogenous variable for reasons as follows.
(1) to meet the requirements of relevance. The trade ports forced to open in the Qing Dynasty from 1842 to 1909 were important industrial and commercial cities in modern China. Because of convenient transportation, these cities become clusters of population and overseas investment. They are also the most developed areas of modern business culture in China, which have an important influence on the formation of manufacturing agglomeration (Chen et al. 2018).
(2) Meet the exogenous requirements. The trade ports opened in the Qing Dynasty from 1842 to 1909 have a history of more than a hundred years, so they will not affect current green development efficiency. (5)

Explained variable
This paper uses the super slack-based measure model (SBM) to measure green development efficiency in the YREB (Long et al. 2020). The super SBM model aims to maximize desirable output such as GDP considering the productive factors including labor and capital and minimize undesirable output such as industrial sulfur dioxide emissions (Tone 2002). The formula is as follows: where GDE * represents the value of green development efficiency; x is the input vector; n is the number of decision-making units; m is the number of input factors; q 1 ,q 2 represent the desirable output and undesirable output, respectively; and s − i ,s g+ r ,s b− t represent the slack vectors of input desirable output and undesirable output, respectively. GDE * > 0, the larger the value of GDE * , the higher the green development efficiency. Concerning the richness and complexity of the connotation of green development, this article refers to Liang et al. (2019) and selects multi-input and multi-output variables to calculate green development efficiency (Table 2).

Core explanatory variable
Manufacturing agglomeration is the core explanatory variable of this article. This study mainly uses location entropy index to describe the urban manufacturing agglomeration levels because location entropy model can better eliminate the endogenous impact brought by regional scale differences, and can more accurately describe the distribution of China's urban manufacturing agglomeration (Qu et al. 2020). The formula is as follows: where MA_LQ it represents the degree of manufacturing agglomeration in city i in year t, cem it represents the employment in manufacturing in city i in year t, NEM it represents the total employment in city i in year t, ∑ cem t represents the total employment in manufacturing in all cities in year t, and ∑ NEM t represents the total employment in all cities in year t.

Control variables
New economic growth theory (NEG) holds that economic growth is affected not only by physical capital, but also by human capital (HC). HC can often affect economic growth  (Balaguer and Cantavella, 2018). Therefore, this article uses the average years of education to measure human capital. The industrial sector is the largest source of pollutant emissions (Shao et al. 2011), so rapid industrialization might lead to a sharp increase in energy consumption, which would intensify pollutant emissions (Lv et al. 2020). Therefore, this paper uses the proportion of added value of the secondary industry to GDP to describe the industrialization level (IND).
Coal consumption accounts for more than half of China's total energy consumption and contributes one-third of CO 2 . This consumption structure that relies heavily on fossil energy may have a critical impact on China's economic transformation and environmental governance (Wei et al. 2020). Therefore, this paper uses the ratio of industrial electricity consumption to electricity consumption of the whole society to describe the impact of energy consumption structure (EC) on green development efficiency.
Environmental regulation (ER) would increase the production cost of enterprises and induce fluctuations in the total factor productivity of manufacturing enterprises (Yu and Wang 2021). To reduce economic losses, the government is likely to consciously lower the threshold of environmental regulation, which can easily lead to "racing at the bottom" between regions (Miao et al. 2019). Therefore, this article uses the environmental regulation index to describe the intensity of environmental regulation (Miao et al. 2019).
Technological innovations (TI) are not only important drivers of economic growth, but also an important way of environmental protection (Miao et al. 2019). Therefore, this paper uses the logarithm of the number of patent grants obtained per 10,000 people to measure and introduces a model to control it.
The increase in population density (PD) would not only cause an increase in resource and energy demand, but also cause further environmental damage due to unreasonable development methods. Therefore, this paper uses the ratio of the total urban population at the end of the year to the area of the administrative region to measure population density (Qiu et al. 2019).

Study area
In September 2014, the State Council clarified the geographic scope of the YREB in the Guiding Opinions on Relying on the Golden Waterway to Promote the Development of the YREB, which covers 11 provinces and cities including Shanghai, Jiangsu, Zhejiang, Anhui, Jiangxi, Hubei, Hunan, Chongqing, Sichuan, Yunnan, and Guizhou, with an area of about 2.05 million square kilometers, and the population and GDP of more than 40% of the whole China. The YREB is considered as the most potential growth area in the new era and as important as half of the country (Fig. 1).

Data source and description
The sample in this article is the panel data of 110 cities in the YREB from 2003 to 2016, in which the economic variables are collected from China City Statistical Yearbook and the meteorological factors from the China Meteorological Data Center (http:// data. cma. cn/). The reason why the data is only updated to 2016 is that the data of China's city-level total fixed asset investment and the industrial output above designated size have not yet been released after 2018, resulting in the inability to estimate Fig. 1 The geographical scope of the YREB the key variables. Since the samples from 2003 to 2016 have already covered the main stages of the development of manufacturing agglomeration in the Yangtze River Economic Belt, and the impact and mechanism of manufacturing agglomeration on green development efficiency is a long-term problem, the lack of three year's data will not affect the core conclusions of this paper. Due to the statistical errors, this article supplements and adjusts the individual missing data and outliers in the data set using interpolation. To eliminate the impact of inflation, with Year 2003 as the base period, the GDP deflator index method is used to adjust all price variables. The Boxplot of the variables is shown in Fig. 2.
According to Fig. 2, MA_LQ and IND basically conform to the normal distribution, and there is no serious outlier. However, the mean values of GDE, HC, ES, TI, and PD are significantly larger than their medians. The data are mainly concentrated in low-value intervals and have obvious positive skewness. In addition, the mean value of the two variables ES and ER is smaller than their medians, and the data Fig. 2 Boxplot of variables distribution has obvious heavy tail and negative skewness. Therefore, to reduce the possible effect of heteroscedasticity, this paper performs logarithmization on each variable.
To avoid the deviation of the regression results due to the collinearity between variables, this paper uses multiple collinearity test and correlation coefficient test to analyze the main variables. According to the multicollinearity test results in Table 3, the minimum value of VIF is 1.19 and the maximum value is 2.68, both of which are lower than the critical value of 10, which indicates that there is no serious collinearity problem between explanatory variables. Besides, the correlation coefficient test further confirms that the maximum correlation coefficient value between the explanatory variables is 0.5972 and the minimum value is − 0.0044, and the correlation coefficient between most variables has passed the significance test at the level of 10%. This shows that there are no serious highly correlated or uncorrelated problems among the explanatory variables. Therefore, the multicollinearity problem can be ignored in the regression analysis later.

General description of green development efficiency
Based on the Super SBM-DEA model, this paper measures the green development efficiency of the YREB and the upper, middle, and lower reaches of the YREB from 2003 to 2016. It can be seen from Fig. 3 that the green development efficiency presents the following changes. First, during the study period, the green development efficiency of the YREB and the upper, middle, and lower reaches show a fluctuating upward trend. Specifically, the average annual growth rate of green development efficiency in the YREB is 3.81%, and the average

Analysis of the imbalance of green development efficiency
The YREB traverses the West, Middle and East China. Since their internal natural conditions and economic development levels are quite different, the green development efficiency of each region tends to be distinct. According to the standards of the "Guiding Opinions of the State Council on Promoting the Development of the YREB Relying on Golden Waterways," this article defines the upper reaches of the YREB as 33 cities in 4 provinces of Yunnan, Guizhou, Sichuan and Chongqing, the middle reaches as 36 cities in 3 provinces of Hubei, Hunan and Jiangxi, and lower reaches as 41 cities in 4 provinces of Jiangsu, Zhejiang, Shanghai, and Anhui. Besides, the Theil index and its decomposition method are used to measure the overall and internal gaps in green development efficiency of the YREB. The Theil index was first used to measure income inequality between regions (Theil and Uribe 1967). This method has good decomposition qualities, so it is often used to analyze regional differences and the sources of these differences.

The overall gap in green development efficiency
It can be seen from Fig. 4 and Table 4 that the overall gap in green development efficiency across the YREB is huge, and there is a tendency of further expansion. The driving force behind this expansion tendency mainly comes from intraregional differences, with an average annual contribution of 96.80%, while the inter-regional differences are rather small.

Intra-regional differences in green development efficiency
According to Fig. 5 and Table 4, the largest gap in green development efficiency occurs in the upper reaches of the YREB, followed by the middle and lower reaches. The annual average value of regional gap of green development efficiency is 0.646 in the upper reaches, which is much higher than that of 0.314 in the middle reaches and that of 0.203 in the lower reaches. It is almost twice the value of the middle reaches and three times of the lower reaches, indicating that green development efficiency within the YREB is extremely uneven and the green development efficiency gap in the upper reaches mainly determines the green development efficiency tendency of the entire YREB with its contribution rate as high as 52.95%. The three major regions present a change tendency of "decline-rise-decline". From 2004 to 2006, the gap in the upper reaches drops from 0.535 to 0.124; from 2006 to 2009, it rapidly increases to 0.954; and since 2009, it gradually declines. The gap between the middle and lower reaches is basically at a low level of about 0.2 before 2011, but after 2011, the gap in the middle reaches expands rapidly, which is much higher than that of the lower reaches. At the end of 2016, the regional gap in green development efficiency in the middle reaches is as high as 0.544, almost five times than that of the lower reaches. Theil index Extra-regional disparity Intra-regional disparity

Fig. 4 Overall gap and decomposition of green development efficiency in the YREB
Inter-regional differences in green development efficiency Figure 6 reports the inter-regional differences in green development efficiency in the upper, middle, and lower reaches of the YREB. Overall, during the study period, the differences among these regions show a trend of "shrink-expand-shrink." Specifically, from 2004 to 2006, the inter-regional differences in the upper-lower reaches and upper-middle reaches show a shrinking trend, while the inter-regional differences in the middlelower reaches show an expanding trend. From 2006 to 2011, the inter-regional differences show an overall rapid expanding trend. However, after 2012, the expansion rate of inter-regional differences slows down significantly, even showing a downward trend. In terms of maximum and minimum values, middle-lower reaches are 0.472 and 0.216, upper-lower reaches are 0.547 and 0.227, and upper-middle reaches are 0.566 and 0.259, respectively. It can be seen that the inter-regional differences in the upper-middle and upper-lower reaches are more prominent, significantly higher than the inter-regional differences in the middle-lower reaches. In terms of average annual growth rate, the inter-regional difference in the middle-lower reaches has the highest average annual growth rate of 5.59%, significantly higher than the upperlower reaches (3.99%) and upper-middle reaches (4.19%).
As far as the coefficient of variation is concerned, the inter-regional difference in the middle-lower reaches is more dramatic, and the coefficient of variation is 0.385, which is higher than that of upper-lower reaches (0.235) and upper-middle reaches (0.227).

Impact of manufacturing agglomeration on green development efficiency
Before examining the impact of manufacturing agglomeration on green development efficiency, this paper has carried out correlation tests on models and instrumental variables (Table 5). First, based on the method proposed by Davidson and MacKinnon (1993), the study has tested whether the model has endogenous problems. The estimation results in the first stage show that under the control of individual fixed effects and temporal fixed effects, the Davidson-MacKinnon test rejects the hypothesis that there is no endogeneity at the confidence level of 1%. Second, the significance test of the Sargan test fails, indicating that the instrumental variables selected in this paper are valid. Finally, it is observed that there is a significant positive correlation between instrumental variables and manufacturing agglomeration, which basically passed the significance test at the level of 1%, indicating that the instrumental variables meet the correlation Table 4 The Theil index of green development efficiency of the YREB and its decomposition contribution rate  hypothesis. In summary, it seems that the instrument variables selected in this paper are relatively reasonable.
To verify the accuracy of the 2SLS method, this article also uses OLS, FE, and FGLS to investigate the relationship between manufacturing agglomeration and green development efficiency. From the results in Table 6, we can see that under the estimation of the OLS, FE, and FGLS methods, the regression coefficients of manufacturing agglomeration are always significantly negative and have passed the significance test at the levels of both 5% and 1%. This shows that manufacturing agglomeration would inhibit the improvement of green development efficiency. In the case of 2SLS, there is also a significant negative relationship between manufacturing agglomeration and pollutant emission, which has passed the significance test at the level of 5%. This is consistent with the OLS regression results, indicating that manufacturing agglomeration is not conducive to the improvement of green development efficiency. In addition, the absolute value of the estimated coefficient of 2SLS is significantly greater than the regression coefficient of OLS. This shows that if the endogenous problems of manufacturing agglomeration are not controlled, the regression coefficient will be biased downward, which will lead to underestimation of the inhibitory effect of manufacturing agglomeration on green development efficiency.

Efficiency decomposition analysis
To further clarify the reasons why manufacturing agglomeration affects green development efficiency, this paper decomposes green development efficiency into green development technology efficiency (GDTC) and green development technology progress (GDTP) and uses 2SLS to reestimate the model. According to column (1) of Table 7, manufacturing agglomeration has a positive effect on GDTC, but it fails the significance test. This shows that manufacturing agglomeration cannot promote GDTC effectively. From column (2) of Table 7, we can see that the regression coefficient of manufacturing agglomeration is − 0.343, and it is significant under the confidence level of 10%. This shows that manufacturing agglomeration mainly hinders the improvement of green development efficiency by inhibiting GDTP.

Regional heterogeneity analysis
To further analyze the impact of regional heterogeneity on the relationship between manufacturing agglomeration and green development efficiency, this paper divides the YREB into 429 observations of 33 cities in the upper reaches,  and 1001 observations of 77 cities in the middle and lower reaches and re-estimates model (4) by using 2SLS. It can be seen from Table 8 that the regression coefficient of manufacturing agglomeration in the upper reaches of the YREB is positive, and it has passed the significance test of 10%, indicating that the manufacturing agglomeration in the upper reaches is conducive to improving green development efficiency. On the contrary, the regression coefficient of manufacturing agglomeration in the middle and lower reaches is significantly negative, indicating that deepening manufacturing agglomeration in the middle and lower reaches would hinder the improvement of green development efficiency.

Spatial spillover analysis
To analyze the spatial spillover effect of manufacturing agglomeration on green development efficiency, this article uses SPDM to investigate the relationship between the two. According to the Wald and LR tests (Table 9), SPDM cannot degenerate into a spatial panel lag model (SPLM) or spatial panel error model (SPDM), indicating that it is reasonable to choose SPDM.
The regression results show that the coefficients of the spatial lag term under the three spatial weight matrices are always significantly negative, indicating that the improvement of local green development efficiency would be constrained by green development efficiency of adjacent regions. Under different spatial weight matrices, the estimated coefficients of manufacturing agglomeration are significantly negative, and all have passed the significance test at the level of 1%. This illustrates that deepening manufacturing agglomeration would seriously hinder the improvement of green development efficiency, which is consistent with the estimation results of non-spatial panel model. However, the coefficients of the spatial lag term of the manufacturing agglomeration under the three spatial weight matrices are significantly positive, which indicates that the manufacturing agglomeration can significantly promote the improvement of green development efficiency in adjacent areas.
It can be seen from the spatial effect decomposition results that the regression coefficients of the direct effects of manufacturing agglomeration under the three spatial weight matrices are significantly negative, and all of them have passed the significance test at the level of 1%, indicating that deepening manufacturing agglomeration would reduce green development efficiency. The spatial spillover effect coefficients of manufacturing agglomeration under different spatial weight matrices are significantly positive, and all are significant at the confidence level of 1%, indicating that increasing the level of local manufacturing agglomeration can help promote green development efficiency in adjacent areas.

Robustness test
(1) Moving average. To overcome the regression bias caused by excessive data fluctuations, this article performs a 3-year moving average on the sample observations, and then continues to re-estimate the model by the Table 8 Estimated results in regional heterogeneity Note: The data in brackets are t-statistics; ***, ** and * indicate significance at the confidence levels of 1%, 5%, and 10%, respectively  Table 9 Estimated results in spatial spillover effect Note: The data in brackets are t-statistics; ***, ** and * indicate significance at the confidence levels of 1%, 5%, and 10%, respectively 2SLS method. From column (1) of Table 10, we can see that after controlling the data fluctuation, the regression coefficient of manufacturing agglomeration is still significantly negative and has passed the significance test at the level of 1%, indicating that the main conclusions of this paper are relatively robust.
(2) Increasing control variables. Since changes in meteorological factors have an important impact on pollutant emissions in a region, this paper further controls the annual average precipitation (AAP), average wind speed (AWS), average air pressure (APR), sunshine hours (SUH), relative humidity (RHU), and other meteorological factors, and performs logarithmization on each variable. It can be seen from column (2) of Table 10 that after considering the interference of meteorological factors, the regression coefficient of manufacturing agglomeration is still significantly negative and is significant at the confidence level of 10%, indicating that the inhibitory effect of manufacturing agglomeration on green development efficiency would remain the same regardless of the changes in weather conditions.
(3) Changing the regression method. To avoid bias caused by regression method, this section integrates GMM and instrumental variable method to re-verify the reliability of the main conclusions of this article. It can be seen from column (3) of Table 10 that under the circumstance of changing the regression method, the manufacturing industry would still hinder the improvement of green development efficiency. This verified the reliability of the conclusions of this article again.
(4) Replacing instrumental variables. To validate the robustness of the conclusion, this article uses "whether each Chinese city has railways or not in 1933" (dum_rail) as an instrumental variable for manufacturing agglomeration (Lin and Tan 2019). Based on this, this paper once again uses 2SLS to estimate the model. According to column (4) of Table 10, there is still a significant negative relationship between manufacturing agglomeration and green development efficiency. This shows that the main conclusions of this article would not change significantly due to different instrument variables.
(5) Replacing core explanatory variables. Based on the replacement of instrumental variables, this paper further uses HHI to re-measure the manufacturing agglomeration level (Mitchell 2019). It can be seen from column (5) of Table 10 that manufacturing agglomeration has a significant hindrance to green development efficiency. This shows that the main conclusions of this article would not change significantly due to different measurement methods of manufacturing agglomeration. (6) Replacing explained variables. The selection of input variables and output variables has an important impact on the measurement of green development efficiency. Therefore, this paper refers to the approaches adopted by Cárdenas et al. (2018) and Jin et al. (2019), and reselects input variables and output variables to measure green development efficiency. Specifically, the product of the average years of education 2 and the total population of urban employment 3 is selected to describe the labor input. The perpetual inventory method is used to calculate the urban fixed capital stock to measure the capital input. Based on the provincial total energy consumption, the urban energy consumption is measured to describe the energy input weighted by the ratio of the average annual electricity consumption of the whole society to the province to which it belongs. The actual GDP of the city is selected as the desirable output, and the urban industrial sulfur dioxide emission, industrial wastewater emission, industrial soot emission, industrial nitrogen oxide emission, and the annual average concentration of inhalable fine particulate matter as undesirable outputs. According to column (6) of Table 10, the inhibitory effect of manufacturing agglomeration on green development efficiency will not be changed by the measurement method of the explained variables.

Research findings
(1) This study finds that the green development efficiency of the YREB presents spatially unbalanced characteristics, and the gap in green development efficiency in the upper reaches of the YREB is the main reason for the expansion of the overall gap. This is consistent with the existing research findings (Zhu et al. 2019a). Huang et al. (2021a, b) used the cutting-edge DEA model to measure the green development efficiency in provinces of China, and found that there are significant regional differences in green development, and the green development efficiency is in descending order as: East > Central > West and South > North. Qiu et al. (2021) investigated the impact of low-carbon city construction on urban green development efficiency based on panel data of 284 cities in China, and found that the green development efficiency in the eastern coastal areas of China is significantly greater than 1, while the green development efficiency in the central and western regions is significantly less than 1. Compared with the existing literature, this paper not only identifies the spatial distribution characteristics of green development efficiency in general, but also identifies the source and contribution of the green development efficiency gap with the help of Dagum method. At the same time, the green development efficiency is measured more scientifically from the perspective of multi-input and multi-output. In terms of input, besides labor, capital, and energy, resource input are considered to calculate green development efficiency. In terms of output, technical output and ecological benefits other than economic output are added to the model, making the results more in line with regional reality.
(2) Insufficient progress in green technology is an important reason for the hindrance of manufacturing agglomeration on the improvement of green development efficiency in the YREB. The existing literature mainly analyzed green development from the aspects of environmental regulation (Zhuo et al. 2022), technological innovation (Liao and Li 2022), energy endowment (Yu et al. 2022), and official characteristics (Anhua Zhou 2021). The influencing factors of green development efficiency, however, are rarely investigated from the perspective of industrial agglomeration. In recent years, although some papers have begun to study the impact and mechanism of industrial agglomeration on green development (Feng et al. 2022;Yuan et al. 2022), most of the literature focuses on the financial industry, and rarely examines the relationship between manufacturing agglomeration and green development. Compared with the existing literature, this paper investigates the impact of manufacturing agglomeration on green development efficiency from both theoretical and empirical perspectives, providing rich empirical evidence for formulating win-win policies for synergistically promoting manufacturing agglomeration and green development.
(3) There are significant regional differences in the impact of manufacturing agglomeration on green development efficiency. Yuan et al. (2020a) empirically tested the effect of financial agglomeration on green development level based on the panel data of Chinese cities and found that there are significant regional differences in the impact of financial agglomeration on green development efficiency.
However, it has the shortcoming that only examines the overall impact of financial agglomeration on the green development level of Chinese cities from a global perspective and fails to reveal the differences of impact in typical regions. Therefore, the policy suggestions it puts forward are not targeted enough. The strategic positioning of the Chinese government for the YREB is an inland river economic belt with global influence, a coordinated development belt for the interaction and cooperation between the East, China and the West, a belt for opening up to the outside world and a pioneering demonstration belt for ecological civilization construction. 4 Based on a local analysis perspective, this paper takes the YREB as a research sample to analyze the impact of manufacturing agglomeration on green development efficiency. The study results can provide a scientific basis for formulating more targeted and directional policies.
(4) Manufacturing agglomeration in the YREB is conducive to improving the green development efficiency in adjacent areas. The reason is that manufacturing agglomeration in central cities will improve the green development efficiency in adjacent areas through demonstration effects and technology spillovers (Yuan et al. 2020a). Existing literature has not paid enough attention to the spatial spillover effect of green development efficiency. Zhang et al. (2021) investigated the impact of carbon emission trading scheme (ETS) on green development efficiency based on Chinese industrial enterprise data and provincial panel data and found that ETS can significantly improve green development efficiency. Zhuo et al. (2022) also suggested similar conclusions at the city level of China. However, none of the above studies have controlled the spatial spillover effect of green development efficiency in the model, which may lead to biased results.

Limitations and future directions
This research analyzes the influence mechanism, impact effect, heterogeneity, and action mechanism of manufacturing agglomeration on green development efficiency in the YREB, but there are still some shortcomings.
(1) This study has not calculated the optimal scale of China's manufacturing agglomeration, and has not included samples from other countries for comparative analysis. Future studies can refer to the approach adopted by Feng et al. (2022), which uses spatial econometric model and the panel threshold model to calculate the optimal scale of manufacturing agglomeration. In addition, comparative analysis of the relationship between manufacturing agglomeration and green development efficiency in other major economies in the world can be conducted, thus providing richer empirical evidence for promoting global sustainable development.
(2) Limited by data availability, this paper does not empirically examine the three mechanisms of sharing, matching, and learning. When the data is available, future studies can measure the sharing, matching, and learning effects from the micro-enterprise level, so as to identify the effects of the three mechanisms more scientifically. This will help to improve the understanding of how manufacturing agglomeration affects green development efficiency through the three mechanisms.

Conclusion and policy recommendations
Based on a Chinese case, this article reveals how manufacturing agglomeration affects green development efficiency from both theoretical and empirical dimensions, and scientifically measures the quantitative relationship between the two. The results show the following: First, the overall gap of green development efficiency in the YREB is relatively large, and the regional gap of the upper reaches contributes an average annual rate of 52.95% to the entire regional gap in the YREB. Second, manufacturing agglomeration in the YREB inhibits the improvement of green development efficiency, but it has no obvious effect on GDTC. Third, manufacturing agglomeration can promote green development efficiency in upper reaches of the YREB but hinder the improvement of green development efficiency in the middle and lower reaches. Fourth, manufacturing agglomeration in the YREB would also promote green development efficiency in adjacent regions. The above analysis can provide new perspectives and methods for the green transformation in the YREB and around the world. Based on the abovementioned analysis, this article proposes the following suggestions regarding the government's political design for manufacturing agglomeration to promote green development efficiency: (1) To promote global sustainable development, decision-making departments in transitional countries such as China should focus on improving the efficiency of regional green development and promoting coordinated regional development. Since the regional gap of the upper reaches contributes an average annual rate of more than 50% to the entire regional gap in the YREB, we must first solve the problem of green development efficiency in the upper reaches. As an ecologically vulnerable area, the ecological barrier is an important mission shouldered by the upper reaches. To this end, priority must be given to the development of ecological and environmental protection industries to meet the needs of economic and social development. Second, while the middle and lower reaches and all the other regions of the country are enjoying the overflow of ecological welfare, the upper reaches should be compensated ecologically. Also, enterprises should attach great importance to the impact of regional heterogeneity on enterprise location selection and production activities. Enterprises should fully consider the huge differences in economic, social, and natural conditions in the upper, middle, and lower reaches of the YREB when selecting the sites and arrange production activities. Based on this, the advantageous industries should be properly distributed, the regional comparative advantages should be maximized, and the sustainable development of the enterprises should be achieved.
(2) Decision-making departments in transitional countries such as China should encourage all market players to actively participate in the R&D and application of green technologies. The results of this study show that manufacturing agglomeration hinders the technological progress of green development, which is an important reason for the reduction of green development efficiency. Therefore, on the one hand, local governments should promote the transformation and upgrading of regional manufacturing by strengthening environmental regulations, and guiding and encouraging increased investment in R&D. On the other hand, decision-making departments should encourage social forces and enterprises to participate in green technological innovation, and guide enterprises to conduct green innovation activities in accordance with market demand. Also, enterprises should focus on promoting high-quality development through green technological innovation. On the one hand, enterprises should actively integrate into the green product market, grasp the potential development of the future market, establish the concept of green production and operation, and strive to improve the competitiveness of green products. On the other hand, enterprises should actively participate in green technological R&D or introduce advanced green technology, thereby improving green technology and maintaining market share of their products.
(3) Decision-making departments in transitional countries such as China should formulate differentiated agglomeration policies. Since manufacturing agglomeration can promote green development efficiency in the upper reaches yet inhibit green development efficiency in the middle and lower reaches of the YREB, efforts should be taken to increase the quality of manufacturing agglomeration, prevent excessive agglomeration, and maximize the economic effect of agglomeration in the upper reaches. For middle and lower reaches, it is necessary to focus on the upgrading and transformation of local manufacturing agglomeration. On the one hand, we must strictly implement environmental regulations and policies, eliminate several backward industries, and promote the survival of the fittest. On the other hand, we must introduce a group of green manufacturing enterprises to promote the transformation and upgrading of the local manufacturing industry. Also, enterprises should focus on giving full play to the advantages of industrial agglomeration and enhancing the core competitiveness. On the one hand, enterprises should utilize the comparative advantages, actively integrate into the industrial agglomeration region, and extend the enterprise value chain and enhance the competitiveness through cooperation with enterprises in the cluster. It is necessary to fully investigate the economic, social, and natural conditions of the region, locate in the region with high degree of matching with the enterprise, and avoid the adverse impact of regional disadvantages on the development of the enterprise.
(4) Decision-making departments in transitional countries such as China should pay attention to the spatial spillover effect of manufacturing agglomeration, and promote the overall improvement of regional green development efficiency. The increase in the local manufacturing agglomeration can drive the improvement of green development efficiency in adjacent areas. Therefore, we must abandon the mindset of local protectionism, break down the administrative barriers, and allow the free flow of resources, thereby reducing the loss of spatial spillover due to space division. Besides, regional cooperation institutions and cooperation systems should be established to develop urban agglomeration economies and crossregional economies. Also, enterprises should pay attention to exerting the spatial linkage effect, building spatial cooperation network, and enhancing enterprise competitiveness. On the one hand, it is necessary to focus on cultivating the enterprise cooperation network, especially the cooperation with enterprises in the peripheral areas of the city. When appropriate, branches can be set up in the peripheral areas of the city, so as to optimize the spatial allocation of enterprise resources. On the other hand, it is necessary to pay attention to dynamic adjustment of enterprise development strategy. When the agglomeration effect of enterprises in the central city is greater than the crowding effect, the focus of enterprise development should be placed in the central city. Once the crowding effect exceeds the agglomeration effect, the focus of enterprise development should be adjusted in time.
Author contribution This collaboration work was carried out by all the authors. Huaxi Yuan contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Huaxi Yuan. The first draft of the manuscript was written by Huaxi Yuan and Longhui Zou. Yidai Feng supervised and reviewed the manuscript. Lei Huang provided critical review. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding This study was funded by National Natural Science Foundation of China (72103205) and Ministry of Education in China Project of Humanities and Social Sciences (21YJC790150).

Data availability
The data in this paper comes from the China City Statistical Yearbook and the China Economic Net Statistical Database.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication Not applicable.
Competing interests The authors declare no competing interests.