Green complexity, economic fitness, and environmental degradation: evidence from US state-level data

Green production is one of the major debates as environmental degradation poses threats globally. The paper attempts to explore the relationship between green production and environmental quality by using Economic Fitness approach. We develop a Green Complexity Index (GCI) dataset consisting of 290 traded green-labeled products and Economic Fitness Index (EFI) for the US states between 2002 and 2018. We analyze the environmental performance of green production using the GCI and EFI data at the sub-national level. Findings indicate that exporting more complex green products has insignificant effects on local (i.e., sulfur dioxide, particulate Matter 10) and global polluters such as carbon dioxide, even accounting for per capita income. Yet, economic fitness has a significant negative impact on the emission levels implying that sophisticated production significantly improves environmental quality in the USA. The insignificant impact of GCI on environmental degradation suggests that green product classifications should incorporate the production and end-use stages of goods to limit the adverse environmental effects of green-labeled products.


Introduction
Global ecological degradation has been raising environmental awareness in modern societies. Local governments and international institutions are promoting environmentally friendly products and services to limit the adverse effects of industrialization. Policymakers increasingly adopt environmental strategies to stimulate the production of climate-neutral and sustainable products and prevent environmentally hazardous goods in the markets. In this context, Organization for Economic Cooperation and Development (OECD), World Trade Organization (WTO), and Asia-Pacific Economic Cooperation (APEC) defined and classified the environmental goods or green products. According to the OECD and European Union Statistics Office (Eurostat) definition, "Environmental goods are products used to measure, prevent, or limit the environmental damage to air, water, or soil." Thus solar panels, electric cars, or wastewater treatment equipment are considered "green products." Scholarly works accompany these efforts and provide extensive theoretical background for "green production." The existing literature on the relationship between economic activities and environmental degradation has been well studied and evolved into a fairly large area known as the environmental Kuznet curve (EKC) hypothesis. The EKC literature mainly focuses on the relationship between income level and environmental degradation (i.e., Meadows et al. 1972;Grossman and Krueger 1991;Shafik and Bandyopadhyay 1992;Panayotou 1993;Selden and Song 1994;Kongkuah et al. 2021b). More recently, countries' environmental impacts of economic activities and production structure are outlined by using Economic Complexity Index (ECI) developed by Hidalgo and Hausmann (2009). The ECI, considered a robust indicator of economic growth, is calculated based on the ubiquity and diversity of products and measures the sophisticated manufacturing capabilities of a country's production structure. Several studies (e.g., Can and Gozgor 2017;Gala et al. 2018;Dogan et al. 2019;Yilanci and Pata 2020;Li et al. 2021;Ikram et al. 2021) examine the relationship between economic complexity and environmental degradation. Can and Gozgor (2017) find a negative relationship between air pollutants (CO 2 ) and economic complexity in developed countries, while Dogan et al. (2019) and Yilanci and Pata (2020) suggest that the relationship between CO 2 and economic complexity is positive for developing countries. Among others, Neagu (2019), Chu (2020), and Pata (2021) show that ECI and CO 2 have an inverted U-shape relationship.
The ECI is questioned due to the linear computation approach by a number of studies in the literature Caldarelli et al. 2012;Cristelli et al. 2013). In this context, Tacchella et al. (2012) develop Economic Fitness Index (EFI) based on non-linear fixed-point iteration to measure countries' production capabilities and economic complexity. They claim to eliminate the conceptual and application-related defects they identified in the Hidalgo and Hausmann (2009) method. Boleti et al. (2021) examine the relationship between economic complexity and environmental performance for 88 countries across the world by using the ECI approach and the ECI + , which is equivalent to the economic fitness algorithm. They conclude that economic complexity improves environmental performance but negatively affects air pollution such as CO 2 and PM 2.5 .
At this point, Mealy and Teytelboym (2020), as a prominent study, examine for the first time the relationship between low-carbon and environmentally friendly production capabilities of countries and environmental degradation. Mealy and Teytelboym (2020) introduce a Green Complexity Index (GCI) using the environmentally friendly product lists reported by WTO, OECD, and APEC. The GCI is important in determining which countries have the production capability to produce green products and giving a measure for countries to re-orient their production structures to become more competitive in producing environmentally friendly products. They develop GCI based on the economic complexity approach (Hidalgo and Hausmann 2009) and examine the relationship between environmental degradation and green product complexity for 122 countries. Their findings indicate that countries having higher GCI experience lesser ecological degradation, i.e., lower CO 2 emissions.
The existing literature on environmental research has certain limitations, such as cross-country measurement inconsistencies due to the significant differences between emission measurement methodologies across countries (Stern et al. 1996;List and Gallet 1999;De Groot et al. 2004;Carson 2010;Awaworyi Churchill et al. 2020). Mealy and Teytelboym (2020) suffer from the measurement inconsistency problem as it relies on cross-country data. Therefore, environmental studies at the regional or sub-national level may provide consistent results and policy implications.
This study extends the literature by developing GCI and EFI at the sub-national level to explore the link between environmental degradation and green production for the US states. Empirical analysis at the sub-national level allows us to minimize the previously mentioned data inconsistency problem, particularly in environmental research. We develop new datasets for US states by employing economic fitness approach  to 290 products at the HS-6 level listed as green products by OECD, APEC, and WTO. Then, EFI, GCI, and environmental data are estimated by the fractional polynomial regression method, which has several desirable features such as providing more flexible functional forms and allowing powers to be logarithmic, non-integer, or to be repeated. Findings reveal that GCI has an insignificant impact on environmental quality in the USA, implying that exporting more complex green products does not affect emission levels. On the other hand, we find that EFI has statistically significant coefficients indicating an inverted U-shape, particularly for SO 2 in the USA.
The remainder of the study is organized as follows. The "Theoretical background and methodology" section introduces the theoretical background for our economic fitness and green product complexity index database for the 51 US states. The next section overviews the data used for the analysis and empirical framework. Following the results and discussion, the paper concludes with key remarks.

Computing economic fitness index of states
The economic fitness approach developed by Tacchella et al. (2012) is based on the complex network structure of Hidalgo and Hausmann (2009). This network structure is represented by an adjacency matrix that enables numerical measurement. While countries are at the rows of the adjacency matrix, exported products are at the columns. In this respect, an adjacency matrix with c countries, where p products are exported, will be a country-product matrix of cxp dimensions consisting of 1 and 0 in each element denoted by M cp . Both economic complexity and economic fitness approaches use Revealed Comparative Advantage (RCA) index developed by Balassa (1965) to obtain 1 and 0. In this way, it is possible to obtain information on whether or not a country is an important exporter of a product. Accordingly, RCA index of a product p exported by a country c can be defined as Eq. (1) below. Tacchella et al. (2012) showed that the products exported by less diversified countries are generally ordinary products, while highly diversified countries export both ordinary and sophisticated products. Thus, the products exported by diversified countries give us almost no information about the level of sophistication of these products. Therefore, it is meaningless to use the average diversity levels of the exporting countries to determine the sophistication level of products as Hidalgo and Hausmann (2009) claimed. Tacchella et al. (2012) propose an iteration process that obtains fixed points of the system by defining the fitness (F c ) and product complexity (Q p ) in a non-linear coupled equation system as given in Eq. (3).
where F c is proportional to the sum of a country's exports weighted by product complexity values, while Q p is inversely proportional to the number of countries exporting the product in Eq. (2).
Equation (2) has a two-stage iteration process. First, F (n) c and Q (n) p intermediate variables are calculated using the relevant formulas, and then these intermediate variables are normalized at each iteration stage. The solution of the coupled equation system is given in Eq. (2); F (0) c = 1 and Q (0) p = 1 are given as the initial condition. Although Tacchella et al. (2012) state that the fixed point solutions of the coupled equation system calculated with Eq. (2) are stable and independent of the initial condition, Morrison et al. (2017) bring up the issue of the instability of the system. The new economic fitness algorithm developed by Servedio et al. (2018) is shown in Eq. (3).
In Eq. (3), the product complexity is now given by P −1 p . In the system where the initial condition is given as (2), by adding two values greater than zero, such as c and p , to each equation, the system is provided to have a structure that is not defined by a multiplicative constant. In this way, the system does not need to be normalized at every stage as it was done earlier (Servedio et al. 2018). c represents the self-fitness value of a country in Eq. (3). Accordingly, even if a country does not export at all, it will have a fitness value of c . On the other hand, p expresses the minimum value that P p will take in the case that a product is not exported by any country (M cp = 0, ∀c); thus, the maximum value that the product complexity expressed as P p = Q −1 p can take for any product. Servedio et al. (2018) state that this situation can only be valid for innovative products that have not been produced yet, while they suggest that the complexity value of products that have not yet been invented will be at the maximum level.
To make Eq.
(3) parameter free and facilitate the algorithm, a common value is assigned such that c = π p = ; ∀c, p, and rescaled quantities P p = P p ∕ and F c = F c are introduced. In this way, rearranging Eq. (3) yields: As soon as the parameter is much smaller than the typical value of M cp matrix elements, i.e., much smaller than unity, the fixed point in terms of F c and P p almost does not depend on (Servedio et al. 2018). Operti et al. (2018) develop two criteria, exogenous fitness and endogenous fitness, in order to apply the economic fitness approach in a robust way at the regional level. Endogenous fitness is calculated using the state-product matrix based on the RCA index, which is the ratio of a product p exported by states to the total exports of states, and the ratio of that product p to the total export of the whole country. Except for the state-product matrix, the standard economic fitness value algorithm is applied. Besides, exogenous fitness, based on the assumption that product complexity values are constant in the world, is based on calculating the economic fitness value for the countries with the standard method and using the product complexity data obtained on a global basis to calculate the fitness value for the states. Hereby, it is prevented from obtaining deviant product complexity values for products that are not produced locally or produced by very few states but are widely produced in the world. Accordingly, for the EFI that will be calculated at the state level, the product complexity (Q p = (P p − 1) −1 ) vector calculated according to the new EFI algorithm shown in Eq. (5) and M state-product matrix are multiplied as follows: Here, M sp denotes the elements of the binary RCA matrix, M, in which 51 US states (50 states and 1 federal district) and exported products classified according to the six digit Harmonized System (HS6) are arranged in the rows and columns, respectively. Accordingly, if a state has a comparative advantage in the export of a product, the relevant M sp element will take the value 1; otherwise, it will take the value 0. Q p is the global product complexity vector calculated on the basis of 206 countries. The M sp matrix to be used for the calculation of the statelevel exogenous fitness index and the M cp country-product matrices prepared to calculate the global product complexity vector should be created separately for each year, and the calculation given in Eq. (4) and Eq. (5) should be repeated, respectively, for each year. The analyses were carried out over 500 iterations and the value of =10 −6 .

Computing Green Complexity Index of states
The environmentally friendly products are obtained from WTO Core List (26 products), OECD customized product list of environmental goods (244 products), OECD illustrative product list of environmental goods (120 products), and in the APEC list (52 products). Hence, M matrix is expected to be 51 × 295. However, the size of the matrix decreases to 51 × 290 in some years due to zero export values. Using Eq. (5), M sp matrix is multiplied by the global product complexity vector to obtain the state-level GCI values. Here, Q p is the sub-vector of the product complexity vector calculated on the basis of 206 countries, which includes environmental products. This process is repeated for each year separately between 2002 and 2018. In order to develop GCI and EFI datasets, the product complexity data on a global scale is calculated by using BACI export data reported at the HS6 level. Among the countries included in the BACI dataset, the countries with the status of micro-states that have no exports are excluded, and as a result, global product complexity calculations are made over 206 countries.

Data and estimation methodology
We employ global and local air pollutants and state-level export data to explore the nexus between environmental degradation and green production and economic fitness for the US states. As recommended by Dinda (2004), we calculate the relative values of CO 2 , PM 10 , and SO 2 per capita data by proportioning each polluter to mid-year population data provided by the U.S. Bureau of Economic Analysis (BEA). In this context, CO 2 , SO 2 , and PM 10 data are in tons per capita.
Six-digit export data for US states are acquired from the U.S. Census Bureau's USA Trade Online database. Unadjusted energy-related CO 2 emission data for CO 2 data are obtained from the U.S. Energy Information Administration (EIA). SO 2 and PM 10 data are extracted from the Environmental Protection Agency.
We use control variables such as population density, industrial energy consumption per capita, and real GDP per capita in our estimations. Population density is thought to be effective on environmental degradation (Grossman and Krueger 1991;Song 1994, Regmi andRehman 2021). We include industrial energy consumption as existing literature extensively suggests the relationship between air pollutants (e.g., SO 2 , PM 10 , and CO 2 ) and energy consumption (see Kongkuah et al. 2021a for a survey of the literature review). Similar to emission data, relative real GDP per capita values are obtained by proportioning the state-level GDP to the mid-year population estimates of each state.
Data for real GDP and industrial energy consumption are available until 2018, CO 2 data up to 2017, and state-based export data at HS6 level starting from 2002. In this context, our analysis covers 2002-2018 for PM 10 and SO 2 and 2002-2017 for CO 2 .
Panel data analysis is a useful empirical tool in environmental research as it provides more variability and less collinearity. Baltagi (2008) states that the fixed effect estimation is more appropriate for a sample of states, companies, or countries with similar conditions. In addition, the availability of state-level export data restricts the time dimension of our study. Therefore, large T may cause biased results in the study. For this reason, we use the fixed effect estimation method to examine the relationship between GCI, EFI, and air pollutants for US states that display a more homogeneous structure in terms of environmental policies and laws.
The baseline models are polynomials that include the quadratic or cubic forms of GDP in the environmental Kuznets curve (EKC) literature. These quadratic and cubic forms are added to take the non-linear relationship into account. However, including quadratic or cubic terms without identifying the exact form of the non-linear relationship are quite restrictive (Aslanidis 2009). Royston and Altman (1994) propose the fractional polynomial regression approach as an alternative model that provides more flexibility compared to the regular polynomial models used to test the EKC hypothesis in the literature.
The fractional polynomial regression method has several desirable features, such as allowing logarithmic, non-integer, or repeated powers to examine the possible non-linear relationship and identify the most appropriate functional form. In addition, fractional polynomial regression enables researchers to determine necessary independent variables to be included in the model (Royston 2017;Royston et al. 1999). Thus, we employ the fractional polynomial fixed effect regression approach to eliminate the functional form bias in the present study.
Fractional polynomial regression includes the Function Selection Procedure (FSP), a closed test procedure that is applied to determine the most convenient functional form. FSP starts with the highest-degree fractional polynomial regression and statistically tests the ability to reduce that model to a first degree or a linear model. According to FSP, the null hypothesis is to be tested whether the variable is omitted from the model. If this test statistic is significant, the testing procedure continues for linear, second-order, or third-order fractional polynomials. Otherwise, the variable should be omitted from the model and the testing procedure stops (Royston 2017).

Results
We develop the GCI and EFI for each US state. Figure 1 illustrates the highest and lowest states in EFI database. The left panel shows the five states with the highest EFI values; the right panel gives the five states with the lowest EFI for this period.
The left panel of Fig. 1 shows that New Jersey stands out with the highest EFI value in the USA. In addition, it is noteworthy that while the EFI values of states such as Florida and California have increased over the years, the state of Pennsylvania has experienced a slight decrease. In Fig. 1, we can also observe that states with higher EFI values show a more stable development path over the years. Alaska stands out as the state that lags in terms of EFI (i.e., productive capabilities) due to its economy mainly based on unsophisticated products such as fishing and forestry.
One interesting point in Fig. 1 is that D.C. is ranked among the lowest EFI states despite having higher per capita income. The main reason for this is that D.C.'s economic structure is business service-oriented rather than production.
In addition to EFI, Fig. 2 demonstrates the states with the highest (left panel) and the lowest (right panel) GCI values for 2002-2018. In Fig. 2, three states with the highest EFI, namely, Illinois, Florida, and Pennsylvania, are also the states with the highest GCI. In addition, Alaska, Hawaii, North Dakota, and D.C., which are among the lowest EFI values, are also among the states with the lowest GCI values. Notably, states with high EFI (i.e., New Jersey and California) are not among states with high GCI.
This shows that while New Jersey and California have a higher accumulation of productive capabilities, this accumulation seems not to be concentrated on green products. Correlations and descriptive statistics of the variables we used for our analysis are given in Tables 1 and 2, respectively.
In Table 1, the high correlation between GCI and EFI is remarkable. For this reason, we preferred not to include the GCI and EFI variables together in an estimated model. Apart from this, the correlations between other variables are considered to be reasonable. Descriptive statistics for logarithmic transformations of the variables are given in Table 2. Alaska is the only state with a negative logarithmic value due to its low population density.
We estimate three models for SO 2 , PM 10 , and CO 2 as dependent variables. First, we questioned the significance of the GCI and EFI variables separately. Both variables are not significant for any of these air pollutants. Table 4 presents the fixed effect estimation of our models. In contrast with Mealy and Teytelboym (2020), our findings suggest an insignificant relationship between CO 2 and GCI.
The linear model estimations in Tables 3 and 4 indicate no statistical relationship between GCI, EFI, and local air pollutants. Considering that polynomial models are used to avoid functional form bias in the EKC literature, estimation over alternative functional forms will yield more reliable estimation results. In this context, we employ the fractional polynomial selection procedure or function selection procedure proposed by Royston (2017), which allows for the logarithmic, non-integer, or repeated powers to examine the possible non-linear relationship and identify the most appropriate functional form.
The function selection procedure (FSP) tests the null hypothesis of whether the variable in question should be omitted. If the null is rejected, the procedure aims to identify the most appropriate functional form, e.g., linear and nonlinear (Royston 2017).
Tables 5 and 6 report the function selection procedure (FSP) results for GCI and EFI variables, respectively.
According to Table 5, GCI and CO 2 have a linear functional form. However, the linear functional specification in the fixed effect estimation shows us that the relationship between GCI and CO 2 is not statistically significant.      In Table 6, the null hypothesis for omitting the EFI, along with the linear and first-order fractional specifications of EFI are all statistically significant and, thus, the null hypothesis should be rejected. Table 6 shows that the EFI variable has a non-linear relationship with the pollutants. Instead, the fractional polynomial specification of EFI should be used to test the relationship between SO 2 , PM 10 , and CO 2 variables. While the best specification for SO 2 and CO 2 variables for EFI is the second-order fractional polynomial form, the third-order fractional polynomial form is the best specification for PM 10 .
However, as stated in Royston (2017), the probability of falling into Type II error increases as the degrees of variables tested in the functional selection procedure are increased. For this reason, it would be appropriate to choose the most parsimonious model. Hence, the second-order fractional polynomial of the EFI variable is preferred and included in the model for the PM10pc variable.
According to these test results in Table 6, the fixed effect estimations may yield biased results due to functional form misspecification. Therefore, the fractional polynomial fixed effect estimation results are given in Table 7. EFI coefficients reported in Table 7 differ from the main effect of the EFI covariate obtained from the fixed effect estimations in Tables 3 and 4. A second-order fractional polynomial regression estimates two EFI coefficients, EFI-1 and EFI-2, to fit the data more adequately and to identify non-linear relationships in predicting the emission of air pollutants which cannot be captured by a traditional fixed effect model.
Both EFI-1 and EFI-2 variables in Table 7 are significant for SO 2 pc, PM10pc, and CO 2 pc. This indicates a nonlinear relationship between the EFI and all air pollutants. The R-squares for all estimated models are about 80 percent except for the PM 10 . Our findings are in line with the findings of Mealy and Teytelboym (2020). Figure 3 illustrates the graphs showing predicted values and observations for estimated fractional polynomial models reported in Table 7. Figure 3 suggests an inverted U-shape relationship between air pollutants and EFI in US states. This is especially evident for the SO 2 pc variable.  The estimates of control variables included in the models are consistent for both fixed effect and fractional polynomial analysis reported in Tables 4 and 7, respectively. Population density is positive but statistically insignificant for SO 2 pc and PM10pc, while it is negative and significant for CO 2 pc. In addition, while the electric consumption per capita is positive and significant for SO 2 pc and CO 2 pc, it is insignificant for PM10pc. GDP per capita, on the other hand, is positive for SO 2 pc and CO 2 pc, and negative for PM10pc and is insignificant for all three air pollutants.
Overall, the results indicate no relationship between GCI and air pollution at the regional level. This finding seems not compatible with the findings of Mealy and Teytelboym (2020). In addition, we find a non-linear relationship between EFI and air pollution. Moreover, this relationship is in the form of an inverted U shape, particularly for SO 2 . Dinda (2004) suggests an inverted U-shape type relation between GDP, especially SO 2 and PM 10 , but not for CO 2 data following Holtz-Eakin and Selden (1995), Roberts and Grimes (1997), and Dinda (2001). Our results contain similar findings for the EFI, SO 2 , and PM 10 as Dinda (2004) put forward for GDP. In contrast, our findings for CO 2 are in line with Pata (2021), which finds an inverted U shape between ECI and CO 2 in the USA.

Concluding remarks
Institutional and public environmental awareness stimulates global demand for environmental products and services. Governments and corporations are getting more sensitive against ecologically hazardous production. For instance, the European Union recently agreed on the European Green Deal (EGD), which promotes environmentally friendly product markets and sets new product standards to eliminate the adverse effects of environmentally hazardous production.
We attempt to analyze the nexus between green production and environmental quality by exploiting sub-national data for the US states. The analysis consists of two stages. First, we developed a green product complexity index dataset for each state, as in Mealy and Teytelboym (2020). Later, environmental data and green and overall product complexity indices are estimated by fixed effect and the fractional polynomial regression method allowing more flexible functional forms.
We find that higher green complexity index levels have an insignificant effect on emission levels in the US states. Contrary to Mealy and Teytelboym (2020), our findings indicate that exporting more sophisticated green products does not improve air quality. This may be due to the current green product classifications of OECD, WTO, and APEC, which fail to incorporate the production and enduse stages of goods or services of green-labeled products.
For this reason, it would be appropriate for OECD, APEC, and WTO to identify their green product lists more clearly with more specific tools such as sustainable certifications. In addition, it is necessary to consider the consumption side of the products declared as green products. As pointed out by Yang et al. (2021), although consumers favor green products, this positive attitude toward green products may not be effectively transformed into purchasing behavior due to higher prices caused by green processing and certification costs. For this reason, governments should undertake a leading role to promote the value of the environment for the public. In addition, as asserted by Morone et al. (2021), sustainability certification may also play a key role in purchasing decisions of green products.
In contrast to the GCI, findings suggest that EFI, which includes all products regardless of green or non-green classifications, has significantly reduces sulfur dioxide, particulate matter 10, and carbon dioxide levels. In line with the existing literature (e.g., Neagu 2019; Chu 2020; Pata 2021), we find an inverted U-shape relationship between EFI and emission levels, particularly for SO 2 .
These findings have practical implications for the Paris Climate Accords and countries' zero-carbon targets. Although green products are defined as goods used to prevent, limit, or measure the environmental damage on water, air, and soil, there is no guarantee that the end-uses of these products will be environmentally friendly. In other words, it is debatable how environmentally friendly the production process of the so-called green products is and how environmentally friendly the areas of use are after production. Besides, green products alone will not be sufficient to make the production structure more technology-intensive, as they only make up a small part of what a country can produce. Thus, increasing the green product complexity is insufficient in transforming the production structures of nations into a more technology-intensive form that will allow them to have more sophisticated products.
In addition, the increase in overall ECI in a country means that the production structure of that country will shift to a technology-intensive production structure and allow it to produce more sophisticated products. On the other hand, technology allows to increase efficiency, that is, to obtain the same amount of production by using less energy and labor. Thus, it allows to obtain the same amount of product by using less energy. This, in turn, will mean that that country's carbon dioxide emissions will decrease. For this reason, if countries take steps to increase their economic complexity and encourage the use of renewable energy for their production instead of focusing on their ability to produce green products, they will serve their zero-carbon targets more.
This paper extends the literature in many folds. First, we provide a new dataset (i.e., green product complexity index) for each US state. The GCI data can be used for future research on green production, carbon emission targets, and Paris Climate Accords. Furthermore, we outline the link between green production and environmental quality at the sub-national level. The sub-national analysis provides a more robust estimation in environmental studies as the significant differences between emission measurement methods across countries create cross-country data inconsistency. Although sub-national analysis offers a more homogeneous environment for researchers compared to cross-country studies, unfortunately, they do not allow for a completely homogeneous research sample. In addition, it should be kept in mind that sub-national studies are less generalizable. Hence, more reliable environmental research requires more subnational studies for different countries.