Using a quadratic model as stated in Eq. (1), the study observed the relationship between GHG emissions, FDI, GDP, and EC in a few SAARC member states.
$$ln{CO}_{2t}={\phi }_{1}+{\phi }_{2}{lnGDP}_{t}+{\phi }_{3}{lnGDP}_{t}^{2}+{\phi }_{4}{FDI}_{t}+{\phi }_{5}{TEC}_{t}+{\epsilon }_{t}$$
1
where metric tones per capita is used as a unit to measure CO2, current US dollars is used to measure GDP, FDI is calculated as a net inflow percentage of GDP, and British Thermal Units (BTU) per capita is used to measure total EC. The subscript t indicates the time period and \({\phi }_{i}\) is the intercept where \(i=1\dots 5\). The error term is represented by \(\epsilon\). Except for total energy consumption, the time series data were captured from the database of the World Bank development indicator, 2021. The data on energy consumption comes from the World Energy Statistics 2021. All of the variables used in the regression model are converted to logarithms, which show the elasticity (or percentage change) of the dependent variable. The sign of the parameters \({\phi }_{2}\) and \({\phi }_{3}\) is predicted to be positive or negative based on the theory of the environmental Kuznets curve (EKC). At the primary stages of growth of any country, the positive sign of the coefficient \({\phi }_{2}\) implies a positive association between lnCO2 and lnGDP. Similarly, the nonlinear inverted U-shaped relationship between CO2 emissions and GDP per capita is confirmed by the negative sign of the squared parameter \({\phi }_{3}\) of the EKC. As a result, the presence of EKC for the panel is confirmed by the statistical significance of both the positive and negative signs of the concerned parameter. Furthermore, the sign \({\phi }_{4}\) associated with FDI is used to test the legitimacy of the pollution halo hypothesis or the pollution haven hypothesis. The first hypothesis contends that FDI reduces GHG emissions, whereas the last hypothesis contends that FDI increases GHG emissions. As a result, the sign of the parameter \({\phi }_{4}\) contributes to the validation of these two hypotheses. Generally, an intensification in energy consumption is projected to result in an escalation of CO2 emissions as a result of increased economic activity; consequently, the expected sign of \({\phi }_{5}\) is positive.
Cross-sectional dependence (CSD) is fairly prevalent and regularly encountered in practice when dealing with repeated cross-sectional data. It is self-evident to test for CSD before examining the stationary features of a time series or panel data. The use of unit root and cointegration methods could be adjusted by the CSD; otherwise, the estimate from the unit root and cointegration properties could be inconsistent (Silva et al. 2018). We used the lagrange multiplier (LM) test suggested by Breusch and Pagan (1980) and the CSD test suggested by Pesaran (2021) to regulate whether the panel data are cross-sectionally dependent or not. Both tests are calculated with cross-sectional independence as the null hypothesis and CSD as the alternative hypothesis. By assuming potential cross-sectional independence within the panel data, this work analyzed stationary qualities utilizing the second generation unit root tests such as cross-sectionally augmented dickey (CADF) and cross-sectionally Im-Pesaran-Shin (CIPS). Pearson (2007) suggested that the CADF and CIPS are superior to Levin, Lin and Chu (LLC) and Im Pesaran and Shin (IPS) panel unit root tests. The study used the Pedroni and Kao cointegration tests, as suggested by Pedroni (2004) and Kao (1999), to ensure that the investigated variables were cointegrated. Later, the study used ARDL to measure the long-run and short-run effects of our proposed variables on CO2. The study also used the pooled mean group (PMG) estimator for further assessment. The Akaike information criterion (AIC) has been applied to select the best lag structure because it reduces the loss of degrees of freedom. It is also treated as the “parsimonious lag structure.” In this case, the PMG estimator is utilized because it maintains long-run coefficient control across all cross-sections while allowing short-run coefficient diversity across all cross-sections. The ARDL model has some practical advantages, such as the ability to estimate short-and long-term consistent estimates simultaneously, regardless of whether the series is I(0) or I(1). With such a model and a modest sample size, reliable findings can be produced. Due to the presence of rich dynamics, the problem of multicollinearity can be readily solved (Dougherty 2016). Finally, because all variables are expected to be endogenous, the ARDL model removes the endogeneity issues that plague the Engel-Granger technique (Seker et al. 2015). The ARDL (pq) model contains lag p used as the outcome and lag q used as the explanatory variable. This model was developed by Ssali et al. (2019) and it takes the following form:
$${y}_{it}={\mu }_{i}+\sum _{j=1}^{p}{\lambda }_{ij}{y}_{it-j}+\sum _{j=0}^{q}{\delta }_{ij}^{\text{'}}{x}_{it-j}+{\epsilon }_{it}$$
2
where \({y}_{it}\) denotes the outcome variable. The number of countries and time period used for analysis are symbolized by the subscript \(i=12\dots N\) and \(t=12\dots T\). The other subscript j is the number of cross-sections. The quantity of lags for the outcome and explanatory variables is expounded by p and q. The explanatory variables \({x}_{it-j}\) represent a \(m\left(row\right)\times n\left(column\right)\) vector. The scalar vector is also expressed by \({\lambda }_{ij}\). The symbol \({{\delta }}_{ij}^{{\prime }}\) shows the \(m\times 1\) coefficient vector and \({\epsilon }_{ij}\) indicates the error term.
If there is cointegration between the variables under investigation, the ARDL model should be reformulated. In the ARDL model, cointegration could be a concern in short-run dynamics. To obtain a consistent estimate, the error correction term (ECT) should be added. The ECT is used to measure how quickly a dependent variable approaches long-run equilibrium while the independent variable changes. The ECT can be written like this:
$${\varDelta y}_{it}={\varphi }_{i}\left({\varDelta y}_{it-1}-{\tau }_{i}^{\text{'}}{x}_{it}\right)+\sum _{j=1}^{p-1}{\lambda }_{ij}{\varDelta y}_{it-j}+\sum _{j=0}^{q-1}{{\delta }}_{ij}^{{*}^{\text{'}}}{{\Delta }x}_{it-j}+{\epsilon }_{it}$$
3
where \({\varphi }_{i}\)=\(-(1-\sum _{j-1}^{p}{\lambda }_{ij})\) and \(\tau\)=\(\frac{\sum _{j=0}^{q}{{\delta }}_{ij}}{{\varphi }_{i}}\). In Eq. (3) \(\tau\) and \({{\delta }}_{ij}^{{*}^{\text{'}}}\) and show the long-run association between outcome and explanatory variables. The symbol \({\varphi }_{i}\) is a part of ECT and is likely to be non-positive or less than one. The presence of cointegration is indicated by a negative sign, whereas the pace of adjustment is indicated by a fraction less than one. If \({\varphi }_{i}=0\) there is no confirmation of a long- run relationship (no cointegration). Therefore, by combining the ECT, our model follows Eq. (4), where CO2 emission is the outcome variable, and other variables (GDP, GDP2, and FDI energy use) are the explanatory variables.
$${{\Delta }CO}_{2i}=\alpha +{\varphi }_{i}\left({{\Delta }COO}_{2i}-{\tau }_{1i}^{\text{'}}{GDP}_{it}+{\tau }_{2i}^{\text{'}}{GDP}_{it}^{2}-{\tau }_{3i}^{\text{'}}{FDI}_{it}-{\tau }_{4i}^{\text{'}}{TEC}_{it}\right)+\sum _{j=1}^{p-1}{\lambda }_{ij}{{\Delta }CO}_{2it-j}+\sum _{j-0}^{q-1}{{\delta }}_{1ij}^{{*}^{\text{'}}}{{\Delta }GDP}_{it-j}+\sum _{j-0}^{q-1}{{\delta }}_{2ij}^{{*}^{\text{'}}}{{\Delta }GDP}_{it-j}^{2}+\sum _{j-0}^{q-1}{{\delta }}_{3ij}^{{*}^{\text{'}}}{FDI}_{it-j}+\sum _{j-0}^{q-1}{{\delta }}_{4ij}^{{*}^{\text{'}}}{TEC}_{it-j}+{\epsilon }_{it}$$
4
While estimating Eq. (4), we chose the PMG estimator given by Pesaran et al. (1999) since it has various advantages over the others. For example, it accounts for differences in short-run coefficient intercepts and error variation among countries. Long-run estimates are similarly restricted by this estimator to being constant. According to Pesaran et al. (1999), the advantage of PMG is that it is robust to outliers as well as lag orders. The PMG has grown popular among researchers due to its application capacity (Baek and Choi 2017; Rafindadi et al. 2018b; Ssali et al. 2019; Yusuf et al. 2020). In order to justify the suitability of the pooling coefficient in the ARDL situation, we used Hausman poolability to diagnose the above-mentioned model. This diagnostic test checks if the null hypothesis of pooled long-run coefficients being identical for all cross-sections is true or not (Ssali et al. 2019). Finally, we applied the panel Granger causality test to measure the causality-based vector error correction's direction. The Granger causality test was done in two steps. The long-run relationship is estimated in the first phase in order to create an ECT for the second step. In the long run, an ECT is defined as one-period lagged residuals. The ECT sign specifies whether one or two variables are used to correct divergence from the long-run relationship. The VECM is also calculated using biassed-corrected least square dummy variables (Bruno 2005). Boutabba (2014) defined the VECM as follows:
$${\left(1-B\right) Y}_{it}={\mu }_{1it}+\sum _{i=1}^{p}\left(1-B\right)\left[{\lambda }_{1it}\right] {Y}_{it-1}+ {\delta }_{1it}\left[{ECT}_{it-1}\right]+{\epsilon }_{it}$$
5
where \(i=12\dots N\) shows the country, \(t=12\dots T\)presents the time, p indicates the lag length, \(\left(1-B\right)\) is the first difference operator, r\({\epsilon }_{it}\) is treated as a serially uncorrelated error term and a finite covariate matrix, and \({ECT}_{it-1}\) shows for lagged ECT. The VECM is used to capture both long-run and short-run Granger causality. The F-test of lagged explanatory factors may be used to determine short-run dynamics, while the t-statistics on the coefficient of lagged error term can be used to determine the significance of long-run contributing effects. Our VECM-based modified model is applied to test the way of the causality, and that can be written as follows:
\(\left[\begin{array}{c}{\varDelta lnCO}_{2it} \\ {\varDelta lnGDP}_{it}\\ {\varDelta lnGDP}_{it}^{2}\\ {\varDelta lnFDI}_{it}\\ {\varDelta lnTEC}_{it}\end{array}\right]\)=\(\left[\begin{array}{c}{\mu }_{1}\\ {\mu }_{2}\\ {\mu }_{3}\\ {\mu }_{4}\\ {\mu }_{5}\end{array}\right]\)+\(\sum _{i=1}^{p}\left[\begin{array}{c}\begin{array}{c}{\lambda }_{11p}{ \lambda }_{12p }{ \lambda }_{13p }{ \lambda }_{14p }{\lambda }_{15p}\\ {\lambda }_{21p} {\lambda }_{22p }{\lambda }_{23p }{\lambda }_{24p }{\lambda }_{25p}\end{array}\\ {\lambda }_{31p}{ \lambda }_{32p }{\lambda }_{33p }{\lambda }_{34p }{\lambda }_{35p}\\ {\lambda }_{41p}{ \lambda }_{42p} {\lambda }_{43p }{\lambda }_{44p }{\lambda }_{45p}\\ {\lambda }_{51p}{ \lambda }_{52p }{\lambda }_{53p }{\lambda }_{54p }{\lambda }_{55p}\end{array}\right]\left[\begin{array}{c}{\varDelta lnCO}_{2it-p} \\ {\varDelta lnGDP}_{it-p}\\ {\varDelta lnGDP}_{it-p}^{2}\\ {\varDelta lnFDI}_{it-p}\\ {\varDelta lnTEC}_{it-p}\end{array}\right]\)+\(\left[\begin{array}{c}{\delta }_{1}\\ {\delta }_{2}\\ {\delta }_{3}\\ {\delta }_{4}\\ {\delta }_{5}\end{array}\right]{ECT}_{it-1}+\left[\begin{array}{c}{\epsilon }_{1it}\\ {\epsilon }_{2it}\\ {\epsilon }_{3it}\\ {\epsilon }_{4it}\\ {\epsilon }_{5it}\end{array}\right]\) (6)