The Interaction Between Area and Production of Food Grain Crops in India : An Empirical Evidence from ADRL Bounds Test Cointegration

This paper demonstrates a significant long-run relationship between area and productions of Food grain crops grown in India during the period 1950-2018. Stability of the estimated model parameters are studied . To assess the consistency of the model parameters the cumulative sum of recursive residuals test and the cumulative sum of recursive residuals squares tests are used.Additionally, cointegration equations such as the Fully Modified Ordinary Least Square Dynamic Ordinary Least Squares, and Canonical Cointegration Regression are applied to check the long-run elasticities in the concerned relationship.


Background of the study
Over the last few decades Regression modeling have traditionally being employed in predictions and classifications of agricultural production. For the agricultural planning purposes, decision makers need simple and reliable estimation techniques for the prediction of crop production. The multiple regression, discriminant analysis, factor analysis, principal component analysis, cluster analysis and logistic regression analysis etc., are the most used statistical techniques for prediction and classifications of agricultural related productions.
Since in the agricultural production time series data, the problems of multicollinearity, autocorrelation and extreme values are unavoidable. In such a complex situations regression models may not provide accurate predictions.As regression models need to fulfil the regression assumptions such as autocorrelation and multiple co-linearity between 1 independent variables, which causes the estimated regression models are unfits and the estimated parameter values obtained based on these models are found to be inefficient. (Zaefizadah et al. 2011). Most of the agricultural practices, crop production is influenced by a great variety of interrelated factors such as autocorrelation and it is difficult to describe their relationships by conventional methods. Hence in the present study the, Auto-regressive Distributed Lag Model (ADRL) is used to combat the complicated relations and strong autocorrelation presents in the crop production data. Granger (1988) demonstrated that causal relations among variables could be examined within the framework of an error correction model (ECM) with cointegrated variables. While the short-run dynamics are captured by the individual coefficients on the lagged terms, the error correction term (ECT) contains information on long-run causality. The significance of the lagged explanatory variable identifies short-run causality, while a negative and statistically significant ECT signifies long-run causality.

ADRL Methodological Literature Reviews
Alimi, (2014) investigated the relationship between expected inflation and nominal interest rates in Nigeria and the extent to which the Fisher effect held for the period 1970-2021. He applied ARDL bounds testing and vector error correction (VECM), and the stability of the function was also tested by the CUSUM and CUSUMSQ tests. Finally, the CUSUM test confirmed the long-run relationship between the variables and showed the stability of the coefficients.
Nkoro and Uko, (2016), narrated and provided sufficient insight into the issues surrounding ARDL cointegration technique to young practitioners to enable them to properly apply, estimate, and interpret within the context of ARDL cointegration framework.This study shows that the adoption of the ARDL cointegration technique does not require pre-tests for unit roots unlike other techniques. Consequently, ARDL cointegration technique is preferable when dealing with variables that are integrated of different order, I(0), I(1) or combination of bothand, robust when there is a single long run relationship between the underlying variables in a small sample size. The long run relationship of the underlying variables is detected through the F-statistic (Wald test). In this approach, long run relationship of the series is said to be established when the Fstatistic exceeds the critical value band. The major advantage of this approach lies in its identification of the cointegrating vectors where there are multiple cointegrating vectors. However, this technique will crash in the presence of integrated stochastic trend of I (2). To forestall effort in futility, it may be advisable to test for unit roots, though not as a necessary condition. Based on forecast and policy stance, there is need to explore the necessary conditions that give rise to ARDL cointegration technique in order to avoid its wrongful application, estimation, and interpretation. If the conditions are not followed, it may lead to model misspecification and inconsistent and unrealistic estimates with its implication on forecast and policy. Based on the above the present study is aimed to investigate the short-and long-term cointegration relationships between the area and production of Food grain crops grown in India during the period 1950-2018 using an ADRL model and bounds cointegration tests.

ARDL model
The Autoregressive Distributed Lag (ARDL) models are standard least squares regressions that include lags of both the dependent variable and explanatory variables as regressors (Greene, 2008). Although ARDL models have been used in econometrics for decades, they have gained popularity in recent years as a method of examining cointegrating relationships between variables through the work of Pesaran and Shin (1998) and Pesaran et al. (2001).
In particular, if t y is the dependent variable and are k e 12 , ,..., k xx xxplanatory variables, a general 12 ( , , ,..., ) k ARDL p q q q model is given by x for j=1,2,3,…,k, respectively. An (,) ARDL p q model has p lags of the dependent variable and q lags of the independent variable: where t µ is a random "disturbance" term.
The model is "autoregressive" in the sense that t y is "explained" (in part) by lagged values of itself. It also has a "distributed lag" component in the form of successive lags of the "x" explanatory variable. Sometimes, the current value of itself is excluded from the distributed lag part of the model's structure (Soharwardi et.al.,2018).

Materials
The present study is to investigate the short-and long-term cointegration relationships between the area and productionof Food grain crops grown in India during the period 1950-

Methods
To apply the ARDL model, the study variables should fulfil certain stationarity conditions.
That is, the variables should be purely I(0), purely I(1) or I(0)/I(1) (Alimi, 2014). To test this, three different tests, viz., the Dickey and Fuller (1979), Phillips and Perron (1988) and Kwiatkowski et al. (1992) tests, were used. The Akaike information criterion (AIC) was used to select the optimal lag. To test the normality of the residual, the Jarque-Bera test (Jarque and Bera, 1980) is used. For testing for autocorrelation and serial correlation, the Ljung-Box test (Ljung and Box, 1979)

RESULTS AND DISCUSSION
In this section, we provide the empirical findings and their interpretations in sequence.

Unit root test
The results of the ADP, PP and KPSS Unit root tests in Table 1 reveal that both the variables of the study are stationary at level and both the variables are I(0). As the KPSS statistics are nonsignificant, the study variables area and prodn are stationary without differencing.

Summary statistics
The summary statistics presented in Table 2

. Trends in Food grain production in India 3.3. Model selection
To choose the optimal lag values, p and q, AIC was calculated for the different values of p and q. The lower the AIC values, the better the lag values for p and q. Fig.3 illustrates that the AIC value is extremely low for lags p=4 and q=4. Accordingly, the ARDL(4,4) model is found to be the best among the 20 models investigated with different lag values.

ARDL(4,4) model
The ARDL (p=4, q=4) model was employed to study the short-run relationship between the area under the Food grain crops and its production. The fitted model is highly significant and the co-efficient of determination is 99% which is remarkably high. Hera area and production at lag one are highly significant. Additionally, production at lag 4 and area at lag 1 are significant at 5% level. All the model parameters are significant except the second lag variables of AREA as well as PROD. The value of the D-W statistic is equal to two, which confirms that there are no spurious results.    To ensure the consistency of the ARDL(4,4) model, the following residual diagnostic tests are carried out.

Ljung-Box test for autocorrelation
The results of the Ljung-Box autocorrelation test (Ljung and Box 1979) presented in Table 4 indicate that the p-values of the Q statistics are clearly greater than 0.05 and strongly suggest the absence of autocorrelation in the model error.  (Breusch, 1978;Godfrey, 1978) Usually, when an analysis involves time-series data, the possibility of serial correlation is high. Therefore, it is necessary to test the residuals for serial correlation using the Breusch-Godfrey LM test. The results presented in Table 5 reveal that the null hypothesis of no serial correlation can be accepted since the p-value for the test is greater than 0.05, and hence, there is no serial correlation.

Breusch-Pagan-Godfrey heteroscedasticity test
To ensure consistency, the study further employed the Breusch-Pagan-Godfrey heteroscedasticity test, and the results are presented in Table 6. The results reveal that the null hypothesis of no heteroscedasticity is accepted, as the test is non-significant (the p-value is greater than 5%). Hence the mean and variance are remains same throughout the study period. The significance in the outcome of serial correlation, heteroscedasticity and test for normality indicate that the fitted model ADRL (4,4) is the well fitted model.

Fit of the model
The actual and fitted plot of the ARDL(4,4) model shows that the fit enough in terms of explaining the production of the Food grain crops and future forecasts (Fig.5.).

Bounds test for cointegration
The bounds test developed by Pesaran et al. (2001) is employed to test the cointegration (long-run relationship) between the study variables area and production and is presented in Table 7. The test results reveal that there exists a cointegration relationship between area and production , as the bounds test statistic is greater than the upper bound from I(1) (F-statistics =9.098144> 5.58), and hence the null hypothesis of "No levels Relationships" is rejected, which impliesthe possibility of a log-run relationship between the study variables,area under the Food Grain crops (AREA)and production (PRODN). The conditional error correction regression model is presented in Table 8. All the estimated parameter values are significant except production at lag 1 and first difference in production at lag one and three. The results presented in Table 9   The results present in the Table 10 is the errors correction model estimates the speed of adjustment to equilibrium in a cointegration relationship. Here, the error correction term derived as the Levels Equation earlier, is included among the regressors and is denoted as The coefficient associated with this regressor is typically the speed of adjustment to equilibrium in every period. Here the Coefficient of CointEq is positive and highly significant.Thus, both the variables under study are moving together in positive direction.

Johnson Cointegration test
Like the ARDL and cointegration equation, the Johnsen cointegration results produce the same evidence for one cointegrating vector. Table 11 represents the Trace and Max-Eigen statistics. Both the statics values are significant indicating the existence of one cointegration relationships between the study variables, which confirms the results of the ARDL Bounds Tests.

3.13.Long-Run Elasticities
As there exists cointegration among the study variables, long-run elasticities are estimated with the FMOLS, DOLS and CCR equations by considering production as the regress and area as the regressor. The results are reported in Table 12.The finding shows that as the 1 % increase in area under the crop cases nearly 6 % increases in production. The cointegration results from the ARDL bounds Test and the long-run elasticities are corroborated with the Johansen cointegration method.

CONCLUSION
The estimated ARDL (p=4, q=4) model is highly significant, and the value of the coefficient of determination, R 2 = 99%, implies that almost 99% of the variation in the dependent variable is explained by the model and that the rest is explained by the error term. The estimated model is found to be, no serial correlations, presences of homoskedastic and the errors are normally distributed. The value of the D-W statistic is nearly equal to two, which confirms that there are no spurious results. The bounds test results reveal that a log-run relationship between the area under the food grain crops and its production. The error correction term is positive and highly significant, which is the one of the desirable qualities of the model which reflects the variables under study are moving together in positive direction.The finding shows that as the 1 % increase in area under the crop cases nearly 6 % increases in production.