Panel Data Modelling for Indian Food Grain Production


 The present investigation was carried out to study the food grain production trends in different states in India based on Panel Regression Model for the period 2001-02 to 2020-2021. The results reveal that between state-to-state food grain production is highly significant the highest food grain production was registered in Uttar Pradesh followed by Punjab and Madhya Pradesh. Very lowest was registered in Kerala and Himachal Pradesh. The findings reveal that the highly significant fixed effect model was found to be suitable to study the trend and this model explains the 82% of variations in food grain production. Over all increasing in food grain production is noted.


Introduction
Over the last few decades, regression modelling has traditionally been employed in agricultural production prediction and classification. For agricultural planning purposes, decision-makers need simple and reliable estimation techniques for crop production prediction. Multiple regressions, Discriminant analysis, factor analysis, principal component analysis, cluster analysis and logistic regression analysis are the most commonly used statistical techniques for the prediction and classification of agricultural-related production.
In agricultural production time series data, the problems of multicollinearity, autocorrelation and extreme values are unavoidable. In such complex situations, regression models may not provide accurate predictions. Regression models need to fulfil regression assumptions such as autocorrelation and multiple colinearity between the independent variables, which causes the estimated regression models to be unfit and the estimated parameter values obtained based on these models to be inefficient. In most agricultural practices, crop production is influenced by a great variety of interrelated factors such as autocorrelation, and it is difficult to describe their relationships using conventional methods (Zaefizadah et al., [1]).
In this study, panel data regression model is used to combat the complicated relations and strong autocorrelation present in the crop production data.
Panel data is a combination of cross-sectional and time series data. Therefore, using a regression suited to panel data has the advantage of distinguishing between fixed and random effects. Fixed effects, effects that are independent of random disturbances, e.g. observations independent of time. Random effects, effects that include random disturbances. Panel data is more informative since it includes more information, but it has to be modeled correctly by taking into account fixed vs. random effects.
Panel data helps us to controls heterogeneity of cross-section units such as individuals, states, firms, countries etc., over time. Panel data estimation considers all crosssection units as heterogeneous. It helps us to get unbiased estimation. There are time invariant and state invariant variables which we observe or not. As compared to pure cross section and time series, panel data estimation is better to identify and measure effects of independent variables on dependent variables what we cannot measure using time series and cross section data. In addition to this "Panel data give more informative data, more variability, less colinearity among the variables, more degree of freedom and more efficiency". It is also better estimation method to study the duration of economic states and the "dynamics of change", over time. It is a good estimation method to 'construct and test complicated behavioral models', (Baltagi,[2]).
Based on the above discussion, the present study is aimed to study the trends in food grain production in different states in India over the period 2001-02 to 2020-2021 based on panel regression model. A simple panel data regression model is specified as (1) where are the estimated residuals from the panel regression analysis. Here, Y is the dependent variable, X is the independent or explanatory variable, are the intercept and slope, i stands for the i th cross-sectional unit and t for the t th month, and X is assumed to be non-stochastic and the error term to follow classical assumptions, namely, and that the LSDV model is valid (Bhaumik,[8]).

Random-Effect (RE) Model:
Random effects model is also called error component model (ECM). In this model the cross section units will have random intercept instead of fixed intercept. The rationale behind random effects model is that, unlike the fixed effect model, the variation across entities is assumed to be random and uncorrelated with the predictor or independent variables included in the model, the crucial distinction between the fixed and random effects is whether the unobserved individual effects embodies elements that are correlated with regressors in the model, not whether these effects are stochastic or not (Green, [4]). The RE model assumes that individual-specific effects are random and one ...

Results and Discussion
The results obtained in this paper based on applying different statistical tools related to panel regression models are discussed in subsequent sections.

Summary Statistics:
The descriptive statistics results presented in Table 1 and depicted in Fig.1., reveal that state wise food grain production are normally distributed as indicated by Jarque-Bera statistic's p-values except for the states Himachal Pradesh and Odisha. Highest food grain production was registered in Uttar Pradesh followed by Punjab and Madhya Pradesh. Very lowest production is registered in Kerala, and Himachal Pradesh.
The Fig.2 depicts that highest food grain production is registered during year 2018-19 and the year wise production shows the increasing trend.  The results presented in Table 2 reveal that the ANOVA F-test and Welch F-test statistic's value values are significant indicating that the production is statistically significant in all the states. The Unit root test result presented in Table 3 reveals that since the Levin, Lin & Chu t statistics values are significant at 1 % level of significance since the p-value is 0.0001 and hence the study variable, PROD, is stationary at level and hence the variable is I(0).

Constant Coefficient Model (Panel OLS):
The CCM es method is employed considering the food grain production (PRODN) as the dependent variable and X, time, as the independent variables; the results are presented in Table 4. The result reveals that the intercepts and slopes are positive and highly significant at the 1% level of significance. The positive slope indicates that the food grain production is an increasing trend. The model is highly significant at the1% level of significance with an incredibly low R 2 value of 15% which is very low. Additionally, the estimated Durbin-Watson value of 0.043090 is quite low, which suggests the presence of autocorrelation in the data. The estimated model assumes that the slope coefficients of time variables X are all identical in all eighteen states. Therefore, despite its simplicity, the CCM may distort the true relationship between the dependent variable -food grain production (PRODN)-and time, the independent variable X, across the states.

Fixed-Effect OR Least-Square Dummy Variable Regression Model: The result presented
in Table 5      The diagrammatic representation of fixed effects in all eighteen states is depicted in Fig,3.
Based on this result it is concluded that the Fixed effect model is better than CCM.

Fig.3. Fixed effect in different states
To confirm the presence of Fixed Effect, the Redundant Fixed Effect test has been carried out and the results are presented in Table 7.

Random-Effect Model:
Finally, the random-effect model is estimated, and the results are presented in Table 9. The result reveal that the model is highly significant at 1 % level of significance with low R 2 value of 17 % with S.E. of regression 2845.515, RMSE, 2837.60.
As in the case of fixed effect model, the random-effect model's intercept and slope are highly significant at 1 % level of significance. The rho value is 0.9385, which indicates that the individual effects of cross-sections are 0.9%.  The diagrammatic representation of random effect in all eighteen states is depicted in Fig,4. Based on this result the presences of random effects in all four different districts are confirmed.

Breusch-Pagan Lagrange-Multiplier Test:
The result presented in Table 11 indicates that the Breusch-Pagan LM, Pesaran scaled LM and Pesaran CD tests statistic values are highly significant at 1 % level of significance since both statistics p-values are equal to 0.0000, indicating that the null hypothesis of the test, "H0:There is constant variance among residuals" is rejected. Hence, the above random effect model having the problem of heteroscedasticity.

Conclusion
The present investigation was carried out to study the food grain production trends in different states in India based on Panel Regression Model for the period 2001-02 to 2020-2021. The result reveals that between state-to-state food grain production is highly significant the highest food grain production was registered in Uttar Pradesh followed by Punjab and Madhya Pradesh. Very lowest was registered in Kerala and Himachal Pradesh. The fixed effect model was found to study the trend and this model explains the 82 % of variations in food grain production. Increases in food grain production have been observed.