The fluctuations in the Onion prices have led to political and economic ramifications in countries such as India. In this study, we intend to estimate and then forecast the price volatility of Onion sales prices in major Indian wholesale markets. Initially, we take daily price data from major vegetable wholesale markets across India and simulate them to compute corresponding daily conditional volatilities using the traditional GARCH method. We then forecast the volatilities for the upcoming 10,15 and 21 days using the same traditional GARCH method and compare its forecasting accuracy with recent AI-led models. According to our comparisons, the deep learning-based LSTM model with various configurations provides superior results when compared to other traditional models with the highest accuracy in more than 70% of the cases. We expect that the given study could help the policymakers in managing sufficient buffer stock levels and the food supply chain stakeholders in hedging against the overall market risks due to the fluctuations in prices.
Unusual price fluctuations of vegetables such as Onions and Potatoes are regular in the Indian Wholesale markets, ultimately leading to social and political consequences. Increasing food prices due to abnormal monsoon rains to lower imports from the foreign markets have significant impacts on the poor, directly dependent upon Government-funded food distribution programs. Also, high food prices force the deprived to spend less on high quality less nutritious food items as found by Mottaleb, et .al(2018). According to Zaridis, A.et al(2020), Lee, H. H., and Park, C. Y. (2013) There are also considerable ramifications for a certain set of stakeholders, especially the SME’s in the food processing sector, having an impact on their investment decision making. Ultimately leading them to manage risks by hedging against these price fluctuations in crucial vegetable commodities. Vegetables such as tomato, onion and potato unlike cereals have a high price elasticity, however, out of all the three, onions have highly uneven price variations as demonstrated by Raka, S., and Ramesh, C. (2017). Also, unlike price trends, which are beneficial to producers and wholesalers in the short run, price volatility is detrimental to both the consumers and producers as it adds risks to their decision making and results in uncompetitive market behaviour from various stakeholders such as Wholesalers and Retailers. Also, price fluctuations lead to almost 40% of the fresh food commodities being wasted or getting spoiled at the post-harvest stage in many of the emerging regions including India (Balaji. et al(2016), Chauhan, Y. (2020)). The active Futures market for commodities in the Metals and Oil and Gas sector throughout the major countries have ensured better price discoveries and risk mitigation scenarios for traders and wholesalers of these commodities. Also, agri-based commodities such as grains such as cereals and pulses, spices and dry fruits etc have various types of financial contracts that are traded in major exchanges throughout the world. However, in the vegetable commodities markets, items such as Onions and Potatoes don’t have any such arrangement (Rajib (2015), Jacks (2007)). This makes the estimation of realized volatilities of these commodities an arduous task.
In this paper, we propose models for volatility forecasting based upon the returns of daily Max Prices for Onions in prominent wholesale markets across India. Initially, the GARCH model is used to estimate and forecast the daily price volatility also known as the Conditional Volatility. The Price return data is again divided into training and validation sets. The training set return values are again set as input in the GARCH(1,1) model which further helped in volatility forecasting for the upcoming 10, 15 and 20 days. The forecasted values are then compared with corresponding volatility values generated earlier with the fitted values of the return data to check for the accuracy of the model. The accuracy is then compared with the forecasted values of Conditional volatility generated from other Machine Learning/Deep learning models such as LSTM and its variations. We believe through accurate forecasted volatility values, companies and organizations in the Food processing sector could come up with Forwarding agreements on these vegetable commodities with the end farmers through contract farming and hedge their risk involved due to unnatural price fluctuations. This will have constructive results for all the stakeholders in the large scale agri-food lifecycle.
The term “Volatility” could be defined as the dispersion from the mean value and is generally calculated as the standard deviation of a time series for a specific time horizon (Figlewski, S. (1997)). According to (Roll, (1989)), Price volatility is the fluctuation of prices of a particular commodity in a time series data. On the other hand, according to (Peterson and Tombek (2005)), volatility is the measure of upside and downside potential from the mean value, irrespective of the direction or trend. This helps in evaluating risk or uncertainty due to variations. Volatility in prices cannot be ascribed to just high prices alone. There are various other reasons as well such as market manipulations and supply shocks.
In non-agricultural based commodity products such as metals, the effects of volatility in price movements have been extensively studied (Bentes, S. R. (2015), Cui et.al (2015), Hu, Y., Ni, J., & Wen, L. (2020)). Also, there are works that have linked various non-agro based commodities with the ones having a direct impact on agro-based ones. For Example, Zied Ftiti et al(2020) showed that gas prices are linearly related to crude oil prices only during extreme price movements, indirectly affecting the cooking gas prices. Similarly, Dimitriadis et al(2020) showed that the increased instability in the crude oil prices could have a positive effect on ethanol prices as with the case of corn prices.
Accordingly, the volatility of Agri commodities could be divided into two types namely, High frequency and low frequency(Von Braun, J., & Tadesse, G. (2012). High-frequency volatility persists only for a single season or less such as sudden variations in price values due to shocks such as uneven weather and pandemic. Low-frequency volatility considers deviations that persist for more than one season. In the scenario of onion prices in developing, country markets such as India, low-frequency volatility is highly prevalent as the price variations persist through more than one season((Peterson, H. H., & Tomek, W. G. (2005), Von Braun, J., & Tadesse, G. (2012)).
This phenomenon is evident from the works of (Gilbert and Morgan (2010)) which states high Food prices in Europe between 2007 to 2009 was followed by unusually high volatility. In fact, according to Gilbert et al(2010), the Price volatility of agricultural commodities in the periods of the 1970s and ’80s were sufficiently higher as compared to the next two decades until 2007. Prices of edible oil and related products such as groundnut oil, soya beans and soya bean oil have shown to follow the same behaviour. The reasons for high price variance could be attributed to reasons such as weather shocks, demand variability due to income shocks(Gilbert (2010a)) and policy shocks (Christiaensen (2009)). Gilbert and Morgan(2010) showed that when there is a significant presence of hoarding and shortage of stocks could result in high price elasticity due to consumption and production shocks. According to (Wright & Williams (1991); Deaton & Laroque (1992)), hoarding reduces volatility when there is an abundant supply available or during positive shocks however when there are shortfalls or negative price shocks. There is a consistent belief among major food economists the causes for high price volatility among food items are due to the given factors such as supply and demand shocks, high oil prices, biofuel feedstock diversion for clean fuels and future prices speculation. (Abbot et al. (2008);Baffes (2007);Cooper & Lawrence (1975)).
This makes the case for retailers to have a procurement strategy that adjusts to the demands of those items at the end consumer level (Nagare et al, (2003). Hence it had been a consistent focus of the practitioners and experts alike in the field for having accurate forecasting of vegetable demands to have minimum delays and shorter lead times in their transportation. There has been demand forecasting by researchers using various methods to maintain a balanced supply chain with the least delays at every stage of a supply chain. Some of the examples of this could be seen in works such as demand forecasting based upon seasonality analysis (Ehrenthal et al., (2014)), time-series sales forecasts based on trends at various retail stages (Zhang and Qi, (2005)), Bayesian combination sales forecasting model for retailers by Wang, W.J. and Xu, Q. (2014) and Demand forecasting during promotions (van Donselaar . et al,(2016), Ali et al( 2012)). Apart from this, other studies have focused on various time series forecasting techniques of daily, monthly and annual sales data of various vegetable commodities ((Gilbert, K.C. and Chatpattananan, V. (2006), Ramos, P et al. (2015)). Hence Demand forecasts at each stage are necessary for recognizing the optimal amount of product to be bought to have a check on stock-outs and excess inventories at the retail level. However, demand forecasting could provide limited information for the origins of uneven demand and supply shocks affecting the efficiency fresh food supply chain (Gilbert, C. L., & Morgan, C. W. (2010)). Various other aspects have a direct or indirect effect on the whole supply chain and just forecasting the demand of the commodity may not provide proper solutions for those underlying issues. According to (Birthal et al (2018)), excessive fluctuations in food prices could have a wider impact, destabilizing a broad set of stakeholders including farmers, consumers and processors etc. In many cases, fluctuation of food prices such as Onions can have deep political ramifications, ultimately bringing down the whole government (Gulati and Ganguly, (2013)). It is widely noted that Highly volatile food prices may mislead production decisions ultimately leading to a scarcity of resources at the other end of the supply chain (Lee and Park, (2013)). Hence to meet the demand of future food sustainability it is imperative to assess price volatility beforehand rather than the price itself (Kharas, (2011)).
According to Dorward(2012), there remains a significant lack of clarity on the causes which trigger higher price volatility in food supply chains both in International and Indian markets. On the domestic front, many causes of high volatility have been attributed to various reasons, some of these are but are not limited to excessive demand and supply, hoardings, infrastructure issues, trade embargos to name a few. There are other causes as well such as Chadha et. al (2012) attributes excessive weather conditions as one of the factors. Similarly, Chengappa et al. (2012) indicate anti-competitive trade behavior of wholesalers downstream and Bathla and Srinivasulu (2011) of asymmetry of information downstream in the groundnut oilseed market. On the other hand, Birthal et al.(2018) suggest causes of high price volatility in vegetable prices are due to anti-competitive practices such as commodity hoardings reiterating the fact that there is a tendency in people to hoard materials and create an artificial shortage in the market to protect themselves from the scarcity of supplies in future periods (Goncalves, (2003)). Saxena et al. (2020) identify how volatility spillover affects the end consumers.
Models such as GARCH and its variations are considered apt for estimating volatility. For example (Lama et al 2015) shows that high volatility in the prices of onion as compared to that of potato was observed in some of the selected markets using different variations of GARCH. Similarly, Sharma et al. (2018) observed similar results based upon coefficient values of the GARCH models on the prices of various vegetable commodities in Gujarat. In this particular study, we have focused on forecasting volatility in onion prices as they are more sensitive as compared to other major vegetables such as Potatoes and Onions(Sharma et al. (2018), Birthal et al.(2018), Saxena et al. (2020)). Many international research efforts were made in calculating predictions of various metallic commodities such as Gold and copper(Hu, Y., Ni, J., & Wen, L. (2020)). When it comes to agriculture commodities, Giot, P., & Laurent, S. (2003) were one of the early recognized attempts to forecast price volatilities of commodities such as Cocoa, coffee and Sugar future prices based on the GARCH model. The recent attempts involved HAR(heterogeneous autoregressive model) which was introduced by Corsi (2009). Tian et al. (2017a) and Yang et al. (2017) utilize the HAR model to forecast the volatility of agri-based commodities taking price data from Chinese futures markets. In the context of studies in India ,(Thiyagarajan et. al(2015)) forecasted the volatility of Dhaanya future prices concerning the NIFTY benchmark index and exchange rate. However, no studies are focusing on the prediction of the onion price volatility and almost none of them have utilized deep learning methodologies for the same. One of the major explanations for this is that most of them forecasted in the given studies are traded through futures contracts in various commodity exchanges, this makes it easier for evaluating realized volatiles based upon the daily futures and spot prices of the underlying. However, in the case of Onions, there were no contracts traded either in India or overseas markets. Hence in the present study, we would be evaluating daily conditional variances on the returns of daily onion prices based upon the GARCH model using the past values of itself and some random errors. Using the daily Onion prices conditional variance data, the values for the next 10,15 and 20 days would be forecasted using the GARCH(1,1) method itself and LSTM based deep learning methods.
Onions are grown in both rabi and Kharif seasons having both seasonal as well annual effects on the daily prices movements and volatility. Hence the study is conducted taking daily per quintal onion prices of 11 years from the period of January 2010 till March 2021 from the AGMARKNET website’s database. Seven major wholesale markets were selected depending on the daily volume arrival levels as compared to other wholesale markets throughout the country, with the purpose of better elucidating the precarious nature of Onion prices. The state of Maharashtra is the largest producer of Onions in India, and as a result, both Lasalgaon and Pimplegaon acts as important producer markets catering as an important source for onion supplies. On the other hand, Azadpur is one of the largest consumer markets covering major parts of Northern India. To have a deeper perspective, other wholesale markets such as Ahmedabad, Bengaluru, Kolkata and Ludhiana have been selected. These markets are located in states with high consumption as well as a higher rate of production of vegetables such as Onions, Potatoes and Tomatoes. Based upon the onion prices, daily returns have been calculated, providing normalized values which in turn help in estimating the variation in changes of prices also known as volatility. The precision of volatility prediction of various asset class prices has been extensively studied before and are already many studies undergoing further improvement in accuracy measures. Usually, in the capital markets, volatility is often estimated using GARCH(Generalized Autoregressive Conditional Heteroskedasticity) methods and its different variations. GARCH model is an extended version ARCH(autoregressive conditional heteroskedasticity) model. The difference between the two is that the ARCH model uses previous periods error terms to describe the variance of the current time series error term. On the other hand, a GARCH(1,1) method also takes into consideration previous period variance terms as well. GARCH was initially proposed by Bollerslev, T. (1987), and is considered to be one of the most vigorous methods for modelling change in variances in a time series. Apart from that, there were many extensions, which have come up recently such as FIAPGARCH, APGARCH, EGARCH, SEARCH (Cao et. al(2009), Ding et. al(2007), Nelson & Cao (1992), Canarella & Pollard(2007)] and more. These methods tend to have identical accuracy levels as there is an intricate relationship among variables depending upon the problem and size of the data sets(Bildirici, M., & Ersin, Ö. Ö. (2009), Nelson, D. B. (1991), Glosten et al. (1993)).
The latest improvements in deep learning methodology and its application in forecasting methods are grounded on ANN(Artificial Neural Networks) models. Kamruzzaman& Sarker (2004) and Wang et. al. (2006) have noticed in their works on neural networks, ANNs imitate the human brain in acquiring and organizing knowledge processing. One of the many benefits of ANNs is that there is no need to check for the functionality among variables of the network(Kristjanpoller W. & Minutolo M. C. (2015), Liu W. K. & So M. K. (2020)). This makes them fit for out of order input variables to be added to the given model. Although according to Benidis K et. al(2020), ANNs are having high accuracy levels, their extensions such as RNN(Recurrent Neural Network) have extraordinary accuracy prediction rates. RNNs are known to process sequential data by keeping it into a sort of memory cells and self-learn by various iterations. It has been used in applications such as speech recognition, text to speech conversations and time-series predictions. There are many examples where RNNs and GARCH type models are used for predictions for various asset classes in the finance industry. RNNs have supposed to show better predictive accuracy levels as compared to their ARCH model counterparts as shown by Petropoulos et. al(2017), Henriquez and Kristjanpoller(2019) and Gers et. al(2000). There are also many hybrid models proposed for various commodities such as (Bildirici and Ersin(2009), Kristjanpoller and Minutolo(2016), Lu et. al(2016)) which have been shown to have advantages over simple RNNs or time series models. Special forms of RNN such as the LSTM(long short term memory) and Stacked LSTM have lately been shown to achieve enhanced results on time series data. Hochreiter and Schmidhuber (1997) was first to conceptualize the LSTM network. Since then, it has shown successful outcomes in various time series forecasting goals. LSTMs are known to remember patterns of data for a longer duration of time and hence are ideal for forecasting large sequential data. Stacked LSTM encompasses multiple LSTM layers, and was first formulated by Graves (2013) and its first widely known application has been on speech recognition problems. In a typical stacked LSTM, the initial LSTM layer produces series vectors which are then added as an input to the succeeding LSTM layer, simultaneously it receives feedback from its earlier time step and hence helps in capturing deep-rooted past data. and thereby learning data patterns in the specific time series.
These recent developments in AI and machine learning models open up the potential for a better understanding of predictive accuracy of asset prices and volatility, helping all the stakeholders in managing risks for their financial investments. In this paper, we intend to propose a hybrid model using the traditional GARCH method to calculate the conditional volatility and based on the Stacked LSTMs forecast the values for the next 10,15 and 20 days. We compare this model with only using the traditional GARCH, other AI/ML prominent models such as ARIMA and pure vanilla LSTMs with varied input units. We hope that the given model would indoctrinate the variance fluctuations as observed by the GARCH models along with the considerations of the nonlinear relationships among the variables through RNN based LSTMs, thereby providing an enhanced forecast of volatility values.
Volatility Calculation: Volatility is an important criterion for understanding the variation in prices of an asset whether a stock or a commodity. One of the suitable terms for defining volatility is the variance of the returns of the underlying asset prices. The general formula for calculating volatility is given by the following equation:
Where Vt from equation 1 is the realized volatility at day t during T trading days, TR from equation 2 are the True Returns of the asset price, Ri is the return of onion prices on the day I, Ã average return of onion prices during T trading days. In this study, three scenarios would be considered i.e T= 10,15 and 20 days.
Measures of Prediction errors:
To compare the performances of the models, MAPE(Mean Absolute Percentage Error) is used. MAPE is extensively applied in the financial engineering literature in studying models for various commodities and foreign exchange markets. Many authors have used other measures for predicting the accuracy levels used for forecasting commitments. Authors such as Bentes(2015) and Kristjanpoller and Hernández(2017) have comprehensively used four measures of prediction errors (i.e., MSE, MAE, RMSE, and MAPE) in volatility forecasting in various commodity markets. Besides, these four measures have also been adopted in studying prediction models in the foreign exchange market such as Sempinis et al.(2012), Petropoulos et. al(2017) and Henriquez and Kristjanpoller(2019). To be in line with the literature, all are utilized in this study to compare different volatility prediction models of copper price.
GARCH(1,1): Traditional methods based upon OLS(Ordinary least square) are not suitable for financial time series data since the variance is not heteroskedastic in nature. To counter this, Engle(1982) introduced the ARCH(autoregressive conditional heteroskedasticity) model in which error terms are autoregressive i.e. the current error term is dependent on previous period error terms. Bollerslev (1986) moves ahead with the approach and introduces the GARCH(generalized autoregressive conditional heteroskedasticity) model taking into account previous conditional variances and previous error terms for computing the current conditional variance. On the other hand, GARCH also involves volatility clustering present in a usual time series. GARCH model could be presented by the following equation:
where α0 > 0, αi ≥ 0, i = 1, . . . , p, βj ≥ 0, j = 1, . . . , q which guarantee that the conditional variance of GARCH(p, q) is always positive
In this study, the volatility is calculated by parameters estimated by storing return values of daily maximum prices to the GARCH(1,1) model as inputs. The order of error terms and conditional variance is taken as 1, based on the descriptive analysis of various parameters when fitting the return on daily max price data. From figure 3 it is clearly visible that Conditional volatility tends to cluster, showing the heteroscedastic property of the time series for Lasalgaon, Azadpur and Pimplegaon data. Also, the GARCH method is best suited for high-frequency time series data, and hence it seems a suitable approach to opt for daily returns of max price data of onions
ARIMA: As the name suggests Auto-Regressive Integrated Moving Average are a class of ML models used for time series forecasting showing linear properties based upon Box Jenkins methodology(Box et a(2015)). Also, ARIMA can perform exponential smoothing of time series, corroborating in better data representation and avoiding overfitting, helping in better accuracy levels. ARIMA tends to show a flexible approach for forecasting as they could handle various types of time series. However, there are serious drawbacks with these models as they are weak in interpreting non-linear time series as they assume a linear relationship among variables and hence are not suitable for complex problems (Zhang, G. P. (2003)). And due to this property, ARIMA models are usually preferred for short term forecasting (Shukla and Jharkharia(2013)). When compared to various ANN models, ARIMA has shown to have only a slight improvement in performance based upon the nature of forecasting. According to Darbellay & Slama, 2000 ARIMA models are shown to have better accuracy levels than other ANNs for small forecasting windows. On the other hand, Zhang, G. P. (2003) proposes that a hybrid of ANNs and ARIMA have improved predictive powers than when the models are used standalone. When compared to the recent AI methods such as LSTM s, ARIMA tends to show inferior performance(Zhang et al(2021), Wang et al(2021).In this scenario, we intend to use stacked LSTMs with both Multivariate and Univariate for predictions and compare them with predictions of ARIMA models.
LSTM: As mentioned above LSTM(Long Short Term Memory) is a standard RNN model dealing with gradient descent problems. It comprises recurrent gates known commonly by input, output and forget gates. LSTM can adapt to deep learning tasks demanding long-time memory of events. It also enables the reduction of the signals that have both low and high-frequency components.
The compact forms of the equations for the forward pass of an LSTM unit with a forget gate are as follows :
ft = 𝝋 (Wfxxt + Wfhht−1 + bf ) (7)
it = 𝝋 (Wixxt + Wihht−1 + bi) (8)
ot = 𝝋 (Woxxt + Wohht−1 + bo) (9)
zt = ft ⊙ zt−1 + it ⊙ tanh(Wcxxt + Wchht−1 + bc ) (10)
ht = ot ⊙ tanh(zt ) (11)
where zt is the memory cell
it is input gate
ft is forget gate
ot is output gate
⊙ is an operator denoting element-wise product
xt represents an input vector
ht is hidden state or output vector
() is the sigmoid function and tanh(.) is the hyperbolic tangent function
Onion Sales price data is taken from the period from 1st January 2010 to March 31st 2021 for various tier 1(Azadpur, Bengaluru, Kolkata) and tier 2(Lasalgaon, Pimplegaon, Ahmedabad, Ludhiana) wholesale vegetable markets.
Based upon daily Onion max price data returns “True returns' ' are calculated and then daily corresponding Conditional Volatilities using the GARCH(1,1) method as shown in figure 3 for the Indian wholesale markets of Azadpur, Lasalgaon and Pimplegaon. As could be seen in the chart the conditional volatility is representative of change variations of the max price returns. Also through Python-based libraries specifically designed for the GARCH model, volatilities for the next 10,15 and 20 days are forecasted. Again the GARCH based conditional volatilities are forecasted for the next 10, 15, and 20 days using the Deep Learning-based LSTM(Long Short Term memory) with different configurations first with taking only the Daily Max Price data with Conditional volatilities into account and then taking Max Prices in the first scenario and Min, Max and Modal Prices in the second scenario along with Conditional volatilities as input. The forecasted values are checked for predictive accuracy and are then compared with each other i.e initially the GARCH(1,1) forecasting accuracy and then for the LSTM methods with various parameters.
The data set was divided into training and testing sets. The training set consisted of 90% of values and the testing sets consisted of the rest i.e. 10%of the values.
For LSTM based models, the values are normalized after separating them for training and testing set. The process of normalization simplifies the training process and makes it more robust(Shavit, G. (2000), Yin et al.(2017)). The process of normalization has also been effective in classification based problems as shown by (Jayalakshmi and Santhakumaran(2011)) and it is widely used in neural network-based deep learning methodologies as it helps in stabilization of the overall process and reducing the training epochs required to train them. To implement the LSTM models, TensorFlow libraries are used as it is the most convenient method to apply neural network-based deep learning models as demonstrated by (Heaton et al.(2016) and Chen et al. (2020)).TensorFlow enhances productivity and automatically helps in optimization.
Their configuration is used with different shapes and the names of the layers. Configuration used consisted of one input LSTM layer with 150 units or neurons, followed by another LSTM layer of 200 units or neurons and two hidden layers of 150 and 70 units respectively.
And finally, one output layer comprising forecasted results. Similarly, the complexity of the architecture is boosted by increasing the number of layers as well as no of neurons. To make certain that the model always returns the same results, the random seed in neural networks architectures is set to be zero. Also, during training, instances are divided into batches for optimization. In the given scenario, a size of 500 values is taken for a single instance which were dependent on the minibatch method formulated by Goodfellow et al.(2016).On the other hand, to maintain a consistent gradient, there needs to be a small learning rate. Adam(Kingma et al.) an adaptive learning rate algorithm is used for training as it helps in deciding an appropriate rate due to it having the property of bias correction.
As mentioned in Table 1, MAPE values for predictive accuracy of Azadpur, Ahmedabad, Lasalgaon, and Ludhiana markets come out to be greater using the Deep Learning-based Stacked LSTM method in comparison to the ARIMA and the traditional GARCH method forecasting irrespective of the configuration, input parameters used and a number of days ahead forecasts. Similarly, for the Pimpalgaon wholesale vegetable market, the mean average percentage error(MAPE) for 10 days forecast is almost greater by 50% using LSTM methods taking Max Price or multiple inputs into when compared to GARCH(1,1) model. Kolkata markets have the least predictive accuracy for the same parameters with LSTM methods as compared to the traditional GARCH methods with a marginally lower MAPE value. As we increase the number of days to 15 and 20 days, the accuracy of the deep learning methods improves, irrespective of configurations and inputs having better performance. There are minor variations when changing configurations when taking Max Price as input with the conditional volatilities method.
There is not a strong linear relationship between Conditional Volatility and other variables of Max, Min and Modal Price. This is true for almost all the markets; hence it makes sense to use variables to enhance the predictive accuracy. As a result, apart from conditional volatility in the first scenario, Max Price was taken as the input variable and in the second case, Min and Modal prices were also taken.
ANN-based LSTMs are exceptional methods to increase the effectiveness of the GARCH model for forecasting volatility based upon various hybrid approaches. Increasing the number of inputs does not necessarily improve the overall forecast. The results and the various relationships are influenced by the historic data available. While there is no perfect forecasting model, these hybrid approaches incorporate the heteroskedastic nature of ARCH models and the supervised learning approaches to improve the predictions. Another drawback relates to the amount of data necessary to predict the volatility for the short term (10, 15 and 15 d) given that 252 d were used.
The given study demonstrates how the hybrid model of using GARCH and ANNs(Artificial Neural Network) based LSTMs provides better predictive accuracy than the classical methods for volatility forecasting. GARCH Model helps in identifying the relationship between the variance of the current period concerning previous period variances. However, the predicted variance doesn’t need to be accurate and show a perfect linear relationship. Hence, in the given case, the conditional variances of the previous periods are trained through ANN-based LSTM architecture and along with daily maximum, minimum and average prices. The result analysis shows that the above hybrid model has better predictive accuracy measures for five out of the seven major wholesale markets taken in the study irrespective of the number of days and the input variables taken. And for the others, the accuracy is only marginally lower as compared to the corresponding GARCH based forecasting values. According to (Guidolin et al) in agriculturally based commodities during high volatility phases, it has been seen that many of the macroeconomic variables, even those specific to that commodity do not play a crucial role in increasing the predictive accuracy measures for the prices.
This paper proposes a hybrid model of price volatility forecasting of one of the most sensitive vegetables in the emerging market of India. It captures the unusual fluctuations of variance in a time series data of GARCH models combined with memory retention ability of RNNs based LSTMs. Secondly, LSTMs are used since the GARCH methodology is not able to forecast for large durations ahead. Although this issue could resolve by taking rolling forecasts of small-time steps of between 1 to 10 days, however, the process is time-consuming and could take up more computing power as compared to other given methods.
This could be beneficial for stakeholders in getting early warning systems through advance signals from the affected wholesale markets. The given model could help in understanding of price transmission which according to works such as Birthal et al.(2019)and Saxena et al. (2020) has been evidently present between producer and consumer vegetable wholesale markets across the country. Apart from this, it could be employed to estimate the price volatility in other agriculture commodities which are not being traded in Futures market. The volatility forecasting would help in understanding spillover effects among the wholesale markets and thereby specifying the price dependencies among them. And finally, these results could help large food processing retailers to get the desired benefits from the recent legislative changes in the Indian Agriculture sector related to Contract farming laws, as it would assist them in hedging the risks arising due to abnormal fluctuations in the prices of major vegetables in India.
Limitation and Future Scope
The given study is still mired with certain limitations. The independent variables are just confined to different types of prices such as Minimum, maximum or Modal prices and previous periods conditional volatilities. However, other variables which can also have an influence on the predictive nature of the models could be taken, such as Interest rate, Global Crude Oil prices and Dollar currency rates etc. The results of the given study could be made more robust by including different variations of LSTM, altering the number of layers, neurons or the structure of the network. Also, it would be desirable to look for varied time horizons to have a broader understanding of these hybrid models show any fluctuations in the final results. Further research could incorporate various other GARCH based methodologies such as EGARCH, IGARCH, T-GARCH and Markov switching states for comparisons of accurate predictions. On the other hand, to further improve the computations and efficiency, a large number of architectures with multiple layers and neurons could be tested and then could be incorporated for increasing the predictive accuracy. And lastly, the methodology used could be engaged for other emerging markets having the same food supply chain dynamics as that of the Indian food processing sector.
The authors declare that there is no conflict of interest.
Table 1 is available in the Supplementary Files section.