Adaptive trio-ensemble deep neural network for high-frequency stock price prediction

doi:10.21203/rs.3.rs-2107202/v1

Download PDF

Research Article

Adaptive trio-ensemble deep neural network for high-frequency stock price prediction

https://doi.org/10.21203/rs.3.rs-2107202/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

The analysis and forecasting of stock price is a highly complex task since its inception. Researchers have proposed a hundreds of mathematical and machine learning based models to solve this high frequency prediction problem. The constraints that restricts the effective stock market forecasting method is its dependency on variety of factors like news, announcement of dividends, company policy, drastic changes at management level, launch of new products etc. The characteristics of Deep learning algorithms like choice of network structure, activation function, and other model parameters etc voted it as a best choice for prediction. This paper proposed an ensemble prediction model by exploiting three most promising variant of Deep Neural Network (DNN) namely Gaussian, Poisson, and Gamma out of six available probability distributions (Quantile, Gaussian, Poisson, Laplace, Huber, and Gamma). The experimental results show that the proposed ensemble deep learning model claimed the best accuracy of R2: 0.92 and Root Mean Square Error (RMSE): 0.17 as per the literature reviewed in this category.

Ensemble Learning

Deep Learning

Stock Price

High-frequency dataset

Adaptive Trio-ensemble Deep Learning Model

The forecast of the stock price remains a stimulating and captivating task for researchers and investors due to its stochastic nature. An accurate prediction of the stock price is highly desired, to decide an investment or withdrawal by the investors. It is too difficult to predict the stock price accurately due to its non-stationary and nonlinear nature (Bezerra & Albuquerque 2017). It is influenced by many factors such as the nature of business of the particular firm or company, demand, supply, public faith, Government policies, etc. Therefore, it becomes too difficult to predict the effect of all these factors on the stock price fluctuation. Researchers have used lots of significant assumptions, tools, and techniques for their pursuance models in different intervals of time (Atsalakis & Valavanis 2009, Henrique et al. 2019). Even after some good efforts by researchers and investors, the problem of accurate prediction remains unsolved. Today, the number of stocks is added regularly to the stock exchanges, and thus the data production rate of the stock price has been tremendously increasing every second. Due to the high volume, variety, and velocity of the stock price, an efficient modeling technique is strongly required to match the production rate. Under these circumstances, machine-learning models became an essential tool that is capable of dealing with the situation due to their efficiency of prediction, fast processing, and optimized memory utilization. Several computational modeling frameworks such as Artificial Neural Networks (ANN) (Shrivastav & Ravinder 2021, Babu & Reddy 2015, Chong et al. 2017), statistical models as ARIMA model (Guo et al. 2018), tree-based learning model as Random Forests (RF) (Khaidem et al. 2016), Genetic Algorithm (GA) (), Deep Learning (DL) ( Schmidhuber 2015, Xu et al. 2018, Minh et al. 2018) models have been applied and verified by lots researchers from time to time. The tree-based ensemble machine learning technique (Gradient Boosting Machine and Random Forest) was proposed and found that the tree-based learning achieved better performance than the rest of the available models and it is better than some known machine learning models (Shrivastav & Kumar 2022(a), Shrivastav & Kumar 2022(b)).

The paper proposed the analysis and prediction of the 1-minute stock price dataset using the adaptive trio-ensemble deep learning model. It is an ensemble model that is the mixture of the most promising deep neural network distributions. Deep Neural Network is a highly applied and recognized computational modeling technique that can be frequently utilized in all types of regression and classification problems. The model is based on feed-forward neural network architecture, which is capable to provide the highest forecast accuracy. This paper proposed a supervised deep learning model to maximize the forecasting result by implementing a trio-ensemble of deep learning with their three most promising and better-performing distributions.

1.1 Motivation and Contribution

The dataset of the stock market is complex, chaotic, dynamic, and volatile. It was always a challenging task for mathematicians and data scientists to design a generalized model. Many efforts were made in the past to maximize the prediction accuracy of the stock price index. The literature review suggests that the high-frequency dataset has an upper edge over the low-frequency dataset and machine learning can play a vital role to control the volume, variety, and velocity of the dataset. This motivates us to use the machine-learning tool and the well-performed model as a deep learning model on a 1-minute (high frequency) dataset of the stock price for a better prediction result.

This paper presents the following major contributions:

An ensemble model for deeper characterization and prediction of the 1-minute dataset of stock price.

The six distributions (Quantile, Gaussian, Poisson, Laplace, Huber, and Gamma) were applied using Deep Neural Network to select three better candidates.

The three better promising distributions are composed to develop an Adaptive Trio-ensemble Deep Neural Network (ATDNN) model to maximize the prediction result.

The rest of the paper is organized as follows: Section 2 provides the review of the relevant literature that will be followed by the description of the process of data collection and compilation in section 3. The next section 4 highlights the proposed ensemble model along with the other Deep Neural Network models. Section 5 presents the experimental setup and results. Discussion of results is presented in section 6, and finally the last section 7 presents the conclusion and future results.

During the last decades, lots of machine learning tools and techniques were applied to optimize the predictive capability of the stock price. This section presents recent and significant analyses and results, particularly for the period of 2015–2020.

Some machine learning techniques applied to characterize the best-adequate model in terms of stock price prediction as Extreme Learning Machine (ELM), Backpropagation Neural Network (BNN), Radial Basis Neural Network (RBNN), and Deep Learning (DL) (Chen et al. 2018). All the three size datasets (small, medium, and large) of CSI 300 Index Future associated with Shanghai and Shenzhen Stock Exchange were used for the analysis. The results confers that the Deep Learning (DL) is comparatively better performed than the rest of the models. The paper also suggests that the performance metrics of the model increase as sample size increases. This means a larger dataset may deeply characterize the intrinsic nature of the stock price.

Another comparative study was proposed to use Deep Neural Network (DNN), Long Short-Term Network (LSTN), Logistic Network model (LNM), and Random Forest (Fischer & Krauss 2018). Thomson Reuters (listed in S&P 500 index) of a very large size dataset (25 years and 9 months) was gathered between the periods of December-1989 to September-2015. The study concluded as the performance metrics of LSTN are better compared to the rest of the predictive models.

A mixture model proposed by Guo et.al (2018) uses three types of the dataset as Daily i.e. listing date to 31-March-2017. The 30-minute dataset was collected between 1-January-2017 to 28-February-2017, and the minutes' dataset collected between 1-February-2017 to 28-February-2017 of SH600006, SH6000016, SH6000036, SH6000056 which are listed in Shanghai Stock Exchange. Adaptive Support Vector Regression (ASVR), which is an ensemble of Particle Swarm Optimization (PSO) and traditional Support Vector Regression (SVR), was applied to these datasets. The performance metrics confer that ASVR has slightly better predictive capability w.r.t. Back Propagation Neural Network (BPNN) and Support Vector Regression (SVR) in terms of MAD, MAPE, and RMSE respectively.

An automated model Xuanwu developed by (Zhang et al. 2018) to forecast the futuristic trends of the stock index. The dataset was collected between the period of 25-January-2010 to 1- October-2016 (approx. daily stock price of 6 years and 9 months dataset) of 495 stocks listed in Shenzhen Growth Enterprise Index. The proposed model implements the Random Forest model using WEKA. The Random Forest model performs slightly better than ANN, SVM, and kNN on the bench of performance metrics Prediction Duration (PD) and Return of the trade (ROT) respectively.

Tree-based models were first time applied to predict the stock price actuation using a high-frequency dataset (Basak et al. 2019). The dataset of Facebook and Apple (size 10 kB to 700 kB with 1180 to 10700 samples) was collected between the period beginning to 3-February-2017. The two esteemed models XGBoost and Random Forest (RF) were applied to the high-frequency dataset and found that XGBoost model provides an accuracy of 78% which is better than the other model. The experimental result also reports that the performance of the tree-based model can be improvised by implementing an ensemble model.

A Multi Filter Neural Network (MFNN) was used on the high-frequency dataset of CSI 300 stock price index. Approx. 3 years dataset between the period of 24-December-2013 to 7-December-2016 was collected (Long et al. 2019). The MFNN model was applied to a 30-set dataset of CSI stock prices. The model performed 6.28% better than the comparatively performed model CNN and RNN. The performance metrics of the MFNN were also compared with Linear Regression, Logistic Regression, LSTN, SVM, and Random Forest respectively. The experimental result concludes that the MFNN model provides far better predictive capability in terms of the Rate of Average Access Return, Total Return, Return Rate, etc. than the rest of the model.

A model that is a combination of Genetic Algorithm (GA) and Convolutional Neural Network (CNN) was proposed and was applied to seventeen years dataset of the daily KOSPI stock collected between 4-January-2000 to 31-December-2016 (Chung et al. 2020). The experimental result suggests the CNN performed with 70.16% accuracy without the Genetic Algorithm. With the combination of the GA, it provides an accuracy of 73.74%. The study also suggests that deep learning may improvise the result.

A modified Convolutional Neural Network model was applied to the daily dataset of S&P 500, NASDAQ, NYSE, and RUSSELL 2000 indices collected between the period of Jan-2010 to Jan-2017 on the interval of a single day (Hoseinzade & Haratizadeh 2019). The modified CNN model as “CNNpred” was applied and found a 3–11% improvement in terms of F-measure. The study suggests that modification in the “CNNpred” can be improvised in the future to get more results that are precise.

A comparative study using machine-learning models to forecast the stock price index (time series) was exploited in (Ersan et al. 2019). Three result-oriented machine-learning frameworks as k-NN, ANN, and SVM applied to the ten years daily and hourly datasets of DAX 30 and S&P 500 respectively. The experimental results were evaluated in terms of SS, DS, and RMSE and found: (i) Hourly data has better prediction capability than daily dataset upto a certain extent. (ii) k-NN performance is better in terms of RMSE (minimum, average, and maximum) in terms of both datasets. (iii) SVM provides stable results whereas ANN and k-NN outperformed the SVM in terms of RMSE.

(Shah & Isah 2019) did a comprehensive study to review the taxonomy of prediction techniques in the domain of stock price prediction. Lots of machine learning tools and techniques were analyzed and concluded as (i) Longer-term dataset can contain less noise and more prediction capability (ii) Mixture model (mixture of machine learning and statistical model) has better prediction capability to predict the stock price.

A systematic analysis and review to predict the stock price index considering frameworks of 50 research papers published between the period of 2010 to 2018 as Fuzzy based techniques like ANN, NN, SVM, SVR, HMM, and K-means by (Gandhmal & Kumar 2019). This paper concluded as. (i) This study reported the research gap and challenge of the different clustering and classification techniques as Bayesian model, Fuzzy classifier, ANN, SVM, Decision support system, machine learning, CNN, etc. and finds ANN is the most applied method in terms of stock price prediction (ii) It also explores the different dataset applied in the period and analyzed its performance metrics in terms of MAPE, RMSE, Accuracy, Sensitivity, and Specificity, MAE, etc. (iii) The studies concluded that stock price prediction is a very complicated task, so despite historical dataset analysis, some other factors may also be considered for the more precise prediction result.

A mixture model, which is a combination of Ensemble Adaptive Neuro-Fuzzy Inference System (EANFIS) and Support Vector Regression was developed (Zhang et al. 2018) and applied to four securities 002570, 600422, 000049, and 002375 datasets listed in Shanghai and Shenzhen Stock Exchanges. The dataset was collected between the periods of Jan-2012 to Jan-2017. The model analyzed these nearly five years daily and the historical dataset. The experimental results conclude that the mixture model performs better than the single-stage ENANFIS as well as the two-stage model (SVR-SVR, SVR-Linear, and SVR-ANN) based on performance metrics as RMSE, MSE, MAE, and MAPE.

A synchronized study performed by (Jiang 2020) to use a deep learning model in the paradigm of stock price forecast. All recent studies were reviewed especially the last three years' research in the field of stock price forecast. About 100 research papers with their datasets, tools, and techniques were analyzed deeply. The study found that deep learning is a suitable tool and has many scopes to forecast the stock price index trends in the near future.

A detailed study was performed to explore and review the 96 research papers published in SCI index, 2016 by (Reschenhofer et al. 2020). This paper concluded with surprising results as (i) the study disappoints due to the rare use and non-availability of the high-frequency dataset. (ii) It is also commented on the benchmark and suitability of the dataset. (iii) The study concluded as, despite the hype about financial big data and sophisticated machine learning frameworks, there are rarely any relevant and truly experimental studies found.

Despite huge research of more than fifty years in this paradigm, no researchers were able to conclude a single manifest or well-performed model that can provide optimal predictive results. Table 1 summarizes the brief of the proposed models in terms of the dataset used, target output, number of samples, sampling period, method, and their performance measure. This paper proposes the most popular machine-learning model as the trio-ensemble of deep learning models to optimize the prediction results.

Table 1

Stock Price Literature Review – Summary
Reference	Source of data	Targeted output	Frequency of Samples	Period of sampling	Applied method	Performance metrics
Chen et al. 2018	CSI 300 Index Future associated with Shanghai and Shenzhen Stock Exchange	Opening price Forecast	small, medium, and large size high-frequency dataset	20-February-2017 to 20-April-2017	Neural Network, Deep Learning, Extreme Learning Machine, Backpropagation, and Radial Basis Neural Network	Directional Predictive Accuracy (DA)
Fischeret et al. 2018	Thomson Reuters listed in S&P 500 index	Stock Price	Daily dataset of 25 years 9 months	December-1989 to Sepetember-2015	LSTN, Logistic Regression Deep Learning, Random Forest	Probability of LSTN, RMSE, MAPE, DM, PT
Guo et al. 2018	Shanghai Stock Exchange benchmark datasets (SH600006, SH6000016, SH6000036, SH6000056)	Stock Price	5-minutes, 30-minutes, and daily dataset (3 type dataset)	5-minutes: 1-February-2017 to 28-February-2017, 30 minute 1-January-2017 to 28-February-2017 Daily: Listing to 31-March-2017,	SVR, BPNN, Adaptive SVR	RMSE, MAPE, MAD
Zhang et al. 2018	Shenzhen Growth Enterprise Index (495 listed stocks)	Close Index Forecast	6 years and 9 months Daily Stock Price	25-January-2010 to 1- October-2016	Xuanwu	Return of the trade, Forecast Duration (PD)
Basak et al. 2018	Apple and Facebook stock Price	Stock Price Return	10kb-700kB in size	Date of listing to 3-February 2017	Random Forest, XGBoost	Specificity, F-Score, Brier, AUC, Accuracy, Recall, Precision,
Long et al. 2019	CSI 300	Stock Return	3 years (approx.) High Frequency (1-minutes)	24-December-2013 to 7-December-2016	MFNN = DNN + (2D²) feature extraction	Rate of Average Access Return, Total Return, Return Rate, etc
Chung et al. 2019	Stocks of KOSPI, Bloomberg	Stock Return	17 years (daily stock price)	04-January-2000 to 31-December-2016	CNN + GA	Comparative Accuracy
Hoseinzade et al. 2019	S&P 500, NASDAQ, NYSE, and RUSSELL 2000 indices	Close Price Forecast	Daily samples	January-2010 to January-2017	CNN	Macro-Averaged-F-Measure
Ersan et al. 2019	Stocks of DAX 30 and S&P 500	Stock Return	10 years of daily and hourly data	02-January-2004 08:00 GMT 06-March-2015 20:00	k-NN, ANN, and SVM	SS, DS, and RMSE
Shah et al. 2019	NYSE, S&P 500, etc.	Price Return and others	Daily, Weekly	Different periods for the different papers	Random Forest, XGBoost, SVM, ANN, etc.	Test Error, Average profit, Precision, Recall, and F-score
Gandhmal, & Kumar 2019	Goldman Sachs Software, Microsoft Corp., S&P 500, BSE, DJIA etc.	Price Return and others	Daily, Weekly and others	Reviewed 50 research papers between the period of 2010–2018	ANN, SVM, Decision support system, CNN, etc	MAPE, RMSE, Accuracy, Sensitivity, and Specificity, MAE, etc.
Zhang et al. 2020	Four securities code (002570, 600422, 000049 and 002375) from Shanghai and Shenzhen Stock Exchanges	Stock return	5 years of the daily dataset	January-2012 to January-2017	SVR-ENANFIS	MSE, RMSE, MAE, MAPE
Jiang et al. 2020	Datasets of 100 research paper	Stock return	All available samples	2017–2019	Deep Learning model	F1 score, precision, recall, MCC. RMSE, MAPE MAE, and MSE.
Reschenhofer et al. 2020	58 dataset of financial time series, Yahoo Finance	Stock price return and others	1-day, 1-week, 1-month, 1-year	96 publications of 2016	Tools revied papers analyzed as SVM, ANN, etc	Different metrics for the different papers
Proposed Work	SPY Stock Exchange Traded Fund (NYSE)),	Close Price Prediction	8 Lacks (approx.) High Frequency (1-minute data)	3-January − 2000 to 31-December-2008 (single minute dataset)	Adaptive Trio-ensemble Deep Neural Network (ATDNN)	MSE, RMSE, MAE, RMSLE, MRD, R²

NN: Neural Network, SVR: Support Vector Regression, RF: Random Forest, rRMSE: Relative RMSE, NMSE: Normalized MSE, MI: Mutual Information, LSTM: Long Short Term Memory, DM: Diebold and Mariano Testing, RMSE: Root Mean Square Error, PT: Pesaran Timmermann Testing, MAD: Mean Absolute Deviation, MAPE: Mean Absolute Percentage Error, BPNN: Backpropagation Neural Network, CNN: Convolutional Neural Network, ENANFIS: ensemble adaptive neuro-fuzzy inference system, ATDNN: Adaptive trio-Ensemble Deep Neural Network.

This outcomes of this review section is summarizes as follows:

High-frequency dataset (collected at every minute or hour) can provide better prediction results than a low-frequency dataset (collected in a day).

Ensemble model has better prediction accuracy than any individual Machine Learning model.

A high-frequency (1-minute) and historical stock price intraday dataset of SPY stock Exchange Traded Fund associated with New York Stock Exchange (NYSE), was collected between the period of 03-January-2000, 09:38 am to 31-December-2008 15:59 pm. The dataset used in this research is the only high-frequency dataset that was freely available on the website and used many researchers including (Senapati & Karmeshu 2016) as of now. The size of the dataset is quite large i.e. 11826 KB with 903097 samples. The collected raw dataset contains “Date”, “Time”, “Open”, “High”, “Low”, “Close”, and “Adj Volume” attributes as headings. This study used only two attributes namely “index” as “minute” and “close” for the forecast. Basic statistics and variation in the close price in the period of 03-Jan-2000, 9:31 am to31-Dec-2008 15:59 pm of SPY stock exchange are shown in Table 2 and Fig. 1 respectively.

Table 2

Statistics of stock price dataset
Stock Price parameter	Statistical measures
Stock Price parameter	Minimum	Maximum	Range	First quartile	Third quartile	Median	Mean	Standard deviation	Skewness
Sequential Minute	1	903097	1 to 903097	255636	677219	451337	451461	260701.8	4.39e-16
Date	03-01-2000	31-12-2008	01-03-2000 to 12-31-2008	-	-	-	-	-	-
Minute	09:31	15:59	09:31 − 15:59	-	-	-	-	-	-

For the ease of understanding and sake of simplicity, this study considered two attributes “index” and “close” out of the seven available attributes of the dataset for analysis and prediction. The dataset is divided into two parts, where the first dataset, 722479 samples and 2 columns used for training, and the second part dataset 180618 samples and 2 columns used for testing respectively. Both experimental datasets are independent, which means there are no exact correlations between the attributes. It was also confined from the literature that attributes of the dataset have an independent existence; this means to increment or decrement in one attribute does not disturb the other attribute.

3.1 Data pre-processing

Data pre-processing is a very important step for the precise prediction result. For this reason, scale of the dataset was analyzed and converted into numeric fraction values. The missing and “NA” value was also analyzed and it found that there are no “NA” or missing values in the high-frequency dataset.

3.2 Feature extraction from the dataset

The raw dataset was collected under the head namely “date”, “time”, “open”, “high”, “low”, “close”, and “adjacent volume”. All heads of the dataset are uncorrelated and independent of each other. This study used 1-minute stock price data for analysis and prediction. The proper combination of time and date can provide a distinctive sequence on which any of the other attributes can be uniquely identified therefore; this paper added a new feature i.e. “index” or “sequential minute”. This study considered the index (as 1-minute) and “close price” of the SPY stock price are considered as inputs for the experimental exploration and to maximize the prediction capability.

4.1 Deep Learning Model

The deep learning model uses multiple hidden processing layers to learn complex data patterns. It is one of the most powerful computational models with multi-levels of abstractions (Goodfellow et al. 2016, Heaton et al. 2017). It applies to the dataset, classifies the deeply intricate structure of the dataset, and ensures the change in the present depending on previous feedback using a back-propagation algorithm to get the optimal result. The proposed model uses H2O package that is memory-efficient, fast, and purely supervised learning. Many researchers on different size and interval (low frequency, small sample size etc.) datasets already used this model. The algorithm with the hyper-tunned parameter, which is based on cited papers including the H2O package, is presented in Table 3 and Table 4:

Table 3

Deep Learning overall training algorithm
Algorithm 1: Deep Learning training
Input: Preprocessed and extracted dataset Output: Trained and correlated model that is capable to predict the futuristic nature of the dataset. 1. The 1-minute stock price dataset is used to train the model. However, the dataset is divided into the 70: 30 ratios for training and testing respectively. 2. The hyper-twined deep learning model (multilayer feed-forward neural network) applies to this dataset using a stochastic gradient descent algorithm. 3. This network keeps with tunned parameter as activation is Rectifier function, two hidden layers with 200 neurons each and with one epoch, hyper learning rate = 0.03 and with different distributions. Weight function can be updated as follows: ${W}_{jk}= {W}_{jk}-\alpha \frac{\delta L(W,B\|j)}{\delta {W}_{jk}}$(1) ${b}_{jk}= {b}_{jk}-\alpha \frac{\delta L(W,B\|j)}{\delta {b}_{jk}}$(2) Where gradient $\delta L(W,B\|j)$ is computed via backpropagation and α is the rate of learning. 4. A computed node using local dataset and multithreading trains and participates to build a global model with model averaging across the network. (This model provides 40,801 weight/biases in the experiment). 5. Based on the hyper-tunned parameters, the global model performs optimized prediction results.

4.2 An adaptive trio-ensemble deep learning model

The methodology of the ensemble-learning model was originally designed and applied in 1992 and named “super learner” in 2007 after some modifications. The proposed model is also a “super learner” that produced the optimized prediction result using stacking, bagging, and boosting methods. Stacking is a method that combines strong learners and finds optimal combinations using the meta-learner algorithm. This model can easily handle the noise present inside the dataset. Bagging is a bootstrap aggregation method that draws N item from the X item with replacement. Boosting is an iterative method that changes the distribution of training data and decreases the model bias. The proposed hyper-tunned H2O model is a supervised machine learning technique that can be applied in all kinds of regression and classification problems but it is specialized for high-frequency stock price prediction. The algorithmic implementation of the adaptive trio-ensemble machine-learning model is given below:

Table 4

Adaptive trio-ensemble machine learning model
Algorithm 2: Adaptive trio-ensemble machine learning model
Input: Set L as the base model. // The proposed stack ensemble machine learning model considers the different distributions of deep learning model takes as base models. Output: Hyper-tunned ensemble machine learning model can forecast the optimal results based on the provided dataset. 1. Set hyper-tunned innovative ensemble machine learning model. a. Specify L base models //Base Models: DNN-1, DNN-2 and DNN-3 b. Set meta-learning algorithm. // Generalized Linear Model (GLM) is taken as a meta-learning algorithm to optimize the prediction result. 2. Provide the training dataset to train the model. //70% of the high-frequency sock price dataset is used for the purpose. a. Train individual L base models for the provided training dataset. //All provided models can be individually trained to capture the intrinsic behavior of the dataset. b. Apply cross-validation(k-fold) on each individual model. //K-fold cross-validation is settled for each model and results can be collected. c. Form N × L matrix from N cross-validated predicted values and L individual algorithm. //L list of model and N cross-validated predicted values can be used to form the N × L matrix. d. Meta-learning algorithms can be trained using available level-one data. // Now ensemble machine learning model can use to predict using L base machine models and meta-learning algorithms. 3. Forecast on the new dataset. //Now model is ready to predict the testing dataset. a. Perform forecast from L base model. //Individual base model can be used for the forecasts. b. Provide these forecasts to meta-learner to perform ensemble model prediction. // Final prediction result can be obtained.

4.3 Meta learner

The Generalized Linear Model (GLM) was used as the Meta learner algorithm in the proposed trio ensemble model. The GLM has three main components: (i) random factor for the dependent variable y, (ii) systematic factor$\eta =X\beta$ for observation matrix X, and (iii) link factor$E\left(y\right)={g}^{-1}\left(\eta \right)$. A trio-ensemble deep learning model is implemented because of the following merits:

It uses less computation time.

It provided stable and accurate outcomes.

A linear function can be used to represent the relationship between the variable as$\underset{\_}{y}={x}^{T}\beta +{\beta }_{0}$. The linear expression could be solved by maximizing the likelihood of the trio-ensemble deep learning model by using the least square approach.

$max {{\beta }_{1 }\beta }_{0 } -\frac{1}{2N}{\sum }_{i=1}^{N}({x}_{i}^{T}{{\beta }_{1 }+\beta }_{0 }-{{y}_{i})}^{2}-\lambda (a{\Vert \beta \Vert }_{1}+\frac{1}{2}(1-a){\Vert \beta \Vert }_{2}^{2}$ and the deviance in the sum of the squared error is expressed as $D={\sum }_{i=1}^{N}({y}_{i}-{\underset{\_}{{y}_{i}})}^{2}$

5.1 Experimental Design

ATDNN used a high frequency (1-minute) dataset of SPY stock price to validate the forecast of the model. Earlier models had not been able to resolve and address the concern at the optimum level. The proposed hyper-tuned model uses the H2O package to implement the model, which is capable to handle the high-frequency dataset. In the next segment, the three most result-oriented and the most performed three distributive DNNs were picked to realize the optimal result as DNN-1, DNN-2, and DNN-3 will be closely compared with the trio-Ensemble model. Finally, this study proposed a trio-Ensemble model of DNN-I, DNN-II, and DNN-III models to produce a super learner or an ensemble model to get the optimized result. At the last, the forecast accuracy of these three models is analyzed by the use of Root Mean Square Error (RMSE), Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Residual Deviance (MRD), Root Mean Square Logarithmic Error (RMSLE), and R2 to compare the results and to find the optimum suited model with comparative lower RMSE and higher R2. The complete flow graph of the stock price prediction framework is shown in Fig. 2.

The comparative performance of the ATDNN model is presented in Fig. 3–6 and its performance metrics are comparatively reported in Table 5 respectively. The model was developed using a Core i7 processor and h2O package on R-studio with twinned parameters where fold value = 10, learning rate = 0.03, Rectifier as activation function with two hidden layers using 200 neurons each and with one epoch. This model provides 40,801 weight/biases. The most performed Generalized Linear Model (GLM) was used as a Meta learner in the assembling of the model. The performance and accuracy can be measured by its R2 and RMSE metrics i.e directly proportional to its R2 estimation result and inversely proportional to its RMSE result estimation. This means model performance may be better if higher R2 and lower RMSE.

Twinned deep learning models using six different distributions as Gaussian, Poisson, Laplace, Huber, Gamma, and Quantile distributions to select the better performing distributions analyzed the 722479 training samples. The trained models have used 180618 testing samples. The testing experimental results suggest that the performance metrics of the models were different with different distributions. Out of these six, three, the better promising distributions were selected for further processing based on R2 performance metrics. The comparative performance study in the tabular form is shown in Table 6.

5.2 Experimental Results of DNN-I

Deep learning with the Gaussian distribution is one of the best promising models in training and testing both. So, the study used it as DNN-I. The experimental result on the testing dataset little deviates w.r.t. the real testing dataset. It provides Root Mean Square Error (RMSE): 6.61 and R2: 0.87. The experimental results of both real and predicted are presented in Table 6 and Fig. 3 respectively.

5.3 Experimental Results of DNN-II

Deep learning with Poisson distribution is outperformed. The experimental result on the testing dataset provides Root Mean Square Error (RMSE): 5.58 and R2: 0.91 in training and testing both. The prediction result with the testing dataset is presented in Table 6 and Fig. 4 respectively. The performance metrics confined that it is the most appropriate model to characterize the intrinsic behavior of the stock price (high frequency).

5.4 Experimental Results of DNN-III

Gamma distribution is also one of the most performing distributions with the high-frequency stock price dataset. The performance metrics confer that it is better than the Gaussian distribution but less performed than the Poisson distribution model. The performance metrics also confined its suitability with Root Mean Square Error (RMSE): 35.14 and R2: 0.89. The testing result vs. real is shown in Table 6 and Fig. 5 respectively.

Table 5

Analytical statistics of performance measures
Performance Measures	Performance measures of Deep Learning Models with different distributions for the testing dataset
Performance Measures	Quantile	Gaussian	Poisson	Laplace	Huber	Gamma
R²	0.44	0.87	0.91	0.51	0.75	0.89
MSE	189.93	43.77	31.19	163.77	85.45	35.14
RMSE	13.78	6.61	5.58	12.79	9.24	5.92
MAE	9.75	5.29	3.89	9.33	7.08	4.06
RMSLE	0.11	0.05	0.05	0.13	0.07	0.05
MRD	4.87	43.77	-924.64	9.33	57.73	11.57
Note: MRD (mean residual deviance), RMSLE (root mean squared log error), MAE (mean average error), RMSE (root mean square error), MSE (mean square error), and R² (coefficient of determination).

5.5 Experimental Results of the ensemble model

All three promising models DNN-I, DNN-II, and DNN-III are simultaneously applied to form an ensemble proposed model named an adaptive trio-ensemble model to improve the performance capability at the optimum level. The experimental result provides RMSE: 0.17, R2: 0.92. The performance metrics confined that the ensemble model outperformed. The predictive result on the testing dataset vs. the real testing dataset is shown in Table 3 and Fig. 6 respectively.

Due to the volatile nature of stock parameters, it is always been an interesting and challenging issue for data scientists. The stock price indices depend on many direct, indirect, and hidden factors where lots of money is involved. Therefore, its right prediction may give gain where the wrong may give tremendous loss. Many researchers tried to find many ways to solve the issues. However, the correctness of the highest prediction provided in (Basak et al. 2019) is better than the earlier provided results. The first region was quite simple as the study used a high-frequency big historical dataset, which was not available in the earlier experiments. Second, was XGboost, a tree-based machine-learning tool was applied.

Due to this region, the study used a truly high-frequency big dataset for training and testing both (722479-minute wise samples for training and 180618-minute wise samples for the testing). Due to a lack of correlation for dependencies, the study used a unique attribute as a “new Minute” which is a combination of the “Date” and “Minute”. Moreover, for the sake of simplicity, the study took the “Close” index for further processing.

Deep Learning, the most performed tool was applied using H2O an advanced and optimal performance package. The trio-deep learning model with different distributions was applied for both training and testing purposes. The experimental result suggests that the model with three distributions as Gaussian, Poisson, and Huber performance is far better out of the six used distributions. The study picked these three distributions as DNN-I, DNN-II, and DNN-III. All three individual performance was comparable with the reviewed paper (R2 of DNN-I: 0.87, DNN-II: 0.91and DNN-III: 0.89). The study used these three models to produce a new adaptive trio-ensemble model to maximize the predictive capability of the individual model. The trio-ensemble model improves the performance with R2: 0 .92.

The input dataset was sequential “Minute” and the output was the “Close” index that represents how the “Close” index change with each new minute. Both attributes were numeric. So, the deep model was settled according to the nature of the dataset. The distribution was picked by seeing the input attribute, as it is a sequential variable. The Min: 74.45, Max: 157.51, Median: 121.45, Mean: 121.46, Standard Deviation: 18.58 and Skewness: -0.22 suggest the nature of the independent variable. Out of six distributions (Quantile, Gaussian, Poisson, Laplace, Huber, and Gamma), only three (Gaussian, Poisson, and Gamma) performed very well.

Gaussian distribution performance is better than Quantile, Laplace and, Huber due to the central limit theorem which finds adequacy in sequential and large dependent variables available in the training dataset. Its mean and variance are normally distributed. It can be mathematically expressed.

$$f\left(x\right)=\frac{1}{\sigma \sqrt{2\pi }}{e}^{-\frac{1}{2}}\frac{{\left(x-\mu \right)}^{2}}{\sigma }$$

where $\mu$ is the mean and $\sigma$ is the standard deviation of the dataset.

Poisson distribution performs as ideal on the uncorrelated and discrete dataset, which is perfect for this dataset (Lass et al. 2020). It works perfectly when the dependent variable (Odyniec et al. 2014) occurs per unit time, the dependent variable is independent of each other as “Minutes”, and “Close” are occurring in the dataset. This is the reason it performs the best with the highest R2 in this case. It can be mathematically expressed as follows:

$$f\left(x\right)=\frac{1}{x!}{\left(\lambda t\right)}^{x}{e}^{-\lambda t}$$

where$x=0, 1, 2, 3,\dots \dots \dots .n, \mu = \lambda t and {\sigma }^{2}= \lambda t$

Gamma distribution is a modification of Poisson distribution where factorial function ($x!$) is replaced by $\varGamma \left(r\right)$ function. It is also one of the finest distributions after Poisson with the second highest${R}^{2}$:

$$f\left(x\right)=\frac{1}{\varGamma \left(r\right)}{\left(\lambda \right)}^{r}{x}^{r-1}{e}^{-\lambda t}$$

where$x=\left[0, {\infty }\right), \mu = \alpha \beta and {\sigma }^{2}= \alpha {\beta }^{2}$

The comparative representation of the testing dataset with DNN-I, DNN-II, DNN-III, and trio-ensemble models are compared and shown in Fig. 7 and Table 6. The dark red line represents real data sets, the yellow line represents the forecast done by DNN-I, the dark green line represents the DNN-II, dark blue indicates DNN-III and finally dark cyan presents the nature of the trio-ensemble prediction model. The twenty days comparative and random prediction result is shown in Table 7.

Table 6

Performance of DNN-I, DNN-II, DNN-III, and trio-ensemble models in the testing dataset
S. No.	Deep learning with trio-ensemble models	Performance measures
S. No.	Deep learning with trio-ensemble models	R²	MSE	RMSE	MAE	MRD
1	DNN-I	0.87	43.77	6.61	5.29	43.77
2	DNN-II	0.91	31.19	5.58	3.89	-924.64
3	DNN-III	0.89	35.14	5.92	4.06	11.57
4	Trio-ensemble	0.92	30.52	5.52	4.22	30.52
Note: MRD (mean residual deviance), RMSLE (root mean squared log error), MAE (mean average error), RMSE (root mean square error), MSE (mean square error), and R² (coefficient of determination).

Table 7

Real vs. Forecasted value captured for 20 samples
Index	Minute	Real Close	DNN-1	DNN-2	DNN-3	Trio-ensemble
1	11	147.125	135.1753	143.0846	141.3083	139.1831
2	15	147.25	135.1762	143.0846	141.3088	139.1838
3	17	147.375	135.1767	143.0846	141.309	139.1841
4	18	147.375	135.1769	143.0846	141.3091	139.1843
5	25	146.938	135.1785	143.0846	141.31	139.1854
6	28	146.75	135.1791	143.0846	141.3103	139.1859
7	36	146.75	135.1809	143.0846	141.3113	139.1872
8	38	147.094	135.1814	143.0846	141.3116	139.1875
9	56	146.688	135.1854	143.0846	141.3137	139.1904
10	67	145.906	135.1879	143.0846	141.3151	139.1922
11	68	145.844	135.1881	143.0846	141.3152	139.1923
12	76	145.375	135.1899	143.0846	141.3161	139.1936
13	78	145.219	135.1904	143.0846	141.3164	139.1939
14	80	144.75	135.1908	143.0846	141.3166	139.1943
15	82	144.625	135.1913	143.0846	141.3169	139.1946
16	85	144.906	135.192	143.0846	141.3172	139.1951
17	87	144.625	135.1924	143.0846	141.3175	139.1954
18	91	144.625	135.1933	143.0846	141.318	139.196
19	99	144.188	135.1951	143.0846	141.3189	139.1973
20	102	144.25	135.1958	143.0846	141.3193	139.1978

The different studies used different performance metrics to compare the experimental results of the study. So, it is unfair to compare them all. Despite that, a similar type of performance metrics was used on the mixture model of SVR and ENANFIS and found the accuracy of the model in terms of MSE: 0.33 to 0.69, MAE: .41 to 0.52, and RMSE: 0.57 to 1.48. The author recommends in their conclusion and future scope to proceed with the deep learning model. This study crossed the limit settled in the literature review with R2 = 0.92 with a very big margin.

In this paper, an adaptive trio-ensemble deep neural network (ATDNN) model is proposed using three different distributions of Deep Learning. The experimental results conclude that the performance of ATDNN is much better than individual deep learning distributions in terms of minimum RMSE and the maximum R2 value. The experimental result also shows that prediction accuracy is much better for high-frequency datasets. The ATDNN provides a better predictive accuracy using the 1-minute (high frequency) datasets of stock price. It is concluded that the performance of the proposed ensemble model may further be improved by tuning the parameters of trio-ensemble ATDNN in comparison to tree-based ensemble models for the high-frequency data prediction. Finally, this paper reports major challenges of the area that as non-availability (a free data source) and true high-frequency stock price dataset despite the huge hype in the area, which is the most, desired for deeper and future research.

The author acknowledges that there is no known conflict of interest directly or indirectly related to the work submitted for publication

ORCID

Ravinder Kumar https://orcid.org/0000-0003-2117-5734

Atsalakis, G. S., & Valavanis, K. P. (2009a). Surveying stock market forecasting techniques - Part II: Soft computing methods. Expert Systems with Applications, 36(3 PART 2), 5932–5941. https://doi.org/10.1016/j.eswa.2008.07.006
Atsalakis, G. S., & Valavanis, K. P. (2009b). Surveying stock market forecasting techniques - Part II: Soft computing methods. Expert Systems with Applications, 36(3 PART 2), 5932–5941. https://doi.org/10.1016/j.eswa.2008.07.006 -9
Babu, C. N., & Reddy, B. E. (2015). Prediction of selected Indian stock using a partitioning–interpolation based ARIMA–GARCH model. Appl Comput Inform, 11(2), 130–143. https://doi.org/10.1016/j.aci.2014.09.002
Basak, S., Kar, S., Saha, S., Khaidem, L., & Dey, S. R. (2019). Predicting the direction of stock market prices using tree-based classifiers. North American Journal of Economics and Finance, 47(December 2017), 552–567. https://doi.org/10.1016/j.najef.2018.06.013
Bezerra, P. C. S., & Albuquerque, P. H. M. (2017). “Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels,” Comput. Manag. Sci., vol. 14, no. 2, pp. 179–196, 2017, doi: 10.1007/s10287-016-0267-0
Brown, M. S., Pelosi, M. J., & Dirska, H. (2013). Dynamic-radius species-conserving genetic algorithm for the financial forecasting of Dow Jones Index stocks. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7988 LNAI, 27–41. https://doi.org/10.1007/978-3-642-39712-7_3 -10
Candel, A., Parmar, V., LeDell, E., & Arora, A. (2016). Deep learning with H2O. H2O. ai Inc. Sep:1–21
Chen, L., Qiao, Z., Wang, M., Wang, C., Du, R., & Stanley, H. E. (2018). Which Artificial Intelligence Algorithm Better Predicts the Chinese Stock Market? Ieee Access : Practical Innovations, Open Solutions, 6(8), 48625–48633. https://doi.org/10.1109/ACCESS.2018.2859809
Chong, E., Han, C., & Park, F. C. (2017). Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Systems with Applications, 83, 187–205. https://doi.org/10.1016/j.eswa.2017.04.030
Chung, H., & Shin, K. (2020). Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction. Neural Computing and Applications, 32(12), 7897–7914. https://doi.org/10.1007/s00521-019-04236-3
Chung, H., & Shin, K. (2020). Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction. Neural Computing and Applications, 32(12), 7897–7914. https://doi.org/10.1007/s00521-019-04236-3
Lien Minh, D., Sadeghi-Niaraki, A., Huy, H. D., Min, K., & Moon, H. (2018). “Deep learning approach for short-term stock trends prediction based on two-stream gated recurrent unit network,” IEEE Access, vol. 6, pp. 55392–55404, 2018, doi: 10.1109/ACCESS.2018.2868970
Ersan, D., Nishioka, C., & Scherp, A. (2019). P 500. Journal of Computational Social Science, Issue 0123456789, https://doi.org/10.1007/s42001-019-00057-5. Comparison of machine learning methods for financial time series forecasting at the examples of over 10 years of daily and hourly data of DAX 30Springer Singapore
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654–669. https://doi.org/10.1016/j.ejor.2017.11.054
Gandhmal, D. P., & Kumar, K. (2019). Systematic analysis and review of stock market prediction techniques. Computer Science Review, 34, 100190. https://doi.org/10.1016/j.cosrev.2019.08.001
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT press
Guo, Y., Han, S., Shen, C., Li, Y., Yin, X., & Bai, Y. (2018). An adaptive SVR for high-frequency stock price forecasting. Ieee Access : Practical Innovations, Open Solutions, 6, 11397–11404. https://doi.org/10.1109/ACCESS.2018.2806180
Guo, Y., Han, S., Shen, C., Li, Y., Yin, X., & Bai, Y. (2018). An adaptive SVR for high-frequency stock price forecasting. Ieee Access : Practical Innovations, Open Solutions, 6, 11397–11404. https://doi.org/10.1109/ACCESS.2018.2806180
Ham, F. M., & Kostenic, I. (2002).Principals of Neurocomputing for Science & Engineering, Tata McGraw Hill
Heaton, J. B., Polson, N. G., & Witte, J. H. (2017). Deep learning for finance: deep portfolios. Appl Stock Models Bus Ind1, 33(1), 19–21. https://doi.org/10.1002/asmb.2230
Henrique, B. M., Sobreiro, V. A., & Kimura, H. (2019). Literature review: Machine learning techniques applied to financial market prediction. Expert Systems with Applications, 124, 226–251. https://doi.org/10.1016/j.eswa.2019.01.012
Hoseinzade, E., & Haratizadeh, S. (2019). CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Systems with Applications, 129, 273–285. https://doi.org/10.1016/j.eswa.2019.03.029
Jiang, W. (2020). Applications of deep learning in stock market prediction: recent progress. Statistical Finance (q-fin.ST); Machine Learning (cs.LG), ArXiv, 1–97, https://arxiv.org/abs/2003.01859
Khaidem, L., Saha, S., & Dey, S. R. (2016). Predicting the direction of stock market prices using random forest. 00(00), 1–20. http://arxiv.org/abs/1605.00003
Lass, J., Jacobsen, H., Mazzone, D. G., Lefmann, K., & MJOLNIR (2020). A software package for multiplexing neutron spectrometers, SoftwareX. Jul 1;12:100600
Long, W., Lu, Z., & Cui, L. (2019). Deep learning-based feature engineering for stock price movement prediction. Knowledge-Based Systems, 164, 163–173. https://doi.org/10.1016/j.knosys.2018.10.034
Odyniec, M., Luttman, A. B., Howard, M. M., Bardsly, J., Joyce, K., Hock, M., & Fowler, M. (2014). Maximum Likelihood Estimation and Uncertainty Quantification for Signals with Poisson-Gaussian Mixed Noise, LO-05-14. Nevada Test Site/National Security Technologies, LLC (United States). Sep 18
Reschenhofer, E., Mangat, M. K., Zwatz, C., & Guzmics, S. (2020). Evaluation of current research on stock return predictability. Journal of Forecasting, 39(2), 334–351. https://doi.org/10.1002/for.2629
Rusk, N. (2015). Deep learning. Nature Methods, 13(1), 35. https://doi.org/10.1038/nmeth.3707
Schmidhuber, J. (2015). Deep Learning in neural networks: An overview. Neural Networks, 61, 85–117. doi: 10.1016/j.neunet.2014.09.003
Senapati, D., & Karmeshu (2016). Generation of cubic power-law for high frequency intra-day returns: Maximum Tsallis entropy framework. Digital Signal Processing: A Review Journal, 48, 276–284. https://doi.org/10.1016/j.dsp.2015.09.018
Shah, D., & Isah, H. (2019). Stock Market Analysis: A Review and Taxonomy of Prediction Techniques. International Journal of Financial Studies, MDPI, no. ii
Shrivastav, L. K., & Kumar, R. (2021). “High-Frequency Stochastic Data Analysis Using a Machine Learning Framework: A Comparative Study”, Cognitive Computing Systems, 3–31, Apple Academic Press, Taylor and Francis Group
Shrivastav, L. K., Kumar, R., & Global, I. G. I. (2022a). Vol. 15, 1, Article 2, doi:10.4018/JITR.2022010102.
Shrivastav, L. K., Kumar, R., & Global, I. G. I. (2022b). Vol. 15, 1, Article 1, doi:10.4018/JITR.2022010101.
Xu, W., Chen, Y., Coleman, C., & Coleman, T. F. (2018). Moment matching machine learning methods for risk management of large variable annuity portfolios. Journal of Economic Dynamics and Control, 87(71771175), 1–20. -https://doi.org/10.1016/j.jedc.2017.11.002
Zhang, J., Cui, S., Xu, Y., Li, Q., & Li, T. (2018). A novel data-driven stock price trend prediction system. Expert Systems with Applications, 97, 60–69. https://doi.org/10.1016/j.eswa.2017.12.026

AuthorsBiography.docx

Download PDF

Version 1

posted

You are reading this latest preprint version

Adaptive trio-ensemble deep neural network for high-frequency stock price prediction

Status:

Version 1

Abstract

Figures

1. Introduction

1.1 Motivation and Contribution

2. Literature Review

3. Data Characterization And Compilation

3.1 Data pre-processing

3.2 Feature extraction from the dataset

4. Machine Learning Algorithms

4.1 Deep Learning Model

4.2 An adaptive trio-ensemble deep learning model

4.3 Meta learner

5. Experimental Design And Results

5.1 Experimental Design

5.2 Experimental Results of DNN-I

5.3 Experimental Results of DNN-II

5.4 Experimental Results of DNN-III

5.5 Experimental Results of the ensemble model

6. Discussion Of The Results

7. Conclusions And Future Direction

Declarations

References

Supplementary Files

Status:

Version 1