Bursts and Regularization in Financial Markets: Deterministic or Stochastic?

We analyze empirical finance data, such as the Financial Stress Index, a number of asset classes (swaps, equity and bonds), market (emerging vs. developed), issuer (corporate vs. government bond), maturity (short vs. long) data, asking whether the recently observed alternations between calm periods and financial turmoil can be modelled in a low-dimensional deterministic manner, or whether they demand for their description a stochastic model. We find that a deterministic model performs at least as well as one of the best stochastic models, but may provide additional insight into the essential mechanisms that drive financial markets.


Introduction
The question of whether financial data has occupied a substantial part of the mathematically minded economic research in the last few decades, where the opinions often tended towards the chaotic aspects being strongly dominated by stochastic process aspects. For an overview of the methods used and conclusions drawn from a potentially mild bias towards chaos in economics see Ref. 1 . To present, the made conclusions were almost exclusively based on indirect, mostly averaged, statistical time series quantifiers, such as Lyapunov exponents, fractal dimensions 2 , recurrence quantification analysis 3 or BDS testing 4 . Here, to the best of our knowledge, we provide for the first time a direct comparison of financial time series with a random-based vs. a low-dimensional deterministic map-based modeling. We are led towards such an approach from the hypothesis that even seemingly highly complex financial dynamics that are embedded into many-faceted contexts, may be driven by human agents that follow simple rules and prescriptions.
Emerging from the impact of COVID-19 in a black swan event, recent dynamics of financial markets have shown spikes both in volatility and in performance. Hence, we ask in this work to what extent they could be modeled by low-dimensional dynamical systems. To model the data, we use Rulkov's family of two-dimensional maps 5 , originally designed for the simulation of neuronal firing. We find that such simple deterministic modeling yields a quality of results that differs from those of an optimized ARIMA-GARCH model only by effects for which white noise can be responsible, but may offer, additionally, detailed insight into the structure behind the dynamics of the processes inherent in the financial dynamics.

Trends in recent financial modeling
In the recent past, financial time series were often simulated as pure jump processes 6 , jump-diffusion models that combine a jump process (represented by Merton log-normal jumps 7 with a diffusion process 8 , Levy-type processes 9 that, when suitably truncated, have finite moments and can account for excess kurtosis at short time-scales along with the slow convergence to a Gaussian at longer time-scales 10 ), or by wavelets 11,12 that are able to cope with both jumps and noise 13 . More recently, Hawkes point processes 14 have attracted attention, as they easily incorporate contagious jumps into financial models.
The approach that we take here is to compare market dynamics explained through deterministic chaos, see, e.g., Ref. 15 , vs. the more widely used traditional stochastic approaches of the field. Chaos has been described as "the dynamical mechanism by which nature develops constrained and useful randomness" 16 . Chaos was found in a discretized evolutionary game 17 that represents "a generic situation in which individuals interact and reproduce according to the result of that interaction, which in turn depends on the composition of the population". This indicates that even when the system is deterministic, its appearance may look arbitrary. A similar result stating that "the microscopic randomness of the mutation process can be amplified to macroscopic unpredictability by evolutionary dynamics" was recently obtained from an infinitely repeated prisoner's dilemma 18 , and a discrete-time replicator map (a prototype of an evolutionary selection game) was shown to yield a fixed-point solution the dynamics that could be interpreted as the Nash equilibrium of an evolutionarily stable strategy 19 . Besides this relation to game theory, financial markets have an affinity to neural networks behavior. As a consequence, a stochastic application of the FitzHugh-Nagumo system 20,21 has been used to model the electric market 22 . Recent empirical studies reported that those firms are significantly impacted from the aftermath of a crisis resulting in " an explosive oscillating convergence" 23 , which can be seen as a hallmark of neuronal-like response patterns.
In our work, we, therefore, scrutinize whether a stochastic approach is indeed preferable to low-dimensional deterministic approaches. For this, we apply a model of neuronal dynamics that has proven to be much closer to the biological example of neuronal firing, Rulkov's two-dimensional map family 5 , the first vector component of which represents an instantaneous activity, whereas a second component describes the underlying trend. Interaction between coupled Rulkov maps may generate transitions between stationary, periodic, quasiperiodic, and chaotic regimes where "due to both synaptic coupling and common inputs among neurons there are many types of synchronization, which can be generally regarded as the presence of a consistent temporal relationship between their activity patterns" 24 . More specifically, the degree of coupling may vary from complete synchronization (spikes happen at the same time) to bursting synchronization (only the first spike happens at the same time) and more.
We begin with the presentation of our version of the Rulkov map that we propose as a new modeling tool for financial data. Then we will provide an overview of the traditional approaches most widely used in the field, in order to finally be able to compare our new approach with the results provided by the more classical approaches.

Methodological elements
In this section, we start with a review of the methodological elements used in our analysis, i.e. the financial modeling tools, and then recall the most popular statistics for model comparison and fit quality.

Generalized Rulkov map
The use of Rulkov maps in the modeling of financial data is the decisive new element of our approach. During the recent turbulence-prone years, ten Asian emerging stock markets showed return series of non-linear serial dependence, and in real exchange rates of developing and emerging market economies 25 , in four USA stock market indices 26 , and in USA short rate indices 27 , mean reversion was observed. By means of its recursive nonlinear and mean reverting properties, Rulkov maps may be expected to be highly suitable for the modeling of financial data, in particular regarding the occurrence of data clusters, heteroscedasticity, mutual synchronization, chaos and regularization of bursts of activity across the markets, after-shock asset classes, and more. Moreover, networks of Rulkov maps can be used to mimic diffusion and increased interrelations across markets.
To provide a first general insight into the nature of Rulkov maps, we start from a two-dimensional non-linear recurrence of the form with and α, γ, δ ∈ R; such a setting is widely used for the modelling short-time effects like pulses and shocks in time series. Vector component y captures a moving average trend implemented as with β , µ, η ∈ R. Combining Eq. (1) with Eq. (3) leads to This version differs from the original Rulkov version and is the map we will use for modeling financial time series where mostly the n parameter will be set equal to two. Its specific two-dimensional form with short-time (x) and average prediction (y) horizon components (that are the main subjects of interest in financial modeling) renders the map an optimal device. Of importance are the following characteristic properties hosted by function f α . ii) f α (x) reaches to 0 when x goes to ±∞. iii) is the rectangle function.
Proof. Without loss of generality we set α > 0.
i) One has: for any x ∈ R, that is null if and only if x = 0. Moreover, ii) It is trivial.
iii) It is the consequence of the well-known limit for any x ∈ R and for any even integer n.
Case n = 2 is special, as it guarantees an increased variability with respect to the output of x t , cf. Figure 1. This setting will be used in our modeling approaches unless specified otherwise.

3/15
To simulate the simplest interaction between markets, asset classes etc., two Rulkov maps (labelled by indices i and j) are coupled as where the coupling describes time-lagged interactions between the time series i and j (and likewise for interchanged indices). α, β , γ, µ, η, δ , κ, θ are parameters and τ is the embedding dimension used for the reconstruction of the phase-space from the one-dimensional time series 2, 28 . The idea underlying this specific coupling is that such an interaction will mostly occur on a short time-scale.

ARIMA-GARCH models
To validate the low-dimensional deterministic Rulkov map approach, we will compare it against an optimized Auto-Regressive Integrated Moving Average model (ARIMA) combined with an Autoregressive Conditional Heteroskedasticity model (GARCH), where the first model is used to model a "wide-sense stationary time series" once removed the integration, and the second to model their volatility. The latter approach is considered the best modeling approach by a large part of the theoretical and practical specialists of finance modeling, for the data that we will deal with. ARIMA parameters are the lag p, the degree of differencing d, and the order of the moving-average model q. Parameters of the GARCH model are the order of the GARCH terms a and the order of the ARCH terms b. The values of sets {p, d, q} and {a, b} are chosen based on Bayesian information (BIC) and on Akaike information (AIC) criteria 29 . In the following, this parameter-optimized ARIMA-GARCH model will be denoted by ARIMA-GARCH * .

Naive model
For time series prediction, naïve models 30 are not only cost-effective, they also provide a benchmark against which to compare more sophisticated models. In our context, naive forecasts equal the last period's observed values, resulting in a "in-sample one step ahead" algorithm, where given the time series Z = {z h : h ∈ T }, with T being the (time) index set, produces an error of size

Elements of model comparison
To compare modeling results, we use a set of metrics:

Normalized square root mean error
The root mean squared error (RMSE) defined as RMSE = 1 n ∑ n h=1 e 2 h , measures the accuracy of a model in terms of goodness of fit, where e h denote the residuals between the observations and the simulations over n time steps. Hence, the values below unity indicate a good fit. RMSE, however, depends on the scale of the observed data and is thus outlier-sensitive, where larger errors have a disproportionately large effect. For this reason, we will apply the normalized root mean squared error (NRMSE): where z max denotes the maximum value and z min the minimum value of the observed sample data.

Relative mean absolute error
The relative mean absolute error is where MAE (1) is the mean absolute error of the model of interest and MAE (2) is the error of some benchmark model 31 . A value greater than unity indicates that the chosen model performs worse than the benchmark.

Mean absolute percentage error
To measure the prediction quality of a model, often the mean absolute percentage error (MAPE) is used, where where e h denote the residuals between the observations z h and their simulations over n time step 32 . Table 1 exhibits generally accepted MAPE accuracy levels 33 .

Dynamic time warping distance
To further scrutinize how close the models are, dynamic time warping (DTWD) can be used, a technique that finds the optimal alignment between a modeled and a target time series 34 . By stretching or shrinking, one time series is "warped" non-linearly along the time axis, to determine the degree of similarity between two time series 35 . This technique is commonly in speech recognition, to determine if two waveforms represent the same spoken phrase. It has, however, also been applied to other fields such as data mining, neural networks and time series classification 36,37 . For instance, Lines et al. 38 when comparing several distance measures, showed that there is no single distance measure that significantly outperforms DTW. A recent application to financial time series can be found in Ref. 39 . In our case, we will need the information on how much the predicted time series is shifted against the modeled data and whether, and if so, how much particular modeling suffers from slowed-down or accelerated reactions due to past predictions, and in comparison between different modelings, such effects may play a role. In Fig. 2 we present the results of DTWD for an artificial system that later on we will use to calibrate obtained results for our chosen modeling approaches.

Data investigated
In Table 2 we provide an overview of the empirical financial time series used for our numerical simulations. Whereas otherwise standard dataset abbreviations will be used, our abbreviation 'BAMLEM' refers to the 'Emerging Markets Corporate Plus Index Total Return Index Value'.

Financial stress index modeling
The Financial Stress Index (STLFSI2) is our main focus in financial data modeling, as it measures the degree of financial stress in the markets 40 . The index is obtained from seven interest rates, six yield spreads, and five other indicators by means of principal component analysis (PCA), as each of these variables captures aspects of financial stress. Accordingly, whenever the level of financial stress in the economy changes, the data series are expected to change in a coherent manner. Positive stress values indicate higher-than-average stress in financial markets, while negative values indicate lower-than-average levels of stress, cf. Fig. 3. Notice that the index is not a leading, but only a coincident, index that measures developments as they occur. Below we report the accuracy of data modeling by our Rulkov map approach, then its potential for making forecasts, and finally present evidence of chaotic behavior in the data. As the STLFSI2 index is a combination of other indices, a single Rulkov map (with calibrated parameters displayed in Table 3) was used for its modeling. For the remaining five indices, there are twenty possible couplings. For space reasons, we will only display the most representative ones, as the ones omitted follow pretty much their behavior. One example of financial stress was the 2008-2009 financial crisis. A second example is the ongoing turmoil in financial markets stemming from the fear and uncertainty associated with the COVID-19 pandemic. Since mid-February 2020 COVID-19 uncertainty has caused distress in financial markets by triggering "a massive sell-off in stocks and consequent plunge in stock prices, sharp declines in interest rates, and stunning increases in financial market volatility" 40 . In Fig. 4, we show in the figures of the top row how the bursting dynamics of the Rulkov map replicates the STLFSI2 index. In the second row, we display the error and, in the third row, we compare this to the one obtained from a calibrated ARIMA-GARCH * model (for model details, cf. 3.1.2). The best ARIMA-GARCH models in terms of Bayesian information criterion (BIC) and Akaike information 6/15  criterion (AIC) were, in our analysis, those integrated of order one 30 . For this reason, we calibrated all investigated models (Rulkov map, Naive model and ARIMA-GARCH * , see below) over the first differences of the STLFSI2 only. While at first, this attempt looked satisfactory, we identified correlations on the residuals that required further investigation of integration, mean reversion, and ARCH effects. The latter problem might be attributed to the fact that some financial time series seem to not satisfy the finiteness condition imposed on the first four distribution moments, which would invalidate the assumption on the asymptotic behaviour of most tests of nonlinearity 41 .   Table 3.

Chaotic or random data?
To check whether expected chaotic aspects are indeed in the data, and if so, how well they are reflected by the model, we evaluated the data and their simulations the maximal Lyapunov exponent (MLe) that serve as a decision guideline [42][43][44][45] . The MLe calculated on the first differences of the STLFSI2 index being 0.2214 versus 0.2607 of the Rulkov map, both values are close to those from the chaotic behavior of the logistic map at nonlinearity parameter around 3.6 and hence indicate the presence of low-dimensional chaos in the data. The approximate entropy (ApEn) is used to quantify the amount of regularity and the unpredictability of fluctuations over time-series 46 ; for an early application to financial time series see Ref. 47 . If the value is very small, this implies that the sequence is regular. As an example, the ApEn for the binary signals S1 =  Table 4 lists the maximum Lyapunov exponent, the approximate entropy, the correlation dimension and the Hurst exponent, together with the values obtained from the corresponding Rulkov map approach, evidencing a good agreement between the model and the empirical time-series data properties. In the current framework, we deal with two time series the fast x and the slow y components. In finance, they correspond to actual data and their memory, respectively, where the latter is approximated by the exponentially weighted moving average (EWMA). It seems from the application point of view advisable to characterize the instability of each component individually by means of one-dimensional Lyapunov exponents, although the underlying system is in fact is two-dimensional. Such a viewpoint is also supported by the nature of the two components the dynamics of which follow two well-separated time scales.

Details of model comparison
A problem deserving special attention is a potential occurrence of serial correlations in the data that could either result from some deterministic trends or from a stochastic mean. Whereas the first possibility is not relevant for our case, the latter can 9/15 again be eliminated by differencing the original non-stationary time series once, so that they can be seen as stationary time series in a wide sense, a situation that can be treated by the Box-Jenkins methodology (ARMA) 30 . While in principle there is no limit to the differencing, the procedure must not be pushed too far since higher (> 2) orders "negatively impact (distort) the desired frequency components (points or intervals)" 53 . In our tests up to order four, only the first order yielded improvements.
The autocorrelation function (ACF) and the partial autocorrelation function (PACF) on the residuals for the state variable x and y for the Rulkov map and for the ARIMA-GARCH * model shown in Fig. 5 evidences that the first order differencing is more effective in removing most of the autocorrelation on the residuals from the Rulkov than from the ARIMA-GARCH * approach. This could be seen as a sign of the greater explanatory power of the Rulkov map (cf. "the inclusion of correlated residuals in latent-variable models is often regarded as a statistical sleight of hand, if not an outright form of cheating" 54 ), sometimes correlated residuals are features of the research design or intrinsic to the studied phenomenon. As the financial stress index (STLFSI2) is a combination of other indices and, according to 55,56 , there is significant evidence of positive serial correlation for weekly and monthly holding-period index returns. Generally, weekly and monthly stock returns are weakly negatively correlated, whilst daily weekly and monthly index returns are positively correlated 57 . This effect has been explained by positive cross-autocorrelations across individual securities across time, where it may happen that this cannot be removed completely 57 . The absence of a unit root, however, has been confirmed using the KPSS test 58 .

Figure 5.
Autocorrelations and partial autocorrelations for state variables x (first two columns) and y (last two columns), comparing the Rulkov map to ARIMA-GARCH * , realized by an ARIMA(2,1,2)-GARCH(1,1) type. While autocorrelation is almost absent in the Rulkov map, the ARIMA-GARCH * displays significant autocorrelations. This could be an indication of the greater explanatory power of the Rulkov map.
To determine how close the Rulkov and the ARIMA-GARCH * approaches would be, we used the DTWD measure that returns the Euclidean distance between two times series. We first calibrated DTWD at the simple example of a perturbed sawtooth wave that was replicated upon adding Gaussian white noise of distinct strength (c.f. Fig. 2). Figure 6 shows the warped residuals of the Rulkov and the ARIMA-GARCH * approaches, and the Euclidean distance of the state variable x. The residuals are almost identical and closely follow the original timeline (in the sense that the warping did not stretch the dependent time series). For the y, as is reported in Table 5, we obtained similar results, but for space reasons refrain from showing the corresponding plots. With the only exception of the DTWD for x j of the IBOV, our distances in Table 5 are below the DTWD of a time series perturbed by Gaussian white noise. From this, we conclude that the distinction between the ARIMA-GARCH * and the Rulkov map approaches should be of, or below, the order of the influence of noise.
In Table 5, we report the RMAE accuracy of the Rulkov map versus the Naive and the ARIMA-GARCH * models, the NRMSE of the Rulkov map and the ARIMA-GARCH * model, and their DTW distance. As may be expected, in all instances the Rulkov map performs substantially better than the Naive model, whereas the comparison between the Rulkov and the ARIMA-GARCH * yields mixed results: the Rulkov map's NMRSE is always lower, but its MAE is occasionally higher. The differences, however, are small in all cases, as is corroborated by the DTW distance of the residuals.  The results obtained with the Rulkov map are similar to those obtained with the optimal ARIMA-GARCH and, often, are even better. The DTW confirms that the two models are very close. To achieve stationarity, when required, time series have been differenced/detrendized.

Forecasting power
Of particular interest is that the Rulkov map may easily be used for forecasting. Indeed, once the values x s , y s at time s are known, the previsions at next time t > s are computed (deterministically) by Eq. (4). In our set-up the parameters are calibrated over the previous window [s − m + 1, s] of size m, opportunely chosen. Because we deal with a weekly frequency dataset, we set m = 52. In particular, starting from m, we predict the next values through a rolling window of fixed size m, as explained above. The parameters are calibrated by a nonlinear least squares regression. We run a robust estimation with the iteratively re-weighted least squares algorithm 59 which, at each iteration, recomputes the weights based on the residual from the previous iteration. This process progressively continues until the weights converge. Using MAPE, the generated forecasts are compared with those of the ARIMA-GARCH * model, see Table 6 and Figure 7. The comparison provides evidence of fits of similar quality, both for the process and its trend, confirming that a deterministic approach (the Rulkov map) may perform as well as a stochastic approach (the ARIMA-GARCH * model).

Conclusions
In our work, we found that the deterministic chaotic model performs as well as an ultimately refined stochastic one. Specifically, our low-dimensional deterministic Rulkov map approach suffers less of residuals' autocorrelation, hosts an improved NRMSE and a better dynamic time warping if compared to the prominent ARIMA-GARCH * model. The situation that we have to deal with is, more generally, similar to the one encountered in neuroscience, where initial conclusions of simple chaotic or power-law behavior were shown to be more complicated and, in particular, scale-dependent 60 . This shifts the question towards determining exactly what time-scales the two processes are dominant and, respectively, to identify the horizon on which a deterministic prediction will yield optimal forecasting results.
Regarding the long-standing debate whether financial times series should be considered to be stochastic rather than chaotic, our direct modeling approach yields evidence for a stronger than expected chaotic data component. While both elements seem to be present, the chaotic component offers several options for understanding and, finally, monitoring financial processes, if choosing the appropriate time-scale. In particular, our approach may be expected to host an improved explanatory power to events of interest, such as shocks that emerge to be of endogenous instead of exogenous nature as understood by stochastic models 61 . In this way, our work may open a general avenue towards a better understanding and monitoring of some essential drivers of the financial market dynamics, and of similar processes beyond.