How to Predict the Spread of COVID-19

For analysis tasks, time counts are of interest – values recorded at some, usually equidistant, points in time. The calculation can be performed at various intervals: after a minute, an hour, a day, a week, a month, or a year, depending on how much detail the process should be analyzed. In time series analysis problems, we deal with discrete-time, when each observation of a parameter forms a time frame. The same can be said about the behavior of Covid-19 over time. In this paper, we solve the problem of predicting Covid-19 diseases in the world using neural networks. This approach is useful when it is necessary to overcome diculties related to non-stationarity, incompleteness, unknown distribution of data, or when statistical methods are not completely satisfactory. The problem of forecasting is solved with the help of the analytical platform Deductor Studio, developed by specialists of the company Intersoft Lab of the Russian Federation. When solving this problem, appropriate methods were used to clean the data from noise and anomalies, which ensured the quality of building a predictive model and obtaining forecast values for tens of days ahead. The principle of time series forecasting was also demonstrated: import, seasonal detection, cleaning, smoothing, building a predictive model, and predicting Covid-19 diseases in the world using neural technologies for 30 days ahead. accuracy-83.84%.

China was presented in [14]. An interesting discussion of the principles of using mathematical modeling was presented in [15]. Some methodologies predict the number of new cases and make some assumptions about growth dynamics [16]. There are many sources of information for predicting the situation. As reported in [17], social networks can bring valuable information about con rmed cases of the disease and further spread. The relationship between new cases and the rate or coverage of growth can be transformed into a prediction elsewhere, as shown in [18]. This transfer of knowledge to model another region was carried out between Italy and Hunan province in China. The case of the ship "Diamond Princess" was discussed in [19]. Some models assess the situation in larger regions or more than one country. In [20] and [21], and applied forecasting model was de ned for working with data from China, Italy, and France. Some models only consider the total number of cases worldwide as a whole [22]. The model proposed in [23] is a complex solution. The proposed neural network architecture was developed to forecast new cases in various countries and regions. The architecture consists of seven layers, and the output predicts the number of new cases. In [24], a shallow long short-term memory (LSTM) based neural network was used to predict the risk category by country. The results show that the proposed pipeline outperforms state-of-the-art methods for data of 180 countries and can be a useful tool for such risk categorization. In [25], a combination of the LSTM-SAE network model, clustering of the world's regions, and Modi ed Auto-Encoder networks were used to predict future COVID-19 cases for Brazilian states. A comprehensive review of arti cial intelligence and nature-Inspired computing models is presented in [26]. To predict timedependent processes, one can, among other things, use an adaptive network based on the fuzzy inference system (adaptive neuro-fuzzy inference system) ANFIS -an arti cial neural network based on a fuzzy inference system that was developed in the early 1990s [27]. ANFIS integrates the principles of neural networks with the principles of fuzzy logic. To use ANFIS most e ciently and optimally, some authors recommend using the parameters obtained using the genetic algorithm [28].

Methods
The solution of the forecasting problem using a trained neural network presupposes, rst of all, the availability of statistical data on the spread of this disease by day, provided by the Federal Service for Surveillance on Consumer Rights Protection and Human Well-being (https://yandex.ru/covid19/stat?utm_source = main_notif & geoId = 1) for the world as a whole (Table 1). The statistical data obtained in the form of a time series require signi cant processing to form a training sample of the neural network and obtain the necessary data for the operation of the neural network dataset. This process usually includes the following steps: • Time-series adjustment -smoothing and removing anomalies.
• Data processing using the sliding window method.
• Data processing using a multi-layer neural network, neural network training • Selecting the appropriate forecasting method.
• Assessment of the accuracy of forecasting and the adequacy of the chosen forecasting method.
The analysis of the above points and numerous experiments allowed us to propose a General scheme for analytical processing of statistical source data to obtain a dataset for a neural network with subsequent neural network training and forecasting training. The block diagram of the dataset generation algorithm for the neural network and predicting COVID-19 coronavirus infection cases is shown in Fig. 1.

ADJUSTMENT OF TIME SERIES
The graph of identi ed cases of COVID-19 coronavirus infection in the world as of October 23, 2020, is shown in Fig. 1. To obtain a forecast on the required scale, it is necessary to change the time scale of the data series. optimize it for further processing. If you send data by day to a predictive model (neural network, linear model), then the forecast will be by day. If you have previously converted the data to weekly intervals, then the forecast will be based on weeks.
If necessary, the date can be converted to a number or a string for further processing.
In our case, we proceeded from the need to obtain a forecast by days, therefore, having performed the necessary transformations of the initial data into the "date" scale: Year + Day, we will receive the corresponding graph of the initial data in the indicated scale ( Fig. 2):

SMOOTHING AND REMOVAL OF ANOMALIES: SPECTRAL DATA PROCESSING
The purpose of spectral processing is to smooth ordered data sets using a wavelet or Fourier transform. The principle of such processing is to decompose the original time series function into basic functions. It is most often used for preliminary data preparation in forecasting tasks.
At the "Spectral Processing" step of the processing wizard, the "Wavelet Transform" method was selected, and the decomposition depth and order of the wavelet were set. The depth of decomposition determines the "scale" of the parts be ltered out: the larger this value, the «larger» parts in the source data will be discarded. If the parameter values are large enough (about 7-9), the data is not only cleared of noise but also smoothed (sharp outliers are "cut off"). Using too many decomposition depth values can lead to a loss of useful information due to too much "coarsening" of the data. The wavelet's order determines the smoothness of the reconstructed data series: the lower the parameter value, the more pronounced the "outliers" will be, and, conversely, if the parameter values are large, the "outliers" will be smoothed. Figure 3 shows a plot of smoothing and removal of anomalies using spectral processing using the "Wavelet transform" method and setting the average values of the parameters of this method.

AUTOCORRELATION ANALYSIS OF DATA
The purpose of autocorrelation analysis is to determine the degree of statistical dependence between different values (counts) of a random sequence formed by the data sample eld. In the process of autocorrelation analysis, correlation coe cients (a measure of mutual dependence) are calculated for two sample values that are separated by a certain number of samples, also called lag. The set of correlation coe cients for all lags is an autocorrelation function of the series (ACF): R(t) = corr(X(t), X(t + k)), where k > 0 is an integer (lag).
The ACF behavior can be used to judge the nature of the analyzed sequence, i.e. the degree of its smoothness and the presence of periodicity (for example, seasonal) or a trend.
For k = 0, the autocorrelation function will be maximal and equal to 1.as the number of lags increases, i. e. the distance between two values for which the correlation coe cient is calculated increases, the ACF value will decrease due to a decrease in the statistical interdependence between these values (the probability of occurrence of one of them less affects the probability of occurrence of the other). At the same time, the faster the ACF decreases, the faster the analyzed sequence changes. Conversely, if the ACF falls slowly, then the corresponding process is relatively smooth. If there is a trend in the original sample (a smooth increase or decrease in the series), then a smooth change in the ACF will also occur. If there are seasonal uctuations in the original data set, the ACF will also have periodic spikes. Figure 4 shows a graph of the autocorrelation function of detected COVID-19 cases in the world. Using this graph, you can visually determine the trend on the curve with lags of 290.

DATA PROCESSING BY A SLIDING WINDOW
Data processing using the sliding window method is used for preprocessing data in forecasting tasks when the values of several neighboring samples of the original dataset must be fed to the input of the neural network. The term "sliding window" re ects the essence of processing -a speci c continuous piece of data is selected, called a window. The window, in turn, moves, "slides" over the entire set of initial data. This operation results in a selection, in which each record contains a eld corresponding to the current selection (it will have the same name as in the original selection), and to the left and right of it there are elds containing selections shifted from the current selection to the past and future accordingly.
The sliding window processing has two parameters: immersion depth -the number of samples in the "past" and the forecast horizon -the number of samples in the "future". The article used a sliding window method to smooth the plots of detected COVID-19 cases in a world with a depth of 282 immersion using spectral processing. The forecast horizon was taken equal to one. The result was a dataset for training a neural network. The Deductor Studio analytical neural network platform was chosen to predict coronavirus in the Russian Federation and Moscow under the current conditions (www.basegroup.ru). Deductor Studio is the analytical core of the Deductor platform. Deductor Studio contains a complete set of data import, processing, visualization, and export mechanisms for fast and e cient information analysis. It focuses on state-of-the-art methods for extracting, cleaning, manipulating, and visualizing data.
With it, you can use modeling, forecasting, clustering, pattern search, and many other technologies for Knowledge Discovery in Databases and Data Mining.
Data processing is performed using a multi-layer neural network. In this mode, the "Processing Wizard" of the Deductor analytical platform allows you to construct a neural network with a given structure, determine its parameters and train it is using one of the training algorithms available in the system. The result will be a neural network emulator that can be used to solve problems of forecasting, classi cation, nding hidden patterns, data compression, and many other applications [29].
Con guring and training a neural network consists of the following steps: setting up eld assignments, adjust the normalization of the elds, setting up a training sample, con guring the structure of the neural network, selecting the algorithm and con guring the training parameters, setting the conditions for stopping training, starting the training process, selecting the data display method.
When con guring the neural network, in the "Neurons in layers" section, you must specify the number of hidden layers, i.e., the layers of the neural network located between the input and output layers. The number of neurons in the input and output layers is automatically set according to the number of input and output elds of the training sample, and it cannot be changed here.
The choice of the number of hidden layers and the number of neurons for each hidden layer should be approached carefully. It is believed that a problem of any complexity can be solved using a two-layer neural network [29]. When choosing the number of neurons, the following rule should be followed: In the "Activation function" section, you need to determine the type of neuron activation function and its steepness. To do this, in the "Function type" list, select the desired activation function, and in the "Steepness" eld, set its steepness.
Neural networks differ from traditional statistical methods but may have some similarities with them. For example, a traditional linear regression model can acquire knowledge through the least-squares method and express this knowledge in regression coe cients. In this sense, the regression model can be considered as a neural network. Then we can say that linear regression is a special case of neural networks of a certain type. However, linear regression has several assumptions that are imposed before the information is extracted from the data -the hypothesis of a preliminary determination of the relationship between the dependent and independent variables is put forward. Instead, in neural networks, the shape of the relationship is determined during the learning process.
In this mode, the processing wizard allows you to de ne the structure of the neural network, determine its parameters and train it is using one of the algorithms available in the system.
Con guring and training a neural network consists of the following steps: 1. 1. Con gure eld assignments. Here you need to determine how the elds of the source data set will be used when training the neural network and working with it in practice.

2.
Setting the normalization elds. The goal of normalizing eld values is to transform data to the form that is most suitable for processing using a neural network. 3. 3. Setting the training sample. Here you need to divide the training sample for building a model based on a neural network into two sets -training and test. The training set-includes the records that will be used as input data, as well as the corresponding desired output values.
The test set includes records that contain input and desired output values but are used to test the results of the model, rather than to train it.
1. 4. adjust the structure of the neural network. At this stage, parameters are set that determine the structure of the neural network, such as the number of hidden layers and neurons in them, as well as the activation function of neurons. In the "neurons in layers" section, you need to set the number of hidden layers, that is, the layers of the neural network located between the input and output layers.
2. 5. The choice of algorithm and parameters training. At this step, we select the neural network training algorithm and set its parameters. 3. 6. Setting the conditions for stopping training. At this step, we set the conditions under which training will be terminated: the condition that the discrepancy between the reference and real network output becomes less than the speci ed value set the number of epochs (training cycles) after which training stops, regardless of the error value. 4. 7. Starting the learning process. At this step, we start the actual process of training the neural network. 5. 8. Choosing a way to display data. At this step, we choose how the imported data will be presented. In our case, the following specialized visualizers are of interest: . 8.1. conjugacy table, scattering diagrams. The choice of an appropriate forecasting method is to determine whether this method produces satisfactory forecast errors. In addition to calculating errors, their comparison is carried out in a special Visualizer -the "scatter diagram" (Fig. 5). The scatter plot shows the output values for the training sample set (dataset) for the entire world.
The X-axis is the output value of the training sample (reference), and the Y-axis is the output value calculated by the trained model using the same example. A straight diagonal line is a reference point (a line of ideal values). The closer the point is to this line, the smaller the model error.
The scatter plot allowed us to compare several models to determine which model provides the best accuracy on the training set. 8.2. Diagram. The graph displays the dependence of the values of one eld on another. The most used chart type is a 2D graph. Its horizontal axis is the independent column values, and the vertical axis is the corresponding dependent column values. After building a model for assessing the quality of training, we present the obtained data in the form of a diagram for the current and reference values of the set of the whole world (Fig. 6).
An analysis of the scatterplot (Fig. 5) and the trained neural network diagram for the entire world dataset (Fig. 6) suggests that the neural network has been successfully trained.

Results
Forecasting allows you to get a prediction of the values of a time series for the number of samples corresponding to the speci ed forecast horizon.
What is the maximum forecast horizon? The following rule is recommended: the amount of statistical data should be 10-15 times greater than the forecast horizon. This means that in our case, the maximum forecast horizon can be 30 days.
When performing the actual forecast, we pre-con gure several elds: forecast horizon (set 20 days), request the "forecast step" and "source data" elds, and set color and scale parameters. Adding the" forecast step " eld (check the box) allows you to add the" forecast Step " eld to the resulting selection, which will indicate the number of the forecast step that resulted in it for each record.
"Source data" -selecting this check box allows you to include in the resulting selection not only those records that contain the predicted values, but also all those that contain the source data. In this case, the records containing the forecast will be located at the end of the resulting selection.
The nal graph for predicting the number of COVID-19 infections by date using neural technologies is shown in Figs. 7 (worldwide). The proposed model for predicting the number of COVID-19 infections by date using neural technologies, built once, cannot "work" inde nitely. There are new data on the number of infections in the world. Therefore, the model should be periodically reviewed and retrained.

Conclusion
In this paper, we solve the problem of predicting COVID-19 diseases in the world using neural networks. This approach is useful when it is necessary to overcome di culties related to non-stationarity, incompleteness, unknown distribution of data, or when statistical methods are not satisfactory. The forecasting problem is solved using the analytical platform Deductor Studio, developed by BaseGroup Labs (www.basegroup.ru, Russian Federation, Ryazan).
When solving this problem, we used mechanisms for cleaning data from noise and anomalies, ensuring the quality of building a predictive model and obtaining forecast values for tens of days ahead. The principle of time series forecasting was also demonstrated: import, seasonal detection, cleaning, smoothing, building a predictive model, and predicting COVID-19 diseases in the world using neural technologies for thirty days.

Declarations
Ethics approval and consent to participation: I approve of all the ethical standards of the magazine, and I am ready to take any part.

Consent to publication:
I am the sole author and I agree and ask you to publish my work.
Availability of data and materials: All the initial data for training a neural network are presented in Table 1 and are obtained from the o cial link https://yandex.ru/covid19/stat?utm_source%20=%20main_notif%20&%20geoId%20=%201 and in section 2. METHODS.
Competing Interests -"Not applicable" for this section.
Funding -"Not applicable" for this section.
Authors ' contribution: The entire article (collection of initial data, processing, writing of the text, drawings, table, study of literature and other works) was performed by one author Eduard Dadyan. Graph of detected cases of COVID-19 coronavirus infection in the world, smoothed using spectral processing using the "Wavelet transform" method Scattering diagram of a trained neural network for the whole world dataset