3.1 Deep learning algorithm
In the process of building the deep learning network model, the main components of the model also refer to the structure of the biological nervous system. Suppose a neuron model contains a number of d neurons, then the neuron model can be expressed by a mathematical formula, as follows:
$$z={\sum }_{i=1}^{d}{w}_{i}{x}_{i}+b={w}^{T}x+b$$
1
During the operation of the neuron model, attention should be paid not to let the deep neural network degenerate into the traditional model, so that there is no way to process the linear indivisible data in the model. Therefore, nonlinear functions should be used to transform the nonlinear data when inputting signals, so that the activation value of the neuron can be obtained, and the overall fitting ability of the neuron model can be improved. The calculation formula is as follows:
The expression of ReLU function is as follows:
$$ReLU\left(x\right)=\text{m}\text{a}\text{x}(0,x)$$
3
The operation of the neural network in solving the problem of time series prediction is shown in the following formula:
$${i}_{t}=sigmoid({W}_{xi}{x}_{t}+{W}_{hi}{h}_{t-1}+{b}_{i})$$
4
$${f}_{t}=sigmoid({W}_{xf}{x}_{t}+{W}_{hf}{h}_{t-1}+{b}_{f})$$
5
$${o}_{t}=sigmoid({W}_{xo}{x}_{t}+{W}_{ho}{h}_{t-1}+{b}_{o})$$
6
$${g}_{t}=tanh({W}_{xg}{x}_{t}+{W}_{hg}{h}_{t-1}+{b}_{g})$$
7
$${c}_{t}={f}_{t}\odot {c}_{t-1}+{i}_{t}\odot {g}_{t}$$
8
$${h}_{t}=\text{t}\text{a}\text{n}\text{h}\left({c}_{t}\right)\odot {o}_{t}$$
9
Sigmaid function is an activation function used in most neural network models. This function can convert the values input into the system into values in the range of 0 to 1, so this function can help LSTM network models process data. The expression of this function is as follows:
$$\sigma \left(x\right)=\frac{1}{1+{e}^{-x}}$$
10
The derivative of the function expression can be obtained as follows:
$${\sigma }^{{\prime }}\left(x\right)=\sigma \left(x\right)(1-\sigma (x\left)\right)$$
11
The Tanh function can process the unit status and unit output in the LSTM network model. The processing flow is expressed by the formula:
$$\sigma \left(x\right)=\frac{{e}^{x}-{e}^{-x}}{{e}^{x}+{e}^{-x}}$$
12
The derivative of the function expression can be obtained as follows:
$${\sigma }^{{\prime }}\left(x\right)=1-{\sigma }^{2}\left(x\right)$$
13
The function images of the above two functions are shown in Fig. 1:
According to the needs of research and analysis, this paper compares and analyzes the polynomial expansion method and lookup table method, and the results of the comparative analysis are shown in Table 1.
Table 1
FPGA resource utilization and error comparison of lookup table and polynomial expansion approximation
Approximate method | function | LUT | FF | Number of clocks | average error |
Polynomial expansion | Tanh | 4276 | 2012 | 20 | 1.65×10− 3 |
Sigmoid | 2987 | 1368 | 17 | 9.33×10− 4 |
Lookup Table | Tanh | 2201 | 1439 | 15 | 2.35×10− 4 |
Sigmoid | 1999 | 938 | 11 | 4.89×10− 4 |
The neuron in LSTM neural network is taken as the most basic programming unit, and the hidden state in the model is calculated using the formula as follows:
$$\overrightarrow{{h}_{l}}={LSTM}_{forward}(\overrightarrow{{h}_{l-1}},e({x}_{i}\left)\right)$$
14
$$\overrightarrow{{h}_{l}}={LSTM}_{forward}(\overrightarrow{{h}_{l-1}},e({x}_{T-i+1}\left)\right)$$
15
The probability calculation formula of different characters during decoding is as follows:
$$p\left({y}_{i}|{y}_{1},{y}_{2},\dots ,{y}_{i-1},X\right)=g({y}_{i},{s}_{i})$$
16
The hidden state of the decoder can be calculated by the following formula:
$${s}_{i}={LSTM}_{decoder}({s}_{i-1},\left[e\left({y}_{i-1},{a}_{i}\right)\right])$$
17
The context vector can be calculated using the following formula:
$${a}_{i}=attention({s}_{i-1},{h}_{1:T})$$
18