Long-term and short-term memory networks based on forgetting memristors

The hardware circuit of neural network based on forgetting memristors not only has the characteristics of high computational efficiency and low power consumption, but also has the advantage that a memristor can store the weight of long-term memory and short-term memory. Neural networks based on forgetting memristors can process two different data sets; however, the number of data sets processed is determined by the conversion rate of short-term memory to long-term memory neural network. In this paper, a forgetting memristor model with controllable decay rate is proposed, the short-term memory and long-term memory of the long-term and short-term memory (LSTM) network based on forgetting memristor is proposed, and the conversion speed from short-term memory network to long-term memory network is controllable. In the process of transformation from short-term memory to long-term memory of LSTM network based on forgetting memristor, the decay rate of forgetting memristor can be controlled, and the duration of short-term memory of LSTM network can be set. A reset signal mechanism is proposed so that the state of short-term memory of LSTM network with high recognition rate can be controlled. Based on the proposed controllable decay rate and reset signal, the state of the short-term memory network with high recognition rate can be set, so the LSTM network with two states can realize the recognition of different number of images under different data sets. Finally, two kinds of data sets are tested on the LSTM network based on the forgetting memristor, and the recognition rate is good, which shows the effectiveness of the proposed algorithm.


Introduction
As a hot academic topic, more and more researchers have been involved in the research of artificial neural network and its derivative fields, and artificial neural network has also been applied in various scenarios.Because of the potential of artificial neural networks in self-learning and high-speed computing, they are widely applied to various fields, such as pattern recognition, signal processing, knowledge engineering, and robot control.For different research fields, different artificial neural networks are also continuously proposed and improved, such as Convolutional Neural Networks (CNN) (Krizhevsky et al. 2017;Roska and Chua 1993), Recurrent Neural Network (RNN) (Sherstinsky 2020;Khaki et al. 2020), Spiking Neuron Networks (SNN) (Taherkhani et al. 2020;Lee et al. 2020), and Residual Network (Nguyen et al. 2021;Wu et al. 2019).Researchers are also studying the properties of neural networks themselves and applying them to more complex applications (Dong et al. 2021;Dong and Huang 2019;Dong et al. 2022).
With the increases of the complexity of the tasks processed by artificial neural network, the scale of artificial neural network becomes larger, and its requirements for computing power become higher.With the development of integrated circuits, researchers are trying to build neural networks based on hardware circuits, which can achieve the goal of high efficiency and low power consumption.In the neural network hardware circuit, due to a large amount of input data, complex network structure and a large number of parameters, the neural network operation needs the hardware circuit to provide a huge power consumption.Compared with CMOS, memristor (Strukov et al. 2008;Yao et al. 2020;Wen et al. 2018a) with low power consumption provides us with a better choice in the construction of neural network hardware circuits.As memristor also has the characteristics of CMOS compatibility (Ali et al. 2017), nanoscale (Wang et al. 2020;Kim et al. 2017;Lin et al. 2020a), fast switching speed (Li et al. 2013) and high integration (Chua 1971;Wen et al. 2018b), it is suitable for the construction of neural network hardware.Since the memristor and memristor simulation are more convenient to realize now, researchers have attempted to apply memristors to artificial neural networks (Sun et al. 2021;Lin et al. 2020b;Zeng et al. 2018;Cai et al. 2020).The research content of neural network based on memristor is extensive, including the realization of memristor devices and the construction of memristor neural networks (Kumar et al. 2017;Woo et al. 2016;Gao et al. 2016;Burr et al. 2017).For example, Zhang et al. proposed a neuromorphic computing system based on memristors in Zhang et al. (2018), which demonstrated the feasibility and applicability of memristor arrays in artificial neural networks.Yang et al. proposed a memristor-based bidirectional associative memory (BAM) circuit in Yang and Wang (2021), which realized the universal circuit for various tasks.Yakopcic et al. realized the hardware circuit of deep convolutional neural network based on memristor arrays (Yakopcic et al. 2016).In Li et al. (2022), Shang and Wang (2020), Wang et al. (2021), Wang et al. (2018), Shi and Zeng (2020) and Sun et al. (2022), the neural network system was realized based on memristor arrays, which also showed the broad application prospect of memristor array in neural morphologic computing.
LSTM is a kind of temporal recurrent neural network with long-term memory ability, which can learn long-term dependence, and has a good effect on the processing of sequence information.The research of LSTM network mainly includes the application research on software and the implementation on hardware circuit.In software applications, LSTM is used directly or indirectly in various scenarios, such as time series prediction, natural language processing and air combat decision-making systems (Zhou et al. 2020).In hardware system construction, researchers have proposed many artifi-cial neural network circuits based on memristors and applied them to many scenarios.Wen et al. proposed the LSTM network based on memristors in Wen et al. (2019), which was realized by hardware circuit for sentiment analysis.Zeng et al. proposed the LSTM network with in situ training based memristors and its application in Liu et al. (2020), which realized the prediction process and in situ training based on hardware.Dou et al presented a complete design of memristive LSTM network system (Dou et al. 2023), which had good performance and low power consumption on data set classification tasks.In neural network hardware circuits, memristors are typically used to store weights and other parameters in an array layout, which can achieve a parallel computing effect with high integration and very low power consumption (Jo et al. 2009;Xie et al. 2017;Li and Ang 2021;Li et al. 2019).
However, most of the existing neural network circuits are based on non-volatile memristors, and volatile memristors are rarely considered in the application of neural networks.Chen et al. proposed a memristor model with forgetting effect in 2013 in Chen et al. (2013), which has the characteristics of long-term memory and short-term memory, and shortterm memory can spontaneously decay to long-term memory.If forgetting memristors are taken into account in the construction of neural network hardware circuits, a memristor can store both long-term memory and short-term memory weights.And the forgetting memristor array can store the long-term and short-term weight matrices, realizing the conversion between the long-term memory and the short-term memory neural network.The processing efficiency of neural network will be faster, thus, a neural network hardware system can deal with two kinds of data sets.Weight storage and conversion based on forgetting memristor have been implemented (Chen et al. 2021a), and the conversion between long-term memory neural network and short-term memory neural network based on forgetting memristor has been realized.But the research on conversion speed is not clear, and the decay rate of the forgetting memristor cannot be controlled.
In this paper, we propose a forgetting memristor model with controllable decay rate, and construct a forgetting memristor bridge based on this model.In order to enable the memristor arrays to store positive and negative weights, the forgetting memristor bridges are used to replace the memristor in the array, and a model of setting and resetting the weight of neural network based on the forgetting memristor bridge is proposed.We build the LSTM network based on forgetting memristor model, store the trained weights in the forgetting bridge array, and realize the switch with controlled conversion speed between short-term memory and long-term memory neural network.We test the LSTM network based on forgetting memristor at KMNIST and FASHION-MNIST, and the corresponding accuracy reaches 89% and 87.5%.In addition, according to the characteristics of the forgetting memristor model, we also study the influence of the decay 123 rate of the forgetting memristor and the reset signal of the forgetting memristor on the neural network convergence to the optimal accuracy.This paper mainly builds LSTM network with two network states based on forgetting memristor model, and expands the application of volatile memristor model.The main contributions of this paper are as follows: (1) A forgetting memristor model with controllable decay rate is proposed in this paper, based on two kinds of weight storage methods of forgetting memristor bridge, three modes of random decay, fixed decay and synchronous decay are proposed.
(2) Based on the three decay modes proposed, the LSTM network based on forgetting memristor is constructed, and the performance of short-term memory networks on two data sets is investigated.(3) A switching method of short-term memory state and long-term memory state based on LSTM network is proposed.The decay rate is set on the forgetting memristor bridge to control the decay speed from short-term state to long-term state.(4) A control method for the state persistence of short-term memory network is proposed-the reset signal is used to influence the duration of short-term memory network.The earlier the reset signal is applied, the longer the high recognition rate of short-term memory states lasts.

Forgetting memristor model and LSTM network
2.1 The forgetting memristor bridge written independently with controlled decay rate Chen et al. (2013Chen et al. ( , 2021b) ) proposed a memristor model with forgetting effect, which contained two states of long-term memory resistance and short-term memory resistance.Based on her work, we improve and propose the forgetting memristor model with controllable decay rate.In the method of setting forgetting memristor resistance proposed by her, the method of controlling decay rate is added.Take the forgetting memristor based on current source as an example.
The long-term memory resistance of a memristor can be defined as where M l is the long-term memory resistance, I l is the current through the forgetting memristor, k l is a constant, sign(I l ) is the symbol of I l , i th1 is the current threshold.
The short-term memory resistance of a memristor can be defined as (2) where M s is the short-term memory resistance, I s is the current through the forgetting memristor, T s is the duration, k s is a constant, sign(I s ) is the symbol of I s , i th2 is the current threshold, f is the decay rate, and α is a constant.For a single memristor, Chen et al. proposed four groups of signals (the signal of resetting the long-term memory resistance, the signal of setting the long-term memory resistance, the signal of resetting the short-term memory resistance and the signal of setting the short-term memory resistance) to set the long-term and short-term memory resistance (Chen et al. 2021b).The long-term memory resistance is the difference between 1 and the ratio of the square of the amplitude multiply by the time of the signal of resetting long-term memory resistance to the square of the amplitude multiply by the time of the signal of setting long-term memory resistance.The short-term memory resistance is the difference between 1 and the ratio of the square of the amplitude multiply by the time of the signal of resetting short-term memory resistance to the square of the amplitude multiply by the time of the signal of setting short-term memory resistance.
Based on her work, we propose a method for controlling the decay rate from short-term memory to long-term memory, relating it to the signal of setting the short-term memory resistance.The decay rate is inversely related to the amplitude and the time of short-memory setting signal.After the longterm memory resistance and short-term memory resistance are set, the short-term resistance will be converted to the long-term resistance at the preset decay rate.
In neural networks based on memristors, memristors are used to store the weights of neural networks, whose forms including single memristor, double memristor and memristor bridge model.Being able to store positive, negative and zero weights, memristor bridges are widely used to store weights of neural networks.As shown in the Fig. 1a, the memristor bridge is composed of four memristors in which the four memristors are controlled by independent current sources.M1 and M2 are connected in series in opposite directions to form one of the branches, M3 and M4 are connected in opposite directions to form another branch.The voltage between A and B is 123 The weight is expressed as the ratio of the voltage between A and B to the input voltage: In order to simplify the expression of weight, we set M1 = M4 and M2 = M3, so the weights can be simplified as We use four sets of signals to set the long-term memory resistance, short-term memory resistance and decay rate of a single forgetting memristor.As shown in Fig. 1b, i1 and t1 are the amplitude and time of the signal resetting the longterm memory resistance, i2 and t2 are the amplitude and time of the signal setting the long-term memory resistance, i3 and t3 are the amplitude and time of the signal resetting the short-term memristor, i4 and t4 are the amplitude and time of the signal setting the long-term memory resistance.And i4 and t4 determine the rate of short-term memory decaying to long-term memory resistance.T 1 and T 2 are the periods of the long-term and short-term memory signals, respectively.
In the forgetting memristor model, mA and μs are set as standard units, k l , k s and decay constant are set as 1e−2, 1e−5, 1e3, respectively.The amplitude of i1 is 100 mA, t1 is 10 µs, i2 and t2 are set as needed.The amplitude of i3 is 10 mA, t3 is 1 µs, i4 and t4 are set as needed.T 1 and T 2 are 10 µs and 1 µs, respectively.If M l is 0.3 and M s is 0.6, according to the corresponding relationship between the resistance and the decay rate and the four groups of signals, the same group of resistance can be set by a variety of four groups of signals, and its decay rate can be different.When the amplitude and time of the four groups of signals are (100 mA, 10 µs), (− 100 mA, 7 µs), (10 mA, 1 µs), (− 10 mA, 0.4 µs), respectively, f is 250 /µs.When the amplitude and time of the four groups of signals are (100 mA, 10 µs), (− 100 mA, 7 µs), (10 mA, 1 µs) and (− 8 mA, 0.625 µs), respectively, f is 200 /µs.
When the weight is stored on the forgetting memristor bridge, the weight has two states of short-term memory and long-term memory, and the two states of the weight are related to the four memristors that compose the memristor bridge.Specifically, the weight is determined by the shortterm memory and long-term memory resistance of M1/M4 and M2/M3, and the rate of short-term memory weight decaying to long-term memory weight is related to the decay rate of four memristors.For the forgetting memristor bridge written independently, we set the weight and its decay rate by setting the resistance and decay rate of M1/M4 and M2/M3.If we store a set of weights in an forgetting memristor bridge written independently (the short-term weight is set to −0.2, and the long-term weight is set to 0.5), according to formula (6), we can get the proportional relationship between M1 l and M2 l , M1 s and M2 s .For example, we can select a group of M1 l , M1 s , M2 l , M2 s corresponding to 0.2, 0.6, 0.6, 0.4, and set the decay rate of M1 to 1e2 /µs and 2e2 /µs, respectively, and set the decay rate of M2 to 1e2 /µs and 4e2 /µs, respectively.Thus, the weight can decay at multiple rates.We can also select a group of M1 l , M1 s , M2 l , M2 s corresponding to 0.3, 0.8, 0.9, 0.53, set the other decay rate we need separately.

The forgetting memristor bridge written in
batches with controlled decay rate Chen et al. (2013) proposed a method of setting weight based on forgetting memristor bridge written in batches.As shown in Fig. 2a, the control of memristor does not require a separate current source.As similar to the above, the weight represented by the memristor bridge is formula (6).
In the forgetting memristor bridge based on current sources, M1/M4 and M2/M3 are connected in series in the branch, and the current flowing through M1/M4 and M2/M3 is the same.In order to further simplify the method of setting weight of the memristor bridge, we set M1+ M2 = 1, so the voltage at both ends of the branch of M1 and M2 in series is numerically equal to the current flowing through M1/M2.The same goes for the M3 and M4.Therefore, the resistance of the four memristors can be set synchronously through the input voltage signal (V in ).
The voltage of long-term signal and current flowing through the memristor are numerically equivalent (V l = I l ), so the change of long-term memory resistance is concluded as The voltage of short-term signal and current flowing through the memristor are numerically equivalent (V s = I s ), 123 Fig. 2 Set the weight of forgetting bridge written in batches so the change of long-term memory resistance is concluded as In the pulse time t, the decay of short-term memory resistance is very small and can be ignored.The change of short-term memory resistance can be approximated as When the short-term memory resistance decays to longterm memory resistance of a single memristor, the resistance of M1 and M2 changes by the same amount while the direction of change is opposite, so M1 + M2 = 1 still holds.
The short-term memory weight ( ω s ) is determined by the difference between the short-term memory resistance of M1 and M2: The long-term memory weight ( ω l ) is determined by the difference between the long-term memory resistance of M1 and M2: In formulas ( 10) and ( 11), the change of long-term memory and short-term memory resistance are M l and M s , respectively.Thus, long-term memory weight (w l ) and shortterm memory weight (w s ) can be set by M l and M s of a signal memristor.M l and M s are set by the amplitude and time of v l and v s , so the weight of forgetting memristor bridge is set directly by V in .
As shown in Fig. 2b, We use four sets of signals (v1, t1), (v2, t2), (v3, t3) and (v4, t4) to set the weight of the forgetting memristor bridge in batches.If we set the weight of long-term memory and short-term memory of forgetting memristor bridge as 0.3 and 0.5.According to formula (6), the change of long-term memory resistance and short-term memory resistance of each forgetting memristor was calculated: The minus signs in M l and M s indicated a drop in resistance.In the forgetting memristor model, mV and µs are set as standard units, k l , k s and decay constant are set as 1e−2, 1e−5, 1e3, respectively.The amplitude of v1 is 100 mV, t1 is 10 µs, v2 and t2 are set as needed.The amplitude of v3 is 10 mV, t3 is 1 µs, v4 and t4 are set as needed.T 1 and T 2 are 10 µs and 1 µs, respectively.The amplitude and duty ratio of voltage signal of control weight were (v1, 1), (−v2, 0.35), (v3, 1), (−v4, t4), respectively.On the premise that v4 * v4 * t4 = 0.25 * v3 * v3, we could adjust v4 and t4 according to different decay rates.
The short-term memory weight will decay to the longterm memory weight at a certain rate, and its rate is jointly determined by the decay rate of M1/M2.In the forgetting 123 Fig. 3 Change of the resistance and the weight of the forgetting memristor bridge written independently memristor bridge written in batches, the resistance and decay rate of M1 and M2 are set through the V in signal.In the forgetting memristor model with controlled decay rate, M l and M s are determined by V in while the positive/negative sign of V in indicates an increase/decrease in the resistance of the memristor, and the decay rate is determined by the amplitude and time of the short-term memory setting signal.
After the weight of forgetting memristor bridge is set, the short-term memory weight will be converted to the long-term memory weight at the decay rate we set.As shown in Fig. 2c, if we apply a set of short-term reset signals and short-term set signals to the forgetting memristor bridge, the weight will be restored to the initial setting state.

The weights based on two kinds of forgetting memristor bridge
In the artificial neural network based on memristor, the memristor usually stores the weights in the form of array.If the weight is stored on a forgetting memristor bridge written independently, due to its characteristics, we can have a variety of choices, such as choosing the same group of resistance with different decay rates, or a group of different resistance with the same decay rate, or choosing different combinations containing different resistances with different decay rates.These choices all satisfy our target weight.
As shown in Fig. 3, we select a group of resistance with different decay rates, and the weights stored on the forgetting memristor bridge written independently have different decay rates.As shown in Fig. 4, we select two sets of resistance with the same decay rate, and the weights stored on the forgetting memristor bridge written independently decay at the same rate.
If the weight is stored on a forgetting memristor bridge written in batches, due to its characteristics, we can select only one set of definite resistance values, but many different decay rates can be selected.As shown in Fig. 5, the weights stored on the forgetting memristor bridge written in batches decay at different rates.
In summary, the storage of weights based on the forgetting memristor bridges written independently has a variety of options, which can be combined with different resistance and different decay rates.However, the storage of weights based on the forgetting memristor bridges written in batches is more deterministic.We can only choose a certain weight, but we can also choose different decay rates.When the weights stored by the two kinds of forgetting memristor bridge are used in the computation of neural network, the decay of the weights stored by the forgetting memristor bridge written independently is more random on the whole due to the huge number of parameters, and the decay of the weights stored by the forgetting memristor bridge written in batches can be either random or determinate.When the weights are stored on the forgetting memristor bridges, they are variable, which is called the short-term memory state of weights.If weights are stored on the forgetting memristor bridges written independently, there are many possibilities for the decay rate of short-term memory weight stored on a single forgetting memristor bridge.Therefore, the decay rate of short-term memory weights are approximately random, which are uncertain, we call it random decay.However, in the LSTM network based on the forgetting memristor bridges written in batches, the decay rate of short-term mem-ory weight stored by a single forgetting memristor bridge is determined.Therefore, for the forgetting memristor array, the decay rate of short-term memory weights are relatively more controllable.We can set the decay rate of a single forgetting memristor bridge in the memristor bridge array so that the decay rate of the short-term memory weights are fixed value, which is called fixed decay.We can set the relative decay rate between all weights so that the decay time of all weights is approximately the same, which is called synchronous decay.We can also ignore the relationship between the decay rates of each forgetting memristor bridge, so that the decay rate of short-term memory weights is random, which is called random decay.

LSTM network based on forgetting memristor bridge array
In this paper, LSTM network based on forgetting memristor bridge is used to process image classification.As shown in Fig. 6, the core of LSTM network is LSTM unit, and an LSTM unit is composed of three gating mechanisms: input gate, forget gate and output gate.
Input gate: The input gate and a tanh function work together to determine the addition of new information and update the current cell state.The mathematical expressions are shown in formulas ( 12) and ( 13): 123 Fig. 6 Long-and short-memory network unit Forget gate: The function of forget gate is to selectively forget the LSTM unit status the last moment and the input the current moment, discard unimportant information and retain important information.The mathematical expression is shown in formula ( 14): Output gate: The output gate controls the output of the current unit state.The mathematical expressions are shown in formulas ( 15)-( 17): The hardware circuits of neural network based memristor usually take the form of memristor arrays, which are used to store weights.Wen et al. proposed the LSTM network based on memristor array in Wen et al. (2019), we use their proposed LSTM network hardware circuit.As shown in Figs.7  and 8, we use forgetting memristor bridge to replace single memristor, stored weight, and realize LSTM network model.The weights are stored on the forgetting memristor bridge array, and participate in the neural network operation, where m is the step of input information and n is the number of hidden layer neurons in LSTM unit.However, the output of the memristor bridge is the voltage signal, which cannot be summed directly to complete the parallel calculation, so it is necessary to use a mirror circuit to convert the voltage signal into the current signal.We use the mirror circuit in the article (Shamsi et al. 2017), and the mirrored circuit is shown in the upper part of Fig. 7.The output current is where R is the source resistance of Q1 and Q2, and g m is the transconductance of Q1 and Q2.For the memristor bridge array in Figs.7 and 8, the output of I on is where V ik is the input voltage signal in row k and w kn is the weight stored by the forgetting memristor bridge in row n and column k.
In LSTM units based on forgetting memristor bridge, as shown in Figs.7 and 8, the trained weight matrix and bias of LSTM network are stored in the forgetting memristor bridge array.According to the dimension of input information, the corresponding LSTM units constituted the LSTM network.In the hardware circuit of LSTM network, the test set is encoded and converted into voltage signals to be input to the LSTM unit.
Taking the images with a size of 28 * 28 in the KMNIST as an example, we take the row dimension of each image as the input step after the normalization treatment, which are converted to the voltage signal.Bias are added to the result of the input signal multiplied by the weight of the forgetting memristor bridge in the way of dot product, and then it is the input of the forget gate, the input gate and the output gate after the activation function.V i , V j , V f and V o in Figs.7 and 8 are the outputs of forget gate, input gate and output gate, respectively, corresponding to i t , j t , f t and o t in formulas ( 12)-( 15).The f t , i t , j t and o t at the current moment are calculated by circuit to obtain h t and c t in formulas ( 16) and ( 17).C t is the input of LSTM unit at the current moment, and h t par-123 Fig. 7 The hardware circuit of LSTM cell based on forgetting memristor bridge written independently ticipates in the operation of LSTM unit at the next moment as the input and the row dimension of the input image.After multiple LSTM units, h t serves as the final output to realize classification.

The long-term memory and short-term memory of LSTM network based on forgetting memristor bridge
Based on the forgetting memristor bridges written independently and the forgetting memristor bridges written in batches, we build the LSTM network in the software and conduct experiments on FASHION-MNIST.In the LSTM network based on two kinds of forgetting memristor bridge, their long-term memory network state are basically the same, and the short-term memory network state are different.This shows a difference in the accuracy of the data set tested by the LSTM short-term memory network.
In the LSTM network based on the forgetting memristor bridge written independently, the short-term memory state of the network decays at a random rate.We repeat the experiment three times, and the results are different each time.As shown in Fig. 9a, it shows that the accuracy of short-term memory of the LSTM network based on random decay rate is relatively difficult to control.
In the LSTM network based on the forgetting memristor bridges written in batches, the short-term memory state of the LSTM network can decay in three ways.As shown in Fig. 9b, the recognition accuracy of FASHION-MNIST of the short-term memory network under the three decay modes shows the same overall trend and different details.However, the workload of the three methods of setting weights based on forgetting memristor written in batches are different.The workload of the random decay mode is the least, followed by the fixed decay, and the synchronous decay is the most workload and the decay time is not accuracy enough, so we can choose different modes after comprehensive consideration.
We build LSTM network on software to train the two kinds of data sets and get weights and other parameters.On the LSTM network based on forgetting memristor model, we store the weights and other parameters on the forgetting memristor bridge array written in batches to construct a neural network with both long-term memory and short-term memory, which can handle two kinds of data sets based on our multitasking needs.The long-term memory network handles the long term task.When we have a temporary need to deal 123 Fig. 8 The hardware circuit of LSTM cell based on forgetting memristor bridge written in batches with other tasks, the long-term memory network can switch to the short-term memory network.When the short-term task is finished, the short-term memory network switches to the long-term memory network.The LSTM network based on forgetting memristor can save hardware resources and realize the recognition of two data set.
The test sets of the experiment are KMNIST and FASHION-MNIST, with the KMNIST data set as a short-term task and the FASHION-MNIST data set as a long-term task.The weights and other parameters in the LSTM network are based on the training set of FASHION-MNIST when dealing with long-term tasks.When faced with short-term tasks, the longterm memory network is reset to the short-term memory network, and the weights and other parameters in the LSTM network after transformation are based on the training set of KMNIST.In the two data sets of image classification, the accuracy of FASHION-MNIST test set for long-term task will change from low to high (from 10% to 87.5%) over time, and the accuracy of KMNIST test set for short-term task will change from high to low (from 89.5% to 10%) over time, as shown in Fig. 10a.An EPOCH contains 500 images, and image classification of the FASHION-MNIST data set is shown in Fig. 10b.
As shown in Table 1, the classification accuracy of KMNIST and FASHION-MNIST data sets is fixed by LSTM network based on non-volatile memristor, but it is variable by LSTM network based on forgetting memristor written in batches.As can be seen from Table 1, although the LSTM network based on forgetting memristor loses some classification accuracy, it achieves the effect of one LSTM network identifying two data sets.
As shown in Table 2, compared with non-volatile memristor, the weight matrix of LSTM network based on forgetting memristor can store both long-term memory and short-term memory weights.It has two states of short-term memory network and long-term memory network, and could handle two different data sets.With the same network structure, more data sets could be processed with less memristor hardware.
The LSTM network based on forgetting memristor bridges written in batches can be applied in multi-task scenarios.When it processes image classification tasks of two kinds of data sets, the short-term memory network will be transformed into long-term memory network after the short-term task is completed.Since the weights of both the long-term and short-term memory networks are stored on arrays based on forgetting memristor bridges, the conversion speed of 123   the short-term memory network to the long-term memory network is related to the decay mode of weights.Our proposed LSTM network based on the forgetting memristor bridges written in batches has three decay modes: random decay, fixed decay and synchronous decay.Since the state of short-term memory network with random decay is difficult to control, we only consider two other decay modes.The two decay modes have different influences on the conversion speed of the short-term memory network to the long-term memory network.
In the process of neural network conversion based on forgetting memristor written in batches, the convergence rate of short-term network to long-term network is different due to the difference of decay modes.On the whole, the slower the decay rate is, the more short-term tasks are processed and the longer the conversion time to the long-term memory network is.The faster the decay rate is, the less short-term tasks are processed and the shorter the conversion time to long time network is.We can choose different decay methods according to the amount of image data in short-time task.The image classification accuracy of short-term memory network and long-term memory network at fixed decay rate and synchronous decay rate are shown in Fig. 11.As can be seen from Fig. 11, the decay trend and extreme value of image classification accuracy of long-term and short-term memory network with two decay modes are the basically the same, but the retention time of high recognition rate of short-term memory network state are different between them.The workload and decay control precision of the setting weights between two modes are also different, moreover, we could choose different modes by the maintenance time of the high recognition rate of short-term memory to meet the needs of different short-task.

The methods of maintaining the short-term memory network
The amount of short-term tasks processed by LSTM's short-term memory network with the two decay modes is determined by the decay rate.Take the synchronous decay for example, the image classification accuracy of the two data sets processed by LSTM network with synchronous decay is shown in Fig. 12.As can be seen from Fig. 12, the classification accuracy of LSTM network with synchronous decay mode for KMNIST varies from high to low and then maintains a low value, while its classification accuracy for FASHION-MNIST varies from low to high and then maintains a high value.It is because KMNIST is processed as a short-term task.After the short-term task is completed, the short-term memory network decays to the long-term memory network, and the corresponding data set is FASHION-MNIST.We could set this conversion speed to satisfy different short-term tasks.
In the LSTM network based on forgetting memristor bridges written in batches, the long-term memory network and the short-term memory network correspond to the longterm task and the short-term task, respectively.The slower the speed of the short-term memory network to the longterm memory network is, the more the number of images that could be classified by the short-term memory network is.In terms of the number of images processing short-term tasks, we want the decay rate of short-term memory network to long-term memory network to be controllable, so we have two approaches.The first approach is to set our decay rate according to the amount of short-term tasks we needed to deal with when the weights and other parameters of the LSTM are stored on the forgetting memristor bridges written in batches.The second approach is that when the short-term memory network decays to the long-term memory network, pulse signals are applied to the forgetting memristor bridges 123 Fig. 11 The accuracy of FASHION-MNIST and KMNIST with fixed and synchronous decay rate Fig. 12 The accuracy of FASHION-MNIST and KMNIST with synchronous decay written in batches to reset the short-term memory network, which delays the conversion of the short-term memory network and increases the number of images that the short-term memory network could process.
In the face of different number of images of short-term tasks, we set the synchronous decay rate of weights on the forgetting memristor bridge written in batches in advance, so the LSTM network can process different number of short-term tasks.On the LSTM network based on forgetting memristor, we set the processing quantities of 1K, 2K, 5K and 8K images, respectively, so the short-term memory network completely switches to the long-term memory network after 1K, 2K, 5K and 8K images are processed, respectively.From Fig. 13a, we can see that the slower the synchronous decay rate we set in the LSTM network with synchronous decay mode, the longer the short-term memory network lasts, the more short-term tasks it processes for the image classification.According to the needs, we can set different synchronous decay rates to meet different short-term tasks.It is worth noting that the weight decay rate and decay time are not linearly corresponding, and we could only process them approximately linearly, where there is error, but we can indirectly set the number of processed images by the relative size of the decay rate of weights.
We set different synchronous decay rates, and LSTM network could realize image classification in the face of different number of images of short-term task.In the process of transformation from short-term memory network to long-term memory network, we can apply reset signal to make LSTM network reset to the initial state of short-term memory network.From Fig. 13b, we can know that after the reset signal is applied to the LSTM network, the LSTM network is reset to the initial state of short-term memory network and the accuracy corresponded to become higher.If the short-term memory state maintains longer, and the number of images recognized by the short-term memory network will increase.Therefore, we can control the speed of the transformation from the short-term memory network to the long-term mem-123 Fig. 13 The accuracy of short-term memory network with synchronous decay rate ory network by reset signal, and further control the number of images of temporary tasks that the short-term memory network could process.The accuracy of short-term memory of LSTM network for short-task will decrease in the process of conversion.Therefore, if we want to improve the classification accuracy of short-term memory and long-term memory state, we can also introduce some mechanisms into the network model.For example, we introduce attention mechanism, and we conduct experiments on KMNIST and FASHION-MNIST with LSTM model and LSTM + Attention model.Table 3 results shows that the accuracy of LSTM + Attention model is higher than the LSTM model.This because LSTM network could alleviate the problem of gradient disappearance in long sequence learning, but when the sequence length exceeds a certain number, it still cause the problem of gradient disappearance at a distance while attention mechanisms could further alleviate gradient disappearance.

Conclusion
In this paper, a two-state LSTM network is proposed based on the forgetting memristor model with controllable decay rate.
The long-term memory state and short-term memory state of LSTM network can recognize two different data sets, and the recognition accuracy is basically the same as that of the traditional LSTM network with the same network structure.These work can expand the application range of volatile memristor, for example, the forgetting memristor can store more data, and the neural network based on forgetting memristor can be applicable to the processing of two kinds of data sets, which can provide a new idea for the researchers related to memristor and neural network.We will conduct further simulation experiments on specific circuits, and extend our work to more complex data sets in real life, including video recognition and text classification, which are our next steps.

Fig. 1
Fig. 1 Setting the weight of forgetting bridge written independently

Fig. 4 Fig. 5
Fig. 4 Change of the resistance and the weight of the forgetting memristor bridge written independently

Fig. 9
Fig. 9 Recognition rate of short-term memory network based on two kinds of forgetting memristor bridge

Fig. 10
Fig. 10 Image classification of LSTM network based on forgetting memristor bridge written in batches

Table 2
Comparison of LSTM networks based on forgetting memristor written in batches and non-volatile memristor bridge model

Table 3
The classification accuracy of different data sets