Hybrid memristor-CMOS neurons for in situ learning in fully hardware memristive spiking neural networks

: Spiking neural network, consisting of spiking neurons and plastic synapses, is a promising but relatively underdeveloped neural network for neuromorphic computing. Inspired by the human brain, it provides a unique solution for highly efficient data processing. Recently, memristor-based neurons and synapses are becoming intriguing candidates to build spiking neural networks in hardware, owing to the close resemblance between their device dynamics and the biological counterparts. However, the functionalities of memristor-based neurons are currently very limited, and a hardware demonstration of fully memristor-based spiking neural networks supporting in situ learning is very challenging. Here, a hybrid spiking neuron by combining the memristor with simple digital circuits is designed and implemented in hardware to enhance the neuron functions. The hybrid neuron with memristive dynamics not only realizes the basic leaky integrate-and-fire neuron function but also enables the in situ tuning of the connected synaptic weights. Finally, a fully hardware spiking neural network with the hybrid neurons and memristive synapses is experimentally demonstrated for the first time, with which in situ Hebbian learning is achieved. This work opens up a way towards the implementation of spiking neurons, supporting in situ learning for future neuromorphic


Introduction
Inspired by the human brain, the spiking neural networks (SNNs) encode timing signals into the computing process, where the spike-based temporal processing allows sparse and efficient information transfer, conversion, and storage 1,2 .Building an SNN system in hardware is promising and attractive for performing edge tasks with great efficiency in the big-data era 3 .Lately, significant efforts with impressive progress have been made towards the implementation of SNN chips based on CMOS (complementary metal-oxidesemiconductor) technology, such as TrueNorth from IBM 4 , Loihi from Intel 5 , and Tianjic from Tsinghua University 6 , et al.Nevertheless, due to the lack of similarities between CMOS devices and biological components at physical mechanism level, the CMOS devices without the intrinsic neuronal dynamics can only simulate rather than faithfully emulate neurons functions.Even the simplest simulation of neuron functions would require such silicon neurons to have a fairly complex circuit [7][8][9] , which further grows quickly with more fidelity and functionalities 10,11 .Therefore, compared to the biological neurons, these bulky neuron circuits are less area or energy efficiency, which will limit its edge applications due to energy constraints as well as its cloud application due to the limited number of neurons that can be integrated on chips.
Wang et al 12 used the dynamic migration of Ag in a host dielectric material to emulate the stochastic leaky integrate-and-fire (LIF) process in neurons and demonstrate a fully memristive SNN system with unsupervised learning.Attributing to the intrinsic dynamics in a single memristive device, neurons in these works are more energy and area efficient than CMOS-based ones.However, these memristor-based neurons focus on the emulation of single neuron's functionalities, without considering the actual requirements for system realization in hardware.Thus, the functional diversity (such as generating in situ learning and lateral inhibition signals) of these neurons remains to be demonstrated, the stability (such as firing under continuous stimuli inputs) needs great improvement, and the fully hardware system-level demonstration is primitive.
Here, we design a hybrid memristor-CMOS leaky integrate-and-fire (LIF) spiking neuron to enhance the fidelity and functionality of memristor-based neurons.In this neuron circuit, a single TS memristor serves as the dynamic integrator of the postneurons to collect input signals from the pre-neurons and determines whether to fire or not.Simple digital circuits detect the fire event and output reproducible spike signals, as well as ensuring the stable firing of the memristor under pulse train inputs by supplying a refractory period (RP) signal.In the hybrid neuron, the TS memristor provides the dynamics for neuromorphic functions, and transistors supply the signal amplification to enable larger and multilayer networks.

Design principles of the neuron circuits
Figure 1a shows the schematic of a biological neural system, constructed with a variety of neurons and connected plastic synapses.In such a system, a typical neuron mainly includes numerous dendrites, a soma, and an axon 33,34 .Together with the soma, the dendrites of post-neurons receive and integrate the excitatory or inhibitory signals from pre-neurons and raise the membrane potential 34,35 .Once the membrane potential surpasses a threshold, the axon hillock generates an "all-or-none" action potential (AP) through the opening or closing of the voltage-gated ion channels.The "all-or-none" feature of AP makes the biological neuron perform the signal gain function and ensure the AP transmission in a deep network.After firing, the membrane potential recovers to the resting state within a refractory period and prepares for the next spiking event.Thus, the neuron could fire continuously under a string of AP inputs.The axon-terminals (of the pre-neuron) and dendrite-terminals (of the post-neuron) form the synapses whose strength (synaptic weight) dictates the intensity of the signal passing from the preneurons to the post-neurons.Importantly, the synaptic weight can be in situ modified according to the relative timing of pre-and postsynaptic spikes (spike-timing-dependent plasticity (STDP) learning rule) [36][37][38] , which is believed to be one of the key mechanisms for organisms to learn and dynamically adapt to the external environment.Furthermore, the lateral inhibition between post-neurons through inhibition interneuron is another key feature in biological systems, in which the excited neurons inhibit other nearby or connected neurons 39 .The lateral inhibition behavior enables the brain to manage the sensory inputs, avoid information overload, and support a network to perform competitive learning 39,40 .
Inspired by the biological system, we design a hybrid spiking neuron to construct a memristive SNN in hardware, as schematically shown in Fig. 1b.In this neuron circuit, the TS memristor serves as a gated membrane to dynamically integrate the input signals through the growth of Ag filaments and induce a fire event determined by the TS switching nature of the device (abrupt switching from a highly resistive OFF state to a highly conductive ON state).The CMOS units shape a fixed output spike for performing the signal gain function and ensure the continuously firing behavior of the TS memristor under pulse train stimuli through supplying a refractory period (RP) feedback signal.
Within the RP, the device spontaneously relaxes back to its initial state, without needing any reset operation.This is attributed to the self-rupture of Ag-channel by interfacial energy minimization between Ag and dielectrics, or Thomson-Gibbs effect 12,41,42 .Under the help of the CMOS units, the weight updating signals are successfully introduced into the neuron circuits to in situ modulate the RS synaptic weights.Furthermore, to support competitive learning in a network, the CMOS units also supply a lateral inhibition signal to other neurons through RS devices (more circuit details are presented in Fig. 3).After the first fire, the device cannot completely return to its initial HRS due to the interval time is not enough for finishing a complete relaxation, and thus induces a sub-threshold firing.f, The statistical data of the required pulse number for the first firing under different pulse voltages (with 250 µs width and interval time).Fewer integration pulses are required under higher voltages.

Neural characteristics of the TS memristor
As mentioned before, the TS memristor plays a key role in the hybrid neuron.To obtain a stable spiking behavior of the neuron and apply it for networks, a TS memristor array that contains 32 discrete devices was fabricated, as shown in Fig. 2a, and the TS device with Au/Ag/SiO2:Ag/Au structure is shown in the inset of Fig. 2b.Initially, the device is in a high resistance state (HRS) and features forming-free owning to the doping technology 43- 45 , which is important for large-scale integration.The fabrication process is described in Methods.Fig. 2b shows typical volatile I-V switching curves of the TS memristor under 100 positive voltage sweeps.During the switching process, once the applied voltage surpasses a threshold, the device switches from an HRS to a low resistance state (LRS) because the Ag-channel(s) is formed within the SiO2 dielectric [46][47][48][49] .When the applied voltage is below a hold value, the device relaxes back to an HRS due to the Ag-channel's spontaneous rupture 42,46,49,50 .It should be noted that the growth and rupture processes of the Ag-channel have stochastic physical dynamics, indicating the switching voltages between each cycle follow a probability distribution function (see statistical data in Supplementary Fig. 1).This provides the stochastic neuronal behavior inherent to the memristor-based neuron, thus does not need any external random number generators required in CMOS-based neurons 11 .
To further study the device characteristics for emulating LIF neurons, we switched the device with pulses, as shown in Fig. 2c.During the measurement, we used a transistor instead of a fixed resistor to limit the current to protect the TS device (inset of Fig. 2c).
The transistor also serves as a read-out resistor that is beneficial for integration in the designed hybrid neuron.The dynamic response of the device under a 1.2 V/1 ms postsynaptic pulse monitored by 0.05 V read voltage was observed.Within a certain delay time, the Ag atoms gradually accumulate in the SiO2 dielectric layer with the effect of the electric field and redox reaction 41,46,47,51 , corresponding to the integration process.
Eventually, an Ag-channel forms and induces an LRS of the device, representing the fire behavior of neurons.When the applied trigger voltage is ceased, the device relaxes back to its HRS spontaneously, indicating the "leaky" feature of the biological neuron membrane.Compared with the non-volatile memristor-based neuron 24 , the TS device's volatile feature allows our artificial neuron to automatically recover to its resting state after firing and without the need for extra reset operations, just like the biological neurons, thus reduces the circuit complexity and energy consumption.To study the effect of the pulse amplitude on integration time and relaxation time, pulses with different amplitudes but fixed 1 ms width were applied on the device, and the statistical data are shown in Fig. 2d.The results show that with increasing the pulse amplitudes from 1.0 V to 1.4 V, the integration time decreases while the relaxation time increases.In other words, a higher post-synaptic voltage needs a shorter time to fire the post neurons and vice versa, which is similar to what observed in biological neurons 33 .Both the required integration time and relaxation time under different amplitudes show a probability distribution (see Supplementary Fig. 2) because of the stochastic growth and rupture processes of the Agchannel(s).These features equip the TS memristors with the highly desirable stochastic neuronal dynamics and spontaneous repolarization capabilities in biological neurons 52 .
Recently, the stochasticity has been successfully demonstrated in PCM-based neurons and presents the potential for population code 24 .
Here, for performing the LIF behavior of neurons under multiple stimuli, pulses with shorter width (250 µs) and interval time (250 µs) were operated as the input signals, as shown in Fig. 2e.Four pulses are required to trigger the first fire event, indicating a multiple pulse LIF process.The statistical data of the pulse number for firing under different pulse amplitudes is shown in Fig. 2f.Fewer integration pulses are required under higher amplitudes.Hence, the neuron firing rates can be modulated by the postsynaptic action potential that depends on the connected synaptic weights.It is worth nothing that, to prepare for the next LIF behavior, the waiting time after firing must be longer than the device's relaxation time.Therefore, after the first firing event (Fig. 2e), the device cannot decay to its initial HRS before the next input pulse coming because the interval time is insufficient.This phenomenon indicates that the simple TS device cannot fire continuously under pulse train stimuli, which is a general challenge observed in capacitor-less memristor-based neurons 12,15,18,30,32 .For pursuing a practical application, a refractory period is needed to enable the device to recover to its HRS and perform continuous LIF behavior under the pulse train inputs.Thus, in this work, we introduce an RP feedback signal into the hybrid neuron circuit to solve this problem (more details are presented in Fig. 3).

Hybrid neuron circuit and the characteristics
Figure 3a shows the details of the hybrid neuron circuits, whose area is estimated to be about 50 × less than a 1 pF NMOS capacitor on a 14 nm technology node 24 .The CMOS units include two D-type latches (L1 and L2), an AND gate (G1), an OR gate (G3), and a depression module.The G1 generates the output spike signal, and G3 generates the lateral inhibition signal.The output of the L2 serves as both the potentiation and refractory period signal to control the transistor T1.The output of L1 triggers the depression feedback circuit that consists of a AND gate (G2), a buffer, and a switch transistor T3.Under a resting state, the transistor T1 is in an off-state, and the electrical potential of node 1 is the post-synaptic action potential.When the TS device fires, a feedback signal from L2 makes the T1 in an on-state, and node 1 is a virtual ground.The virtual ground of node 1 leaves a refractory period for TS memristor relaxing back to its HRS and potentiates the related synaptic weights that with input pulses.Initially, the depression module can be considered as an open circuit because the voltage on the T3 gate is zero.When the TS device fires, the output signal from L1 activates the depression module and lifts the potential of node 1 (see Supplementary Fig. 3), thus depressing the synapses whose inputs are zero.Given the opening of T1 and T3 happens within two different clock periods, the potentiation and depression operations do not conflict with each other, and the neuron could support an optimized Hebbian learning rule 36,53 (see Supplementary Fig. 4).Fig. 3b shows the measured output sequence diagram of five critical nodes in the neuron circuit within two adjacent firing cycles.Noting that for clearly present the voltage evolution on critical nodes, the depression module is disabled during the measurement.Here, two fixed resistors are performed as RS synapses (S1 = 10 kΩ and S2 = 40 kΩ).VIN1 receives input pulses, and VIN2 is grounded.On the fifth input pulse within the first firing cycle, the TS memristor fires, leading to an abrupt increment of the voltage on node 2. Then voltage on node 2 serves as the input of L1 and induces a high-level output of L1 under the control of the CLK signal.Subsequently, the output of L1 (the input of L2) activates the L2 to output a high-level voltage that turns on the T1.
When the T1 is on, the potential on node 1 is nearly zero, which offers the TS memristor a sufficient time (refractory period (500 µs) + interval time (250 µs)) to decay to its initial HRS state and prepare for the next firing event.During this period, the G1 generates an output spike by carrying out the 'AND' logic operation of 'L2 OUTPUT' and 'CLK'.
The CLK signals are provided by a global (shared) signal generator, with 2 kHz and 50% duty cycle.All output spikes are identical because the output spike results from the "AND" operation of the L2 output and the global CLK signal.Thus, the circuit could output a fixed spike signal, emulating the "all-or-none" feature of the action potential in biological neurons 33 .
The continuous firing behavior of the neuron under pulse train with different amplitudes (from 1.4 V to 2.0 V, 2 kHz frequency, 250 µs width) are shown in Fig. 3c, equivalent to the firing behaviors under identical input pre-neuron pulses but different synaptic weights.Intuitively, the spiking frequency increases with increasing the amplitudes, demonstrating that the neuron could classify different stimuli intensity by giving a different spiking frequency.Identical forms of all output spikes are observed (2.0 V, 250 µs width, the visual error results from the read fluctuation, see the zoomed-in view in Supplementary Fig. 5). Figure 3d shows the spiking frequency's statistical results as a function of the input pulse amplitudes, further confirming that the neural spiking frequency increases with increasing the input pulse amplitudes.Besides, attributing to the active digital components, the hybrid neuron could enable the adjacent neurons directly.
The spiking behavior of two connected neurons was tested (see supplementary Fig. 6).
The results indicate that the proposed hybrid neuron could propagate the spiking signals in multilayer networks through connected synapses, just like what observed in biological systems.
To further demonstrate the neuron's feasibility for performing in situ learning, two RS synapses (Ta/HfO2/Pd) are connected to the neuron circuits (see Supplementary Fig. 3a).
Initially, the synapses S1 and S2 are programmed into a medium resistance state (~ 400 µS @ 0.2 V).Then a series of pulses are applied on the input terminal VIN1, and VIN2 is zero.Compared to Wang's work 12 , both the synaptic potentiation and depression operations are performed within the neuron and avoid using any external depression control circuits, which in some certain decrease the system hardware overhead and more faithfully implement the Hebbian learning process in the biological system 36,53 .During learning, the increased potential of node 1 is clearly observed (Supplementary Fig. 3b), which used to depress the synapse S2.After learning, the synapses S1 and S2 are respectively programmed into an LRS (~980 µS @ 0.2 V) and an HRS (~42 µS @ 0.2 V), as shown in Supplementary Fig. 3c.Corresponding to the evolution of S1 and S2, the output spiking frequency increases with increasing the input pulse counts during the learning process (Supplementary Fig. 3d), demonstrating the in situ learning capability of the hybrid neuron.

Lateral inhibition circuits for the WTA learning rule
Lateral inhibition is a crucial feature for unsupervised learning 40 , and it could support the implementation of the winner-take-all (WTA) learning rule.The WTA rule indicates that once the winner neuron fires, other neurons are inhibited.To perform the WTA learning rule using the proposed neuron circuits, we design a lateral inhibition array (LIA) contains RS array and comparators, as shown in Fig. 4a.The LIA ten post-neurons is presented and will remain feasible within thousands of neurons by simply increasing the array size (n×(n+1), n is the WTA neuron numbers).For carrying out the lateral inhibition operation within post-neurons, the LIA should possess two features: First, when no neuron fires, all lateral inhibition signals from neurons (VL1−VL10) are "0".Thus, all outputs of the LIA (LG1−LG10) should be "1" to active all synapses for an inference operation.Second, when the winner neuron fires (i.e., N1), the lateral inhibition signal of N1 is "1", and all other neurons' lateral inhibition signals are "0".In this case, the LG1 of the LIA outputs should be "1" and other outputs (LG2−LG10) are "0".Correspondingly, only the winner neuron's synapses are active, and other neurons' synapses are inhibited, followed by the in situ learning operation on the winner neuron's synapses.Here, a bias input is introduced into the LIA to make the lateral inhibition weights valid (see mathematical analysis in Supplementary Note1).It is worthy to note that the comparators' positive terminals serve as the reference terminals, and the negative terminals receive signals from the RS array.In such a method, negative weight values could be avoided to reduce the hardware overhead of using differential resistor pairs 54 .Fig. 4b shows the preprogrammed weight conductance of the LIA according to the calculated weight value.
To demonstrate the performance of the LIA, we carried out the test on LIA under two input conditions: all neurons' VLs are "0" (0 V), and only the winner neuron's VL is "1" (1.5 V), as shown in Fig. 4c.In detail, for the input condition that all neurons' VLs are "0", no neuron fires at the beginning, and the lateral inhibition outputs (VL1−VL10) are "0" (0 V). Thus, the inputs of the lateral inhibition array are "0" (0 V), except that the bias input is "1" (1.5 V) (left part of Fig. 4c).In case, all the LIA outputs (LG1−LG10) are "1" (3 V), which is used to activate all synapses, as shown in the left part of Fig. 4d.The input condition under which only the winner neuron fires, corresponding to the case when only the winner neuron's lateral inhibition signal is "1" (1.5 V) and other neurons are silent (right part of Fig. 4c).In this case, only the winner neuron's LG is "1" (3 V), while other LIA outputs that correspond to loser neurons are "0" (0 V), as shown in the right part of Fig. 4d.Thus, only the winner neuron's synapses could be programmed.It is clear that the lateral inhibition signal of the fired neuron happens when the TS device switches on (Fig. 4e), indicating that the lateral inhibition signal is triggered timely.These experimental results show that the LIA circuits actually possess two features as mentioned above, thus are decent for supporting the proposed hybrid neuron to perform the lateral inhibition operation and unsupervised learning with the WTA rule.

Fully hardware multilayer SNNs
Based on the proposed hybrid neurons and the lateral inhibition circuits, we further demonstrate a fully hardware multilayer SNN for performing in situ learning.Fig. 5a shows the digital patterns that are used for learning.Each pattern includes 30 pixels (6 ×   5).In real operations, the black pixels are recognized as "1" and then programmed into positive pulses (1.6 V, 250 µs width, and 2 kHz).The white pixels are recognized as "0" and thus grounded.The network is constructed with a 30 × 10 × 10 structure, as shown in  4).Fig. 5e shows the evolution of the synaptic weights of neuron "1" after 30 firing events when the digit "1" is used as the input.The conductance of synapses is clearly programmed upon the firing events.Supplementary Fig. 8 shows the synaptic weights of the other neurons under different input digital patterns.The weight map after learning is shown in Fig. 5f, a clear conductance distribution is observed.Furthermore, the inference firing rates under different input digits with noise pixels are shown in Fig. 5g and Supplementary Fig. 9.
The results show that both neuron "6" and neuron "5" fire when the input digit is "5" (or digit "6"), as well as neuron "8" and neuron "9" when input digit is "8" (or digit "9").This is because the hamming distance between these patterns is small 55 , which results in nearly the same post-synaptic membrane potential (see Supplementary Fig. 10).It is worth noting that attributing to the stochasticity of neurons, the 'greedy' neuron is successfully suppressed during the learning process, which is critical to perform the WTA rule with an unsupervised way 52,56 .For comparison, a simulation is performed on the neurons without stochasticity.The results show that all patterns are trapped by a greedy neuron and cannot carry out clustering normally (see Supplementary Fig. 11).
Then, we trained the second layer in a supervised way by applying a constant voltage on the shared gates of 1T1R synapses of the corresponding post-neurons (see methods in the experimental section).The conductance of synapses is initialized to ~ 40 µS (low conductance state), as shown in Fig. 5h.Fig. 5i shows the corresponding synaptic conductance evolution of digit "1" in the first 30 training iterations after applying the output signals from neuron 2 in the hidden layer.We noted that the potentiation operation of the target synapses is done within one-cycle, indicating a fast learning process, which is because the input pulse is strong enough to directly set the synaptic devices.This onecycle learning might limit the pattern amounts learned in a large scale network but could be alleviated by adopting some optimized methods.Such as, using faster TS devices 48,49 that support shorter pulses to achieve analog switching of synaptic devices 57 , or introduce the synaptic switching probability into the learning process 56,58 , which requires further

Discussion
In this work, a hybrid memristor-CMOS stochastic LIF neuron s is designed to enable fully hardware implementation of SNNs.The hybrid neuron is equipped with two key features: first, the TS device brings in the highly desirable diffusion dynamics for efficiently and faithfully performing the leaky integrate-and-fire functions.Second, the simple digital circuits serve as an active pump to output "all-or-none" spikes, as well as introducing potentiation, depression, refractory period, and lateral inhibition signals into the neuron circuit.These features render the compact neurons capable of tuning synaptic weights for in situ Hebbian learning.Moreover, the digital module makes the neuron circuit active, which could be used to enable deep spiking neural networks.
It is worthy to note that the pulse parameter used in this study is just an example of a demonstration.The pulse width that serves as input could be scaled to µs, even ns level according to recent experimental data of the TS memristor 48,49 , and thus could implement faster computing.Furthermore, the presented hybrid design concept may extend to other memristor technologies, such as NbO2 and VO2 that have shown promising dynamics for emulating spiking neurons 17,20 .
To perform unsupervised learning with the WTA rule, an LIA circuit that consists of an RS array and comparators is devised to work with the hybrid neurons.By combining the LIA circuit and hybrid neurons, we further experimentally demonstrated a fully hardware two-layer (30 × 10 × 10) SNNs, on which in situ learning operations has been successfully performed.This work paves the way towards hardware implementation of sophisticated and yet highly efficient spiking processors by leveraging the advantages of both emerging and CMOS devices.
For performing in situ learning operations on memristive synapses, potentiation, depression, and lateral inhibition signals are introduced into the neuron successfully, and a lateral inhibition array (LIA) is specifically designed.Using the hybrid neurons and LIA, we further experimentally demonstrate a 30 × 10 ×10 fully hardware multilayer SNN (MSNN) with RS synapses.In this MSNN, the training processes are in situ, 10 hidden neurons perform feature extraction with the LIA in the first layer, and 10 output neurons serve for further recognition in the second layer.The experimental results show that the hybrid neurons could perform in situ tuning on RS synapses and have the potential to build self-adaptive spiking neuromorphic systems.

Fig. 2 .
Fig. 2. Characteristics of the TS device.a, Scanning electron microscope (SEM) image of the

Fig. 4 .
Fig. 4. Lateral inhibition circuits for the winner-take-all learning rule.a, Schematic of the implement lateral inhibition.The lateral inhibition signals (VL1−VL10) from neurons serve as the inputs of the LIA, and the outputs of the comparators are applied to the shared gates of 1T1R synapses.VBIAS: 1.5 V, VRef: 50 mV.b, The pre-programmed weight conductance of the memristor array used in the LIA circuit for ten neurons.c, Two input conditions of the LIA while performing lateral inhibition operation, the input signals are the neurons' lateral inhibition signals.d, The corresponding outputs of all the LIA (LG1−LG10) when two conditions in c serve as the inputs of the LIA.c & d, The Y-axis unit is volt (V).e, The moment when the neuron outputs lateral inhibition signal.Two firing cycles are presented.

Fig. 5 .
Fig. 5. Fully hardware multilayer SNNs.a, The digital patterns used for learning, every pattern

Fig. 5b .
Fig. 5b.The 10 winner-take-all hidden neurons receive input signals from the 30 input study.The other corresponding synaptic devices, whose input terminals are grounded, remain nearly unchanged because the initial low conductance cannot be further programmed.The synaptic weight evolutions of the other neurons are shown in Supplementary Fig.12.After training, the weight map of the second layer is shown in Fig.5j.In each neuron's synapses, only one related synapse is potentiated.The locations of the potentiated synapses just right cater to the neurons with the highest rate in the hidden layer.Fig.5kshows the firing rates of the second layer neurons under different input digital patterns with noise pixels, where clear recognition results are observed (see Supplementary Fig.13for spike outputs).These results demonstrate that the proposed neuron circuit could successfully perform in situ learning on RS synapse networks, suggesting that the hybrid neuron circuit has the potential to build a high-dense spiking neuromorphic machine with in situ learning capability.