Neuromorphic systems inspired by the human brain offer a promising solution to overcome the limitations of traditional computing architectures1. These systems possess highly parallel processing capabilities, scalability, and low-power consumption, making them a viable path towards artificial general intelligence. In the brain, neurons serve as the fundamental units of information processing, while synapses regulate the flow of information by adjusting their weights2, 3. While considerable progress has been made in developing artificial synapses, research on low cost neurons is still in its early stages4. Memristive devices, due to their unique characteristics such as low power consumption and small size, have emerged as potential candidates for creating neural components5, 6, 7, 8, 9, 10. Memristive devices based on the diffusion dynamics of silver (Ag) have shown promise as neuromorphic devices, with non-volatile dynamics enabling the emulation of synaptic plasticity and volatile switching effects facilitating neuron functions11, 12, 13. However, the traditional sandwiched metal-insulator-metal structure of memristors falls short in achieving the desired adjustable properties of neurons. Moreover, one common approach is to connect memristors in parallel with capacitors to enable neuron functions14, 15, 16, 17. Unfortunately, this method requires relative large number of hardware. Consequently, the exploration of methods to fabricate single-device tunable neurons becomes a highly relevant and intriguing challenge.
Regarding the application of neural components, one of the mainstream directions is the construction of Spiking Neural Networks (SNNs), in which neurons serve as activation functions within the neural network. Among these implementation methods, an array of synaptic memristors is employed for storing and calculating the weights of the neural network. For implementing neurons, a common approach involves utilizing CMOS neurons18, 19, 20, 21, or memristor-based neurons which require capacitors22, 23 or memristors with auxiliary CMOS circuits24, 25 for functionality. Consequently, there is a scarcity of work involving the implementation using single-device neurons. Furthermore, in previous SNN studies, the emphasis has predominantly been on the functionality of synapses and their weights in the network, with neurons themselves primarily regarded as excitatory units11. However, real biological neurons employ various forms of adjustability to generate attention mechanisms, thus achieving economically efficient processing of mixed-modal information26. Until now, research has been lacking that focuses on optimizing neuron adjustability as a method to leverage the optimization of processing mixed-modal information.
In this work, a gate tunable MoS2 memristive neuron has been proposed. This memristive neuron replicates essential neuronal functions using a single device, underscoring its remarkable dynamics. These functions include leaky integration, threshold-driven firing, firing threshold tuning, and self-recovery. Additionally, we propose an early fusion multimodal learning SNN based on the proposed tunable neurons. In this network, memristive neurons are configured with different activation modes for various modalities, resulting in excellent recognition performance (95.45%) while significantly reducing hardware costs (49%). This study illustrates the enhancements achieved through the integration of tunable neurons in neural network structures, paving the way for efficient hardware implementations.
The schematic of the proposed tunable neuron device and the low cost SNN architecture have been shown in Fig. 1. Figure 1a and 1b show the conventional memristor neuron and our proposed gate tunable memristive neuron, respectively. The difference in signal response between the tunable neuron and conventional neuron have also been displayed. Compared to traditional two-terminal devices, the proposed gate tunable memristor neuron adds an extra port to become a three-terminal device. The introduction of the back gate provides tunability, enabling more diverse functionalities. For the proposed device, tunability means the neuron has different excitation patterns. We leverage this trait to construct a multimodal recognition neural network. As Fig. 1c shows, the prevailing architecture for multimodal fusion is to combine weights from different modalities at a later stage, which incurs significant hardware costs. The number of weights in early fusion network can be significantly reduced by 49% (Fig. 1d). However, using early fusion reduces hardware overhead but applying conventional neurons leads to reduced accuracy. Fortunately, by adopting a multimodal recognition architecture with tunable neurons as shown in Fig. 1e, this issue can be well addressed. Our model shared the synaptic architecture at an earlier stage, leveraging the adjustability of neurons for multimodal regulation. By fully utilizing neurons to introduce attention mechanisms, we achieved improved outcomes. The specific illustration of this phenomenon can be found in Fig. 1f-h. The schematic diagram illustrates the hidden layers of a neural network, depicting the mapping relationship from weights to the excitation of neuron signals. In particular, Fig. 1f illustrates a neural network employing late fusion, characterized by a greater number of weight matrices, thereby eliciting a diverse set of signals. Figure 1g illustrates early fusion, with a limited number of weight matrices triggering a reduced set of signals. Fig. h demonstrates the introduction of tunable neurons, enabling the activation of various signals with a minimal number of weight matrices. This architecture combines the advantages of low hardware costs and high recognition accuracy. In fact, our approach is also capable of being extended to accommodate multiple modalities. For n different modalities, we segment the adjustable neurons into n groups, each group being stimulated accordingly to represent a distinct modality. This setup allows us to perform recognition tasks for each of the n modalities, ultimately yielding favorable outcomes. By effectively grouping the neurons and validating our approach, we demonstrated its potential for successful recognition of diverse modalities. Figure 1e displays the comparison of their hardware costs versus prior neuron designs. This highlights the low-cost advantage of our tunable neuron. Figure 1f demonstrates the proposed multimodal network with tunable neurons achieves both low hardware cost and high recognition accuracy27, 29, 30, 31. In fully connected neural networks, the main hardware cost comes from a large number of synapses, and the conductivity of each synapse is a weight. Compared with the studies of Neverova et al.27 and Vielzeuf28 et al., our number of weights decreased by 42% and 64% respectively, while the accuracy improved.