Figure 1a shows the lattice structure of VP, which appears to be monoclinic with a space group of P2/n, with crystallographic lattice constants of a = 9.210 Å, b = 9.128 Å, c = 21.893 Å, and β = 97.776°12. The VP structure generally shows a bi-tubular structure with one layer stacked on another vertically along the z-direction, as shown in Fig. 1a lower panel. The Raman spectrum of VP-MoS2 heterostructure excited by a 532 nm laser is shown in Fig. 1b. The Raman shift peaks at 183, 211, and 278 cm-1 represent for the variation modes of VP atoms while the peaks at 361, 379 and 476 cm-1 represent for the stretching of the atom cages as a whole. The complex Raman spectrum indicates a large density of photon states in VP, which further contributes to the strong light-matter interaction and electron-phonon scatterring18. In addition, few-layered VP possesses a direct bandgap of ~ 2.54 eV12 and usually behaves as an n-type semiconductor, which indicates its potential to build optoelectronic devices. On this basis, we built VP photo-transistors and measured their characteristics, as shown in Fig. 1c,d. Due to its unique lattice and bandgap characteristics, VP-based photodevice shows an extremely low dark current (~ fA) and a high light-to-dark ratio (~ 105). In addition, the VP phototransistors were exposed to the ambient air without any encapsulation and could still exhibit an on/off ratio of ~ 100 and a light-to-dark ratio of ~ 103 after 14 days, while black phosphorous would completely lose on/off characteristics in 2 days19, indicating that VP is the most stable phosphorus, as shown in Fig. 1d (original data see Supplementary Figure S1). However, as the photo-response of VP phototransistors is generally instantaneous (see Figure S2 in Supplementary Information), which is not suitable for synaptic applications, heterostructure was then introduced. The device schematic and process flow of our heterostructure device is shown in Fig. 1e and Fig. 1f, respectively. MoS2 and VP were successively transferred to a 300 nm-thick silicon oxide substrate to form the vertically stacked heterostructure (Device I). Electron beam lithography (EBL) was used to form the electrode pattern and electron beam evaporation (EBE) deposits the source and drain electrodes. Top gate dielectric (30 nm HfO2) was then deposited using atomic layered deposition (ALD) with 1 nm seed layer (SiO2) using EBE, followed by top gate electrode deposition. Another MoS2 device (Device II) without VP was built using the same material as the control. The VP stacked here on the top of MoS2 works as a photogate, and the thickness of the dielectric and top gate have been carefully determined to ensure the transmission of light. Figure 1g shows the scanning electron microscope (SEM) image of the heterostructure synaptic device. High-resolution scanning transmission electron microscopy (STEM) was used to characterize the device microstructure. Figure 1h ~ j shows the cross-section HAADF STEM images of the VP-MoS2 under different magnifications, which implies a layered van der Waals structure and is consistent with the lattice in Fig. 1a. The STEM image in Fig. 1j shows a single-layer VP thickness of ~ 2.2 nm, which is consistent with the theoretical value12. Figure 1k shows the STEM image of the VP-MoS2 heterostructure and the corresponding energy dispersive spectroscope (EDS) mapping, which exhibits a clean van der Waals interface between VP and MoS2 layers. The heterostructure region was further investigated using atomic force microscope (AFM), which is provided in Figure S3 in the Supplementary Information. The thickness of VP and MoS2 is measured by AFM to be ~ 6.66 nm and 5.85 nm, corresponding to 3 and 9 layers, respectively.
The operating principle of our VP-MoS2 synaptic device is described in Fig. 2a. When a voltage is applied to the top gate, the device exhibits an n-type on/off switch, as shown by the dashed line in Fig. 2a left panel. Subsequently, when a weak light stimulus is applied, a negative threshold shift of several V can be observed, leading to a “threshold window”, which is shown by the solid line in Fig. 2a left panel. Different from the ordinary photocurrent effect where the off-state current increase obviously under illumination, here the photostimulation mainly act as a trigger to switch the on/off state instead of increasing the off-state current. Similar phenomena can be found in other heterostructures as well. By carefully selecting the operation point, the former off-state can be transformed to the on-state by light induction, resulting in a record large optical dynamic range (over 106), far better than most analog switching memories (10 ~ 100), as shown in Fig. 2a right panel. It is worth noting that the operation point is typically negative, which means that our device could maintain a high response at a lower off-state current than most mainstream analog memories, leading to great improvement in dynamic range. To further investigate the principle of threshold shift, we construct an energy band model for the heterostructure device (Device I), as shown in Fig. 2b. The theoretical bandgaps of MoS2 and VP are about 1.9 eV20 and 2.54 eV12, respectively. When stacked together without external voltage (flat band), VP and MoS2 form a type-II heterostructure, with the bottom of the conductance band of MoS2 higher than VP and the top of the valence band lower than VP. When a negative voltage is applied to the top gate, the device changes from the flat band to the programmed state. Photostimulation is then applied to the device in the programmed state to induce photogenerated electron-hole pairs. According to the energy band structure, electrons and holes move separately: holes move toward VP and are trapped by the potential well at the interface of VP and HfO2 (the properties of VP-MoS2 heterostructure device without the top gate dielectric are shown in Supplementary Figure S4, where the separated photogenerated holes are not efficiently trapped, making the threshold shift rather weak and leading to a limited dynamic range); while electrons move toward MoS2, leading to an increase in transient channel current. Upon withdrawal of the light stimulus, the trapped holes contribute to the conductivity of the channel, thus resulting in a negative threshold shift. Through applying a positive voltage to the top gate, the band structure changes, leading to a release process of the trapped holes. As a result, the device is reset to the initial state. In this way, the heterostructure device shows the coexistence of optical potentiation and electrical inhibiting, which exactly mimic biological excitatory and inhibitory plasticity and can be mapped to artificial neural networks for neuromorphic computing. As a comparison, we construct another energy band model without VP (Device II) to further investigate the role played by VP in such a process, see Supplementary Figure S5. Without the strong light-matter interaction with VP, MoS2 produces limited electron-hole pairs. More importantly, in this case, the separation of electrons and holes cannot be captured efficiently, so only a transient photocurrent exists without a significant threshold change. To confirm the validity of our energy band model, we further use COMSOL Multiphysics to conduct finite element simulation based on the above theoretical analysis. More details about the simulation could be found in Section 6 in Supplementary Information. Figure 2c depicts the distribution of holes in the heterostructure device under different conditions. When a positive voltage is applied at the top gate without light stimulation, the heterostructure device exhibits on current state with low hole densities, due to the fact that both VP and MoS2 are n-type semiconductors. When the light stimulation was applied to the device, electron-hole pairs are generated and separated, with holes moving towards VP and trapped, resulting in the distribution shown in Fig. 2c upper panel. In this case, although the concentration of carriers increases overall, the current of the device does not change much because the device is still in on-state. However, when a negative voltage is applied, the heterostructure device is set to off-state. When light stimulation is applied, the accumulation of holes in VP leads to threshold voltage shift, which turn the device from a non-conductive state into a conductive state, leading to a large dynamic range through light stimulation, as shown in Fig. 2c lower panel.
Experiments are implemented to verify these analyses. Figure 3a upper panel shows the transfer curve of the VP-MoS2 heterostructure device under varying illuminations, where the source-drain voltage (Vds) is fixed at 1 V and the laser wavelength is 473 nm. Obvious threshold shift can be observed for illumination as low as 6 µW and expand with increasing intensity. The actual laser intensity may be lower, taking into account the influence of the top gate dielectric and electrodes. As the intensity increases to 20 µW, the threshold voltage has changed from about − 4 V to below − 8 V, which is desired for the large dynamic range. According to the transfer curve, when the top gate voltage (Vtg) is fixed in this range (-4 V to -8 V), a 20 µW optical stimulation could lead to a large light on/off ratio up to over 106. In addition, we test the transfer curve of Device II as a control, as shown in Fig. 3a lower panel. In this device, almost no change in threshold voltage is observed even when 30 mW of strong light is applied, which is in line with our previous analysis. To visualize the threshold shift of the VP-MoS2 device, we collected the source-drain currents for different top gate voltages as well as for different light intensities, as shown by color mapping in Fig. 3b (see original curves in Figure S8 in Supplementary Information). Considering 10 nA as the boundary between on and off switching, the dashed line could reflect the shift in threshold voltage, that is, the threshold changes from − 12 V to -16 V, which leaves a “threshold window” of ~ 4 V, sufficient for synaptic device operation. We chose an operation point of -4 V and test the output curve under varying light intensities, as shown in Fig. 3c. At such an operation point, the output current increases to around 1 µA under 8 µW illumination, which is consistent with the transfer curve. Such a phenomenon has been repeatedly verified in different batches of devices with this heterostructure (Figure S7 in Supplementary Information). Although the threshold voltage varies slightly with the thickness of 2D materials, this phenomenon could be reproduced in each device, which means that there is a stable strategy to improve the optical dynamic range. Next, we stimulate the device with a single laser spike and test its response, as shown in Fig. 3d. When it comes to neuromorphic computing, the laser spike here could simulate presynaptic input, and the channel current is monitored as post-synaptic current (PSC). The stimuli are applied at different operation points (different Vtg). Although the base current of the device decreases with increasing Vtg, the PSCs almost all reach the µA level, which is consistent with the on-state current of our heterostructure device. As a result, the excitement ratio (PSC/base current) increases rapidly as Vtg increases, and can exceed 106 thanks to the extremely low dark current of VP. Similar tests are carried out using Device II (see Figure S9 in Supplementary Information). A comparison of the two devices is shown in Fig. 3e, where the squares represent the performance of the VP-MoS2 heterostructure device (Device I) and the circles represent the MoS2 transistor (Device II). To eliminate the effect of different original threshold voltages, the x-axis has been unified as base current rather than top gate voltage. Regarding the PSC amplitude (left axis), the MoS2 transistor shows high PSC amplitude only when the base current itself is relatively high, whereas VP-MoS2 exhibits high PSC amplitude irrespective of the base current. This discrepancy leads to at least a 3-magnitude improvement in excitement ratio (right axis), representing a stronger synaptic response to the stimulus.
We further explored the potential of VP-MoS2 device for the simulation of synaptic plasticity and behavior. Dual laser pulses with different intervals were applied to the device to test the paired-pulse facilitation (PPF) characteristics, which is essential to simulate biologically short-term plasticity (STP). The inset in Fig. 4a shows a output waveform, where the base current is around 100 fA, and A1 and A2 represent the amplitudes after the first and second laser pulses. The second pulse exhibits a stronger response than the first pulse, and quickly recovers to the base current, showing typical PPF characteristics. Waveforms based on different interval times are shown in Figure S10 in Supplementary Information. When the interval is as low as 150 ms, our device shows a fairly high PPF index (determined by A2/A1) of up to 853%, and gradually recovers to 100% as the interval increases above 3000 ms. The dashed line here indicates the fitting curve at the experimental points, which obeys an exponential decay, consistent with the theoretical result21. Moreover, the long-term plasticity of the heterostructure synaptic device was explored by increasing the number of laser pulses. As shown in Fig. 4b, 30 laser spikes with different intensities were applied to achieve a progressive excitatory PSC modulation, which simulated the LTP plasticity. The PSC gain (determined as A30/A1) increases significantly as the base current A0 decreases (increase in Vtg) as well as the increased laser power. Figure 4c shows the PSC gain (An/A1) as a function of top gate voltage and pulse number. When the pulse number is as low as 5, the synapse device generally exhibits STP, as shown in Fig. 4c left panel. In this case, the PSC increases linearly with the accumulation of laser spikes and falls back to the base current a few seconds after the removal of laser. The device shows a clear transition from STP to LTP as the pulse number increases, with an increase in PSC gain and retention time (Figure S11 and Figure S12 in Supplementary Information). In addition, a higher top gate voltage will also significantly increase the PSC gain, leading to a large dynamic range of over 106 (Figure S12 in Supplementary Information). However, the extremely high dynamic range is achieved at the expense of linearity and retention time, which are also critical for neuromorphic computing. To balance all these key metrics, the operation point was carefully selected to achieve a large dynamic range (~ 106) with 30 conductance states, fair linearity (1.31) and retention time (~ 40 s), as shown in Fig. 4d red curve (more details see Figure S13 in Supplementary Information), which could meet the requirements of high-precision neuromorphic computing4. A large dynamic range means the potential to contain more conductance states. To further investigate the capability of the heterostructure device to obtain distinguishable conductance multi-states, we applied more pulses for both potentiation and depression. Figure 4d shows the normalized long-term synaptic plasticity using optical pulses for potentiation and electrical pulses for depression under different conditions. By applying 128 light pulses for stimulation, 128 stable, non-crossing, and distinguishable conductance states are generated. The original waveforms are shown in Figure S14 in Supplementary Information, with a base state of around 100 fA, and after 128 pulses of stimulation, the current rises to ~ µA, implying a dynamic range of over 106, which indicates a strong capability to map synaptic device conductance to neural network weights. In addition, by applying over 200 pulses, we can obtain up to ~ 180 distinguishable conductance states (original waveforms shown in Figure S15 in Supplementary Information). Figure 4e depicts the waveforms of each state extracted from Fig. 4d, which are distinguishable and stable. The device also exhibits an extremely high An/A1 of ~ 104, showing the potential to obtain more conductance states. However, limited by the instability of light stimulation and strong light response of the device, state intervals smaller than ~ nA hardly exist stably, limiting the further increase in the number of states. The dynamic ranges and number of conductance state statistics for synaptic devices based on emerging analog memories are shown in Fig. 4f. The devices are divided into four categories according to working mechanisms: charge-trapping electrical devices22–27, ferroelectric devices28–33, electrical memristors34–38, and optoelectronic devices10, 39–43, which are based on a variety of promising materials, including 2D, organic, oxide (e.g. indium gallium zinc oxide) and perovskite semiconductors. Thanks to the extremely low off-state current, our VP-MoS2 heterostructure synapse exhibits a record high dynamic range of over 106, as well as 128 distinguishable conductance multi-sates, far outperforming that of current mainstream analog memories, providing a new strategy for improving the dynamic range and multi-states in synaptic devices.
Finally, we used the NeuroSim multilayer perceptron (MLP) neural network simulator44 to validate the ability of VP-MoS2 synaptic device with a large dynamic range for high-complexity image classification tasks. The neural network used is shown in Fig. 5a, which consists of an input layer, a hidden layer and an output layer. Each neuron node in one layer is connected to each node in the following layer, forming a fully connected neural network. Neuron nodes are connected via synaptic devices, and device conductance represents network weights. WIH and WHO represents the weight matrix between the input and hidden layers and between the hidden and output layers, respectively. We have made the necessary modifications to the original network model in the NeuroSim simulator to make it suitable for our classification tasks. We performed the image classification based on two standard datasets: MNIST and Fashion-MNIST (an MNIST-like dataset with higher complexity)45. The input image data has been pre-processed into grayscale data of each pixel as the input layer (20×20 for MNIST and 28×28 for Fashion). The network contains 100 hidden neurons and 10 output neurons (referring to 10 kinds of labels, i.e. handwriting digits or objects), more details about the simulation could be found in Section 16–19 in Supplementary Information. Figure 5b shows the distribution of weights before and after MNIST training, consisting of WIH and WHO. The initial weights for both matrices are set randomly from 7 states (0, ± 1, ±0.33, ± 0.66), and the weights after training are updated to 128 states, corresponding to the 128 conductance states of VP-MoS2 device, indicating the update of the network weights.
The classification accuracy as a function of training epochs is shown in Fig. 5c. According to the simulation, our device eventually reach an accuracy of 95.23% for MNIST and 79.65% for the Fashion dataset, which is close to those of ideal devices (95.47% and 79.95%). As the original algorithm of the simulator is designed especially for the 20×20 MNIST dataset, the learning accuracy for the Fashion dataset is relatively lower but still could reflect the superiority of our device by comparing it with the ideal device. The detailed parameters and results of our simulation could be founded in Table S2 in Supplementary Information. It is worth mentioning that in the actual classification process, the conductance states of real devices are mapped into the synaptic weights in the algorithm, where the lowest conductance state is transferred into weight 0. However, due to the physical limitation of the device, the absolute 0 weight is unreachable, which would affect all the conductance-weight mapping and finally lead to accuracy deviation between the ideal condition and the physical device5. Compared with most mainstream analog memories, our heterostructure synaptic device owns a lower off-state current and could minimize such deviation, thus leading to negligible error from ideal cases. Therefore, we further investigated the dependence of classification accuracy on dynamic range. Figure 5d shows the final classification accuracy for different dynamic ranges and the number of conductance states while keeping other parameters constant (consistent with the previous simulation for VP-MoS2 heterostructure device). In these cases, the change in dynamic range occurs with a decrease in the off-state while keeping the on-state fixed. Simulation results show that for simple classification tasks like MNIST, accuracy is strongly suppressed when the dynamic range is below 10, and for dynamic range over 100, the increase of dynamic range could still lead to improvement in learning accuracy. In addition, the increase in states number will also improve learning accuracy significantly. In classification tasks with higher complexity like Fashion-MNIST, such dependence becomes more pronounced. In Fig. 5e, we further investigated the learning performance of different synaptic devices in image classification tasks with different complexity. The x-axis shows the complexity of different datasets, which are defined as the average information entropy of the images (See Section 19 in Supplementary Information) and the y-axis shows the error between ideal cases (software implementation) and physical devices. In those cases, most synaptic devices show relatively high error (mostly over 2%) due to their relatively low dynamic range and fewer state numbers22, 26, 30, 38, 43, 46–50. By combining large dynamic ranges and sufficient state numbers, our VP-MoS2 synaptic device shows negligible error from ideal cases, even for the high-complexity Fashion-MNIST classification task.