Bioinspired and Low-Power 2D Machine Vision with Adaptive Machine Learning and Forgetting.

Natural intelligence has many dimensions, with some of its most important manifestations being tied to learning about the environment and making behavioral changes. In primates, vision plays a critical role in learning. The underlying biological neural networks contain specialized neurons and synapses which not only sense and process visual stimuli but also learn and adapt with remarkable energy efficiency. Forgetting also plays an active role in learning. Mimicking the adaptive neurobiological mechanisms for seeing, learning, and forgetting can, therefore, accelerate the development of artificial intelligence (AI) and bridge the massive energy gap that exists between AI and biological intelligence. Here, we demonstrate a bioinspired machine vision system based on a 2D phototransistor array fabricated from large-area monolayer molybdenum disulfide (MoS2) and integrated with an analog, nonvolatile, and programmable memory gate-stack; this architecture not only enables dynamic learning and relearning from visual stimuli but also offers learning adaptability under noisy illumination conditions at miniscule energy expenditure. In short, our demonstrated "all-in-one" hardware vision platform combines "sensing", "computing", and "storage" to not only overcome the von Neumann bottleneck of conventional complementary metal-oxide-semiconductor (CMOS) technology but also to eliminate the need for peripheral circuits and sensors.

"all-in-one" hardware vision platform combines "sensing", "computing" and "storage" not only to overcome the von Neumann bottleneck of conventional complementary metal oxide semiconductor (CMOS) technology but also to eliminate the need for peripheral circuits and sensors.
An intelligent system, natural or artificial, is one that monitors its environment, learns, or remembers key information, and adapts to changes, as necessary. Animals do this seamlessly, often with very limited resources and in challenging ecological conditions. Their success can be attributed to the underlying biological neural networks (BNNs) that not only correlate and collocate the neural primitives for 'sensing', 'computing', and 'storage', which drastically reduces the energy expenditure for many difficult tasks but also learn and adapt ensuring the survival of the species, even in most resource constrained environments.
The world we 'know', is a result of the perception enabled by our sensory organs. Information embedded in the outside world, takes multiple sensory pathways and associated transformations before it reaches the brain, which then processes it to give out a wide variety of outcomes, sensations, and above all, aids in learning and memory formation. In primates (including humans), vision constitutes a major portion of information input, more than all the other sensory inputs combined. Hence, a major part of the brain is devoted for processing visual stimuli highlighting the importance of visual system in learning [1].
Drawing inspiration from the biological intelligence observed in visual animals, machine learning and machine vision are pushing the limits of artificial intelligence (AI) in our everyday life, from defeating professional players in the game of "Go" [2] to driving autonomous vehicles in crowded streets [3]. Recent years have seen significant progress in artificial neural networks (ANNs) [4], which are high-level abstractions of the biological neural networks (BNNs), i.e. neurons connected to other neurons through synapses. Software ANNs and their different incarnations such as deep neural networks (DNN) [5], convolution neural networks (CNNs) [6], and recently bio-realistic and event driven spiking neural networks (SNNs) [7] have shown remarkable success in multiple applications including image processing, pattern classification, and solving complex optimization problems. Their hardware implementation have primarily relied on the Si-based conventional complementary metal-oxide-semiconductor (CMOS) technology [8][9][10]. However, unlike BNNs, where the "computing" primitives, i.e. neurons, and "storage" units, i.e. synapses are collocated, the von Neumann architecture used by Si CMOS separates "compute" from "memory" leading to orders of magnitude higher energy expenditure compared to what the brain requires for similar tasks [11][12][13]. Non von Neumann computing architectures based on field programmable gate arrays (FPGAs) [14], and resistive random-access memory (RRAM) [15][16][17][18][19][20] bridge the gap between "memory" and "compute" and offer energy efficient alternatives for hardware implementation of ANNs. However, these in-memory compute architectures mostly rely on CMOS-based peripheral sensors and circuits adding area and energy overhead [12,13,[21][22][23].
Finally, while learning has been a topic of extensive research, the importance of forgetting in learning has not received adequate attention. Most researchers considered forgetting as a passive brain process that allows unused memories to disappear over time. However, this decades-old hypothesis has now been challenged by a new radical idea that suggests that the brain is built not to remember, but to forget [45]. In other words, forgetting is an active brain process that plays an important role in biological learning. As we will elucidate, it is possible to develop new unsupervised learning rules for hardware ANNs based on forgetting.
In the light of the above discussion, it is imperative that the next generations of AI will benefit from an integrated hardware platform that combines machine vision with machine learning mimicking the adaptive neurobiological architectures for seeing, learning, and forgetting. Here we accomplish the same by integrating monolayer MoS2 phototransistor array with analog, nonvolatile, and programmable memory gate-stack bridging the gap between "sensing", "compute", and "storage". In short, we combine analog optical memory observed in 2D phototransistors with analog electrical memory enabled by the back-gate stack for direct learning and unsupervised relearning from the visual stimuli. Our biomimetic hardware vision platform also enables adaptive learning under photopic (bright-light), scotopic (low-light), as well as noisy illumination conditions at miniscule energy expenditure bridging the energy gap between AI and natural intelligence. Finally, our "all-in-one" vision platform not only overcomes the von Neumann bottleneck of CMOS-based ANNs but also eliminates the need for CMOS-based peripheral sensors and circuit components.  [46]. Photoreceptors also transduce visual information into electrical impulses and, with the help of other cells in the retina, pass on to the visual cortex in the brain (Fig 1c). The visual cortex contains a vast network of neurons which take part in learning. While the neuroscience of learning is still a topic of active research, it is widely accepted that learning leads to strengthening of connections between the associated neurons through a process known as synaptic plasticity (Fig. 1d) [47]. For example, long-term potentiation or memory formation leads to increase in the number of AMPA receptors in the postsynaptic neuron when the presynaptic neuron uses glutamate as the neurotransmitter [48]. Similarly, forgetting leads to weakening of connection strengths through reduction in the number of AMPA receptor. The mathematical construct of synaptic plasticity determines the biological learning rule, most of which can be categorized as unsupervised in the context of machine learning, although evidence of reward-based or reinforcement learning can also be found [49].

Direct and analog learning from visual stimuli using 2D phototransistor array
ANNs require involvement of photosensitive materials with unique properties to perform machine vision operations such as analog sensing and adaptation. Direct-bandgap monolayer 2D materials with their superior photosensitivity, and electrostatic gate tunability are, therefore, natural choices for next generations of bio-inspired machine vision platforms. Additionally, the atomically thin body nature of 2D monolayers allow aggressive dimension scaling and hence enable high integration density as reported recently [38,50]. Moreover, some of the early criticism of 2D materials have also been successfully addressed through the realization of low contact resistance [51], high ON current [52], integration of ultra-thin and high-k gate dielectric [53], and wafer scale growth [54, 55] making them a technologically viable option. Demonstration of 2D-based microprocessors [56], analogue operational amplifier [57], and RF electronics components [58] support this claim. Finally, unlike silicon CMOS, 2D materials enable flexible [59] and printable [60] electronic circuits adding value towards 2D-based biomimetic and neuromorphic hardware platforms [61][62][63].
Illumination of a 2D semiconducting channel in a phototransistor will generate photocarriers, which under an electrical bias will drift towards the respective electrodes adding to the already existing dark current in the device. The illumination intensity will determine the change in conductance of the channel, allowing one to leverage this property for analog machine vision sensors. In most cases, after the stimuli is removed, the conductance regains the initial state, without remembering the change induced by the visual stimuli. This is a limitation for many machine vision demonstrations, necessitating peripheral circuit elements to store the new conductance value induced by the optical stimuli [35]. This challenge is overcome with the property called 'optical memory' or persistent photoconductivity, which allows materials/devices to remain in the new conductance state even after the visual stimuli is removed. In 2D based vision sensors, this is accomplished mainly with trapping of photocarriers by the trap states at the 2D semiconductor-oxide interface [35, 41-43, 63, 64]. The trapped charges will alter the threshold voltage of the device, changing the conductance measured at a given BG . The retention of this optically induced conductance state depends mainly on the detrapping time and may range from several hours to days. This is called photo-gating effect and is leveraged in our 2D based machine vision to demonstrate direct learning from visual stimuli.  learn the input image by updating . As expected, devices exposed to brighter intensities reach higher compared to the devices exposed to lower intensities due to the difference in photo-gating effect illustrated in Fig. 2c. As a result the heatmap of ( Fig. 2e) mimics the contrast present in the input image at the end of the 50 epochs suggesting direct learning by the 9×1 phototransistor array from the analog visual stimuli. The total learning energy expenditure per pixel after 50 epochs was found to be miniscule, ~ 50 nJ.

Adaptive Learning
As mentioned earlier, a key feature of BNN is neuroplasticity that enables adaptation to learning under changing environmental conditions. Here we demonstrate similar functionalities achievable in the phototransistor array allowing learning under both scotopic and photopic illuminations.
While biological vision requires significant amount of time from minutes to hours to adapt to scotopic conditions, our artificial vision can be reconfigured such that similar learning rates are achievable irrespective of the illumination conditions. Fig. 3a shows the time evolution of measured at BG = 0 V, every 500 ms, corresponding to different LED and write . It is observed that for any given LED , the rate of change in increases as the magnitude of write increases, i.e. as write becomes more negative. This is another consequence of photogating effect as more trap states become available for carrier capture at the 2D/dielectric interface when the phototransistor is biased in the deep off-state. Fig. 3b shows the learning rate, which we define as the rate of increase in as a function of LED and write . Note that the learning rates obtained for photopic illuminations (e.g. LED = 20 mA) at a lower negative write can also be achieved for scotopic illuminations (e.g. LED = 2 mA) at a higher negative write . Fig. 3c  as the visual stimuli. The photopic and scotopic "T"s were realized by using LED = 20 mA and LED = 2 mA, respectively. The photopic "T" is learned within only 10 epochs (Fig. 3d), whereas learning scotopic "T" is incomplete even after 50 epochs at write = -2 V as shown using time evolution of the heatmap of measured at BG = 0 V (Fig. 3e). We consider the learning to be complete when = 100 nS. However, by using more negative write , e.g. write = -5 V, the scotopic "T" can be learned within 10 epochs highlighting the capability of our vision platform to adapt to the learning ambience. The learning energy expenditure for the photopic and scotopic "T"s were found to be ~ 3.6 nJ/pixel at write = -2 V and ~ 1.1 nJ/pixel at write = -5 V, respectively.

Adaptive forgetting
Forgetting has traditionally been considered to be a passive brain process, which ensures unused information fade over time so that neural resources can be reallocated for storing more important and newer information. When machines learn with unrestricted storage resources (e.g. cloud servers), forgetting is irrelevant. However, when storage capacity is either limited or not accessible, for example in internet of things (IoT) edge devices deployed in remote locations, forgetting can play an active role in smart learning.
Forgetting is enabled in our phototransistors by exploiting the non-volatile and analog programmability of our back-gate dielectric stack. Fig. 4a shows the transfer characteristics of the phototransistor when positive programming voltages ( D ) of increasing amplitude are applied to the back-gate, each for a total duration of 100 ms. During programming, the source and drain terminals are kept grounded. Also note that before programming the device is set at a high   ( D ) for a total duration of 2.5 s. Note that since the electrical programming via back-gate results in positive shift in the TH , decreases with increasing and and hence can be exploited as synaptic depression or forgetting. However, unlike biological forgetting where humans have limited control, forgetting rate, which we define as the rate of decrease in in our vision platform can be precisely controlled through (Fig. 4c). Forgetting is also permanent as shown in Fig. 4d using non-volatile retention of post forgotten measured at BG = 0 V. Fig. 4e shows

Importance of forgetting in unsupervised relearning
Next, we elucidate the role of forgetting in unsupervised relearning. In this instance each epoch consists of two cycles, optical potentiation for learning followed by electrical depression for forgetting. We have considered 3×3 pixel images of the letters "L" and "T" for learning and relearning, respectively, with each being presented for 25 epochs. As before, all devices were initially programmed to a low conductance state with ≈ 1 nS. Fig. 4g and

Forgetting for learning under random disturbances
Next, we show that forgetting plays an even more significant role when learning under noisy illumination conditions. The human visual system possesses the remarkable ability to identify important features in an image even in the presence of disturbances. For example, the brain can extract information in poor weather conditions such as mist, storm, blizzard, and other impediments to perfect vision. Avoidance of dynamic noise, on the other hand is a challenging task for machine vision and it must rely on sophisticated computer algorithms for the elimination of the same. Hardware implementations of such algorithms are naturally energy hungry. However, as we demonstrate below, forgetting can help learning under such dynamic noise.

Methods
Film growth: Monolayer MoS2 was obtained from 2D crystal consortium (2DCC) [54]. It was deposited on epi-ready 2" c-sapphire substrate by metalorganic chemical vapor deposition (MOCVD). An inductively heated graphite susceptor equipped with wafer rotation in a cold-wall horizontal reactor was used to achieve uniform monolayer deposition as previously described [65].

Author Contributions
A

Competing Interest
The authors declare no competing interests