According to the neuroscience principle of free energy minimization (FEM), living organisms develop internal models of their environment to guide actions that minimize surprise and reduce uncertainty 1,2. This objective stands in contrast to that of biologically inspired artificial neural networks (ANNs), which typically aim to maximize accuracy 3. Shifting focus from accuracy to handling uncertainty is pivotal in explaining the efficiency and adaptability of biological neural networks. To date, ANNs have been very successfully implemented on deterministic conventional hardware and have led to breakthrough results in areas including weather forecasting 4, medical diagnostic 5, autonomous driving 6 and natural language processing 7–10. However, deterministic models are point estimates based on known data and do not take the complete posterior distribution of the parameters into account 11. Bayesian neural networks (BNNs) replace the deterministic network parameters with probability distributions to capture the probabilistic nature of inferring from incomplete observed data 12,13. In this way, BNNs allow for distinguishing between epistemic uncertainties due to the lack of data and aleatoric uncertainties arising from noise in the data itself 14,15. Consequently, BNNs are also significantly more robust against overfitting to small data sets 16,17. Bayesian inference also lies at the heart of the of the FEM principle.
Processing complex probabilistic models poses major challenges for conventional deterministic hardware. Because the integral formulations used in describing probabilistic models become intractable already for a small number of parameters, Monte Carlo methods are employed to provide approximate solutions 13,16. This includes sampling from the model’s posterior distribution multiple times and subsequently evaluating the model for each drawn sample. Thus, high-speed (true) random number generators are required in combination with an architecture capable of evaluating the full model for each sample in a reasonable time. In conventional hardware implementation, one major factor contributing to the inefficiency of machine learning systems is the reliance on the von Neumann digital architecture, which, contrary to the physics of computing substrates, enforces determinism and separates memory from computation 18. Brain-inspired computing differs from conventional digital computing by emphasizing in-memory analog processing, fine-grained parallelism, reduced precision, increased randomness, adaptability, analog processing, and possibly, spike-based communication 19. Co-designing FEM-based learning with brain-inspired computing platforms can enhance energy efficiency and adaptability by shifting the learning objective from noise reduction (accuracy) to instead harnessing hardware noise as a valuable computational resource 19. For electronic crossbar arrays, memristors serve as the main in-memory computation element due their tunable conductance. Simultaneously, programming and reading the conductance of a memristor is a stochastic process due to inherent randomness of the switching process in addition to drifts and instabilities 20,21. Since the randomness is programmable by deploying multiple memristors for a single matrix weight, it can be deployed for Bayesian inference. In this case, sampling from the posterior distribution is implemented by reading, and potentially rewriting, the memristor several times while the neuromorphic crossbar architecture ensures the efficient evaluation of the model 22. To avoid the need for sequential sampling and the random structural changes within memristive materials, transitioning to the optical domain allows for probabilistic computing in parallel with single-shot readout by deploying chaotic light. Chaotic light is an ideal entropy source for true random number generation 23–26 and can, moreover, easily be generated at default telecom wavelengths by amplified spontaneous emission in erbium doped fibers or erbium doped waveguides 27–31. Moreover, the incoherent nature and large optical bandwidth of chaotic light allows for high-speed data processing in photonic crossbar arrays by exploiting wavelength division multiplexing.
In the following we present a photonic neuromorphic architecture capable of performing probabilistic single-shot computations with a photonic crossbar array. We harness chaotic light fields as the entropy source of the system and as the optical carrier for probabilistic information encoding. For photonic in memory computing, we employ the non-volatile phase change material Germanium-Antimony-Telluride (GST). Using time-amplitude modulation, we perform probabilistic data encoding and achieve parallel sampling based on spectral demultiplexing. We quantify the precision of the stochastic multiply and accumulate operations performed by the photonic circuit. With an incoherent photonic processor, we calculate high-speed probabilistic convolutions on visual inputs, making use of parallel spectral sampling from the output distributions. We deploy stochastic variational inference in a Bayesian neural network based on the LeNet 5 32 architecture to minimize the divergence between the true posterior of our model parameters and the variational distributions educible by our encoding scheme. We benchmark the BNN’s accuracy and out-of-domain rejection on an incomplete MNIST 33 data set.