Here, we propose [11, 12] a novel scheme for ONNs based on integrated micro-combs to achieve simultaneous temporal, spatial, and wavelength multiplexing, which we then use to perform the dot product of vectors. We perform matrix operations by first flattening the matrices to convert them into vectors at high data rates. Our system is capable of dynamic training and its network structure is highly scalable. We demonstrate a single photonic neuron perceptron with 49 synapses, or wavelengths using the microcomb. Our fundamental building block for ONNs achieves a speed for matrix multiplication at 11.9 billion (Giga) operations/s (OPS) – or GOPs - that equates to 95.2 Gigabits/s for 8 bit operations. We do this via simultaneous synapse weighting in the wavelength domain and the temporal domain, scaling the input data. The device is applied to benchmark tests that include handwritten digit classification, where we obtain an accuracy greater than 93%, and to the prediction of cancer classes to distinguish malignant from benign cases based on an extracted feature set from microscope images from biopsied tissue. We obtain an accuracy of greater than 85% for cancer classification.
Figure 1 depicts the neuron perceptron mathematical model  and Figure 2 outlines the experiment setup that uses a Kerr optical micro-comb source. The perceptron is based on wavelength multiplexing of 49 microcomb wavelengths, done simultaneously with temporal multiplexing, in order to form a single synapse. The main operation consists of matrix multiplication with vectors formed from flattened matrices. The matrix multiplication occurs between the electronic image input data and the synaptic weights, and this is performed with multiple steps using photonics. The input data for classification consists of 28 × 28 electronic digital matrices with 8-bit grey-scale intensity resolution, which is initially down sampled digitally into 7×7 matrices that are then reorganized into 1D vectors: X(i) = [X(1), X(2) … X(49)], which are then multiplexed sequentially in the temporal domain by an electronic high speed D/A converter at 11.9 Gigabaud. Here, each symbol corresponds to the 8-bit pixel input data images and takes up one time slot 84 ps in length. Hence, the whole duration of the waveform is N x τ = 4.12 ns with N=49. In approaches based on digital electronics, the neural network input nodes usually reside in electrical memory and are routed according to memory address. By comparison, the input nodes for the ONN are temporally defined by multiplexing the symbols that are then routed according to their location in time.
Following this, the electrical input waveform that is a temporally multiplexed signal is broadcast via an electro-optic modulator on to all 49 wavelengths (equal to the number of elements of the vector X), the wavelengths being generated by the micro-comb. Here, each comb line contains an equal copy of X, the time domain multiplexed input data waveform. Every comb line’s power is then adjusted by an optical waveshaper with the weights being determined by the theoretical synaptic weight vector W = [w(1), w(2), …, w(49)] obtained during training. This effectively multiplexes the synaptic weights in wavelength. If W and X are both 1×49 column vectors, then the weighted input X vector replicas are
where the nth row (where n∈ (1, N)) corresponds to the temporal weighted waveform replica of the nth wavelength. Therefore, the diagonal components reflect the input N weighted nodes, so that the nth weighted input node is reflected in the 8-bit symbol w(n)·x(n) that exists in the nth time slot for the nth wavelength. After this, the replicas are transmitted through a medium that provides a dispersive delay equivalent to 2nd order dispersion, to sequentially delay the weighted replicas in order to align the diagonal components into the same time window, with the delay step given by t = delay(λk) − delay(λk+1). Therefore, the dispersive delay is an addressable time-of-flight memory that lines up the progressively weighted time-dependent symbols w(1) · x(1), w(2) · x(2) … w(49) · x(49) over all wavelengths as
While this process, as it is implemented here, does not enhance the network speed because it only uses diagonal components, in principle a significant increase in speed can be obtained by scaling the network to deep (multiple level) structures through the use of parallel wavelengths as well as time and spatial multiplexing.
Finally, the intensity of all of the optical signals in each time bin are summed via sampling and detection to produce the resulting matrix multiplication (equivalent to a dot product of 49×1 vectors for the case of 7×7 matrices) of the neuron, given by:
After matrix multiplication, the summed, weighted output is then modulated in order to map it into a desired range by using a nonlinear sigmoid function. In this initial demonstration we achieve this last function using digital electronics, which generates the output of the single neuron perceptron. In principle, however, this can easily be achieved all-optically. Finally, the input data prediction category is produced through comparison between the decision boundary with the neuron output. The decision boundary is a 49 dimensional hyperplane, generated during digital learning carried out offline prior to the experiments. Thus, the input data can be separated into two categories.