We experimentally demonstrated the building block of the network – a single layer, single neuron photonic perceptron (Fig. 4) which is suitable for binary classification problems. Problems with more classes can be addressed using more than one neuron, even with only a single layer (non-deep) ONN system. This can easily be achieved by sub-dividing the comb into wavelength groups that each define a neuron (See Sect. 5). We first tested the perceptron on several pairs of handwritten digits (Figs. 5 and 6), using 500 figures for each digit, from which 920 figures were randomly selected for offline pre-training, leaving the remaining 80 figures for experimental testing. The 2D handwritten digit figures were pre-processed electronically using a down-sampling method to reduce the image size from 28×28 to 7×7, followed by transforming it into a one-dimensional array of 49 symbols. This was then time multiplexed with ~ 84 ps long timeslots for each symbol (Fig. 5b), equating to a modulation speed of 11.9 Giga-baud.

As discussed above, the optical power of the 49 microcomb lines was shaped according to pre-learned synaptic weights (Fig. 6a) to boost the parallelism and establish synapses for the neuron. Then the input data stream was multicast onto all 49 shaped comb lines followed by a progressive (linear with wavelength) delay using a ~ 13 km standard single-mode fibre (SMF), which served as the time-of-flight optical buffer via its second-order dispersion (~ 17 ps/nm/km). Hence, the weighted symbols on different wavelength channels were aligned temporally, allowing them to be summed together via photodetection and sampling of the central timeslot, to generate the results of the matrix multiplication and accumulate (MAC) operation. The output was then compared with the decision boundary obtained from the learning process, which yielded the final ONN prediction (Fig. 6b).

We evaluated the performance of the optical perceptron in determining the classification of two standard benchmark cases (Figs. 6, 7), handwritten digits and cancer cells. In the first case, two categories of handwritten digits (0 and 6) were distinguished by the decision boundary. Our device achieved an accuracy (ACC) of 93.75%, compared to 98.75% success for the calculated results on a digital computer (see Fig. 6d). Despite being a rudimentary benchmark tests, the perceptron nevertheless achieved a very high success rate and, most importantly, at unprecedented speeds (see below). This was a result of the large number of synapses (optical wavelengths over the C-band), in turn enabled by the record low FSR soliton crystal micro-comb.

We also determined the classification of cancer cells from tissue biopsy data (Fig. 6e). Individual cell nuclei, from breast mass tissue extracted by fine needle aspirate and imaged under a microscope, have previously been characterized in terms of 30 features such as radius, texture, perimeter, etc. In our analysis, data for 521 cell nuclei were employed for pre-training, with another 75 used for experimental diagnosis, following a similar procedure to the above handwritten digit test. We achieved an accuracy of 86.67% as compared to 98.67% success for the calculated results on a digital computer.

There is currently no commonly accepted standard that establishes benchmark systems for classifying and quantifying the computing speed and processing power of the widely varying types of ONNs that have been reported. Therefore, we explicitly outline the performance definitions that we use for throughput and latency in characterizing our ONN. We follow the approach Intel has used to evaluate digital micro-processors [54]. Considering that in our system the input data and weight vectors for the MAC calculation originate from different paths and are interleaved in different dimensions (time, wavelength), we use the temporal sequence at the electrical output port to clearly define the throughput. According to the broadcast-and-delay protocol, each computing cycle of matrix multiplication between the 49-symbol data and weight vectors generates an output temporal sequence with a length of 48 + 1 + 48 symbols and thus a total time duration of 84ps×97. While the 49th symbol corresponds to the desired matrix multiplication output as a result of 49 multiply-and-accumulate operations, the throughput of our ONN is thus given as (49×2)/(84 ps×97) = 11.9 Giga-FLOPS.

The input data stream consisted of symbols with 8-bit (256 discrete levels) values, determined by both the original grey scale values of the image pixels and the intensity resolution of our electronic arbitrary waveform generator. The optical spectral shaper (Waveshaper) featured an attenuation control range of 35 dB, which could support up to 11-bit resolution (10∙log10(211) = 33 dB). As such, each computing cycle also corresponded to an equivalent throughput of (49×2)×8/(84 ps×97) = 95.2 Gbps in terms of bit rate. For analog systems such as the one used here, the bit rate/intensity resolution is limited by the signal-to-noise ratio of the system. Hence, to achieve 8-bit resolution, the system would have to feature a signal-to-noise ratio of > 20∙log10(28) = 48 dB in electrical power or 24 dB in optical power. This is well within the capability of analog microwave photonic links including the ONN system reported here (with OSNR > 28 dB).

Our results represent the fastest throughput (in bit rate) claimed so far for any ONN, although a direct comparison of the widely varying systems is difficult. For example, while systems that use CW sources to perform single-shot measurements [4, 10, 17] may have a low *latency*, they always suffer from a very low *throughput* since the input data cannot be updated rapidly. While the *latency* of our single perceptron is relatively high (~ 64 µs) due to the fibre spool, this does not affect the *throughput* of our system. In any event the latency can be readily reduced to < 200 ps through the use of compact devices to implement the delay function — devices with high group velocity dispersion and much lower overall time delay such as photonic crystal waveguides or sampled Bragg gratings (in fibre or on-chip) [55], for example. Finally, although we implemented the nonlinear function digitally offline, which did not impact the predictions, this could also be done with electro-optical modulators or electrical amplifiers operating at saturation point.