**A DNA switching gate architecture design used for ConvNet circuit.** A ConvNet basically consists of input layer, convolutional layer, nonlinear layer, and output layer. In each layer, an intermediate array of pixels, referred to as feature map, is produced from the previous layer. Fig. 1a illustrates the operation principle of the ConvNet for recognition tasks, where a n × n input symbol is convolved with a k × k kernel function (stride=1) to compute a feature map of dimensions (n-k+1) × (n-k+1). When operating ConvNet, the input symbol is grouped into (n-k+1)2 receptive regions (blue dashed line marked area of Fig. 1a) of dimensions k × k. The elements of these receptive regions share the same weights, which could enable a sparse topology to effectively reduce network connections. Mathematically, a convolution operation requires multiple MAC operations (y= ∑wi × xi) with shared weights. To implement the weight-sharing MAC operation using DNA molecules, we proposed a switching gate architecture24. Each switching gate is associated with a gate base strand (for example, domains *T*Ssi*Si*T** in Fig. 1b) that has a recognition domain (*Si** in Fig. 1b) and a weight tuning domain (*Ssi** in Fig. 1b) flanked with two toehold domains (*T** in Fig. 1b), and these domains are functionally independent. Varying the sequence of the recognition domain at 3’end (or 5’end) would enable it to be connected to different downstream gates (or upstream gates), leading to different signal transmission pathways. Varying the sequence of the weight tuning domain would enable to determine the weights assigned to the input. The cascaded circuit could thus allow independent control of signal transmission functions and weight assignment functions during computation. In this way, we can implement weight-sharing MAC operations at the molecular level and construct molecular convolutional neural networks with reactive orthogonal DNA molecules.

**DNA implementation of MAC and convolution operation.** We started experimental demonstration with weight multiplication function (y=wi × xi), in which xi is a binary input and wi is an analogue weight. These weights are implemented by designing the switching gate with one weight tuning domain and two recognition domains.Weight multiplication is implemented with cascaded reactions (*Supplementary Fig.3a*) wherein input species Xi convert an activated weight substrate molecule Ni,i,j* to an intermediate product Pi,j. Ni,i,j* is implemented with reactions that Ni,i,jundergoes a spontaneous intramolecular-conformational-switch upon hybridization with weight tuning molecule Wi. In the absence of Xi, no Pi,j will be generated; in the presence of Xi, then the final concentration of Pi,j will be determined by the concentration of Ni,i,j*, thus setting the value of the weighted multiplication y. Different weights can be implemented by varying the concentration of Wi (*Supplementary Fig.3b, and Supplementary Figs.4-10*). Then, we can compute the sum of weighted inputs within the same neuron (y=∑xi). This is implemented with reactions wherein all intermediate species Pi,j stoichiometrically convert summation gate (Sdj,k) to common weighted-sum species Ssj,k (*Supplementary Fig.3c,d*). It should be noted that weights with negative values are implemented by different DNA strands. To complete the summation, the positive weighted-sum species Ssj,k and negative weighted-sum species‾Ssi,n need to be subtracted from one another (*Supplementary Fig.3e*). Specifically, all positive weighted-sum species Ssj,k can convert the double-stranded complex Ddk,m to an intermediate species Dsk,m. All negative weighted-sum species‾Ssi,n generated from the previous step can bind to the toehold of inhibitory strand Ini and branch-migrate to form inert waste species, producing reactive annihilation species Subi,n* through intramolecular-conformational-switch.The subtraction can thus be realized wherein Subi,n* and Dsk,m annihilate each other. Only remaining Dsk,m will interact with the downstream reporting gate (*Supplementary Fig.3f*) to read the output signal; otherwise reaction is terminated (*Supplementary Fig.3g*).

Subsequently, we demonstrated that simple MAC operations can be implemented by combining weight multiplication and summation subfunctions (*Supplementary Figs. 11 and 12*). For example, a two-species MAC operation (y=w1 × x1 + w2 × x2) is implemented by adding one summation gate to two parallel weight multiplications (*Supplementary Fig. 11a, b*). We can vary the concentration of respective weight tuning molecules (W1 and W2) to obtain different weights for corresponding multiplication reaction, then we can compute the sum of all weighted inputs by using the same recognition domain connected to summation gate. As expected, the circuit exhibits a stoichiometric behaviour with the output level and the concentration of Wi (*Supplementary Fig. 11c, d*).

Then, we combined two MAC operations to demonstrate convolution of a 2×2 input pattern using a 2×1 kernel. Each receptive region of input patterns (x1 and x3; x2 and x4) multiply with weights (w1 and w2) to obtain weighted pixels (x1 × w1 and x3 × w2; x2 × w1 and x4 × w2), respectively. Feature maps (y1 and y2) are then exported by summing up the weighted pixels in the same receptive region (Fig. 2a). We built a small-scale DNA regulatory circuit using three subfunctions: multiplication, summation, and reporting, to perform the convolution operation (Fig. 2b). The 2×2 input pattern is encoded with four DNA input strands (X1, X2, X3, and X4). The convolution kernel is encoded in the sequence of weight tuning domains (see green domains *Ss1* and *Ss2* in Fig. 2b) of weight tuning molecules (W1 and W2). To complete the multiplication with shared weights, we designed four weight substrate molecules (N1,1,4 and N1,2,6, N2,3,4 and N2,4,6), and two of which (for example, N1,1,4 and N1,2,6 in Fig. 2b) have the same weight tuning domain corresponding to the pixels that interacts with the same kernel in different local receptive regions, but have different recognition domain at 3’ end to connect to the downstream summation gates (Sd4,5 and Sd6,7). Each input patterns were binary patterns, in which 1 or 0 represent the presence or absence of input strands. The value of analogue weights determined from convolution kernel is implemented with the concentrations of Ni,i,j. To compute convolution, each DNA sub-circuit runs independently and parallelly to compute MAC operation in each receptive region. For a specific pattern, the corresponding weight tuning molecules Wi would activate respective weight substrate molecules Ni,i,j. Thus the assignment of shared weights by the convolution kernel is implemented with the activated weight substrate molecule Ni,i,j* and Xi through DNA strand displacement reaction, resulting in releasing of intermediate species Pi,j. Two summation gates Sdj,k converts Pi,j in the same receptive regions to weighted-sum species Ssj,k, leading to the trigger of downstream reporting gates. For experimental demonstration, we chose 6 input patterns and all two outputs achieved their correct ‘on’ or ‘off’ states, indicating that the DNA circuit correctly implemented the convolution computation (Fig. 2c). For example, with inputs X1 X2 X3 X4 = 1001, the concentration of output strands and corresponding fluorescence signal y1 (or y2) is proportional to X1 × W1 (or X4 × W2) as designed.

**A DNA-based ConvNet for molecular pattern recognition.** Having shown that the DNA circuit is capable of processing the convolutions, we next built a DNA-based ConvNet that can ‘remember’ two handwritten symbols: Chinese oracles ‘fire’ and ‘earth’ (Fig. 3a). The training set consists of 48,000 patterns of handwritten symbols from the Sinica oracle database. In silico, all original symbols were converted to 144-bit binary patterns for network training by rescaling them to 12×12 grids, and setting each pixel to 1 when exceeding the threshold (*Supplementary Fig. 13*). The convolution kernel (a 6×6 matrix) slides along input patterns with a stride of 6 and subsequently generates a corresponding output feature map (Fig. 3b and *Methods* ‘*Neural-network training and testing*’). We evaluated the network performance on a reference dataset after training, reaching a 97% accuracy (Fig. 3c). We implemented this ConvNet model by encoding the convolution kernel in the sequence of weight tuning domain, and implementing the value of weights with the concentration of weight substrate molecule Ni,i,j. The test input binary patterns were encoded with single strands, wherein each 1 or 0 represents the presence or absence of input strand (Fig. 3d). A DNA-based ConvNet implements pattern recognition by comparing its local feature to all memories and identifying the most similar memory (*Supplementary Fig. 14*). For example, each receptive region of a ‘fire’ can simultaneously interact with same kernel function to export feature maps through DNA strand displacement cascades. As the network runs, a subset of weight tuning molecules Wi could activate corresponding weight substrate molecule Ni,i,j in four receptive region at the same time to enable multiple weight-sharing MAC operations to perform parallelly (*Supplementary Fig. 15a*). This allows DNA circuits to be able to activate a specific reaction pathway in the convolution layer when exposed to a specific pattern, which can enhance network robustness (*Supplementary Fig. 16*). Then, max-pooling process is applied to reduce feature map size by annihilating the smaller one between two pixels through cooperative hybridization7 (*Supplementary Fig. 15b*). As shown in the experimental data (Fig. 3e and *Supplementary Fig. 17*), the input patterns of two handwritten symbols each triggered desired outputs, indicating that two handwritten symbols are classifiable. When each oracle pattern was rotated with the angle increment of 30° from 0° to 360°, the circuit still yielded the desired output for all 26 test input patterns, indicating that the circuit correctly classified rotated patterns. In total, 177~250 distinct molecules were used for all test patterns (*Supplementary Fig. 17b*). As expected, we showed that the DNA-based ConvNet can also remember eight 144-bit molecular patterns simultaneously (*Supplementary Figs. 18-22*).

**A hierarchical neural network for 32 patterns recognition.** Our ConvNet has the feature that inputting of weight tuning molecules can selectively activate specific set of weights to allow the same set of DNA molecules to be used for different tasks. The use of weight tuning molecules as outputs of the upstream circuit shows up possibilities for building hierarchical networks for more sophisticated categorization tasks. To validate this approach, we proposed a two-step classification approach that first uses a logic gate to perform coarse classification and then uses a ConvNet to perform finer classification. To demonstrate this approach experimentally, we choose a task of recognizing 32 handwritten symbols that can be divided into 4 groups: including 8 Chinese oracles (Sinica oracle database), 8 Arabic numerals (MNIST database), 8 English alphabets and 8 Greek alphabets (Kaggle website). In silico, we converted all original handwritten symbols to binary patterns with two layers (Fig. 4a). Layer 1 is on a 1×4 grid and acts as an input for logic circuits to perform coarse classification on languages (for example, oracle is 1000), yielding the outputs to selectively activate the downstream ConvNet subnetwork to perform fine classification into specific handwritten symbols using Layer 2 on a 12⋅12 grid as inputs. 4 groups in Layer 2 can be separately trained in silico with respective datasets to obtain the optimal model (Fig. 4b), thus yielding values of four convolution kernel with dimension of 3×6 (stride = 3×6) (*Supplementary Fig. 23, Methods ‘Neural-network training and testing’*). This network performed well in the reference dataset, reaching >84.0% accuracy in each group (Fig. 4c). We implemented two-step classification approach experimentally by designing different switching gates to encode four convolution kernels. Both the tags in Layer 1 and the inputs in Layer 2 are binary patterns, in which each 1 or 0 indicates the presence or absence of a tag strand (or an input strand), respectively (*Supplementary Fig. 24a*). The pattern classification can be completed with following steps (Fig. 4d): (i) The tag strand in Layer 1 will react with the reporter gate to generate fluorescence signal yi, which can be recognized as corresponding coarse category (*Supplementary Fig. 24b, c*). Meanwhile, the tag strand will react with the fan-out gate to release a set of weight tuning molecules Wi, which can then activate the downstream DNA neural networks (*Supplementary Fig. 24b, d*). (ii) The Wi generated from upstream logic circuits (*Supplementary Fig. 25*) then activates corresponding neural network to implement the recognition (Fig. 4e), in which each output yi is uniquely correlated to specific handwritten symbols to enable the fine classification. Two fluorescence signals are collected from Layer 1 and Layer 2 respectively to determine the recognition results. For example, a ‘fire’ is recognized if and only if y1=1 and y1=1 (where y1 is the output identifying the coarse category of ‘oracle’ and y1 is the output identifying the fine category of ‘fire’). In total, constructing the DNA-based ConvNet that can remember 32 molecular patterns requires 368~512 distinct molecules for all test patterns. As expected, the circuit yielded the desired pair of outputs for 32 representative example patterns with group identities (Fig. 4f, *Supplementary Figs. 26 and 27*). In general, with this hierarchical approach, constructing a DNA-based ConvNet that can recognize b×m distinct n-bit patterns (b is number of groups and m is number of patterns in each group) with e-bit kernel size requires n+5m+(m+1)×b×e molecules (*Supplementary Fig. 28*).

**A cyclic freeze/thaw approach accelerate DNA circuits.** The speed of execution of DNA computing remains a challenge in large-scale DNA neural network reactions. For example, our computation of 2 categories took longer than 20 h (*Supplementary Fig. 17c*), and computation increased to over 36 h for 32 categories (*Supplementary Fig. 27*). To accelerate DNA circuits, we developed a simple cyclic freeze/thaw approach (Fig. 5a). The cyclic freeze/thaw approach iteratively drives the strand displacement reaction towards thermodynamic equilibrium, which can accelerate the basic strand displacement reaction by ~75-fold (*Supplementary Figs. 29 and 30*). For a larger-scale circuit, 160 test patterns of 144 bits can be recognized with less than 30 min through 5 freeze/thaw cycles, which would otherwise require hours. (Fig. 5b, c).