Relationship between training database size and structural connectivity in arti cial neural networks: Do speci c local connections improve prediction accuracy?

Training of artificial neural networks is very expensive, as a large-size database is necessary. Moreover, it is usually difficult to find such large-size training databases. Hence, it will be interesting to design artificial neural networks that can be used for training with a small-size database, while maintaining a similar accuracy for prediction compared to fully connected neural networks. We studied neural networks with partial disconnections, additional bypass connections, and negative activation nodes, which are found in the neuronal systems of the human brain. By combining the fully connected neural network and the above three brain-like elements, we found that the modified neural network showed improved prediction accuracy of 13% compared to the fully connected one despite the small size of the training database. To analyze the improved neural network, the contribution of each node in the hidden layers affecting the total prediction accuracy of the neural networks was studied. We also found important local connections that improve the prediction accuracy, and discussed the design of a neural network with a small-size training database without reduction in prediction accuracy.


Introduction
The realization of intelligence systems mainly depends on the size of the training database, that is, the larger the size of the training database, the higher the prediction accuracy. However, unlike intelligent machines, human beings can learn faster and from fewer examples. This may be related to the specific structure of neural networks in the human brain. The formation of specific brain structure may have been affected due to evolution, for adaptation to harsh environments, such as fear, starvation, and cold [1], given the need to make decisions quickly. The structure of a neural network can be divided into the physical geometry of neuronal connections (structural connectivity) and the chemical connection strength between the neurons (evoked functional connectivity). Thus, although two brains may have the same physical geometry in terms of neuronal connections, the shape of the flow of its neuronal signals may vary depending on the synaptic strength.
Human brains show distinguishable spatial distributions of functional areas such as the visual, auditory, and prefrontal cortices. It is probable that the connections between neurons are governed by several rules, which may affect the formation of each functional area. However, the connection rules for the functional areas in the human brain have not been revealed yet.
According to Hebb's rule, neurons that fire together, wire together [2]. Hebb's rule explains how associative memory can be formed by changing the strength of the synapses. However, Hebb's rule works within a given neuronal structure, that is, it does not explain how the synapses are physically formed between the neurons. It is still unclear whether the synapses between neurons are formed randomly or genetically [3][4][5][6][7][8]. Then, how can neurons located far apart make connections to create associative memory? In this case, the direct connection between two memories is impossible physically. Instead, they are probably connected indirectly through the higher layers in the hierarchical structure.
Inhibitory neurons exist in addition to excitatory neurons and account for 20-30% of all neurons. The role of inhibitory neurons is not clear yet. However, it is expected that they can suppress a surplus of neuronal signals and prohibit interference between these signals. In the visual cortex, the role of inhibitory neurons is related to orientation selectivity [9][10][11]. However, the principles behind the formation of excitatory-inhibitory networks have not been studied as diligently as compared to functional studies of the role of inhibitory neurons. Thus, it is interesting to probe why a neural network composed only of excitatory neurons cannot be optimized without inhibitory neurons.
A fully connected artificial neural network (FNN), a machine learning method, is considered to be a candidate for realizing artificial intelligence. Notably, the emergence of deep learning with big data has accelerated developments of artificial intelligence technology such as image classification, voice recognition, and self-driving cars [12][13][14][15]. FNNs use the strength of the connections between nodes (i.e., weights) in the same manner as the control of the strength of synapses in the brain for learning. However, only the numbers of hidden layers and nodes are set up, and the types of connections and nodes have not been considered much in studies on FNNs.
Here, we assumed that the functional connectivity, i.e., trained weight structure, can be affected by the structural connectivity. Therefore, we adopted three properties of neural networks within the human brain, which are not included in the FNNs and well observed in the neuronal network of human brains [16,17]. The first property is that all nodes in the hidden layers do not need to connect fully between adjacent layers. The second property indicates that additional bypass connections in the hidden layers can exist between the following adjacent layers or beyond. The third property relates to the nodes in the hidden layers, which can be inhibitory instead of excitatory (i.e., negative activation nodes; Fig. 1). In the Eq. 1, the main variables affecting prediction accuracy in FNNs and brain-like neural networks (BNNs) including the above three properties are summarized. Even though these three properties do not represent the whole set of neuronal connections in the brain, they can be used as starting points to study how the brain can learn from small-size training data sets. Therefore, our goal in this work is to reveal the relationship between the size of the training database and the structural connectivity of BNNs ( Fig. 1).
Prediction Accuracy ~ fFNN(Nlayers,Nnodes) for FNN, ~ fBNN(Nlayers,Nnodes,Ndiscon,Naddition,Ninh) for BNN, where the Nlayers,Nnodes,Ndiscon,Naddition, and Ninh correspond to the number of hidden layers, nodes in hidden layers, disconnections, additional bypass connections, and inhibitory nodes, respectively.  S1(c)). As in the disconnection, a ratio between 0 and 0.5 was assumed to select hidden nodes for the additional bypass connections randomly. For the inhibitory nodes, the signs of the activation values were modified to become negative, and a ratio between 0 and 0.5 was also applied for the random selection of inhibitory nodes. To find optimized BNNs showing high prediction accuracy, we set up 576 BNNs in a combination of ratios between 0 and 0.5 for the disconnections, additional bypass connections, and inhibitory nodes, and the range of between 2 and 4 for the level in the additional bypass connections (Table I). Other learning conditions, such as backpropagation, sigmoid activation, etc., were the same as those for the FNNs.

Analysis method.
The knockout method has been used for the identification of the role of specific neurons in experimental neuroscience [18,19]. Here, the knockout method was developed to quantitatively analyze the contribution of each node to the prediction accuracy in BNNs. After training the BNNs, all connections from a targeted hidden node were removed (i.e., the node was knocked out), and the test accuracy was recalculated.

MNIST database.
The MNIST database is a collection of handwritten digits from 0 to 9 [20]. It consists of 60k training and 10k test datasets. Each image in grayscale has a dimension of 28 × 28 pixels and, therefore, the input vectors for FNNs and BNNs had a length of 784. The dimension of the output was 10 (0-9). For training, two types of BNNs were used, consisting of the hidden layers 250-200-150-100-50 and 100-80-60-40-20 for the test and analysis, respectively.
Other conditions were the same as those for the other cases using different databases.

FEI database.
The FEI database is a collection of face images [21]. The oxidation potentials of the training dataset were obtained from high-throughput quantum chemistry calculations [23]. For the training, the BNNs consisted of hidden layers 250-200-150-100-50. Other conditions were the same as those for the other cases using different databases.

Results and discussion
To study the optimized BNN structure, we initially set up the FNNs with five hidden layers.
The shape of the hidden layers was optimized by screening 3,480 FNNs using the MNIST database. As per the optimized shape, the first hidden layer next to the input layer should be the largest among the hidden layers, and the sizes of the following hidden layers should decrease gradually.
Three BNN factors, disconnection, additional bypass connection, and inhibitory nodes, were included in the FNNs with the optimized structure of hidden layers. Based on the combinations of the three BNN factors, 576 BNN structures with 250-200-150-100-50 hidden layers were generated and tested for predictions using the MNIST database (Table I). Here, the 20k database was used to test the BNNs for the case of a large-size database training.  In Fig. 3, the role of inhibitory nodes is studied in more detail. Here the activation thresholds of inhibitory nodes are changed from 0 to 100 (0, 1, 5, 100). The increased activation thresholds mean that the inhibitory nodes activate higher inputs compared to the excitatory (positive) nodes.
In other words, as the threshold increases, the number of inactive inhibitory nodes increases. Fig.   3 shows the decrease in prediction accuracy as the activation threshold increases for a high ratio of inhibitory nodes. Comparing the thresholds of 5 and 100 in Fig. 3(c) and 3(d), the decrease in prediction accuracy becomes saturated after 5. This result may be related to the fact that the sum of the size of inputs from the front does not exceed 5; instead, the decrease in prediction accuracy may originate from the non-activated inhibitory nodes, similar to the fully disconnected nodes.
Therefore, the prediction accuracy decreases further compared to the disconnection cases consisting of partially disconnected nodes. The wider distribution of the prediction accuracy in  ). An improvement of up to 13% occurs in the prediction accuracy for the BNN using the FEI database ( Fig. 5(b)), which is a large effect compared to that under 1% for the BNNs using the MNIST and PubChem databases. Thus, the design of the BNN may be very important when the training database size is small. Furthermore, the type of training database affects the dependence of prediction accuracy by the inhibitory nodes. An increase in the ratio of inhibitory nodes results in a remarkable decrease in the prediction accuracy of the BNNs trained by the FEI database ( Fig. 5(a)), which is different from the case of the BNNs trained by the MNIST database (size: 1k); they show a slight decrease in the prediction accuracy ( Fig. 4(a)).

Conclusions
BNNs including three brain-like factors, disconnection, additional bypass connection, and inhibitory nodes, were studied to understand the relationship between specific local connections and training database size. For a large-size training database such as the MNIST database (size: 20k), additional bypass connections improve the prediction accuracy. On the contrary, the inhibitory nodes worsen the prediction accuracy. For the small-size training MNIST database (size: 1k), irrespective of the existence of inhibitory nodes, many BNNs showed better prediction accuracy compared to the FNN. The knockout method was used for the quantitative analysis of the improved BNN, and the findings showed that the contribution of additional bypass connections to the prediction accuracy was high. Finally, we found an improvement of up to 13% in the prediction accuracy of specific BNN structures for small-sized databases. In this study, we showed the existence of optimized BNNs depending on the size and type of training databases, therefore, we are expecting the study of how to find optimized BNN structure efficiently for a given specific database in the near future.