A new deep-learning structure for image localization, called an ONN, was presented and applied to detect the location of pneumothorax in chest X-rays, resulting in an AUC of 0.870, an accuracy of 85.3%, a sensitivity of 75.0%, and a specificity of 86.5%. We also applied ONNs and CNNs to predict the location of the glottis in laryngeal images and achieved accurate prediction and adjacent prediction rates of 70.5% and 20.5%, respectively, with the ONN. The ONN was compared favorably with that of the CNN, a commonly used deep-learning structure for image recognition, and was compared decently with that of the selected ANN model9,12. Compared with a CNN, an ONN required only approximately 10% of the computations using a CNN to train images with an input resolution of 256 × 256 pixels.
An ONN extracted well the spatial location information of the input images by setting the same weight factor to the connections of the input nodes in a given column to a vertical layer and setting the same weight factor to the connections of the input nodes in a given row to a horizontal layer. This approach would be similar to using the latitude and longitude values on a map and finding the intersection. Having different biases for the vertical and horizontal nodes can help greatly in extracting the spatial location information. However, significantly increasing the number of vertical and horizontal layers will not be very important to improve extracting the spatial location information.
CNNs have shown excellent performances in classification and object detection for images through abstraction extraction from the images while generating the featured maps using several filters. However, extracting the spatial location information required for localization may be slightly different from the abstraction extraction. Therefore, an approach from a different perspective, such as ONN, is required.
In addition, the diagnostic performance of the ONN with a sigmoid activation function for all the nodes outperformed the ONN with RELU activation function for all the nodes other than the output nodes, as shown in Table 1. Since the back-propagation method4–6 uses the gradient descent, the RELU function greatly simplifies the process of calculating the derivative of the activation function, as seen from Eq. (1). However, an essential characteristic of the activation function within an ANN node should be to generate a signal that is symmetric in both the forward and backward directions. The closest form to this function is the sigmoid function. Although the RELU function looks similar to a sigmoid function in terms of shape, it might remove some important data flow information from the ANN12 [refer to Eq. (1)] and cannot produce a symmetrical signal, which is an essential characteristic of the activation function within an ANN node. Therefore, the performance of the ANN cannot be optimized with the RELU activation function.
Fully-connected small ANNs have achieved excellent results in image localization as in previous studies9,12. However, to change the input image resolution for the ANN and the number of hidden layers, several dozens of ANN models must be individually trained, and the test results of these models should be compared to find the best model. This approach requires considerable time, effort and computing resources, even for small ANN models. This study showed that an ONN can be used as a quick selection criterion to compare small ANN models for image localization, since the ONN performed well compared decently with the selected ANN model, and training an ONN on images with an input resolution of 256 × 256 pixels requires a similar amount of computing resources as a small ANN to train the same number of input images with a resolution of 30 × 30 pixels.
In conclusion, as a new deep-learning structure for image localization, an ONN, which is simple, efficient, and completely different from a CNN, was applied to detect the location of pneumothorax in chest X-rays and to predict the location of the glottis in laryngeal images. Since the ONN extracted the spatial location information of the input images better than the CNN, its localization performance was compared favorably with that of the CNN. The ONN with a sigmoid activation function for fully-connected hidden nodes outperformed the ONN with the RELU activation function, which does not produce a symmetrical signal, an essential characteristic of the activation function within an ANN node. An ONN can be used as a quick selection criterion to compare the test results of small ANN models for image localization to choose the best model from several dozens of ANN models. Finally, our approach can accurately predict locations in medical images, reduce the time delay in diagnosing urgent diseases, and increase the effectiveness of clinical practice and patient care.