Informed consent was obtained from all patients for being included in the study. This study was approved by our university’s ethics review board (Nos. 496) and was performed in accordance with the tenets of the Declaration of Helsinki.
Patients
The panoramic radiographs of 383 patients (169 female and 214 male) with unilateral CA that were acquired between August 2004 and July 2020 were selected from our hospital image database retrospectively. All patients were verified as having unilateral CA by medical records and CT or cone-beam CT examinations. The mean age of both male and female patients was 9.3 years. Of the 383 patients, 174 had solely CA and were assigned as the CA only group, whereas 209 had CA with CP and were designated as the CA with CP group. The CA only group was differentiated from the CA with CP group by referring to the patients’ medical records and CT images. Cases in which the cleft was limited to the anterior area of the incisive foramen on the most inferior axial CT slice, where the foramen was visible, were assigned to the CA only group. Cases in which the cleft extended posteriorly beyond the incisive foramen were assigned to the CA with CP group. Patients who had received surgical interventions for bony structures around the cleft before the first panoramic examination were excluded. In most patients, panoramic examinations were performed several times before bony transplant surgery. The panoramic images taken just before the transplant were selected for the present study. As controls, 210 panoramic radiographs matching the mean age and sex distributions of patients were selected from the same database during the same period. These patients, who were assigned as the normal group in the present study, were examined for other purposes, such as the evaluation of unerupted permanent teeth and pre-examination for orthodontic treatment.
The panoramic radiographs were exposed using an AUTO III NTR unit (Asahi Roentgen Industry, Kyoto, Japan), with a tube voltage of 75 kV, tube current of 12mA and exposure time of 12s or a Veraview Epocs unit (J. Morita Mfg. Corp., Kyoto, Japan), with the tube voltage of 75 kV, tube current of 8 mA and exposure time of 16.2s.
DL architecture
The DL system was created on Ubuntu Linux operating system version 16.04.2. The workstation had a GeForce 1080Ti GPU with 11GB of memory (NVIDIA, Santa Clara, CA). The deep learning process was performed using a customized DetectNet built in the Digits version 5.0 (NVIDIA, Santa Clara, CA; https://developer.ndivia.com/digits) training system. The Adam (adaptive moment estimation) solver was used for the training process with 0.0001 as the base learning rate. DetectNet has five main parts: data ingestion and augmentation, a fully convolutional network, loss function measurement, bounding box clustering, and mean average precision calculation19.
Development of learning models
Two models (models 1 and 2) were created. The panoramic radiographs were downloaded from the database in JPEG format, and all images were cropped to 900×900 pixels. In each group (including the CA only, CA with CP, and normal groups), 30 images were randomly assigned to test dataset, and the remaining images, which included the training (approximately 80% of the remaining data) and validation datasets, were used to create the learning models (Table 1).
In model 1, only two groups—CA only and CA with CP—comprised the training and validation sets (i.e., the normal group was not included). Rectangular regions of interest (ROIs) were set on the training and validation images to encompass the area of the CA according to the following methods. The superior margin was set at the level of the inferior line of the piriform aperture on the contralateral healthy side, and the inferior margin was set at the alveolar ridge. The medial end was set at the alveolar ridge between the central incisors, and the distal end was set at the most distal portion of the piriform aperture. The coordinates of the upper left (x1, y1) and lower right (x2, y2) corners of the ROIs were labeled using ImageJ (National Institute of Health, Bethesda, MD, USA) (Figure 1a) and converted to text form (Figure 1b). The CA only group and the CA with CP group were assigned as class 1 and class 2, respectively. Model 2 included the normal group’s data in addition to those of the patient groups. Only the labels were created for the classifications as class 0 (not the coordinates). In both models, 1000 epochs of the training process was performed. Inference was then applied to the test data, including all three groups, using the created learning models. When the model detected a CA, the detected area was shown as a bounding box. Red and blue boxes were shown for the CA only group and the CA with CP group, respectively.
Analysis of image appearance features
To determine the characteristic image appearance features that could influence the DL models’ performance, the image datasets used for the training process (i.e., the training and validation datasets) were analyzed regarding two structures: the inferior line of the piriform aperture and the lateral incisor on the affected side. The former was evaluated based on its visibility and relative level to the contralateral or unaffected side. The latter was evaluated according to whether the tooth was present or absent, and regarding findings of microdontia, un-eruption and medial inclination. These evaluations were performed by two radiologists (YA and EA) with more than 20-years experiences of interpreting panoramic appearances. The final determinations were reached by consensus after discussion when the evaluations differed between the radiologists.
Statistical analysis
The differences in ratios between two groups were tested by chi-square test, with p < 0.05 established as the threshold of significant difference.