Deep learning for detection and 3D segmentation of maxillofacial bone lesions in cone beam CT

To develop an automated deep-learning algorithm for detection and 3D segmentation of incidental bone lesions in maxillofacial CBCT scans. The dataset included 82 cone beam CT (CBCT) scans, 41 with histologically confirmed benign bone lesions (BL) and 41 control scans (without lesions), obtained using three CBCT devices with diverse imaging protocols. Lesions were marked in all axial slices by experienced maxillofacial radiologists. All cases were divided into sub-datasets: training (20,214 axial images), validation (4530 axial images), and testing (6795 axial images). A Mask-RCNN algorithm segmented the bone lesions in each axial slice. Analysis of sequential slices was used for improving the Mask-RCNN performance and classifying each CBCT scan as containing bone lesions or not. Finally, the algorithm generated 3D segmentations of the lesions and calculated their volumes. The algorithm correctly classified all CBCT cases as containing bone lesions or not, with an accuracy of 100%. The algorithm detected the bone lesion in axial images with high sensitivity (95.9%) and high precision (98.9%) with an average dice coefficient of 83.5%. The developed algorithm detected and segmented bone lesions in CBCT scans with high accuracy and may serve as a computerized tool for detecting incidental bone lesions in CBCT imaging. Our novel deep-learning algorithm detects incidental hypodense bone lesions in cone beam CT scans, using various imaging devices and protocols. This algorithm may reduce patients’ morbidity and mortality, particularly since currently, cone beam CT interpretation is not always preformed. • A deep learning algorithm was developed for automatic detection and 3D segmentation of various maxillofacial bone lesions in CBCT scans, irrespective of the CBCT device or the scanning protocol. • The developed algorithm can detect incidental jaw lesions with high accuracy, generates a 3D segmentation of the lesion, and calculates the lesion volume.


The dataset
In order to develop an automated deep-learning algorithm for detection and 3D segmentation of BL in CBCT, we searched our archives for relevant cases.According to Balki et al [28], there is no agreement on the required sample size for deep learning studies.Therefore, we reviewed all CBCT scans, performed in a specific timeframe between 2019 and 2020, at our oral maxillofacial imaging unit of which we had full access to all technical and medical records.The inclusion criteria for the study cases were CBCT scans with well-defined hypodense

Introduction
Oro-maxillofacial radiology currently uses 3D imaging which overcomes anatomical overlap and distortions inherent to 2D imaging such as intraoral and panoramic imaging.Cone beam CT (CBCT) uses a cone-shaped x-ray beam rotating around the patient's head and capturing data to reconstruct a 3D image of the oral cavity [1][2][3], with a relatively lower radiation dose and higher spatial resolution compared to multi-detector CT [4].CBCT is common in dental practice for various indications such as pre-implant planning, assessing the location of impacted teeth as well as bone deficiency in patients with clefts, diagnosing and managing endodontic pathologies, and assessing and diagnosing jaw lesions [5,6].There are currently more than 40 different CBCT devices which use many protocols with various voxel sizes, field of views (FOV), and rotation pathways [1,4].Maxillofacial bone lesions (BL), found in both jaws, in different sizes, shapes, and densities, may result from the presence of cysts and benign or malignant tumors and can be either clinically apparent or detected incidentally on CBCT imaging [6].Benign tumors, accounting for most BL, are incidentally found in 1.9-4% of cases [7,8].Incidental BLs benign BL, histologically confirmed, located either in the maxilla or in the mandible, fully demonstrated in the scan FOV, performed prior to surgical intervention, with or without metal artifacts.The exclusion criteria were CBCT scans with movement artifacts.After applying the inclusion and exclusion criteria, our study included 41 CBCT scans with benign BL.Then, we searched for 41 CBCT scans without BL (control cases) confirmed by a maxillofacial imaging expert (C.N.).The inclusion criteria for the control cases were scans of the maxilla, the mandible, or both jaws, performed prior to implant placement, without movement artifacts, with or without metal artifacts.The study was approved by the Institutional Review Board (HMO-0297-21) with an exemption of informed consent.
The CBCT scans were performed using three different CBCT devices, with different protocols.Most scans were obtained with the MORITA TM 3D Accuitomo 170, followed by I-CAT TM Next Generation and CRANEX®3D Dental Imaging System.The tube voltage ranged between 85 and 120 kV, tube current ranged between 3 and 13 mA, FOV diameter ranged between 4 and 16 cm, FOV height ranged between 4 and 14 cm, and the voxel size ranged between 0.08 and 0.25 mm.Most of the scans used full rotation, though some cases used half rotation (Table 1).
Due to the limited number of anonymized CBCT scans we included in our dataset, we applied the deep learning algorithm to each axial slice individually.As a result, our dataset consisted of 4555 axial slices with BL and 31,539 axial slices without BL (Table 1).
All CBCT scans were assigned to training (50 cases-20,214 slices), validation (10 cases-4530 slices), and testing (22 cases-6795 slices) datasets.Scans were equally divided to include, in each dataset, various types of BLs, different scanning protocols, and different CBCT devices (Table 1).The number of the axial slices per case ranged from 161 to 705 (Appendices A and B).
The included BL differed by their size, location, and histopathological diagnosis (Table 1).Some were adjacent to a tooth, cortical bone, the maxillary sinus, or the mandibular canal, differently demonstrated in the various slices.The maximal and minimal lesion diameters, defined as the lesion diameter in the slice in which it appears the largest or smallest, were measured (Table 1).
In order to train and test the performance of the developed algorithm, the exact contours of the BL in the dataset were manually drawn in a middle axial slice and in both the most inferior and the most superior slice by an oral maxillofacial radiologist (R.A.A.) with more than 6 years' experience.Based on these marks, the contours of the BL in all the other slices were manually drawn by students (T.B., A.I.P., A.A.) using the VGG Image Annotator (VIA) program [29].Finally, these manual contours served as ground truth after confirmation or modification, by an oral maxillofacial radiologist (C.N.) with more than 10 years' experience.

The deep learning Mask-RCNN algorithm
In the first step of the 3D detection process, a Mask-RCNN deep learning algorithm [26] was implemented to separately detect the BL in each CBCT axial slice.This algorithm extends the faster region-based convolutional neural networks (faster RCNN) process [30] which detects only a bounding box containing the bone lesion, by additionally segmenting the bone lesion mask within the bounding box.
The Mask-RCNN algorithm uses a feature pyramid network (FPN) backbone for feature extraction from each slice and then applies a region proposal network (RPN) to propose regions of interest (ROIs) which may contain a BL.The proposed ROIs are then resized to yield an aligned-similar size for all ROIs in order to classify them as a normal region or a region containing a bone lesion.Then, the algorithm generates a pre-segmentation bounding box (yellow rectangle in Fig. 1) for each detected bone lesion and adds an instant segmentation process based on a fully convolutional network [31] for producing a mask (red marking in Fig. 1) for the lesion within each bounding box.All three tasks (classification, bounding box, and instant segmentation) utilize features extracted by the backbone network.
During training, the weights of the network are adjusted using five loss values, with the total loss being calculated as follows: The losses for the RPN class and RPN bounding box are related to the output of the region proposal network.The RPN class loss is a binary classification evaluation, marked as positive if the overlap between the proposed region and the ground truth bounding box is greater than 0.7 using the intersection over union (IoU) metric.The RPN bounding box loss measures the discrepancy between the corner points of the proposed region and ground truth bounding box.The remaining three losses are calculated for each of the final three tasks in the algorithm.In addition to the training loss that is calculated after each step, the loss is also calculated for the validation dataset after each epoch to assess the algorithm's performance on an independent set of data.In order to train the algorithm, we used transfer learning on Mask-RCNN architecture, with ResNet101 backbone and initial weights obtained from a pre-training dataset [32].For this purpose, all CBCT slices were converted from DICOM to JPEG format with 256 gray-level values and resized to 800 × 800 pixels.The gray-level values were automatically modified according to the window attributes in the DICOM header.The training and the validation datasets were augmented by horizontal flipping of the axial slices.Current (mean ± SD, range in mA) 6.7 ± 2.5 (4-13) 5.8 ± 1.1 (4.9-10) 7 ± 3.4 (5-13) 5.2 ± 0.5 (4.9-6) 5.5 ± 1.1 (4-8) 5.5 ± 0.6 (5-6.4)6.4 ± 2.4 (4-13) 5.6 ± 0.9 (4.9-10)The Mask-RCNN algorithm was trained with a single NVIDIA Quadro P2000 GPU, using the stochastic gradient descent (SGD) optimizer algorithm for minimizing the loss of the model [33] with a momentum of 0.9, starting with a learning rate of 0.001 and ending with a learning rate of 0.0001.The training process ran for 96 h, with 60 epochs and 6000 steps per epoch.We found that fewer epochs led to underfitting and more epochs did not lead to a noticeable improvement.The minimum confidence level for categorizing a region of interest (ROI) as containing a bone lesion in a CBCT axial slice was set to 85%.A higher confidence level resulted in missing many relevant ROIs and a lower confidence level caused many false marks.

Improving the lesion detection and segmentation
In the final stage of the algorithm, the following steps were implemented: 1. Gray-level filtering For improving the performance of the Mask-RCNN algorithm, an additional algorithm was used for filtering the detected ROIs by the mean gray-level value of the segmented lesion within the ROI.ROIs containing very bright (hyperdense) segmented objects, with a mean gray-level value higher than 155, were removed, assuming they represent teeth or restorations as shown in Fig. 2a1 and a2.Similarly, ROIs containing a dark (hypodense) segmented object, with mean gray-level value less than 50, were also removed, assuming they represent air, as shown in Fig. 2b1 and b2.

Analysis of sequential slices for confirming that the suspected ROIs contain a BL
In order to confirm that the segmented object in each detected ROI in a specific slice truly represented a BL, there was a need to analyze all sequential slices of each CBCT examination.For this purpose, an additional algorithm was developed to analyze all the slices of the CBCT examination and to evaluate, in sequential slices, the spatial location of the suspected objects (masks).For each slice containing a suspected object, the algorithm calculated its overlapping with suspected objects in previous slices by: where A is the number of pixels in the suspected object and A ∩ B is the number of pixels that are located identically in the suspected objects in this slice and in a previous slice.We used several previous slices and not only a single slice, since the Mask-RCNN algorithm may have missed the ROI (2) NR KOT (4), RC containing the BL in the neighboring slice (Fig. 3).Then, the algorithm defined subgroups of ROIs in which the calculated overlapping between the objects was more than 50%.Any subgroup including at least 14 ROIs in any 20 successive slices was considered as a subgroup which contain a true BL.

Identifying the initial and final slices containing the BL
When the algorithm classified a CBCT case as containing a true BL, it was important to precisely locate the BL within the scan, by identifying the initial slice and the final slice containing the BL.In the extreme slices, the Mask-RCNN may miss more ROIs since in these slices, the BLs are often smaller and their borders are less defined.Therefore, the initial slice was defined as the first of at least 6 slices out of 20 successive slices, included in the same subgroup of ROIs.The final slice was defined as the last slice in the subgroup that fulfilled the same condition (6 out of 20 successive slices).

Creating a 3D mask of the BL
Then, the algorithm inserted a BL mask in all the slices between the initial and final slices in which the BL had been missed.This was performed by interpolating the shape of the detected objects in the two nearest neighboring slices since the mask of the BL should be very similar in subsequent slices, as shown in Fig. 3.Then, the algorithm removed all the suspected objects not included in the range of the initial and final slices and created a 3D mask of the segmented BL.Finally, the volume of the segmented BL, important for follow-up, was defined as the product of the voxel volume by the number of the segmented voxels in the 3D mask.

Statistical analysis
The accuracy of the algorithm for classifying the CBCT scans (cases) as containing BL or not containing BL was calculated as the ratio between the number of cases that were correctly classified and the total number of cases.
The sensitivity, the precision, and the dice coefficient were evaluated individually for each case containing BL.The sensitivity was calculated as the ratio between the number of ROIs, in all the axial slices of the case, correctly detected as containing BL and the total number of ROIs that contained BL according to the ground truth.Precision was calculated as the ratio between the number of ROIs, correctly detected as containing BL and the total number of ROIs detected by the algorithm, including falsely detected ROIs.
The dice coefficient [34] was calculated using the voxels of the 3D segmentation of the lesion.The dice was calculated as twice the number of overlapping voxels of the manual marking and the automatic segmentation, divided by the sum of the total number of voxels in the manual marking and the total number of voxels in the automatic segmentation.

Detecting the bone lesion separately in each CBCT slice
The deep learning algorithm was tested on 22 cases in the testing dataset, including 11 control cases and 11 study cases with various types of benign BL.The Mask-RCNN algorithm detected the ROIs containing BL in the axial slices, with high sensitivity and high precision for all the different types and forms of BL, independent of their size, shape, and location (Fig. 4).However, the algorithm missed some ROIs with BL (Fig. 3) and sometimes falsely detected ROIs without lesions, such as regions of bone marrow defects (Fig. 5).
The performance of the basic Mask-RCNN algorithm before the improvement stages, for each of the study cases, is summarized in the columns marked by "Basic" (Table 2).The sensitivity ranged between 52.8 and 100% with a total sensitivity of 83.5% while the precision ranged between 80.8 and 100% with a total precision of 88.4%.For the control cases, the basic Mask-RCNN algorithm falsely marked 112 ROIs, with an average of 10.2 false marks per case.

Classification of the CBCT cases following analysis of sequential slices
All the 11 control cases were classified by the algorithm as cases without BL while the 11 study cases were classified as containing BL, with an accuracy of 100%.

Detecting and segmenting the bone lesions following the algorithm improvement
Following the improvement steps, the final sensitivity ranged between 82.0 and 100% with a total sensitivity of 95.9% and the final precision ranged between 92.4 and 100% with a total precision of 98.8% (Table 2, columns marked by "Final").
Following the 3D segmentation of the BL in each case (Fig. 6 and in the video seen in supplementary C), the dice coefficient ranged between 66.8 and 93.0% with an average of 83.5% ± 7.8%.The BL volumes ranged between 0.66 and 6.68 cm 3 with an average error of 15% between the volume of the manually marked BL and the volume of the automatically segmented BL.

Discussion
In this study, an automated Mask-RCNN based deep leaning algorithm was used for the detection and 3D segmentation of BL in maxillofacial CBCT scans.The algorithm was trained, validated, and tested on 82 CBCT scans, using 3 different CBCT devices with various protocols.The Mask-RCNN network was trained on each CBCT axial slice individually, resulting in thousands of CBCT slices for the deep learning training process.Following the deep learning stage, additional stages were implemented for improving the algorithm performance.The accuracy for classifying each CBCT scan as containing BL or not was 100%.The ROIs containing BL were detected in the individual axial slices with high sensitivity (95.9%) and high precision (98.9%).The average dice coefficient of the 3D segmentation was 83.5%± 7.8%.
The dataset included CBCT scans with or without BL, sorted into three datasets, aiming to include in each dataset cases scanned by different CBCT devices with various imaging protocols, including BLs of various histological types, locations, and sizes (Table 1).CBCT scans with movement artifacts were excluded since they are usually nondiagnostic.Scans with metal artifacts were not excluded, since they are common in the clinical setting.3 A BL missed by the Mask-RCNN algorithm in some sequential slices.The initial segmentation of two mandibular BL, performed separately on 3 sequential axial slices a-c and d-f.In these cases, the algorithm missed the ROI with the bone lesion in the middle slice b and e.Following an analysis of sequential slices, the mask of the missed bone lesion was added by interpolating the shape of the detected objects in the nearest neighboring slices Automatic detection of BL in CBCT is critical since sometimes the lesions may be located in a portion of the scan outside the images related to the clinical area of concern and therefore may be missed [7,17].In some studies, AI models have been used for segmentation and classification of BL in panoramic imaging [15,35] while other studies focused on CBCT imaging.Several studies did not focus on BL detection but on classifying the lesion type after manual marking of the lesion contour or of a rectangle containing the lesion [36][37][38].Most of the studies applying AI for BL detection and segmentation focused on periapical lesions only.Setzer et al used AI for segmenting periapical lesions in CBCT scans with a sensitivity of 88%, precision of 87%, and dice coefficient of 88% [39].However, their algorithm cannot be used for detecting incidental BL since it requires manual cropping of a limited CBCT volume including one tooth and the adjacent lesion.Orhan et al developed an algorithm which automatically generated a 3D box around each tooth, and segmented periapical lesions in the relevant boxes with a sensitivity of 89% and a precision of 95% [40].Ezrov et al showed that in clinical settings, with a larger dataset, the same algorithm performance resulted in sensitivity of 69.6 to 83.8% and precision of 98.0 to 99.7%, depending on the lesion's radiographic appearance [41].Recently, Kirnbauer et al detected radiolucent periapical lesions with a sensitivity of 97.1% and specificity of 88.0%, with a dice coefficient varying between 46.7 and 86.7% [42].
Two studies by Abdolali et al [43,44] dealt with automatic detection of benign BL in various locations and not only adjacent to the teeth.Their algorithm for segmentation and classification of three bone lesion types resulted in a total classification rate of 96.5%.However, unlike the algorithm described in our study, their algorithm requires a medium to large CBCT FOV since it is based on asymmetry between both sides of the jaw.Moreover, their method might not be suitable for cases with a solitary anterior lesion, or of multiple symmetrical lesions.
Our algorithm is unique since it is not based on the symmetry of the jaw, nor is it limited to periapical lesions, but trained and tested on a clinically realistic dataset, acquired by various CBCT devices and using different protocols.In spite of this diversity, the algorithm managed to distinguish between CBCT cases with and without BL with an accuracy of 100%.The algorithm detected the BL in individual CBCT axial slices with high sensitivity (95.9%) and high precision (98.8%) and the average dice coefficient (83%) similar to the results obtained by other studies.Although the average dice may be improved, the ability of the algorithm to detect incidental BL in CBCT scans is clinically more important than the precise segmentation of the lesions' borders.
According to international guidelines interpretation of CBCT scans is mandatory, but it is still not a common practice worldwide [45].Our novel algorithm could increase the detection of incidental BL in early asymptomatic stages, aiding the clinician's interpretation, thus improving patient management.Furthermore, the 3D segmentation of BL may assist in virtual reality educational simulators, for both diagnosis and treatment planning [46].
The major limitation of our study is its constriction to hypodense benign BL with well-defined borders.Another limitation is that our dataset includes a relatively limited number of cases.For detecting all types of lesions, including less frequent BL and examining the algorithm's practical role in patient care, future prospective multi-center studies should be performed with larger datasets, including various hypodense, hyperdense, and mixed BL with both well-defined and ill-defined borders.

Conclusion
A unique deep learning algorithm was developed for detecting various benign BLs in volumetric maxillofacial CBCT scans.This algorithm may be used for early detection of incidental BL in CBCT scans.This is most important since currently, the entire scanned volume is not commonly interpreted in clinical practice.The 3D segmentation of BL may be used for treatment follow-up and for educational purposes.
Fig. 6 3D segmentation of a bone lesion.This bone lesion is a mandibular asymptomatic adenomatoid odontogenic tumor detected in 34-year-old male, causing slight buccal cortical expansion.The volume analysis of the segmented lesion may serve as a follow-up tool.a Buccal view, b lingual view

( 1 ) 1
loss =rpn_class_loss + rpn_bbox_loss+ class_loss + bbox_loss + mask_lossTable Characteristics of the CBCT scans divided into training, validation, and testing datasets.Demographics, number of axial slices, technical scan information, radiographic features, and histopathological results for the group with a bone lesion (BL) and the control group (without a bone lesion-w/o BL)

Fig. 1 AFig. 2
Fig. 1 A schematic flow diagram of the Mask-RCNN algorithm used for detecting and segmenting maxillofacial BL in CBCT axial slices.In the final stages, the algorithm classifies the region of interest (ROIs) as normal or as containing a bone lesion, generates an accu-

Fig.
Fig.3A BL missed by the Mask-RCNN algorithm in some sequential slices.The initial segmentation of two mandibular BL, performed separately on 3 sequential axial slices a-c and d-f.In these cases, the algorithm missed the ROI with the bone lesion in the middle slice b and e.Following an analysis of sequential slices, the mask of the missed bone lesion was added by interpolating the shape of the detected objects in the nearest neighboring slices

Fig. 4
Fig. 4 Segmenting different types of BL with various imaging devices and protocols.The examples show the axial slice (1) and same slice with the segmentation by the deep learning algorithm (2). a A mandibular radicular cyst (RC) in close proximity to the inferior alveolar nerve (arrow), Morita, small FOV (6 cm), full turn, voxel size = 0.125 mm, 90 kV, 5.5 mA.b A mandibular keratocystic odontogenic tumor (KOT) in the same axial slice as the maxillary antrum, Morita, medium FOV (8 cm), half turn, voxel size = 0.125mm, 90kV, 5.5mA.c A maxillary dentigerous cyst (DC) in close proximity to

Fig. 5
Fig. 5 False marks detected by the Mask-RCNN algorithm.a1, b1 Examples of false-positive marks of a mandibular bone marrow defect detected by the Mask-RCNN algorithm as a hypodense bone lesion -red markings.a2, b2 False marks were removed by the additional stage of analysis of sequential slices

Table 2
The sensitivity and precision of the Mask-RCNN algorithm.The table includes the number of ROIs with lesions, number of detected ROIs, number of missed ROIs, and number of false marks, for each case in the test group with BL.The columns "Basic" and "Final" refer to the performance of the Mask-RCNN algorithm before and after the analysis of sequential slices, respectively.Values in bold indicate emaphasis on the "Final" result and the "Total" result