Human rights:
The requirement for informed patient consent for inclusion of data was waived by the ethics committee because of the retrospective nature of the data use and this study obtained ethical approval from Aichi-Gakuin University ethics committee (No.
496). All methods were carried out in accordance with relevant guidelines and regulations.
Preparation of datasets
A total of 60 904 original CT images including 30 452 bone window images and their corresponding 30 452 soft-tissue window images acquired from 200 patients, between June 2019 and December 2019, were selected from our hospital image database for the model learning process. All CT examinations were performed to evaluate the status of an impacted third molar. Cases with severe inflammation were excluded. When these images were downloaded, the window level and width were set at 900 and 4500 HU, respectively, for the bony-window images, and 60 and 300 HU, respectively, for the soft-tissue-window images. The images were downloaded in Joint Photographic Experts Group (jpeg) format with a resolution of 900 × 900 pixels, and the image patches for the learning process were created by compressing them to 256 × 256 pixels with 8-bit gray values. Of the 60 904 patches obtained, 54 740 patches (from 180 patients) were assigned as training data, and 6164 patches (from 20 patients) were assigned as validation data (Fig. 1). CT images were acquired using an Aquilion PRIME scanner (Canon Medical Systems, Otawara, Japan) using the following parameters: tube voltage, 120 kV; tube current-time product, 100 mAs; slice thickness, 0.5 mm; field of view, 20 cm.
CBCT images of 10 patients (4272 images) who were also examined with CT were selected from the same image database and prepared for the testing process (Fig. 1). In all 10 patients, the CBCT examinations were performed to clarify the relationship between the mandibular third molar and canal before extraction, whereas the CT was acquired to evaluate post-extraction mandibular nerve damage and inflammation. The two examinations were carried out within 2 years of each other. No severe inflammation, which can affect soft-tissue findings, was observed in the CT images. Each downloaded CBCT image (1039 × 1264 pixels with 8-bit gray level values in jpeg format) was compressed to 256 × 256 pixels for use as an image patch. To imitate bone window CT images, the brightness and contrast of all the CBCT images were manually adjusted before they were input into the testing process. The CBCT scans were acquired using an Alphard Vega scanner (Asahi Roentgen Ind. Co. Ltd., Kyoto, Japan) with a field of view of 102 × 102 mm and a voxel size of 0.2 mm3.
All image processing was performed using IrfanView version 4.44 (http://www.Irfanview.com).
Learning architecture and processes
Training and testing processes were performed using Neural Network Console (Sony Corporation, Tokyo, Japan) with a Geforce 1080 Ti graphics processing unit (Nvidia, Santa Clara, CA). The learning method used a modification of the U-Net CNN reported by Ronneberger et al.29 The network consisted of a convolutional layer, rectified linear unit (ReLU) activation function layer, and pooling layer (Fig. 2). The training parameters were: learning epochs, 300; initial learning rate, 0.001; solver type, Adam. The trained model was then used to convert the testing CBCT image datasets to soft-tissue quality CBCT images in the Portable Network Graphics (png) format. These images are referred to as estimated CBCT images.
Subjective evaluation of the quality of the estimated CBCT images
The image quality of the estimated CBCT images (Figs. 3a and 4a) was subjectively evaluated on a personal computer display by seven radiologists, all of whom had more than 3 years of experience in interpretation of CT and CBCT images. The radiologists compared the estimated CBCT images with both the test CBCT image expressed with an appropriate window level and width for visualizing soft-tissue structures, which we refer to as the swCBCT image (Figs. 3b and 4b), and the compressed swCT image (Figs. 3c and 4c). The quality of these images was scored using a five-point grading system relative to the original soft-tissue window CT images presented on a DICOM display. In the actual evaluations, the observation windows of the displays were manually adjusted to optimize the visualization of six anatomical structures, including the medial pterygoid and digastric muscles, parapharyngeal and submandibular spaces, submandibular gland, and submental or submandibular lymph nodes. For the evaluation of the fascial space, the radiologists paid special attention to the visibility of the included fat tissue. For lymph nodes, the node to be evaluated was indicated beforehand on the images. The subjective scoring was performed according to the following procedure:
Score 0: The anatomical structure was difficult to identify.
Score 1: Between scores 0 and 2.
Score 2: The anatomical structure was sufficiently identifiable for use in a clinical setting, but the quality was inferior to the original soft-tissue window CT image displayed on a DICOM viewer.
Score 3: Between scores 2 and 4.
Score 4: The anatomical structure was clearly identified and the quality was equivalent to the original soft-tissue window CT image displayed on a DICOM viewer.
The means and standard deviations of the subjective scores were calculated for 10 patients, and the differences between image types were assessed using the Steel-Dwass test with statistical significance of p < 0.01.
Visibility of the digastric muscle relative to the fat tissue in the submandibular space
The visibility of a soft tissue partially depends on the contrast between the target tissue and adjacent tissues. Therefore, to verify the visibility judgments, the voxel values of the anterior belly of the digastric muscle and the adjacent fat tissue in the submandibular space were measured on a slice of each of the three image types that were subjectively evaluated (estimated CBCT, swCBCT, and compressed swCT images) in the 10 patients. The most appropriate slices showing the maximum area of the muscle were selected by a radiologist (MF), and 160-pixel circular regions of interest (ROIs) were set in the bilateral muscles and adjacent fat tissues (Fig. 5). For the estimated CBCT and compressed swCT images, the widow level and width were maintained at the same values used when they were created. The windowing of the swCBCT images was determined by a radiologist (MF), so that sufficiently high contrast was shown between the two tissues. The measured voxel values using ImageJ software30–31 for the muscle and fat tissues were totaled for all 10 subjects.