A Novel AI Program to Detect COVID-19 Pneumonia and the Distribution of Affected Loci

Manami Sazuka Tokyo Metropolitan Geriatric Hospital: Tokyo-to Kenko Choju Iryo Center Hiroshi Yamamoto (  hyamamot-tky@umin.ac.jp ) Tokyo Metropolitan Geriatric Hospital https://orcid.org/0000-0001-8907-3166 Yasushi Unno Tokyo Metropolitan Geriatric Hospital: Tokyo-to Kenko Choju Iryo Center Aya M. Tokumaru Tokyo Metropolitan Geriatric Hospital: Tokyo-to Kenko Choju Iryo Center Kenji Toba Tokyo Metropolitan Geriatric Hospital and Institute of Gerontology: Tokyo-to Kenko Choju Iryo Center Hideki Ito Tokyo Metropolitan Geriatric Hospital and Institute of Gerontology: Tokyo-to Kenko Choju Iryo Center

The diagnosis of COVID-19 is made by reverse transcription real-time polymerase chain reaction (RT-PCR) using respiratory specimens, along with the clinical course of the disease, whereas the diagnosis of COVID-19 pneumonia is supported by chest X-ray or computed tomography (CT) scan. Chest CT ndings that are typical or atypical for COVID-19 pneumonia have already been accumulated [1]. The recent report by Borakati et al. suggested that CT could be more useful for detecting COVID-19 pneumonia than chest X-ray [2]. However, there is a limited number of respiratory physicians and diagnostic radiologists who can evaluate these CT images; therefore, CT is not effectively used in some medical institutions. On the other hand, in areas where PCR testing is di cult, chest CT imaging is considered a useful tool for triage [3,4].
InferRead™ CT Pneumonia (InferVision Medical Technology, Beijing, China) is an image analysis arti cial intelligence (AI) program for COVID-19 pneumonia. This AI program was adjusted from InferRead™ CT Lung (InferVision Medical Technology, Beijing, China), that was designed to aid the radiologist in detecting pulmonary nodules and COVID-19 pneumonia based on the CT data in China from January to February 2020. In June 2020, it was approved by the Pharmaceuticals and Medical Devices Agency in Japan. The program uses computeraided detection, which processes X-ray or CT lung images and then presents reference information on the possibility of having COVID-19 pneumonia. InferRead™ CT Pneumonia also provides data concerning the affected volume in each lung lobe, which can be useful in detecting COVID-19. Performance testing for this program has been done, but its results have still not been published.
With the global spread of SARS-CoV-2, the number of cases of COVID-19 pneumonia has exploded. Therefore, enhancing the accuracy of detecting COVID-19 pneumonia in countries with limited medical resources is important to equalize crisis management in the context of the pandemic.
Based on the situation above, we decided to investigate whether InferRead™ CT Pneumonia is a useful tool for detecting COVID-19 pneumonia with analyzing changes in volume distribution.

Patient recruitment
The data of 1,311 patients from the Tokyo Metropolitan Geriatric Hospital (Tokyo, Japan), which included 217 SARS-CoV-2-positive and 1,094 negative patients who underwent PCR between July 18, 2021, and August 17, 2021, were extracted from the medical records. Of these, 100 positive cases and 193 negative cases underwent chest CT scans, most of which were performed on the same day as the SARS-CoV-2 test. Among the positive cases, 21 patients were excluded because they were scanned before implementation of the AI program; the remaining 79 cases were included. In contrast, 90 SARS-CoV-2-negative cases were included. There were many cases where CT scans were done to screen for the possibility of COVID-19 before any surgery was performed. Therefore, 43 patients who had undergone CT scans for this purpose were excluded from the study. Sixty cases before implementation of the AI program were also excluded. (Fig. 1). Age, gender, smoking history, and date of symptom onset for each patient were collected from medical records.

De nitive diagnosis of COVID-19 pneumonia
De nitive diagnosis of COVID-19 pneumonia was done by physicians in charge from clinical symptoms, laboratory ndings (C-reactive protein, leucocyte count, etc.), ndings in chest X-ray or CT images in patients with positive SARS-CoV-2 PCR test. CT images were analyzed by diagnostic radiologists. PCR tests were done using nose smear specimens by the Respiratory Panel 2.1 (RP2.1) with FilmArray® 2.0 and FilmArray® Torch Systems (bioMérieux®, Marcy l'Etoile, France).
CT imaging and application of the AI program For all enrolled cases, CT images were taken using iQon spectral CT (Philips Japan, Tokyo, Japan) or Aquilion ONE™ ViSION Edition (Cannon Medical Systems, Tokyo, Japan). If the images was also taken at multiple thickness, only the images taken at 0.5mm-thickness was transferred to the dedicated image server, and if the images were only taken at 5-mm thickness, then these images were transferred to the server. The decision of which condition to use was made at the discretion of the radiology technologist based on the patient's condition. The images were analyzed by InferRead™ CT Pneumonia, and the possibility of COVID-19 pneumonia was presented at four levels: "None," "Low," "Middle," and "High." Of these, cases classi ed as "Middle" or "High" were recognized as AI-recognized COVID-19 pneumonia. Additionally, the volume of interest (VOI) of "possible viral pneumonia" in each lung lobe was collected for each case.

Performance of the AI program
We evaluated the sensitivity, speci city, positive predictive value, negative predictive value, positive likelihood ratio (LR), negative LR, and overall accuracy with their con dence intervals (CI) of diagnosing SARS-CoV-2 test positive based on the diagnosis of COVID-19 pneumonia by InferRead™ CT Pneumonia.
Further, the two specialists' diagnosis was made only by CT images based on their own experience. It was made by the following methods: "rearranging the order of positive and negative cases by generating a random number for each case, so that positive and negative cases are arranged in a discrete order (blinded)," and "reading in a situation where the results of the two diagnoses cannot be seen (blinded), so that the results of the two diagnoses are not in uenced by each other. For each CT slice thickness, the diagnostic performance of COVID-19 pneumonia was compared between respiratory specialists A and B, respiratory specialist A and AI, and respiratory specialist B and AI, and the agreement of their diagnoses with each other was evaluated. Concordance between them when analysing the data according to the number of days elapsed since onset (≤ 7 days or > 7 days ) was also assessed.
Next, we collected the VOIs data that the program calculated as "suspected COVID-19 pneumonia" and compared the percentage of affected volume in each lung lobe. The correlation between the volume affected and the number of days elapsed since the onset of symptom was also assessed.

Statistical analyses
Almost all statistical analyses were performed using the JMP® version 14.3.0 software package (SAS Institute Inc., Cary, NC, USA) unless otherwise speci ed. Continuous variables were expressed as the median with interquartile range (IQR). Statistical signi cance tests were performed by Mann-Whitney U test for continuous variables and two-tailed Fisher's exact probability tests for categorical variables. The latter was performed by EZR version 1.5.3 (Saitama Medical Centre, Jichi Medical University; http://www.jichi.ac.jp/saitamasct/SaitamaHP. les/statmedEN.html) [5]. Interlobar comparisons of affected lung volumes were evaluated by Kruskal-Wallis test, and multiple comparisons were performed by Steel-Dwass test. Cohen's kappa coe cient was used to evaluate the degree of agreement. Partial correlation coe cients between the number of days since onset and the affected volume of the whole lung or each lung lobe were estimated by EZR, along with the adjusted P values by the Holm's method. To estimate the risk of inconsistency, multiple logistic regression analysis to calculate odds ratios (ORs) was performed by EZR. All P-values were two-sided, and values of <0.05 were considered statistically signi cant.

Patients' characteristics
Characteristics of the patients are shown in Table 1. Regarding the 79 positive cases, 50 were males and 29 were females, and the median age of the patients was 44 years (IQR: 35-56). Thirty-three patients (43.4%) had a smoking history. On the other hand, 40 of the negative cases were males and 50 were females, with a median age of 81.5 years (IQR: 69.5-89.5) and 31 (38.3%) with smoking history. CT images were taken with a slice thickness of 5 mm (53.2% SARS-CoV-2-positive, 51.1% SARS-CoV-2-negative) or 0.5 mm (46.8% SARS-CoV-2-positive, 48.9% SARS-CoV-2-negative). The median days from the onset of symptoms to the date of scanning in patients taken with a slice thickness of 0.5 mm and 5 mm were 8 [IQR: 4-9] and 7 [IQR: 4-8], respectively, with no signi cant difference between them (P = 0.1152).   Table 2).  Table 2). The agreement between the two was satisfactory high (κ = 0.792) in the SARS-CoV-2positive cases (Table 3). In particular, the 37 cases diagnosed on CT images with a slice thickness of 0.5 mm were in extremely high concordance of κ = 0.8532. In contrast, the agreement of the two specialists in the SARS-CoV-2-negative cases was low (κ = 0.299) even for CT with a slice thickness of 0.5 mm (κ = −0.024) ( Table 3). Diagnostic accuracy of AI for COVID-19 pneumonia-Evaluation in SARS-CoV-2-positive cases First, the median time from disease onset of positive cases in which specialist A, specialist B and AI were able to diagnose COVID-19 pneumonia was 7 days (IQR: 6-9, 7-9, 6-9, respectively). In contrast, the median time from disease onset for positive cases in which specialist A, specialist B, or AI could not diagnose COVID-19 pneumonia was 2 days (IQR: 1-5, 1-3, 1-3, respectively), and there was a signi cant difference between them (P < 0.0001). Next, the agreement between the diagnosis of COVID-19 pneumonia by InferRead™ CT Pneumonia and that of the two respiratory specialists was evaluated in SARS-CoV-2-positive cases (Table 4a). The agreement was satisfactory high (specialist A: κ = 0.817, specialist B: κ = 0.890) even when divided into the cases where the CT slice thickness was 0.5 mm (specialist A: κ = 0.924, specialist B: κ = 0.924) and where it was 5 mm (specialist A: κ = 0.718, specialist B: κ = 0.859). Similarly high concordance was observed when analysing the data separately according to the number of days elapsed since onset. (Table 4b).  The concordance between the diagnosis of COVID-19 pneumonia by InferRead™ CT Pneumonia and that of the diagnosis by the two respiratory specialists was evaluated in SARS-CoV-2-negative cases (Table 4). These showed low concordance (specialist A: κ = 0.095, specialist B: κ = 0.013); similarly low concordance was shown even with a slice thickness of 0.5 mm (specialist A: κ = 0.086, specialist B: κ = −0.045). Multiple logistic regression analysis was performed to estimate what contributed most to the disagreement between the diagnosis of each respiratory specialist and that of the AI in SARS-CoV-2-negative cases. The OR of age for the inconsistency with specialists A and B was 1.030 (95% CI, 0.997-1.070; P = 0.077) and 1.060 (95% CI, 1.020-1.110; P = 0.007), respectively (Table 5). SE: standard error, VIF: variance in ation factor, OR: odds ratio, CI: con dence interval, *P < 0.10, **P < 0.05

Distribution of affected lung lobes in COVID-19 pneumonia
We also collected the VOIs data that the program calculated as "suspected COVID-19 pneumonia" and compared the percentage of affected volume in each lung lobe (Supplementary Table 1a). The median VOI for both lungs as a whole was 4.15% (IQR: 0.73-9.24), with a particularly high percentage in the lower lobes of both lungs (11.52% in the right lower lobe (IQR: 4.08-26.99) and 7.16% in the left lower lobe (IQR: 2.27-20.04). In contrast, the percentage of affected volume in the right middle lobe of the lung was relatively low at 0.16% (IQR: 0.00-5.04), and a signi cant difference between affected lobes was detected (P < 0.001, Kruskal-Wallis test). Multiple comparisons (Supplementary Table 1b) showed that the affected volume fraction of the right lower lobe was signi cantly higher than that of its ipsilateral upper-middle lobe, and the affected volume fraction of the left lower lobe was signi cantly higher than that of its ipsilateral upper lobe (P < 0.001, Steel-Dwass test). In addition, the affected volume fraction of the middle lobe of the right lung was signi cantly lower than that of the upper lobe of the right lung (P = 0.045, Steel-Dwass test) (Fig. 2). Accordingly, we examined the 24 SARS-CoV-2-positive cases in which the AI judged that the lesions were distributed in the bilateral lower lobes and there was no lesion in the middle lobe of the right lung to see if the diagnosis of each respiratory specialist was consistent with that of the AI and found that all of them were consistent with the diagnosis of COVID-19 pneumonia. The partial correlation coe cients between the number of days since onset and the affected volume of all lungs or each lung lobe and the adjusted P-values using the Holm's method are shown in Supplementary Table 2. No signi cant correlation was found between the number of days since onset and the affected volume.
There are challenges in the diagnosis of COVID-19 pneumonia using AI technology, including binary diagnosis (COVID-19 present or absent), segmentation and quanti cation of the abnormal lung opacities, and discriminating COVID-19 from non-COVID-19 pneumonias [12]. Usefullness of AI to detect COVID-19 pneumonia based on chest CT data has already been reported in a large patient cohorts [13]. However, there are no reports on the use of commercial AI program InferRead™ CT Pneumonia.
In this study, we demonstrated the e cacy of InferRead™ CT Pneumonia for detecting SARS-CoV-2 with su cient sensitivity and speci city regardless of the CT slice thickness. Furthermore, InferRead™ CT Pneumonia showed a high sensitivity comparable with that of respiratory specialists, and when limited to SARS-CoV-2-positive cases, an extremely high agreement was observed in detecting COVID-19 pneumonia between respiratory specialists and AI, even on CT images taken at a slice thickness of 5 mm. The number of CT devices per 100,000 people in Japan in 2011 is reported to be 12,945, of which the number of multirow detector CT is 8,347 [14]. Therefore, InferRead™ CT pneumonia is expected to be useful in assisting the diagnosis of COVID-19. However, since there is a lack of respiratory specialists and radiologists in Japan, as well as a disparity in the number of CT scanners installed in different regions [14], it is di cult to say that CT scans are being used e ciently. The image analysis AI program can facilitate the detection of COVID-19 pneumonia of SARS-CoV-2-positive patients in institutions with only single detector-row CT or without respiratory specialists or diagnostic radiologists and is thought to be useful in eliminating the disparity in crisis management under the spread of coronavirus infections.
It might be considered that the sensitivity of chest CT for COVID-19 pneumonia is not very high immediately after the onset of the disease, which might have affected the chance of detecting it. In particular, it could be assumed that imaging with a slice thickness of 5 mm would have a lower sensitivity to detect COVID-19 pneumonia than with a slice thickness of 0.5 mm. However, no signi cant difference was observed in the number of days passed from onset to CT imaging between the groups of different slice thicknesses. Therefore, the timing when the CT scan was performed was not considered to be a factor affecting the difference in the results between 0.5 mm and 5 mm. In addition, comparing the cases where each specialist and AI diagnosed SARS-CoV-2 positive cases as COVID-19 pneumonia and those where they could not, signi cant differences were observed in the number of days since the disease onset. Therefore, it could be assumed that the diagnosis of COVID-19 pneumonia was more di cult when the duration of illness was shorter probably because the ndings were milder or atypical.
Nevertheless, the length of time since onset did not make any difference in the ability of each specialist or AI to detect COVID-19 pneumonia.
However, it must be emphasized that the diagnosis of COVID-19 must be based on clinical symptoms and detection of SARS-CoV-2 and not only on imaging and that this image analysis AI program should only be used as an adjunct to the diagnosis of COVID-19 pneumonia. In the SARS-CoV-2-negative cases, discrepancy between the diagnosis by AI and that by respiratory specialists was found. The SARS-CoV-2-negative cases used in this study were biased toward the elderly, and the results of multiple logistic regression analysis indicated that this may have contributed to the discrepancy in diagnosis. Older patients often have more complex background lung modi cations than younger patients, which may have in uenced the diagnostic discrepancy. The performance of InferRead™ CT pneumonia in diagnosing lung abnormalities, which are often seen in the elderly, as something other than COVID-19 pneumonia is important, and in this regard, there is a need for improvement in the performance of InferRead™ CT pneumonia.
On the other hand, we investigated the affected area of COVID-19 pneumonia based on the per-lobar VOI presented by InferRead™ CT Pneumonia and showed that the affected volume fraction of both lower lobes of the lung was signi cantly higher than that of the other lobe on the same side and that the concerned volume fraction of the middle lobe of the right lung was signi cantly lower than that of the other lobe on the same side. lobe distribution along with ground-glass opacities, subpleural distribution, multiple lobar distribution, and bilateral distribution [9]. The results of the present study were similar to those reported above, especially in the cases of bilateral lower lobe involvement with the absence of right middle lobe involvement, where the diagnosis of COVID-19 pneumonia was as reliable as that of the respiratory specialists. The distribution of affected lung lobes provided by InferRead™ CT Pneumonia also seemed to be useful in assisting the diagnosis of COVID-19 pneumonia.
There are several limitations in this study. First, this is a single-center, retrospective study. Therefore, there is potential for case selection bias and institutional bias. Second, the number of patients included in the study was limited. Third, the data were not adjusted for the timing of CT imaging or disease severity in the clinical course of the disease. These issues can be resolved by accumulating more cases, but new ndings on COVID-19 should be shared as soon as possible, and this is an issue for the future. Fourth, background changes of the lungs before enrollment in this study were not collected. In the imaging diagnosis of COVID-19 pneumonia, it is required to take into consideration the background lung changes, especially in older patients.

Conclusions
In conclusion, InferRead™ CT Pneumonia is an image analysis AI program that can be used as an aid in COVID-19 pneumonia detection and its affected volume distribution, especially in SARS-CoV-2-positive cases. The protocol was approved by the ethics committee of the Tokyo Metropolitan Geriatric Hospital (R21-073). This study was a retrospective observational study and was opted out on the website. Patients who participated in the study could express their refusal to participate at any time by downloading and lling out a form on the web. The institutional review board waived the need for written informed consent.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and analyzed in this study are available from the corresponding author on reasonable request.

Competing interests
HY and AMT have received support from CES Descartes for funded research. The other authors do not have any con icts of interest to declare.

Funding
No funding was provided for this study.
Authos' contributions MS and HY made contributions to conception and design, acquisition of data, analysis, and interpretation of data, and drafting the article. YU contributed to acquisition and management of data. AMT, KT, and HI substantially contributed to interpreting the data and revising the manuscript critically for important intellectual content. All authors read and approved the nal version of this manuscript. Figure 1 Patient recruitment. SARS-CoV-2: severe acute respiratory syndrome coronavirus 2, PCR: polymerase chain reaction, AI: arti cial intelligence, CT: computed tomography.