Study design
This retrospective study used data collected at Chiba University Hospital to develop and test a deep-learning algorithm using the CRs of patients with PAH to help physicians detect PAH. This study was performed in accordance with the Declaration of Helsinki. This clinical study was approved by Research Ethics Committee of the Graduate School of Medicine, Chiba University (approval number: 4203). All adult participants provided written informed consent to participate in this study.
Datasets
The dataset included 145 patients with PAH and 260 control patients who visited Chiba University Hospital between January 1, 2003, and December 31, 2020. Using a Swan-Ganz catheter, PAH was diagnosed based on a right heart catheter (RHC). The diagnosis of PAH was made using hemodynamic measurements according to the most recent World Symposium standards: mean pulmonary arterial pressure (PAP) > 20 mmHg, pulmonary arterial wedge pressure (PAWP) ≤ 15 mmHg, pulmonary vascular resistance (PVR) > 3 wood units [2, 15].
Because PAH is a rare disease, one to three CRs were used at different times during the clinical course of 145 patients with PAH (259 total images). The RHC-CR data pair from patients with PAH, performed within 3-day intervals, constituted the analysis dataset. The normal control group consisted of 260 patients who visited the Ophthalmology Department of Chiba University Hospital from January 1, 2015 to December 31, 2020 and had CRs without suspicion of PH (260 total images). Although 263 patients were initially included as normal controls, two respirologists checked the CRs and excluded three patients with obvious cardiac enlargement or interstitial pneumonia.
Development of a deep-learning method for PAH detection
The ResNet50 model pre-trained on ImageNet-1k was used for image classification. A fully connected layer was added to the final layer for binary classification [16]. A PyTorch framework was used for the ResNet50 model implementation [17]. The training was performed on a workstation using an NVIDIA Tesla T4 graphics processing unit ( Santa Clara, CA, USA). All CRs datasets were split in a 4:1 ratio for development and testing. The developed CRs dataset was separated into training and validation datasets using the K-fold cross-validation method (K = 4). The patient identification number was used to separate the CRs to ensure that multiple CRs from the same patient were not distributed across the datasets. Before training, the image contrasts were normalized to an intensity range of 0–255 and resized to 320 pixels × 320 pixels. We adopted common schemes for data augmentation, including flipping, rotating, and resizing. The training used binary cross-entropy loss function with stochastic gradient descent optimization (learning rate parameter, α = 0.01).
Testing the detecting capability of the algorithm
The algorithm outputted a numerical value from zero to one using the testing dataset (the probability of the PAH score) for each image. A receiver operating characteristic (ROC) curve, which plots the sensitivity and false positive rate (1 - specificity) for every score cut-off, was plotted, and the AUC was calculated.
Comparing the performance of the algorithm versus the performance of the doctors
The detecting capability of the algorithm was compared with that of doctors (respirologists [> 8 years of experience], n = 7 and radiologists [> 14 years of experience], n = 2). The distance from the doctor's eyes to the monitor was about 60 cm. The time to read each image was limited to ≤ 5 s.