Deep Learning Analysis of Polar Maps from SPECT Myocardial Perfusion Imaging for Prediction of Coronary Artery Disease


 Purpose: This study aimed to investigate the diagnostic accuracy of deep convolutional neural networks for classifying the polar map images in Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) by considering the physician’s diagnosis as reference.Methods: 3318 images of stress and rest polar maps related to patients (67% women and 33% men) who underwent 99mTc-sestamibi MPI were collected. The images were manually labeled with normal and abnormal labels according to the doctor’s diagnosis reports. The proposed deep learning model was trained using stress and rest polar maps and evaluated for prediction of obstructive disease in a stratified 5-fold cross-validation procedure.Results: The mean values of accuracy, sensitivity, accuracy, specificity, f1 score, and the area under the roc curve were 0.7562, 0.7856, 0.5748, 0.7434, 0.6646, and, 0.8450, respectively over 5 folds using both stress and rest scans. The inclusion of rest perfusion maps significantly improved AUC of the deep learning model (AUC: 0.845; 95% CI: 0.832-0.857), compared with using stress polar maps only (AUC: 0.827; 95% CI: 0.814-0.840); P < 0.05.Conclusion: The results of the present work reveal the possible applications of deep learning for polar map images classification in SPECT MPI.


Introduction
Coronary artery disease (CAD) is a topic of pivotal importance in heart disease, accounts for one in every ve deaths, and is commonly seen throughout the world [1]. As a direct consequence of this mortality, there have been increasing calls for imaging techniques to predict the presence of CAD disease.
Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) is widely used throughout the world to assess coronary artery disease [2]. This technique can provide information about myocardial perfusion which is crucial to know the signi cance and extent of coronary artery stenosis as well as information regarding ventricular function [3]. The key idea in SPECT MPI is to image the patient's heart twice, during the rest and under stress, to identify areas of reduced radiopharmaceutical uptake [4].
Visual assessment is often chosen as the default approach for perfusion image interpretation; however, this one could be a time-demanding task. Moreover, such a method is notably dependent on the experience of the nuclear medicine expert or radiologist and suffers from intra-observer and inter-observer variability. Therefore, to increase the reproducibility of reports, quantitative analysis is typically integrated into visual evaluation [2,4]. The main focus of developing technologies in nuclear cardiology is on the automation and use of arti cial intelligence algorithms as a component of the interpretation process [2].
The term DL has been introduced in an attempt to develop a method to boost the performance of conventional arti cial neural networks (ANNs) using deep architectures and more layers [3,5]. The convolutional neural network (CNN) is a class of deep neural networks that are mainly used in image analysis such as medical images due to its unique feature of maintaining local relationships in the image while reducing dimensions [6,7]. Contrary to traditional machine learning that normally requires pre-determined image measurements, CNNs connect directly to image pixels to learn image statistics on their own, so that image processing is done naturally [6]. This is a robust new machine learning tool, with breakthrough applications in medical imaging issues such as disease detection and classi cation [5,8].
Several studies have explored the use of CNNS for the prediction and classi cation of obstructive coronary artery disease in the nuclear cardiology eld [4,6,[9][10][11][12][13][14][15]. Spir et al. (2019) suggested a method for the classi cation of polar maps based on a graph-type CNN and compared the agreement with the human observer [4]. Bentacor et al. (2019) used deep learning based on a combination of raw and quantitative supine and upright stress polar maps to predict obstructive coronary artery disease [6]. In a study by Apostolopoulos et al. (2020), a dataset of attenuation-corrected (AC) and non-attenuationcorrected (NAC) stress and rest polar maps and clinical data were analyzed using a multi-input network consisting of inceptionv3 and a random forest [9]. In the present work, a CNN based on a combination of stress and rest polar maps is introduced to classify coronary artery disease.
This study aimed to investigate the diagnostic accuracy of deep convolutional neural networks for classifying the polar map images in myocardial perfusion imaging by considering the physician's diagnosis as a reference.

Study Population
The study population comprised 3,318 patients (67% women and 33% men) referred for SPECT MPI from 2018 to 2020, at the nuclear medicine center. The study was approved by the Ethical Committee of our Institution. Polar maps in stress and rest conditions were extracted from the SPECT scan in tiff format.
The images of stress and rest were resized to 221x217 with 3 channels, to remove the unnecessary parts and reduce the computational cost. Sample polar map images as identi ed in the data set are shown in Fig. 1

Image Acquisition
The stress SPECT images were acquired about 30 min after an injection of 20 mCi Technetium-99m (Tc-99m MIBI) following an exercise test or pharmacological stress. Tc-99m MIBI was injected after at least 85% of the age-predicted maximum heart rate was achieved. Rest imaging was carried out nearly 40 min after an injection of 15-20 mCi Tc-99m MIBI. Two SPECT gamma-cameras (Philips Forte and Genesis cameras) equipped with low-energy high-resolution collimators, were used for MPI imaging. The data were collected from 32 projections of 25-30 seconds in a 140 Kev photo-peak over a 180degree arc in a 64x64 matrix. The stress and rest SPECT MPI were performed using 180 degrees SPECT imaging, beginning from 45 degrees right anterior-oblique and ending at 45 degrees left posterior-oblique. The patients were imaged in the supine position. The short axis and vertical and horizontal longaxis slices were reconstructed from the raw projection data by ltered back projection or iterative algorithms in the rest and stress SPECT MPI. No attenuation correction was performed. Polar maps were extracted from SPECT images.

MPI Interpretation
A nuclear medicine expert had been engaged retrospectively in the interpretation of MPI studies. Myocardial perfusion imaging reports were considered positive for CAD if they depicted a reversible tracer defect of any extent. If a perfusion defect in the SPECT images acquired after exercise was not seen in the rest images, this condition was described as ischemia. For example, if the result is "Stress-induced ischemia of the apical lateral, apical anterior and mid anterolateral wall", the patient is labeled as abnormal. If the result is "The study is negative for appreciable stress-induced ischemia", the patient is labeled as normal. Among 3318 subjects in this study, 2303 subjects were labeled as normal and 1015 subjects as abnormal.

Deep Learning Model
The overall process is schematically shown in Fig. 3. Our network includes ve convolution blocks with an increasing number of lters in succession to extract image features. Each block consists of two convolution layers. Each layer is followed by an exponential linear unit (ELU) activation function and a batch-normalization (BN) layer for normalizing data. Then after each block, there is a max-pooling layer with stride 2 that maintains only the maximum value from a 2×2 input, to compensate for small image changes and distortions. These convolution layers are used for obtaining the main features of input images. The early blocks extract low-level features whereas the other succeeding blocks allow for the detection of higher-level features. The output of convolution layers is passed through a global Average pooling (GAP) layer to generate the feature vector. The feature vector then passes to the second part. This part includes two fully connected layers (arrays of neurons linked with each neuron in the preceding layer), followed by a dropout and a dense layer with one node and a sigmoid function to generate the output class. A summary of the model is given in Table 1.

Implementation
Our proposed deep learning model was implemented using Python programming language version 3.7.10 and Keras Application Programming Interface (API) version 2.4.3 for Tensor ow version 2.4.1. The experiments were run using the graphical processor units on Google Colaboratory. Our model used focal loss function and adam optimizer. A focal loss function addresses the class imbalance problem during training a model [16]. Two hyper-parameters, alpha, and gamma are applied to the Cross-Entropy loss function. Alpha gives more importance to the minority class (abnormal class) and handles the class imbalance problem while gamma gives more importance to misclassi ed examples and makes the model e cient to learn for hard (misclassi ed) examples. Suitable values for alpha and gamma were 0.75 and 0.2, respectively. The initial learning rate, batch size, and the number of epochs was 0.001, 16, and 50, respectively. The learning rate was reduced by a factor of 0.2 when no improvement was seen on validation loss for 10 epochs. The lower bound on the learning rate was 0.00001.

Data Augmentation
To reduce over-tting and enable more generalization, in each of the k rounds during the cross-validation method, the training parts were augmented before the training. We rotated the stress and rest polar maps simultaneously and partly (by a maximum of 45 degrees). Then the rotated polar maps were concatenated along the channel dimension. In such a way, the model ignored small spatial differences between the polar maps and focused on the color variations, irrespective of their relative position.

Cross-validation
The validation of the deep learning model was performed using a strati ed 5-fold cross-validation procedure to proper training and reliable evaluation of the CNN and maximizing the use of data while avoiding model over-tting and selection bias. The approach randomly splits the training set into 5 nonoverlapping groups of patients of roughly equal size where the rst four groups are the training set, while the fth part remained hidden from the model for testing. These groups are strati ed to have a similar percentage of obstructive disease as the studied population. Before the training, the four training parts were augmented. This procedure was repeated for the left parts until every part was selected for the test set. Finally, the mean of the metrics over ve folds provides an overall performance estimate of the deep learning model.

Added Value of Rest Scans
The added value of rest perfusion maps was assessed by evaluating the deep learning model performance trained including rest perfusion polar maps in addition to stress polar maps. The 2 methods (with and without rest maps) were evaluated into the same cross-validation procedure, using the same folds.

Statistical Analysis
The diagnostic performance of the deep learning model was evaluated using classi cation metrics including accuracy, precision, recall, f1score, speci city, and the area under the ROC curve (AUC). The added value of rest polar maps was evaluated using ROC analysis and pairwise comparisons of the area under the ROC curve according to the DeLong test [17]. McNemar's chi-square test was also used to assess the signi cance of changes between two deep learning methods (with and without rest polar maps). A 2-tailed p-value <0.05 was considered statistically signi cant. Statistical tests were performed in MedCalc software version 20.0.15. Table 2 provides the average accuracy, AUC, recall, precision, f1-Score, and speci city metrics over 5 folds along with the corresponding standard deviations for two deep learning methods (with and without rest polar maps). The results related to considering rest in addition to stress polar maps are also shown graphically in Fig. 4 as box plots.

Learning curves
The average accuracy and error learning curves for the training and test data set over 5 folds are presented in Fig. 5. The learning curve is the performance of a learning model over experience or time. Reviewing the learning curves of the models during training can be used to diagnose learning problems, such as over-tting and under-tting, as well as whether the training and validation data sets are appropriate. As seen in Fig. 5, the training and test error curves decrease and converge together. The training and test accuracy curves also increase and converge at the same time and the iteration stops before overtting. Table 3 presents the summarized confusion matrix on total data obtained from the sum of the matrices of all folds. Also, the normalized confusion matrix in Table 4 is created by dividing each element by the sum of the corresponding row elements, which represents the number of images in each class. The values in this matrix belong to one of the True Negative, True Positive, False Negative, and False Positive categories.

ROC and Precision-recall curves
The Receiver Operating Characteristic curve, also known as the ROC, displays the relationship between True Positive Rate and False Positive Rate for each threshold value. To numerically represent the ROC curve, the Area under the Curve can be calculated. Fig. 6 on the left part illustrates the ROC curve for each of the 5 folds, as well as the overall curve obtained from the prediction on all data, along with the AUC values. Fig. 6 on the right part shows the precisionrecall curve for each of the 5 folds, as well as the overall curve obtained from the prediction on all data, along with the AUC values.

The Result of Adding Rest Scans
According to the DeLong test, the inclusion of rest perfusion maps signi cantly improved the AUC of the deep learning model (AUC: 0.845; 95% CI: 0.832-0.857), compared with using stress polar maps only (AUC: 0.827; 95% CI: 0.814-0.840); P < 0.05 shown in Fig. 7. McNemar's test also shows that the difference before and after adding rest maps is -2.68% with 95% CI from -4.09% to -1.27%, which is signi cant (P=0.0002).

Discussion
This study aimed to assess the feasibility of SPECT MPI polar maps classi cation using deep learning methods. The deep learning model was evaluated using a strati ed 5-fold cross-validation strategy. Taking expert diagnosis as a reference, well-established metrics, including accuracy, AUC, sensitivity, and speci city, were computed to evaluate the performance of the proposed deep learning model.
Visual assessment of SPECT heart perfusion images faces several challenges, including the lack of reproducibility due to the intra-observer and interobserver variability, dependency on nuclear medicine expert or radiologist experience, and increased time and cost of the interpretation process. Machine learning and Deep learning techniques exhibited promising potential for the detection and classi cation of coronary artery disease from SPECT MPI images compared with other approaches, including human observer diagnosis and quantitative analysis of perfusion defects [4,6,[9][10][11][12][13][14][15][18][19][20][21]. A comparison to the related studies is listed in Table 5. In [4,6,9,10,12,15], the authors used CNNs with polar maps.
A previously proposed DL-based method by Betancur et al. combining raw and quantitative supine and upright stress polar maps and patient gender information to predict obstructive coronary artery disease led to a greater per-patient and per-vessel AUC than the combined total perfusion de cit (TPD). In a study by Apostolopoulos et al. (2020), a collection of 566 patient samples was analyzed. The dataset includes a combination of attenuation-corrected (AC) and non-attenuation-corrected (NAC) stress and rest polar maps, clinical data, and coronary angiography results, the latter of which is considered as the ground truth. The four polar maps corresponding to each patient were concatenated into one image alongside one another. This research shows that an optimal strategy involves a hybrid multi-input network consisting of inceptionv3 and a random forest and data augmentations with small shifts in any direction. This method corresponds to the expert's accuracy, which is 79.15% for this speci c data set [9].
The current study deals with a combination of stress and rest polar maps as model input for each patient while most previous studies have used only stress polar maps as input to the convolutional network. We demonstrated that deep learning utilizing a combination of stress and rest perfusion polar maps outperforms deep learning using only stress polar maps for the prediction of obstructive disease. These ndings suggest that the rest polar maps provide important supplementary information. This could be because the model uses the difference between two polar maps to detect obstructive disease or ischemia. Although our dataset is different from the corresponding works, the proposed method offers a competitive performance compared to mentioned related studies in terms of accuracy, sensitivity, speci city, and the area under the ROC curve which corresponds to 0.7562%, 0.7856%, 0.7434%, and, 0.8450%, respectively. The results also re ect those of Apostolopoulos et al. (2020) who also found that data augmentation by rotation is an effective strategy to prevent over-tting in classifying polar maps [10].
This work suffers from some limitations. First, the data augmentation method to overcome the data shortage is not strong enough to create real images. In addition, by rotating the polar maps, the location of the abnormal perfusion is attributed to another area in the myocardium. However, in this study, we examined the overall diagnostic effectiveness of this method (normal or abnormal) and not its ability to locate ndings in speci c coronary artery areas.
Furthermore, this work is also limited by its consideration of only polar map images.
This study could further be re ned by considering the clinical and functional data. Also, Coronary angiography information can be used as a ground truth. Moreover, by increasing the number of images, higher performance can be achieved and the model generalization can be improved.

Conclusion
Despite the above limitations, the results of the present work reveal the possible applications of deep learning for polar map images classi cation in SPECT myocardial perfusion imaging. These ndings suggest that our approach can be a promising alternative to routine polar map analysis using normal databases to further support the clinical decision-making process by providing a second opinion.

Declarations
Funding:      Distribution of Classi cation Metrics in Cross-Validation over 5 Folds