Deep learning-based detection of coronary artery calcification in non-contrast and contrast-enhanced CT scans

doi:10.21203/rs.3.rs-4281908/v1

Download PDF

Article

Deep learning-based detection of coronary artery calcification in non-contrast and contrast-enhanced CT scans

https://doi.org/10.21203/rs.3.rs-4281908/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Coronary artery calcification (CAC) assessed using computed tomography (CT) scans is a clinically-validated biomarker that is highly prognostic for coronary heart disease (CHD) and adverse cardiac events. Clinical assessment of CAC relies on a dedicated coronary electrocardiogram (ECG)-synchronised non-contrast CT scan. However, millions of CT scans are acquired every year for various indications that include the heart in the field-of-view yet visible CAC is often not reported in these scans. This is a significant missed opportunity for incidental detection of a powerful cardiac risk factor. Our study was conducted on a set of 295 unselected, consecutive CT scans from the National Health Service (NHS) Golden Jubilee Hospital. These were annotated for CAC and used for model training and testing. We developed and validated a deep learning model to accurately quantify CAC on any CT scan including the heart, regardless of the presence or phase of contrast agent, reason for the scan, or use of ECG-synchronisation. The model achieved substantial agreement with the manual human assessment (Cohen’s Kappa=0.61, Bland-Altman mean difference=-40.8mm³). Additionally, we found no correlation between arterial brightness (a surrogate metric for the level of contrast agent present) and agreement between manual and automated measurements (Spearman correlation R=-0.005). Early intervention is vital to improve patient outcomes. The automated CAC scoring method demonstrated here could be applied to all chest CT scans that include the heart, greatly expanding the opportunities for early detection of subclinical cardiovascular disease when preventative interventions have more impact. The promising accuracy achieved here by our deep learning model on a set of unselected sequential CT scans shows the potential for large-scale implementation to reduce the burden of coronary heart disease through systematic, opportunistic CAC screening.

Health sciences/Diseases/Cardiovascular diseases/Vascular diseases/Calcification

Health sciences/Health care/Medical imaging/Tomography

Coronary heart disease (CHD) remains the leading cause of death worldwide. According to the World Health Organisation, an estimated 17.9 million deaths in 2019 were due to CHD, accounting for 32% of global deaths ¹. Many effective treatments are available for CHD, but these require timely intervention. Primary prevention strategies targeting CHD are largely based on opportunistic assessment of 10 year cardiovascular risk ^2–4 using risk scoring models such as QRISK ⁵ or ASSIGN ^5,6, which rely on pooled risk factor equations. However, these risk scores suffer from poor risk stratification and poor risk discrimination in diverse groups ^7,8.

Coronary artery calcification (CAC) confirms atherosclerotic disease, and predominately occurs within the arterial intimal layer as a result of smooth muscle cell apoptosis and macrophage infiltration ⁹. CAC may be the strongest predictor of future cardiovascular events and a better guide to the introduction of primary preventative therapies than traditional risk factor models ^5,6,9–11. The “gold standard” method to assess CAC is a computed tomography (CT) scan specifically calibrated for the detection of calcium: non-contrast, electrocardiogram (ECG) synchronised cardiac scans with heart rate control, and breath holding during the scan to reduce motion of the heart. These scan acquisition parameters reduce variation and artefacts in images and ensure reliable quantitation of calcium volume. Conventionally, the scan is reconstructed into 3mm thick slices and calcium is annotated semi-automatically using a threshold of 130 Hounsfield units (HUs) to produce a cumulative Agatston Score ¹¹, which is the current standard prognostic measure of risk from CAC burden.

Despite the prognostic value of Agatston scores, image reconstruction parameters and the nominal cut-off of 130 HU were derived in the early 1990s. CT scanner technology and post processing techniques have markedly changed since this time, including multidetector scanners now capable of imaging the whole heart in a fraction of a heartbeat ^12,13. As a result, CAC can also be detected on non-cardiac CT scans that include the heart in the field of view ^14–16 which were not acquired for coronary indications. Such “incidental findings” are often not reported ^17,18 but they have significant implications for patients. For example, it could be used to prompt clinical review, including screening for risk factors (e.g. hypertension, dyslipidaemia, diabetes mellitus, and obesity), to provide advice regarding lifestyle measures proven to reduce the risk of cardiovascular events (e.g. smoking cessation, weight loss, diet, exercise etc.), and prompt consideration of disease modifying therapies such as statins. Such intervention following detection of incidental coronary calcification on ungated, non-cardiac chest CT scans has been shown to improve patient engagement with secondary prevention therapies ^14,19.

Automated CAC scoring techniques have historically focused on speeding up the calculation of an Agatston score in routine calcium screening workflows ^20–25. However, recent studies have shown excellent agreement for automated detection of CAC in scans for non-coronary indications on non ECG-synchronised, non-contrast chest CT scans (including positron emission tomography [PET-CT]) ^22,26–33. However, these scans represent a small portion of CT scans. Many CT scans are acquired following administration of exogenous intravenous contrast agent for non-cardiac indications and may also have a wider field of view than just the chest (e.g., chest, neck, abdomen, and pelvis). Contrast agent alters the brightness of vessels and other tissues rendering the traditional threshold of 130HU inapplicable. Though several techniques have been demonstrated on contrast-enhanced ECG-gated, dedicated cardiac CT angiography (CCTA) scans ^{25,28,34–37,37–44}, there are currently no automated methods for CAC quantification on contrast-enhanced, non-ECG-synchronised, non-coronary CT scans. Furthermore, while several studies have successfully developed automated CAC quantification techniques for both ungated, non-contrast chest CT and CCTA scans separately, there is currently no reported method for automated CAC quantification on both ungated non-coronary non-contrast chest CT, and ungated non-coronary contrast-enhanced CT using a single deep learning model.

In this paper, we describe the development of a deep learning model to detect and quantify CAC on any ungated CT scan that includes the whole heart in the field of view, irrespective of the presence of contrast. We developed our model using only reference ground truth of CAC, without the use of annotations of coronary arteries or coronary structures and reported accurate risk stratification of patients. We demonstrate that deep learning models can reliably quantify CAC across a spectrum of contrast phases in contrast-enhanced scans without being adversely affected by relative brightness levels in coronary vessels. This study supports opportunistic cardiovascular risk screening by expanding the pool of CT scans that can be automatically screened for evidence of cardiovascular disease. This could enable early, targeted intervention for at risk, asymptomatic populations who otherwise remain undetected.

Cohort Details

In this study, we developed a single deep learning method that automates the quantification of CAC using a single model for both non-contrast and contrast-enhanced CT. The model does not require ECG-synchronisation or scans acquired for coronary indications and is agnostic to contrast phase. Our study was conducted on a set of 295 unselected CT scans from the National Health Service (NHS) Golden Jubilee Hospital that included the heart in the field of view. Scans were annotated for calcification in the coronary arteries, aortic root, aortic valve, and mitral annular by three clinical annotators.

A summary of the CT scan dataset is provided in Table 1. The majority of scans were undertaken for the investigation of cancer (165/295, 56%) and lung disease (107/295, 36%). The field-of view varied from chest only (104/295, 35%) to neck, chest, abdomen, and pelvis (10/295, 3%). The dataset comprised both non-contrast scans and contrast scans with varying brightness in the vasculature and tissues. At the time of data export, 221 (75%) were documented on the radiology report as containing contrast.

Table 1

Patient demographics. All continuous values are reported as mean ± standard deviation. Imaging protocol abbreviations are: Chest Abdomen Pelvis (CAP), Chest High Resolution (HR), Chest Abdomen (CA), Neck Chest Abdomen Pelvis (NCAP). **Other imaging protocols included 3 CT Chest Neck, and 1 CT Neck Chest Abdomen.
Parameter	All (N = 295)	Train/validation (N = 214)	Test (N = 81)
Age (years)	64 ± 12	65 ± 12	62 ± 1
Female Sex (%)	54.2%	56%	49.3%
CT Scanner Manufacturer
Canon	124 (42%)	94	30
Siemens	170 (57.6%)	120	50
GE Medical Systems	1 (< 1%)	0	1
Imaging Protocol
CT CAP	128 (43%)	99	29
CT Chest	73 (25%)	56	17
CT Chest HR	32 (11%)	18	14
CT CA	47 (16%)	33	14
CT NCAP	11 (4%)	7	4
Other**	4 (1%)	1	3
Reconstruction Kernel
Body Sharp	106 (35%)	82	24
Lung	18 (5%)	12	6
Br40f	170 (57.6%)	120	50
Standard	1 (< 1%)	0	1
Tube Voltage (kV)	105.2 ± 14.5	105.7 ± 14.6	104 ± 14
80	4 (1%)	3	1
90	123 (42%)	88	35
100	19 (7%)	11	8
110	12 (4%)	8	4
120	137 (46%)	104	33
Current (mA)	290.7 ± 164.8	290.9 ± 163.5	290.2 ± 169
Contrast present (%)	75%	78%	65%
Slice thickness (mm)	0.74 ± 0.04	0.74 ± 0.04	0.73 ± 0.05
Annotator
R.I.S.G.	202 (68.5%)	139	63
C.P.B.	48 (16.3%)	38	10
R.K.H.	45 (15.2%)	37	8

Assessment of automated CAC detection model and annotator agreement

We investigated agreement between and within annotators, and between annotators and our model, which we summarise in Table 2. Inter-observer and intra-observer variability was assessed on 18/295 (6%) and 30/295 (10%) of scans, respectively. The model was evaluated using a held-out test set of 81 patients.

Table 2

Agreement between and within annotators, and between the model and annotators. Values are reported as mean ± standard deviation where relevant.
	Bland-Altman mean offset and LOA	Cohen’s Kappa	ICC	F1 (%)	Sensitivity (%)	PPV (%)
Annotator	Intra-Observer Variability
A	-48.9 [-878.3, 780.48]	0.91	0.92	80.1 ± 12.5	66 ± 31.5	79.3 ± 27.4
B	-92.1 [-342.9, 160.5]	0.10	0.77	66.3 ± 10	43.5 ± 15.1	80.9 ± 35.2
C	-150.9 [-1024.5, 772.5]	0.73	0.80	75 ± 19.2	72.6 ± 24.3	84.4 ± 16.2
Annotators	Inter-Observer Variability
A and B	163.5 [-517.8, 844.8]	0.58	0.83	70.6 ± 20.5	75.7 ± 26	60.1 ± 24.7
B and C	-265.6 [-1136.7, 605.5]	0.65	0.76	60 ± 15.1	60.7 ± 26	72 ± 24.1
A and C	-102.0 [-1063.4, 859.2]	0.72	0.76	67.2 ± 14.7	71.8 ± 24.9	71.3 ± 21.9
Model	-40.8 [-519.4, 437.7]	0.61	0.94	58.6 ± 19.1	70.7 ± 22.9	43.6 ± 30

Detection of CAC was assessed using a combination of Bland-Altman analysis and Interclass Coefficients (ICC) ratings. Bland-Altman mean and 95% limits of agreement (LOA) are reported. Cohen’s Kappa was used to assess agreement of volume score risk category classification of patients into five volume score risk categories (0–4, for volume scores of < 1, 1-112, 112–400, 401–999, >= 1000, respectively)⁴⁵. We report the F1-score, sensitivity, and positive predicted value (PPV), which are aggregated over each voxel for a given patient and averaged over all coronary labels (i.e., the intersection between true calcified voxels and false positive voxels). Figure 1 and Fig. 2 show Bland-Altman agreement plots for intra-observer and inter-observer agreement, respectively. Figure 3 shows Bland-Altman agreement plots between our model and annotators.

To better understand model performance, we assessed agreement between the model and annotators Bland-Altman analysis of the volume score (in mm³). For all coronary territories, we found a mean offset of -40.8 and 95% LOA as [-519.4, 437.7]. For each coronary territory: left main stem (LMS) -4.96 [-190.1, 180.2], left anterior descending (LAD) -53.1 [-447.16, 340.8], circumflex (Cx) 0.12 [-162.5, 162.7], right coronary artery (RCA) -18.5[-772, 734.8], and non-coronary regions − 277.5 [-2731.4, 2176.3]. We provide per-vessel Bland-Altman plots in Supplementary Fig. 4.

Fourteen (17%) patients, initially reported without any calcification, had calcium predicted by the model. After manual review, two patients were found to have calcification that was missed by the annotators but correctly predicted by the model. The twelve remaining false positive cases included four with metal artefacts in the heart, and two where only non-coronary calcification was reported. The remaining six patients presented with features including noisy scans and swirling contrast, bright artefacts near the heart. We include qualitative examples of false positive predictions in Fig. 4. Our method reported CAC in all patients where calcium was present.

We additionally report our voxel-wise metrics and across volume score risk categories in Table 3. For classification of patients into volume score risk categories, we found substantial agreement (Kappa = 0.61) between predicted CAC and reference CAC using Cohen’s Kappa statistic, with a specificity of 94.8% (± 2.37), sensitivity of 72.3% (± 20.8) PPV of 71.5% (± 21.9), and NPV of 94.6% (± 2.68). We provide qualitative examples of accurate model predictions in Fig. 5. We provide a confusion matrix for agreement between the model and annotators for volume score risk classification in Supplementary Fig. 2.

Table 3

Metrics across the volume score risk-categories. Mean values are reported with standard deviations.
	Automated Model Voxel-wise Metrics
Volume Score risk category	Number of Patients	F1 (%)	Sensitivity (%)	PPV (%)
No risk (< 1)	36	68.3 ± 32.2	97.6 ± 1.7	7.3 ± 22.4
Average Risk (1-112)	15	57.7 ± 24	79.1 ± 29.6	51.1 ± 27
Moderate Risk (112–400)	8	42.3 ± 14.7	60.3 ± 24	35.8 ± 15.7
High Risk (401–999)	11	60.7 ± 14.3	67.8 ± 14.2	56.1 ± 16.3
Very High Risk (> 1000)	11	70.1 ± 9.9	69.5 ± 16.4	74.9 ± 12.1

Effect of contrast on automated calcium detection

To explore how our method will perform in the incidental detection of CAC in any CT scan including the heart, we investigated the effect of contrast agent on the ability of the model to accurately detect CAC. We report the difference in reported CAC between annotators and our automatic technique, against brightness in different organs in Fig. 6.

We found that model performance was not affected by contrast agent concentration in the coronary arteries. No significant correlation was found between brightness in organs and the volume score agreement between annotators and our model. Spearman’s rank correlation for the brightness in the aorta and pulmonary artery was -0.005 (p=0.62), and spleen and liver, 0.02 (p=0.8). Bland-Altman analysis agreement in volume score between annotators and our model for contrast patients shows a mean offset and LOAs in mm³ of -54.8 [-465.1, 355.5], and for non-contrast patients, -14.4 [-597.3, 568.5]. A Mann-Whitney U-test did not show a significant difference (p=0.84) in mean difference between contrast (N=53) and non-contrast (N=28) scans.

There is rising interest in incidental identification of biomarkers to opportunistically screen for risk of future adverse events ^19,47. Significant evidence supports CAC as a predictor of future atherosclerotic events ^14,48–50, particularly in otherwise asymptomatic populations, but remains under-reported ^17,18. This is supported by a SCCT/STR guideline which recommends reporting moderate and severe CAC on all patients undergoing lung cancer screening ⁵¹. Several studies have developed automated techniques for CAC scoring in non-contrast CT scans acquired for non-coronary indications ^{27,29–31,33,52,53}. A recent study from Teng et al. ¹⁴ manually reported the prevalence of CAC in both contrast and non-contrast, non-ECG synchronised, non-coronary chest CT. Whilst multiple studies have developed capabilities for automatically quantifying CAC in CCTA scans ^{25,34,35,37–44,54}, our study is the first automated technique capable of quantifying CAC in both contrast and non-contrast, non-ECG synchronised, non-coronary CT using a single model.

This study shows that a single deep learning model can reliably detect and quantify CAC in any CT scan that include the heart in the field of view, irrespective of the presence of contrast. We evaluated the model’s performance on a single held-out test set, demonstrating robust detection of CAC and stratification of patients into established risk categories⁴⁶. We found that agreement between the model and annotators was not did not correlate with brightness in the heart (Spearman coefficient − 0.005), and found no significant difference in annotator and model agreement between contrast and non-contrast scans. Therefore, our model is invariant to contrast levels and is capable of quickly detecting CAC on scans which may otherwise require careful thresholding to differentiate calcium from surrounding tissue.

In the absence of the established 130HU threshold for calcium in non-contrast scans, there is currently no standardised procedure for manually annotating CAC in contrast scans. A common method is to estimate a per-scan threshold using HU statistics within a region of interest in the ascending or descending aorta, followed by manual annotation of voxels above this threshold, or semi-automatic labelling of whole calcium lesions identified by thresholding and connected component analysis ^{34,38,41,44,55}. Other studies have used free-form voxel annotation without the use of thresholds using arbitrary manual adjustment of windowing levels ^36,38,43. Our annotation procedure involves first manually identifying a per-scan threshold, and then manually annotating voxels above this threshold. This reduces the time spent on tedious editing and/or rejection of calcium lesions from metal artefacts identified by semi-automatic thresholding. Similarly, using connected-component analysis to identify calcium lesions makes it impossible to differentiate calcium lesions which span multiple anatomical regions (e.g., across the aortic valve, LCA, and RCA). This work benefits from precise reference annotations for calcium in specific coronary regions.

Compared with other techniques which automate CAC quantification on CCTA, our method did not require ground truth for the coronary arteries ^37,38 or coronary regions ²⁵, large datasets for domain adaptation learning ⁴¹, or rely on reference ground truth generated from virtual non-contrast scans ⁴². More generally, our method was developed without the use of cardiac atlases for registration ^22,25, thus able to implicitly identify coronary anatomy and discern between calcification in different locations in both non-contrast and contrast scans. To reduce the FOV of scans to just the coronary region, we used the open-source TotalSegmentator ⁵⁶ tool. Additionally, to eliminate false positive calcium detections outside the heart, a heart region mask generated using an in-house multi-atlas heart segmentation tool was used.

There are important limitations to this study. The small size of our dataset which was acquired using only two scanners from a single institution, constrains the scope of our generalizations. The retrospective design of this study also limits the clinical applicability of our findings. However, our cohort consists of unselected sequential scans. Furthermore, our model was trained on a set of homogenous high-resolution and thin-slice scans, which are typical of acquisitions from modern scanners. In a preliminary investigation on lower resolution, thick-slice data, we found similar results indicating strong generalizability on lower resolution scans which are traditionally used for calcium scoring. The inclusion criteria for this study included scans with the whole heart in the field of view. In practice, any calcium detected in scans where the heart is partially visible presents a valuable opportunistic finding.

The automatic method developed in this study reported CAC in fourteen scans where no calcium was reported. Two of these scans detected only non-coronary calcification, which does not adversely affect CAC risk scores. In the remaining twelve scans, false positive calcium detection was due to artefacts including pacemaker wires, poor quality scans (e.g. high noise or motion), abnormal brightness near the heart (e.g. pericardial calcification), or calcification missed by annotators which was retrospectively found to be correctly identified by the model. Since the model had few examples of pacemaker wires during training, they were more prevalent in the test set, it was not expected that the model would robustly generalize to these cases. Additionally, in our intra-observer study we found two cases (Supplementary Fig. 1) where annotators had reported no calcification in one attempt, and positively identified calcification in another, for the same scan. Robust identification of coronary calcium can be difficult in the presence of noisy scans, significant artefacts, and non-coronary calcification.

In conclusion, we have developed the first automatic technique capable of detecting CAC invariant to contrast in non-ECG synchronised CT scans. The general applicability of our method further reduces barriers for large-scale incidental screening which can utilise the greater prognostic value of CAC. This work can utilise CT scans across a variety of indications and modalities, thus can take advantage of the millions of CT scans acquired annually for any indication. Patients who are asymptomatic, or missed by traditional risk models, would benefit from early detection and preventative therapies, which could help significantly reduce the global health burden of CHD.

Ethics and governance approval

The study was undertaken at the NHS Golden Jubilee Hospital, a tertiary cardiothoracic centre that provides regional CT imaging for a population of over two million patients in the west of Scotland. This study was conducted in accordance with guidance from the UK Policy Framework for Health and Social Care Research. The project was approved by Yorkshire and The Humber – Leeds East Research Ethics Committee (REC Ref 21/YH/0217).

Scan selection

Using the Picture Archiving and Communication System (PACS) in our institution, we identified CT scans between March 2022 and March 2023 that included the heart in the field of view. The clinical indication for each scan was recorded from the electronic care record together with patient demographics (age and sex). The scan field of view and use of contrast were also recorded. All scans were anonymised and the DICOM file transferred to a remote workstation. Details of additional scan parameters, including CT reconstruction kernel, x-ray tube voltage, and reconstructed slice thickness, were noted from the relevant DICOM tags. When multiple scans were available for a single patient, we chose the thinnest-slice scan (for greater ground-truth annotation accuracy) with the softest reconstruction kernel (for minimum noise), and the patient in a supine position.

Ground Truth Annotations

Following curation, scans were uploaded to a dedicated annotation platform with a custom workflow and image annotation tool which enabled automatic and smooth transition of cases between curator, annotator, reviewer, and quality assurer. Three clinicians (R.H., C.B., R.G.), each with more than two years of experience annotating coronary calcification on CT scans, underwent training on both the annotation protocol and annotation platform, on at least 15 datasets each. The remaining scans were then randomly assigned to each of the three. The most senior of the three annotators had more than ten years of Cardiac CT reporting experience (R.G.) and acted as reviewer. Annotators and reviewer were blinded to radiology reports, patient details such as medical history, scan indication, and scan parameters during the annotation process.

Annotators identified calcium in seven defined regions of interest: the left main stem artery (LMS), left anterior descending artery (LAD), left circumflex artery (Cx), right coronary artery (RCA), aortic valve (AV), aortic root (AR, assumed below the sinotubular junction), and the mitral valve annulus (MV). Annotation was performed manually on axial slices using a threshold-assisted brush. No maximum threshold was advised but rather annotators gradually increased the lower threshold until there were few highlighted pixels within the main chambers (atria and ventricles) to minimise annotation of contrast in the vasculature.

Annotators additionally assessed for the presence of contrast in the scan (yes/no), whether the scan was of sufficient quality for interpretation, and the presence of any artefacts which may affect scan interpretation, e.g., motion, permanent pacemakers (including biventricular pacemaker), valve replacement, central venous lines, or evidence of previous cardiac surgery (including coronary artery bypass grafting). The senior annotator reviewed a random 40% of the image annotations to confirm consistency. Quality assurance was conducted on all cases by S.S.M and S.M. to ensure completeness of the data and compliance to the annotation protocol. Annotations which contained overlapping labels for a given voxel were merged by assigning voxels to labels in the following order of priority: LMS, LAD, RCA, Cx, AR, AV, MA. This ordering was consistent across all patients.

Scan summary data

An unselected series of 320 CT scans, acquired between March 2022 and March 2023, with the heart in the field of view (FOV) were identified, anonymised, and downloaded to the annotation platform. A total of 17 scans were excluded during curation (Fig. 7) due to unsuitable acquisition series, unusable reconstructions, or incomplete coverage of the heart, which made manual annotations of scans difficult. The remaining 303 scans were assigned for annotation, of which annotators excluded a further 4 scans due to severe artefact and/or extensive non-coronary calcification (Fig. 8). The reviewer excluded a further two scans: one scan with situs inversus and one scan with a biventricular pacemaker lead. Two further scans were excluded during the quality assurance (QA) process due to ambiguity around the presence of contrast and severe artefact, leaving a total of 295 scans available for development and testing. All exclusion criteria were applied before model training and testing. Our final cohort represents a wide variety of scan quality, imaging artefacts, and foreign bodies (Supplementary Fig. 3, Supplementary Table 1). Supplementary Table 2 details the various indications for the scans.

Scans were divided into training and validation sets (214 scans) and a testing set (81 scans) (Table 1). Scans were primarily acquired from two scanners, a Canon Medical Systems Aquilion ONE scanner, and a Siemens SOMATOM go Top scanner. A single scan was acquired from a GE Medical Systems Revolution EVO scanner. Scans were reconstructed using soft and sharp kernels, including the soft-tissue Br40f and Standard kernels, as well as sharper Lung and Body Sharp kernels. This indicates the variety in the dataset which may be encountered in incidental findings.

We used stratified random sampling based on the presence of calcium in specific coronary territories to assign patients to the training, validation, and testing splits, thus ensuring territory classes were balanced. We initially split the dataset between training and testing, then repeated the procedure in the training split to create three folds for cross-validation, resulting in three folds of the same dataset with 143 training and 71 validation scans each, and 81 testing scans.

Presence of contrast was indicated on the radiology reports of 221/295 (75%) scans. To understand the effect of contrast agent on CAC detection, we estimated the levels of contrast in cardiac, vascular, and the solid organ components of the scan. We used the open-source TotalSegmentator tool ⁵⁶ to automatically segment the spleen, liver, aorta, and main pulmonary artery. The median HU values in these regions were used as a surrogate for contrast phase as follows. Early-stage contrast scans can be identified by higher HU values in the aorta and pulmonary arteries than in the spleen and liver (early arterial phase). As contrast diffuses through the vascular system, the liver and spleen parenchyma become brighter. Thus, later stage contrast scans can be recognised by higher HU values in these organs as well (portal venous phase).

The variation in brightness of these predefined elements in our scans is shown in Fig. 9. We estimate the majority (199/295, 67%) of our contrast scans were acquired during the portal venous phase, as we observe enhanced attenuation in both the liver and spleen, and the aorta and pulmonary artery, compared to the non-contrast baseline (dashed lines, Fig. 9). A small number of scans (8/295, 2%) were indicated as contrast scans but presented without enhancement in the organs or arteries considered; we identified these as delayed contrast scans. We identified the remaining contrast scans as early arterial phase (14/295, 4.7%), as they showed enhanced attenuation in the aorta and pulmonary artery over non-contrast but did not show significant enhancement in the liver and spleen.

Importantly, we observe a wide variety of attenuation in the coronary region, which presents challenges for the detection of coronary calcification. Each scan will have varying thresholds at which calcium is significantly differentiated from coronary vessels and surrounding tissue. This makes manual annotation difficult and time-consuming, particularly early-stage contrast scans where calcium may not be differentiated from contrast-enhanced vessels. This highlights the value of an automated technique for detecting calcium that is robust to contrast levels.

Manual annotation results

Figure 10 illustrates the distribution of calcification across coronary territories in our cohort. No coronary calcification was found in 90/295 (30%) scans. Of these, 80/295 (27%) showed no evidence of any calcification, including non-coronary calcification. The remaining 10/295 (11%) had only non-coronary calcification. Of the 205 scans with coronary artery calcification, the most common artery was the LAD (191/295, 64%), followed by the RCA (135/295, 45%), the circumflex artery (125/295, 42%), and the LMS (102/295, 34%). Aortic root calcification was the most common non-coronary calcified region (133/295, 45%), followed by the aortic valve (76/295, 25%), and the mitral annular (58/295, 19%).

To assess annotator agreement, we assigned 18/295 (6%) cases to annotators for inter-observer variability, and 30/295 (10%) for intra-observer variability. Annotators were blinded to previous attempts and followed the same annotation procedure for both attempts. For intra-observer attempts, we ensured at least a three-month gap between an annotator’s first and second attempt at annotating a case.

Artificial Intelligence Algorithm Development

CT scans acquired for non-coronary indications can have wide FOVs, including the neck and pelvis. To reduce the unnecessarily large FOV we crop the images to the heart region in the axial-plane using the TotalSegmentator ⁵⁶ tool. This significantly improves efficiency and reduces the time for training the model.

Our algorithm uses two CNNs successively in stages, processing the CT volume in 3D patches. The first stage CNN produces a segmentation of calcium, including coronary and non-coronary territories. The second stage CNN refines the output from the first stage network and classifies each calcium voxel to its coronary territory (LMS, LAD, Cx, RCA, or non-coronary arteries AR, AV, and MA). While there is evidence for the prognostic value of calcification in non-coronary structures ^57,58, our primary use for detecting calcification in these regions was to help eliminate the possibility of the model including confusing CAC for non-coronary artery calcification. We reduce false-positive predictions by post-processing model predictions. Firstly, we used an in-house multi-atlas heart segmentation tool to generate heart masks to reduce false positives, by excluding any voxels predicted by the models outside of the heart mask. Secondly, we use 3D connected-component labelling and discard lesions smaller than 3mm³. We tuned the minimum voxel volume size on the validation sets. Finally, whole-volume CAC segmentations are obtained by stitching adjacent 3D model output patches.

Algorithm Architecture and Training

We used an encoder-decoder based U-Net ⁵⁹ architecture for both the first and second CNN stages (Fig. 12). The encoder layers consist of two convolutional blocks which down-sample the input. The decoder layers consist of two convolutional blocks followed by a transposed convolution which use skip-connections from the encoder layers at the corresponding levels. Input CT scans are intensity windowed (window level and width of 200 and 1200 HU respectively), and then cropped to the coronary region. During training we randomly sample input with and without calcium present, and augment patches using random rotation, and random adjustment of voxel intensity values. Both models used early stopping and the Adam optimizer with β₁ = 0.9 and β₂ = 0.999, ε = 1e^− 8, and a learning rate of 3x10^− 3.

Coronary calcium is very sparse. To prevent the model simply predicting no calcium, we use a combination of Dice and focal loss ⁶⁰ to downregulate the contribution of the background voxels. The Dice loss was weighted by 0.5. For the focal loss hyperparameters, we used a \(\gamma\)of 2 for both stages. The first stage model was trained with \(\alpha\)= 6900, and the second stage with \(\alpha\)= [1.5, 1800, 1400, 1600, 1400, 1400], for background, LMS, LAD, Cx, RCA, and non-coronary classes, respectively. We calculated the per-class inverse voxel frequency as the weighting term for each class. Final model predictions during inference were generated by optimising the models for each cross-validation fold, and ensembling the best performing models by voxel-wise probability averaging.

Statistics

All analysis was done in Python (Version 3.9.15) using numpy (Version 1.23.5). Bland-Altman ⁶¹ means and 95% limits of agreement were used to evaluate per-vessel and total coronary agreement between annotators, and between the automated model and annotators. Spearman’s rank correlation or Mann-Whitney U tests were used to compare continuous data using the scipy package (Version 1.9.3). Cohen’s Kappa was used to evaluate agreement in classification of patients in the five volume risk categories also using the scipy package. Cohen’s Kappa was interpreted as: (< 0.00): poor; (0.00–0.20): slight; (0.21–0.40): fair; (0.41–0.60): moderate; (0.61–0.80): substantial; (0.81–1.00): almost perfect agreement ⁶². Volume score classification confusion matrices were created using scikit-learn (Version 1.1.3). We report the voxel-based F1-Score, positive predictive value by averaging over all voxels for a given scan, for all coronary labels. Voxel based metrics were calculated using the MONAI package (Version 1.1.0). All statistical tests were two-sided. Statistical significance was defined as a p value of < 0.05. Intraclass Coefficients were calculated using the Pingouin package (Version 0.5.3).

Author Contributions

This work would not have been possible without dedication and support from all authors. Salman Mohammadi designed, implemented, and executed all experiments. Richard Good was the primary clinical contributor, oversaw annotator training and data collection and contributed to the manuscript. Keith Goatman provided scientific oversight. Richard Good and Keith Goatman conceived and proposed the study. Shadia Mikhael provided clinical oversight over the whole annotation process, including curation, review, quality assurance, administration of scans across the annotators, annotator training, and review of model results. Jeremy Voisey and Salman Mohammadi supported data curation and processing. Rebecca Hughes and Conor Bradley supported annotations of reference ground truth. Sonia Dahdouh provided scientific support and extensive manuscript reviews. Olivier Jaubert contributed to development of the pre-processing methods used and provided manuscript reviews.

Competing Interests

Six of the authors (S.M, S.S.M, K.A.G, S.D, J.P.V, O.J) are employees of Canon Medical Research Europe. There are no competing interests in this study.

Data Availability

The raw patient data is not publicly available to protect patient confidentiality. However, reasonable data requests will be considered. After approval from the hospital and the corresponding authors, de-identified CT data may be provided. Requests to access the datasets should be directed to corresponding author.

Code Availability

The underlying code for this study is not publicly available as it includes proprietary code.

World Health Organisation. Cardiovascular diseases (CVDs). https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) (2021).
Benner, J. S. et al. A novel programme to evaluate and communicate 10-year risk of CHD reduces predicted risk and improves patients’ modifiable risk factor profile: Novel programme to evaluate/communicate predicted CHD risk. International Journal of Clinical Practice 62, 1484–1498 (2008).
Persell, S. D. et al. Individualized Risk Communication and Outreach for Primary Cardiovascular Disease Prevention in Community Health Centers: Randomized Trial. Circ: Cardiovascular Quality and Outcomes 8, 560–566 (2015).
Persell, S. D., Lloyd-Jones, D. M., Friesema, E. M., Cooper, A. J. & Baker, D. W. Electronic Health Record-Based Patient Identification and Individualized Mailed Outreach for Primary Cardiovascular Disease Prevention: A Cluster Randomized Trial. J GEN INTERN MED 28, 554–560 (2013).
Woodward, M., Brindle, P., Tunstall-Pedoe, H., & for the SIGN group on risk estimation*. Adding social deprivation and family history to cardiovascular risk assessment: the ASSIGN score from the Scottish Heart Health Extended Cohort (SHHEC). Heart 93, 172–176 (2005).
Home - ASSIGN Score – prioritising prevention of cardiovascular disease. https://www.assign-score.com/.
Cainzos-Achirica, M. et al. Pathways Forward in Cardiovascular Disease Prevention One and a Half Years After Publication of the 2013 ACC/AHA Cardiovascular Disease Prevention Guidelines. Mayo Clinic Proceedings 90, 1262–1271 (2015).
Goff, D. C. et al. 2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk. Journal of the American College of Cardiology 63, 2935–2959 (2014).
Mori, H. et al. Coronary Artery Calcification and its Progression: What Does it Really Mean? JACC Cardiovasc Imaging 11, 127–142 (2018).
Polonsky, T. S. & Greenland, P. Viewing the Value of Coronary Artery Calcium Testing From Different Perspectives. JAMA Cardiol 3, 908 (2018).
Agatston, A. S. et al. Quantification of coronary artery calcium using ultrafast computed tomography. Journal of the American College of Cardiology 15, 827–832 (1990).
Einstein, A. J. et al. Radiation Dose from Single-Heartbeat Coronary CT Angiography Performed with a 320–Detector Row Volume Scanner. Radiology 254, 698–706 (2010).
Fishman, E. K. Multidetector-row computed tomography to detect coronary artery disease: the importance of heart rate. European Heart Journal Supplements 7, G4–G12 (2005).
Teng, L. E., Kennedy, L., Lok, S. C., O’Rourke, E. & Premaratne, M. An Opportunity to Seize From Low Hanging Fruits: Capitalising on Incidentally Reported Coronary Artery Calcification. Heart, Lung and Circulation 32, 1222–1229 (2023).
Jacobs, P. C. et al. Coronary Artery Calcium Can Predict All-Cause Mortality and Cardiovascular Events on Low-Dose CT Screening for Lung Cancer. American Journal of Roentgenology 198, 505–511 (2012).
Matsumura, M. E. et al. Breast artery calcium noted on screening mammography is predictive of high risk coronary calcium in asymptomatic women: a case control study. Vasa 42, 429–433 (2013).
Balakrishnan, R. et al. Coronary artery calcification is common on nongated chest computed tomography imaging. Clinical Cardiology 40, 498–502 (2017).
Secchi, F. et al. Detection of incidental cardiac findings in noncardiac chest computed tomography. Medicine 96, e7531 (2017).
Sandhu, A. T. et al. Incidental Coronary Artery Calcium: Opportunistic Screening of Previous Nongated Chest Computed Tomography Scans to Improve Statin Rates (NOTIFY-1 Project). Circulation 147, 703–714 (2023).
Shahzad, R. et al. Vessel specific coronary artery calcium scoring: an automatic system. Acad Radiol 20, 1–9 (2013).
Isgum, I., Rutten, A., Prokop, M. & van Ginneken, B. Detection of coronary calcifications from computed tomography scans for automated risk assessment of coronary artery disease. Med Phys 34, 1450–1461 (2007).
de Vos, B. D. et al. Direct Automatic Coronary Calcium Scoring in Cardiac and Chest CT. IEEE Trans Med Imaging 38, 2127–2138 (2019).
Kim, S. Y., Suh, Y. J., Lee, H.-J. & Kim, Y. J. Prognostic value of coronary artery calcium scores from 1.5 mm slice reconstructions of electrocardiogram-gated computed tomography scans in asymptomatic individuals. Sci Rep 12, 7198 (2022).
Zeleznik, R. et al. Deep convolutional neural networks to predict cardiovascular risk from computed tomography. Nat Commun 12, 715 (2021).
Lee, J.-G. et al. Fully Automatic Coronary Calcium Score Software Empowered by Artificial Intelligence Technology: Validation Study Using Three CT Cohorts. Korean J Radiol 22, 1764–1776 (2021).
Takx, R. A. P. et al. Automated Coronary Artery Calcification Scoring in Non-Gated Chest CT: Agreement and Reliability. PLOS ONE 9, e91239 (2014).
Eng, D. et al. Automated coronary calcium scoring using deep learning with multicenter external validation. npj Digit. Med. 4, 1–13 (2021).
Yu, J. et al. Automated total and vessel-specific coronary artery calcium (CAC) quantification on chest CT: direct comparison with CAC scoring on non-contrast cardiac CT. BMC Medical Imaging 22, 177 (2022).
van Velzen, S. G. M. et al. Deep Learning for Automatic Calcium Scoring in CT: Validation Using Multiple Cardiac CT and Chest CT Protocols. Radiology 295, 66–79 (2020).
Gernaat, S. A. M. et al. Automatic quantification of calcifications in the coronary arteries and thoracic aorta on radiotherapy planning CT scans of Western and Asian breast cancer patients. Radiother Oncol 127, 487–492 (2018).
Lessmann, N. et al. Automatic Calcium Scoring in Low-Dose Chest CT Using Deep Neural Networks With Dilated Convolutions. IEEE Trans Med Imaging 37, 615–625 (2018).
Choi, J. H. et al. Validation of deep learning-based fully automated coronary artery calcium scoring using non-ECG-gated chest CT in patients with cancer. Front Oncol 12, 989250 (2022).
Isgum, I., Prokop, M., Niemeijer, M., Viergever, M. A. & van Ginneken, B. Automatic coronary calcium scoring in low-dose chest computed tomography. IEEE Trans Med Imaging 31, 2322–2334 (2012).
Wolterink, J. M. et al. Automatic coronary artery calcium scoring in cardiac CT angiography using paired convolutional neural networks. Med Image Anal 34, 123–136 (2016).
Schuhbaeck, A. et al. Coronary calcium scoring from contrast coronary CT angiography using a semiautomated standardized method. J Cardiovasc Comput Tomogr 9, 446–453 (2015).
Fischer, A. M. et al. Accuracy of an Artificial Intelligence Deep Learning Algorithm Implementing a Recurrent Neural Network With Long Short-term Memory for the Automated Detection of Calcified Plaques From Coronary Computed Tomography Angiography. J Thorac Imaging 35 Suppl 1, S49–S57 (2020).
Li, Q. et al. Coronary artery calcium quantification using contrast-enhanced dual-energy computed tomography scans in comparison with unenhanced single-energy scans. Phys Med Biol 63, 175006 (2018).
Lee, J. O., Park, E.-A., Park, D. & Lee, W. Deep Learning-Based Automated Quantification of Coronary Artery Calcification for Contrast-Enhanced Coronary Computed Tomographic Angiography. JCDD 10, 143 (2023).
Yang, G. et al. Automatic coronary calcium scoring using noncontrast and contrast CT images. Med Phys 43, 2174 (2016).
Otton, J. M. et al. A method for coronary artery calcium scoring using contrast-enhanced computed tomography. Journal of Cardiovascular Computed Tomography 6, 37–44 (2012).
Zhai, Z. et al. Learning coronary artery calcium scoring in coronary CTA from non-contrast CT using unsupervised domain adaptation. Frontiers in Cardiovascular Medicine 9, (2022).
Mu, D. et al. Calcium Scoring at Coronary CT Angiography Using Deep Learning. Radiology 302, 309–316 (2022).
Van Herten, R. L. M. et al. Automatic Coronary Artery Plaque Quantification and CAD-RADS Prediction using Mesh Priors. IEEE Transactions on Medical Imaging 1–1 (2023) doi:10.1109/TMI.2023.3326243.
Mylonas, I. et al. Quantifying coronary artery calcification from a contrast-enhanced cardiac computed tomography angiography study. Eur Heart J Cardiovasc Imaging 15, 210–215 (2014).
Callister, T. Q. et al. Coronary artery disease: improved reproducibility of calcium scoring with an electron-beam CT volumetric method. Radiology 208, 807–814 (1998).
Lo‐Kioeng‐Shioe, M. S. et al. Coronary Calcium Characteristics as Predictors of Major Adverse Cardiac Events in Symptomatic Patients: Insights From the CORE320 Multinational Study. JAHA 8, e007201 (2019).
Pickhardt, P. J. et al. Opportunistic Screening at Abdominal CT: Use of Automated Body Composition Biomarkers for Added Cardiometabolic Value. RadioGraphics 41, 524–542 (2021).
Hecht, H. S. Coronary Artery Calcium Scanning. JACC: Cardiovascular Imaging 8, 579–596 (2015).
Peng, A. W. et al. Very High Coronary Artery Calcium (≥1000) and Association With Cardiovascular Disease Events, Non–Cardiovascular Disease Outcomes, and Mortality: Results From MESA. Circulation 143, 1571–1583 (2021).
Williams, M. C. et al. Reporting incidental coronary, aortic valve and cardiac calcification on non-gated thoracic computed tomography, a consensus statement from the BSCI/BSCCT and BSTI. BJR 94, 20200894 (2021).
Hecht, H. S. et al. 2016 SCCT/STR guidelines for coronary artery calcium scoring of noncontrast noncardiac chest CT scans: A report of the Society of Cardiovascular Computed Tomography and Society of Thoracic Radiology. Journal of Thoracic Imaging 32, W54–W66 (2017).
Aslam, A. et al. Assessment of isotropic calcium using 0.5-mm reconstructions from 320-row CT data sets identifies more patients with non-zero Agatston score and more subclinical atherosclerosis than standard 3.0-mm coronary artery calcium scan and CT angiography. J Cardiovasc Comput Tomogr 8, 58–66 (2014).
Mühlenbruch, G. et al. The accuracy of 1- and 3-mm slices in coronary calcium scoring using multi-slice CT in vitro and in vivo. Eur Radiol 17, 321–329 (2007).
Liu, J. et al. A Vessel-Focused 3D Convolutional Network for Automatic Segmentation and Classification of Coronary Artery Plaques in Cardiac CTA. in Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges (eds. Pop, M. et al.) 131–141 (Springer International Publishing, Cham, 2019). doi:10.1007/978-3-030-12029-0_15.
Saur, S. C., Alkadhi, H., Desbiolles, L., Székely, G. & Cattin, P. C. Automatic detection of calcified coronary plaques in computed tomography data sets. Med Image Comput Comput Assist Interv 11, 170–177 (2008).
Wasserthal, J. et al. TotalSegmentator: robust segmentation of 104 anatomical structures in CT images. Preprint at https://doi.org/10.48550/arXiv.2208.05868 (2022).
Christensen, J. L. et al. Aortic valve calcification predicts all-cause mortality independent of coronary calcification and severe stenosis. Atherosclerosis 307, 16–20 (2020).
Koos, R. et al. Aortic Valve Calcification as a Marker for Aortic Stenosis Severity: Assessment on 16-MDCT. American Journal of Roentgenology 183, 1813–1818 (2004).
Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Preprint at https://doi.org/10.48550/arXiv.1505.04597 (2015).
Lin, T.-Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal Loss for Dense Object Detection. Preprint at https://doi.org/10.48550/arXiv.1708.02002 (2018).
Giavarina, D. Understanding Bland Altman analysis. Biochem Med 25, 141–151 (2015).
Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977).

No competing interests reported.

DeeplearningbaseddetectionofcoronaryarterycalcificationinnoncontrastandcontrastenhancedCTscansSupplementarySubmission08042024.docx

Download PDF

Version 1

posted

You are reading this latest preprint version

Deep learning-based detection of coronary artery calcification in non-contrast and contrast-enhanced CT scans

Status:

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods

Declarations

Author Contributions

Competing Interests

Data Availability

Code Availability

References

Additional Declarations

Supplementary Files

Status:

Version 1