Comparison of Bone Scintigraphy and PET/CT for the Evaluation of Disease Activity in Patients with Rheumatoid Arthritis: Application of Bone Scintigraphy

Background: We aimed to compare the reliability of bone scintigraphy (BS) and uorine-18-uorodeoxyglucose ( 18 F-FDG) positron emission tomography (PET)–derived parameters in the detection of active arthritis in 28-joint areas and evaluate the reliability of joint counts between BS and clinical assessment in patients with rheumatoid arthritis (RA). Methods: We enrolled 106 patients (67 in the development group and 39 in the validation groups) with active RA who underwent BS, 18 F-FDG PET/computed tomography (CT), and clinical evaluation of disease activity. We compared the results of BS-derived joint assessment with those of PET-derived and clinical joint assessments. Subsequently we developed a disease activity score (DAS) using BS-positive joints and validated it in an independent group. Results: The number of BS-positive joints in 28-joint areas signicantly correlated with the swollen /tender joint counts (SJC/TJC) and PET-derived joint counts. A BS uptake score of 2 (strong positive) was signicantly more sensitive compared with a BS uptake score of 1 (weak positive) in detecting a PET-positive joint among the 28-joints. After conducting multivariate analyses including erythrocyte sediment rate (ESR) and patient global assessment (PGA) in addition to BS-derived parameters, BS/DAS was obtained as follows: 0.056 × number of BS-positive joints in 28 joints + 0.012 × ESR + 0.030 × PGA. A signicant correlation between BS/DAS and DAS28-ESR was conrmed in the validation group. Conclusion: Strong positive uptake of BS is sensitive and reproducible for the detection of active joints, and can complement the clinical assessment of disease activity in RA.


Introduction
Rheumatoid arthritis (RA) is a chronic in ammatory joint disorder characterized by the synovial in ltration of active immune cells, which causes the destruction of cartilage, bone, and joint structures [1]. Joint counts performed by experienced physicians are considered crucial in the quantitative assessment of synovitis, which was included in the disease activity score (DAS) 28 for the measurement of RA activity [2]. However, joint counts are limited by an inherent lack of objectivity related to both operator's factors and patient's factors [3,4], thereby increasing the need for more sensitive and reproducible tools to detect synovitis.
Although imaging modalities such as ultrasound (US) and magnetic resonance imaging (MRI) are more sensitive than clinical assessment for detecting joint in ammation [5][6][7], it is di cult to assess systemic joint status in patients with RA with these tools [8][9][10][11]. Another limitation is the time-consuming nature of these imaging procedures. Recently, uorine-18-uorodeoxyglucose (FDG) positron emission tomography (PET)/computed tomography (CT) imaging provided important insights that helped in evaluating disease activity in patients with RA. The FDG PET/CT-derived joint count assessment is a highly reproducible and sensitive tool, and complements the clinical evaluations [12]. However, FDG PET/CT examinations have certain limitations including high levels of radiation exposure, use of expensive core facility, and high costs [13][14][15].
Bone scintigraphy (BS), which has long been used in clinical settings for the assessment of in amed joint distribution, has several advantages for evaluating systemic joints over US, MRI, and FDG PET/CT. BS provides whole-body joint imaging with much less radiation exposure compared with FDG PET/CT [13] and is a potential tool for quantitative assessment of disease activity in a more affordable and safer way.
However, no study has validated the usefulness of BS in the measurement of RA disease activity.
In this study, we aimed to validate the BS-derived quantitative parameters for RA disease activity by comparing BS-derived joint counts with PET-derived joint counts performed in 28-joints. First, BS-derived joint assessment was compared with PET-derived and clinical joint assessments. Subsequently, DAS was developed using BS-positive joints and validated it in an independent group.

Patients and study design
We enrolled 106 patients who had active joints and underwent BS evaluation at Kyungpook National University Hospital from December 2010 to February 2018 in our study. We diagnosed all patients with RA according to the American College of Rheumatology/European League Against Rheumatism criteria of 2010 [16]. This study comprised two groups: a development (n = 67) group, in which DAS was derived by both BS and FDG PET/CT, and a validation group (n = 39), in which the DAS was applied. At the time of BS evaluation, we assessed the clinical disease activity including swollen joint count (SJC), tender joint count (TJC), patient global assessment (PGA), erythrocyte sedimentation (ESR), and C-reactive protein (CRP). The clinical assessments of positive joint counts were examined in each patient by the rheumatologists (J.S.E., J.W.K., and N.R.K.) and a BS image analysis was performed by two nuclear medicine physicians (C.M.H. and I.C.). Nuclear medicine physicians were unaware of the clinical positive joint counts and disease activity of the patients. This study was approved by the Institutional Review Board at con rmed Kyungpook National University Hospital.

FDG PET/CT acquisition protocol and image analysis
A previous study demonstrated the FDG-PET/CT acquisition protocol [12]. All patients fasted more than 6 hours, and the blood glucose levels of each patient before the FDG administration was < 150 mg/dL. PET/CT images were obtained from the skull vertex to the feet with the patient in supine position using a Reveal HiREZ 6-slice CT apparatus (CTI Molecular Imaging, Knoxville, TN, USA) 1 hour after the intravenous injection of FDG (~ 5 MBq/kg body weight). First, a low-dose CT scan without contrast enhancement was obtained for attenuation correction, and all images were reconstructed using a 3.75mm slice thickness at 2.5-mm increments. Then a three-dimensional-mode PET scan with a maximum spatial resolution of 6.5 mm was performed for 3 minutes per bed position. The PET images were reconstructed with a 128 × 128 matrix. When the FDG uptake in the joint synovium was higher than the normal regional tracer accumulation, the joints were considered positive for active arthritis.. The volume of interest (VOI) for a PET-positive joint was placed on a joint synovium in PET images, and an isocontour VOI including all voxels > 42% of the maximum was created; subsequently, the SUVmax value was automatically calculated. The SUVmax was obtained using following formula: maximum activity in the region of interest (MBq/ml) divided by injected dose (MBp)/body weight (gm). PET28 was de ned as the number of PET-positive joints among the 28-joints. Two experienced nuclear medicine physicians interpreted the PET/CT images and the interpretation of the PET/CT images was repeated 2 months later (by a nuclear medicine physician) or independently (two nuclear medicine physicians).
We previously developed a novel PET/DAS formula using PET/CT after conducting multivariate analyses including ESR and PGA in addition to PET-derived parameters. 12

Statistical analysis
The baseline clinical data were expressed as means ± SD for continuous variables or as numbers and percentages for categorical variables. To compare BS and PET/CT in terms of the detection of active joints, the signi cant differences between variables were calculated using the chi-square test and Mann-Whitney test. The correlations between the BS-derived parameters and other disease activity measures were calculated using the Pearson's correlation test, with Bonferroni's correction. The intra-observer (the nuclear medicine physician, 2 month intervals) and inter-observer (between the two nuclear medicine physicians or between a nuclear medicine physician and rheumatologists) in the 28-joints counts were calculated using the Cohen κ-test and intraclass correlation coe cient (ICC). A kappa value of 0-0.20 was considered poor, 0.21-0.40 as fair, 0.41-0.60 as moderate, 0.61-0.80 as good, and 0.81-1.00 as excellent [17,18]. ICCs between the BSS28 and PET28, TJC28 in the development group were calculated using a two-way mixed-effects model and the Bland-Altman approach [19].
For the development of DAS using BS, univariate and multivariate analyses were conducted using the linear regression model to evaluate the association among clinical factors, including BS-derived parameters, and disease activity measures in patients with RA. After the generation of BS/DAS, we calculated it for each patient in the validation group (n = 39). Pearson's correlation test was utilized to compare the correlation between BS/DAS and DAS28-ESR. P-values < 0.05 were considered signi cant.
All statistical analyses were performed using SPSS version 19 software (IBM, Chicago, IL, USA) and GraphPad Prism version 5 (GraphPad, San Diego, CA, USA) was used to generate the graphics.

Baseline characteristics in the development and validation groups
We enrolled 86 patients with active RA in the development group (n = 67) and validation group (n = 39) who underwent BS, disease activity evaluation, and/or FDG-PET/CT at the same time. The mean ages of the development and validation groups at the time of disease evaluation were 68 and 67 years, respectively. The proportion of women was similar between the two groups. Additionally, the mean DAS28-ESR of the development and validation groups were 6.81 and 6.43, respectively, with all patients in both groups showing moderate to high disease activity. In both groups, 53 patients (79.1%) and 28 patients (71.8%) were naïve to disease-modifying antirheumatic drugs (DMARDs) ( Table 1). Data are expressed as means ± SD for continuous variables or numbers and percentages for categorical variables. RF, rheumatoid factor; antiCCP, anti-cyclic citrullinated peptide; ESR, erythrocyte sedimentation rate; CRP, C-reactive protein; DAS, disease activity score; PGA, patient global assessment; DMARD, disease-modifying antirheumatic drugs.

Correlations between BS-derived parameters and other disease activity measures in the development group
To compare the reliability of BS and PET/CT in the detection of active joints, the individual affected joints examined by BS, in terms of cumulative frequencies and percentages of involvement (Fig. 1A, 1B) of the individual joint, were expressed based on the positive joint counts and SUVmax on PET/CT (Fig. 1C). A BS uptake score of 2 was signi cantly more sensitive compared with a BS uptake score of 1 in detecting a PET-positive joint among the 28-joints (Fig. 1A, B). Thus we used the BS uptake score of 2 as a criterion for diagnosing a BS positive joint. At the time of BS and PET/CT evaluation, the clinical disease activity was assessed using SJC28, TJC28 and DAS28-ESR. To investigate the correlation between BSS28 and clinical disease activity, the BSS28 was compared with the clinical parameters including TJC28, SJC28 and DAS28-ESR. The BSS28 was signi cantly correlated with TJC28 (r = 0.483, p < 0.001), SJC 28 (r = 0.409, p = 0.001), and DAS28-ESR (r = 0.457, p < 0.001) ( Fig. 2A, B, and C). The BSS28 was also signi cantly correlated with PET28 (r = 0.643, p < 0.001) (Fig. 2D).

Reliability of joint counts between BSS28 and other disease activity measures in the development group
Kappa values between BS results and clinical assessments of the individual joints ranged from 0.033 to 0.457. However, these values indicated constant fair to moderate agreement, except for the knee and shoulder joints. The reliability values between BSS28 and SJC28/TJC28 assessed by ICCs at the patient level in 28-joints were 0.585 (95% con dence interval: 0.324-0.745) and 0.646 (0.424-0.783), respectively ( Table 2). The reliability between BS and PET/CT for joint counts ranged from 0.194 to 0.703 by kappa values at the individual joint and 0.782 (0.646-0.866) by ICCs at the patient level in 28-joints, respectively. These reliability values were higher than those between BS results and clinical assessments ( Table 2). The level of reliability of the BSS28 in relation to the PET28 and TJC28 was further illustrated by the Bland-Altman plots. The mean differences between the BSS28 and PET28/TJC 28 were 0.46 and − 0.40, respectively. The majority of plots (62 of 67 (92.5%) and 64 of 67 (95.5%), respectively) were within the upper and lower limits of 2 SD (Fig. 3A and B).
When the intra-observer reliability of the nuclear medicine physician was evaluated, the kappa values at the individual joint showed moderate to excellent agreement, and the ICC values at the patient level showed an excellent reliability (0.938, 0.840-0.976). Furthermore, the ICC values of the inter-observer results (between two nuclear medicine physicians) were good in the 28-joints counts (0.830, 0.560-0.935) (Supplementary Table 1). Disease activities such as DAS28-ESR/CRP in the validation group were not signi cantly different from the development group (Table 1). The BS/DAS in the validation group were signi cantly correlated with DAS28-ESR (r = 0.806, p < 0.001) (Fig. 4). BS/DAS were also signi cantly correlated with the DAS28-CRP, TJC28, and SJC28 (Supplementary Table 3).

Discussion
This study had two main results. First, the BS-derived joint assessment signi cantly correlated with clinical and PET/CT-derived joint counts, and its reliability was good for both clinical and PET/CT-derived ndings. Second, we developed the disease activity formula, the BS/DAS, which is composed of the BSS28, levels of ESR, and the PGA. Additionally, the formula was con rmed in a validation group.
In our previous study, FDG-PET/CT could serve as a sensitive and reproducible method for assessing disease activity in patients with RA [12]. Although the radiation dose is reduced with more advanced scanners, an increase in radiation exposure is one of a major safety concern in this procedure [15]. In Korea, the average radiation doses of PET/CT and BS are 12.2 and 4.2 mSv, respectively, as estimated by a national survey [14,15]. Furthermore, the cost of conducting a PET/CT examination is high and this procedure required the use of accompanying facilities including the tracer production, so PET/CT study may not be possible in small to moderate sized facilities. Therefore, the use of FDG PET/CT for evaluating disease activity in a routine clinical practice remains challenging. On the contrary, BS imaging for active joint count has much less radiation exposure than PET/CT imaging, while it provides similar reliable results in patients with RA. The correlation coe cient of a BS/DAS formula for representing DAS28-ESR in each patient in the validation group in this study is comparable to that of PET/DAS formula in a previous study (r = 0.806, p < 0.001 vs r = 0.843, p < 0.001, respectively) [12].
BS is a highly sensitive diagnostic technique of nuclear imaging that uses a radiotracer to evaluate the distribution of active bone formation [20]. Solid tumors with high a nity for bone, metabolic bone diseases, and joint diseases such as chronic in ammatory arthritis and osteoarthritis (OA) are indications for BS evaluation [20]. BS has been used for the differential diagnosis of RA, OA, spondyloarthritis, and unclassi ed arthritis in the eld of rheumatology [21][22][23]. Additionally our results show that joint count by BS evaluation is a reproducible method for assessing bone changes in the affected synovitis, with good reliability between observers, thus BS can be used for measuring disease activity in patients with RA.
Although previous studies on disease activity assessment using BS in patients with RA were limited, two reports showed a signi cant correlation between the regional uptake for large joints on BS and disease activity [24,25]. These studies did not evaluate 28-joints including small joints and did not compare the BS values with DAS28. According to the analysis of the affected joint in a large cohort with RA patients, tender joints were frequently observed in large joints, while swollen joints were frequently observed in the small joints of the hands [26]. Thus, evaluating large joints alone is not su cient to represent the accurate disease activity. Furthermore, the reliability of BS for clinical assessment of large joints such as knee and shoulder joints was relatively lower than that of other joints in our study. Therefore, joint count based on the BS values of 28-joint areas including both small and large joints should provide a more objective parameter for disease activity assessment. Because it is important to determine the cut off value for BS score to assess for synovitis in patients with RA, we compared affected individual joints between BS scores and PET/CT examination. A BS uptake score of 2 was signi cantly more reliable than a BS uptake score of 1 in detecting PET-positive joint at 28 joints. Thus we used the BS uptake score of 2 as a criterion for BS positive joint.
Despite the crucial role of RA disease activity measurement in detecting synovitis, clinical assessments of joint counts are not routinely performed in clinics because reliability of joint count assessments, considering both the intra-observer and inter-observer variabilities, needs to be explored further [27]. The intra-observer reliability of ICCs for the clinical assessment of joint counts by healthcare professionals ranged from 0.47 to 0.98 in both TJC and SJC [28], whereas the reliability of kappa value at the joint level varied from fair to good in SJC [29], thereby suggesting the inconsistent joint assessment in clinical practice. Furthermore, the range of inter-observer reliability assessed with the ICCs and the kappa value was dependent on the variation among study samples in nding a positive joint count (from 0.29 to 0.98, from poor to excellent, respectively) [30,31]. By contrast, joint counts by BS evaluation are a reproducible method for assessing synovitis, with excellent inter-observer and intra-observer reliability. Moreover, BS images show the involvement of whole joint pattern for synovial in ammation [20].
Surprisingly, when observing the ICC values of reliability between BS and PET/CT ndings in 28-joints, the ICC between BSS28 and PET28 was 0.782 (0.646-0.866). Furthermore, the ICC values between BS28 and TJC28 were comparable to those between PET28 and TJC28 (0.646 and 0.728, respectively) [12], implicating that the BSS28 and clinical assessments that were performed by experienced clinicians had a good reliability. We also developed a novel BS/DAS formula derived from the results of BS assessment alone, without using the results of joint assessment performed by experienced clinicians. This formula was con rmed in an independent validation group of RA patients. The BS/DAS, which may overcome the variability of clinical evaluation by joint assessors with diverse backgrounds, can complement the use of the DAS28-ESR and may provide similar results compared with more advanced modality such as PET/CT for evaluation of disease activity.
There are two limitations in this study. First, because BS re ects bone remodeling, uptakes in knee joints can be observed in patients with knee OA [21], regardless of RA disease activity. Second, patients were enrolled at a single center, thus multicenter studies of BS validation are warranted to determine whether our ndings are generalizable.

Conclusion
In conclusion, BS is a sensitive and reproducible method for the detection of active joints, and can complement the clinical assessment of disease activity in RA. Despite the availability of more advanced imaging modality such as PET/CT, considering their costs, and the radiation and sensitivity for evaluating active joints, BS may still be comparable to this advanced imaging method in terms of assessing disease activity in patients with RA. In the future, the incorporation of deep learning from BS images into computer-aided evaluation is promising for the assessment of disease activity in patient with RA. Comparison of positron emission tomography (PET) and bone scintigraphy (BS) in the detection of active joints The frequencies (A), percentages (B) and the mean SUVmax (C) of PET positive joints were expressed according to BS scores in the affected individual joints among the 67 patients who underwent PET and BS. A total of 134 frequencies were observed for each joint. In total, 12 frequencies in knees were excluded in the analysis because those indicate the status of total knee replacement arthroplasty.