Difficult-to-treat rheumatoid arthritis with respect to responsiveness to newly used biologic/targeted synthetic disease-modifying anti-rheumatic drug: a retrospective cohort study

Background Difficult-to-treat rheumatoid arthritis (dt-RA) is an emerging concept, the definition of which has been proposed based on global consensus. This study aimed to establish an evidence-based definition of dt-RA with respect to responsiveness to newly used biologic and targeted synthetic disease-modifying anti-rheumatic drugs (b/tsDMARDs). Methods A retrospective cohort study was conducted using the FIRST registry. An inadequate response to current b/tsDMARDs was defined as clinical disease activity index (CDAI) >10 at week 22 or termination of treatment within 22 weeks due to insufficient efficacy. Cut-off values were defined according to the number of past failures to DMARDs and dose of glucocorticoid. Responsiveness to newly used b/tsDMARDs were compared with respect to above- versus below- cut-off values. Hazards of treatment cessation within 22 weeks due to adverse events were also compared using the same thresholds. Results The cut-off values associated with significant differences in responsiveness to b/tsDMARD treatment were ≥ 2 failures to conventional synthetic DMARD (csDMARD) treatment and ≥ 4 failures to b/tsDMARD treatment. Three or more failures to csDMARDs and concomitant use of glucocorticoid were significantly correlated with an increased hazard ratio of infection. Further analysis using clinical variables revealed that refractoriness to ≥ 2 previous csDMARDs was weakly associated with less improvement in ESR titre, while refractoriness to ≥ 4 previous b/tsDMARDs was associated with less improvement in HAQ. For both cut-offs, significant but weak association with GH was also observed. Conclusions We propose cut-off values of ≥ 2 failures to csDMARDs and/or ≥ 4 b/tsDMARDs to define dt-RA with respect to responsiveness to use of b/tsDMARDs.


Introduction
The development of biologic and targeted synthetic disease-modifying anti-rheumatic drugs (b/tsDMARDs) has revolutionarily improved the prognosis of rheumatoid arthritis (RA) patients refractory to conventional synthetic DMARDs (csDMARDs). Global evidence indicates that 30-60% of RA patients refractory to their first DMARD can achieve clinical remission following treatment with additional bDMARDs, and structural remission can be achieved in approximately 60-90% of patients treated with tumour necrosis factor inhibitors (TNFis) and methotrexate (MTX) [1]. Even so, the disease remains refractory to treatment in 20-75% of patients on their first bDMARD [2][3][4] and in 40-55% of patients on their second bDMARD [5][6][7][8]. With increasing treatment options, b/tsDMARD-refractory RA is becoming one of the most challenging areas in rheumatology.
Difficult-to-treat rheumatoid arthritis (dt-RA) is an emerging concept, and is defined as persistency of signs and/or symptoms suggestive of inflammatory RA disease activity despite prior treatment [9]. To date, the most common characteristics of dt-RA include persistent disease activity (disease activity score assessing 28 joints using erythrocyte sedimentation rate [DAS28-ESR] > 3.2), failure to at least 2 csDMARDs and 2 b/tsDMARDs [9,10], and/or failure to tapering the glucocorticoid dose to < 5-10 mg prednisone or equivalent daily for more than 1 year [10]. Thus far, selection of these cut-off values has been based on expert consensus; however, clinical evidence for this definition is not sufficient. Notably, whether patients refractory to multiple b/tsDMARDs are refractory to another treatment has not yet been well studied. For a better understanding of the definition of dt-RA, cohort studies of patients treated with multiple b/tsDMARDs are essential.
The University of Occupational and Environmental Health, in Fukuoka, Japan has established a cohort of RA patients who initiated treatment with b/tsDMARDs. Using this cohort, this study aims to assess clinical outcomes of patients who are refractory to csDMARDs and/or b/tsDMARDs regarding responsiveness to a subsequent b/tsDMARD, and to propose an evidence-based definition of dt-RA.

Study setting
The FIRST registry is a multi-institutional cohort of RA patients treated with b/tsDMARDs established by the University of Occupational and Environmental Health and its affiliated hospitals. The registry has accumulated data from patients who started b/tsDMARDs since the first agent was approved in Japan in 2003

Patient selection and data collection
Patients whose precise information regarding past use of csDMARDs and b/tsDMARDs was available were included. Collected data included demographics, disease characteristics, measures of disease activity, present and past treatment at the start of treatment, and disease activity data 22 weeks after treatment. If a treatment was discontinued within 22 weeks due to infection, other adverse events, or lack of response to treatment, the date of and reason for discontinuation were also collected.
The inadequate response to current b/tsDMARD (current b/tsDMARD-IR) group consisted of patients with moderate to high disease activity (clinical disease activity index [CDAI] > 10) at week 22 and those who stopped treatment within 22 weeks due to insufficient efficacy.
The control group included patients who achieved remission or low disease activity (CDAI ≤ 10) at week 22 and those who stopped treatment within 22 weeks due to remission. We used CDAI instead of DAS28-ESR for assessment of disease activity because CRP and ESR titres would be more strongly affected by TCZ usage than other b/tsDMARDs.

Statistical analysis
For continuous variables, the normality of distribution of the data was assessed using the Shapiro-Wilk test. If the normality of distribution of a variable was rejected, the variable was converted into a categorical variable for further analysis. For categorical variables, differences between groups were assessed using the chi-squared test.

Assessment of cut-offs for the components of the dt-RA definition
Components of the definition of dt-RA were categorised into three groups based on previous reports [9, 10]: past failure to csDMARDs, past failure to b/tsDMARDs, and current use of glucocorticoids. For past failure to DMARDs, four cut-off values were defined according to the number of failures: ≥ 1, ≥ 2, ≥ 3, and ≥ 4 failures. For current use of glucocorticoids, five cut-off values were defined according to prednisolone equivalent dose of glucocorticoid: > 0 mg/day, ≥ 3 mg/day, ≥ 5 mg/day, ≥ 7.5 mg/day, and ≥ 10 mg/ day.
To determine differences between patients in the b/tsDMARD-IR and control groups, mixed-effect logistic regression models were used to compare above-versus below-cut-off values. The mixed-effect regression model was fitted with age, gender, body mass index In studies with very few outcomes, multivariate logistic regression analysis tends to overfit the data, resulting in biased estimation. Therefore, an inverse-probability scorebased weighted (IPW) method was performed using a logistic regression model to adjust for any potential confounders. Variables that showed significant correlation with numbers of past failures to DMARDs were included as covariates.

Analyses of clinical indicators associated with refractoriness
Clinical indicators are expected to improve during the observational period (22 weeks) after treatment initiation. Therefore, change in clinical indicators from week 0 to week 22 (∆values) can be used as substitutes for indicators of treatment effectiveness. For these indicators, negative ∆values indicate the treatment was less effective. To compare treatment effectiveness between two groups, the ∆value was calculated by subtracting the value at week 22 from that at week 0.
The ∆values were used in mixed-effect regression analysis to assess correlations between cut-off values and these clinical indicator substitutes. The same covariates as those used in the mixed-effect logistic regression analysis described above were used.

Assessment of adverse event incidence
To compare hazards of adverse events with respect to above-versus below-cut-off values, a Cox proportional hazards regression model was used. In addition, Nelson-Aalen cumulative hazard estimation was shown graphically. For all methods, a p-value < 0.05 was considered statistically significant.

Baseline factors affecting the current b/tsDMARDs-IR group at week 22
Among 3,535 registered patients, 905 had missing data regarding past use of csDMARDs and b/tsDMARDs. Treatment was discontinued in 388 patients, 24 of whom stopped treatment due to infection, 91 due to other adverse events, 3 due to remission, 103 due to lack of response, and 167 due to other reasons such as economic reasons. Among the remaining 2,242 patients, CDAI data at week 22 were missing for 525 patients. Of the remaining 1,717 patients, 1,231 achieved remission or low disease activity (CDAI ≤ 10) and 486 showed moderate to high disease activity (CDAI > 10) at week 22. In total, 589 patients were classified into the current b/tsDMARDs-IR group and 1,234 into the control group ( Figure 1).
The backgrounds of patients in the current b/tsDMARDs-IR and control groups are shown in Table 1. As Shapiro-Wilk testing rejected a normal distribution for all of the continuous variables listed in Table 1, the two groups were compared using chi-squared testing with categorical variables. Patients in the current b/tsDMARDs-IR group had a longer disease duration, higher disease activity, higher CRP titres, a higher rate of anti-CCP antibody positivity, and lower glucocorticoid and MTX doses at week 0 compared with patients in the control group. Types of newly used b/tsDMARDs also differed between groups.
However, no significant difference in previously used b/tsDMARD types was observed.

Cut-offs for the current b/tsDMARDs-IR group at week 22
Next, mixed-effect logistic regression analyses were performed for each cut-off value. For csDMARDs, the cut-off values associated with a significant difference in responsiveness to b/tsDMARD treatment were ≥ 2 failures to csDMARD treatment and ≥ 4 failures to b/tsDMARD treatment ( Table 2).
Another definition of dt-RA that has gained global consensus is the inability to reduce glucocorticoid use. Due to limited available data, we used glucocorticoid dose at week 0 (mg/day, prednisolone equivalent) as a substitute for previous use of glucocorticoids. No significant correlation between refractoriness to a b/tsDMARD and use of glucocorticoids was observed ( Table 2).
As the number of patients who experienced ≥ 4 failures to b/tsDMARD treatment was small (N=48), this difference might have been caused by confounders such as disease duration. Indeed, logistic regression revealed significant differences in gender, age, BMI, disease duration, CDAI and ESR titres at week 0, anti-CCP antibody positivity, glucocorticoid dose, and MTX dose between patients with ≥ 4 failures to b/tsDMARDs and those with ≤ 3 failures. Therefore, adjustment for these confounders was performed using IPW, and the average effect of each cut-off value was calculated. Even after this adjustment, ≥2 failures to csDMARDs and ≥ 4 failures to b/tsDMARDs appeared to be significant cut-offs. These data indicate that refractoriness to ≥ 2 previous csDMARDs or ≥ 4 previous b/tsDMARDs could be defined as cut-off values for dt-RA regarding responsiveness to another b/tsDMARD ( Table 2).

Clinical and laboratory findings associated with the identified cut-off values for refractoriness to previous treatments
To determine the major factors affected by the identified cut-offs, a mixed-effect multiple regression model analysis and IPW were conducted using ∆values of the variables associated with disease activity. The factors that were affected by each cut-off appear to be different. Refractoriness to ≥ 2 previous csDMARDs was significantly associated with less improvement in ESR titre, while refractoriness to ≥ 4 previous b/tsDMARDs was associated with less improvement in HAQ, although this effect was not observed when IPW was done. For both cut-offs, significant association with GH was also observed, but again, the effect was not significant when the IPW method was applied (Table 3).

Cut-offs for adverse events
Treatment failure due to adverse events is often included in the definition of dt-RA[10]. To determine whether numbers of treatment failures and glucocorticoid dose affect the probability of adverse events, hazards of adverse events were compared above versus below cut-offs.
In total, 24 cases (343,247 patient-days) of infection and 91 cases (465,938 patient-days) of other adverse events that led to treatment cessation within 22 weeks were identified.
To avoid overfitting, no adjustment was made for Cox proportional hazards analysis. The analyses revealed that ≥ 3 failures to csDMARDs and use of glucocorticoids were significantly correlated with an increased hazard ratio (HR) of infection that led to treatment cessation (Figure 2). No significant association between the HR of adverse events within 22 weeks and past treatment failures was observed (Figure 3).

Discussion
This is the first study to assess the definition of dt-RA with respect to effectiveness and tolerance to the next b/tsDMARD. Based on the results of this study, we propose the definition of dt-RA to be failure to ≥ 2 csDMARDs and/or ≥ 4 b/tsDMARDs, because these cut-offs predict inadequate response to the next b/tsDMARD.

Refractoriness to a b/tsDMARDs is caused by a variety of mechanisms. Incorrect targeting
is supported by studies that compared non-TNF-targeted bDMARDs and TNFis as second bDMARDs for patients with insufficient response to a first TNFi [11,12]. For patients who received treatment with incorrect targeting, the probability of effectiveness of the next b/tsDMARD does not differ from that of the first treatment. This may explain the present result that no significant difference was observed between patients who failed ≥ 1, ≥ 2, and ≥ 3 b/tsDMARDs and the efficacy of the next b/tsDMARDs treatment ( Table 2).
In contrast, multi-b/tsDMARD refractoriness may occur via different mechanisms. One possible explanation is induction of anti-drug antibodies (ADAbs) 15 , which is more commonly observed among patients treated with TNFis than with other agents [13].
Indeed, patients who previously developed ADAbs against a TNFi are reported to be more likely to develop additional ADAbs with subsequent TNFi treatment [14,15]. The presence of ADAbs is associated with drug safety and tolerability as well as refractoriness [15] , but no associations between cut-off values and the frequency of adverse events were observed in the present study ( Figure 3). Therefore, further analyses are required to study the characteristics of patients with ≥ 4 failures to b/tsDMARDs.
Another reason for inadequate response to b/tsDMARD could be "false refractoriness," which is characterised by persistent symptoms despite lack of inflammation [16]. One cause of false refractoriness is increased comorbidity burden, which has been reported to lower response rate and retention rate of bDMARDs [17]. The present finding that treatment failure to ≥ 4 b/tsDMARDs was not significantly correlated with markers of inflammation, such as CRP and ESR titres, but was associated with HAQ suggests that resistance to multiple b/tsDMARDs might be caused by increased comorbidity burden.
However, GH was also weakly correlated with refractoriness in a mixed logistic model, so the mechanisms underlying refractoriness remain to be elucidated. In contrast, treatment failure to ≥ 2 csDMARDs was significantly correlated with ESR titre, which cannot be explained by the "false refractoriness" concept. Therefore, the mechanism of refractoriness observed in csDMARD treatment failure might be different from that in b/tsDMARD failure.
A previous study showed that two or more failures to csDMARDs were correlated with comorbidity burden in RA patients [18]. As some comorbidities, such as interstitial lung disease, also increase the risk of infection, the correlation between ≥ 3 failures to csDMARDs and a higher hazard of infection ( Figure 3) may be due to the comorbidities in patients who have experienced multiple failures to csDMARDs.
In an international survey, many rheumatologists mentioned characteristics other than joint symptoms as factors contributing to dt-RA, including extra-articular manifestations, comorbidities, side effects, and treatment non-adherence. The current study did not show a significant difference in the hazards of such events. However, based on the appearance of the cumulative hazard estimate graphs (Figure 2, 3), this absence of a significant association might be due to the sample size being too small.
A global consensus about the definition of dt-RA regarding use of glucocorticoids is failure to taper glucocorticoids to < 5-10 mg prednisone or equivalent daily [10]. The present study showed that treatment with glucocorticoids was not associated with responsiveness to b/tsDMARDs but rather with hazard of severe infection that leads to treatment cessation. This result underlines the importance of using minimum doses of glucocorticoids, while considering the benefit of their use [19-21].
Our study is limited primarily by its inherently retrospective nature. In particular, there are several limitations related to data collection. Firstly, this registry included several episodes of the same patients who received different agents, because if all duplicates are excluded, the number of responders would be too small. Secondly, 905 patients (25%) had missing data about past usage of csDMARDs. This is mainly because many patients with long disease duration did not remember the past use of csDMARDs. Therefore, patients with longer duration might have been more likely to be excluded, which may have caused selection bias. Thirdly, as aforementioned, comorbidity data, such as chronic kidney disease and interstitial pneumonia, were not included, which could confound the outcomes.
Another limitation is that the number of patients varied by agent type. Although the mixed-effect model was applied to adjust for this difference, such methodology may not fully adjust for all confounders related to choices of treatments. In addition, the number of patients included in certain categories, such as ≥ 4 failures to b/tsDMARDs, was very small. Therefore, based on the Cox regression model illustrated in Figures 2 and 3, we cannot determine whether the absence of significant statistical difference indicates no difference or whether the sample size was too small to show a difference.
Finally, almost all the patients included in this study are Japanese with Asian ethnicity and a relatively small body size, among whom the risks of adverse events may be different from people of different ethnic or demographic backgrounds. Nevertheless, these limitations may not substantially impact the cut-off values we have proposed.
In conclusion, this study assessed cut-off values to be used to define dt-RA with regard to responsiveness to the next b/tsDMARD. Our results suggest that cut-offs of ≥ 2 failures to csDMARDs and/or ≥ 4 b/tsDMARDs are useful to predict dt-RA status.

Ethical approval and consent to participate
The institutional review board of the University of Occupation and Environmental Health approved the study (approval code:04-23). Informed consent was obtained from all participants of the FIRST study.

Availability of data and materials
The datasets generated and/or analysed during the current study are available in the FIRST registry of the University of Occupational and Environmental Health. The datasets are not publicly available due to our privacy policy, but are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.  Odds ratios and average effects of inadequate response to current b/tsDMARD treatment (current b/tsDMARD-IR) were calculated using mixed-effect logistic regression and inverseprobability score-based weighted methods (IPW), respectively. week 0 -value at week 22) was compared between above-and below-cut-off values for variables associated with disease activity using mixed-effect regression and inverseprobability score-based weighted methods (IPW), respectively.   Cumulative hazard of treatment cessation due to other adverse events.
Cumulative hazards of treatment cessation due to adverse events other than infections within 22 weeks (91 cases, 465,938 patient-days) are shown by cut-off values.