Incidences and risk factors of serious infections among different rheumatic patients: a retrospective cohort study

Background To compare the incidence and risk factors of serious infections among patients of seven common rheumatic diseases including rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), polymyalgia rheumatica (PMR), Sjögren's syndrome (SS), systemic sclerosis (SSc), systemic vasculitis (VA), and other diffuse connective tissue diseases (oCTD). In a retrospective cohort study using large Electronic Health Records (EHR) data, the infection rates of different rheumatic diseases in two years were calculated and variances on risk factors were analyzed using the multivariable Cox model.


Page 3/14
Dataset We conducted a retrospective study of patients visited between January 2000 and 2017 using Electronic Health Records (EHRs) from the Cerner Health Facts® database. Cerner Health Facts® (version 2017) is a de-identi ed EHRs database that consists of over 600 hospitals and clinics in the United States, represents over 68 million unique patients, and includes longitudinal data from 2000-2017. The database consists of patient-level data including demographics, encounters, diagnoses, procedures, lab results, medication orders, medication administration, vital signs, microbiology, surgical cases, other clinical observations, and health systems attributes. The study was approved by the Institutional Review Board (IRB) of UTHealth. The IRB waived the requirement for informed consent.

Cohort Identi cation
The study population included individuals who were aged 18 years or older with at least 6 months of continuous enrollment in the database. The rheumatic diseases discussed in this study were all identi ed by International Classi cation of Disease, Clinical Modi cation, 9th or 10th revision (ICD-9 and ICD-10) codes. Details for all the ICD codes used in this study are listed in Supplements.
The index date (cohort entry) was de ned as the date of the rst disease-modifying antirheumatic drugs (DMARDs) (including glucocorticoids (GCs)) prescription and with a diagnosis of at least one rheumatic diseases. Patients who never used any of these drugs during the study period were excluded since they might represent a group of asymptomatic or mild cases that were not representative of the severe cases to whom these analyses were intended.
The event to assess is the rst time of serious infection starting from the index date. A two-years' time window after the index date was de ned as the followup observation period. Serious infections were de ned as infections requiring either hospitalization or treatment with intravenous (IV) antibiotics. Patients were excluded if the rst infection occurred within 30 days after the index date. To select the population, we identi ed patients who experienced a hospitalization with an infection discharge diagnosis in any position (primary or non-primary) on the hospital claim using diagnosis codes from ICD-9 and ICD-10. Follow-up extended from the entry date until the earliest of the following dates (the end date): 1) the rst serious infection, 2) the last visit within 24 months. Each patient was then tagged as Infection or Not according to the existence of serious infection during the follow-up period. We identi ed the records of 12 months before the index date to gather the baseline information of each patient's comorbidities, including ischemic heart disease, cerebrovascular disease (CVD), chronic obstructive pulmonary disease (COPD), diabetes mellitus, and renal dysfunction, etc. (Table 1). Patients were excluded if they had a history of malignancy during the baseline period. Figure 1 shows the pipeline of cohort selection.

Variables
Page 5/14 We de ned patients' demographic characteristics, comorbidities and drug usage (including biologic DMARDs, nonbiologic DMARDs and GCs) as the basic variables for statistical analysis. Also, we added whether a patient has infections before the index (during a 6-months' time window before index) as an important indicator since one would likely to be infected again if with a history of infection [19]. Since only a small proportion of patients had their BMI records, we excluded BMI from the primary analysis but discussed the inclusion of it in a sensitivity analysis.
Age was de ned as the age in years on the index date and further categorized into three groups according to the World Health Organization (WHO)'s taxonomy (Adult: 18 < = age < = 44, Middle-aged: 45 < = age < = 64, Aged: age > = 65). To address the inconsistent records of sex and race across encounters per patient, these variables were de ned using the mode value (most frequently appeared) calculated based on all available records to reduce any inconsistencies. For example, if there are three records for a patient A and two of them indicate A's sex is male but one is female, we coded A as a male. Races were further categorized into Caucasian, African American, Native American, Hispanic, Asian and Other according to their proportions among the cohort. A patient's BMI was extracted as the average value of all recorded BMIs in the follow-up observation period. Included comorbidities and usage of medications were de ned in supplement Table 3 according to the ICD codes or generic names of drugs and represented as binary variables. To facilitate the survival analysis, we de ned the disease duration (the "survival" time) as the number of days between the index date and the end date (de ned above).

Statistical Analysis
Fisher exact test was applied in the univariate analyses to explore the distribution of infections across subtypes of rheumatic diseases including RA, SLE, PMR, SS, SSc, VA and oCTD. The statistical signi cance was de ned as a p-value of less than 0.05. The crude incidence rate and the cumulated incidence curve were calculated on the overall cohort as well as each sub-cohort. The crude incidence rate was calculated as the number of total new cases developed over the follow-up time divided by the total number of patients at risk at baseline. Cumulated incidence curve was computed using Kaplan-Meier estimator, which is a non-parametric method representing conditional probability of survival or risk at speci c time interval. In the multivariable analyses, Cox proportional hazard models were employed on each individual cohort to evaluate the risk of serious infection adjusted for age, race, sex and other predictors.
Sensitivity analyses were further conducted using the cox proportional hazard model on a sub-cohort with record BMI values as well as separating different types of nonbiologic DMARDs. The analyses were performed using R 3.4.1.    Figure 2 compares the cumulative incidences across seven rheumatic diseases and signi cant differences were observed between their incidence of infections (p < 0.001). We observed distinct distributions on the infection rate across these cohorts. Similar to Table 2, patients with SSc, VA and SLE demonstrated the highest overall cumulative incidences compared to other groups. In terms of growth trend, patients of VA, SSc, and SLE seemed to infection more rapidly than those of PMR, RA, oCTD and SS. SS tends to have a lower overall incidence of infection. Table 3 shows the subgroup analyses on sub-cohorts of rheumatic diseases using the Cox proportional hazard model with the same set of covariates. Of note that several variables such as GCs, COPD, congestive heart failure, chronic anemia and infection before index were observed a similar level of correlation with increased risks in infections across all the seven types of diseases as well as the overall population. However, we observed variations in the risks of some variables across sub-cohorts. For example, osteoporosis was with increased risks (HR > 1) among SLE, SS, and SSc but decreased risks among others (HR < 1). In addition, the associations of some predictors uctuated heavily in different diseases, e.g. the usage of biological DMARDs had an HR of 1.32 in VA, but only 0.21 in PMR. Patients with some speci c diseases such as VA and SS tended to be more vulnerable to infection as more factors were associated with increased risks than other diseases (according to the number of bold values). The risks for most racial categories were not statistically signi cant.

Results
The results on the overall cohort may inform us of some general senses about the covariates in the general rheumatic population, where most of them were associated with increased risks in infections. Speci cally, infection before index ranked top. For gender, females were more likely to be infected. For races, Caucasian had a decreasing risk but was not statistically signi cant (p = 0.510). Aged patients were observed with a larger likelihood of infection than the adult and middle-aged groups, but the middle-aged seems to be less vulnerable than the adult group. While GCs and non-biologic DMARDs were highly associated with serious infections, biological DMARDs delivered a protected effect towards infection. Most of the comorbidities as covariates showed increased, e.g. COPD (HR:1.24), or marginally increased risks, e.g. hypertension (HR:0.99).

Sensitivity Analyses
To evaluate the robustness of the ndings, we conducted two sensitivity analyses: on a sub-cohort with recorded BMIs as well as the overall cohort but separating different types of non-biological DMARDs. Some researchers discovered that therapy with nonbiologic DMARDs represented an additional factor of increased risk of infections and the risk varied depending on the different nonbiologic DMARDs used [4]. For explorative purposes, the second analysis was conducted on the overall cohort by separating each nonbiologic DMARD to see their individual effect ( Cyclosporine A and thalidomide were not associated with enough patient samples (0 and 1 respectively) to generate meaningful statistical results.

Discussion
The primary nding of the study was different types of rheumatic diseases were with different infection rates and distributions of risk factors.
Analyses on the real-world large U.S. cohort offered us a comprehensive understanding of infection rates of rheumatic diseases which mostly need glucocorticoids and DMARDs treatment. In general, the infection rates per 100 patient-years was relatively high (8.95) during the rst two years of treatment. While for some particular disease types, such as SLE, VA, and SSc, their IRs are higher than others. It is not surprising as these three diseases tend to affect the critical organs, have more severe impaired immune functions, and need more immunosuppressive treatment regimens. For example, it was found that the impaired immune function in SLE including an impaired acute in ammatory response, the decreased number of T lymphocytes and the T-helper cell activity, and complement dysfunction contributed to the increase of the infection risk [16].
Our study validated as well as provided important complements to the current literature on the risk of serious infections associated with rheumatic diseases. Our analyses found a 20.4% prevalence of severe infection among SLE patients, which is in the range of 12-40% reported by different prior works [20][21][22].
Our results were a little higher than one another study which was based on a cohort of 33,565 SLE patients aged 18-64 years old [6]. The discrepancies might be owing to that we also included elder patients. In our infection group, 45.92% of patients were > 65 years, and the elder people have an increased risk of infection. Recently, a large England cohort study showed that the cumulative incidence of infection over 1 year's follow-up was 18.3% (17.9-18.7) in patients of PMR or giant cell arteritis [17]. Our results seemed much lower in PMR patients according to Table 3 (9.20, 8.51-9.89) but we included two years' data and Page 10/14 the target was calculating the serious infection rate rather than the all-cause infection rate. For SS, there are few studies concentrated on the infection rate [2] but we reported the rate is a bit lower (8.32, 7.37-9.27) than RA (8.60, 8.36-8.84).
A study from the European League against Rheumatism (EULAR) Scleroderma Trials and Research (EUSTAR) database reported among the non-SSc-related causes of death, infections accounted for 33% [23]. Our results also revealed a very high infection rate among SSc patients (10.89 per 100 patient-years), which reminded us to concern more about this point. Further studies about the related risk factors and how to decrease the infection should be implemented.
We also discovered for the rst time that certain types of rheumatic diseases such as VA and SSc tend to have infection more rapidly than others. In antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis, it was reported that severe infections developed almost in one out of four patients (23%), most of them during the rst year [24]. A high dose of glucocorticoids is perhaps a signi cant factor in infection development, while neutropenia due to cyclophosphamide is also a contributing factor [25,26].
Given structured EHRs only, although it is di cult to interpret all these results, we can still get some clues and pay more attention to these particular types of rheumatic diseases in their early stages of treatment in clinical practice.
Apart from common risk factors like older age, previous serious infection, and some commodities, we also found that chronic anemia was strongly associated with an increased risk of infection, i.e. HR = 1.14 (1.06-1.22) in the overall cohort. Chronic anemia is often seen in many chronic diseases, including RA and chronic renal disease. In RA, anemia is the most common comorbidity with an estimated prevalence of 33.3-59.1% [27]. Although we didn't observe a such a high prevalence according to a rough statistic using the ICD codes only (8.65% and 13.25% in the whole group and infection group, which might underestimate the numbers), chronic anemia a icted a substantial proportion of rheumatic patients and treating anemia is strongly encouraged regardless of the underlying disease process responsible for anemia.
The drugs used in rheumatic diseases can be divided into glucocorticoids, traditional (non-biological) DMARDs, and biological DMARDs. Voluminous previous studies have validated that the usage of certain dosage of GCs would increase the risk of infections, in both RA [28,29] and other types of rheumatic diseases such as SLE and lupus nephritis [22,30,31]. In line with these previous studies, our data revealed the increased risk of serious infection associated with systemic GCs (RR:1.64, 1.52-1.76) among the cohorts although there are subtle differences between different disease types. Compared with many previous studies, our study used a relatively larger database and offered a more comprehensive perspective. Withdrawal of GCs should therefore be carried out systematically for all patients receiving GCs therapies long-term, and if this is hard to implement, a change in treatment should be considered [32].
Previous discussions on the risk of infection by using nonbiologic DMARDs were not converging. In RA, the initiation of le unomide, sulfasalazine or hydroxychloroquine may do not boost serious infections compared with methotrexate [33]. A systematic review and meta-analysis of randomized controlled trials (RCTs) also showed that methotrexate was associated with an increased risk of infection in RA (RR: 1.25; 1.01-1.56), but not in other non-RA in ammatory rheumatic diseases populations [34]. However, another population-based RA cohort in British Columbia, Canada indicated that the use of nonbiologic DMARDs, including methotrexate, did not increase the risk of infection in RA [35]. Our results con rmed that the use of most nonbiologic DMARDs, including cyclophosphamide, mycophenolate mofetil, le unomide, methotrexate, azathioprine, hydroxychloroquine, sulfasalazine and tacrolimus increased the risk of serious infection on the overall cohort. And on larger sub-cohorts such as RA and SLE, the associations were also kept with only slight differences in the hazard ratios. Compared with results based solely on RCTs, our results are derived from the real-world data from EHRs, which were not limited by the underrepresentation of speci ed populations, e.g. elderly and high-risk patients, and thus might have higher generalizability.
In registry or observational studies, biologics were associated with a higher risk of serious infections, compared both to non-use of biologics and to the use of nonbiologic DMARDs [36]. A meta-analysis reported a 31% increased risk of serious infections in standard dose biologic-treated RA patients compared to nonbiologic DMARDs (OR:1.31, 1.09-1.58) [1]. In our cohort, the proportion of biological DMARDs users was relatively low (3.27%) which might have led to limited statistical power, but we observed a decreased risk (HR:0.82, 0.71-0.95), re ecting the protected effect.
This study used structured EHRs and was thus subject to some potential biases and limitations. It lacks linkage to other data sources, so cares provided by non-participating physicians were missed. With respect to the disease de nition and outcome measurement, we cannot exclude some misclassi cations since data were based on diagnosis codes and not validated through medical record reviews. However, the diagnoses were based on hospitalization with infection as the primary diagnosis, thus limiting potential misclassi cation. In addition, no direct measures of disease activity and disease severity exist within the administrative database, therefore, the impact of disease status on the DMARDs initiation could hardly be determined from this study. Also, there might exist some selection biases for GCs and DMARDs use as physicians tended to treat more severe patients with (higher dosages of) GCs and more powerful DMRADs but we didn't include the dosage and duration of drug usages. Finally, we cannot rule out the potential for residual confounding, since we selected variables based on experience and reports and it is possible that the results remained affected by unmeasured confounders. The strengths of our data were that they were the real-world data and the sample size is large enough to allow an adequate number of events. We plan to include the dosage and duration of drugs for in-depth analyses in the future.

Conclusion
These comprehensive analyses presented novel ndings while con rmed previous data from observational studies using a large administrative EHR database. Patients with rheumatic diseases have a signi cant risk of serious infections and require enhanced vigilance in the management of their pharmacotherapy and comorbidities. While patients of different rheumatic diseases were observed variances in infection rates and risk factors, the therapeutic strategies should be adjusted accordingly. Ethics approval and consent to participate The study was approved by the Institutional Review Board (IRB) of UTHealth. The IRB waived the requirement for informed consent.

Consent for publication
Not applicable.

Figure 2
Cumulative incidence of serious infection during the follow-up period. We excluded 4,852 patients with overlapped rheumatic diseases to facilitate the comparison. There might be some discrepancies with results in Table 3 due to censoring data and differences in computation.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. supplement.docx