Data sources
The cohort included all COVID-19 cases in the study area, covered by the Agency for Health Protection (AHP) of Milan, corresponding to 193 municipalities in the northern Italian region of Lombardy, with a total population of 3,48 million inhabitants. The study area includes the municipality of Codogno that was at the origin of the first Italian epidemic outbreak. From the beginning of the outbreak, all tracing activities were included in a web-based platform, developed by the Epidemiologic Unit of the AHP, called Milano COV, including cases and related contacts (details on the information system are described in the Online-Only Methods). A confirmed-case is defined as a person with a real-time polymerase chain reaction (RT-PCR) positive result of SARS-COV-2 infection, irrespective of clinical signs and symptoms. Contacts are defined as all individuals who are associated with a case’s sphere of activity, thus potentially exposed to the same source of contagion. Cases and close contacts underwent epidemiological investigation to provide description of the clinical presentation of COVID-19 and its clinical course. Furthermore, data were collected to estimate the serial interval of SARS-COV-2 infection, the symptomatic proportion of COVID-19 cases, and to identify possible routes of transmission.
From the beginning of the outbreak, all tracing activities were included in a web based platform, developed by the Epidemiologic Unit of the AHP, called Milano COV, including cases and related contacts. During the outbreak, it has been necessary to communicate the nominative list of identified cases and close contacts of each municipality to the Mayor’s office, to verify social support needs and to assess possible quarantine violations. The prefectures of the province of Milan and Lodi needed the same information to enforce restrictive measures provided for by temporary laws issued during the lockdown phase. Moreover, in order to allow the clinical management of identified cases and close contacts, information related to their registered patients was released to general practitioners (GPs). In order to manage the dynamic lists of cases and contacts, updated with information relating to death, hospitalization or discharge home, an additional web portal called ATS-Milano COR was set up. The portal allowed the selective visualization of subjects by town of residence (for Mayor’s offices and other stakeholders), province of residence (for prefectures) or according to their GP. This dynamic information structure is capable also to track the clinical evolution of each case from symptom onset, or swab date, up to a negative PCR result, and of each close contact to the end of quarantine. For cases and contacts reported by general practitioners, in addition to the active surveillance and contact tracing described above, a massive SMS system was activated both to reinforce the indications for segregation and to provide links to information material on how to maintain the quarantine at home.
In order to expand the outbreak reporting system, general practitioners could add symptomatic cases that did not undergo a nasopharyngeal swab, and their close contacts. In the AHP of Milan 95% of the residents are registered with a GP affiliated with the Lombardy Regional Health System (RHS). From the beginning of the epidemic, the Lombardy Region daily sent the list of hospitalized COVID cases to each AHP, including the date of entry and the name of the hospital where the patient was admitted, and each variation (transfer, home discharge, death) was communicated in the following daily dataflow. Additional information relating to hospitalizations of patients in the ATS Milano COR was derived from the regular administrative data discharge flow, which is consolidated for hospital admissions up to the end of April 2020. We integrated between them, and verified with the demographic information in the Health Service Register of the Lombardy Region (age, gender, place of residence), the described data sources in the Integrated Datawarehouse for COVID Analysis in Milan, anonymized with a random unique patient id. The same id was assigned to those subjects in all other administrative databases of the AHP, anonymized prior to analysis. Individual level comorbidities data were derived using the chronic disease administrative database of the AHP of Milan, according to the algorithms specified in the Regional Act X/6164 [24] and X/7655 [25] of 2017, and summarized in English in E-Table 1. Vital status was derived from the early notification system of the AHP of Milan, set-up from the beginning of the epidemic, in which deaths are communicated from the Civil Registry of each Municipality to the AHP and manually introduced in the Health Service Register, or directly from the GP and Mayor’s offices for the subjects already in the Milano COV database through the web-based information system. We determined vital status at 30-day from diagnosis, which was defined for confirmed-cases as the first date between registered symptom onset and the swab positivity result. The date of symptom onset in the database was derived from the epidemiological interview or from the date of first access to an emergency department or first thorax CT scan, in this order of priority. If none of these dates was available and the patient had been hospitalized, the date of hospital admission was used. For a minority of patients, infected in the early phase of the epidemic and for whom no onset dates were available, we uniformly random imputed the date of symptoms onset between February 10th and 17th. For symptomatic cases, date of diagnosis was the date of symptoms onset reported by the GP or, if missing, the date in which the subject was introduced in the web-system by the GP. For this analysis, we considered as alive patients with a date of death more than 30 days after the date of diagnosis. The vital status was assessed on May 23th 2020.
Study Population
From the COVID-19 database of the Milan AHP we extracted, on April 27, 2020, all subjects with nasopharyngeal positive and negative swabs confirmed SARS-COV-2. Through the database, we collected demographic information on age, gender, municipality of residence, ASST (geographical and administrative partition of the territory of the Milan AHP). We defined as exposed all subjects with the following autoimmune diseases: AR, SLE, systemic sclerosis, Sjogren disease, ankylosing spondylitis, myasthenia gravis, Hashimoto's disease, acquired autoimmune hemolytic anemia, and psoriatic arthritis. Presence of any autoimmune disease was identified in the chronic disease administrative database of the AHP of Milan, where information on the presence of 64 chronic conditions is recorded, for every resident registered with the RHS, using outpatient exams and visits, hospital discharge sheets, pharmaceutical, and exemption from co-payment databases according to the algorithms defined in the Regional Act X/6164 and X/7655 of 2017 [24, 25]. Number of comorbidities was derived from the same database. For test-positive subjects, death and hospitalization status were updated to June 11, 2020.
Study Design
To evaluate the association between autoimmune status and occurrence of COVID-19 we performed a combined analysis of a test-negative design (TND) case-control study, a case-control with test-positive as cases (CC-POS design), and one with test-negative as cases (CC-NEG design). As proposed by Vandenbroucke et al [26], the combination of these studies will serve to evaluate if autoimmune diseases are specific risk factors for COVID-19 or generally for respiratory diseases with similar symptoms. We also performed a conventional matched case-control design for comparison.
TNDs evaluate the association between an exposure and an outcome by comparing test-positive (cases) with test-negative (controls) subjects. Cases were defined as all subjects with a positive swab collected from the ATS-Milano COR system, no exclusions were performed. Controls as all subjects with a negative swab included in the same database. The idea is that test-negative controls underwent testing because they presented symptoms attributable to COVID-19 but resulted negative, thus having a different infection which may lead to similar symptoms, for example another respiratory infection. In fact, being susceptible to the same selection mechanisms as tested-positives will protect from common case-control biases. The TND design is potentially capable of identifying the effect of autoimmunity on COVID-19 if it has a different magnitude, or even direction, compared to the effect that it has in other respiratory infections [26]. To control for different spatial correlation among subjects, and to control for time trends in the administration of swabs, we matched TND’s cases and controls by ASST and date of swab, date of positive swab for cases and date of negative swab for controls within 7 days of the case index date, only matched cases and controls will be considered.
At the same time, comparing test-positive subjects with general population controls will allow to estimate the effect of autoimmunity on COVID-19 infection compared to a control without respiratory symptoms. On the other hand, comparing test-negative subjects to the general population will allow to assess if autoimmunity is a risk factor for respiratory infections in general. Consequently, we designed two additional case-controls: in the first one cases were those subjects who tested positive (CC-POS design), while in the second one cases were those who tested negative (CC-NEG). Test-negative subjects were the same subjects selected as controls in the TND design [26]. In order to compare CC-POS’s (and CC-NEG’s) results with TND, a number of controls equal to 4 times the number of test-positive plus test-negative subjects was randomly sampled from the general population of the Milan AHP, thus resulting in a single control group for both CC-POS and CC-NEG designs.
For comparison, we performed also a classical population case-control design (hereafter name case-control design 2), where cases (test-positive subjects) and controls are matched by age ( 5 years), gender and municipality of residence. Controls were randomly sampled from the general population of the Milan AHP, also with a ratio of 1:4, only cases matched with 4 controls will be considered.
To evaluate the association between autoimmune status and a proxy of disease severity, defined as non-hospitalized and alive, hospitalized and alive, and deceased, we performed a cohort study using all COVID-19 test-positive subjects.
Statistical Analysis
To measure the association between autoimmune diseases and COVID-19 occurrence we used logistic regression models and conditional logistic regression models in matched designs, presenting results as the ORs of having an autoimmune disease in cases compared to controls and their 95% CIs.
Models for the TND design were adjusted for age (categorized as <17, [18-40), [40-70), ≥70 years), gender and number of non-AIDs chronic conditions (categorized as no conditions, 1-3, and ≥4). Models for CC-POS and CC-NEG were adjusted for gender, age (categorized as <17, [18-40), [40-70), ≥70 years), number of non-AIDs chronic conditions (categorized as no conditions, 1-3, and ≥4) and municipality of residence. Models for the classical Population Case-Controls Design were adjusted for number of non-AIDs chronic conditions (categorized as no conditions, 1-3, and ≥4).
To measure the association between autoimmune diseases and severity of COVID-19 disease we used ordinal (cumulative) and multinomial logistic models. Ordinal logistic regression, in its cumulative formulation, requires the assumption of proportional odds [27]. When any covariate did not satisfy this assumption, a partial proportional odds model was fitted allowing non-proportionality for the selected variables [28]. Ordinal and multinomial logistic models were adjusted for gender, age (categorized as <17, [18-40), [40-70), ≥70 years), number of non-AIDs chronic conditions (categorized as no conditions, 1-3, and ≥4) and ASST (n=6). We decided to adjust for ASST, and not for municipality, of residence given that there are 193 municipality in the territory of the AHP, which may have led to very few cases in each stratum. Results were displayed as ORs with 95% CIs. ORs for the ordinal logistic model will be interpreted in their cumulative formulation that is the odds of deceased versus the combined categories hospitalized and alive, and non-hospitalized and alive, and of the combined categories deceased, and hospitalized and alive versus non-hospitalized and alive. ORs for the multinomial logistic regression will be interpreted as usual ORs with non-hospitalized and alive as the reference category. The analyses were performed using SAS Software version 9.4 (SAS Institute Inc., Cary, NC, USA).