From positive to negative: a time to event analysis during the first COVID 19 epidemic period

Background: The ability to identify the positive subjects is crucial for public health practice to reduce transmission and supporting contact tracing and isolation. The reliability of the criteria of the test-based criteria as the required condition for the reintroduction of the asymptomatic and positive patients of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the community was evaluated assessing the time span from positive to negative for RNA detection by Real Time – Polymerase Chain Reaction (RT-PCR). Methods. We used information concerning negative conversion time and the respective times. Cumulative probabilities of negative conversion time during the follow-up were evaluated by Crude Cumulative Incidences (CCIs). Non-parametric estimates of CCIs and respective 95% C.I.s were obtained.Results. We report the results for 52,186 individuals. 33486 subjects resulted negative or potentially negative with a CCI of 75.2% at 70 days from the first swab (95% CI: 74.8% to 75.7%). 11,000 subjects deceased before 14/05/2020 without diagnosis of negative status (CCI 21.9%;95% CI: 21.5% to 22.3%), at 56 days from the first swab (maximum observed time to death).Conclusions. SARS-CoV-2 positivity is a condition that frequently lasts more than 30 days. More solid studies are required to determinate the significance of a prolonged state of positivity and the consequences on the policies of dismission of quarantine and isolation.


Introduction
In December 2020, in China, a novel strain of coronavirus was recognized to be the infective agent causing an abnormal peak of atypical pneumonia [1]. The 31st of December 2020 marks the o cial date of the arrival of the virus in Italy, as two Chinese tourists were rst diagnosed positive to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [2] and then admitted to Spallanzani hospital in Rome.
On the 20th of February in Lombardy region the rst Italian patient positive to Sars-Cov-2 was hospitalized [3,4].
SARS-CoV-2 belongs to the family of coronaviruses, it targets the airways, and it is incubated for a time span that goes from 5 to 14 days [5]. Common symptoms are anosmia, dysgeusia, cough, fever, diarrhea, nausea, vomit, dyspnea [6,7].
The Real Time -Polymerase Chain Reaction (RT-PCR) analysis runs on nasopharyngeal swab is the gold standard for the diagnosis of Coronavirus infection disease  [8]. It detects fragments of viral RNA on the upper airways regardless of the viral load. Up to now, a patient is considered to be positive to SARS-CoV-2 depending only on the positive result of the nasopharyngeal swab, regardless of symptoms [9]. The criteria of quarantine and isolation changed over time. According to the current Italian national guidelines, individuals who came in contact with a case of COVID-19 can either be quarantined for 14 days and released at the end of the two weeks without undergoing a RT-PCR test or they can be quarantined for 10 days if a RT-PCR test performed on the last day yields a negative result [9]. If the individual becomes symptomatic he will be isolated, a nasopharyngeal swab will be performed immediately and isolation will be continued at least until the end of symptoms. Isolation will only be discontinued after one negative result of RT-PCR test [10]. A patient who shows no symptoms but is persistently positive to RT-PCR test after at least one week from his clinical recovery, will be able to go back to the community after 21 days from symptoms appearance without the need to perform another nasopharyngeal swab [11].
Both the Center for Disease Control (CDC) and the World Health Organization (WHO) discard a test-based criterion for the termination of isolation for positive patients outside healthcare settings, only relying on temporal criteria which changes depending on the presence of symptoms [12,13]. The last Italian Ministry of Health indication about isolation of a symptomatic positive patient, establishes that a patient is de ned "recovered" and can go back to society after at least 10 days from the appearance of symptoms and at least 3 days from the clinical recovery if a PCR test performed on the 10th day yields a negative result [11]. For asymptomatic positive cases it is possible to discontinue quarantine and isolation after 10 days from the rst positive PCR test result if a negative one is yielded on the 10th day [11].
The aim of this study is the evaluation of the time span during which a patient diagnosed positive to COVID-19 becomes negative ("negative conversion time") to viral RNA detection by RT-PCR. To such end we used a large dataset that includes informations from all the swabs performed in Lombardy region from 20/02/2020 to 14/05/2020. The ability to identify the positive subjects is crucial for public health practice to reduce transmission and supporting contact tracing and isolation.

Materials And Methods
We used a large dataset relative to the rst outbreak phase, including the earliest period of virus spread and the subsequent phase of Public Health Emergency. In this study we performed a retrospective reconstruction of the negative status resulting at the date of 14/05/2020 (end of study) of subjects diagnosed as "COVID-positive" in Lombardy and of the respective time to (positive-to-negative) conversion, by using records about dates of delivery and test results from 229,565 swabs. In the emergency period a drastic rise of mortality has been experienced in Lombardy, especially in the elderly population. For those inhabitants who deceased (including every cause, both COVID-related and not related to the infection) before the end of study without a con rmed negative status, it was not possible to know their "negative conversion time". This situation is named as "presence of competing risks", therefore a speci c method of statistical analysis was adopted for accounting for this issue.
Lombardy region collected the data of analysis by combining the information retrieved from two sources: the list of SARS-CoV-2 positive patients derived from contact tracing activity and the list of positive patients derived from the laboratories that performed SARS-CoV-2 diagnosis by RT-PCR on nasopharyngeal and oropharyngeal swab. The data were then integrated with the information contained in the database dedicated to the comorbidities and to the personal data so as to elaborate a single database collecting all the SARS-CoV-2 positive patients, the rst positive diagnostic swab and, when performed, the negative swabs testifying the resolution of the disease. -subject's demographic characteristics, i.e., -gender and age -and date of decease.
Inclusion criteria was: subjects with diagnosis of positivity COVID-19 more than 30 days before the end of follow-up. This criteria was used to guarantee a su cient time for the conversion to occur.
The variable of main interest was the time to conversion to negative status, de ned as the time elapsed from the date of delivery of the rst swab with positive result and the date of delivery of the last negative swab. Consequently methods of analysis of survival data were considered. For every subject with at least one negative swab result within the follow up period, the variable above is observable; these subjects were considered as negative to COVID-19, regardless if there was an additional negative swab delivered no more than 24 hours earlier or not. Subjects without any negative swab result and alive at the end of follow-up were considered as positive to COVID-19, and the time until the last follow-up was considered as censored observation. Finally, it is worth noting that for subjects deceased before the end of the follow-up for any cause (COVID-related or not) without evidence of conversion to negative status it was not possible to evaluate conversion time or to determine their status at the end of follow up. For this reason, death for any cause was considered as "competing risk".
To investigate the incidence of conversion to negative throughout the follow-up period, deceased subjects without negative swab results cannot be excluded from the study, because they contribute to the estimates of the probability of conversion to negative [14]. As a consequence, methods for survival analysis of competing risks data must be used [15]. For competing risks data, a reliable estimator of Crude Cumulative Incidence (CCI) is provided by the non-parametric method described by Kalb eisch and Prentice [15]. Although the main interest is focused on the time to negative conversion, we reported also estimates of CCI of decease (for any cause), because a complete description of the possible events is necessary for understanding the impact of covariates on the principal endpoint. In our case, for understanding the effects of gender and age classes on the times to conversion to negative status.
Results were reported in terms of estimated CCIs at each time with respective 95% con dence intervals, both for the overall collectivity, and for gender and age subgroups (0-19, 20-34, 35-49, 65-79, 80-120 years). The estimates were graphically represented (CCI curves); for sake of simplicity only the estimate of CCIs for xed time points (i.e. weekly intervals) were reported in tables. To evaluate possible differences of CCIs among different age classes within gender, and between gender within age classes, the Fine-Gray regression model was adopted. This model can be considered as an analogous of the Cox model for competing-risks data [16]. Gender and age classes were speci ed as independent variables in the model using dummy coding. To evaluate the differential effect of age classes within gender, interaction terms were also included in the model. Results were reported in terms of sub-distribution hazard ratios (sh-HR), with respective 95% con dence intervals. Comparisons among gender and among age classes were performed by Wald tests. The con dence intervals and the p-values were corrected for multiple comparisons, using the Bonferroni rule. As sh-HRs are directly linked to CCIs, any signi cant difference between strata imply a difference between CCIs [17].

Results
Here we report the results for 52,186 individuals who received the rst diagnosis of positivity COVID-19 at least 30 days before the end of follow-up (14/05/2020). Of these 52,186, 27,002 (51.7%) were female.
Estimates of CCI for the overall population are reported in Fig. 1 for two outcomes: time to conversion to negative status and death for any cause. The CCI of the former outcome has a slow increase in the rst two weeks, then a rapid increase and nally a plateau at around 60 days from the date of rst positive swab. Thirty-three thousands, four hundreds and eighty-six subjects resulted negative to COVID-19 before the end of the follow-up. In table 1, the CCI at 70 days was 75.2% (95% CI: 74.8-75.7%). Among the 33,486 "negative" subjects, 9,570 had negative diagnosis con rmed by two consecutive negative results (in an interval of at most 24 hours), while 23,916 subjects had only one negative swab. Eleven thousands deceased before the end of follow-up for any cause and without diagnosis of negative status. The CCI was 21.9% (95% CI: 21.5-22.3%) at 56 days from the rst swab (maximum observed time to death).
Furthermore, from table1 it can be noted that less than 5% of subjects are estimated to become negative within 2 weeks from the rst swab, while the CCI of death in the same period is higher: CCI = 18% (95% CI: 18.5-18.8%). This highlights a strong impact of the competing event in the rst two weeks.  week. The impact of death for any cause is strongest in the rst 14 days only in the elderly ones; in particular the CCIs of death were less than 10% for people in the age groups 0-19 to 50-64 years, while this incidence is higher than 30% in the elder classes: see Fig. 2. -From The Fine-Gray regression model, a signi cant interaction effect between gender and age classes was found for the time to conversion to negative status (p < 0.0001) meaning that the effect of age on CCI of conversion was different between males and females. The sd-HRs for comparing age classes for males and females are reported in Table 3. To facilitate the interpretation of results, comparisons were reported for males and females (Table 3). In both the two groups sub-distribution-hazards were compared for each age class over the 50-64 years age class (reference class). For both females and males, sd-HRs are, overall, signi cantly higher than 1 for age classes 0-19 to 35-50 years, thus showing that CCIs for young and adults are higher than CCI for "elder adults" (50-64 years). The exception is given for females aged 0-19 years, with a sd-HR not signi cantly different from 1(p > 0.999). The sd-HRs for the classes 65-79 and 80-120 years are signi cantly lower than 1, both for females and males. Thus, CCIs of elder people are or young and adults are lower than CCI in the reference class. These results extend the results previously shown for the estimates of CCI curves, where similar differences between age classes were found for the overall collectivity (i.e. without stratifying for gender). Overall, the effect of age on CCI seems slightly more pronounced in males: in fact the value of sd-HR for males are more distant from 1 than the sd-HRs for females.
By comparing females VS males within age groups (Table 3), it emerges that sd-HRs are overall signi cantly higher than 1 except for the 0-19 age class. These results extend the results previously shown for the estimates of CCI curves, where higher incidences were found in females regardless the age.
Thus the difference between female's and male's CCIs for distinct age classes is not the same, with the greatest difference observed within the age class 80-120 years, with an estimated sd-HR of 1.64 (95% C.I. 1.47 to 1.82) Table 3. Results from multivariable Fine-Gray regression model for sub-distribution hazard ratios (sd-HR) of conversion to negative status and sub-distribution hazard of death among gender and age classes.
C.I. = con dence interval. * = p < 0.0001 (p-values corrected for test multiplicity).  Table 3. In brief, we found, both for males and females, CCIs strongly lower in age classes 0-19 to 35-49 years compared to age class 50-64 years, and CCIs higher in elder age classes compared to the age class 50-64 years. In particular for the elder ones the effect of age on death for any cause is higher in males, with estimated sd-HR equal to 4.63 for the age class 65-79 years and 8.94 for the class 80-120 years, compared to the females, with estimated sd-HR equal to 2.88 for the age class 65-79 years and 4.57 for the class 80-120 years. Moreover, CCI of death for any cause is signi cantly higher in males only for age classes from 35-49 to 80-120.

Discussion
Since the rst outbreak of SARS-CoV-2 infection in China, COVID-19 epidemic spread throughout the world involving more than 37 million people [18]. In Italy, the exponential growth of positive cases, especially in the rst weeks, brought a rapid succession of Government policies aiming at controlling the spread of the disease [19]. Prevalent cases account for most of the present cases in Italy, therefore one of the most important questions to answer remains the duration of the disease itself.
This research focused on the evaluation of the interval between the rst ascertainment of SARS-CoV-2 infection and the last test result that accounts for the recovery. It is worth recalling that the dataset includes all the tests performed from the very beginning of the outbreak until, approximately, the beginning of phase 2. In the earliest period there was no guideline either for the diagnosis of recovery or the conversion to negative status. Several communications by international and national sources have been published in the following period: for example, the rst regulatory instructions about the assessment of negative status by two negative swabs in Italy can be dated to February 22th by Italian Ministry of Health (circolare N. 5443). In this situation the administration of diagnostics tests was perfomed with heterogeneous rules throughout the majority of the time period covered by our data. In view of this, we preferred to perform a reconstruction of negative status by referring to the latest available swab, which represents the current-period knowledge (referred to the end of the study) about health status of all positive subjects in the territory. Of course, some limitations are implied, as discussed below. Up to date, the national guidelines have changed and a symptomatic patient is considered "recovered" and can go back to society after at least 10 days from the appearance of symptoms and at least 3 days from the clinical recovery if a RT-PCR test performed on the 10th day yields a negative result, while for asymptomatic cases it is possible to discontinue quarantine and isolation after 10 days from the rst positive RT-PCR test result if a negative one is yielded on the 10th day.
Another major difference that has been introduced in the latest national guidelines and that doesn't apply for our study is that, individuals that came in contact with a case of COVID-19 can either be quarantined for 14 days and released at the end of the two weeks without undergoing a RT-PCR test or they can be quarantined for 10 days if a RT-PCR test performed on the last day yields a negative result. If the individual becomes symptomatic he will be isolated, a nasopharyngeal swab will be performed immediately and isolation will be continued at least until the end of symptoms. Isolation will only be discontinued after one negative result of RT-PCR test. A patient that shows no symptoms but is persistently positive to RT-PCR test after at least one week from his clinical recovery, will be able to go back to the community after 21 days from symptoms appearance without the need to perform another nasopharyngeal swab.
On total population (n = 52186) the analysis showed a CCI for negativity (considering both the single last and the double negative sample) of 16.6%, 31.1%, 45.2% and 56.3% at 2 28, 35 and 42 days from diagnosis respectively.
When the same population is strati ed for sex, CCI for women showed a more rapid increase accounting for a higher probability than men of being negative or potentially negative for women, at any time interval. The strati ed analysis for age showed a pattern in which younger patients had a consistently higher probability of negative or potentially negative than older patients, especially for higher time intervals. The lowest CCI curve was evident for patients older than 80 yo. As shown in Fig. 2, patients older than 65 yo showed a sensibly lower CCI than any younger age group.
These remarkable differences between age groups are partly motivated by the consistently higher probability of death in these older patients; in fact, CCI refers to the probability that the event veri es as the rst event compared to the other events considered, as in this case death.
Patients older than 65 yo showed CCIs for negativity almost halved compared to those of younger age groups, on the contrary, when considering CCI for death, the older age groups showed a signi cantly higher probability than that of younger age groups.
Our results are in accordance with the work of Mancuso et al. which demonstrated in a sample of 1162 patients that 60,6% of subjects became negative at a median follow up time of 30 days from diagnosis and 36 days from symptoms onset [20]. Moreover, in a recent submitted article, available in pre-print, Lombardi et al. reported a median time from rst positive test to a negative test to be 27 days (95% CI: 24-30) [21]. The results of our study have been obtained independently of symptoms, therefore the positivity of samples at RT-PCR testing was not related to a clinical correlation and we can't speculate on the probability of positive patients to be contagious.
The major limitations of the study stem from the fact that in the period under investigation data have been recorded without a planned national strategy, because of the lack of a unique testing protocol for SARS-Cov-2 [22]. Although, the loss of accuracy for the reconstruction of the time of conversion into negative status, this choice is useful to avoid putative under-estimation of negative conversion time. A further issue that justi es the use of the last swab to ascertain the negative status is the absence of a rationale, con rmed by reliable study results, that explains the possible factors that could determine of occurrence of a positive swab after a rst negative result. In particular at the time of the study it was not clear if, and to what extent, subjects recovered from coronavirus could be again infected by the virus.
Up to date, SARS-CoV-2 contagiousness has been reported in current literature to be evaluated not only by the positivity to RT-PCR, but also considering the viral replication. In fact, several studies posit that the likelihood of recovering replication-competent virus declines after onset of symptoms. In patients with mild to moderate symptoms, no trace of a replication-competent virus was found after 10 days following symptom onset [23,24,25]. In patients with severe symptoms, which in some cases were complicated by immunocompromised state, replication-competent virus was isolated between 10 and 20 days after symptom onset; even though, 88% and 95% of their biological uids tested negative for replicationcompetent virus research after 10 and 15 days, respectively, following symptom onset [26].
At the same time, it is evident that a high fraction of SARS-CoV-2 positive patients remain positive for a long time span; this implies that, if the test-based criteria is used as the necessary condition to end the isolation, most patients will be isolated for a long time regardless of symptoms resolution.
These considerations need to be done especially due to the impact of containment measures on those activities that would suffer the most from this policy: manufacturing and productive activities, schooling and education. A strict policy of a long quarantine means loss of work hours, and sometimes entire departments being sent home. The impact of the containment measures will be both short and long time: during the 4th quarter of the 2020 the Gross Domestic Product (GDP) will contract by about 11%, and more than half of it is due to COVID-19 induced uncertainty [27]; also, given that every additional year of schooling translates to 8 percent in future earnings, a study demonstrated that the cost of school closures due to earning losses as a percent of GDP will range from 9% in high income countries to 61% in low income countries [28].
The test-based criteria have been discarded by the major scienti c organizations (WHO, CDC) but it still is the requirement for re-admission in the community in many nations. As a consequence, the absence of a single internationally-shared procedure that grants 100% safety causes great uncertainty and confusion, also taking into account that a 60 days long isolation is not easily manageable and maybe not even necessary.

Conclusions
It appears clear that SARS-CoV-2 positivity is a condition that frequently lasts more than 30 days, as we observed in our cohort of patients. To be able to determine the accordance between positivity to the test and contagiousness is paramount in order to avoid very long isolation or quarantine which would be unsustainable, but, at the same time, shortening the time span to less than 10-15 days would pose a concrete risk of increasing the virus spread in the population therefore more solid studies are required in order to determine a single internationally accepted policy regarding the dismission of quarantine and isolation.
It must be stressed that several testing protocols for the processing of naso-pharyngeal samples have been adopted throughout the outbreak period: thus, it was not possible to have a systematic evaluation of conversion times. Nonetheless, our results provide useful information for aiding decisions about the administration of positive cases.

Declarations
Ethics approval and consent to participate. This research did not need any ethical approval since are all administrative data available at a central level.
Consent for Publication. Not applicable.
Funding. This research did not receive any speci c grant from funding agencies in the public, commercial or not for pro t sectors.

Figure 1
Crude Cumulative Incidence (CCI) for any negativity status and death.

Figure 2
Crude Cumulative Incidence (CCI) for any negativity status and death strati ed for age groups.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.