Long COVID incidence in adults and children between 2020 and 2023: a real-world data study from the RECOVER Initiative

Estimates of post-acute sequelae of SARS-CoV-2 infection (PASC) incidence, also known as Long COVID, have varied across studies and changed over time. We estimated PASC incidence among adult and pediatric populations in three nationwide research networks of electronic health records (EHR) participating in the RECOVER Initiative using different classification algorithms (computable phenotypes). Overall, 7% of children and 8.5%–26.4% of adults developed PASC, depending on computable phenotype used. Excess incidence among SARS-CoV-2 patients was 4% in children and ranged from 4–7% among adults, representing a lower-bound incidence estimation based on two control groups - contemporary COVID-19 negative and historical patients (2019). Temporal patterns were consistent across networks, with peaks associated with introduction of new viral variants. Our findings indicate that preventing and mitigating Long COVID remains a public health priority. Examining temporal patterns and risk factors of PASC incidence informs our understanding of etiology and can improve prevention and management.


Introduction
Post-acute sequelae of SARS-CoV-2 infection (PASC), commonly referred to as Long COVID, has emerged as a signi cant clinical and public health concern following the global COVID-19 pandemic. 13][4] These prolonged effects can profoundly impact quality of life and lead to functional limitations, psychological distress, and higher numbers of healthcare visits. 5aracterization of the epidemiology of PASC to guide the larger public health response has remained challenging due to the wide range of symptoms.Various studies have quanti ed how many patients with SARS-CoV-2 infections may subsequently develop PASC, resulting in a wide range of estimates.A recent systematic review found that 45% of patients with COVID-19 developed at least one unresolved symptom, with higher prevalence among hospitalized patients; however, estimates ranged from 2-99.9% across studies. 6Differences in PASC de nitions, data sources, study populations, and methodology have contributed to inconsistent results. 7,8In the absence of a gold standard, multiple case de nitions for PASC have been operationalized by researchers. 3,4other potential source of variation is the study period.0][11][12] Fewer studies have explored uctuations in PASC frequency over variant waves.Some of these su ggest that risk of PASC has declined with recent COVID-19 variants, [13][14][15][16][17][18][19] while others have found similar or higher levels of risk with Omicron. 20,21rger studies are needed to systematically elucidate whether variations in PASC incidence are associated with variant differences, and whether incidence has declined after introduction of acute COVID-19 treatments, vaccination efforts, and immunity supplied by prior infection.
As part of the National Institutes of Health (NIH)-funded REsearching COVID to Enhance Recovery (RECOVER) Initiative, we leveraged electronic health record (EHR) data from three large EHR-based research networks to estimate PASC incidence.Our analysis considered demographic factors, including age, sex, race, ethnicity, and residential setting.We also examined how temporal trends varied across viral variant waves.
Given the lack of a uni ed de nition of PASC, our analyses leveraged three de nitions developed by each network.By exploring patterns of PASC across networks over time using a range of de nitions, our study contributes to the growing body of knowledge on sequelae of SARS-CoV-2 infection to inform ongoing healthcare planning, intervention strategies, and patient care.In ammatory Syndrome in Children, or MIS-C), U09.9 (PASC), or B94.8 (Sequelae of other speci ed infectious and parasitic diseases), 23 and pediatric positive nucleocapsid IgG test results.We de ned the index event as the rst documented evidence, and for patients with post-acute evidence only we imputed index events as 59 days prior to the earliest, placing the window for PASC onset after the acute infection period but minimizing possibility of the window being associated with reinfection. 24diatric patients were included if ≤21 years old at index event (PEDSnet), and adult patients were included if ≥22 (N3C, PCORnet).Patients with an index event between March 2020 and February 2023 were included, leaving at least six months before the end of the study period (September 2023) to allow time for PASC to be documented and mitigate lags in data reporting.We also required patients to have at least one visit within the health system prior to index event, and at least one follow-up visit occurring during the study period and 90 or more days after the index event.See patient attrition details in Supplement eFigure 2.

Outcomes
EHR networks have iteratively developed separate PASC de nitions using different populations and technical infrastructures.Given the lack of a gold standard, we saw this as an opportunity to look across de nitions.
Each network applied its own computable phenotype for probable PASC (herein 'PASC').N3C's de nition identi ed a) earliest documentation of a U09.9 / B94.8 diagnosis code or b) patients with predicted PASC by a machine-learning based algorithm trained on patients with a U09.9 diagnosis. 257][28] Generally, PCORnet classi ed patients based on the presence of a U09.9 or B94.8 code, or at least one incident PASC diagnosis. 26,29DSnet classi ed patients based on diagnoses (one U09.9, B94.8, M35.81 code, or two or more PASCassociated features at least 28 days apart).See Supplement eMethods for network-speci c descriptions.
All networks examined the proportion of COVID-19 patients developing PASC within 30-180 days after the index event (incidence proportion) and de ned PASC onset as the earliest date that the computable phenotype identi ed PASC.Rates were presented as percentages (per 100 COVID-19 positive patients) over the entire study timeframe and by month or variant wave.

Statistical analysis
We summarized categorical data as counts and percentages and described continuous data as mean values with standard deviations (SDs) or median values with interquartile ranges (IQRs).We analyzed 180-day incidence proportions using the beta distribution to calculate 95% con dence intervals (CIs) and represented ndings visually as time-series plots and heatmaps, including summing patients with PASC by month of PASC onset.Unadjusted and adjusted hazard ratios with 95% CIs were generated using multivariable Cox Proportional Hazards regression models with PASC onset as an endpoint and presented by age group, sex, race/ethnicity, pre-existing conditions, rurality, COVID severity, vaccination status, and index month.P-values <0.05 were considered statistically signi cant.All analysis and visualizations were created using R, including ggplot2, survival, and survminer packages.

Control group analysis
Because many PASC symptoms are non-speci c, we compared PASC-positivity within a COVID test-negative control group against COVID-19 positive patients within the same period."Negative" controls were required to have a negative PCR or antigen test within the rst ve months of 2021 (when testing was available but before widespread home testing), and no evidence of COVID-19 prior to or within 180 days afterwards.We included patients if their rst documented negative test fell within the period and considered this the index event.We repeated this approach using a "historical" control group selected from the rst ve months of 2019 to estimate how many patients would be identi ed as PASC-positive using our algorithms during that period, selecting the rst visit within that timeframe as the index event.To calculate 'excess incidence' for the SARS-CoV-2 study population, we considered the incidence among COVID-19 negative outpatients to approximate a baseline ("COVID-free") burden of PASC-like sequelae and subtracted this from the overall incidence among COVID-19 positive patients.
Documented vaccinations were consistently low across adult EHR networks, with 24% of N3C patients and 26% of PCORnet patients having at least one COVID-19 vaccination prior to index event (Figure 1F, eTable 1).
This number was lower (12%) for PEDSnet patients.In unadjusted analyses, proportions of patients developing PASC were similar by vaccination status (Table 1).

Monthly trends for COVID-19 and PASC cases
Figure 1 shows the occurrence of COVID-19 and PASC from March 2020 to February 2023, together with changes in related covariates.Monthly count patterns were concordant across PCORnet and N3C, including spikes in cases around the beginning of 2022 (Figure 1A), while PASC case counts among children were lower.COVID-19 case counts were also closely aligned across networks (Figure 1B), including surges in cases following the onset of new variant waves.While absolute PASC incidence proportions differed, temporal patterns were strikingly similar (Figure 1C).A decline in incidence towards the end of the study period was noted across networks.However, analysis of source data newer than those used for this manuscript con rms that this apparent decline is unreliable (data not shown), and likely an artifact of incomplete follow-up times for some of our population due to heterogeneous follow-up and reporting schedules across sites.Monthly percentages of severe COVID-19 cases were concordant (Figure 1D).
Across networks, PASC incidence was higher for female patients (Figure 2A) and, among adults, for older age groups and Black patients.Notably, incidence increased across variant waves: it was lowest during the Ancestral and Alpha waves, followed by Delta and early Omicron, and was highest for recent Omicron variants.Incidence was frequently greater with higher Charlson Comorbidity Index (CCI) scores, Pediatric Medical Complexity Algorithm (PMCA) category, and COVID-19 severity (Figure 2B).

Multivariable analysis
We constructed time-to-PASC multivariable Cox regression models (Figure 2C).Unadjusted and adjusted hazard ratios (HR and aHRs; 95% CI) for PASC incidence within 180 days were summarized over the study period ( control groups (Figure 3), but at lower levels than among COVID-19 positive patients, demonstrating the nonspeci c nature of symptoms associated with PASC.Rates among historical controls were higher than contemporary controls, potentially re ecting higher levels of interaction with the healthcare system in 2019 compared to 2021.Heatmaps for each control group showed the increased propensity for PASC with greater burden of pre-existing conditions.

Discussion
Our study explored crude and excess incidence of PASC among a large cohort of COVID- Our study relied on three distinct de nitions of PASC, including a broader de nition from PCORnet and a more restrictive de nition from N3C.Our pediatric de nition was also more restrictive (requiring two or more diagnoses indicative of PASC).Thus, we present ndings as a range of potential PASC incidence estimates.
Our analyses further contextualize these estimates by applying PASC de nitions to two "control" groups.As expected, PASC-like sequelae were highest among COVID-19 patients, especially when considering pre-event health of patients and severity of the index encounter.We note that PASC-like symptoms may be overrepresented in both control groups, which could underestimate excess incidence; contemporary controls may include patients with false-negative tests or undocumented infections, and patients had more frequent interactions with the healthcare system in 2019 compared to 2021. 33nsistent with other studies, our ndings highlight severity of initial COVID-19 infection as a risk factor for PASC, with elevated rates of PASC among patients hospitalized during their initial COVID-19 infection, 6,34,35 though we recognize that patients with inpatient care may receive closer follow-up, leading to more PASC diagnosis opportunity.
Consistent with a prior N3C publication, 36 our adjusted analyses found that previously vaccinated adults had a lower risk of PASC compared to unvaccinated patients.This nding was not replicated within the pediatric cohort, perhaps because higher-risk pediatric patients are more likely to seek out vaccination.However, the sample of pediatric patients with known vaccination was also small and we were not able to differentiate unvaccinated patients from vaccinated patients lacking documentation.The impact of vaccination on PASC incidence remains an important area for investigation.
In this study, we applied a consistent analytic approach across three research networks using different PASC de nitions.Despite concordance of most ndings, this study had several limitations.First, diagnoses of COVID-19 and PASC may be missing in EHR data.Indeed, we suspect the high levels of PASC observed at the beginning of the pandemic re ect underdetection of mild COVID-19 while testing was rationed.Similarly, higher PASC incidence during recent variant waves could re ect growing recognition and diagnosis of PASC. 21Findings may also be in uenced by restriction to patients with multiple healthcare encounters, underrepresentation of data from non-academic medical centers, and variation in healthcare access by gender, race/ethnicity, rurality, and other structural or social determinants of health.Our study also lacked complete data on vaccination status.
Nonetheless, our study contributes valuable insights into PASC incidence in a large cohort of COVID-19 patients.Our cross-network approach leveraged differing de nitions delivered a plausible range of estimates and demonstrated remarkably similar pro les and temporal patterns of PASC incidence.Future prospective studies incorporating comprehensive assessment protocols, diverse healthcare settings, and detailed variant and vaccination data are warranted to further elucidate the complex nature of PASC and inform strategies for its prevention, management, and long-term care.

Table 2
In N3C, patients vaccinated within 90 days post-index event also had slightly lower risk of PASC (aHR: 0.97, p<0.001).Prior vaccination was associated with a slightly higher risk of PASC among pediatric patients (aHR: 1.1, p<0.001).Acute COVID-19 illness severity was the strongest predictor of PASC risk (Table2).Patients hospitalized with ICU-level care had the highest risk, even after adjustment (N3C aHR: 2.53, p<0.001;PCORnet aHR: 2.31, Risk of PASC varied by race and ethnicity, with Asian patients having a slightly lower PASC risk compared to white patients (Table2).Results were otherwise inconsistent across networks.After adjustment, adult patients with documented prior vaccination had a signi cantly lower risk of PASC compared to those with no evidence, with a stronger observed effect among PCORnet patients (aHR: 0.84, p<0.001) than N3C (aHR: 0.98, p<0.001).
While magnitude of PASC incidence varied depending on de nition used, temporal patterns and risk factors were largely consistent across the EHR networks.Temporal peaks in PASC cases aligned with emergence of new viral variants, suggesting a potential association between viral dynamics and the development of PASC.Peaks aside, we did not see sizable secular decreases in PASC risk over time, suggesting that PASC remains a public health priority.However, other factors such as increasing recognition of PASC clinical features and use of U09.9 codes over time PASC over time may be in uencing detection of PASC diagnoses in EHRs.Higher prevalence of PASC among older adults and individuals of certain racial or ethnic backgrounds underscores the importance of considering these factors in PASC risk strati cation.
19-infected adults and children using nationwide EHR network data.Crude incidence estimates of PASC ranged from a low of 8.5% to a high of 26.4% in adults and was 7.0% in children.After accounting for potential background levels of PASC-like symptoms, excess incidence was estimated to be 4% in children and between 4-7% among adults.Incidence was highest in older adults, women, and patients with pre-existing comorbidities or more severe acute COVID-19 illness.

Table 1 .
• University of Florida -UL1TR001427: UF Clinical and Translational Science Institute • University of New Mexico Health Sciences Center -UL1TR001449: University of New Mexico Clinical and Translational Science Center • University of Texas Health Science Center at San Antonio -UL1TR002645: Institute for Integration of Medicine and Science • Yale New Haven Hospital -UL1TR001863: Yale Center for Clinical Descriptive characteristics of COVID-19-positive patients with and without PASC across three Hazard ratios generated by a multivariable Cox Proportional Hazards regression model, adjusted for age group, sex, race/ethnicity, pre-existing conditions, rurality, COVID-19 severity, vaccination status, and month of the index event.
a Hazard ratios generated by a Cox Hazards regression model, unadjusted.b

Table 1 .
Vertical dotted lines represent overall PASC incidence proportion for each network.B, Two-Dimensional Heatmap.Heatmaps represent the proportion of COVID-19 positive patients who developed PASC.Percentages are strati ed by COVID-19 severity and patient pre-existing conditions.N represents the number of patients within the group.P represents the number of PASC patients within the group.Heatmap scales are based on the percentage of PASC patients from each network, from 0% (blue) through 100% (red).The midpoint (white) of the scale represents the overall PASC rate from each network.Values three or more times greater than the overall PASC rate were colored red.C, of over time compared January 2021.Multivariable hazard ratios for incident PASC per month.Hazard ratios were generated by a multivariable Cox Proportional Hazards regression model, and are presented unadjusted (black) and adjusted (orange) for age group, sex, race/ethnicity, pre-existing conditions, rurality, COVID-19 severity, vaccination status, and month of the index event.95% Con dence Intervals are provided.