We studied 719 adults from the PHOSP-COVID study, including the 626 patients previously reported.15 Individuals had been hospitalised for COVID-19, 6 months prior (median 6.0; IQR 4.5–6.7 months; Range 1.4–8.3), confirmed clinically or by PCR (n=621). We included patients reporting symptoms from 4 weeks after acute COVID-19, per the National Institute for Health and Care Excellence (NICE) and Centers for Disease Control and Prevention (CDC) definitions of long COVID (Fig. 1A).23,24 Analysing cross-sectional clinical data, 250/719 (35%) felt fully recovered (“Recovered”) and the remaining 469 (65%) reported symptoms consistent with long COVID (Fig. 1B; Table 1).
Using a multivariate penalised logistic regression model (PLR) to explore the associations between clinical covariates, immune mediators and symptoms, we found women were more likely to experience all symptoms, and this effect was largest for GI (Odds Ratio; OR=1.13) and cardiorespiratory symptoms (OR=1.17; Fig. 1C–G). Confidence intervals are not appropriately derived from PLR analysis and are not reported, however repeated cross-validation was used to estimate uncertainty associated with PLR outputs (Methods and Supplementary). Pre-existing conditions that might predispose to symptom outcomes (e.g., chronic lung disease in the case of cardiorespiratory symptoms; Supplementary Table 1) were associated with all symptoms, except GI. Age and acute disease severity were not associated with any symptom. We did not include ethnicity as a covariate because it is not an independent risk factor in this cohort.16
Myeloid inflammation and complement activation are common to all long COVID symptoms
To study the association of peripheral inflammation with symptoms, 368 immune mediators were measured from plasma and included as covariates. Mediators suggestive of myeloid inflammation were associated with all symptoms (Fig. 1C–G). Elevated IL1R2 and/or Matrilin-2 (MATN2) were consistently associated with the highest odds of all symptoms, except cognitive impairment where the effect was smaller (cardiorespiratory IL1R2 OR=1.16; fatigue IL1R2 OR=1.53; anxiety/depression IL1R2 OR=1.13; GI MATN2 OR=1.08; cognitive MATN2 OR=1.03). IL1R2 is expressed by monocytes and macrophages, modulating IL-1 inflammation.25 MATN2 is an extracellular matrix (ECM) protein which promotes inflammation by activating toll-like receptors and enhancing monocyte infiltration into tissues.26,27 CSF3 (G-CSF, which promotes neutrophilic inflammation), was elevated in fatigue (OR=1.12), GI symptoms (OR=1.05) and anxiety/depression (OR=1.05; Fig. 1D–F).28 Increased levels of IL-6 were associated with cardiorespiratory symptoms (OR=1.06) and fatigue (OR=1.09).
Elevated Collectin-12 (COLEC12) was also associated with cardiorespiratory symptoms (OR=1.12), anxiety/depression (OR=1.06) and fatigue (OR=1.21; Fig. 1C–E). COLEC12 can initiate inflammation in tissues by activating the alternative complement pathway.29,30 Whilst COLEC12 was not associated with GI symptoms and only weakly associated with cognitive impairment (OR=1.02), C1QA was associated with these symptoms (Fig. 1F&G). C1QA is a component of the complement system, indicating activation via the classical pathway.31 Notably, C1QA was associated with the second highest odds of cognitive impairment (OR=1.04), and has been implicated in the pathogenesis of chronic neuroinflammation in Alzheimer’s disease.32 Although subtle differences were observed between symptom groups, our findings demonstrate myeloid inflammation and complement activation in all long COVID phenotypes.
We used the CDC and NICE definition for long COVID (>4 weeks after acute COVID-19) in our analyses.23,24 However, the World Health Organisation (WHO) defines long COVID as symptoms occurring 3 months post-infection.33 We therefore repeated our analysis using samples and clinical data collected after 3 months (Median 6.1 months; IQR 5.1-6.8; Range 3.0-8.3; n=659; recovered=233[35%]). Inflammatory associations with long COVID symptoms were consistent with our original analysis, indicating that the profiles identified in our cohort were representative of long COVID after hospitalisation using three commonly used definitions (Extended Data Fig. 1A-G).
To further validate the findings from PLR analysis, we examined the distribution of data, prioritising proteins that were associated with the highest odds of each symptom (Fig. 1H–L and Extended Data Fig. 2). Each protein was significantly elevated in the symptom group compared to recovered, confirming the patterns identified by PLR. Unadjusted PLR models and alternative regression approaches (Partial Least Squares; PLS) were also used to confirm the validity of our findings (Supplementary Table 2 and Extended Data Fig. 3,4). Results from these approaches confirmed the relationship between female sex and comorbidities on outcome, as well as the association between myeloid inflammation, complement and symptoms. Notably the standard errors of PLS estimates were wide, consistent with the literature, reporting PLR as the optimal method to analyse multiple mediators which may correlate due to their combined effects.34 Since we aimed to understand how inflammatory proteins work together to mediate symptoms, we prioritised PLR results to draw conclusions.
Biomarker discovery was not our goal and the marked overlap in mediator levels when viewed unidimensionally, indicates these markers are not useful on an individual basis for diagnosis (Fig. 1H–M). Importantly, we did not find differences in C-reactive protein (CRP) levels between groups, measured contemporaneously by hospital laboratories (Table 1). Fibrinogen levels during acute COVID-19 have recently been associated with cognitive deficits post-COVID. 35,36 We similarly found that elevated fibrinogen was evident in long COVID (p=0.0077), suggesting that elevated fibrinogen in both the acute and post-acute phase associates with long COVID symptoms (Extended Data Fig 1H). Given the interaction between complement activation and thrombosis, elevated fibrinogen supports our observation of complement pathway activation.37
GI symptoms and Cognitive impairment are associated with different patterns of inflammation
Whilst the protein signatures of individuals with cardiorespiratory symptoms, fatigue and anxiety/depression (the most common combination, n=88) were similar, specific proteins were raised in those with GI symptoms and cognitive impairment (Fig. 1F,G). Elevated Dipeptidyl peptidase 10 (DDP10) and Secretogranin 3 (SCG3) was observed in the GI group (DPP10 OR=1.07; SCG3 OR=1.06). DDP10 can modulate tissue inflammation, and increased DPP10 expression is associated with Ulcerative Colitis, suggesting that GI symptoms may result from enteric, as well as systemic, inflammation.38,39 Elevated SCG3 suggests disturbance of the brain-gut axis, as observed in patients with irritable bowel syndrome.40
Cognitive impairment was associated with elevated Neurofascin (NFASC; OR=1.05), Spondin-1 (SPON-1; OR=1.03) and Iduronate sulfatase (IDS; OR=1.04)(Fig. 1G,L). NFASC and SPON-1 regulate neural growth,41,42 whilst IDS is an ECM enzyme supporting tissue turnover and enabling leucocyte infiltration into tissues.43,44 The combination of these proteins with elevated C1QA, suggest neuroinflammation and alterations in nerve tissue repair (i.e., neurodegeneration). Taken together our findings indicate that complement activation and myeloid inflammation is common to all long COVID cases, but subtle differences in those with GI and Cognitive symptoms may have mechanistic significance.
Given our observations of elevated C1QA and the recent identification of acute fibrinogen as a biomarker of post-COVID cognitive impairment,36 we analysed fibrinogen specifically in this group. We found that median fibrinogen levels were higher at 6 months in those with cognitive impairment (p=0.07), though this difference was not significant (Extended Data Fig 1I).
To explore the relationship between inflammatory mediators associated with different long COVID symptoms, we performed a network analysis of those mediators highlighted by PLR within each symptom group. COLEC12 and MATN2 showed high centrality compared to other mediators in the Cardiorespiratory, Fatigue and Anxiety/Depression groups (Fig. 2A–C & Extended Data Fig. 5A–C). Both mediators correlated with pro-inflammatory proteins (e.g., IL1R2, IL-12B [also known as IL-12/23p40], IL-6, CD276, CD4, DPP10) and markers of endothelial and mucosal inflammation (e.g., TGFA, TFF2, ISM1, ANGPTL2), suggesting roles in tissue-specific long COVID inflammation. Similarly, MATN2 and the pro-inflammatory protein TNFRSF11B were central to inflammation in the GI group (Extended Data Fig. 5D). However, SCG3 correlated less closely with mediators in this group, suggesting that alterations in the brain-gut axis may contribute separately to symptoms (Fig. 2D). SPON-1 was the most central mediator in those with cognitive impairment, further highlighting the possibility that neurodegenerative processes may occur in these individuals (Fig. 2E & Extended Data Fig. 5E). Taken together, these findings support the central role of complement and myeloid inflammation in long COVID but suggest additional processes may contribute towards GI symptoms and cognitive impairment.
Elevated sCD58 is associated with recovery
Elevated sCD58 was associated with lower odds of all long COVID symptoms and this was most pronounced for cardiorespiratory symptoms (OR=0.79; Fig. 1C,M). sCD58 is an immunoregulatory factor, known to suppress IL-1 and IL-6 dependent interactions between CD2+ monocytes and CD58+ (lymphocyte-function antigen 3) T/NK cells.45,46 Since we observed markers of monocytic inflammation in all symptom groups and elevated IL-6 in those with fatigue and cardiorespiratory symptoms, the association of sCD58 and recovery supports the central role of myeloid inflammation in long COVID.
Elevated markers of tissue repair, including Delta/notch-like EGF repeat (DNER OR=0.82) were also associated with reduced risk of all symptoms (Fig. 1C–G). Notably, elevated IDS was associated with recovery compared to all symptom groups, except cognitive impairment where the inverse was true. IDS maintains tissues by preventing accumulation of ECM proteoglycans and facilitating leucocyte entry.43,44 IDS may have divergent functions in different tissue environments, for example supporting lung tissue repair to prevent respiratory symptoms, whilst promoting neuroinflammation and thus cognitive impairment. Our data suggests immunosuppressive factors and a robust tissue repair response may prevent symptoms after COVID-19, supporting the use of anti-inflammatory agents in therapeutic trials.47
Women who experience long COVID have higher inflammatory markers
We next sought to understand inflammatory responses in women, who were more likely to experience long COVID, in keeping with previous studies (Fig. 1C–G; Table 1).16,18 Since oestrogen can influence immunological responses,48 we compared protein levels between men and women younger and older than 50 years to discriminate between pre- and post-menopausal women (Fig. 3A–E). IL1R2 and MATN2 were significantly higher in women >50 years, with cardiorespiratory symptoms (IL1R2 p=0.0002; MATN2 p<0.0001), fatigue (IL1R2 p=0.0003; MATN2 p=0.012) and anxiety/depression (IL1R2 p=0.0003; MATN2 p=0.012). Oestrogen-dependent differences would be expected to be most pronounced in pre-menopausal women,49 but this was not observed. Women have been reported to have stronger innate immune responses to infection48,50 and are at greater risk of autoimmunity,48 possibly explaining our findings.
Examining proteins associated with GI symptoms, there were no significant differences seen between men and women (Extended Data Fig. 6). In the cognitive impairment group IDS was significantly higher in pre-menopausal women (p=0.02), though this effect was lost in the post-menopausal group. IDS is X-linked, which may partially explain these differences.51 Overall, our analysis suggests non-hormonal differences in immune responses explain the increased likelihood of women to experience long COVID. These findings require confirmation in adequately powered studies but have potential clinical implications, suggesting anti-inflammatory therapies might be most beneficial for women.
Systemic inflammation in long COVID is not related to the upper respiratory tract
We next sought to understand mechanisms driving long COVID inflammation, focussing on the cardiorespiratory group as the most common phenotype. Given the correlations observed between MATN2 and markers of mucosal inflammation in individuals with cardiorespiratory symptoms (Fig. 2A), we considered local inflammation in the respiratory tract as a possible cause. We analysed nasosorption samples from 88 adults within our cohort and 25 healthy controls (Supplementary Table 3). Several inflammatory markers were elevated in the upper respiratory tract post-COVID, including IL-1α (Fig. 4A). However, there was no difference between those recovered (n=31) and those not (n=33) (Fig. 4B). In adults with only cardiorespiratory symptoms (n=29), inflammatory mediators elevated in plasma were not elevated in the upper respiratory tract (Extended Data Fig. 7A–F). Furthermore, there was no correlation between mediator levels at different sites (Extended Data Fig. 7G–L). This exploratory analysis suggests that upper respiratory tract inflammation is not associated with cardiorespiratory symptoms.
Long COVID is associated with stronger antibody responses but not persistent sputum antigen
We next considered that SARS-CoV-2 persistence in lung tissue, might explain the inflammatory profiles observed in those with cardiorespiratory symptoms. We performed an exploratory analysis of SARS-CoV-2 antigens (S and N) in sputum from a subgroup of 23 adults with cardiorespiratory symptoms at 6 months. Sputum from 17 recovered adults and pre-pandemic bronchoalveolar lavage fluid were analysed as controls (Supplementary Table 3). Although low concentrations of N antigen were detected in 4 samples, there was no difference between those with symptoms and those recovered (Fig. 4C). S antigen was undetectable in all sputum samples.
Our findings do not exclude persistence, which is most likely evident from tissue samples.52,53 We therefore examined SARS-CoV-2 specific antibody levels in a subgroup of unvaccinated individuals, which might respond to viral reservoirs. Consistent with previous reports, we found stronger SARS-CoV-2-specific IgG responses in individuals with persistent symptoms (Fig. 4D–H).52 Both anti-S and -N IgG responses were higher in the Cardiorespiratory (S p=0.0040, Fig. 4D; N p =0.023, Fig. 4E) and Fatigue groups (S p=0.0030, Fig. 4F; N p=0.010, Fig. 4G), relative to Recovered. Anti-S (p=0.0098, Fig. 4H) but not -N (p=0.054, Fig. 4I) IgG was elevated in the Anxiety/Depression group. We did not have sufficient data to assess responses in Cognitive impairment and GI groups.
Overall, we demonstrate complement and myeloid associated inflammation in long COVID alongside elevated antibody titres, providing insights into disease mechanisms and aetiology.