Impaired immune signaling and changes in the lung microbiome precede secondary bacterial pneumonia in COVID-19

Secondary bacterial infections, including ventilator-associated pneumonia (VAP), lead to worse clinical outcomes and increased mortality following viral respiratory infections including in patients with coronavirus disease 2019 (COVID-19). Using a combination of tracheal aspirate bulk and single-cell RNA sequencing (scRNA-seq) we assessed lower respiratory tract immune responses and microbiome dynamics in 28 COVID-19 patients, 15 of whom developed VAP, and eight critically ill uninfected controls. Two days before VAP onset we observed a transcriptional signature of bacterial infection. Two weeks prior to VAP onset, following intubation, we observed a striking impairment in immune signaling in COVID-19 patients who developed VAP. Longitudinal metatranscriptomic analysis revealed disruption of lung microbiome community composition in patients with VAP, providing a connection between dysregulated immune signaling and outgrowth of opportunistic pathogens. These findings suggest that COVID-19 patients who develop VAP have impaired antibacterial immune defense detectable weeks before secondary infection onset.


Introduction 61
Secondary bacterial pneumonia results in significant morbidity and mortality in patients 62 with viral lower respiratory tract infections (LRTI) 1 . This problem was evident in the 1918 influenza 63 pandemic during which the majority of deaths were ultimately attributed to secondary bacterial 64 pneumonia 2 . SARS-CoV-2 infection, like influenza, confers an increased risk of late onset 65 secondary bacterial infection, often manifesting as ventilator-associated pneumonia (VAP) 3,4 . 66 Marked heterogeneity exists, however, with respect to the risk of VAP in patients with coronavirus 67 disease 2019 (COVID-19), with incidence ranging from 12-87% in published cohort studies 5-8 . 68 The mechanisms underlying VAP susceptibility in COVID-19 remain unknown. Animal 69 models of influenza may provide some insight, suggesting a role for interferon-mediated 70 suppression of cytokines essential for bacterial defense, including those important for neutrophil 71 recruitment, antimicrobial peptide production and the Th17 response 9-11 . The impact of interferon 72 signaling on antibacterial host defense, however, varies based on bacterial pathogen, and is not 73 well studied in humans 12 . 74 Much of our understanding of VAP immunobiology derives from sepsis studies, which 75 describe a state of immune reprogramming in patients who go on to develop nosocomial 76 infections 13-16 . This state is characterized by impaired monocyte, neutrophil and CD4 lymphocyte 77 function, as well as decreased responsiveness of pathogen recognition receptors and 78 downstream signaling pathways 13-16 . Severe COVID-19, like sepsis, is characterized by a 79 dysregulated host response to infection, raising the possibility that similar mechanisms may 80 contribute to the development of VAP in patients infected with SARS-CoV-2. 81 In addition to host immune elements, microbial factors contribute to VAP pathogenesis. 82 Mechanically ventilated patients endure prolonged exposure to the hospital environment, which 83 increases risk of colonization by opportunistic pathogens in the upper airway, oropharynx, and 84 lungs 14,16 . Micro-aspiration or direct invasion may subsequently lead to lower respiratory tract 85 disease and clinical onset of VAP 16,17 . Further, the common practice of early empiric antimicrobial 86 administration in critically ill patients can lead to disruption of the lung microbiome, which has 87 been associated with the development of VAP 18,19 . 88 Despite their interconnected roles, no studies to date have simultaneously profiled host 89 immune responses and lung microbiome dynamics in the context of VAP. Further, the question 90 of whether host responses to SARS-CoV-2 or other viral infections may contribute to this 91 dysbiosis, leading to VAP, remains unanswered. 92 Given the marked heterogeneity in VAP incidence among patients with COVID-19 5-8 , as 93 well as gaps in mechanistic understanding of secondary bacterial pneumonia, we sought to 94 assess the molecular determinants of VAP in patients infected with SARS-CoV-2. We employed 95 a systems biology approach involving immunoprofiling the host transcriptional response and 96 simultaneously assessing lung microbiome dynamics, using a combination of bulk and single cell 97 RNA sequencing and extensive clinical phenotyping. We observed a striking impairment in 98 antibacterial immune signaling at the time of intubation, that correlated with disruption of the lung 99 microbiome, weeks before the onset of VAP. 100 Patients with VAP were adjudicated using the United States Centers for Disease Control 112 (CDC) definition 20 , including a requirement for a positive bacterial TA culture (n=10). We defined 113 onset of VAP as the first day a patient developed any of the criteria used to meet the definition, 114 in accordance with CDC guidance 20 . Patients who did not meet the CDC criteria for VAP, and for 115 whom there was no sustained clinical suspicion for bacterial pneumonia during the admission, 116 were adjudicated as No-VAP (n=13). Patients who met CDC VAP criteria but had negative 117 bacterial TA cultures, patients who were clinically suspected of having VAP with either positive or 118 negative TA cultures, and patients who met clinical pneumonia criteria on intubation were 119 excluded from this study (Figure 1). 120 We compared lower respiratory tract host transcriptional responses between the VAP and 121 No-VAP groups at two timepoints. "Early" timepoint TA samples were collected a median of two 122 days post-intubation and 17 days before VAP onset (range: 15-38 days) for bulk RNA-seq 123 analyses or 10 days before VAP onset (range: 8-16 days) for scRNA-seq analyses. "Late" 124 timepoint samples were collected a median of two days before VAP onset (range: 2-3 days) for 125 bulk RNA-seq analyses and three days before VAP onset (range: 2-4 days) for scRNA-seq 126 analyses, and compared against samples collected from No-VAP patients at similar timepoints 127 post-intubation (Figure 1, Table 1). No samples were analyzed after the onset of VAP. We 128 additionally evaluated eight intubated patients with non-pneumonia illnesses as controls at the 129 "early" timepoint. 130 We considered that sex as a biologic variable is important for acute COVID-19 severity, 131 and thus asked whether differences in sex might explain our observations. However, no 132 differences in sex were present between VAP and No-VAP groups (P=1, early and late timepoints) 133 (Table 1). There were also no significant differences between groups with respect to age, race or 134 ethnicity (Table 1, S1, S2). In addition, there were no differences between groups with respect to 135 in-hospital receipt of any immunosuppressant or antibiotics prior to sample collection (Table S1, 136 S2, S3). 137 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We began by assessing the lower respiratory host transcriptional response at the "late" 140 timepoint, just before VAP onset. We carried out differential gene expression analysis on TA bulk 141 RNA-seq data from five patients who developed VAP and eight No-VAP patients (Table 1). We 142 identified 436 differentially expressed genes at a False Discovery Rate (FDR) < 0.1 (Figure 2A) 143 and performed gene set enrichment analysis (GSEA) ( Figure 2B). 144 As expected, the VAP group exhibited upregulation of pathways related to antibacterial 145 immune responses in the days preceding secondary infection, such as neutrophil degranulation, 146 toll-like receptor signaling, cytokine signaling and antigen presentation ( Figure 2B). We also 147 noted that interferon alpha/beta signaling was upregulated. Ingenuity pathway analysis (IPA) 148 predicted broad activation of upstream inflammatory cytokines (including IFNa and IFNg) in the 149 VAP group at this "late" timepoint ( Figure 2C). 150 151 COVID-19 patients who develop VAP have attenuated immune signaling two weeks before 152

VAP onset 153
Given our findings of a unique lower respiratory transcriptional signature in the days 154 preceding VAP onset, we next asked whether differences in host immune signaling might exist 155 even earlier, two or more weeks before clinical diagnosis of VAP. We hypothesized that such 156 differences might provide insight regarding the increased susceptibility to secondary bacterial 157 infection in these patients. 158 We identified 154 differentially expressed genes at the "early" timepoint (Table 1). The 159 patients who developed VAP exhibited lower expression of several genes with established roles 160 in innate immunity including IFI30, MMP2, TLR9, and DEFB124 ( Figure 3A). GSEA further 161 revealed that patients who developed VAP had lower expression of pathways related to 162 antibacterial immune responses including neutrophil degranulation, toll-like receptor signaling, IL-163 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint 17 signaling, antigen presentation and complement pathways, and higher expression of 164 interferon-alpha/beta signaling pathways, more than two weeks before the onset of VAP ( Figure  165 3B). Additionally, pathways related to adaptive immunity such as T and B cell receptor signaling 166 were downregulated in the COVID-19 patients who later developed VAP ( Figure 3B). 167 To gauge the degree of immune pathway suppression compared to controls, we 168 performed a similar analysis on critically ill intubated patients without infection ( Figure 3C). 169 Relative to the control patients, multiple antibacterial immune pathways were downregulated in 170 both COVID-19 groups, with the greatest attenuation in the VAP group ( Figure 3C). IPA upstream 171 cytokine analysis predicted reduced activation of multiple proinflammatory cytokines juxtaposed 172 against increased activation of IFNB1 in the VAP patients ( Figure 3D). Compared to the control 173 group, the predicted activation states of several proinflammatory cytokines were decreased in 174 both COVID-19 groups ( Figure S2). 175 Given prior reports demonstrating a correlation between SARS-CoV-2 viral load and 176 interferon-related gene expression 21 we next asked whether viral load differed between VAP and 177 No-VAP patients. At the "early" timepoint, no differences were observed in either PCR cycle 178 threshold or bulk RNA-seq SARS-CoV-2 reads per million (P=0.53 and P=0.84, respectively) 179 ( Figure S3). We also considered the possibility that differences in the number of days of steroid 180 exposure prior to sample collection might explain our findings but found no differences between 181 groups (P=0.34) (Table S1, S3). 182

and macrophages 185
To further understand the observed downregulation of key antibacterial signaling 186 pathways in COVID-19 patients who developed VAP, we next asked whether this was driven by 187 specific types of immune cells. We performed scRNA-seq on TA specimens and enriched for 188 immune cells using CD45 selection (Methods). We analyzed 12,197 cells from 14 patients after 189 quality control and batch correction. First, we analyzed samples obtained post-intubation at the 190 "early" timepoint. Clustering based upon cellular transcriptional signatures followed by a 191 comparison of cell type proportions did not reveal statistically significant differences between VAP 192 and No-VAP groups with respect to any cell type at the "early" timepoint ( Figure 4A, S4). 193 Monocytes and macrophages were the most abundant cell types, and thus we focused 194 transcriptional assessment on these two populations. 195 COVID-19 patients who developed VAP had distinct post-intubation transcriptional 196 signatures compared to the No-VAP patients at this "early" timepoint, with respect to the overall 197 monocyte/macrophage cluster (Figure 4, S4, S5). We identified 625 differentially expressed 198 genes (FDR < 0.05) in this cluster and noted downregulation of several genes with key roles in 199 innate immunity in the COVID-19 patients who subsequently developed VAP compared to those 200 who did not, including IL1Rn, CXCL2, CCL20, and JunB. (Figure 4C). In addition, we noted 201 upregulation of interferon-induced genes including IFI30 and IFNγR-2. (Figure 4C). Findings were 202 similar when evaluating differential expression within alveolar macrophages, the largest 203 monocyte/macrophage subset ( Figure S5). 204 IPA canonical pathway analysis revealed downregulation of several cytokine and innate 205 immune signaling pathways in the patients who later developed VAP at the "early" post-intubation 206 timepoint, including IL-6 and IL-1 signaling, and the NRF2-mediated oxidative stress response 207 ( Figure 4D). Computational analysis of upstream cytokine activation predicted impaired activation 208 of multiple pro-inflammatory cytokines in patients who developed VAP, including TNF, IFNγ, and 209 IL1B ( Figure 4E). Similar findings were observed in the alveolar macrophage population ( Figure  210 S5). We were unable to reliably assess the T cell population due to low cell numbers. 211 A comparison of monocytes and macrophages between the VAP and No-VAP patients 212 just prior to VAP onset at the "late" timepoint revealed significant upregulation of genes essential 213 for antibacterial defense. These findings reflected bulk RNA-seq observations and were 214 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  We next investigated temporal dynamics of the lower airway host immune response in 219 COVID-19 patients from the time of intubation to development of VAP, by evaluating bulk RNA-220 seq differential gene expression between COVID-19 VAP patients at the "early" timepoint (median 221 of 17 days before VAP onset, n=4) versus "late" timepoint (median of two days before VAP onset, 222 n=5). We identified 2705 differentially expressed genes, and unsupervised hierarchical clustering 223 of the 50 most significant genes demonstrated clear separation of the two timepoints ( Figure 5A). 224 GSEA revealed that type I interferon signaling was notably downregulated at the "late" timepoint 225 in comparison to the "early" timepoint ( Figure 5B); however, expression of this pathway was still 226 significantly higher in the VAP versus the No-VAP patients ( Figure 2B). Several other immune 227 signaling pathways were more highly expressed at the "late" timepoint, presumably reflecting 228 activation of an antibacterial response in the setting of impending bacterial pneumonia ( Figure  229 5B). Consistent with this, upstream regulator analysis predicted increased activation of several 230 pro-inflammatory cytokines and decreased IFNa and IFNl signaling at the "late" versus "early" 231 timepoints ( Figure 5C). 232 In contrast, comparing No-VAP patients at the "early" (n=8) versus "late" (n=8) timepoints 233 yielded only two genes, both of which were interferon-stimulated genes (RSAD2 and CMPK2) 234 downregulated at the "late" timepoint, suggesting that while the host response was relatively 235 unchanged in these patients, the antiviral response attenuated over time. Indeed, GSEA revealed 236 that type I interferon signaling, and other antiviral immune pathways, were downregulated in the 237 patients who did not develop VAP at the "late" timepoint ( Figure S7). 238 We further compared dynamics of host immune responses between VAP and No-VAP 239 patients by performing longitudinal analyses of key immune signaling pathways, including all 240 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021.
patients with available bulk RNA-seq samples (VAP: n=7, No-VAP: n=10). Onset of VAP ranged 241 from 10-39 days post intubation (median 19 days), and treatment with immunosuppressants did 242 not differ significantly between comparator groups (p=0.30, Fisher's exact test). Early attenuation 243 of immune signaling in the VAP group was conspicuous, and this pattern eventually resolved later 244 in disease course by the time secondary bacterial infection became established (Figures 5D-G). 245 We confirmed that the observed differences between VAP and No-VAP patients were not driven 246 by differences in treatment with immunosuppressants by comparing pathway Z-scores between 247 patients who received immunosuppressants and those who did not, at the "early" timepoint 248 ( Figure S8). 249 We subsequently performed a similar comparison between the "early" and "late"

Lung microbiome disruption precedes VAP in COVID-19 patients 257
We hypothesized that the innate immune suppression observed in patients who developed 258 VAP would correlate with SARS-CoV-2 viral load. Using TA metatranscriptomics to assess the 259 lower respiratory microbiome, we evaluated longitudinal changes in SARS-CoV-2 abundance. 260 Although no difference in viral load was observed at the "early" timepoint ( Figure S3, Figure 6A), 261 the trajectory of SARS-CoV-2 viral load differed significantly in patients who developed VAP. Viral 262 load decreased over time in both groups, however the virus was still detected at later timepoints 263 in patients with VAP, while it was undetected in any of the No-VAP patients at these timepoints 264 ( Figure 6A). This result suggested that COVID-19 patients who develop VAP may exhibit 265 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Diversity Index, P=0.0065; Figure 6B). Beta diversity did not differ significantly, which may be 270 due to the fact that VAP was caused by different pathogens in each patient, resulting in different 271 airway microbiome compositions in each case ( Figure S9). All patients received antibiotics prior 272 to collection of the first sample, suggesting that antibiotic use was not driving these differences 273 (Table S1). 274 We next assessed the most abundant bacterial genera in the lung, as identified by TA 275 metatranscriptomics. At the "early" timepoint, control and No-VAP samples were predominated 276 by common lung commensals, with significant enrichment (FDR < 0.05) of Alloprevotella, 277 Fusobacterium, and Neisseria compared to the VAP group ( Figure 6C). In the VAP group, early 278 significant enrichment of Klebsiella was noted, suggestive of a shift in the microbiome structure 279 that precedes disease onset and culture positive confirmation ( Figure 6C). By the "late" timepoint, 280 VAP-associated pathogens such as Citrobacter, Escherichia, and Klebsiella were highly 281 prevalent. Citrobacter was significantly enriched in VAP versus No-VAP groups at this timepoint, 282 supporting culture results. While highly enriched in VAP, Klebsiella did not reach significance at 283 this timepoint due to high variance between individuals ( Figure 6D). No-VAP patients had a 284 significantly greater abundance of commensal lung microbes, such as Lactobacillus, 285 Bifidobacterium, and Veillonella, at the "late" timepoint ( Figure 6D). 286 287

Discussion 288
Secondary bacterial pneumonia contributes to significant morbidity and mortality in 289 patients with primary viral lower respiratory tract infections 1,3 , but mechanisms governing 290 individual susceptibility to VAP have remained unclear. Few human cohort studies have evaluated 291 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint the immunologic underpinnings of VAP, and none have been reported in the context of COVID-292 19, which is characterized by a dysregulated host response distinct from other viral pneumonias 21-293 23 . To address this gap and probe mechanisms of VAP susceptibility in patients with COVID-19, 294 we carried out a systems biological assessment of host and microbial dynamics of the lower 295 respiratory tract. 296 Two days before VAP onset, a transcriptional signature consistent with bacterial infection 297 was observed. This finding suggests that host response changes can occur before clinical 298 recognition of pneumonia, highlighting the potential utility of the host transcriptome as a tool for 299 VAP surveillance. While intriguing, this observation did not provide an explanation for differential 300 susceptibility of some COVID-19 patients to post-viral pneumonia. 301 The discovery of an early suppressed antibacterial immune response in patients who later 302 developed VAP did, however, offer a potential explanation. More than two weeks before VAP 303 onset, we observed a striking suppression of pathways related to both innate and adaptive 304 immunity, including neutrophil degranulation, toll-like receptor (TLR) signaling, complement 305 activation, antigen presentation, and T cell receptor and B receptor signaling, as well as signaling 306 by several cytokines (e.g. IL-1, IL-4, IL-12, IL-13 and IL-17). Comparison against uninfected, 307 intubated controls confirmed the previously described paradoxical impairment in immune 308 signaling found in patients with severe COVID-19 22 , and suggested that VAP susceptibility may 309 be the result of disproportionate suppression of innate and adaptive pathways critical for 310 antibacterial defense, resulting in enhanced susceptibility to opportunistic secondary infections. 311 Animal models of influenza have provided insight into potential mechanisms of post-viral 312 pneumonia, although none have provided insight regarding why some individuals are more 313 susceptible than others. In mice inoculated with influenza, for instance, virus-induced type I 314 interferon suppresses neutrophil chemokines and impairs Th17 immunity, compromising effective 315 clearance of bacterial infections 10,11 . Interestingly, we also observed increased type I interferon 316 signaling in COVID-19 patients who weeks later developed VAP, and an impairment in IL-17 317 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint signaling and other immune pathways. Desensitization to TLR ligands after influenza infection 318 has also been documented 24 , which is congruent with the downregulation of TLR signaling at the 319 time of intubation observed in our bulk RNA-seq analyses. 320 We asked whether certain cell types were responsible for driving the early suppression of 321 immune signaling observed in COVID-19 patients who went on to develop VAP. No significant 322 differences in proportions of any cell types was observed between patients with or without VAP 323 at the time of intubation. This finding suggests that an impairment of immune cell recruitment was 324 not causing these differences, but rather significant gene expression differences within these 325 immune cell populations. 326 In the most predominant cell population, the monocytes and macrophages, we observed 327 we initially hypothesized that differences in viral load between groups might relate to individual 335 VAP susceptibility. However, we found no difference between groups at the "early" timepoint. This observation was corroborated by a prolonged antiviral type I interferon response at the "late" 343 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint timepoint (median of two days before VAP onset) in patients who developed VAP versus those 344 who did not, pointing to the persistence of suboptimal antiviral immunity in these patients. Early 345 induction of functional SARS-CoV-2 specific T cells is associated with faster viral clearance in 346 COVID-19 patients 25 and likewise, we observed impairments in T cell activation and signaling in 347 the VAP group by bulk RNA-seq at the "early" timepoint, which further suggests a decreased 348 ability to control the virus in these patients. Thus, our results emphasize the need for ongoing vigilance for VAP in patients treated with potent 369 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Sample size is a limitation of this study, which could result in over-fitting to this small set 378 of patients; however, the reproducibility of our observations across both bulk and scRNA-seq 379 analyses and the significant number of differentially expressed genes among the comparator 380 groups support the internal validity of our conclusions. Additional studies in a larger independent 381 cohort will be needed to determine the utility of lower respiratory gene expression biomarkers for 382 predicting future development of VAP. Because this study was limited to critically ill, intubated 383 patients, we were unable to assess early stages of COVID-19, which may provide additional 384 insight regarding determinants of secondary bacterial infection. 385 We were unable to assess whether epithelial cells contributed to VAP risk due to 386 enrichment for immune cells prior to scRNA-seq. However, given that monocytes, macrophages 387 and lymphocytes can both be infected by SARS-CoV-2 and contribute to disease pathophysiology 388 and resolution 30-32 , our findings are apropos. With larger cohorts, the early detection of specific 389 immune pathway suppression and microbiome collapse could be leveraged to develop clinically 390 useful models for identifying COVID-19 patients with increased susceptibility to secondary 391 bacterial pneumonia. 392 393 Materials and Methods 394 395 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Study design, cohorts, and enrollment 396
We conducted a prospective case-control study of adults requiring mechanical ventilation 397 for COVID-19 with or without secondary bacterial pneumonia. We also evaluated control patients 398 requiring mechanical ventilation for other reasons who had no evidence of pulmonary infection 399 (Figure 1) Ventilator-associated pneumonia adjudication 406 A total of 63 adults who required intubation for severe COVID-19 (Cohort 1) and who had 407 available TA samples were considered for inclusion in the study (Figure 1) and were selected because they had previously been adjudicated as having no evidence of lower 420 respiratory tract infection. This group included four patients with acute respiratory distress 421 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. standards. Gene-level counts were generated from the transcript-level abundance estimates 447 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint using the R package tximport 36 , with the scaledTPM method. Samples retained in the dataset had 448 a total of at least 1,000,000 estimated counts associated with transcripts of protein coding genes. 449 Genes were retained for differential expression analysis if they had counts in at least 30% 450 of samples. Differential expression analysis was performed using the R package DESeq2 37 . We 451 modeled the expression of individual genes using the design formula ~VAPgroup, where VAP 452 groups were "VAP-early", "No-VAP-early", "VAP-late" and "No-VAP-late" and used the results() 453 function to extract a specific contrast. Separate comparisons to the control group were performed 454 using the design formula ~COVID-19-status to compare positive and negative patients. 455 Significant genes were identified using a Benjamini-Hochberg false discovery rate (FDR) 456 < 0.1. We generated heatmaps of the top 50 differentially expressed genes by FDR. For 457 visualization, gene expression was normalized using the regularized log transformation, centered, 458 and scaled prior to clustering. Heatmaps were generated using the pheatmap package. Columns 459 were clustered using Euclidean distance and rows were clustered using Pearson correlation. 460 Differential expression analysis results are provided in (Supplementary Data File 1). 461 462

Pathway analysis 463
Gene set enrichment analyses (GSEA) were performed using the fgseaMultilevel function 464 in the R package fgsea 38 and REACTOME 39 pathways with a minimum size of 10 genes and a 465 maximum size of 1,500 genes. All genes were included in the comparison, pre-ranked by the test 466 statistic. Significant pathways were defined as those with a Benjamini-Hochberg adjusted P-value 467 < 0.05. Ingenuity Pathway Analysis (IPA) Canonical Pathway and Upstream Regulator Analysis 40 468 was employed on genes with P < 0.1 and ranked by the test statistic to identify cytokine regulators. 469 Significant IPA results were defined as those with a Z-score absolute value greater than 2 and an 470 overlap P-value < 0.05. The gene sets in figures were selected to reduce redundancy and 471 highlight diverse biological functions. Full GSEA and IPA results are provided in (Supplementary 472

Data Files 2 and 3). 473
All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Raw sequencing reads were aligned to GRCh38 using the STAR aligner . Cell barcodes 498 were then determined based upon UMI count distribution. Read count matrices were generated 499 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this this version posted November 30, 2021. ; https://doi.org/10.1101/2021.03.23.21253487 doi: medRxiv preprint through the 10X genomics cellranger pipeline v3.0. Data was processed and analyzed using the 500 Scanpy v1.6 42 . Cells that had <200 genes, <20% mitochondrial reads, and had greater than 2500 501 counts were filtered. Mitochondrial genes were removed and multi-sample integration was 502 performed using single-cell variational inference (scVI) 43 (Supplementary Data Files 5 and 6). 515 516 Lung microbiome analysis 517 RNA from tracheal aspirates was sequenced as described above. Taxonomic alignments 518 were obtained from raw sequencing reads using the IDseq pipeline 45,46 , which performs quality 519 filtration and removal of human reads followed by reference-based taxonomic alignment at both 520 the nucleotide and amino acid level against sequences in the National Center for Biotechnology 521 Information (NCBI) nucleotide (NT) and non-redundant (NR) databases, followed by assembly of 522 reads matching each taxon detected. Taxonomic alignments underwent background correction 523 for environmental contaminants (see below), viruses were excluded, and data was then 524 aggregated to the genus level before calculating diversity metrics. Alpha diversity (Shannon's 525 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Diversity Index) and beta diversity (Bray-Curtis dissimilarity) were calculated and the latter plotted 526 using non-metric multidimensional scaling (NDMS). Comparison of alpha and beta diversity over 527 time between VAP and No-VAP groups was calculated using a two-way analysis of variance 528 (ANOVA) in R. Top ten taxa in each group were identified by ranking the percent of counts for 529 each genus among total bacterial counts in each group. Significantly enriched taxa were 530 evaluated by comparing distributions using three models (Poisson, negative binomial, and zero-531 inflated negative binomial mixed-effects models) and correcting for multiple hypothesis testing 532 using Benjamini-Hochberg as previously described 47 . Significantly enriched taxa were called 533 using a q < 0.05 cutoff. 534 535 Identification and mitigation of environmental contaminants 536 To minimize inaccurate taxonomic assignments due to environmental and reagent derived 537 contaminants, non-templated "water only" and HeLa cell RNA controls were processed with each 538 group of samples that underwent nucleic acid extraction. These were included, as well as positive 539 control clinical samples, with each sequencing run. Negative control samples enabled estimation 540 of the number of background reads expected for each taxon. A previously developed negative 541 binomial model 21 was employed to identify taxa with NT sequencing alignments present at an 542 abundance significantly greater compared to negative water controls. This was done by modeling 543 the number of background reads as a negative binomial distribution, with mean and dispersion 544 fitted on the negative controls. For each batch (sequencing run) and taxon, we estimated the 545 mean parameter of the negative binomial by averaging the read counts across all negative 546 controls, slightly regularizing this estimate by including the global average (across all batches) as 547 an additional sample. We estimated a single dispersion parameter across all taxa and batches, 548 using the functions glm.nb() and theta.md() from the R package MASS 48 . Taxa that achieved a 549 P-value < 0.01 were carried forward. 550 551 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Detailed enrollment and consent protocols for both studies have been previously described 18,33 . 562 563 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

714
Statistical significance was determined using Fisher's exact test (categorical variables) or by Wilcoxon test (continuous variables).

717 718
All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  patients who developed VAP (yellow) versus those who did not (red) at the "late" timepoint, two 746 days before the onset of VAP, from bulk RNA-seq. B) Gene set enrichment analysis (GSEA) at 747 the "late" timepoint based on differential gene expression analyses. GSEA results were 748 considered significant with an adjusted P-value < 0.05. C) Ingenuity Pathway Analysis (IPA) of 749 upstream cytokines at the "late" timepoint based on differential gene expression analyses. IPA 750 results were considered significant with a Z-score absolute value > 2 and overlap P-value < 0.05. 751 *Denotes cytokines that were included with an overlap P-value < 0.  patients who developed VAP (blue) versus those who did not (green) at the "early" timepoint from 759 bulk RNA-seq. B) Gene set enrichment analysis at the "early" timepoint based on differential gene 760 expression analyses. GSEA results were considered significant with an adjusted P-value < 0.05. 761 C) Expression of GSEA pathways at the "early" timepoint with respect to a baseline of uninfected, 762 intubated controls. Pathways were selected from the GSEA results if they had an adjusted P-763 value < 0.05 in at least one of the comparisons (VAP vs controls or No-VAP vs controls). Pathways 764 with an adjusted P-value < 0.05 when compared to controls are indicated by circles with a black 765 outline. D) Ingenuity Pathway Analysis (IPA) of upstream cytokine activation states at the "early" 766 timepoint based on differential gene expression analyses. IPA results were considered significant 767 with a Z-score absolute value > 2 and overlap P-value < 0.05. *Denotes cytokines that were 768 included with an overlap P-value < 0.  patients with COVID-19. 827 We hypothesize that individual immune responses to SARS-CoV-2 infection may drive a 828 restructuring of the microbial community and increase susceptibility to VAP. Those predisposed 829 to VAP have increased type I interferon responses and dysregulated antibacterial immune 830 signaling characterized by impaired macrophage, neutrophil and T cell activity, decreased toll-like 831 receptor (TLR) signaling and impaired activation of cytokines important for pathogen defense. 832 This state of suppressed immunity may disrupt the lower respiratory tract microbiome, 833 predisposing certain patients to the outgrowth of bacterial pathogens and VAP. Created with 834 BioRender.com. 835