Microbial signatures in the lower airways of mechanically ventilated COVID19 patients associated with poor clinical outcome

Abstract Mortality among patients with COVID-19 and respiratory failure is high and there are no known lower airway biomarkers that predict clinical outcome. We investigated whether bacterial respiratory infections and viral load were associated with poor clinical outcome and host immune tone. We obtained bacterial and fungal culture data from 589 critically ill subjects with COVID-19 requiring mechanical ventilation. On a subset of the subjects that underwent bronchoscopy, we also quantified SARS-CoV-2 viral load, analyzed the microbiome of the lower airways by metagenome and metatranscriptome analyses and profiled the host immune response. We found that isolation of a hospital-acquired respiratory pathogen was not associated with fatal outcome. However, poor clinical outcome was associated with enrichment of the lower airway microbiota with an oral commensal ( Mycoplasma salivarium ), while high SARS-CoV-2 viral burden, poor anti-SARS-CoV-2 antibody response, together with a unique host transcriptome profile of the lower airways were most predictive of mortality. Collectively, these data support the hypothesis that 1) the extent of viral infectivity drives mortality in severe COVID-19, and therefore 2) clinical management strategies targeting viral replication and host responses to SARS-CoV-2 should be prioritized.


Introduction
The earliest known case of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection causing coronavirus virus disease (COVID-19) is thought to have occurred on November 17, 2019 1 . As of February 20, 2021, 110.3 million con rmed cases of COVID-19 and 2.4 million deaths have been reported worldwide 2 . As the global scienti c community rallied in a concerted effort to understand SARS-CoV-2 infections, our background knowledge is rooted in previous experience with the related zoonotic betacoronaviruses Middle East Respiratory Syndrome coronavirus (MERS-CoV) and SARS-CoV-1 that have caused severe pneumonia with 34.4% and 9% case fatality, respectively 3 . As observed for these related coronaviruses, SARS-CoV-2 infection can result in an uncontrolled in ammatory response 4 leading to acute respiratory distress syndrome (ARDS) and multi-organ failure, both associated with increased mortality. While a large proportion of the SARS-CoV-2 infected population is asymptomatic or experiences mild illness, a substantial number of individuals will develop severe disease and require hospitalization, with some progressing to respiratory failure. Mortality among hospitalized  patients is estimated to be approximately 20%, which can go up to 70% among those requiring invasive mechanical ventilation 5-12 . Mortality in other viral pandemics, such as the 1918 H1N1 and 2009 H1N1 in uenza pandemics, has been attributed in part to bacterial co-infection or super-infection 13,14 . To determine if this is also the case for COVID-19, we can use next generation sequencing (NGS) to probe the complexity of the microbial environment (including RNA and DNA viruses, bacteria and fungi) and how the host (human) responds to infection. Recent studies have used this approach to uncover microbial signatures in patients with ARDS. 15,16 Increased bacterial burden and the presence of gut-associated bacteria in the lung were shown to worsen outcomes in these critically ill patients 15,17 , highlighting the potential role of the lung microbiome in predicting outcomes in ARDS. In a recent study using whole genome sequencing to pro le the gut microbiome of 69 patients from Hong Kong, investigators identi ed an increased abundance of opportunistic fungal pathogens among patients with con rmed COVID-19 18 . While there is emerging interest in understanding the microbial environment in patients with SARS-CoV-2 infections, few studies have attempted to characterize this at the primary site of the disease activity: the lower airways 19,20 . Furthermore, no study has yet determined whether microbial differences in the airways of COVID-19 patients could be contributing to the different outcomes in patients receiving mechanical ventilation.
In this investigation, we accessed a large prospective cohort of critically ill patients with SARS-CoV-2 infection who required invasive mechanical ventilation, and from whom bronchoalveolar lavage (BAL) samples were collected. We characterized the lung microbiome of these patients in parallel with analyses of lower airway markers of host immunity. While we did not nd that isolation of a secondary respiratory pathogen was associated with prolonged mechanical ventilation (> 28 days) or fatal outcome, we did identify critical microbial signatures-characterized by enrichment of oral commensals, high SARS-CoV-2 load, and decreased anti-SARS-CoV-2 IgG response-associated with fatal outcome, suggesting a need for more targeted antiviral therapeutic approaches for the care of critically ill COVID19 patients.

Cohort
From March 3rd to June 18th 2020, a total of 589 patients with laboratory-con rmed SARS-CoV-2 infection were admitted to the intensive care units of two academic medical centers of NYU Langone Health in New York (Long Island and Manhattan) and required invasive mechanical ventilation (MV). This included a subset of 142 patients from the Manhattan campus who underwent bronchoscopy for airway clearance and/or tracheostomy from which we collected and processed lower airway samples for this investigation ( Supplementary Fig. 1). Table 1 shows demographics and clinical characteristics of the 142 patients who underwent bronchoscopy divided into three clinical outcomes: survivors with ≤28 Days on MV; survivors with > 28 Days on MV; and deceased. The median post admission follow-up time was 232 days (CI = 226-237 days). Supplementary Tables 1 and 2 compare similar data across all 589 subjects, divided per site and sub-cohorts. Patients at the Manhattan campus who underwent bronchoscopy were younger, had lower body mass index (BMI), and a lower prevalence of chronic obstructive pulmonary disease (COPD; Supplementary Table 1). Among the cohort that provided lower airway samples through bronchoscopy, 37% of the subjects were successfully weaned within 28 days of initiation of MV and survived hospitalization, 39% required prolonged MV but survived hospitalization, and 24% died. Patients within the bronchoscopy cohort had a higher overall survival than the rest of the NYU COVID-19 cohort since most critically ill patients were not eligible for bronchoscopy or tracheostomy. Mortality among those in the no-bronchoscopy cohort was 77%. In the overall NYU cohort, higher age and BMI were associated with increased mortality (Supplementary Table 2). There was a similar, albeit non-signi cant, trend for the bronchoscopy cohort. Among the clinical characteristics of this cohort, patients within the deceased group more commonly had a past medical history of chronic kidney disease and cerebrovascular accident. Study patients were admitted during the rst wave of the pandemic in the US, prior to current standardized management of COVID-19. Within the bronchoscopy cohort, more than 90% of the subjects received hydroxychloroquine and anticoagulation (therapeutic dose), 69% received corticosteroids, 41% received tocilizumab (anti-Interleukin (IL)-6 receptor monoclonal antibody), 21% required dialysis, and 18.9% were started on extracorporeal membrane oxygenation (ECMO) ( Table 1). Antimicrobial therapy included use of antivirals (lopinavir/ritonavir in 16% and remdesivir in 10%), antifungals ( uconazole in 40% and micafungin in 57%), and antibiotics (any, in 90% of the subjects). Among the factors associated with clinical outcome within the bronchoscopy cohort, patients who survived were more commonly placed on ECMO whereas patients who died had frequently required dialysis (Table 1); these trends were also observed across the whole NYU cohort. Neither hydroxychloroquine or azithromycin were signi cantly associated with clinical outcome; however, patients who survived were more frequently treated with the combination antibiotic piperacillin/tazobactam.
Within the rst 48hrs from admission, respiratory bacterial cultures were rarely obtained (n = 70/589, 12%) with very few positive results (n = 12, 17%). Blood cultures were more commonly obtained (n = 353/589, 60%) but the rate of bacterial culture positivity was much lower (n = 5, 1.4%). These data support that community acquired bacterial co-infection was not a common presentation among critically ill COVID-19 patients.
During their hospitalization, most patients had respiratory and/or blood specimens collected for bacterial cultures (Table 1 and Supplementary Table 1). The proportions of positive bacterial respiratory cultures and blood cultures were 73% and 43%, respectively. We evaluated whether respiratory or blood culture results obtained as per clinical standard of care were associated with clinical outcome. Risk analyses for the culture results during hospitalization for the whole cohort (n = 589) demonstrated that bacterial culture positivity was not associated with increased odds of dying but was associated with prolonged mechanical ventilation in the surviving patients (Fig. 1). Since length of stay could potentially affect these results (patients who died could have a shorter hospitalization, and therefore may have had fewer specimens collected for cultures), we repeated the analysis using culture data obtained during the rst two weeks of hospitalization. This analysis showed that bacterial pathogen culture positivity (both respiratory and blood) during the early period of hospitalization was not associated with worse outcome   [8][9][10][11][12][13][14][15][16], and 13 [8][9][10][11][12][13][14][15][16] for the ≤28-days MV, > 28-days MV, and deceased groups, respectively, p = ns). Paired analysis of upper and lower airway samples revealed that, while there was a positive association between SARS-CoV-2 viral load of the paired samples (rho = 0.60, p < 0.0001), there was a subset of subjects (21%) for which the viral load was greater in the BAL than in the supraglottic area, indicating topographical differences in SARS-CoV-2 replication (Fig. 2a). Importantly, while the SARS-CoV-2 viral load in the upper airway samples was not associated with clinical outcome ( Supplementary Fig. 2), patients who died had higher viral load in their lower airways than patients who survived (Fig. 2b). We then evaluated virus replication in BAL samples by measuring levels of subgenomic RNA (sgRNA) targeting the E gene of SARS-CoV-2. This mRNA is only transcribed inside infected mammalian cells and is not packed into virions, thus, its presence is indicative of recent virus replication in a sample 21 − 23 . In BAL, levels of sgRNA correlated with viral load as estimated by rRT-PCR for the SARS-CoV-2 N gene (Fig. 2c) and the highest percentage of measurable sgRNA was in the deceased group followed by the ≤28-days MV group, and the > 28-days MV group (17,7%, 11.5%, and 3.7%, respectively, chi-square p = 0.028 for the comparison deceased vs. >28-days MV group). Thus, while in most cases levels of sgRNA were not measurable in BAL suggesting that no active virus replication was ongoing in the lower airways of COVID-19 patients at the time of bronchoscopy (overall median[IQR] = 12 [7][8][9][10][11][12][13][14][15][16] days from hospitalization), the lower airway viral burden, as estimated by rRT-PCR, is associated with mortality in critically ill COVID-19 patients.
Microbial community structure of the lower airways is distinct from the upper airways in critically ill patients.
Considering the bacterial species and the viral loads identi ed in the lower and upper airways of this cohort and their association with outcomes, we pro led in detail their viral and microbial composition.
Microbial communities were evaluated using parallel datasets of RNA and DNA sequencing from 118 COVID-19 patients with lower airway samples that passed appropriate quality control and a subset of paired 64 upper airway samples, along with background bronchoscope controls.
RNA sequencing (RNAseq) of the metatranscriptome provided insight into the RNA virome as well as the transcriptomes of DNA viruses, bacteria, and fungi. Given the low biomass of lower airway samples we rst identi ed taxa as probable contaminants by comparing the relative abundance between background bronchoscope and BAL samples (Supplementary Fig. 3a and Supplementary Table 4). However, we did not remove any taxa identi ed as probable contaminants from subsequent analyses. A comparison of the microbial community complexity captured in these data, determined using the Shannon diversity Index, showed there was signi cantly lower α diversity in the lower airway samples than in the upper airways and background controls ( Supplementary Fig. 4a). Similarly, β diversity analysis based on the Bray Curtis Dissimilarity index indicated that the microbial composition of the lower airways was distinct from the upper airways and background controls ( Supplementary Fig. 4b, PERMANOVA p < 0.01). Sequence reads indicated a much higher relative abundance of SARS-CoV-2 in the lower than in the upper airways for this cohort ( Supplementary Fig. 4c). Comparisons of the most dominant bacterial and fungal taxa that were functionally active showed that S. epidermidis, Mycoplasma salivarium, S. aureus, Prevotella oris, and Candida albicans, many often-considered oral commensals, were present in both upper and lower airway samples ( Supplementary Fig. 4c). Interestingly, the lytic phage Proteus virus Isfahan, known to be active against bio lms of Proteus mirabilis 24 , was found to be highly transcriptionally active in the BAL.
DNA sequencing data provided insight into the DNA virome, as well as the bacterial and fungal metagenomes. As for the metatranscriptome data, we rst identi ed taxa as probable contaminants but these were not removed for subsequent analyses ( Supplementary Fig. 3b). Both α and β diversity analyses of the metagenome support distinct microbial community features in the lower airways as compared with the upper airways and background controls ( Supplementary Fig. 5a, 5b). Among the top 10 taxa across lower and upper airway samples were S. aureus, Salmonella enterica, Burkholderia dolosa, and Klebsiella variicola. Candida albicans only ranked #77 in the BAL while it was ranked 5th in the metatranscriptome data indicating that while present at low relative abundance, it was highly active (Supplementary Table 4). K. variicola, while prevalent at a high relative abundance (#4 in BAL, and #5 in the upper airways) in patients of this cohort, its ranking in the RNAseq data was not among the top 50, indicating that it was not as active functionally as other bacteria. Conversely, while S. epidermis ranked as the most highly functional taxon in both lower and upper airways, based on RNAseq reads (Supplemental Fig. 3c), it was 33rd in relative abundance in the BAL DNAseq data but was present at very high relative abundance in the upper airways (ranked #3). These data suggest that microbes that colonize the upper airways and the skin were common in the lower airways in this cohort of COVID-19 patients requiring invasive mechanical ventilation.
Distinct microbial signatures are associated with different clinical outcomes.
To determine the potential impact of vertebrate viruses on outcome, we compared virus enrichment  Table 5).
Analysis of differential DNA virus abundance using DEseq did not show statistically signi cant differences. Because the virome includes viruses of bacteria and archaea, we also analyzed the phage data (including viruses of archaea). Phages impact the bacterial population-including bacterial pathogens-and so could be clinically relevant. At a compositional level, the virome of DNA phages did not display statistically signi cant differences or signi cant virus enrichment based on clinical outcome groups (data not shown). However, while the phage metatranscriptome α and β diversity was similar across the clinical outcome groups, there were various taxonomic differences at the RNA level with enrichment of Staphylococcus phages CNPx in the deceased and > 28-day MV groups when compared with the ≤ 28-day MV group (Fig. 2e). Differential expression from two other Staphylococcus phages was also observed in the > 28-days MV group as compared with the ≤ 28-days MV group (Fig. 2e). None of the described taxa were identi ed as possible contaminants (Supplementary Table 4).
Enrichment of the lower airway microbiota with oral commensals is associated with poor outcome We evaluated the overall bacterial load by quantitative PCR, targeting the 16S rRNA gene. As expected, the bacterial load in the lower airways was several folds lower than in the upper airways but clearly higher than the background bronchoscope control ( Supplementary Fig. 7). Patients who died had higher total bacterial load in their lower airways than patients who survived (Fig. 3a).
While no statistically signi cant differences were noted in α or β diversity across clinical outcome groups ( Fig. 3b-c), several differences were noted when differential enrichment was evaluated using DESEq. For the comparisons made across the clinical outcome groups we focused on consistent signatures identi ed in the lower airway metagenome and metatranscriptome. Coherence of differentially enriched taxa was determined by gene set enrichment analysis (GSEA) (Fig. 3d) and directionality of enrichment between the two datasets was evaluated (Fig. 3e). Among the most abundant taxa, the oral commensal M. salivarum was enriched in the deceased and > 28-days MV groups as compared with the ≤ 28-days MV group. In contrast, a different oral commensal, Prevotella oris, was enriched in the ≤ 28-days MV group as compared with the deceased and > 28-days MV groups. These data support that oral commensals are frequently found in the lower airways of critically ill COVID-19 patients and that differences between groups could be due to differential microbial pressures related to host factors. Interestingly, most of the statistically signi cant taxa were identi ed in the metatranscriptome rather than in the metagenome data, with only P. oris identi ed in both datasets. None of the described taxa were identi ed as possible contaminants (Supplementary Table 4). Overall, most of the microbial signatures identi ed as enriched in the deceased or in subjects on prolonged MV are regular colonizers of healthy skin and mucosal surfaces rather than frequent respiratory pathogens.
For the fungal data, there were no statistically signi cant differences in α or β diversity identi ed between clinical outcome groups in the metagenome or the metatranscriptome data (Supplementary Fig. 8a and 8c). However, in the metagenome data, we identi ed Candida glabrata enriched in the deceased group as compared with the ≤ 28-days MV and the > 28-days MV groups but this was not consistent in the metatranscriptome data (Supplementary Fig. 8b and 8d).
Poor clinical outcomes are associated with enrichment of antimicrobial resistance genes and glycosphingolipid biosynthesis We used the gene annotation of the DNAseq and RNAseq data to pro le the microbial functional potential of the lower airway samples. For the comparisons made across the clinical outcome groups, we focused on consistent functional signatures identi ed in the lower airway metagenome and metatranscriptome. Coherence of differentially enriched functions was determined using GSEA ( Fig. 4a) and directionality of enrichment was also evaluated (Fig. 4b).
Overall, there was coherence of directionality between the metranscriptomics and metagenomics datasets for the comparisons between deceased vs ≤ 28-days MV, and > 28-days MV vs ≤ 28-days MV groups. Interestingly, statistically signi cant differences were only noted in the metatranscriptome data and not in the metagenome data. Among the top differentially expressed pathways in the poor outcome groups were glycosylases, oxidoreductase activity, transporters, and two-component system, among other genes. The two-component system is used by bacteria and fungi for signaling. A speci c analysis of antibiotic resistance genes shows that there was signi cant gene enrichment and expression of biocide resistance in the deceased group as compared to the two other MV groups ( Supplementary Fig. 9). There was also signi cant expression of genes resistant to trimethoprim and phenolic compound, as well as multi drug resistance in the deceased group as compared to the ≤ 28-days MV group. Presence of the resistance gene against Trimethropim was not signi cantly associated with prior exposure with Trimethoprim. However, only 7 patients received this drug before sample collection.
Lower airway host immune phenotype shows failure of adaptive and innate immune response to SARS-

CoV-2 among deceased subjects
To evaluate the host immune response to SARS-CoV-2 infection, we rst measured levels of anti-Spike  Table 6). These data suggest that the IgG subfraction is an important marker of the adaptive immune response in the lung of critically ill COVID-19 patients and that both sub-fractions of IgG and IgA anti-SARS-CoV-2 may contribute to the viral replication control in the lower airways.
Host transcriptome analyses of BAL samples showed signi cant differences across clinical outcome groups based on β diversity composition ( Supplementary Fig. 11). We identi ed multiple differentially expressed genes across the clinical outcome groups ( Supplementary Fig. 11b-d). First, we noted that the lower airway transcriptomes showed downregulation of heavy constant of IgG (IGHG3), and heavy constant of IgA (IGHA1) genes in those with worse clinical outcome (Supplementary Table 7). We then used IPA (Ingenuity Pathway Analysis) to summarize differentially expressed genes across the three clinical outcome groups (Fig. 5b). The sirtuin Signaling Pathway (a pathway known to be involved in aging, gluconeogenesis/lipogenesis, and host defense against viruses) 25 Table 8). We also observed decreased activation in the in ammatory response in critically ill COVID-19 subjects with poor outcome (phagocytes, neutrophils, and granulocytes, and leukocytes; Supplementary Table 9). A comparison of clinical outcome between the > 28-days MV vs. ≤28-days MV groups showed upstream predicted inhibition in insulin, estrogen, beta-estradiol, EGF, EGFR, IL-5, and IL-10RA in the > 28-days MV group (Supplementary Table 9).
These differences suggest that, at the stage that we sampled the lower airways of patients with critically COVID-19, an overt in ammatory tone was not predictive of worst outcome.
To determine if the abundance of immune cells varies between different clinical outcome groups, we estimated cell type abundance from the host transcriptome with computational cell type quanti cation methods, including a deconvolution approach implemented in CIBERSORTx 31 and a cell type signature enrichment approach implemented in xCell 32 . As reported recently in other studies 33 , among the cell types detected in the BAL samples we observed a consistent enrichment of mast cells and neutrophils in the > 28-days MV and deceased groups compared with the ≤ 28-days MV group ( Fig. 5c and Supplementary Table 10). We also identi ed signi cantly higher in ammatory macrophages (M1), innate

T-cells and memory T-cells (CCR7 + ) among subjects with worse clinical outcome.
Cross-kingdom network analyses identify bacteria, fungi, and host pathways functionally impacted by SARS-CoV-2 To identify potential microbe-microbe and microbe-host interactions that could have an effect on outcome, we used a multi-scale network analysis approach (Multiscale Embedded Gene co-Expression Network Analysis, MEGENA) 34 . We rst used the relative abundance from the RNAseq data to capture coexpressing taxa in the metatranscriptome network neighborhood of SARS-CoV-2 (SARS2-NWN). We examined ve such network neighborhoods (constructed by including nodes with increasing distance 1 to 5 from SARS-CoV-2, i.e. neighborhood 1 to neighborhood 5) that were signi cantly enriched for taxa functionally active in the deceased group when compared with the ≤ 28-day MV group. Only the largest cluster, with 504 taxa, had signi cantly enriched taxa in both the deceased and in the ≤ 28-day MV outcome groups ( Supplementary Fig. 12a) (FET P-value = 4.6e-45, 4.0 FE). Many of these taxa are among the top 50 most abundant microbes we had previously identi ed in the metatranscriptome dataset. Taxa present that are in uenced by SARS-CoV-2 and signi cantly differentially enriched in the deceased group include bacteria such as M. salivarium, Bi dobacterium breve, and Lactobacillus rhamnosus (a gut commensal), that we had previously identi ed by differential expression analysis (Fig. 3e), but also taxa such as S. epidermis, Mycoplasma hominis (urogenital bacteria), and the phage VB_PmiS-Isfahan (also referred to as Proteus virus Isfahan) that we had previously only picked up as being highly abundant but not necessarily differentially enriched in the deceased group. Most of the fungi, such as C. albicans, C. glabrata and C. orthopsilosis were enriched in the ≤ 28-day MV group. Interestingly, our earlier analysis of the metagenome (Supplementary Fig. 8b) had identi ed C. glabrata as being enriched in the deceased group with no enrichment in the metatranscriptome. This analysis indicates that some of these abundant taxa could be responding to SARS-CoV-2 disruption in a similar manner, or indirectly interacting functionally.
We further investigated the association of the network neighborhood with host network modules using the host transcriptome data to identify groups of host genes that are co-expressed in response to SARS-CoV-2 disruption. The  and R-spondin 1 (RSPO1). Humanin is known to protect against oxidative stress and mitochondrial dysfunction 36 . RSPO1 protects against cell stress by activating the Wnt/β-catenin signaling pathway 37 . Non-coding RNAs, such as MALAT1 and RHOQ-AS1 were found to be up-regulated. MALAT1 is known to suppress IRF3-initiated antiviral innate immunity 38 while the function of RHOQ-AS1 is unknown.
Metatranscriptome and Transcriptome signatures are predictive of mortality We evaluated the strength of the metatranscriptomic, metagenomic and host transcriptomic pro les to predict mortality in this cohort of critically ill COVID-19 patients. To this end, we identi ed features in each of these datasets and constructed risk scores that best predicted mortality. Figure 6a shows that the metatranscriptome data, alone or combined with the other two datasets, was most predictive of mortality. Importantly, the predictive power (as estimated by the area under the curve) of the metatranscriptome data was improved by excluding probable contaminants and worsened when SARS-CoV-2 was removed from the modeling. The selected features we used to construct the metatranscriptome, metagenome and host transcriptome risk scores are reported in Supplementary Table 11). Using the means of the scores, we classi ed all subjects into high risk and low risk groups for mortality. Figure 6b shows Kaplan-Meier survival curve comparisons evaluating the predictive power of risk score strati cation based on metatranscriptome, metagenome and host transcriptome data. Combining risk scores from different datasets showed an optimal identi cation of mortality when metatranscriptome and host transcriptome were considered (Fig. 6c). We then used the gene signature found as being the most predictive of mortality to conduct IPA analyses. Among the upstream regulators, mortality was associated with predicted activation of interferon alpha while chemotaxis and infection by RNA virus were predicted as activated in diseases and functions. These data highlight the importance of SARS-CoV-2 abundance in the lower airways as a predictor for mortality, and the signi cant contribution of the host cell transcriptome, which re ects the lower airway cell response to infection. ventilation recruited during the rst wave of SARS-CoV-2 infections in New York City, we used a metagenomic approach to characterize the microbiome in the lower airways and assessed its impact on clinically meaningful outcomes. In this analysis of 142 critically ill hospitalized patients with con rmed SARS-CoV-2 infection and lower airway biorepository samples available, we determined that higher SARS-CoV-2 viral load, higher relative abundance of Mycoplasma salivarium, and limited anti-SARS-CoV-2 Spike protein IgG response in the lower airways were associated with increased mortality. This signature was supported by the metatranscriptome data of the lower airway samples where SARS-CoV-2 sequence reads were signi cantly enriched in those patients who died compared to those who survived after developing respiratory failure requiring mechanical ventilation. Importantly, although we observed changes in other microbial components of the lower airway microbiome in our analysis of lower airway samples from 118 patients and by clinical laboratory culture results obtained from 589 patients, we did not nd evidence to support the hypothesis that co-infection with common (bacterial, viral, fungi) respiratory pathogens was associated with poor outcome-although most patients received empiric treatment with broad spectrum antibiotics and anti-fungals.

Discussion
Several studies have explored the relationship between SARS-CoV-2 viral load and mortality [45][46][47][48][49][50] . Severe in uenza requiring hospitalization has also been associated with higher viral loads 51,52 . It has been argued that high viral load might merely be a re ection of an individual's immune response 45 . In fact, in SARS-CoV-1, clinical progression was not associated with increased viral load or uncontrolled viral replication in the nasopharynx but rather with an upregulated immune pro le in these patients 53 . In a large cohort of 1145 patients with con rmed SARS-CoV-2, viral load measured in nasopharyngeal swab samples was found to be signi cantly associated with mortality, even after adjusting for age, sex, race and several co-morbidities 50 . Similar results were found in a cohort of patients in New York City with or without cancer, where in-hospital mortality was signi cantly associated with a high SARS-CoV-2 viral load in the upper airways 49  were found later to have a secondary bacterial infection 57 . The most common pathogens identi ed included species in the genera Mycoplasma, Hemophilus, and Pseudomonas. In another study, the most commonly identi ed co-infections were with Streptococcus pneumoniae, Klebsiella pneumoniae, and Haemophilus in uenzae 59 . Using detailed clinical laboratory culture data available for 589 subjects hospitalized with respiratory failure due to COVID-19, we showed that higher rates of respiratory infection with other organisms, especially early in their hospitalization, did not occur among subjects with poor clinical outcome. Further, we did not observe an association between positive cultures for any pathogen tested and increased odds of dying in critically ill COVID-19 patients.
In the subset of COVID-19 patients with BAL samples, we used NGS to identify all potential pathogens and commensals in the lower airways beyond microbial cultures routinely obtained as per clinical care.
The RNA virome data showed that SARS-CoV-2 dominates the lower airways and was signi cantly associated with death. A small number of samples had a few sequences that mapped to in uenza A or B viruses, suggesting that co-infection with in uenza did not occur frequently during this rst wave of SARS-CoV-2 infections. Within the DNA virome, there was no signi cant difference in viruses between the three outcome groups despite the frequent nding of HSV-1. Similarly, when evaluating the metatranscriptome of DNA viruses, there were few differences between the three outcome groups. Although analysis of the phage metagenome data showed no differential enrichment between the three cohorts, we did identify in the metatranscriptome data differentially active phages when comparing the three cohorts, suggesting that changes in the bacterial microbiome may be occurring in critically ill patients with COVID-19. Certain Staphylococcus phages were differentially active in those who were ventilated for more than 28 days and in those who died. Interestingly, the bacterial signatures also identi ed Mycoplasma salivarium, a known oral commensal that has previously been associated with ventilator-acquired pneumonia 63 , as differentially active in those who died and those who were ventilated for more than 28 days when compared to those ventilated less than 28 days. From previous data published by us, enrichment of the lower airway microbiota with oral commensals was seen to be associated with a pro-in ammatory state in several diseases including lung cancer 64,65 and nontuberculosis mycobacterium related bronchiectasis 66 .
With the use of metagenomic and metatranscriptomic analyses it is also possible to examine how functionally active microbes impact the host 67 . In this cohort of patients, we evaluated the functional pro le of the microbiome within the lower airways and its effect on mortality, something that, to our knowledge, had not yet been assessed in COVID-19 patients. The only signi cant gene function enrichment was found with the metatranscriptome data suggesting that functional activation of microbes can provide further insights into the lower airway microbial environment of patients with worst outcome. Among the pathways that were differentially expressed in those patients with poor outcome, we identi ed genes associated with degradation, transport, and antimicrobial resistance genes, as well as with signaling. These differences may indicate important functional differences leading to a different metabolic environment in the lower airways that could impact host immune responses. It could also be representative of differences in microbial pressure in patients with higher viral loads and different in ammatory environments.
In the current investigation, we also characterized the immune response within the lower airways by measuring anti-SARS-CoV-2 Spike antibodies and pro ling the host RNA transcriptome. We observed that low levels of anti-Spike and anti-RBD IgG in the lung were associated with poor outcome. Although we did not nd a statistically signi cant association between SARS-CoV-2 neutralizing capacity and poor outcome, levels of SARS-CoV-2 neutralizing antibodies, anti-Spike and anti-RBD antibodies (both IgG and IgA) were negatively correlated with SARS-CoV-2 viability. Prior investigations have suggested that IgA levels are a key driver of neutralization in the mucosa [68][69][70] . The differences noted in the current investigation in the IgG pools are intriguing and future work investigating the antibodies generated during SARS-CoV-2 infections will be essential.
When examining host transcriptomic differences across the different clinical outcome groups, Sirtuin and Ferroptosis signaling pathways were found to be upregulated in the most critically ill COVID-19 patients.
Upregulation in the Sirtuin pathway demonstrates an increased host in ammatory response to viral infection 25 . In addition, ferroptosis, a recently identi ed form of non-apoptotic regulated cell death through iron-dependent accumulation of lipid peroxides, has been shown to cause direct lung injury 71 or pulmonary ischemia-reperfusion injury 72,73 . Interestingly, there is evidence to support that STAT3 71 and ACSL4 72 alleviated ferroptosis-mediated acute lung injury dysregulation, which are both down-regulated in COVID-19 patients with worse clinical outcome. Further analysis showed that there appeared to be an inactivation of phagocytes, neutrophils, granulocytes, and leukocytes, including downregulation of IgG expression levels, with additional mitochondria dysfunction, and down-regulation of Inositol related pathways and noradrenaline/adrenaline degradation. There is evidence that in the neonatal lung, inositol related components exert an anti-in ammatory effect and can prevent acute lung injury 74,75 .
Collectively, these data suggest that an imbalance rather than an elevated in ammatory state in the lung is an important marker that predicts poor outcomes in critically ill COVID-19 patients. Indeed, the inferred cell composition analysis from the bulk transcriptome data overall points to a tepid immune response.
Memory T cells have been implicated with a robust immune response in SARS-CoV-2. 76 The de ciency of these memory T cells that we found in the lungs of COVID-19 patients with worse outcome further supports the presence of an ineffective immune response or presence of immune exhaustion. IL4I1, found in the network analysis to be up-regulated in the deceased group in association with SARS-CoV-2, is an immunosuppression enzyme that plays a role in infection and the control of immunopathology 77 . Strikingly, interrogation of the host transcriptomic analysis identi ed survival-associated differences in interferon-related responses. Our host transcriptomic risk strati ed model seems to point to a predictive activation of type I interferon as a prediction for mortality. This might be inconsistent with the current suggestion that, based on systemic levels, early interferon responses are associated with poor outcome in COVID19. 80,81 Others have suggested that a robust interferon response may lead to a hyperin ammatory state that could be detrimental in the disease process, justifying the use of Janus kinase inhibitor inhibitors in patients with COVID-19. 82 Studies comparing transcriptomic signatures in BAL of patients with severe COVID-19 and controls have shown activation of type 1 interferons. 83 While further longitudinal data will be needed to clarify the role of interferon signaling on the disease, the data presented here suggest that combining microbial and host signatures could help understand the increase risk for mortality in critically ill COVID-19 patients.
By collecting BAL samples rather than endotracheal aspirate specimens we were able to ensure extensive sampling of the lower respiratory tract in intubated patients. However, we were limited to samples from intubated patients in whom a clinically indicated bronchoscopy was done to place a percutaneous tracheostomy or for airway clearance. Although this included a large number of patients with various clinical outcomes, those sampled may not be representative of the extremes in the spectrum of disease severity who were most likely not eligible for bronchoscopy. For example, patients that presented with very rapid clinical deterioration and died within the rst few days of hospitalization or those who were quickly weaned from mechanical ventilation did not receive bronchoscopy. However, extensive and detailed clinical data were also obtained from intubated COVID-19 patients without bronchoscopy performed within the Manhattan Campus (no bronchoscopy cohort) and from the Long Island cohort for whom bronchoscopies were done without collecting research samples. In both of these cohorts, clinical laboratory culture data did not identify untreated secondary pathogen infections associated with poor outcome.
The samples used in this investigation were obtained during the rst surge of cases of COVID-19 in New York City, and management re ected clinical practices at that time. Among the differences with current therapeutic approaches in COVID-19 patients, corticosteroids and remdesivir, two medications that likely affect the lower airway microbial landscape, were rarely used during the rst surge. Other medications, such as antibiotics and anti-in ammatory drugs could affect our ndings and we therefore considered them as potential confounders. However, the use of these medications was not found to be associated with clinical outcome. The cross-sectional study design precluded evaluation of the temporal dynamics of the microbial community or the host immune response in this cohort, which could provide important insights into the pathogenesis of this disease. Performing repeated bronchoscopies without a clinical indication would be challenging in these patients and other less invasive methods might need to be considered to study the lower airways at earlier timepoints and serially over time in patients with respiratory failure. It is important to note that there were no statistically signi cant differences in the timing of sample collection across the three outcome groups.
In summary, we present here the rst evaluation of the lower airway microbiome using a metagenomic and metatranscriptomic approach, along with host immune pro ling in critically ill patients with COVID-19 requiring invasive mechanical ventilation. The RNA metatranscriptome analysis showed an association between the abundance of SARS-CoV-2 and mortality, consistent with the signal found when viral load was assessed by targeted rRT-PCR. These viral signatures correlated with lower anti-SARS-CoV-2 Spike IgG and host transcriptomic signatures in the lower airways associated with poor outcome.
Importantly, both through culture and NGS data, we did not nd evidence for an association between untreated infections with secondary respiratory pathogens and mortality. Together, these data suggest that active lower airway SARS-CoV-2 replication and poor SARS-CoV-2-speci c antibody responses are the main drivers of increased mortality in COVID-19 patients requiring mechanical ventilation. The potential role of oral commensals such as Mycoplasma salivarium need to be explored further. It is possible that M. salivarium can impact key immune cells and has recently been reported at a high prevalence in patients with ventilator-acquired pneumonia 63 . Critically, our nding that SARS-CoV-2 evades and/or derails effective innate/adaptive immune responses indicates that therapies aiming to control viral replication or induce a targeted antiviral immune response may be the most promising approach for hospitalized patients with SARS-CoV-2 infection requiring invasive mechanical ventilation.

Subjects
Enrolled subjects were 18 years or older, admitted to the intensive care units (ICUs) at NYU Langone Health from March 10 th to May 10 th , 2020 with a nasal swab con rmed diagnosis of SARS-CoV-2 infection by reverse transcriptase polymerase chain reaction (RT-PCR) assay and respiratory failure requiring invasive mechanical ventilation. Samples were obtained during clinically indicated bronchoscopy performed for airway clearance or for percutaneous tracheostomy placement. Surviving subjects signed informed consent to participate in this study. Samples and metadata from subjects who died or were incapacitated were de-identi ed and included in this study. Comprehensive demographic and clinical data were collected. We also collected longitudinal data on clinical laboratory culture results and treatment. Supplementary gure 1 shows the distribution of subjects and sampling strategy used for this study. The study protocol was approved by the Institutional Review Board of New York University.
Lower airway bronchoscopic sampling procedure Both background and supraglottic (buccal) samples were obtained prior to the procedure, as previously as input into the NexteraXT library preparation kit following the manufacturer's protocol. Libraries were puri ed using the Agencourt AMPure XP beads (Beckman Coulter, Inc.) to remove fragments below 200 bp. The puri ed libraries were quanti ed using the Qubit dsDNA High Sensitivity Assay kit (Invitrogen) and the average fragment length for each library was determined using a High Sensitivity D1000 ScreenTape Assay (Agilent). Samples were added in an equimolar manner to form two sequencing pools.
The sequencing pools were quanti ed using the KAPA Library Quanti cation Kit for Illumina platforms.
The pools were then sequenced on the Illumina Novaseq 6000 in one single run. For RNA sequencing, RNA quantity and integrity were tested with a BioAnalyzer 2100 (Agilent Antibiotic resistance genes were quanti ed in all metagenome and metatranscriptome samples using Salmon v1.3.0 93 run with --keepDuplicates for indexing and --libtype A --allowDovetail --meta for quanti cation. Genes were ltered such that only genes that actively conferred antibiotic resistance were kept. To assess differentially expressed classes of antibiotic resistance genes, gene counts for individual antibiotic resistance genes were collapsed by their conferred antibiotic resistance. Supplementary Figure 1   Microbial and Host predictive modeling Cox proportional hazards model was used for investigating the association between the time to death and the relative abundance of each taxon quanti ed using metatranscriptomic and metagenomic data separately. We rst performed the univariate screening test to identify signi cant features associated with the time to death using the Cox proportion hazards regression model for the relative abundance of taxa from the RNA and DNA data, and log-transformed count of host transcriptome data, respectively. Within each type of data, given the p-value cutoff, the features with a p-value less than the cutoff were selected and integrated as a sub-community. For the RNA and DNA data, the alpha diversity (Shannon index) was calculated for each sample on the selected sub-community and the negative of the value was de ned as the microbial risk score, because high alpha diversity indicates low risk of death. For the host transcriptome data, the log-transformed total count of all selected candidate transcriptome for each sample was de ned as the risk score, since most selected candidate transcriptomes increased the risk of death. The leave-one-out cross-validation (LOOCV) was used for the predictions. The p value cutoff was set at the value which produces the largest AUC (area under the receiver operating characteristic curve) in predicting the death/survival status using the risk score we constructed over these features. The additive model was used to integrate when more than one scores are used for the prediction.

Multiscale and co-expression network analyses
Raw counts from the human transcriptome were normalized and converted to log2-counts per million using limma 99  Multiscale Embedded Gene Co-Expression Network Analysis (MEGENA) 34  with the combined enrichment of the differentially expressed gene (DEG) signatures as implemented: , where, is the relevance of a consensus j to a signature i; and is de ned as , where is the ranking order of the signi cance level of the overlap between the module j and the signature.
To functionally annotate gene signatures and gene modules derived from the host transcriptome data, we performed an enrichment analysis of the established pathways and signatures¾including the gene ontology (GO) categories and MSigDB. The hub genes in each subnetwork were identi ed using the adopted Fisher's inverse Chi-square approach in MEGENA; Bonferroni-corrected p-values smaller than 0.05 were set as the threshold to identify signi cant hubs. The correlation between modules, modules and clinical traits as well as modules and individual taxa were performed using Spearman correlation.
Other correlation measures, such as Pearson correlation or the Maximal Information Coe cient (MIC) 101 proved to be inferior for this task. Categorical trait data was converted to numerical values as suitable.  Figure 1 Associations between culture positivity and clinical outcome. Odds ratios and corresponding 95%

Figures
con dence intervals for rates of culture positivity for the whole cohort (n=589) during the length of their hospitalization (left) and during the rst 2 weeks of hospitalization (right).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.