The hidden microbiome of hospital infection surveillance testing: biomarkers of health outcomes in MRSA and VRE colonization

Abstract Background Hospital-acquired infections present a major concern for healthcare systems in the U.S. and worldwide. Drug-resistant infections result in increased costs and prolonged hospital stays. Methicillin-resistant Staphylococcus aureus (MRSA) and vancomycin-resistant Enterococcus (VRE) are responsible for many drug-resistant infections in the U.S. We undertook two parallel studies aimed to investigate the differences in the microbial communities of individuals colonized with MRSA (or VRE) as compared to their respective non-colonized counterparts matched for age, sex, race, ethnicity, unit of admission, and diagnostic-related group, when available. Results The VRE study showed considerably more Enterococcus genus communities in the VRE colonized samples. Our findings for both MRSA and VRE studies suggest a strong association between 16S rRNA gene alpha diversity, beta diversity, and colonization status. When we assessed the colonized microbial communities in isolation, the differences disappeared, suggesting that the colonized microbial communities drove the change. Isolating Staphylococcus , we saw significant differences expressed across colonization in specific sequence variants. Conclusions The differences seen in the microbial communities from MRSA (or VRE) colonized samples as compared to non-colonized match-pairs are driven by the isolated communities of the Staphylococcus (or Enterococcus ) genus, the removal of which results in the disappearance of any differences in the diversity observed across the match-pairs.


BACKGROUND AND SIGNIFICANCE
Programs of infection prevention for pathogens such as methicillin-resistant Staphylococcus aureus (MRSA) and vancomycinresistant Enterococcus (VRE) are routine in many hospitals in the United States due to the associated risks to patients.MRSA alone accounts for over 94,000 annual infections and almost 19,000 deaths, with 86% of the cases associated with healthcare [1][2][3][4].Moreover, healthy individuals in non-healthcare settings are susceptible to community-acquired MRSA infections.Both within healthcare and community settings, the majority of people colonized with MRSA are asymptomatic, and unless regular and routine infection surveillance testing is performed, transmission and acquisition may go unrecognized.Infections due to these multidrugresistant bacteria are associated with increased rates of morbidity, mortality, and healthcare costs, making them a major public health concern [5].It is also known that colonization with drug-resistant bacteria increases the risk for subsequent infection [6][7][8].
Understanding the factors contributing to the colonization and persistence of MRSA and VRE in human populations is crucial for the development of effective prevention and control strategies.Prior to 2020, the Medical University of South Carolina (MUSC) utilized an active surveillance testing program designed to identify patients colonized with MRSA and VRE and exercise the use of contact precautions for those found to be colonized.Under this program, a majority of patients admitted for over 24 hours are tested for MRSA using nasal swabs, producing approximately 15,000 specimen tests for the pathogen annually.High-risk hospitalized patients undergo active surveillance testing for MRSA and VRE weekly until the patient becomes positive for MRSA, VRE, or is discharged.
Investigating the microbiome's role in MRSA and VRE colonization may shed light on the complex interactions between bacteria and their human hosts.There are important scienti c and clinical implications for our research.Firstly, by characterizing the nasal and perianal microbiome in MRSA and VRE colonization, we aim to identify microbial signatures that could serve as potential signatures for colonization risk.These biomarkers may help identify those who are more likely to become colonized and may facilitate the implementation of focused preventative therapies to limit the spread of antibiotic-resistant pathogens in healthcare settings.Secondly, understanding the variations in microbial diversity and community structure among individuals colonized and uncolonized may provide insight into the mechanisms underlying antibiotic resistance.Studies such as ours could help elucidate how speci c microbial species or groups interact with MRSA/VRE and affect the likelihood of colonization.This may lead to the development of novel interventions aimed at speci c microbial communities to reduce colonization and subsequent infection rates.

Study population
We utilized infrastructure from the Living µBiomeBank to enroll, process, sequence, and perform downstream analyses [9].The MUSC Institutional Review Board protocol (Pro00062584) was approved as a minimal-risk study, using existing surplus nasal and perianal swab specimens processed by the MUSC Diagnostic Microbiology Laboratory.Specimens from pediatric populations and those who have opted out of participating in clinical research from discarded clinical specimens via intake questionnaire were excluded.
Two parallel studies were undertaken to assess the differences in the microbial communities of those colonized with MRSA and VRE as compared to their uncolonized matched counterparts.The methods describe the parallel and similar processes for enrolling specimens from these populations and downstream handling and processing.Daily reports using the electronic medical health record platform Epic identi ed nasal specimens, reported MRSA/VRE status, demographic variables, admission unit, admission diagnosis, and other identi ers.MRSA/VRE positive specimens were captured on an ongoing basis, de-identi ed, and stored at -80°C for batch processing.Once a MRSA/VRE positive specimen was identi ed, a one-to-one matched MRSA/VRE negative specimen was sought based on age (± 5 years), sex, race, ethnicity, unit of admission, and diagnostic-related group, when available.To minimize potential biases due to seasonal microbiome variability, matching MRSA/VRE negative samples were at most sought within four weeks of the original MRSA/VRE positive specimens' collection dates.If a match was not had within the prespeci ed window, the positive specimen was discarded, and another positive specimen was recruited.
DNA extraction and 16S rRNA gene sequencing Samples were processed for bacterial DNA extraction using QIAamp DNA Mini Kit from Qiagen (Hilden, Germany) according to the manufacturer's instructions.The quality and purity of extracts were determined using the QIAxpert system (Qiagen).Bacterial 16S rRNA variable region V2 was ampli ed with Illumina (San Diego, CA, USA) Nextera XT Index Kit v2 index adapters (CTGTCTCTTATACACATCT) [10,11].Barcoded libraries were pooled and sequenced on a MiSeq using paired-end sequencing with 2 x 300 cycles.Sequences were then transferred to the Program for Human Microbiome Research at MUSC for further processing and analysis.

Statistical analysis
All statistical analysis have been performed in R (v. 4.0.3)[12].Initial preprocessing of the Illumina paired-end sequences was performed using the statistical denoising algorithm DADA2 (v.1.19.1)[13,14]; chimeric sequences were identi ed and removed using the UCHIME algorithm within the DADA2 package [15]; taxonomy was assigned against the SILVA reference database (v.138.1) [16] using an implementation of the naïve Bayesian classi er similar to the RDP-II classi er [17].The resulting amplicon sequence variants (ASVs) with corresponding taxonomic information and sample data were transformed into a phyloseq (v.1.27.6)[18,19] object for downstream statistical analysis.
A owchart of the downstream analysis is presented as Supplementary Figure S1.Three analysis scenarios were considered: (a) in which all microbial ASVs are present; (b) in which all ASVs identi ed as genus Staphylococcus or Enterococcus have been removed; and (c) in which all ASVs except those classi ed as genus Staphylococcus or Enterococcus have been removed.Within each of these three scenarios, gures were generated and analytical tests performed including univariate and multivariate testing.Non-bacterial ASVs were removed as an initial step.Compositional bar charts, alpha diversity gures, and statistical assessments of alpha diversity (paired t-tests) were performed prior to any ltering.Filtering threshold was set at to retain ASVs with at least a 1% relative abundance and observed in more than one sample.Following ltering, univariate and multivariate analysis were performed.Relative abundance data were transformed and normalized using centered log-ratio (CLR) calculations to use the geometric mean of the ltered sample vector as the reference.Multiple clustering methods were utilized: Jensen-Shannon divergence and Bray distance matrices utilized relative abundance data, whereas the Euclidean distance was used with the CLR-transformed abundances.Match-strati ed multivariate data were analyzed using test, a robust distance-based multivariate analysis of variance which has been developed in our group to account for multivariate dispersion in the data tested, which has been shown to be associated with adverse statistical properties in PERMANOVA [20].ICD9 and ICD10 codes were extracted and encoded into the Elixhauser Comorbidity Index using python package pyelixhauser [21].Gap statistic (K-means) clustering analysis was performed on these the Elixhauser comorbidities using R package cluster (v.2.1.4)[22] to assess the presence of comorbidity clusters in cases and controls for the MRSA and VRE studies.Our study targeted a convenience sample of 50 matched pairs of nasal and 50 matched pairs of perianal swabs.

RESULTS
Samples (N = 108) were successfully sequenced and generated all downstream data.Tables 1a and b present the demographic summary of the patient population for MRSA and VRE studies, respectively.Controls were recruited by matching on the demographic and comorbidities of cases (sex, age, race, ethnicity, unit of admission, and diagnostic-related group, when available).As indicated in Tables 1a and b, no statistically signi cant differences (t-and chi-squared tests) were observed in age, race, sex, and Elixhauser Comorbidity Index between the cases and controls for the MRSA and VRE studies.).These reads comprised 962 ASVs, only one (1) of which was eukaryotic in origin and was observed across four samples.The eukaryotic ASVs were removed prior to analysis.Sequencing depth between MRSA/VRE colonized and uncolonized samples were assessed to ensure no signi cant read quality differences were present in any of the covariates.Casecontrol matched visualization of the top-10 most abundant genus-level communities revealed an interesting trend (Figs.1a and b); uncolonized samples (top facets) are observed to have a relatively larger portion of their abundance comprise their non-top-10 genera (combined and colored in black), indicating a greater within sample diversity.This trend was con rmed and visualized in Supplementary Figs.2a and b.As observed in Fig. 1a, the majority of ASVs in both colonized and uncolonized samples were identi ed within the Staphylococcus genus.This same trend was not observed in the VRE study; although in the cases we observe the Enterococcusgenus with the majority abundance, this is not the case in the matched negative controls (in 1b top facet).
Alpha diversity of samples was assessed (Table 2) prior to any abundance/prevalence ltering under the three scenarios described in the methods.For the MRSA study, a total of 735 ASVs across 55 subjects were assessed using t-test of observed, Shannon, and Simpson diversity measures, all three of which were found to be signi cantly associated with subjects' colonization status (p = .026,.003,and .004respectively).ASVs identi ed as Staphylococcus were removed and processed for the same analysis.This time only the observed alpha diversity count was seen to be signi cant (p = .02).Subsequently, Staphylococcus ASVs were isolated and processed using the same assessment of alpha diversity and no signi cance was observed on a 0.05-level.Similar trends were observed in the VRE study in which observed, Shannon, and Simpson diversity measures were found to be signi cantly associated with subjects' colonization status (p = .003,.001,and .012respectively) when all ASVs were included.ASVs identi ed as Enterococcus were removed and processed for the same analysis leading to signi cantly different alpha diversity measures again (p = .001,.001,and .003for observed, Shannon, and Simpson respectively).Subsequently, Enterococcus ASVs were isolated and processed using the same assessment of alpha diversity and no signi cance was observed on a 0.05-level.We have included visualized alpha diversity measures as Supplementary Figs.3a and b.  3a).This assessment revealed ASVs 1 and 2 (both identi ed in the Staphylococcus genus) as statistically signi cant ( and , respectively).These two ASVs are also the most abundant overall.When corrected for multiple comparisons using FDR approach, q-values were observed to be non-signi cant (on a .

level). A table containing all 104
ASVs is included as Supplementary Table 1a.A similar comparison was performed on genus agglomerated CLR data (Table 4a).We do not see any differences in q-values following corrections for multiple testing.A table containing all 30 genera of the MRSA study is included as Supplementary Table 2a.
Table 3 ASV-level univariate analysis of centered log-ratio (CLR) abundances on MRSA (a) and VRE (b) colonized and uncolonized respective pairs using paired t-test.Only the top 10 most signi cant ASVs are displayed here and larger tables including all 104 ASVs (for a) and 176 ASVs (for b) are included as Supplementary Table 1.P-values were corrected for multiple testing using FDR methods (q-values).
In (a) we see ASVs 1 and 2 showing a signi cant q-value (on .05level) even after these corrections.2. P-values were corrected for multiple testing using FDR methods (q-values).In (a) we observe genus Cutibacterium on the threshold of signi cance (on a 0.05 level).However, q-values after multiple testing corrections show no signi cance.In (b) we observe genera Enterococcus, Corynebacterium, Mobiluncus, and Facklamia as signi cance (on a 0.05 level).Likewise here, the q-values after multiple testing corrections show no signi cance for any genera.
a. MRSA Study: Genus-level univariate analysis using paired t-test on relative abundance values between MRSA colonized and non-colonized matches.We used an identical ltering criterion for the VRE study removing 670 low abundance and 659 low prevalence ASVs (total of 785), resulting in 176 ASVs that were used for our downstream analyses.CLR-transformed data was assessed in ASV-wise manner across VRE colonization status (Table 3b).Although three ASVs (1, 8, and 20) showed signi cant p-values, when corrected for multiple comparisons using FDR approach, q-values were observed to be non-signi cant (on a .05level).ASVs 1 and 8 belong to the Enterococcus genus and ASV20 belongs to the Klebsiella genus.A table containing all 176 ASVs is included as Supplementary Table 1b.A similar comparison was performed on genus agglomerated CLR data (Table 4b) which showed no signi cant differences once corrected for multiple comparison.A table containing all 64 genera of the VRE study is included as Supplementary Table 2b.

Genus
The composition of the top 30 most abundant ASVs across samples is displayed using a heatmap in Fig. 2a and b.We see speci c ASVs that are associated primarily or exclusively with each group (colonized and non-colonized).In 2a we see Staphylococcus ASVs 3 and 4 to be widely present in both cases and controls, whereas ASVs 1 and 2 (also Staphylococcus) are almost exclusively observed in the cases (MRSA colonized).On the other hand, Staphylococcus ASVs 9 and 11 are observed largely in the controls (noncolonized for MRSA).We have included a larger heatmap including all 104 AVSs within the MRSA study as Supplementary Fig. 4.
Visualized principal coordinates analysis (PCoA) of the Bray-transformed relative abundance data ( -diversity) revealed interesting interaction of ASVs (Figs. 3 and 4).Here, the assessments were made using the three scenarios (for MRSA and VRE studies) described within the methods section.Statistical assessment was made using test which has been developed in our group to account for heteroscedasticity and control for confounders and covariates [23].Starting with the MRSA study, when all the ASVs are present ( ltering at least 1% relative abundance and seen in more than one sample) (Scenario a, Fig. 3a), we see that the MRSA colonized cases are differentiated from the non-colonizing controls ( =10.41, p=.001).Removal of the Staphylococcus genus (Scenario b, Fig. 3b) results in an overlap of group centroids meaning these differences are primarily driven by the Staphylococcus genus, which accounts for most observed microbiome shift ( =.82, p=.581).Looking at the Staphylococcus genus in isolation (Scenario c, Fig. 3c) we see the rst axis of PCoA accounting for a large 68.2% of the differences observed ( =24.54, p = .001),further evidence that the signal results from within this genus.Performing a similar assessment for the VRE study following standard ltering criteria ( ltering at least 1% relative abundance and seen in more than one sample), when all the ASVs are present (Scenario a, Fig. 4a), we see that the VRE colonized cases are relatively differentiated from the non-colonizing controls ( =2.65, p=.009).
Removal of the Enterococcus genus (Scenario b, Fig. 4b) results in an overlap of group centroids meaning these differences are primarily driven by the Enterococcus genus, which accounts for most observed microbiome shift ( =.92, p=.461).Looking at the Enterococcus genus in isolation (Scenario c, Fig. 4c) we see the rst axis of PCoA accounting for a large 80.1% of the differences observed ( =3.86, p=.089).We have performed similar analyses using Jensen-Shannon divergence metric (on relative abundances) and Euclidean distances (on CLR-transformed abundances), along with respective test results for the three aforementioned scenarios.These are included as Supplementary Figs.6-9 and show similar trends as described above.

DISCUSSION
The ndings in both the MRSA and VRE studies indicated alpha diversity measure differences, notably within the ASVs of genus Staphylococcus or Enterococcus between the MRSA/VRE colonized and their matched pairs.We assessed these differences further by considering the primary colonizing genus in each study and either removing it entirely or isolating it solely for the analysis.This allowed us to assess the degree to which Staphylococcus or Enterococcus genera in uence the remaining microbial communities of their hosts.In both studies we observe that the alpha diversity of neither Staphylococcus nor Enterococcus genera are signi cantly different in isolation whereas the alpha diversity of the remaining community (without Staphylococcus or Enterococcus genera) is in uenced by these two genera.In other words, the inclusion of Staphylococcus or Enterococcus genera indicate important sources of observed variation in our alpha diversity analysis.Our ASV-level univariate analysis results showed that Staphylococcus genus ASV 1 and 2 are signi cantly different in MRSA colonized vs uncolonized match pairs (paired t-test corrected using FDR).A similar trend was not observed in our VRE study once corrected for multiple testing.Genus-level agglomerated univariate testing also showed no signi cant differences even in the case of Staphylococcus that was previously observed on the ASV-level.This may be indicative of differences in behaviors of species within the same genus such as Staphylococcus.
We performed PCoA of the beta-diversity revealing distinct clustering pro les of colonized cases and non-colonized matched controls.
A similar pattern is observed in MRSA and VRE studies in which a signi cant difference is seen when all ASVs are present; this difference disappears entirely when the colonized genus is removed (all ASVs except Staphylococcus or Enterococcus); and is markedly present in colonized genus in isolation (only Staphylococcus or Enterococcus).The variance obtained from the PCoA as While the study provides important insights into the compositional microbiome differences of the groups, there are several limitations including the relatively small sample size, which could have made it di cult to detect some of the differences between the two groups.Additionally, the study was conducted in a single hospital, which may limit the generalizability of the ndings.The study may also suffer from selection bias given we relied on surplus clinical specimens and there may be inherent biases associated with subjects who are hospitalized or sampled.
The ndings from this study shed important light on the variations in the microbiome composition of subjects colonized with MRSA/VRE compared to uncolonized matches.The Staphylococcus and Enterococcus genera were found to be signi cant contributors to these differences which may have implications for understanding the colonization dynamics and developing potential interventions for MRSA and VRE infections.

Declarations
Figure 1 Samples from subjects colonized with MRSA (a, bottom facet), VRE (b, bottom facet), and respective uncolonized match subject pairs (a and b, top facets) are displayed on the x-axes.The relative abundance of the top ten most abundant genus-level taxa are displayed on the y-axis.Less abundant taxa are combined and colored as black (corresponding to "other").Uncolonized subjects are seen to have higher diversity of bacteria present individually and as a group (number of different taxa not in top 10 genera) as shown in proportional analysis Supplementary Figure S2.
own robust tests attest to this trend that the differences were primarily driven by the Staphylococcus (or Enterococcus) genus.

Figure 3 See
Figure 3

Table 1
Comparison of demographic characteristics of enrolled samples from subjects colonized with (a) MRSA and (b) VRE and their respective uncolonized matched subject-pairs (controls).All subjects identi ed as non-Hispanic/Latino.No statistical signi cance (at a .05level) between cases and controls across demographics of the recruited samples were observed.

Table 2 .
Assessment of alpha diversity of observed, Shannon, and Simpson indices are assessed using paired t-test (within match analysis) under three scenarios: multivariate assessments.CLR-transformed data was assessed in ASV-wise manner across colonization status (Table

Table 4Genus -
Both of these ASVs have been identi ed within the Staphylococcus genus and no species level information is available.In (b) we see no individual ASV as statistically signi cant (on .05level) after multiple testing corrections.level univariate analysis of centered log-ratio (CLR) abundances on MRSA (a) and VRE (b) colonized and uncolonized respective pairs using paired t-test.Only the top 10 most signi cant genera are displayed here and the larger tables including all 30 and 64 genera, for MRSA and VRE samples respectively are included as Supplementary Table