Colonic Mucosa-associated Mycobiota in Individuals With Normal Colons

To characterize the spatial variation of the mucosa-associated adherent mycobiota along the large intestine in individuals with a normal-colon, we performed eukaryotic rRNA operon’s internal transcribed spacer-2 sequencing to prole fungal community composition and structure in 70 mucosal biopsies taken from the cecum, ascending, transverse, descending colon, and rectum of 14 polyp-free individuals. The bacteriome of these samples was previously characterized by sequencing the V4 region of the 16S rRNA gene. We identied 64 amplicon sequence variants (ASVs) with the relative abundance no less than 0.05% from these colonic mucosa samples. Each individual has a unique community composition of the gut mycobiome (P = 0.001 for beta diversity). Alpha-diversity and beta-diversity did not differ signicantly across the colon segments. The most common phyla (relative abundance) were Ascomycota (45.4%) and Basidiomycota (45.3%). The most common genera were Malassezia (28.2%) and Candida (13.4%). Malassezia was found in 13 of 14 individuals. Other fungi genera were sporadically found in the large intestine. The most common species were Malassezia restricta (22.7%), Candida albicans (11.9%), Malasseziales sp. (8.80%), unclassied fungi (7.80%), and Penicillium paneum (5.70%). Malasseziaceae was co-abundant with Enterobacteriaceae and co-exclusive with Barnesiellaceae, Rikenellaceae, and Acidaminococcaceae. Malassezia was widely colonized whereas other fungal genera were sporadically colonized in the large intestine. The physiologic and pathogenic functions of fungi in human gastrointestinal tract including Malasseziaceae that may interact with several bacterial families remain to be fully elucidated.


Introduction
The colon harbors the most diverse and abundant population of microorganisms in the human body [1]. The community composition and structure of colonic intestinal microorganisms may re ect the differences in nutrient abundance, temperature, gastrointestinal motility, and oxygen content, which are known to change gradually from the cecum to the rectum [2].
The longitudinal distribution of the gut bacteriome along the colon has been investigated. Some studies have reported signi cant differences in bacterial diversity, density, and depth of invasion along the intestinal tract [3,4]. Other studies have suggested the mucosa-associated bacteriome is homogeneous along the colon axis [5][6][7].
Fewer studies have investigated the distribution of other normal colonic cohabitants, such as fungi, viruses, and protozoa in the large intestine in humans.
Gut fungi -also known as the "mycobiome" -comprise a smaller proportion of the human intestinal microbiome (~10 8 organisms per gram of wet stool) compared to bacteria (~10 11 organisms per gram of wet stool) [8]. Shotgun metagenomic sequencing has shown that fungi account for ~0.1% of the gut microbiome in feces [9]. Emerging evidence has implicated fungal dysbiosis in in ammatory bowel disease (IBD) [10], colorectal cancer (CRC) [11], cirrhosis [12], alcoholic liver disease [13], and pancreatic cancer [14]. Nevertheless, the feces-based studies by far suggested that there are no or only a small set of species that colonize the large intestine [15,16].
A few studies have analyzed the mucosal mycobiota in humans with adenoma or IBD [10,[17][18][19][20]. The community composition and structure of the fungi along the large intestine in the normal colon have not been purposefully characterized in humans. We addressed this knowledge gap by leveraging next generation sequencing (NGS) technology using the colonic biopsies taken from adults with macroscopically normal colons. We hypothesized that the community composition and structure of the mucosa-associated mycobiota would vary along the colonic longitudinal axis and among participants. In addition, because the 16S rRNA gene sequencing data were available to all the samples, we examined interactions between gut bacteria and fungi.

Study participants and data collection
The study design, participant eligibility, and data collection have been reported previously [6]. Brie y, 402 participants were prospectively recruited and underwent routine colonoscopy at the Michael E. DeBakey Veterans Affairs Medical Center (MEDVAMC) in Houston, Texas, between July 2013 and April 2016.
All participants attended a regular educational session one to two weeks before the colonoscopy. During the session, a trained research coordinator obtained the informed consent from all participants and administered a questionnaire to collect information on lifestyle, social history, and medical history.

Colonoscopy And Biopsy Acquisition
To ensure adequate endoscopic visualization of the colorectal mucosa, each participant took 3.78 liters of polyethylene glycol (Golytely®) bowel cleanser the night before their colonoscopy [21]. Patients were advised to stop taking aspirin, anti-in ammatory drugs, blood thinners, iron or vitamins with iron seven days before the procedure, and stop taking antidiabetic medications two days before the colonoscopy. The procedure was performed under conscious sedation and involved the visualization of the entire colon up to the cecum using standard Olympus® endoscopes.
Among 402 enrolled participants, a total of 120 were found to be polyp-free and have no other colonic pathological conditions. We collected the mucosal biopsies using cold-biopsy forceps introduced through the scope-operatingchannel from multiple colonic segments (cecum, ascending colon, transverse colon, descending colon, and rectum) when feasible. Each biopsy sample was immediately placed in a sterile tube (RNase and DNase-free) on dry ice and transferred to a -80°C freezer within 15 minutes.
We sent multiple biopsies from the rst enrolled 27 participants for gut microbiota pro ling using the 16S rRNA gene amplicon sequencing. To perform the mycobiota pro ling, we excluded nine participants with relatively lower sequencing count in the 16S rRNA gene survey because of the concern over poor DNA quality. We exclude four participants with a history of diabetes. Thus, our current mycobiota analyses were performed on 70 colonic biopsies taken from 14 participants. The study protocol was approved by the Institutional Review Board of Baylor The Alkek Center for Metagenomics and Microbiome Research (CMMR) at BCM performed microbial DNA extraction, library construction, and mycobiota sequencing using the methods previously described by Nash et al [22]. In brief, DNA was extracted using the MO BIO PowerLyzer UltraClean Tissue & Cell Isolation Kit (Qiagen, Germantown, MD). The eukaryotic rRNA operon's second internal transcribed spacer 2 (ITS2) was ampli ed by PCR using modi ed versions of primers of ITS3 and ITS4 [23] and sequenced on the MiSeq platform (Illumina, San Diego, CA) using the 2x300bp paired-end protocol. The PCR primer modi cations included the addition of a PrimerProspector [24] Illumina adapter and linker sequence. The reverse primer (ITS4) also included a unique 12base pair Golay barcode for each sample [25]. A rarefaction curve was constructed using the sequence data for each sample to ensure we sampled the majority of its fungal diversity with the cutoff threshold of 70 per sample.
The rarefaction resulted in excluding 8 out of the 70 samples from the analysis.

Bioinformatics And Taxonomic Assignment
We used the CMMR pipeline incorporating phylogenetic and alignment-based approaches to maximize data resolution. This analysis provided summary statistics and quality control measurements for each sequencing run, as well as multi-run reports and data-merging capabilities for validating built-in controls and characterizing microbial communities across large numbers of samples.
The ITS2 read pairs were demultiplexed based on their unique molecular barcodes and merged using USEARCH (v7.0.1001) [26], requiring a minimum overlap of 50 base pairs. Merged reads were trimmed at the rst base with quality score Q5 and ltered to remove those containing a > 0.5% expected error rate. The primers, adaptors, and linkers were removed. We used the DADA2 pipeline (version 1.8) to generate the amplicon sequence variant (ASV) table [27]. DADA2 implemented the naïve Bayesian classi er method for a taxonomic assignment using the UNITE database and NCBI databases (2018 version). The ASV table records the number of times each exact amplicon sequence variant was observed in each sample. ASVs can be resolved exactly down to the level of singlenucleotide differences over the sequenced gene region. Hence, it has a higher-resolution than operational taxonomic unit (OTU) tables, which roughly cluster sequences using sequence identity thresholds. A total of 64 ASVs were identi ed from our samples.

16S rRNA sequencing for gut microbiota pro ling
We ampli ed and sequenced the 16S rRNA gene V4 region to determine the microbiota pro le as described

Statistical analysis
We performed data analysis using the Agile Toolkit for Incisive Microbial Analyses (ATIMA), which is part of the CMMR pipeline. We used the Kruskal-Wallis test to test the differences in Shannon index for alpha diversity, as well as the difference in relative abundance of ASVs across the colonic segments. The difference in the beta diversity of the mycobiota by each individual or by the colon segment was visualized using the Principal Coordinate Analysis (PCoA) plots based on the weighted Bray-Curtis as the distance matrix and the difference was tested using PERMANOVA. P-values were adjusted for multiple comparisons with the Benjamini-Hochberg false discovery rate (FDR)) algorithm [28]. FDR adjusted P-value < 0.05 indicated statistical signi cance.

Fungal-bacterial Interaction
We used the SparCC [29] implementation in the R package SpiecEasi (version 1.1.0) [30] to analyze the interaction between fungi and bacteria. We evaluated the interaction between 46 families of bacteria and eight families of fungi based on the ASV data.

Results
Description of study population Table 1 presents the basic demographic and selected lifestyle characteristics of 14 participants. Most of the participants were men and obese.

Fungal Diversity: Inter-individually And Across The Colon Segments
We identi ed a total of 64 unique ASVs in our samples. There was a signi cant difference in the beta-diversity of gut mycobiota among 14 participants (Fig. 1). However, there was no compositional dissimilarity of fungi across all colonic segments (Fig. 2). There was no signi cant difference in the Shannon index across all colon segments (Fig. 3).
The relative abundance and the prevalence of fungal taxa in the colon From the 64 fungal ASVs with relative abundance of no less than 0.05%, we identi ed ten phyla, 23 classes, 31 orders, 45 families, 53 genera, and 60 species. Table 2 shows the most abundant fungal taxa and their relative abundance in all samples. Ascomycota and Basidiomycota were the most common phyla, with an overall mean relative abundance of 45.4% and 45.3%, respectively. These two phyla could be found in all the samples (Fig. 4A).  (13 of 14) individuals. Candida and Penicillium were detected in less than ve individuals. The Saccharomyces genus was only found in the cecum of one individual with a relative abundance of 0.18%. The fungi were generally present in the entire colon if they could be detected in an individual. Fig. 4B and  (Table 3). S. cerevisiae was not detected in our samples. The relative abundance of major fungal phyla and genera across the colon segments Table 3 shows no statistically signi cant difference in the relative abundance of fungal species across ve colon segments. However, some observations were worth noting. The relative abundance of Malassezia restricta in the cecum was much lower than in other segments. The relative abundance of Candida albicans was also lower in the cecum and ascending colon than in other segments. The variability in relative abundance of Penicillium paneum across the colon segments was also noted.

Fungal-bacterial interaction
SparCC network plot of correlations between bacterial families and fungal families is shown in Fig. 5. Only signi cant correlations (two-sided P value < 0.001 based on bootstrapping of 1000 repetitions) with an absolute correlation coe cient ≥ 0.3 are presented. We found that Malasseziaceae was the only fungal family signi cantly correlated with bacteria. It was inversely correlated with Barnesiellaceae (correlation coe cient r = -0.48), Rikenellaceae (r = -0.49), Acidaminococcaceae (r= -0.49), and positively correlated with Enterobacteriaceae (r = 0.42), respectively.

Discussion
We surveyed the mucosa-associated mycobiota across the longitudinal axis of normal-appearing adult human colon using the ITS2 amplicon sequencing. Ascomycota and Basidiomycota were the two major phyla. Malassezia and Candida were the two major genera. M. restricta, C. albicans, P. paneum, and M. spp were the major species.
We found that almost all individuals were colonized with Malassezia, with sporadic colonization with other fungal genera. Notably, although we observed signi cant inter-individual variability in fungal composition across individuals, the distribution of fungi longitudinally along the colon was not statistically heterogeneous within each individual. The prevalence of fungi with other co-occurring bacteria was observed only for Malasseziaceae and a few bacterial families including Barnesiellaceae, Rikenellaceae, Acidaminococcaceae, and Enterobacteriaceae.
Ascomycota, Basidiomycota, and unclassi ed fungi accounted for 97% of the fungi phyla. A previous study using the adjacent normal colonic mucosa of 27 patients with colorectal adenoma found Ascomycota was the dominant phylum (80.5%), followed by Glomeromycota (3.1%) and Basidiomycota (2.5%) [18]. The other study using the mucosal sample of 14 patients with UC and 14 healthy controls found Ascomycota and Basidiomycota were the two dominant phyla [19]. In these two deep sequencing studies conducted in China, the relative abundance of Ascomycota was much higher than Basidiomycota. However, in our study, these two common phyla had equal abundance as detected in the normal mucosa of polyp-free individuals.
Malassezia, which belongs to Basidiomycota, was the most common genus with M. restricta being the most common species in our study. Malassezia is abundant in human breast milk and colonizes the gut during the neonatal period [31] and can survive in the gut [32]. Malassezia has previously been recognized as the dominant fungal genus in the oral mycobiome [33]. Malassezia is also a skin commensal fungus frequently associated with mild skin infections such as tinea versicolor, and occasionally results in fungemia in immunosuppressed individuals [34,35]. Using the fecal sample, Malassezia was identi ed as an opportunistic fungus in patients with colorectal tumor [36]. Malassezia restricta was shown to be a key player in the pathogenesis of Crohn's disease in a mouse model because this fungus can elicit innate in ammatory responses through caspase recruitment domain family member 9 (CARD9) [37]. While our study identi ed that Malassezia was the most abundant fungal genus, another study reported that Candida was the most abundant genus using 54 mucosal samples taken from cecum or rectum of 28 healthy individuals, followed by Cyberlindnera, Fusarium, Galactomyces, and Malassezia. The differences in these two studies included geography (California vs. Texas), race/ethnicity (Asian vs. African-American predominantly), and the taxonomic assignment tool (ASV vs. OTU-based) used [37]. Our nding suggested that the role of Malassezia in colonic physiology and pathology deserves more investigations.
Candida was the second most abundant genus in our study with a mean relative abundance of 11.8%, which is similar to the abundance reported by Nash using 317 fecal samples in the Human Microbiome Project [22].
Fechney et al reported that Candida is the most abundant oral colonizer in 17 children with or without dental caries [33]. C. albicans, .and C tropicalis were the major Candida spp in our study. C. albicans is not only a commensal in healthy individuals but also a pathogen of the gastrointestinal (GI) tract, at mucosal surfaces, and the blood under opportunistic conditions including disease-and drug-related immunosuppression or injury. Candida has recently been implicated in the pathogenesis of IBD [20], Clostridium di cile colitis [38], and graft-versus-host disease [39]. C. tropicalis is a common opportunistic infectious agent, especially in patients hospitalized in intensive care units [40]. C. tropicalis has also been associated with Familial Crohn's disease [41]. In our study, Candida was not present in the colon of all participants. However, its occurrence in 43% of our colonically normal participants coupled with its known potential pathogenic signi cance suggested its presence and role should be further investigated.
Penicillium, commonly found in soil and belongs to Ascomycetes phylum and Aspergillaceae family [42], also colonizes the colon. P. paneum was the only species identi ed in our samples. P. paneum, isolated from the baled grass silage, has been shown to produce mycotoxins such as Roquefortine C, marcfortine A, and andrastin A and diverse secondary metabolites [43]. However, the implication of P. paneum in human health is essentially unknown.
Yamadazyma mexicana, unclassi ed fungi, and unclassi ed members of Cladosporium were also detected in the colonic mucosa. Yamadazyma mexicana is one of the six yeast strains that can degrade environmental hydrocarbons [44] and its role in the gut is completely unknown. Cladosporium spp is a ubiquitous fungus, which can cause skin infections and has been linked to allergic rhinitis, with its cell wall being classi ed as a potential allergen [45]. Among the 20 most abundant species, 10 were unclassi ed using our sequencing and bioinformatics pipeline. Additional research will be necessary to characterize these unclassi ed fungi in order to understand their potential function and signi cance in colon health and disease pathogenesis.
The fecal mycobiome has been examined using culture-based and non-culture-based methods. The following core fungal genera were proposed: Candida (especially C. albicans), Saccharomyces, Penicillium, Cryptococcus, Malassezia (especially M. restricta), Cladosporium, Galactomyces, Aspergillus, Debaryomyces, and Trichosporon [46]. However, Nash et al were unable to nd the latter four genera using NGS in 317 stool samples [22]. Similarly, we did not identify Cryptococcus and Galactomyces in our mucosal samples. Saccharomyces was only detected in the cecum of one individual with a relative abundance of 0.18%. Saccharomyces cerevisiae was not detected in our samples. Because Saccharomyces is a yeast that is most abundant in foods, our observation supported the notion that foodborne Saccharomyces cerevisiae is likely not a mucosa-associated fungus and is only transiently present in the human GI tract [47]. A Saccharomyces-free diet has been shown to eliminate its presence in the stool [15]. In our study, participants had a full colonic preparation as well as fasted for 12 hours before the procedure. The absence of food-related fungi may be the indicator of proper colonic preparation. On the other hand, the colonization of fungi in the colon mucosa should be further investigated. Overall, the structure of the fungi in the colonic mucosa differed from that of feces shown by other studies.
Our ndings did not concur with a previous research using the mucosal sample. Li et al used the 18S rRNA region to survey the mycobiome in normal terminal ileum tissues (non-colonic) samples of seven healthy volunteers [20]. They found Saccharomyces cerevisiae, Saccharomyces castellii, Candida albicans, Candida tropicalis, Gibberella moniliformis, and Sclerotinia sclerotiorum to be the most abundant species. The difference between our study and Li's study included anatomic locations surveyed, target regions sequenced, and the genome assemblies for fungal identi cation. ITS2 has improved resolution of the mycobiome membership compared to metagenomics and 18S rRNA gene sequencing [48]. More research is needed to characterize the fungi entity in humans using the standardized approach.
In an earlier study, we found the bacterial distribution was largely homogeneous along the colon axis [6]. In the present study, we found that the distribution of the common fungal genus, Malassezia, was homogeneous along the colon. Most of other fungal species were only sporadically detected in the colon. Whether the rare or sporadic microorganism has any role to play in patchy colonic lesions should be elucidated further.
We found the bacterial-bacterial interaction in colonic mucosa of these normal individuals detected by SparCC. However, because fungi were sporadic, we only observed the interaction between the most common fungi, Malasseziaceae family, with Acidaminococcaceae and Barnesiellaceae, and possibly with Enterobacteriaceae. We also observed co-abundance of Aspergillaceae and unclassi ed member of Lactobacillales. Taken together, our data suggested that in the normal colon, there was no extensive interaction between fungi and bacteria because of the rarity of fungi.
Our study had several strengths. It was among the rst to characterize fungi in the normal colon in humans.
Although we provided a snapshot of the fungal pro le, the survey of the mucosal mycobiota likely re ects the longterm colonic indigenous mycobiota that are less likely to be disturbed by bowel preparation. In addition, we used the identical procedure to collect biopsies where the contamination was minimized in a clinical setting. Lastly, we used the ASVs for taxonomic assignment, which is thought to be more precise than the OTU-based assignment [49].
Our study also had several limitations. First, the ITS2 ribosomal domain for fungal identi cation is different from that of ITS1. The taxonomic characteristics may be biased towards Ascomycota and Basidiomycota [50]. Second, the generalizability of this study was limited as it included predominantly overweight and obese middle-aged men using the VA healthcare system. Third, our sample size was limited in testing the distribution of less abundant fungi across the colon segment. Lastly, the research provided a snap-shot of the fungal distribution in the gut after bowel preparation. Additional study is needed to describe the trajectory of fungal colonization over time and with aging.
In summary, using the ITS2 targeted-amplicon sequencing and the present fungal genome assemblies, we found fungi that are commonly present in indoor and outdoor environments also inhabit the lower GI tract in humans. Our study ndings argue against the absence of fungi in "healthy" adult colons. Nevertheless, Malassezia was the only prevalent fungi and others were sporadic. The potential co-occurring and co-exclusive correlation between Malasseziaceae and bacteria and its physiological implication deserve further research. The characterization of the fungal origin, richness, composition, and structure in the human colon is essential for informing future research examining the role of fungi in both maintenances of colonic health and the pathogenesis of colonic diseases.
Appropriate use of antifungal therapy will be dependent on understanding the role of fungi in health and disease. Figure 1 Mycobiota beta diversity inter-individually. Principal-coordinate analysis (PCoA) with weighted Bray-Curtis dissimilarity shows that there was a signi cant inter-individual difference in fungal beta diversity in 14 participants (P value = 0.001, R 2 =0.36, PERMANOVA test). PC1 and PC2 represented the top two principal coordinates that capture most of the diversity. The fraction of diversity captured by the coordinate was shown as a percentage in the corresponding axis. Each centroid represented each individual with ve colon segments.

Figure 2
Box plot of alpha diversity of fungi across the colon segments. There was no signi cant difference in fungi richness and evenness (the Shannon index) across the colon segments (q value = 0.16, Kruskal-Wallis test).   SparCC-fungal and bacterial families network analysis. The node size represents the relative abundance of the families. Node color corresponds to phylum taxonomic classi cation. Edges between the nodes represent correlation between the nodes they connect. Red is the positive correlation and blue is the negative correlation respectively. Only signi cant correlations (two-sided P value < 0.001 based on bootstrapping of 1000 repetitions) with an absolute correlation magnitude ≥ 0.3 are presented.