Landscape of human gut mycobiome composition and diversity
To characterize the human gut fungal diversity and composition, we collected internal transcribed spacer (ITS) sequencing data from 15 published projects (Supplementary Table 1)(12, 19–28). In addition, we recruited 572 Chinese participants (Chinese Gut Mycobiome cohort, or CHGM) aged from 17 to 89 years old and profiled their fecal mycobiome with ITS1 sequencing. In total, 3,363 fecal samples with ITS1- (960 samples) and ITS2- (2,403 samples) sequencing data from 16 cohorts covering three continents (Europe, North America, and Asia) were included in our study (Fig. 1a).
The gut mycobiome composition and the fungal diversity varied significantly across cohorts, which may be partially attributed to biological and technical factors such as geography and sequencing methods (Fig. 1b-d; p < 0.001, PERMANOVA, see Supplementary Note). Though we obtained a total of 1,120 genus-level taxonomic groups after combining all samples, the observed number of the fungal genera was still considerably below the estimated saturation level (Extended Data Fig. 1c), suggesting that a requirement for further increase in sample size to characterize the comprehensive gut fungal diversity. At the genus level, Saccharomyces and Candida were the most abundant genera across all samples, followed by Penicillium and Aspergillus (Fig. 1b). These genera are also the most common commensal fungi in other human body sites, including skin, lung, and oral cavity(29, 30), indicating their possible well-balanced symbiotic relationship with humans.
The gut mycobiome, compared with the paired bacteriome, demonstrated a significantly lower Shannon diversity yet higher between-individual dissimilarity (Extended Data Fig. 1e). Such observation was in line with the previous studies showing that, in comparison with the gut bacteriome, the gut mycobiome was less diverse but more individual-specific(21, 31). In addition, we found a positive correlation between the pairwise dissimilarities of fungal and bacterial communities across studies that had matched mycobiome and bacteriome datasets (Extended Data Fig. 1f), as well as a significant positive correlation between the alpha-diversity indices of the two communities (Fig. 1e; Supplementary Table 3), suggesting the possible between-kingdom interactions of gut microbiota.
Enterotypes of the human gut mycobiome
To investigate the overall structural and compositional patterns of the human gut mycobiome, we stratified the genus-level fungal compositions of the 3,363 samples into distinct groups, i.e., enterotypes (Methods). The clustering analysis revealed that both ITS1- and ITS2-combined datasets formed four distinct clusters (Fig. 2a, Extended Data Fig. 2a), and these enterotypes were highly concordant across clustering results obtained at other taxonomic levels (Extended Data Fig. 2d). This finding remained unchanged even at a removal of the half samples (Extended Data Fig. 2b). Three of these fungal enterotypes were found in both ITS1- and ITS2-sequencing datasets, where Saccharomyces, Candida, and Aspergillus were the most abundant genera, respectively (Extended Data Fig. 2e). Therefore, we defined the Saccharomyces-dominated enterotype as fun_S_E, and the Candida- and Aspergillus-dominated enterotypes as fun_C_E and fun_A_E, respectively. In addition to these three enterotypes, we also observed a fourth enterotype in both ITS1 and ITS2 (Fig. 2a). However, the fourth enterotype in ITS1 was dominated by an unclassified Ascomycota phylum (Ascomycota.sp, presented in 15.1% of ITS1 samples), while in ITS2 it was driven by an unclassified Saccharomycetales order (Saccharomycetales.sp, presented in 5.5% of ITS2 samples). Such a difference observed for the fourth enterotype between ITS1 and ITS2 can be attributed to different amplicon-targeted regions by ITS1 and ITS2. Hierarchical clustering on the combined datasets (ITS1 and ITS2) shows that these two enterotypes can be grouped together, suggesting that these two enterotypes had similar structures (Extended Data Fig. 2c). Thus we defined the fourth enterotype as fun_AS_E hereinafter.
We further confirmed the robustness of the enterotypes by performing a cross-dataset validation analysis between the ITS1- and ITS2-combined datasets with a LASSO logistic regression model (Methods). In the first instance, the model’s high prediction accuracy (Fig. 2b, Extended Data Fig. 3) supported the fungal enterotypes’ robustness. We also obtained a good performance of cross-validation in the absence of these enterotypes’ driver genera, revealing the enterotypes’ ability to characterize the overall fungal community structure independent of the main driver genera (Fig. 2b, Extended Data Fig. 3). Moreover, the consistent enterotype-specific fungal genera profiles across cohorts provided further solid evidence for the robustness of fungal enterotypes (Fig. 2c).
We then examined the geographical and ecological characterizations of the fungal enterotypes. Among the different populations, we found that the fun_C_E enterotype was less common in the European populations (Fisher’s exact test, ITS1: p = 4.67e-14; ITS2: p = 3.92e-09), while the fun_S_E enterotype was relatively rare in the populations from North America (Fisher’s exact test, ITS1: p < 2.2e-16; ITS2: p < 2.2e-16). This difference might be partially attributed to the significantly decreased abundance of Candida in European populations and that of Saccharomyces in North American populations (Extended Data Fig. 1a). Furthermore, we observed that both the fun_S_E and fun_C_E had the lowest diversity (Extended Data Fig. 2f), and a strong and inverse correlation between the fungal alpha diversity indices and abundances of their respective driver genera (Pearson’s r < -0.3, p < 2.2e-16).
In addition, we explored the relationship between the fungal and bacterial enterotypes with paired ITS1 for fungal profiling and metagenomics data for bacterial profiling as both data types were available for the CHGM cohort (see methods). Four bacterial enterotypes, which were identified following the same procedure as that of the fungal enterotypes with genus-level metagenomics data (Extended Data Fig. 4), were respectively dominated by Bacteroides (20.2% and 37.4% abundances in two bacterial enterotypes, annotated as prok_bac_E1 and prok_bac_E2, respectively), Prevotella (42.5% abundance in the prok_bac_E3 enterotype) and Enterobacteriaceae (34.9% abundance in the prok_bac_E4). Such observations were in line with those previously reported in the Asian populations(15, 32). In addition, we observed significant correlation between the fungal and bacterial enterotypes (Fig. 2d). For example, the fun_C_E fungal enterotype was enriched in the prok_bac_E1 enterotype (p = 3.6e-03, Fisher’s exact test) and depleted in the prok_bac_E3 enterotype (p = 0.024). We also observed that the fun_A_E enterotype showed a trend to be enriched in the bacterial enterotypes prok_bac_E2, while the fun_AS_E enterotype was enriched in the prok_bac_E4 (both p = 0.05). Together with the consistent results from other studies (Extended Data Fig. 5a), such evidence suggested a significant correlation between fungal and bacterial communities.
Age has a large effect on fungal enterotypes
We then explored the associations between the fungal enterotypes and the hosts’ basic characteristics, including age, gender and BMI. We noticed that age could significantly explain the inter-individual variation of the human gut mycobiome or strongly affected the fungal enterotypes in four cohorts with available age metadata including the CHGM cohort, Gao et al(20), Limon et al(12), and Zuo et al(19) (Fig. 3a, Supplementary Table 4). The insignificant explanatory power of age on the fungal enterotypes in the study by Gao et al(20) was likely attributable to the small sample size (n = 31). As shown in Fig. 3a, fun_C_E (38.8%) and fun_AS_E (34.0%) were significantly enriched in the elderly participants (age > 60 years), while fun_S_E (37.3%) and fun_A_E (44.9%) were significantly enriched in the young participants (age < 30 years, p < 0.05, Fisher’s exact test). In addition, a significant inverse correlation between the fungal Shannon diversity and chronological age was observed (Pearson’s r = -0.19, p = 3.34e-08). Moreover, a multi-variable linear regression analysis on 531 healthy participants from these four cohorts identified 21 fungal genera that significantly correlated with age (Fig. 3b; Methods). Notably, nine age-associated fungal genera were observed to have a different abundance distribution among the three fungal enterotypes (Supplementary Table 5). Among these genera, Candida, one driver genera of the fun_C_E, had a positive correlation with age, while two other genera, Saccharomyces and Aspergillus, showed an inverse trend. This observation was consistent with the age distribution trends of their respective fungal enterotypes (Fig. 3a). Hence, we suspected that the association of fungal enterotypes with age is at least partially driven by their respective dominant fungal genera. No significant association of fungal enterotypes with BMI or gender was found in any cohort (Supplementary Table 4).
To further quantify the association between the fungal enterotypes and age in other cohorts without available age metadata, we calculated a gut aging index (GAI) for each sample based on the 21 age-associated fungal genera, where higher GAI scores indicating a higher level of intestinal aging (Methods). According to our results, the GAI showed a strong correlation with the age of participants in each enterotypes (Fig. 3c). Of note, participants of the fun_C_E and fun_AS_E enterotypes had consistently higher GAI scores across their lifespan, while those of the fun_S_E and fun_A_E had relatively lower GAI scores (Fig. 3c). Similar results found in healthy subjects of other cohorts without available age metadata further validated the significant associations of GAI scores with the fungal enterotypes (Fig. 3d). Consequently, participants of the fun_C_E enterotypes that contained more age-positively related fungal taxa represented a higher intestinal aging degree, while the physiological condition of the fun_S_E enterotype exhibited a younger state (Fig. 3c,d). Additionally, the distribution of GAI scores in participants with different bacterial enterotypes became another piece of evidence to support correlations between fungal and bacterial enterotypes. For example, participants of the E3_bac enterotype (enriched in fun_S_E) had the lowest GAI scores similar to those of the fun_S_E (Extended Data Fig. 6d). Furthermore, higher GAI scores, as what we observed in patients with intestinal dysbiosis compared to their paired controls, might indicate an occurrence of aging-related pathological changes in the intestine (Extended Data Fig. 6e, Supplementary Note).
Functional variations across fungal enterotypes
To characterize the bioactive potential of the fungal enterotypes, we annotated fungi-contributed pathways based on the paired shotgun metagenomics data in the CHGM cohort (Methods). In total, we identified 388 biological pathways in the cohort, among which 48 were contributed by fungi alone and 104 were contributed by both bacteria and fungi (fungi-contributed pathways hereafter). Functional richness (the observed number of fungi-contributed pathways) did not vary among fungal or bacterial enterotypes ( Extended Data Fig. 2g). However, we identified a total of 31 fungi-contributed pathways whose distribution varied across enterotypes (adjusted p < 0.05, Supplementary Table 6). Furthermore, the relative abundances of these pathways were also significantly correlated with those of 14 fungal genera (Fig. 4a, adjusted p < 0.05, Supplementary Table 6). An overrepresentation of pathways related to carbohydrate degradation in the fun_AS_E enterotype was observed, suggesting a possible increase in saccharolytic and proteolytic potential (Fig. 4a). Notably, most of the fun_S_E enriched pathways were positively associated with the relative abundance of Saccharomyces, which implied the essential roles of genus Saccharomyces in these biological pathways. Two pathways involved in heme biosynthesis (PWY-5920 and HEME-BIOSYNTHESIS-II) were enriched in the fun_C_E enterotype and associated with the fun_C_E dominate genera, i.e., Candida. It has been demonstrated that heme, the key iron source for pathogenic bacteria, could have a negative impact on the intestinal mucosa and result in a higher risk of colorectal cancer (CRC)(33, 34). Thus the participants of fun_C_E enterotype might have an increased risk of developing CRC.
To further examine the impacts of fungal enterotypes on human health, we explored these enterotype-associated pathways’ correlations with their host properties. We observed a significant positive correlation between the relative abundance of the fun_C_E-associated pathway PWY-7279 (aerobic respiration) and subject age (Fig. 4b), consistent with the previous observation that the elderly population contained a higher abundance of pathways involved in microbial respiration(35, 36). One possible explanation is the higher oxygen level caused by inflammation related to aging promotes aerobic respiration in the gut microbiome(37). Additionally, one of the previously detected age-positively related genera, Paracremonium, was also shown to be associated with aerobic respiration pathways (Fig. 3b, Fig. 4a). Moreover, we found a significant positive correlation between the host BMI and the PWY-2723, a trehalose degradation pathway (Fig. 4c). The fun_AS_E enterotype, where the PWY-2723 was enriched, had a similar enrichment of biological pathways related to energy metabolism (Fig. 4a). These results are not only consistent with the higher BMI levels in the participants with fun_AS_E enterotype (Extended Data Fig. 6f), but also in line with the previous findings that the microbiota of obese individuals has an increased capacity for energy harvest(38). Thus, the functional differences observed across fungal enterotypes can partly explain the host phenotypes variations among fungal enterotypes.
fun_C_E enterotype is prevalent in disease populations
We further examined the clinical relevance of the fungal enterotypes by assessing their associations with human diseases. By comparing the fungal enterotype’s structure of healthy participants with that of patients with adjustment of age, we found that the fun_C_E enterotype was significantly more prevalent in patients of diseases such as type 2 diabetes, clostridium difficile infection, alcoholic hepatitis, and Alzheimer’s disease (Fig. 5a, p < 0.05, odds ratio > 1, Fisher’s exact test). Though there was no significant correlation between fungal enterotypes and other human diseases, we observed similar trends of a higher prevalence of the fun_C_E enterotype in the patients of these diseases (Fig. 5a, odds ratio > 1). In contrast, the other two enterotypes (i.e., the fun_S_E and the fun_A_E) were mainly enriched in the healthy participants (Fig. 5a; odds ratio < 1), except that the fun_S_E was enriched in two viral infectious diseases (H1N1 and COVID-19; Fig. 5a). To further quantify the disease associations across fungal enterotypes, we calculated a Gut Microbiome Health Index (GMHI) as previously described(39), and a higher GMHI value indicates a healthier status. Consistent with our expectation, the participants of the fun_C_E enterotype were more likely to have the lowest GMHI value (Fig. 5b), while those of the fun_A_E and fun_S_E enterotypes were more likely to have higher GMHI values. Thus, in addition to its association with higher intestinal aging, the fun_C_E enterotype might also be related to higher disease risk.
To explore the potential molecular mechanism contributing to the association of the fun_C_E enterotype with disease risk, we examined the intestinal barrier function as indicated by human DNA contents (HDCs) in the CHGM cohort (Methods). The HDC acts as an indicator of the compromised intestinal barrier. Previous studies show a significant elevation in HDCs among patients with several intestinal diseases(40). We found that the HDCs were significantly higher in the feces of participants of the fun_C_E and the fun_AS_E enterotypes than those of the fun_S_E and the fun_A_E enterotypes (Fig. 5c; p < 0.05, Wilcoxon test). This finding was consistent with the GAI scores of these enterotypes (Fig. 3c). Therefore, the compromised intestinal barrier might help to explain the increased disease risk in participants of the fun_C_E. In addition, we also observed significant correlations between the HDCs and the two fungi-contributed pathways involved in aerobic respiration (Fig. 5d,e; adjusted p < 0.05). These results strongly indicated significant relationships among the compromised intestinal barrier (hence the increased HDC), gut aging, and the fungal enterotypes’ distribution and bioactive potential. Furthermore, through a bidirectional mediation analysis, we found that the increased age might contribute to the HDC elevation by affecting the abundance of aerobic respiration pathway (69%, pmediation < 1e-04; Fig. 5f), which means the increased level of aerobic respiration significantly mediated the relationship between the age and compromised gut barrier.