Recovery of 1376 species-level metagenome-assembled genomes from anammox microbiota
To provide a comprehensive collection of anammox microbiota, the metagenomic sequencing data from 193 samples were used in this study. The 193 samples were affiliated with individual anammox systems (110), PNA (35), PDA (24), SAD (17), and other anammox systems (7). These systems corresponded to different bioreactors, nitrogen loadings, and organic/inorganic conditions (Additional file 2: Table S1). In addition, we incorporated 43 new public samples after catalog construction to evaluate the reliability of the catalog and conduct downstream analyses.
We clustered a total of 7474 MAGs using thresholds of ≥ 99% and ≥ 95% average nucleotide identity, thereby generating 1768 strain-level and 1376 species-level non-redundant MAGs (Additional file 2: Table S3). Overall, 809 high-quality and 567 medium-quality genomes were obtained at the species level (Fig. 1a). The MAGs at the species level contained 1 to 2296 contigs with a median of 207 (Fig. 1b), with genome sizes ranging from 545.10 kilobases (Kb) to 11.7 megabases (Mb) and N50 values ranging from 2.68 Kb to 4.0 Mb (Fig. 1c, d). The above characteristics were also calculated for the strain-level MAGs (Additional file 1: Fig. S5). Unless otherwise specified, we used the species-level MAGs for downstream analyses.
The MAGs were compared with the prokaryotic genomes in the GTDB for taxonomic classification (Additional file 2: Table S3). Almost all the MAGs could be assigned at the phylum, class, order, and family levels (Fig. 1e). Although 1095 of 1376 MAGs were identified at the genus level, 389 (28.27%) MAGs could be classified as known species. Even based on the latest GTDB (v214), only 581 (42.22%) MAGs were assigned to specific species, indicating that most of the recovered MAGs represent new candidate species in anammox systems. The highest proportion (25.58%) of MAGs were assigned to the phylum Proteobacteria, followed by the phyla Bacteroidota (14.61%), Chloroflexota (14.03%) and Planctomycetota (11.34%) (Fig. 1f). We also explored the taxonomy of these MAGs at the genus level and found that 9 prevalent genera had ≥ 10 MAGs, such as OLB14 (16 MAGs), PHOS-HE28 (16), Nitrosomonas (15), Rubrivivax (15), UBA12294 (13), Zeimonas (12), Desulfobacillus (11), JJ008 (11) and VBCG01 (10). These results suggested that these genera had high species diversity in anammox microbiota.
To determine whether the constructed AnMAGC could improve the coverage of microorganisms within anammox microbiota, we mapped all the metagenomic clean reads against both the 1376 MAGs and the MAGs obtained from individual binning (Fig. 1h). The average percentage of unmapped bacteria decreased significantly from 25.58–7.60% (P < 0.01), indicating that most microorganisms could be represented by these species-level MAGs. Nevertheless, we found a high level of unmapped reads in a few samples; therefore, in future, the catalog must be continuously updated with more samples and based on advanced technologies. In summary, we first obtained a high-quality genome catalog with potential novel species, which significantly improved the identification of microorganisms in anammox microbiota.
Distinctive patterns of microbial communities and functions across anammox systems
Coupled systems generally display significantly distinct microbial communities compared to individual anammox systems [4]. Therefore, we focused on the taxonomic and functional differences between individual anammox and PNA systems and between inorganic and organic systems, respectively. Alpha diversity analysis revealed that functional richness and diversity significantly increased with the complexity of the treatment process (Additional file 1: Fig. S6), presumably because these coupled systems promoted the growth of more species and resulted in higher Shannon indices (Fig. 2a). Principal coordinates analysis revealed that the microbial composition was strongly affected by process type (Fig. 2b-c). To gain deeper insights into the structural differentiation determined by process type, we identified the main taxa and their roles in driving differences using ALDEX2 and machine learning methods. Among 664 genera, 334–354 genera were significantly enriched in specific systems, and 709–760 species were identified as differentiated taxa (Additional file 2: Tables S5-S8). Moreover, both the RandomForest and LightGBM algorithms indicated that the different anammox systems could be distinguished based on microbial composition (Additional files 2: Tables S10). Several ammonia-oxidizing bacteria (e.g., species MAG.643, MAG.717, MAG.606, and MAG.1143) were identified as bioindicators in the classification of individual anammox and PNA systems (Fig. 2d). Heterotrophic and denitrifying bacteria, such as MAG.4185 (genus Azonexus), MAG.1067 (genus Thauera) and the genera Flavobacterium and Hydrogenophaga, were important microorganisms in organic systems (Fig. 2e). Notably, the relative abundances of anammox bacteria (e.g., Kuenenia and Jettenia) and other key bacteria (e.g., Desulfobacillus, Zeimonas, Ignavibacterium, and UBA7227) were low in both the PNA and organic systems.
A co-occurrence network of dominant species in anammox microbiota was constructed (Additional file 1: Fig. S7). Overall, the species exhibited a high prevalence of positive interactions in anammox microbiota, such as cooperation or mutualism, to maintain the stability of the microbial community. The networks of the individual anammox and inorganic systems exhibited high degrees of simplicity (Fig. 2f) with 919 and 730 edges, respectively. In contrast, the complex co-occurrence patterns observed in the PNA and organic systems (2673 and 3778 edges, respectively) might be attributed to substantial variations in the microbial compositions of these systems. Nevertheless, the low average path length and modularity observed in the PNA and organic systems (Additional file 1: Fig. S8) indicated the high efficiency of information or mass transport within the anammox microbiota, which likely facilitated the integration of diverse metabolic pathways in these systems.
Core and conditionally rare or abundant taxa in anammox microbiota
According to the relative abundance threshold for all genera, we identified a total of 64 core genera (8 strict, 18 general, and 38 loose) and 207 CRAT genera (Additional file 2: Table S12). The core genera included 361 species that were mainly affiliated with the phyla Proteobacteria, Chloroflexota, Bacteroidota, and Planctomycetota (Fig. 3a). The strict core genera involved in nitrogen metabolism included Desulfobacillus, Brocadia, Nitrosomonas, and Zeimonas. The general core genera included not only functional bacteria (e.g., Jettenia, Nitrospira, Rubrivivax, and Aquabacterium) but also 12 candidatus genera. Furthermore, a total of 31 candidatus genera were affiliated with the loose core genera, and these unclassified microorganisms should be the focus of future studies. Compared with the core and CRAT genera, we identified only 44 core species (one strict, 17 general and 26 loose) and 374 CRAT species based on the same criteria for the all samples (Additional file 2: Table S13). Only the species Desulfobacillus sp003105015 (GCA_016860745.1) was identified as a strict core species, whereas three other species (GCA_015075465.1, GCA_015075505.1, and MAG.4175) in the genus Desulfobacillus belonged to the general core species. The general core species contained only two anammox species (MAG.2153 of Jettenia and MAG.5753 of Brocadia). Notably, most of the species in the strict core genera were identified as general/loose core species, such as species within the genera Brocadia, Zeimonas, and Nitrosomonas. These findings indicated that species-level heterogeneity generally occurred within anammox microbiota.
We further investigated the relative abundances of strict and general genera across different systems (Fig. 3b and Additional file 1: Fig. S9). Apart from the genus OLB14, the relative abundances of other strict core genera were largely affected by process type. For instance, the genera Rubrivivax and VBCG01 had higher relative abundances in organic systems, while the genus IGN2 showed significant enrichment in PNA systems. Although core taxa, especially core species, accounted for only a small proportion of the total 1376 species, they exhibited high relative abundances (Fig. 3c-d). For example, the accumulative relative abundances in all samples were 24.93–31.78% for the strict core, 13.73–23.04% for the general core, and 11.38–20.76% for the loose core genera, and the CRAT represented 13.11–20.42%. Notably, the loose cores and CRAT species accounted for more than 50% (50.62–61.44%) of the anammox microbiota (Fig. 4d), further providing evidence of significant species-level heterogeneity.
Additionally, we identified core and CRAT taxa individually for the four different systems with the same criteria (Additional file 2: Tables S12-S13). Although specific systems had similar numbers of core genera compared to the overall anammox microbiota, more distinctive species were identified as core taxa, potentially suggesting the strong impact of process type on specific species. Moreover, distinct species within a genus were observed among specific anammox microbiota under a given set of environmental conditions. The core taxa and CRAT constituted a smaller proportion of anammox microbiota in the PNA and organic systems, while the remaining fraction comprised numerous taxa with very low abundances and unclassified taxa (Fig. 3e-f).
Functional potentials of the core taxa
The development of anammox systems for the removal of multiple nutrients removal requires the presence of complex metabolic potentials, and we performed functional annotation on the 1376 MAGs to investigate their global functional potentials (Additional file 2: Tables S14-21). Based on the global COG categories, ordination analysis strongly differentiated these species (Additional file 1: Fig. S10), especially those in the phyla Proteobacteria, Chloroflexota, and Patescibacteria. Species in these phyla, except for Patescibacteria, possess wide-ranging functional potential and are involved in nitrogen, phosphorus, and sulfur metabolism. For example, the nitrous oxide (N2O) reductase gene (nosZ) is primarily present in the phyla Proteobacteria and Bacteroidota and potentially contributes to N2O reduction in anammox systems. Given the crucial role of quorum sensing in facilitating the formation of granules or biofilms [66], most species exhibit complete pathways (i.e., synthesis, receptor, and quenching) for homoserine lactones (AHLs) and secondary messenger (c-di-GMP). The ubiquitousness of receptors for autoinducer-2 (AI-2) and diffuse signaling factor (DSF) also reveals their potential roles in anammox microbiota.
To estimate the role of core taxa in anammox microbiota, we further elucidated the global functions of distinct groups of core genera (Fig. 4). Most of the core taxa presented the aforementioned functional potentials, indicating their primary contributions to a wide range of metabolic activities. For example, a substantial fraction of loose core taxa could reduce nitric oxide (NO) and N2O, potentially serving as N2O sources and sinks in anammox microbiota [67]. Both the PNA and organic systems significantly promoted the potential of core taxa for NO and N2O reduction, likely increasing the risk of N2O emission in these coupled systems [68]. Notably, the metabolic potential for complex carbon degradation was predominantly observed in loose core taxa, which were primarily affiliated with the classes Anaerolineae and Phycisphaerae. These taxa can decompose extracellular polysaccharides in anammox microbiota [8].
Furthermore, we calculated the completeness of the biosynthesis pathways in core taxa to reveal their potential cross-feeding interactions (Fig. 4f). Overall, the strict and general core taxa exhibited greater functional potential for nutrient biosynthesis, while loose core taxa exhibited more prominent auxotrophy. The ability to synthesize molybdenum cofactors was observed in almost all core taxa, except for the phylum Zixibacteria and the genus Nitrosomonas from the PNA systems. No complete pathway for the biosynthesis of ectoine was detected in these core taxa, and anammox bacteria generally acquire exogenous ectoine through transporter systems [69]. Multiple amino acid biosynthesis pathways were observed in numerous core taxa, but these species lacked the potential for synthesizing lysine, cysteine, and phenylalanine. Moreover, due to the greatest metabolic cost of tryptophan and phenylalanine production, the growth of core taxa inevitably depends on exogenous nutrients from other species, such as anammox bacteria [16]. The auxotrophy of vitamins was extensively discovered in core taxa, especially for cobalamin, which is important for numerous metabolic functions in prokaryotes [70]. The potential biosynthesis pathway of cobalamin was identified in the genera Nitrospira and Thauera, family J111 (class Anaerolineae), and anammox bacteria. Auxotrophs for riboflavin and thiamine were also ubiquitous in core taxa, while loose core taxa in organic systems had greater synthetic capabilities. Anammox bacteria generally lack the capabilities for pantothenate, folate, and biotin production, which are widely observed in other core taxa.
Diversity of the genera Desulfobacillus and Zeimonas in anammox microbiota
Previous studies have elucidated the crucial functions of the phylum Chloroflexota (e.g., the core genera UBA12294 and OLB14) in anammox microbiota [71–73]. However, previous studies underestimated the significance of several core taxa associated with the phylum Proteobacteria [20, 74]. Here, we conducted a comparative genomic analysis to reveal the global profiles of the strict core genera Desulfobacillus and Zeimonas (Additional file 2: Tables S24-S25). Notably, the genus Desulfobacillus was classified within the family Rhodocyclaceae and showed a close phylogenetic distance to the genus Denitratisoma (Additional file 1: Fig. S12). The genus Zeimonas, which belongs to the family Burkholderiaceae, exhibited a close phylogenetic relationship with the genus Lautropia (Additional file 1: Fig. S13). Based on amplicon sequencing, the two genera Denitratisoma and Lautropia were commonly detected and recognized as key microorganisms in anammox microbiota [67, 75, 76]. We therefore speculated that the understanding of the two genera in anammox systems was poor ion previous studies. Based on the ANI and AAI similarity values, as well as the GTDB annotation, we greatly expanded the species-level diversity of the genera Desulfobacillus (five new species) and Zeimonas (seven new species) compared to the reference genomes (Fig. 5a, c and Additional file 1: Fig. S14). By mapping metagenomic reads to these species, we identified six species of the genus Desulfobacillus and three species of the genus Zeimonas as being responsible for the predominance of these genera (Additional file 1: Fig. S15).
The number of pangenomes, including the core and accessory gene clusters, reached 15205 (Desulfobacillus) and 26800 (Zeimonas), respectively, while the number of core gene clusters decreased to 44–114 (Additional file 1: Figs. S17-S18). This indicated that species-level functional differentiation occurred (Fig. 5b, d). Next, we functionally annotated the pangenomes of each genus to investigate specific metabolic capabilities, especially those potentially relevant to nutrient removal and microbial interactions (Fig. 5e and Additional file 1: Figs. S18-S20). Importantly, six predominant species within the genus Desulfobacillus carried the hydroxylamine oxidoreductase gene (hao), whereas most low-abundance and reference species had lost this crucial gene responsible for NO and/or hydroxylamine production [74]. The absence of the nitric oxide reductase gene (norB) indicated that the genus Desulfobacillus may serve as a potential N2O sink in anammox microbiota. The genus Zeimonas, in contrast, showed greater potential for achieving complete denitrification, even though some low-abundance species also lacked the norB gene. Moreover, all species of the genus Desulfobacillus and three dominant species of the genus Zeimonas could reduce nitrite to ammonia via a cytochrome c periplasmic nitrite reductase gene. These results indicated that these two genera likely exhibited diverse nitrogen cross-feeding relationships with anammox bacteria. The metabolic pathways for phosphorus and sulfur were more complete in the genus Desulfobacillus than in the genus Zeimonas. Multiple functional genes coding for the SOX system were observed in the two genera, indicating that these species oxidized sulfur compounds to support their energy metabolism [77]. Abundant species of the two genera encoded carbon fixation pathways (i.e., the full Calvin-Benson-Basham cycle). Furthermore, these species can utilize various carbohydrates, such as aromatic compounds, acetate, pyruvate, glycerol, and lactate (Fig. 5e). The presence of these differentiated species with functional plasticity likely enhanced the mixotrophic nutrition modes among anammox microbiota [74]. In addition, the signaling molecules AI-2 and AHLs could regulate the metabolic activities of the two genera due to the availability of receptor proteins (Additional file 1: Fig. S19), and several species possessed genes encoding quenching proteins. Although auxotrophy is widely observed in these species, their nearly identical characteristics in terms of the biosynthesis of amino acids and vitamins demonstrated their similar cross-feeding interactions with other species (Additional file 1: Figs. S20-S21). Overall, functional plasticity probably contributed to increased species diversity within the genera Desulfobacillus and Zeimonas, thereby enhancing the coexistence of multiple species in anammox microbiota.