MAGs recovered from composting
A total of 11 MAGs from the ZC3 dataset and 49 from the ZC4 dataset were recovered. (Supplementary Table S1). All of our MAGs meet the medium-quality requirement (> 50% completeness and < 10% contamination) of the MIMAG standard [29] (with one exception: ZC4RG07 had 10.87% contamination); 34 MAGs meet the high-quality requirement (> 90% completeness and < 5% contamination). The average number of contigs in these MAGs is 363.75, with a minimum of 15 and a maximum of 1,655. Thirteen MAGs could be assigned to species for which there is at least one genome publicly available; this allowed us to compare our MAGs to publicly available genomes of the same species (Supplementary Table S2a). In all cases the two-way ANI measure was above 98% and the GGDC DDH estimate (formula 2) was at least 87.3%, strongly suggesting that the assignments are correct and that the recovered genomes are of high quality. These 13 MAGs correspond to 11 species that have been found in environments related to biomass degradation, with the possible exception of M. hassiacum, which has been isolated in human patients. All these species have been reported as thermophilic bacteria (Supplementary Table S3).
Taxonomic assignments
The 60 recovered MAGs could be assigned to six different phyla: Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Firmicutes and Proteobacteria (Supplementary Table S1). At the order level there is remarkable diversity: 32 different orders are represented. The most frequent order was Limnochordales (six MAGs). Most of the MAGs seem to be novel: there are seven MAGs for which no order could be assigned, 14 MAGs for which no family could be assigned, 14 MAGs for which no genus could be assigned, and 12 MAGs for which no species could be assigned. This counting does not take into account GTDB-tk assignments to taxa that are not currently accepted (such as family WCHB1–69 within the order bacteroidales).
Presence of MAGs in other datasets
We compared the 60 MAGs with genomes recovered from our previous studies [3, 11] and between both composting cells (Supplementary Table S2b). We observed that a few genomes can be said to be present (using the same methodology for MAG species assignment) in different samples, although the majority was found in only one sample. The recovery of the “same” genome from different samples lends additional confidence to our genome recovery process. It is probable that there may be more occurrences of the same MAGs in these samples than here reported, but our sequencing coverage and stringent genome recovery criteria may have prevented us from recovering additional already-observed MAGs from all samples.
Functional analysis of MAGs
For functional analysis, we have focused on ZC4 samples only, since for them alone do we have metatranscriptome data. ZC4 is composed of nine time-series samples (days 1, 3, 7, 15, 30, 64, 67, 78 and 99). A comparison of variation in relative abundance of transcripts over time showed that all MAGs here analyzed were transcriptionally active (Supplementary Table S4)
The composting process is clearly very complex, both in terms of its microbiota and in terms of the varied subprocesses that occur over the approximately three months during which composting takes place. For the functional analysis that follows we have focused on functions that we deemed relevant for composting microbial systems. Those that we chose to study in detail are: lignocellulose degradation; denitrification; sulfur metabolism; hydrogen metabolism; and oxygen metabolism. Secondary metabolite production and antibiotic resistance genes were also included given their role in microbial interactions.
Lignocellulose degradation
Biomass degrading capabilities in MAGs were analyzed based on COG assignment (Supplementary Fig. S1) and CAZy annotation (Fig. 1; andSupplementary Tables S5 and S6). Among the 49 ZC4 MAGs, 12 present each more than 150 CDSs classified as CAZymes (Fig. 1). In these 12 MAGs, each has at least 40 CDSs also classified as GHs (Fig. 1). Several cellulases (GH5, GH6, GH9 and GH45), endohemicellulases (GH8, GH10, GH11, GH12, GH26, GH28 and GH53), debranching (GH51, GH62, GH67 and GH78) and oligosaccharide-degrading enzymes (GH1, GH2, GH3, GH29, GH35, GH38, GH39, GH42 and GH43) were found in these MAGs (Fig. 1).
Regarding Auxiliary Activities (AA), ZC4RG20 (c__Gammaproteobacteria), ZC4RG33 (g__Aquamicrobium),, ZC4RG43 (s__Mycobacterium_hassicum),, and ZC4RG45 (s__Thermocrispum agreste) present at least 15 CDSs classified as AA (Supplementary Table S6). ZC4RG45 contains the highest diversity of AA genes. Members of the AA1 family, which perform lignin degradation efficiently, were only found in ZC4RG08 (s__Pseudomonas themotolerans).. ZC4RG21 (s__Thermobifida fusca),, ZC4RG04 (s__Thermobispora bispora),, ZC4RG28 (f__Streptosporangiaceae), ZC4RG45, and ZC4RG47 (g__Micromonospora) were the only ones containing CDSs classified in the AA10 family (lytic polysaccharide monooxygenases), members of which are capable of directly targeting cellulose for oxidative cleavage of the glucose chains.
As stated above, we have strong evidence that each MAG here analyzed was transcriptionally active. We checked the expression of CDSs related to lignocellulose degradation, and determined that all CAZymes mentioned here are being expressed in the composting process (Fig. 2).
Secondary metabolites
Several CDSs classified as secondary metabolite genes (siderophores, bacteriocins, sacpeptides, betalactones, lassopeptides, and type I, II and III polyketides) were found in MAGs (Fig. 3). MAGs with at least six secondary metabolites genes were ZC4RG43 (s__M. hassiacum) (16 genes), ZC4RG39 (f__Steroidobacteraceae) (13), ZC4RG21 (s__T. fusca) (10), ZC4RG47 (g__Micromonospora) (7), ZC4RG22 (o__Luteitaleales) (6), ZC4RG04 (s__T. bispora) (6) e ZC4RG46 (o__Polyangiales) (6). Most of these MAGs were classified as Actinobacteria or Proteobacteria.
We observed that secondary metabolite genes were more expressed in the initial days (D1, D3, D7) than the final days (D78, D99) (Supplementary Table S7).
Antibiotic resistance genes
Antibiotic resistance gene (ARG) clusters were observed mainly in MAGs from Actinobacteria and Proteobacteria phyla (Fig. 4). ZC4RG08 (s__P. thermotolerans) has the largest number of ARGs (12 CDSs), followed by ZC4RG43 (s__M. hassiacum) (11 CDSs). Additionally, several multidrug efflux pumps were found in ZC4RG08 (MuxABC-OpmB, MexAB-OprM, MexEF-OprN, MexWV and MexJK) (Supplementary Table S8).
Transcripts coding for resistance genes were more abundant in the early (D01 and D03) and final (D78 and D99) days (Supplementary Table S9).
Aerobic and Anaerobic respiration strategies
The analysis of oxygen metabolism indicates that nearly all bacteria from which MAGs were obtained are aerobes (Supplementary Table S10). The oxidases detected were all active and transcript abundance variation over time shows a slight decrease in D7, with an increase following the turning procedure (Supplementary Fig. S2). Evidence of oxidases and aerobic metabolism was not detected in MAGs ZC4RG12 (g__Caldicoprobacter),, ZC4RG32 (s__Caldicoprobacter_faecalis),, ZC4RG34 (f__Thermovenabulaceae), and ZC4RG49 (s__[Clostridium] cellulosi),, indicating thereby a metabolism strictly anaerobic. These MAGs have all been classified within the phylum Firmicutes and demonstrated a similar profile of activity based on the global abundance variation of their transcripts across composting, with a peak in D7 (Supplementary Table S4).
Sulfate reduction via cysteine desulfurase (cysCN) was detected as a widespread and active function among MAGs (Supplementary Table S10). Such mechanism of sulfate reduction is part of the assimilatory sulfate reduction pathway by which sulfate is incorporated into cysteine. Evidence for dissimilatory sulfite reductase function (dsrAB and apsA) was not detected in the genomes, which is evidence that the respiratory sulfate reduction was not the main strategy for anaerobic respiration employed by the bacteria in this composting process.
Active denitrification genes were detected in several MAGs (Supplementary Table S10). Eighteen of them presented the nitrite respiration gene nirK. The variant nirS was not detected in the MAGs. ZC4RG13 (s__Rhodothermus marinus),, ZC4RG22 (o__Luteitaleales), ZC4RG26 (s__Sphaerobacter thermophilus),, and ZC4RG29 (f__Cyclobacteriaceae) encode the complete denitrification pathway (i.e., nitrous-oxide reductase pathway, nosDZ). ZC4RG29 lacks the nirK gene and has evidence for nitrite reductase using the nirB gene. Nitrous-oxide reductase genes (nosZD) were also active during the composting process. The variation in abundance of transcripts of these denitrification genes increases over time, with a peak starting in D7 (Supplementary Fig. S2).
Chemolithotrophic metabolism
We found evidence for chemolithotrophic metabolism based on MAG genes related to the oxidation of inorganic sulfur compounds (Supplementary Table S10). Nearly all MAGs have genes from the sulfur oxidation pathway via sulfur dioxygenase. Some of the MAGs were found to be more versatile and had genes annotated with other functions associated with the oxidation of sulfur compounds. ZC4RG20 (c__Gammaproteobacteria), for instance, represents a bacterial population that showed transcripts associated with sulfur dioxygenase, sulfide oxidation (sqr), and thiosulfate oxidation (soxC), including transcripts associated with carbon fixation via rubisco activation, supporting a chemolithoautotrophic growth. The Sox system, which is able to oxidize sulfite and sulfone group in thiosulfate, was found in MAGs ZC4RG25 (f__Hyphomicrobiaceae), ZC4RG31 (f__Hyphomicrobiaceae), ZC4RG33 (g__Aquamicrobium),, and ZC4RG42 (o__Betaproteobacteria), although soxC was apparently lacking in all of them. Sulfide oxidation (sqr) to thiosulfate was detected also in ZC4RG25. These observations highlight that members of the bacterial populations in the composting microbiome were capable of harvesting energy by oxidizing inorganic sulfur compounds. CDSs associated with nitrification genes (amo and hao) were not detected in the MAGs, indicating that this trait was of minor or no relevance for the composting microbiome.
Several hydrogenases were found to be present and expressed (Supplementary Table S10). We were able to identify two types of hydrogenases. MAGs ZC4RG04, 09, 13, 15, 26, 28, 36, 37, 38, 43, and 49, belonging to diverse phyla (Supplementary Table S1), presented CDSs associated with [NiFe] hydrogenases. MAGs ZC4RG11, 12, 23, 32, 34, 38, and 49, from the phylum Firmicutes (Supplementary Table S1), have CDSs annotated as prototypical hydrogen-evolving [FeFe] hydrogenases (group A1).
Co-occurrence of MAGs
Using the 49 ZC4 MAGs, we inferred correlation patterns using their variation in abundance (metagenomic datasets) and in activity (metatranscriptomic datasets) during the composting process. The correlation patterns derived from the abundance profile resulted in a graph composed by 40 nodes and 107 interactions, and four clusters (Fig. 5a). The correlation patterns derived from the activity profile resulted in a graph composed by 43 nodes and 76 interactions, and three clusters (Fig. 5b).
We observed a high concordance between both correlation analyses. Fifty-five interactions observed in the graph based on metagenomic datasets (Fig. 5a) are also present in the graph based on metatranscriptomic datasets (Fig. 5b). The correlations based on the activity of MAGs give us predictions on who are the key microbial players in the composting process as well as when and with whom they interact. In what follows we describe the main features of each activity cluster, highlighting high number of transcripts for selected cluster member MAGs on particular days (Supplementary Table S11).
Cluster 1: Seven MAGs form this cluster. Transcripts from members of this cluster are more abundant on initial days (D01 and D03) with a slight later increase on D64 (immediately after turning), followed by another increase on D99. Cluster members ZC4RG02 (g__Pseudoxanthomonas),, ZC4RG04 (T. bispora),, and ZC4RG28 (f__Streptosporangiaceae) presented the highest number of transcripts in the initial days.
Cluster 2: This cluster is composed of 10 MAGs, all Firmicutes and mostly abundant and active between D3 and D15, followed by a peak on D64. Many transcripts from ZC4RG12 (g__Caldicoprobacter) and ZC4RG32 (s__Caldicoprobacter faecalis) related to lignocellulose degradation were identified, especially on D3, D7, and D15.
Cluster 3: This 26-MAG cluster is taxonomically diverse (it contains members of Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, and Proteobacteria phyla). The following cluster members are notable for expressing genes related to lignocellulose breakdown: ZC4RG13 (s__R. marinus),, ZC4RG14 (f__Steroidobacteraceae), ZC4RG16 (s__T. fusca),, ZC4RG26 (s__S. thermophilus),, ZC4RG29 (f__Cyclobacteriaceae), ZC4RG36 (c__Anaerolinea), ZC4RG46 (o__Polyangiales), ZC4RG47 (g__Micromonospora),, and ZC4RG48 (f__Roseiflexaceae); this activity is especially intense on D30, 78, and D99. ZC4RG20 (c__Gammaproteobacteria) and ZC4RG45 (s__T. agreste) express several genes associated with lignin degradation (i.e., annotated with CAZy AA families), especially on day 99.
Metabolic dependencies based on genome-scale models
Based on the results obtained with the co-occurrence analysis (Fig. 5b) and the activity of relevant functional genes (Supplementary Tables S7, S9, S10, S11, Fig. 2, and Supplementary Fig. S2), we identified MAGs according to their importance in the different stages of composting and the main functions associated with them (Fig. 6). We used this model in turn to assess the metabolic dependencies between these MAGs based on genome-scale models (Supplementary Fig. S3). The results revealed strong dependencies between some of the MAGs, as they received a maximum dependency score (Table 1). According to the models obtained, the most frequent compounds involved in the interactions between MAGs are hypoxanthine, H+, uracil, and phosphate. ZC4RG26 (s__Sphaerobacter thermophilus) had the highest level of dependency from other MAGs, predicted to be a metabolite receiver of 11 compounds, followed by ZC4RG08 (s__Pseudomonas thermotolerans) and ZC4RG20 (c__Gammaproteobacteria), predicted to receive six and five compounds, respectively. ZC4RG04 (s__Thermobispora bispora) and ZC4RG22 (o__Luteitaleales) were predicted to have the highest number of interactions as metabolite producers. They were both predicted to be donors of seven compounds, followed by ZC4RG28 (f__Streptosporangiaceae), predicted as donor of six compounds. The set of keystone MAGs during the final composting stage (D99) presented higher possibilities for strong metabolite dependencies between MAGs compared to the other stages (Table 1).
In order to test if the predicted metabolic interactions are likely to be specific of the bacteria strains that were able to thrive in the composting microbiome, we chose three of the MAGs assigned to known species that were more frequently involved in the metabolic interactions model (ZC4RG26, ZC4RG08, ZC4RG04) and replaced them by the respective reference genomes from GenBank according to the taxonomic assignment based on GTDB: Sphaerobacter thermophilus (RefSeq: GCF_000024985.1), Pseudomonas thermotolerans (RefSeq GCF_000364625.1), and Thermobispora bispora (RefSeq GCF_000092645.1). The results show that several metabolic exchange interactions were lost (Fig. 7). For instance, in the models built using the RefSeq genomes, S. thermophilus and T. bispora were not able to share metabolites with each other. On the other hand, the RefSeq P. thermotolerans was able to share palmitate with other nodes, while ZC4RG08 (s_P. thermotolerans) was not (Fig. 7).