The taxonomic composition of the enriched biofilms was not based on the type of carbon source
The outline of study has been shown in Supplementary Fig. 1. The seed biofilm was collected from the surface of a rock immersed in subtidal zone. The selected carbon sources for enrichment culture included 17 alcohols, 23 amino acids, 15 organic acids (exclude amino acids), and 14 saccharides. Details of carbon sources has been provided in Supplementary Table 1. Since the utilization of carbon sources is most likely linked to the availability of oxygen19, the enrichment culture experiments were performed under both aerobic and anaerobic conditions. Thus, total 138 cultures were set up (69 carbon source × 2 oxygen conditions). After obtaining stable cell densities in the culture, these communities were subjected to metagenomic sequencing to generate in total 3.2 Tb Illumina datasets (22.8 ± 0.8 Gb per metagenome; Supplementary Table 2). Samples cultured in presence of same carbon source category were pooled together for Nanopore sequencing, generating 37.4 Gb datasets (12.5 ± 0.6 Gb per metagenomes; Supplementary Table 3). Classification of the miTags (183,028 ± 46,301 per metagenome) extracted
from the Illumina-sequenced metagenomes at 97% similarity revealed 9,497,699 operational taxonomic units (OTUs). These OTUs were further classified into 75 phyla (class level for Proteobacteria), 434 families, and 2,027 genera. Gammaproteobacteria was found to be the most prevalent phylum with a maximum percentage of 99.62%, followed by Alphaproteobacteria (86.31%) and Bacteroidetes (29.05%) (Fig. 1a). At the family level, Oceanospirillaceae (87.61%), Vibrionaceae (82.44%), Kiloniellaceae (67.89%), Rhodobacteraceae (67.42%, members of this family have been recently reclassified into Roseobacteraceae20), Pseudomonadaceae (64.44%), and Aeromonadaceae (62.17%) were the most abundant taxa (Supplementary Fig. 2). At the genus level, Marinomonas, Pseudolateronomas, Marinobacterium, Microbacterium, Roseibium, and Ruegeria were prevalent taxa (Supplementary Fig. 3). Overall, no distinct taxonomic composition boundaries were observed between communities assembled on different categories of carbon source. However, huge variations were observed among the communities assembled on carbon sources of same category. Thus, it was speculated that the taxonomic composition of the enriched microbiota was not determined by the type of carbon source.
To confirm this speculation, similarities among the communities were observed through principal co-ordinates analysis (PCoA), which was conducted based on Bray-Curtis distances of the phylum- (Fig. 1b), family- (Supplementary Fig. 4a), and genus-level compositions (Supplementary Fig. 4b). In order to exclude the inference of oxygen conditions, the communities obtained under aerobic and anaerobic conditions were analyzed separately at each taxonomic level. A majority of the samples were overlapped, and aerobic and anaerobic samples were not well separated in the PCoA plots (Fig. 1b; Supplementary Fig. 4a, b). This suggested that availability of oxygen had no role in determining the community structure. To certify the findings of above analyses based on 16S miTags, single-copy marker gene sequences were identified for taxonomic profiling, followed by PCoA at the genus level. Again, most of the biofilm communities were overlapped (Supplementary Fig. 4c), which confirmed that the overall community structure of the enriched biofilms is not based on carbon source.
Extensive microbial coexistence across the enriched biofilm communities
To get genome-level and functional explanations for the dissociation between carbon source category and composition of microbial communities, the metagenomes were assembled and genome binning was performed. For each carbon source, aerobic and anaerobic communities sequenced by the Illumina platform were cross assembled and the contigs were subjected to differential coverage binning. The key metagenome-assembled genomes (MAGs) were reassembled with the Nanopore datasets for improvement of genome completeness. To this end, 535 non-redundant (average nucleotide identity < 99%) MAGs were obtained with high quality (completeness > 90% and potential contamination < 5%). The information of metagenomic contigs and MAGs are given in Supplementary Table 4 and Supplementary Table 5, respectively. These MAGs were distributed in 13 phyla (class level for Proteobacteria), 68 families, and 156 genera (Supplementary Table 5). Whole genome-based phylogenetic analysis revealed diverse phylogeny, and at the phylum level, these MAGs mainly belonged to Gammaproteobacteria, Alphaproteobacteria, Bacteroidota, and Spirochaetota phyla, amongst others (Supplementary Fig. 5).
Afterwards, the relative abundance of the 535 MAGs was calculated in the 138 metagenomes and the single-genome level distribution patterns were observed. In the resulting heatmap, many of the MAGs showed similar distribution patterns (Supplementary Fig. 6), implying that these MAGs may coexist across the enrichment cultured biofilms. To test this notion, Spearman correlation analysis was applied to construct a correlation matrix between vectors of MAG abundances across all the metagenomes. As a result, 383 MAGs showed coefficient of 0.9 or higher (average 0.95), indicating strong and widely distributed correlations (Fig. 2a). Four major clusters formed by MAGs from different microbial genera were observed (indicated by boxes in Fig. 2a). These MAGs were mostly affiliated to Alphaproteobacteria (e.g., Roseibium), Gammaproteobacteria (e.g., Pseudoalteromonas), Actinobacteriota (e.g., Glutamicibacter), and Bacteroidota (e.g., Polaribacter_A) phyla (Supplementary Table 6). Subsequently, a network was constructed using a threshold of coefficient above 0.9 and P-value lower than 0.05, to show the connections between the 383 MAGs. These MAGs formed a network with 1557 edges. The largest cluster consisted of MAGs belonging to Alphaproteobacteria, Gammaproteobacteria, and Bacteroidia (Fig. 2b). The average degree, network diameter, graph density, modularity, average clustering coefficient, and average path length were 8.13, 7.00, 0.02, 0.79, 0.86, and 2.40 respectively. According to the threshold used in a previous study21, these values here indicated that the MAGs were tightly connected to form a network with high density.
To understand the potential functions that were responsible for the observed microbial correlation, functional genomic comparison of selected MAG groups was performed. Two groups were created. The first group contained MAGs with highest connection numbers (edge number ≥ 14) and no significant abundance variations (P-value > 0.05) between carbon source categories. The second group comprised of MAGs with lowest connection numbers (edge number = 1) and significant (P-value < 0.05) abundance variation between at least one pair of carbon source categories. To this end, 33 MAGs in the first group were compared with 24 MAGs in the second group. Searching against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database revealed total 5630 KEGGs for the 57 MAGs. After statistical analysis using t-test and Hypergeo-test, 329 KEGGs showed significant variation (P-value < 0.05 in both tests) in terms of the average number of genes per MAG between the two groups. KEGGs corresponding to amino and peptide metabolism were highly prevalent in the MAGs with strong connections (Fig. 2c). For example, the peptide/nickel transport system substrate-binding protein (K02035) and the peptide/nickel transport system permease protein (K02033) had more than eight average copies per MAG in the strongly-connected MAGs, whereas only three average copies of these KEGGs were observed in the MAGs with no connection (Fig. 2c).
Positive interactions among biofilm-derived culturable strains
Above analyses revealed extensive microbial coexistence across biofilms on various carbon sources, which implied strong and comprehensive interactions among microorganisms. To confirm this notion using experiments, strains were isolated from the enriched biofilms to conduct pairwise co-culture. In total, 107 non-redundant (16S rRNA gene identity < 99.9%) strains were identified, which were mostly affiliated to Alphaproteobacteria and Gammaproteobacteria. At the genus level, Vibrio was the most prevalent taxa, followed by Microbacterium, Albirhodobacter, Glutamicibacter, Aurantimonas, Tritonibacter, Celeribacter, Roseibium, and Pseudoalteromonas (Supplementary Fig. 7). Six strains including Pseudoalteromonas shioyasakiensis C717, Roseibium aggregatum S1616, Microbacterium maritypicum S1611, Salipiger pacificus T6124, Leisingera aquaemixtae M597 (this strain was also isolated from the seed biofilm), and Sulfitobacter indolifex W002 (also isolated from the seed biofilm) were monocultured and co-cultured pair by pair in presence of 21 single carbon sources, including four alcohols, nine amino acids, four organic acids, and four saccharides. These strains were selected because MAGs belonging to the corresponding genera showed strong correlations in the MAG correlation analysis (Supplementary Table 6). Notably, growth of these bacteria in single carbon sources was largely promoted by co-culturing (Fig. 3a). In many cases, the monocultured bacteria could not grow well (OD600 of cell density < 0.1). On the other hand, co-cultured consortia displayed apparently distinct patterns, with substantial cell densities for a broad range of carbon sources (Fig. 3a). For example, when cultured individually, S1616 or M597 could not utilize the saccharides. However, when cultured together, these two strains showed substantial cell densities in D-maltose, L-arabinose, L-rhamnose, D-arabinitol, and D-mannitol (Fig. 3a). Similarly, co-culturing of these two strains in D-gluconic acid yielded much higher cell densities than monoculture (Fig. 3a). Subsequently, time series of cell density dynamics were observed in the consortium of S1616 and M597 co-cultured in D-gluconic acid, which resulted in a cell density of OD600 0.37 after three days (Fig. 3b). To confirm the stable coexistence of these two strains, 16S rRNA gene sequencing (data information is given in Supplementary Table 7) was conducted with two-day intervals after seeding the strains in the medium containing D-gluconic acid. The relative abundance of S1616 was higher than that of M597 and increased within the first six days. Afterwards, the abundance of S1616 decreased on the 8th day and was stabilized at approximately 50% (Fig. 3c). The consumption of D-gluconic acid by the co-cultured and monocultured consortia was also examined through biochemical assays, which showed an increased carbon consumption rate during co-culturing (Supplementary Fig. 8). These results evidenced positive interactions among the biofilm-derived strains.
Metabolic and molecular bases for cross-feeding between two representative strains
Next, the functional basis for the positive microbial interactions were explored through the co-culture experiments. Given the overexpression of genes for peptide and amino acid metabolism in the strongly-correlated MAGs, we hypothesized that, amino acid utilization might be the limiting factor for individual bacterial growth, whereas cross-feeding based on amino acid metabolism may expand the range of available carbon sources during co-culture. To test this hypothesis, we designed an experiment comprising the following steps: 1) S1616 and M597 were co-cultured in the presence of D-gluconic acid until obtaining the stationary phase and the supernatant was extracted; 2) the supernatant was amended with D-gluconic acid and the mixture was used to culture the two strains individually; 3) the supernatants of the two monocultures the co-culture supernatant obtained in the first step were subjected to quantitative detection of individual amino acids (Fig. 4a). Monoculture supplemented with the co-culture supernatant revealed significantly higher cell density for these two strains, especially for M597, which could not grow in monoculture (Fig. 4b). Quantitative detection of individual amino acids in the supernatants revealed significant changes in concentration of certain amino acids before and after addition in the monoculture (Fig. 4c). For example, the glutamate concentration in the co-culture supernatant increased after monoculture of S1616, whereas it decreased after M597 monoculture (Fig. 4c). Other amino acids, including alanine, citrulline, phenylalanine, threonine, and tyrosine, also displayed changes in concentrations, and in most cases, these amino acids were consumed or excreted by different strains (Fig. 4c).
To explore gene-level mechanisms of the metabolic cross-feeding, genomics and transcriptomics were conducted on S1616 and M597. Complete genome sequencing was conducted, followed by gene prediction and KEGG annotation. Genome information, including the number of ribosomal RNA (rRNA), transfer RNA (tRNA), open reading frames (ORFs), and KEGG-annotated ORFs, are given in Supplementary Table 8. Transcriptomic sequencing was performed after co-culturing these two strains in D-gluconic acid. The related data have been presented in Supplementary Table 9. Subsequently, gene transcription profiles of the two strains were visualized individually based on the Reads Per Kilobase Million (RPKM). Nearly all KEGG-annotated genes (total 1507) were expressed in both strains, while 614 and 330 genes were specifically expressed in S1616 and M597, respectively (Fig. 5a and 5b). Students’ t-test of 1507 KEGG genes revealed 82 significantly overexpressed (P-value < 0.05 and fold change > 2) genes in the S1616 transcriptome, and 297 significantly overexpressed genes in the M597 transcriptome (Fig. 5c). Regarding amino acid metabolism, 10 genes were overexpressed in S1616 (e.g., K01733, thrC, threonine synthase) while 25 were overexpressed in M597 (e.g., K01920, gshB, glutathione synthase) (Fig. 5d). Reconstruction of the amino acid metabolism-related pathways in the genomic and transcriptomic datasets revealed complete pathways for metabolism of a variety of amino acids, including the transformation of threonine, alanine, and phenylalanine to acetyl-CoA, as well the transformation of citrulline, arginine, glutamine, and glutamate to 2-oxoglutarate (Supplementary Fig. 9). These pathways suggested that these amino acids can probably enter the central carbon metabolism. In addition, pathways for the transformation of intermediates to several amino acids during the central carbon metabolism were also annotated (Supplementary Fig. 9). All these metabolomic, genomic, and transcriptomic results suggest that there is an extracellular amino acid pool, which allows the exchange of amino acids between the two bacterial strains, thus promoting their growth (Fig. 5e).
Microbial coexistence in in situ marine biofilms
Finally, the prevalence and coexistence of the two cross-feeding strains, S1616 and M597, in marine biofilms were assessed. This was done by examining the distribution pattern of these strains in biofilm metagenomes collected from the Pacific Ocean, the Indian Ocean, and the Atlantic Ocean to obtain a global ecological profile. Total 131 biofilms were collected, including 18 on metal panels15, 13 on rock surfaces15, 76 on plastic panels15, and 24 on microplastic or glass particles11. The detailed information of these biofilms, as well as the metagenomic datasets, have been presented in Supplementary Table 10. The results revealed substantial proportion of both strains in nearly all biofilm communities, with maximum relative abundance of 1% and 2% for S1616 and M597, respectively (Supplementary Fig. 10). Interestingly, correlation analysis indicated significant coexistence (Spearman Rho > 0.5 and P-value < 0.05) between these two strains detected in biofilms on all the four substrate types (Supplementary Fig. 11).