Taxonomic and phylogenetic diversity of the MARINHET culture collection
A total of 1561 bacterial strains were isolated from 19 marine stations, eight photic-layer, four mesopelagic, and seven bathypelagic samples (Fig. 1a). The partial 16S rRNA sequences of the cultured strains were grouped into operational taxonomic units (isolated OTUs, referred hereafter as iOTUs) using 99% similarity thresholds. Alphaproteobacteria and Gammaproteobacteria iOTUs dominated in all stations (Fig. 1b). Bacteroidetes isolates were present in all photic stations, but were not retrieved at the mesopelagic of ST39 or in the Atlantic bathypelagic samples ST10, ST33 and ST43. Actinobacteria isolates were retrieved only from six stations including photic and mesopelagic but not from bathypelagic samples. Finally, Firmicutes could be only isolated from photic samples of the Arctic and Indian Ocean during the time of sampling (Fig. 1b).
If we group the different stations per depth, Good´s coverage analyses per layer, which is an estimator of the percentage of total species represented in a sample, ranged from 56.1 to 70.5% (Table 1). These results indicated that the isolates dataset, even if not saturated, represents a reasonable inventory of the culturable heterotrophic marine bacteria. The number of iOTUs detected was slightly higher in the photic layer for the non-rarefied iOTU table, but similar in all depths for the rarefied iOTU table, being the mesopelagic the layer with the lowest observed values (Table 1). Rarefaction curves showed also slightly higher richness for the photic samples compared to the mesopelagic and the bathypelagic, but they did not reach an asymptote (Fig. 2a and Supplementary Fig. S1a in Additional file 1). On the other hand, rank abundance plots of the non-rarefied (Fig. 2b) and rarefied iOTU tables (Supplementary Fig. S1b in Additional file 1) presented, for the three depths studied, a steep curve, which is indicative of low evenness. Thus, there were a few abundant iOTUs with a large number of representatives and a large proportion of iOTUs that had few representatives (rare iOTUs). Therefore, we also calculated the richness and diversity metrics of each depth using OTU-based and phylogenetic approaches. All three metrics of OTU-based alpha diversity used (Species observed (S.obs) or nº of iOTUs, Chao1 and Shannon indexes, Fig. 2c) decreased with depth but not significative differences were found between layers (ANOVA test: S.obs: P-value = 0.152; Chao richness estimator: P-value = 0.191; Shannon diversity index: P-value = 0.183). The three measures of phylogenetic diversity, Faith’s phylogenetic diversity (PD) [27], the PD divided by the number of iOTUs (PD/iOTUs), and the mean nearest taxon distance (MNTD) [28], were not significantly different between depths (ANOVA test: PD: P-value = 0.093; PD/iOTUs: P-value = 0.159; MNTD: P- value = 0.107), although a higher mean in phylogenetic diversity was observed in the photic layer than in the mesopelagic and the bathypelagic samples, while the phylogenetic diversity per iOTU and the MNTD was slightly higher in the bathypelagic layer (Fig. 2c).
Shared diversity between photic, mesopelagic, and bathypelagic samples across oceans
We explored the similarity between iOTUs from different layers (photic, mesopelagic and bathypelagic). First, we started with samples from Indian Ocean ST39 because it was the only station with a vertical profile covering samples of the photic (surface and deep chlorophyll maximum, DCM) and the aphotic layer (mesopelagic). A total of 34 iOTUs were obtained from the independent clustering at 99% sequence similarity of all isolates from ST39. This clustering revealed that only 5 iOTUs were shared between photic and mesopelagic, while the rest could only be recovered from one depth, being the photic layer with the highest number of different iOTUs (Fig. 3a), results that could be biased due to the higher presence of both surface and DCM isolates in the ST39 photic layer in comparison with the mesopelagic isolates. However, it was surprising that the shared iOTUs comprised the 63.6% of the total isolates (Fig. 3a). At this point, we also examined the connectivity between different layers and across distant oceans covering large spatial and latitudinal scales. The non-rarefied iOTU table, including all the photic (817), mesopelagic (362), and bathypelagic (382) isolates, as well as the rarefied iOTU table, sampled down to the layer with the lowest number of isolates (mesopelagic), were used for the analyses, and because minor differences were observed among them (Supplementary Tables S1 and S2 in Additional file 2), the results mentioned here refer only to those obtained after rarefaction. Fifteen out of 122 iOTUs (Fig. 3b) included isolates from all layers, accounting for 52.7% of the total isolates sequences (Fig. 3b), with an average number of 37.6 isolates per iOTU. Further, eight iOTUs (12.7% of the isolates) were common to photic and bathypelagic isolates, nine (6.6%) to photic and mesopelagic isolates, and eight (7.4%) to mesopelagic and bathypelagic isolates (Fig. 3b). Nevertheless, as observed in ST39, a substantial proportion of isolates were only retrieved from one of the layers: 29 iOTUs were only found in the photic samples, 25 in the mesopelagic, and 28 in the bathypelagic samples (Supplementary Table S3 in Additional file 2), with an average number of 3.2, 1.4, and 2.9 isolates per iOTU, respectively (Fig. 3b).
The taxonomic classification of all these iOTUs, using the lowest common ancestor (LCA) method, designated a total of 59 different genera and 10 iOTUs that could not be classified at the genus level. From these 59 genera, 13 were widely distributed along the different depths studied representing 75% of the total isolates. On the other hand, the photic ocean was again the layer with the highest number of retrieved genera that were not observed in the other two depths, even though they accounted for only 5.6% of the isolates (Supplementary Table S4 in Additional file 2).
If the comparative analysis is repeated with a more restrictive clustering, (instead of 99% at 100% similarity) we found that 37% of the isolates (578 out of 1561) were 100% identical at their partial 16S rRNA genes regardless the origin or layer. We found Alteromonas, Cobetia, Erythrobacter, Leeuwenhoekiella, Halomonas, Idiomarina,Marinobacter, and Mesonia between the shared genera, indicating taxa widely distributed along different depth layers. This shared percentage was even higher, up to 58.9%, when considering all mesopelagic and bathypelagic samples as aphotic and comparing them to all the photic isolates.
Biogeography of the commonly isolated heterotrophic bacteria
The most abundant and common culturable genera, i.e. those that occurred in all or most (around 80%) of the 19 stations studied, and the ones only retrieved locally (around 25% of the samples) with a restricted distribution were identified. Erythrobacter and Alteromonas were the most abundant and recurrent genera retrieved, representing 41.3% of the isolates (338 and 333 isolates respectively), and appearing in 94% of the samples studied regardless their origin, season and year of sampling (Fig. 4a). Less abundant genera such as Marinobacter (113 isolates), Halomonas (70 isolates), Pseudoalteromonas (51 isolates), Idiomarina (42 isolates), Pseudomonas (29 isolates), Sulfitobacter (51 isolates), or Oceanicaulis (46 isolates) were present in more than 25% of the samples (Fig. 4a) and covered almost all the oceanographic regions (Supplementary Table S5 in Additional file 2). These could be considered, thus, regionally distributed. Some genera such as Psychrobacter, Leeuwenhoekiella or Alcanivorax had lower numbers of isolates, but were recovered in more than 25% of the samples (Fig. 4a). Other genera, in turn, such as Zunongwangia, were retrieved in less than 25% of the samples but presented 54 isolates (3.5% of the strains). All the mentioned genera were found in the photic, mesopelagic, and bathypelagic layers, except Oceanicaulis which could not be isolated in this study from the bathypelagic samples (Supplementary Table S5 in Additional file 2). Finally, the remaining genera represented 20% of the cultures. Then, these results revealed which genera are commonly isolated from distant stations with contrasted environmental conditions, depths and seasons covering 6 years of temporal range.
In a parallel study, we compared the isolates from each station with 16S rRNA sequences obtained through Illumina HTS of environmental DNA (16S iTAGs, hereafter) from two marine circumnavigations (Tara Oceans [29] and Malaspina Expedition [30]), to investigate whether our isolates have identical matches with environmental 16S iTAGs (Sanz-Sáez et al. in preparation). Despite the global comparison is out of the scope of this study, here we show the biogeographic distribution of the abundant top12 iOTUs, those with more than 20 isolates (Supplementary Table S6 in Additional file 2). To do so, we show the relative abundances of the denoised zOTUs (zero-radius OTUs, i.e. OTUs defined at 100% sequence similarity), obtained from the 16S iTAGs that matched at 100% similarity with these top12 iOTUs. We were aware that different iOTUs could match with the same zOTUs, and therefore the biogeography results presented here for the different zOTU represent the sum of the abundances of the top12 abundant iOTUs with other less abundant/rare iOTUs (Supplementary Table S7 in Additional file 2). The top12 iOTUs matching at 100% with zOTUs represented the 48.3% of the total isolates (754 out of 1561), and only one of the zOTUs matched with two iOTUs, a top12 iOTU and a less abundant/rare iOTUs, the latter representing 6 out 754 (0.8 %) of the isolates included in the top12 iOTUs. Besides, both iOTUs matching to the same zOTUs affiliated with the same genus and, thus, the abundance presented would correspond to different species or ecotypes within a genus. Thereby, in the photic layer, the abundant top12 iOTUs, or in this case, their respective zOTUs matches, represented an average abundance of 16S iTAGs (at 100% similarity) lower than 1%, regardless of their geographic region (Fig. 4b). This percentage increased around 1-2% of the reads in the mesopelagic layer in specific regions such as the Indian and South Pacific Ocean. However, our isolates exhibited higher abundances in the bathypelagic layer, in almost all oceanographic regions, especially for iOTU1, iOTU3 and iOTU4 affiliating with Alteromonas, Erythrobacter and Halomonas spp. (Fig. 4b). Overall, these iOTUs accounted a total average proportion of reads of 0.3 and 1.1% in the photic layer in two independent datasets (Tara and Malaspina respectively), 2.7% in the mesopelagic layer, and up to 7.8% in the bathypelagic, indicating that the commonly found isolates are more abundant in the deeper layers of the ocean.
Novelty of the isolates of the MARINHET collection
The percentages of similarity between the strains and their Closest Cultured Match (CCM) and Closest Environmental Match (CEM) were extracted and compared with the 97% and 99% identity thresholds to explore the possible novelty of our culture collection. The results showed that most of the isolates were similar to previously published cultured bacterial species, but also to environmental sequences obtained using molecular techniques (Fig. 5a). Therefore, most of the isolates were previously known microorganisms. However, we detected three 100% identical strains in their partial 16S rRNA gene that had a percentage of identity, both for CCM and CEM, below the threshold, at around 94%. One of the strains was isolated from surface samples of the North Atlantic Ocean (ISS653), whereas the other two were isolated from two mesopelagic samples of the Pacific Ocean (ISS1889, ISS2026). Further analyses with the complete 16S rRNA gene of ISS653 indicated that they could be candidates for a new species or even a new genus according to the thresholds proposed by Yarza et al. [31]. The three databases consulted (National Center for Biotechnology Information (NCBI), Ribosomal Data Base Project (RDP) and SILVA) showed different BLASTn results (Supplementary Table S8 in Additional file 2). Nevertheless, the Living Tree Project (LTP) database, which contains the accepted type species of each genus, displayed a 93.5% similarity with Mesonia mobilis. The phylogenetic tree constructed (Fig. 5b) also supported its novelty as our isolates had less than 93.8% of similarity with the cultivated reference genomes of the Mesonia genus.
Genomes of the strains ISS653 and ISS1889 were fully sequenced and characterized to formally describe a novel species, Mesonia oceanica (Lucena et al., submitted). As a detailed description of the novel species is already given in Lucena et al. (submitted), here we only focused in some interesting phenotypic differences among these two strains and their distribution pattern in marine metagenomes from five stations, including ST102 and ST151, that were the ones in which ISS1889 and ISS653 were respectively isolated. Only a few phenotypic differences could be found among both strains (Table 2). The most important phenotypic trait was the difference in their maximum growth temperatures, being 37ºC for ISS653, isolated from surface waters in ST151, and 30ºC for ISS1889, isolated from the mesopelagic layer in ST102. Genomic comparisons of both strains revealed an average nucleotide identity (ANI) of 99.9%, which indicated that the two strains were almost identical genetic clones. The G+C content and the number of RNAs present were equal in both strains. They slightly differ in the size and the number of protein codifying sequences (Table 2). However, we identified a pool of unique genes for each strain, 33 were only found in ISS653, whereas 6 were unique in ISS1889 (Supplementary Table S9 in Additional file 2). Among them, we found interesting some proteins that may confer specific advantages and/or adaptation (Table 2). For example, ISS653 contains some chaperones, GroEL, GroES and ClpB, that may be related with its wider range of growth temperatures. In addition, we detected some resistant mechanisms to toxic heavy metals. Resistance genes to cadmium-zinc-cobalt were detected in ISS653, while mercury resistance genes were observed in ISS1889. Nevertheless, we found that both isolates presented the same distribution patterns (Fig. 5c and Supplementary Fig. S2 in Additional file 1). Thus, these two new strains displayed higher abundances in the mesopelagic waters regardless of the station analysed, but especially in ST102 where ISS1889 was retrieved (Fig. 5c). Strain ISS653 was isolated from surface waters of the ST151 but it was in deeper layers where its abundance was also higher (Supplementary Fig. S3 in Additional file 1).