Phylogenetic analysis of 16S rRNA gene sequence
The 16S rRNA gene sequence of strain GM2-3-6-6T (1,384 bp in length, accession number: MT829391) obtained by Sanger sequencing had 100% identity with the complete sequence (1,513 bp in length) identified and extracted from the genome sequence. Two ribosomal RNA (rrn) operons (16S-23S-5S) were found in the complete genome, and they had 100% sequence similarity. Comparison of 16S rRNA gene sequence of strain GM2-3-6-6T with nucleotide sequences in GenBank and SSU reference sequences in SILVA (Release 138) showed top hits with uncultured bacterium clones of <95.8% sequence similarities. Comparison with type strain sequences in the EzBioCloud server showed that strain GM2-3-6-6T had maximum sequence similarities with C. algicola 0182T, C. catalasitica IFO 15977T, and P. roseus SM1701T of 93.8%, 93.6%, and 92.5%, respectively. These values are below the recommended taxonomic threshold of a genus delineation with sequence identity of 94.5% (Yarza et al. 2014), indicating that strain GM2-3-6-6T may represent a novel species of a novel genus.
Phylogeny of 16S rRNA gene in the neighbor-joining tree (NJ) and maximum likelihood tree (ML) indicated that strain GM2-3-6-6T was well clustered with the members of the family Crocinitomicaceae, and formed an independent branch closely related with Crocinitomix members and P. roseus (Supplementary Figure S1 and S2). Phylogeny of the 16S rRNA gene supported that strain GM2-3-6-6T represents a novel genus lineage within the family Crocinitomicaceae. Members of the family Crocinitomicaceae were well clustered and shared 16S rRNA gene sequence similarities of 86.74-98.49%.
Phylogeny of the 16S rRNA gene also found that a small number of members related to Cryomorpha ignava ACAM 647T formed multi-family lineages with low bootstrap values (Supplementary Figure S1 and S2). Firstly, the type species Schleiferia thermophila formed a strongly supported clade with Thermaurantimonas aggregans (sharing 93.91% sequence similarity), belonging to the family Schleiferiaceae. Secondly, two species within the genus Phaeocystidibacter sharing 95.42% similarity formed an independent clade. Here we proposed a novel family Phaeocystidibacteraceae fam. nov. to accommodate the genus. Thirdly, the species Owenweeksia hongkongensis and Croceimicrobium hydrocarbonivorans were tightly clustered (bootstrap value of 100%), and they should be considered a distinct family from the families Salibacteraceae, Schleiferiaceae, and Phaeocystidibacteraceae. Thus, a novel family designated Owenweeksiaceae fam. nov. is proposed, with Owenweeksia as the type genus. Fourthly, Vicingus serpentipes ANORD5T and Acidiluteibacter ferrifornacis S-15T, sharing 89.83% of 16S rRNA gene sequence similarity, formed a highly supported clade, affiliating to the family Vicingaceae.
Genomic features and genomic relatedness
The draft genome size of strain GM2-3-6-6T determined using Illumina sequencing was 4,338,207 bp on 15 contigs (>1 kb) with the two largest sequences of 1,394,190 bp and 1,261,120 bp in length. The complete genome size of strain GM2-3-6-6T determined using PacBio sequencing was 4,365,762 bp with one circular chromosome (Table 1). The genome size of strain GM2-3-6-6T was similar to close relatives C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T of 4,622,888 bp, 3,719,297 bp, and 4,042,952 bp, respectively. The DNA G+C content of strain GM2-3-6-6T was 34.98%, which was also similar to three closely related species (Table 1).
The antiSMASH server revealed a carotenoid biosynthetic gene cluster (BGC, ~20 kb) in the genome sequence of strain GM2-3-6-6T, which encoded the key enzymes that synthesize the carotenoid, including phytoene desaturase, phytoene synthase, and phytoene dehydrogenase (Figure S3). The carotenoid BGCs were also found in the genomes of C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T, while they demonstrated a different gene arrangement, indicating that the carotenoid BGCs were very diverse in the Crocinitomicaceae members.
The ANI values of strain GM2-3-6-6T compared to C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T were 68.80%, 68.55%, and 68.79%, respectively. These values were lower than the suggested threshold of species delineation (95-96%) (Yoon et al. 2017a), suggesting that strain GM2-3-6-6T represents a novel species. The digital DNA-DNA hybridization (dDDH) values of strain GM2-3-6-6T compared to C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T were 19.00%, 18.50%, and 19.20%, which were also below the threshold value of species delineation (70%) (Meier-Kolthoff et al. 2013), indicating that strain GM2-3-6-6T represented a novel species. The average amino acid identities of strain GM2-3-6-6T compared to C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T were 61.21%, 62.33%, and 59.01%, respectively. These values were below the threshold of a new genus (65%) (Konstantinidis et al. 2017), supporting that strain GM2-3-6-6T represented a novel genus. POCP values calculated between strain GM2-3-6-6T and C. catalasitica IFO 15977T, C. algicola 0182T, and P. roseus SM1701T were 0.488, 0.542, and 0.465, respectively. These values also supported strain GM2-3-6-6T to be classified in a novel genus (Qin et al. 2014).
Compared to 16S rRNA gene sequences, phylogeny based on genome sequences enable more accurate classification of bacterial and archaeal groups. The GTDB tools (Chaumeil et al. 2019) were used in this study to determine the taxonomic position of strain GM2-3-6-6T and closely related members of the order Flavobacteriales. A total of 126 genomes including an outgroup Chitinophaga pinensis DSM 2588T were used in the phylogenomic analysis (Figure 1). The majority of the phylogenetic lineages in the tree were obtained from uncultivated bacteria, indicating that most lineages are still awaiting to be cultivated.
Phylogenomic analysis based on Bac120 sets showed that strain GM2-3-6-6T was affiliated to the family Crocinitomicaceae and formed a clade with an uncultivated bacterium Bin_13 (accession number: JABURP000000000), which was obtained from a marine sediment core in northeastern Brazil. This clade was neighbored by the species C. algicola, C. catalasitica, and P. roseus, a topology congruent with the phylogeny of the 16S rRNA gene (Supplementary Figure S1 and S2).
Phylogenomic analysis based on Bac120 sets strongly supported classification of the family Cryomorphaceae-related members in multiple-family-level clades (Figure 1). Firstly, the type species Cryomorpha ignava was placed in an independent clade as a family-level taxon Cryomorphaceae, which was clearly separated from the genera Luteibaculum, Salibacter, Phaeocystidibacter, Owenweeksia, and Croceimicrobium. This result agreed with the analysis of the family Cryomorphaceae by Bowman (2020). Secondly, the species Luteibaculum oceani and Salibacter halophilus formed two separate monophyletic clades, affiliated to the Luteibaculaceae and Salibacteraceae, respectively. Thirdly, the phylogenomic tree strongly supported that Phaeocystidibacter marisrubri and Phaeocystidibacter luteus belonged to a novel family, closely related to two novel clusters containing uncultivated bacteria. Fourthly, the clade including Schleiferia thermophila and Thermaurantimonas aggregans formed a node with the clade containing Owenweeksia hongkongensis and Croceimicrobium hydrocarbonivorans, which is inconsistent with the 16S rRNA gene phylogeny, where the two clades were assigned to different families. The family, Schleiferiaceae, including the genera Schleiferia and Thermaurantimonas, and Owenweeksiaceae, including the genera Owenweeksia and Croceimicrobium were proposed according to the principle of priority. Lastly, A. ferrifornacis and V. serpentipes formed a distinct lineage regarded as a novel family, the topology being congruent with the phylogeny of 16S rRNA gene sequences.
Cells of strain GM2-3-6-6T were Gram-strain-negative, ovoid or short rod-shaped, 1-1.5 μm long and 0.7 μm, and non-motile (Figure 2). Colonies grown on MB agar plate at 25 oC for 3 days were light yellow-colored, a color distinct of that of its close relatives (Table 1). Catalase activity and oxidase activity were positive, as in close relatives. Growth was observed at 15-40oC (optimum, 25oC), at pH 6-8 (optimum, 7) and in the presence of 0.5-4.0% NaCl (optimum, 2.0%, w/v). Strain GM2-3-6-6T cannot degrade starch, cellulose (CMC), skimmed milk and Tween 40, Tween 60 and Tween 80, similar to C. catalasitica IFO 15977T and P. roseus SM1701T, but different from C. algicola 0182T, which degraded agar and CMC. Additional physiological and biochemical properties are given in Table 1 and in the species description.
The respiratory quinone determined in strain GM2-3-6-6T was MK-7, which was consistent with close relatives, Crocinitomix members (Shi et al. 2017), P. roseus (Wang et al. 2020), and Wandonia, while MK-6 was present as major quinone in other members of the family Crocinitomicaceae (Table 2). The major fatty acids (>5%) of strain GM2-3-6-6T were iso-C15:0 (55.3%), summed feature 3 (C16:1ω7c and C16:1ω6c) (10.1%), iso-C15:1 G (9.1%), and iso-C17:0 3-OH (7.9%) (Table S1). The percentage of iso-C15:1 G in strain GM2-3-6-6T was lower than in C. catalasitica IFO 15977T (28.4%) and P. roseus SM1701T (23.8%), which can differentiate the close relatives.
The polar lipid profiles of strain GM2-3-6-6T included phosphatidylethanolamine (PE), two unidentified phospholipids (PL), one unidentified aminolipid (AL), one unidentified aminoglycolipid (AGL), and four other unidentified lipids (Figure S4). GL, present in the close relatives (Table 2), was not identified in strain GM2-3-6-6T.
Proposal of Owenweeksiaceae and Phaeocystidibacteraceae
Phylogenomic analysis placed the Vicingaceae, Ichthyobacteriaceae, and Salibacteraceae into separate clusters with genomic size of 2.9-3.5, 1.2-3.6, and 2.3-4.5 Mbp, respectively (Table S2). Phaeocystidibacteraceae, Owenweeksiaceae, Schleiferiaceae, and two additional clusters containing uncultivated organisms clustered together, but the genomic characterizations, including 16S rRNA gene sequences similarities and ANI and AAI values, showed a certain distinctiveness (Supplementary Figure S1 and S2). The 16S rRNA gene sequence similarities compared among these three families were lower than 90.35% (Table 3), nearly approaching the threshold identity of a family (86.5%) (Yarza et al. 2014). The ANI values and AAI values are close to the family boundary, 73.7-74.5%, and 52.3-57.1%, respectively (Table 4).
The genomic size of Owenweeksiaceae calculated from the draft genomes was 3.86-4.46 Mbp (DNA G+C content of 40.2-46.0%), which was much larger than the 2.0-2.7 Mbp of members of the Schleiferiaceae (DNA G+C content of 42.6-45.3%). The Phaeocystidibacteraceae have genome sizes of 3.2-3.4 Mbp, intermediate between the Owenweeksiaceae and the Schleiferiaceae (Table 5 and Table S2). The genomic size could be used as a useful feature to differentiate the three families. Also, catalase was positive in Owenweeksiaceaea and Phaeocystidibacteraceae, but negative in Schleiferiaceae. In addition, the polar lipids of Schleiferiaceae did not contain glycolipid (GL), which was present in the Phaeocystidibacteraceae and Owenweeksiaceae (Table 5).