Flavobacterium difficile sp. nov., isolated from a freshwater waterfall

A bacterial strain designated KDG-16 T is isolated from a freshwater waterfall in Taiwan and characterized to determine its taxonomic affiliation. Cells of strain KDG-16 T are Gram-stain-negative, strictly aerobic, motile by gliding, rod-shaped and form light yellow colonies. Optimal growth occurs at 20–25 °C, pH 6–7, and with 0% NaCl. Phylogenetic analyses based on 16S rRNA gene sequences and an up-to-date bacterial core gene set reveal that strain KDG-16 T is affiliated with species in the genus Flavobacterium. Analysis of 16S rRNA gene sequences shows that strain KDG-16 T shares the highest similarity with Flavobacterium terrigena DSM 17934 T (97.7%). The average nucleotide identity, average amino acid identity and digital DNA–DNA hybridization values between strain KDG-16 T and the closely related Flavobacterium species are below the cut-off values of 95–96, 90 and 70%, respectively, used for species demarcation. Strain KDG-16 T contains iso-C15:0, iso-C15:1 G and iso-C17:0 3-OH as the predominant fatty acids. The polar lipid profile consists of phosphatidylethanolamine, one uncharacterized aminophospholipid, one uncharacterized phospholipid, two uncharacterized aminolipids and two uncharacterized lipids. The major polyamine is homospermidine. The major isoprenoid quinone is MK-6. Genomic DNA G + C content of strain KDG-16 T is 31.6%. Based on the polyphasic taxonomic data obtained, strain KDG-16 T is considered to represent a novel species in the genus Flavobacterium, for which the name Flavobacterium difficile sp. nov. is proposed. The type strain is KDG-16 T (= BCRC 81194 T = LMG 31332 T).


Introduction
The genus Flavobacterium (type species, Flavobacterium aquatile), a member of the family Flavobacteriaceae in the order Flavobacteriales of class Flavobacteriia within the phylum Bacteroidetes Ludwig et al. (2011), was first established by Bergey et al. (1923) and emended by Bernardet et al. (1996), Dong et al. (2013), Kang et al. (2013) and Kuo et al. (2013). The genus Flavobacterium comprises 242 species with validly published names so far stated on the List of Prokaryotic Names with Standing in Nomenclature (https:// lpsn. dsmz. de/ genus/ flavo bacte rium). The present study was carried out to clarify the taxonomic position of a putative Communicated by Erko Stackebrandt. novel species belonging to the genus Flavobacterium, designated KDG-16 T , by a polyphasic taxonomic approach.

Bacterial isolation and culture conditions
During a survey on cultivable bacterial resources from freshwater environment, a water sample (20 °C, pH 7, 0% NaCl) was collected from the Dajin Waterfall (GPS location: 22° 51′ 40″ N 120° 38′ 43″ E) in the Sandimen Township of Pingdong County, Taiwan on 5 April 2015 ( Supplementary  Fig. S1). The water sample was plated on R2A agar medium (BD Difco) through serial dilution technique. After incubation at 25 °C for 3 days, strain KDG-16 T was isolated as a single light yellow colony and subjected to detailed taxonomy analyses. Strain KDG-16 T was sub-cultured on R2A agar and stored at −80 °C in R2A broth with 20% (v/v) glycerol or stored following lyophilization. Strain KDG-16 T has been deposited in the Bioresource Collection and Research Center, Taiwan (BCRC 81194 T ) and BCCM/LMG Bacteria Collection, Belgium (LMG 31332 T ). Flavobacterium terrigena DSM 17934 T , Flavobacterium urocaniciphilum JCM 19142 T and Flavobacterium aquatile DSM 1132 T were obtained from corresponding culture collections. The three type strains were used as reference strains and evaluated together with strain KDG-16 T under identical experimental conditions.

Morphological, physiological, and biochemical characterizations
Cell morphology of strain KDG-16 T was observed by phasecontrast microscopy (DM 2000;Leica) and transmission electron microscopy (Model H-7500; Hitachi) using cells grown on R2A agar at 25 °C for 3 days. The gliding motility was studied using phase-contrast microscopy as described by Bernardet et al. (2002). Gram-staining was performed with Stain Set S kit (BD Difco). Colony morphology was investigated on R2A agar using a stereoscopic microscope (SMZ 800;Nikon). The presence of flexirubin and carotenoid types of pigments was examined as described by Reichenbach (1992) and Schmidt et al. (1994). The physiological characteristics of strain KDG-16 T and the three reference strains were examined by growing bacteria at various temperatures, pH values and NaCl concentrations. The temperature range for bacterial growth was determined on R2A agar at 4, 10,15,20,25,30,35,37,40,45 and 50 °C. The pH range for bacterial growth was estimated at pH 4-9 (at intervals of 0.5 pH unit) using R2A broth at 25 °C. The pH of the medium was adjusted prior to sterilization to pH 4-9 using the biological buffers such as citrate/Na 2 HPO 4 , phosphate and Tris (Breznak and Costilow 2007). To measure the tolerance to NaCl, R2A broth was prepared and contained NaCl concentration, 0, 0.5% and 1-5%, w/v (at intervals of 1%). Growth under anaerobic conditions was determined after streaking strain KDG-16 T on R2A agar and on R2A agar supplemented with nitrate (0.1% KNO 3 ) and then incubated in anaerobic jars with AnaeroGen anaerobic system envelopes (Oxoid) at 25 °C for 21 days. Bacterial growth was studied on R2A agar, nutrient agar, Luria-Bertani agar and trypticase soy agar (all from Difco) under aerobic condition at 25 °C for 21 days.

Chemotaxonomy
The fatty acid profiles of strain KDG-16 T and the three reference strains were analyzed from cells that grown on R2A agar at 25 °C for 3 days. Fatty acid methyl esters were prepared and separated using a standard MIDI protocol (Sherlock Microbial Identification System, version 6.0), analyzed by GC (Hewlett-Packard 5890 Series II) and identified using the RTSBA6.00 database (Sasser 1990). Polar lipids were extracted and analyzed by twodimensional TLC as described by Embley and Wait (1994). Ethanolic molybdophosphoric acid was used for the detection of the total polar lipids, ninhydrin for amino lipids, the α-naphthol reagent for glycolipids and the Zinzadze reagent for phospholipids. Polyamines were extracted and identified according to the methods described by Busse and Auling 1 3 (1988) and Busse et al. (1997). Cells of strain KDG-16 T were cultivated in R2-PYE medium as described by Kämpfer et al. (2007) at 25 °C for 3 days, and homogenized in 0.2 M perchloric acid (HClO 4 ) and centrifuged. Polyamines in the resultant supernatant were treated with dansyl chloride solution (7.5 μg ml −1 in acetone), and analyzed by HPLC on a D-7000 high-speed liquid chromatograph and UV-VIS detector L-7420 (Hitachi). Isoprenoid quinones were extracted and purified according to the method of Collins (1994) and analyzed by HPLC with a Spherisorb ODS column.

Determination 16S rRNA gene sequence and phylogenetic analysis
Genomic DNA was extracted using a bacterial genomic DNA purification kit (DP02-150, GeneMark) for 16S rRNA gene analysis. The 16S rRNA gene was amplified using the universal primer set (27F and 1541R) and then sequenced using four primers (27F, 520F, 800R and 1541R) (Weisburg et al. 1991;Anzai et al. 1997). The sequence obtained was compared with those available from EzBioCloud (Yoon et al. 2017). Multiple sequence alignment was carried out using the clustal W (Larkin et al. 2007) and BioEdit software (Hall 1999). Phylogenetic analyses were performed with the neighbor-joining (NJ) (Saitou and Nei 1987), maximum-likelihood (ML) (Felsenstein 1981) and maximumparsimony (MP) (Kluge and Farris 1969) methods in the program MEGA 7 (Kumar et al. 2016). Bootstrap analysis was calculated based on 1000 resamplings.

Genome sequencing, analysis and comparison
Genome sequencing was performed by the Genomics BioSci & Tech. Co., Ltd. (Taipei, Taiwan, ROC) using the Illumina NextSeq sequencer platform and using MultiQC v1.2 for evaluating read quality (Ewels et al. 2016). The sequence assembly was performed using SPAdes (version 3.10.1) (Bankevich et al. 2012). Gene prediction and annotation was carried out using the Prokka pipeline (Seemann 2014). The protein encoding genes were classified into functional categories based on eggNOG (evolutionary genealogy of genes: Nonsupervised Orthologous Groups)-Mapper using precomputed cluster and phylogenies from the eggNOG database according to Huerta-Cepas et al. (2016. Digital DNA-DNA hybridization (dDDH) analysis was performed on the DSMZ Genome-to-Genome Distance Calculator platform as described by Meier-Kolthoff et al. (2013). Average nucleotide identity (ANI) values were calculated by OrthoANI analysis (Lee et al. 2016). Average amino acid identity (AAI) calculations were performed (http:// enveomics. ce. gatech. edu/). A genome-based phylogenetic tree was reconstructed using an up-to-date bacterial core gene set (UBCG, concatenated alignment of 92 core genes) as described by Na et al. (2018). For genome comparison, the genome sequences of strain KDG-16 T and three genome sequences from the phylogenetically closely related Flavobacterium species were annotated by the NCBI Prokaryotic Genome Annotation Pipeline and Rapid Annotation of microbial genomes using Subsystem Technology (RAST) server as described by Overbeek et al. (2014). Comparative gene contents were analyzed using an enhanced software platform EDGAR 2.0 according to the method described by Blom et al. (2016).

16S rRNA gene sequencing and phylogenetic analysis
The sequenced length of the 16S rRNA gene for strain KDG-16 T was 1435 bp (GenBank accession number MH54470). The sequence similarity calculations revealed that strain KDG-16 T was most closely related to the species of the genus Flavobacterium, and had the highest sequence similarity with F. terrigena DSM 17934 T (97.7%), followed by F. glaciei 0499 T (95.2%) and F. urocaniciphilum JCM 19142 T (95.1%). Sequence similarities < 95.1% were observed with the type strains of all other species listed in Fig. 1. Strain KDG-16 T shared 94.2% similarity with the type strain of the type species of the genus, F. aquatile DSM 1132 T . Phylogenetic analysis based on 16S rRNA gene sequence indicated that strain KDG-16 T formed a separate phylogenetic branch cluster with F. terrigena DSM 17934 T and F. urocaniciphilum JCM 19142 T within the genus Flavobacterium in the neighbor-joining tree (Fig. 1). The overall topologies of the maximum-likelihood and maximum-parsimony trees were similar.

Whole genome analysis, average nucleotide identity and average amino acid identity calculations, digital DNA-DNA hybridization scores and UBCG phylogenetic tree construction
The genome of strain KDG-16 T was 2.96 Mb (GenBank accession number NZ_JAAJBT000000000) with G + C content of 31.6%. It was composed of 33 contigs with an average coverage of 321 × and a N50 size of 307,762 bp. A single copy of the 16S rRNA gene was found in the annotated genome, which showed 100% similarity to the amplified 16S rRNA gene sequence. A total of 2586 protein encoding genes, three rRNA genes and 46 tRNA genes predicted. According eggNOG database, the 2586 protein encoding genes in strain KDG-16 T genome were classified into 20 functional categories (Supplementary Table S1).
Most of coding sequences are classified as general function prediction only (R, 8.9%), followed by those identified as having roles in cell wall/membrane/envelope biogenesis (M, 7.7%), functional unknown (S, 6.2%), translation, ribosomal structure and biogenesis (J, 5.5%) and amino acid transport and metabolism (E, 5.3%).
ANI values were calculated between the genome of strain KDG-16 T and the type strains of other close related Flavobacterium species with whole genome sequence publicly available. The results showed that the ANI values were 70.7-84.2% (Supplementary Table S2), which were lower than the prokaryotic species delineation threshold of 95-96% (Richter and Rosselló-Móra 2009). The dDDH values between strain KDG-16 T and the close related Flavobacterium species were 15.4-23.0% (Supplementary Table S2), which are below the threshold of 70% for species delimitation (Goris et al. 2007 Fig. S2). The calculated AAI values were above the threshold of 60% for genus boundary and below the threshold of 90% for species demarcation (Rodriguez-R and Konstantinidis 2014). These data supported that strain KDG-16 T is a novel species in the genus Flavobacterium.
To infer a genome-based phylogenetic tree, UBCG was utilized for phylogenetic tree construction. The phylogenetic tree based on the coding sequences of 92 protein clusters showed that strain KDG-16 T formed a distinct phylogenetic lineage cluster with F. terrigena DSM 17934 T and F. urocaniciphilum JCM 19142 T in the genus Flavobacterium (Fig. 2), which confirms that strain KDG-16 T should be assigned to a novel species of the genus Flavobacterium.

Genome comparative analysis
The genome sequences of strain KDG-16 T and three genome sequences from the genus Flavobacterium were used for genome comparative analysis, including the type species F. aquatile DSM 1132 T isolated from deep well and two type strains, F. terrigena DSM 17934 T isolated from soil and F. urocaniciphilum JCM 19142 T isolated from wastewater treatment plant. The genome characteristics of strain KDG-16 T and these three strains is shown in Supplementary Table S3. Results revealed that some genes had all four strains in common and some genes differed among them (Table 1). Strain KDG-16 T had genes putatively encoding proteins associated with gliding motility such as gliding motility proteins, GldB, GldC, GldD, GldF, GldG, GldH, GldI, GldJ, GldK, GldL, GldM, GldN, which confirmed the observed gliding on phase-contrast microscopy (described Filled circles indicate nodes that are also found with the maximumlikelihood and maximum-parsimony algorithms. Lutibacter flavus IMCC1507 T was used as an out-group. Bar, 0.01 substitutions per nucleotide position below). Strain KDG-16 T had genes putatively encoding proteins related to carotenoid biosynthesis including polyprenyl synthetase family protein, sterol desaturase family protein, lycopene cyclase, consistent with result of absorption maximum at 455 nm of carotenoid pigments as measured by spectrophotometer (described below). The most obvious difference is that only strain KDG-16 T had genes related to copper tolerance e.g. apolipoprotein N-acyltransferase and DUF21 domain-containing protein involved in virulence and defense, related to NADPH:quinone oxidoreductase 2 e.g. NmrA family NAD(P)-binding protein, and related to cluster containing glutathione synthetase such as 16S rRNA (uracil 1498 -N 3 )-methyltransferase and holliday junction resolvase RuvX involved in stress response, while these genes are missing in the three reference genomes. Other features are that only these three strains possessed genes putatively encoding proteins regarding to tolerance to colicin E2 involved in virulence and defense, regarding to 2-phosphoglycolate salvage involved in DNA repair, regarding to proline synthesis involved in amino acid metabolism and regarding to glycolate and glyoxylate interconversion involved in central carbohydrate metabolism, but the novel strain KDG-16 T did not possess these related genes.
In addition, only strain KDG-16 T and F. aquatile DSM 1132 T possess genes related to capsular polysaccharides biosynthesis and assembly involved in cell wall and capsule biosynthesis, related to toxin-antitoxin replicon stabilization system involved in regulation and cell signaling, related to photosynthesis, related to galactosylceramide and sulfatide metabolism, related to glycerolipid and glycerophospholipid metabolism, related to trehalose biosynthesis and trehalose uptake and utilization, related to lactose and galactose uptake and utilization involved in disaccharide and oligosaccharide metabolism, and related to cellulosome involved in polysaccharide metabolism (Supplementary Table S4). F. terrigena DSM 17934 T and F. urocaniciphilum JCM 19142 T did not possess these related genes. Meanwhile, all four strains showed highly diverse distribution pattern in iron acquisition and metabolism, nitrogen, sulfur, aromatic compound, lipid, DNA, amino acid and derivative and carbohydrate metabolisms.
When the percentage of genes of strain KDG-16 T shared with the type species of the genus was estimated, strain KDG-16 T showed 1823 genes (69.4%) shared with F. aquatile DSM 1132 T (Supplementary Fig. S3A). If strain KDG-16 T , F. terrigena DSM 17934 T , F. urocaniciphilum JCM 19142 T and F. aquatile DSM 1132 T were analyzed together, there are 1613 genes of strain KDG-16 T in common (about 61.4% of the total number of genes). There are 427 genes presented as strain KDG-16 T specific genes (about 16.3% of the total number of genes) (Supplementary Fig. S3B). Summarizing, the genomic information could provide the basic knowledge related to physiological and biochemical characteristics of these stains such as metabolic ability of various nutrients, resistant ability of different toxic compounds and adaptability to environmental changes. These capabilities confer the competitive ecological advantage for

Phenotypic and biochemical characteristics
Cells of strain KDG-16 T were Gram-stain-negative, strictly aerobic, catalase-positive, oxidase-negative, rod-shaped and motile by gliding ( Supplementary Fig. S4). Colonies of strain KDG-16 T were light yellow. The optimal growth temperature, pH and NaCl concentration were 20-25 °C, 6-7 and 0%, respectively. Strain KDG-16 T was sensitive to chloramphenicol, kanamycin, nalidixic acid, novobiocin, rifampicin, streptomycin, tetracycline, gentamicin, ampicillin and penicillin G, and resistant to sulfamethoxazole plus trimethoprim. Detailed results from the phenotypic and biochemical analyses of strain KDG-16 T are provided in the species description, Table 2 and Supplementary Table S5.

Fatty acids, polar lipids, polyamines and isoprenoid quinones
The major cellular fatty acids (> 10% of the total fatty acids) of strain KDG-16 T were iso-C 15:0 (22.9%), iso-C 15:1 G (15.3%) and iso-C 17:0 3-OH (11.2%). The fatty acid composition of strain KDG-16 T and three type strains is shown in Table 3. Strain KDG-16 T and the three reference strains had iso-C 15:0 , iso-C 15:1 G and iso-C 17:0 3-OH as the predominant cellular fatty acids. Their fatty acid compositions were similar, with slightly differences in the proportions of some fatty acids. Strain KDG-16 T had a complex polar lipid profile consisting of phosphatidylethanolamine (PE), one uncharacterized aminophospholipid (APL), one uncharacterized phospholipid (PL), two uncharacterized aminolipids (AL1 and AL2) and two uncharacterized lipids (L1 and L2) ( Supplementary Fig. S5). PE was the major polar lipid of strain KDG-16 T consistent with previous descriptions of Flavobacterium species (Dong et al. 2013;Kang et al. 2013). Strain KDG-16 T contained homospermidine (HSPD, 88.5%) as the major polyamine in line with that of other Flavobacterium species which polyamine composition was analyzed (Bernardet et al. 1996;Bernardet and Bowman 2011), and it had spermidine (SPD, 11.5%) as the minor component ( Supplementary Fig. S6). The major respiratory quinone of strain KDG-16 T was menaquinone (MK-6) ( Supplementary  Fig. S7). The major respiratory quinone was MK-6, which is a common feature of the members of the family Flavobacteriaceae (Bernardet et al. 2002). Overall chemotaxonomic characterization based on the fatty acid, polar lipid, polyamine and quinone suggested the inclusion of strain KDG-16 T as a member of the genus Flavobacterium.

Description of Flavobacterium difficile sp. nov.
Flavobacterium difficile (dif.fi.ci'le. L. neut. adj. difficile difficult, referring to difficulties in cultivating the type strain). Cells are Gram-stain-negative, strictly aerobic, rodshaped and motile by gliding. Cells grow well on R2A agar, but not on nutrient, trypticase soy and Luria-Bertani agars. After 3 days of incubation on R2A agar at 25 °C, the mean cell size is 0.4-0.5 μm in width and 1.2-1.8 μm in length. Colonies on R2A agar are light yellow, convex and circular with regular margins. The colony size is approximately 0.5-1 mm in diameter after 3 days at 25 °C. Growth occurs at 15-30 °C (optimum, 20-25 °C), at pH 5.5-9 (optimum, pH 6-7) and with 0-0.5% NaCl (optimum, 0%). Positive for catalase activity and negative for oxidase activity. Positive for hydrolysis of starch, casein, DNA and Tweens 20 and 80. Negative for hydrolysis of chitin, CM-cellulose, corn oil, lecithin and Tweens 40 and 60. Carotenoid pigments are present with maximum absorption at 455 nm. Flexirubintype pigments are not produced. Positive for gelatin hydrolysis. Alkaline phosphatase, C4 esterase, C8 esterase lipase, C14 lipase, leucine arylamidase, valine arylamidase, cystine arylamidase, trypsin, α-chymotrypsin, acid phosphatase, naphthol-AS-BI-phosphohydrolase and α-glucosidase activities are present. Acids are produced from esculin, starch, glycogen and 5-ketogluconate. Growth under aerobic condition is positive on: d-glucose, d-mannose, l-arabinose, acetate and Tween 40. The predominant fatty acids are iso-C 15:0 , iso-C 15:1 G and iso-C 17:0 3-OH. The polar lipid profile consists of phosphatidylethanolamine, one uncharacterized aminophospholipid, one uncharacterized phospholipid, two uncharacterized aminolipids and two uncharacterized lipids. Homospermidine is the major polyamine and spermidine is the minor polyamine. The major respiratory quinone is MK-6. The DNA G + C content of the type strain is 31.6%.
The type strain is KDG-16 T (= BCRC 81194 T = LMG 31332 T ) isolated from the Dajin Waterfall in the Sandimen Township of Pingdong County, Taiwan. The GenBank/ EMBL/DDBJ accession numbers for the 16S rRNA gene sequence and the whole genome of Flavobacterium difficile KDG-16 T are MH544703 and NZ_JAAJBT000000000, respectively.
Funding The authors received no specific grant from any funding agency.

Conflicts of interest
The authors declare that there are no conflicts of interest. Data are presented as percentages of the total fatty acids; the major fatty acids (> 10%) are in bold type. Only fatty acids representing more than 1% of the total fatty acids of at least one of the strains are shown. −, not detected For unsaturated fatty acids, the position of the double bond is located by counting from the methyl (ω) end of the carbon chain. cis isomer is indicated by the suffix c. *Summed features are groups of two or three fatty acids that are treated together for the purpose of evaluation in the MIDI system and include both peaks with discrete ECLs as well as those where the ECLs are not reported separately. Summed feature 3 was listed as C 16:1 ω6c and/or C 16:1 ω7c and summed feature 9 as iso-C 17:1 ω9c and/or 10-methyl C 16:0