Genome sequence of Corynebacterium amycolatum ICIS 99 isolated from human vagina reveals safety and beneficial properties

Corynebacterium amycolatum ICIS 99 was isolated from vaginal smears of healthy women and showed promising results in antimicrobial screenings. Here, we report the draft genome sequence of this strain and analyze its main features to assess its safety and useful properties. The genome is 2,532,503 bp long and contains 2186 CDSs with an average G + C content of 59.0%. Analyses of the ICIS 99 genome revealed the absence of true virulence factors. The genome contains genes involved in the synthesis of secondary metabolites and bacteriocins of the class sactipeptide. In the genome of ICIS 99, we identified a large number of genes responsible for adaptation and survival in the vaginal environment, including acid and oxidative stress resistance genes. The genomic information of ICIS 99 provides a basis for understanding the safety and useful properties of ICIS 99 and for considering it as a potential probiotic strain. The whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under the accession number JAIUSU000000000.


Introduction
The genus Corynebacterium was described in 1896 by Lehmann & Neumann and includes straight or slightly curved rod-shaped, Gram-positive, catalase-positive, nonspore-forming, non-motile bacteria that belong to the family Corynebacteriaceae of the phylum Actinobacteria (Lehmann 2016). Currently, this genus contains 120 valid species (www. bacte rio. net/ coryn ebact erium. html). Corynebacterium are widespread in the environment and can be isolated from soil, water, and plant material. Certain species of corynebacteria are of great importance in industry and agriculture (Ikeda and Takeno 2013). Some Corynebacterium spp. are well-known pathogens of humans or animals; other species are part of the normal microbiota, although they may occasionally cause infections in humans or can be transmitted to humans by zoonotic contact (Tauch and Sandbote 2014).
Corynebacterium amycolatum is a typical bacterium of the microbiota on human skin and mucous membranes. It was first isolated by Collins and Burton from clinical specimens in 1988 (Collins et al. 1988). A distinctive feature of this species compared to other corynebacteria is the absence of typical mycolic acids in the cell membrane. According to the 16S rRNA gene phylogenetic analyses of different Corynebacterium species, C. amycolatum is only distantly related to other corynebacteria (Pascual et al. 1995).
In the literature, studies on C. amycolatum are limited, and this species is described as a causative agent of sepsis, endocarditis, meningitis, septic arthritis, and urinary tract infections (Reddy et al. 2012). However, the similarity of non-diphtheria corynebacteria often calls into question the identification of isolates from clinical specimens (Esteban et al. 1999).
In this study, we report the draft genome sequence of C. amycolatum ICIS 99, which was isolated from vaginal smears of healthy women during our pilot study aimed  ). Therefore, the present study was undertaken to study the genome of this strain to assess its safety and useful properties.

Materials and methods
Bacterial strain and genomic DNA extraction C. amycolatum ICIS 99 was isolated from the vaginal contents of healthy women and is part of the Collection of Microorganisms of the Institute for Cellular and Intracellular Symbiosis UrB RAS (Orenburg, Russia). The strains were grown in Luria-Bertani (LB) Miller broth (Becton Dickinson, USA) at 37 °C for 24 h and sub-cultured once in the same medium at 37 °C for 24 h before use. The strain morphologies were observed by scanning electron microscopy ( Supplementary Fig. 1).
Genomic DNA was extracted using a Quick DNA Fungal/Bacterial Kit (Zymo Research, USA). The quality of the extracted DNA was assessed according to the A260/280 ratio using a Nanodrop 8000 (Thermo Fisher Scientific, USA), and electrophoresis was performed in a 1% agarose gel. DNA concentration was quantified using a Qubit 4 Fluorometer and a dsDNA High Sensitivity Assay Kit (Life Technologies, USA).

Genome sequencing, assembly and annotation
The DNA library for whole-genome sequencing was prepared using a NEBNext ® Ultra™ II FS DNA Library Prep Kit for Illumina ® (New England BioLabs, USA). The library was validated using capillary electrophoresis on a QIAxcel Advanced System (Qiagen, Germany) using the QIAxcel DNA High Resolution Kit and normalized using qPCR on the CFX Connect Real-Time PCR System (Bio Rad, USA). Paired-end sequencing (2 × 250 bp) was carried out on a MiSeq platform (Illumina, USA) using Reagent Kit v.2 (Illumina, USA) in the Center of Shared Scientific Equipment "Persistence of microorganisms" at the Institute for Cellular and Intracellular Symbiosis UrB RAS.
The paired-end reads were filtered and trimmed with the Trimmomatic program (Bolger et al. 2014). De novo genome assembly was performed using the Unicycler v. 0.4.9b genome assembler (Wick et al. 2017). The contigs were analyzed for gene prediction using the NCBI Prokaryotic Genome Annotation Pipeline (PGPA) and rapid annotation using subsystem technology (RAST) (Aziz et al. 2008). RAST data visualization was performed using GraphPad Prism version 9.2.0 and GraphPad Software (La Jolla, CA, USA) (www. graph pad. com). The interactive visualization of the genome of Corynebacterium amycolatum ICIS 99 was performed with GView (Petkau et al. 2010).
Tentative taxonomic identification was carried out using the BLASTn 2.12.0+ program, according to the rRNA/ITS databases Targeted Loci Project. Summary results are presented in Supplementary Table 1.
The bioinformatic tools antiSMASH 5.0 (Blin et al. 2021), NaPDoS (Ziemert et al. 2012) and BAGEL4 (Heel et al. 2018) were used to determine potential clusters of secondary metabolites with antimicrobial activity. Antibiotic-resistance genes in the genomes were predicted using the RGI (Resistance Gene Identifier) tool (Alcock et al. 2020). The presence of putative virulence genes in the genomes was investigated using the Virulence Factor of Bacterial Pathogens Database (VFDB) (Liu et al. 2019). The CRISPR regions were identified with the CRISPR online detection tool CRISPR finder (https:// crisp rcas. i2bc. paris-saclay. fr/ Crisp rCasF inder/ Index) (Couvin et al. 2018).

Nucleotide sequence accession number
This whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JAI-USU000000000. The version described in this paper is version JAIUSU010000000.

General genomic features and genome-based phylogeny of ICIS 99
The genome characteristics of strain ICIS 99 are presented in Table 1 and Fig. 1. The draft genome of ICIS 99 was 2,532,503 bp long with an N50 length of 167,044 bp, an L50 of 6, and a G + C content of 59.0%. The final assembled genome consisted of 41 contigs. Genome annotation was performed using the National Center for Biotechnology Information (NCBI) Prokaryotic Genome Annotation Pipeline (PGAP) (http:// www. ncbi. nlm. nih. gov/ genome/ annot ation_ prok), and 2186 coding sequences, including 2156 proteins (CDSs), 30 pseudogenes, 3 complete rRNAs (5S, 16S, and 23S) and 54 tRNAs, were identified.
ANI, OrthoANI and DDH analyses made it possible to definitively determine that ICIS 99 was a C. amycolatum strain.

Unique genomic characteristics of ICIS 99
To assess the safety of this strain, we examined the genome for the presence of genes related to antibiotic resistance and virulence. The RGI tool was used to predict antibioticresistance genes in ICIS 99. In the genome of ICIS 99, only one gene was found to code for resistance to the antibiotic chloramphenicol.
VFDB software was used to predict virulence factors in ICIS 99. VFDB predicted 18 virulence factors in ICIS 99 (Supplementary Table 3). These virulence factors can be classified into ten categories: adherence, iron uptake, regulation, amino acid and purine metabolism, anti-phagocytosis, cell surface components, immune evasion, lipid and fatty acid metabolism, protease and secretion system. However, none of the genes identified were true virulence factors. These genes encode only proteins responsible for the structural and functional characteristics of microorganisms of the genus Corynebacterium or 'niche factors' (Swierczynski and Ton-That 2006;Wennerhold and Bott 2006;Tauch and Burkovski 2015;Baumgart et al. 2016;Bush 2018). True virulence factors, such as toxin-related genes (diphtheria toxin and phospholipase D) and hemolysin-related genes characteristic of well-known pathogenic corynebacteria, were not identified.
The previously described antibacterial activity of CFSs from ICIS 99 suggests that this strain contains genes responsible for the synthesis of secondary metabolites and bacteriocins. The presence of secondary metabolic gene clusters was examined by the antiSMASH 5.0 platform (Blin et al. 2021) and NaPDoS (Ziemert et al. 2012). AntiSMASH 5.0 revealed 4 putative biosynthetic gene clusters, which were predicted to be related to the biosynthesis of different types of secondary metabolites, including terpene, T3pks (type III polyketide synthases), NAPAA and NRPS (nonribosomal peptide) (Supplementary Table 4). Two gene clusters showed low similarity (≤ 10%) with the most similar clusters, of which T3pks gene clusters were associated with the biosynthesis of merochlorin A-D-like compounds and the Nrps gene cluster was associated with the biosynthesis of phthoxazolin-like compounds. The other two gene clusters did not resemble any registered gene clusters. NaPDoS revealed one putative gene cluster, T2pks (type II polyketide synthase), that was involved in the biosynthesis of actinorhodin (similarity 27%). The existence of clusters of secondary metabolic genes indicates that the ICIS 99 strain can be a producer of new natural bioactive products.
To search for genes involved in the synthesis of potential bacteriocins, we used BAGEL4. The genome of strain ICIS 99 had one area of interest (AOI) that included the class sactipeptide ( Fig. 4 and Supplementary Table 5). Sactipeptides are a new class of ribosomally synthesized and post-translationally modified peptides (RiPPs) and are a growing class of natural products that have garnered substantial attention because of their structural diversity and biological activities (Balty et al. 2019). The presence of an AOI encoding sactipeptide in the genome of strain ICIS 99 indicates that strain may produce the sactipeptide.
In the genome of ICIS 99, we identified a large number of genes coding for proteins involved in the stress response. These stresses include pH, osmotic pressure, nitrogen stress and oxidative stress. The presence of these genes determines the ability of ICIS 99 to adapt to and colonize the vaginal biotope of healthy women, which is a fairly aggressive environment for most pathogenic and opportunistic microorganisms. The detailed analysis of Fig. 1 Circular map of the genome of Corynebacterium amycolatum ICIS 99. Each ring represents the loci of genes that are labeled outside the outermost ring: (from outermost to innermost) forward coding sequences (green); reverse coding sequences (green); contigs (dark red); GC skew ± (blue/violet); genome size (black). Triangles within rings: rRNA (dark blue); tRNA (red); tmRNA (yellow) (color figure online)   Table 6.
The genome of ICIS 99 was analyzed for the occurrence and diversity of CRISPR-Cas systems. The identification of CRISPR loci suggests the presence of a prokaryotic immune system, which confers resistance to foreign genetic elements and provides a form of acquired immunity (Amitai and Sorek 2016). The ICIS 99 genome contained three CRISPR loci (CRISPR1-CRISPR3) (Supplementary Table 7). The detected CRISPR1/CRISPRassociated (Cas) system was type IE (Cas5, Cas7, Cse2, Cas6, Cas3, Cas1, and Cas2).
In conclusion, the genomic information for ICIS 99 provides the basis for understanding the safety and useful properties of ICIS 99 and for considering its use as a potential probiotic strain. However, further research is needed to test its probiotic efficacy in vivo.