Genomic Characterization of Common Pathogenic Nocardia Species.
The present study encompassed a comprehensive genomic characterization of various pathogenic Nocardia species, incorporating 33 complete genomes or genomes assembled at the chromosome level. Among these, common species such as N. farcinica (4 strains), N. cyriacigeorgica (3 strains), N. brasiliensis (3 strains), and others were analyzed. Furthermore, we also included 73 complete genomes or genomes assembled at the chromosome level of closely related strains in our analysis, as in the case of NTM, MTB, Gordonia and Tsukamurella (Table S1). The genomic features, as depicted in Table 1, Nocardia had a larger genome (6.2 ~ 10.01Mb) and more number of genes (5268 ~ 8560) compared with NTM (~ 4.54Mb, ~ 3786), MTB (~ 4.54Mb, ~ 3786), Gordonia (~ 5.96Mb, ~ 5191), and Tsukamurella (~ 4.92Mb, ~ 4 963). Among the Nocardia species, N. farcinica and N. cyriacigeorgica, the two most common pathogenic species, exhibited slightly smaller genome sizes and numbers of genes than the other Nocardia species (~ 6.61Mb, ~ 6025; ~6.48Mb, ~ 5633). Additionally, the GC content of Nocardia (66.5 ~ 71.5%) was generally higher than that of NTM (~ 69%), MTB (~ 65.5%), Gordonia (~ 69%) and slightly lower than that of Tsukamurella (68 ~ 71%).
Table 1
Genomic feature of the common pathogenic Nocardia
Organism | Strain number | Average genome size (Mb) | Average gene number (CDS) | Average GC content (%) | Average core gene number (%) | Average accessory gene number (%) | Average unique genes number (%) | Average exclusively absent genes number (%) |
Nocardia | 33 | 7.8 | 6672 | 68.5 | 1773 (26.57) | 4329 (64.88) | 562 (8.42) | 8 (0.12) |
N. brasiliensis | 3 | 8.77 | 7957 | 68 | 5607 (70.47) | 1098 (13.80) | 703 (8.83) | 549 (6.90) |
N. cyriacigeorgica | 3 | 6.32 | 5603 | 68 | 4369 (77.98) | 494 (8.82) | 493 (8.80) | 247 (4.41) |
N. farcinica | 4 | 6.45 | 5899 | 70.5 | 4865 (82.47) | 651 (11.04) | 266 (4.51) | 117 (1.98) |
Non-tuberculosis mycobacterium (NTM) | 31 | 5.86 | 4935 | 66.9 | 1543 (31.27) | 3232 (65.49) | 156 (3.16) | 4 (0.08) |
Mycobacterium tuberculosis (MTB) | 23 | 4.42 | 3646 | 65.4 | 3245 (89.00) | 381 (10.45) | 12 (0.33) | 8 (0.22) |
Gordonia | 12 | 4.99 | 4326 | 67 | 1351 (31.23) | 2148 (49.65) | 808 (18.68) | 19 (0.44) |
Tsukamurella | 7 | 4.71 | 4561 | 70.5 | 2794 (61.26) | 1344 (29.47) | 326 (7.15) | 97 (2.13) |
Subsequently, we conducted a phylogenetic analysis encompassing the aforementioned strains, including Nocardia (33), MTB (23), NTM (31), Gordonia (12), and Tsukamurella (7). As illustrated in Fig. 1, the results show that different species cluster in different clades. Notably, separate evolutionary branches were observed for different Nocardia species. Particularly, N. farcinica and N. cyriacigeorgica formed a closely related cluster, representing the most common pathogenic species, while N. brasiliensis appeared as a distinct branch. Moreover, the long branch lengths observed in Nocardia species imply the presence of substantial genetic variation among the strains, indicating a noteworthy level of genetic diversity within Nocardia spp..
These phylogenetic findings were consistent with the results of our comparative genomic analysis. Strains belonging to the same Nocardia species demonstrated higher Average Nucleotide Identity (ANI) values and a greater number of homologous genes compared to those from different species (Table S2A-C). ANI analysis revealed that strains within the same Nocardia species exhibited ANI values ranging from 94–99%. Conversely, the ANI between different Nocardia species, such as N. farcinica, N. cyriacigeorgica, and N. brasiliensis, was approximately 79% (Table S2A). In the context of gene homology analysis (Table S2B-C), strains of the same Nocardia species displayed the highest homology, with more than 5000 homologous genes and over 92% proportion of homologous genes shared. About 3500 homologous genes (65%-85% homologous gene ratio) were found among N. farcinica, N. cyriacigeorgica and N. brasiliensis. When compared with NTM, MTB, Gordonia, and Tsukamurella, Nocardia exhibited higher homology with Gordonia and Tsukamura, with approximately 2500 homologous genes and a homologous gene ratio of about 55%. However, the homology with MTB was relatively lower, with around 2200 homologous genes and a proportion of homologous genes at approximately 45% (Table S2A-C).
Pan-genome and comparative genome analyses reveal specific core genes (SCGs) of the common pathogenic Nocardia species.
In order to identify core genes specific to common pathogenic Nocardia, we employed a pan-genomic analysis strategy. As controls, we also analyzed closely related strains of NTM, MTB, Gordonia, and Tsukamurella. The number of core genes among different species was determined through pan-genomic analysis (Figure. 2). The pan-gene accumulation curves demonstrated that all species, including Nocardia, exhibited an open genome, with the size continuously increasing as more genomes were analyzed (Figure. S1). Furthermore, we conducted a pan-genomic analysis of different Nocardia strains. There are 6038 orthologous genes in the 4 N. farcinica strains, including 4865 core genes (82.47%); 5573 orthologous genes in the 3 N. cyriacigeorgica strains, including 4369 core genes (77.98%); 7711 orthologous genes in the 3 N. brasiliensis strains, including 5607 core genes (70.47%) (Table 1). Overall, the percentage of core genes in Nocardia species was lower than that observed in NTM, MTB, Gordonia, and Tsukamurella. Conversely, the percentage of strain-specific genes in Nocardia species was higher compared to NTM, MTB, and Tsukamurella (Table 1, Figure. 2). This difference can be attributed to the conserved genome sequences found in NTM, MTB, Gordonia, and Tsukamurella strains in comparison to Nocardia.
Subsequently, we conducted a comprehensive comparative genome analysis with 297 Nocardia strains (including 40 strains of N. farcinica, 32 strains of N. cyriacigeorgica, 7 strains of N. brasiliensis, etc.) and 108 strains of NTM, 39 strains of MTB, 29 strains of Gordonia, and 23 strains of Tsukamurella, which are assembled genome sequences (Table S3), to search for specific core genes that could distinguish common pathogenic Nocardia isolates. The number of specific core genes exhibited variability among different Nocardia species. Eventually, we identified a total of 18 core genes that were specific to Nocardia spp., 4 core genes that were specific to N. farcinica, 46 core genes that were specific to N. cyriacigeorgica. None were found exclusive to N. brasiliensis. These specific core genes serve as potential genetic markers that may play a crucial role in distinguishing these common pathogenic Nocardia species from one another. Indeed, it is essential to acknowledge that the number of specific core genes may be influenced by the inclusion of different strains in the analysis.
Validation of the feasibility of Nocardia-specific core genes for clinical diagnosis.
To validate the clinical feasibility of the screened Nocardia-specific core genes, we successfully extracted DNA templates from 18 samples of Nocardia species and 6 samples of closely related strains, which are listed in Table 2. Following PCR amplification and agarose gel electrophoresis, we identified specific gene markers that showed promising clinical potential for diagnosing nocardiosis. Among the screened genes, 1 Nocardia spp. gene (F6W96_34950) and 5 N. cyriacigeorgica genes (NCTC10797_02287, NCTC10797_01760, NCTC10797_05842, NOCYR_2299, and C5B73_13220) demonstrated robust specificity for the diagnosis of nocardiosis, the gene sequences and primer sequences were presented in Table S4. These clinically specific markers were found to be reliably associated with Nocardia infection, as illustrated in Figure. 3. Regrettably, our Subsequent validation did not yield any core genes specific to N. farcinica.
Table 2
Strains used for validation in this study.
Strains | Source of strainsa | No. of strains |
Nocardia. cyriacigeorgica | BCH | 8 |
Nocardia. farcinica | BCH-NTCL | 8 |
Nocardia. otitidiscaviarum | BCH | 1 |
Nocardia. abscessus | BCH | 1 |
Nocardia. wallacei | BCH | 1 |
Nocardia. beijingensis | BCH | 1 |
Nocardia. nova | BCH | 1 |
Nocardia. puris | BCH-NTCL | 1 |
Tsukamurella | BCH-NTCL | 1 |
Gordonia | BCH-NTCL | 1 |
Mycobacterium Tuberculosis | H37Rv (BCH-NTCL) | 1 |
Mycobacterium chelonei | ATCC35750 (BCH-NTCL) | 1 |
Mycobacterium abscess | ATCC19977 (BCH-NTCL) | 1 |
Mycobacterium avium | ATCC25291 (BCH-NTCL) | 1 |
a: BCH: Beijing Chaoyang Hospital; BCH-NTCL: National Tuberculosis Clinical Laboratory, Beijing Chest Hospital |