The overview of infant cohorts and sequencing data
Stool samples were collected from a total of 30 infants for the study (10 healthy and 20 infected infants). The 30 infants came from northern cities in China and ranged in age from 2 months to 8 months, with a sex ratio of nearly 1:1 (Table 1). To investigate the ongoing effects of C. botulinum on infant gut microbiota, we collected the fecal samples in 12 of 20 infected infants at days 1, 7, 14, 30, 60 and 90. A total of 47 longitudinal cohort samples were collected for 16S rRNA analysis. On average, each 16S sample contained 71,427 ± 11050 reads across all samples. In addition, 20 strains of C. botulinum were isolated from infected infant fecal samples for whole genome sequencing. The total size of whole genomes sequencing for strains was 26.46 G, with an average sequencing coverage of 339 ± 122.36 X per strain. Following pre-processing, all samples were saturated by genome completion assessment and ready for subsequent analysis. The specific analysis flow is shown in Figure S1.
The composition of intestinal microbiota in healthy and C. botulinum -infected infants
Based on the standard 16S data analysis protocol, 16 phyla, 24 orders, 34 orders, 54 families, 138 genera and 195 species were identified in healthy and C. botulinum -infected infants. In terms of taxonomic composition, the top 3 phyla were Actinobacteriota (37.46% mean abundance), Firmicutes (29.65%) and Proteobacteria (26.20%). At the genus level (Fig. 1A), the most abundant genera were Bifidobacterium (31.58%), Escherichia (15.87%) and Enterococcus (10.24%). In terms of diversity, differences in the overall distribution between the healthy and infected groups were found using PCoA analysis based on the Bray-Curtis dissimilarity method (PERMANOVA, P value < 0.05) (Fig. 1B). The Shannon index and the Simpson index of the healthy group were higher than those of the infected group but the difference was not statistically significant (Fig. 1C).
The classification and functional differences in intestinal microorganisms between healthy and C. botulinum infected infant cohorts
To systematically assess the differences in intestinal microbiota between healthy and infected infants, we analyzed the microbial communities with different abundance and the function of differential microorganisms. We detected 15 differential abundant microbiota at the genus level (LDA > 2), with Bifidobacterium showing the greatest degree of difference among those with higher abundance in the healthy group (LDA > 4). Gemella, Lachnoanaerobaculum and Veillonella were also different in the two groups. Besides, differential abundant microbiota with higher abundance in the infected group were Enterococcus (LDA > 4) and Lactobacillus (LDA > 3) (Fig. 2A).
In terms of tertiary metabolic functions, the two groups differed in 21 pathways (p < 0.05) (Fig. 2B). The metabolic pathways with the greatest differences were mainly amino sugar and nucleotide sugar metabolism, fructose and mannose metabolism and pyruvate metabolism, which were significantly enriched in the infected group. Compared with control group, fatty acid metabolism, cysteine and methionine metabolism and nicotinate and nicotinamide metabolism decreased in infection group.
In addition, we constructed microbial co-abundance networks for the healthy group and the infected group of infants respectively (Fig. 2C and 2D). We found that the network in the healthy group was more tightly linked than the network in the infected group, with more connections between nodes. In the healthy group, the degree of all the differential abundant microbiota was less than or equal to 3, except for Veillonella, which had a linkage greater than 5. Flavonifractor (degree = 8) was the most linked taxon in the healthy group network (Fig. 2C). The overall degree of the infected group network was not high. The most linked taxa in the infected group network were Dysgonomonas, Bilophila, and Dialister, all of which had a linkage of 5 (Fig. 2D).
Longitudinal analysis reveals the dynamics of key microbiota from C. botulinum infected infants
In the differential abundant microbiota analysis described above, we found that Bifidobacterium and Enterococcus differed most in the two cohorts. To examine the temporal dynamics of these two groups, we followed up some of the infected infants (12/20) and collected stool samples for longitudinal study. The cross-sectional 16S data analysis also showed significant differences in Enterococcus (P < 0.05) (Fig. 3A and B). The overall abundance of Bifidobacterium increased significantly after 30 days and the dispersion throughout decreased, whereas Enterococcus decreased rapidly after 14 days and was almost undetectable after 30 (Fig. 3C and D). We also compared the samples of patients at the initial stage of infection (Infection group) with the samples of patients at 60 days and 90 days after disease diagnosis (Recovery group). We found that the differential abundant bacteria between the two groups were also different between the infected group and the healthy group, with the exception of Bifidobacterium (Figure S2). Secondary metabolic pathway analysis revealed that the microbial-related metabolic pathways like Glycan biosynthesis and metabolism, amino acid metabolism, metabolism of cofactors and vitamins and biosynthesis of other secondary metabolites in the infected group gradually returned to the levels of the control group over time (Fig. 3E).
Genomic analysis of C. botulinum isolates from infected infants
The 20 C. botulinum isolates from the stool samples of the 20 infected infants with one isolate per case were sequenced using Illumina sequencing. The average genome length ranged from 3.8 to 4.3M, with an average N50 of around 846 kb (Table S1). All isolates carried the botulinum toxin gene with identical nucleotide sequence encoding the type B botulinum toxin (Figure S3). in silico MLST typing found that the 20 C. botulinum isolates from the patients were typed into two sequence types (ST), ST31 and ST32 with 10 isolates each. KEGG enrichment analysis based on the core genome of 20 strains of C. botulinum showed that ABC transporters and two-component system pathways had the highest gene enrichment degree (Figure S4).