Variability of the Bacillus anthracis tryptophan operon

Abstract


Abstract
Background Bacillus anthracis is a causal agent of a zoonotic disease relevant for many countries, and is an agent of bioterrorism. Meanwhile, the reasons for the dependence on tryptophan of some strains with altered virulence have not been established with an almost complete absence of information on the tryptophan operon of this pathogen. In this study, we report gene variability and the structure of the tryptophan operon in B. anthracis strains of the three main lineages.

Results
For in silico analysis we used 112 B. anthracis genomes, including 68 of those available at the GenBank database and 44 sequenced at our institute. The B. anthracis tryptophan operon has an ancestral structure with a complete set of seven partially overlapping genes. The results show that the variability of all seven tryptophan operon genes is determined by the presence of single nucleotide polymorphisms and InDels. The trpA genes of strains of the main lineage B and trpG genes of strains of the C lineage are pseudogenes and the proteomes lack the corresponding enzymes of the biosynthetic pathway, which may explain the dependence of the strains of line B on tryptophan.

Conclusion
In this study, the differences in tryptophan operon genes for B. anthracis strains belonging to different main lineages were demonstrated for the rst time. Mutation in the gene of the tryptophan synthase subunit alpha can explain the dysfunction of this enzyme and the dependence on tryptophan in strains of the main lineage B. Identi ed features suggest a further study of the dependence on tryptophan in B. anthracis strains of the main lineage B and may be of interest from the point of view of intraspeci c evolution of the anthrax pathogen.

Background
The causal agent of anthrax -Bacillus anthraciscauses a particularly dangerous zoonotic infection with a global range and is an agent of biological terrorism of group A [1]. The ability of the spore form of this bacterium to persist in soil foci for decades and cause poorly predictable disease outbreaks among livestock, often accompanied by human infections, makes anthrax a problem for public health and veterinary medicine in many countries, including Russia [2]. The research of anthrax infection and its causal agent has been the subject of numerous works by researchers all over the world, but despite the long history of research of B. anthracis, some of the properties of this pathogen remain poorly understood. Among them is the dependence on tryptophan in a number of strains whose virulence is reduced [3].
The tryptophan biosynthesis pathway is one of the branches of the general branched aromatic amino acid biosynthesis pathway which starts with chorismic acid. Tryptophan operon (Trp) is responsible for tryptophan biosynthesis. The genes and operons of the tryptophan biosynthetic pathway are organized differently in different types of bacteria. These differences re ect evolutionary divergence, as well as adaptation to unique metabolic capabilities and interactions with the environment [4].
Jacques Monod described tryptophan operon of Escherichia coli for the rst time in 1953. The Bacillus anthracis tryptophan operon contains genes for seven catalytic domains encoding ve enzymes, including two α/β subunit complexes -tryptophan synthase and anthranilate synthase: This is an ancestral structure of the operon, which includes a full set of speci c whole-pathway operons that is widespread among prokaryotes. For some organisms, genes of biosynthetic pathways may be scattered, for others -organized in two or more "split-pathway" operons. The question is what kind of evolutionary relationships exists between these three types of pathway genes organization. Trp operon is a perfect model for studying the biosynthetic pathways [5].
Mechanism of tryptophan dependence is not quite clear, and there is very little published information on trp operon of B. anthracis. One of the possible reasons of the trp dependence may be the mutations in genes determining enzymes of trp synthesis pathway. There is an important link between the organization and genomic context of the trp operon genes and the mechanism that regulates its expression. The regulatory mechanisms used to control the transcription of tryptophan biosynthesis genes in B. anthracis are still poorly understood. It is a known fact that unlike Bacillus subtilis, B. anthracis lacks trpRNA binding attenuator protein (TRAP), encoded by mtrB gene.
Due to the low state of knowledge of the trp operon and the development mechanism of tryptophan auxotrophy in B. anthracis determines the relevance of this study. Our aim was to analyze the features of genes and structure of the trp operon of different B. anthracis strains.

Results
The comparison of nucleotide sequences of trp operon genes showed the following: trpA gene size of 33 strains of the main genetic line B is 651 b.p., for I-373 strain -650 b.p., for Tyrol 4675 strain -777 b.p. In addition to the trpG gene encoding the aminodeoxychorismate/anthranilate synthase component II in the tryptophan operon, the B. anthracis genome of all three main lineage contains the pabA gene encoding (MULTISPECIES: aminodeoxychorismate / anthranilate synthase component II). This protein is also synthesized by many strains of Bacillus cereus, Bacillus thuringiensis and other bacilli. The trpG and pabA genes are distinguished by multiple substitutions, InDels, as well as the proteins encoded by them.

Discussion
Reconstruction of the structure of the tryptophan operon of different B. anthracis strains showed that there are differences between its structure in strains of the main lineages (Fig 3). In strains of lineage B, due to a mutation in the trpA gene, which turns it into a pseudogene, the last step of the tryptophan biosynthesis pathway should be blocked, since the tryptophan synthase subunit alpha is absent. This circumstance may explain the dependence on tryptophan in strains of the main lineage B. . The high-resolution reference phylogeny, based on 11989 SNPs of genomes of 193 strains from the global collection, reveals that the next event after the separation of lineage C from A/B was the divergence of lineage C into sub clusters, then the separation of lines A and B [12].
Clade A divides into four main monophyletic subclades, from which, earlier than other subclades, formed the "Ancient A" clade, being the base for other subclades of this line. The base subclade of clade B may be subclade B.Br003, including subclade B.Br004 with strains from Europe, formed at about the same time as subclade A.Br.002, other subclades of line B include strain HYU01 from South Korea (subclade B.Br. 002), which appeared later, and nally, the strains of the clade B.Br.008 isolated in South Africa and Sweden [12].
According to our data, the subclade B.Br.002 contains, along with the isolate from Korea, strains isolated in Western Siberia (a separate cluster "Siberia") and Finland [13], although in an earlier work the strain from Finland was described as constituting a separate clade line B.Br.002 with the nearest strains HYU01 from South Korea and BF1 from Germany [14].
Based on the data described, it can be assumed that clade C, the oldest, with a minimum number of isolates, has become a blind branch of the evolution, which has not received further distribution outside the United States. Clades A and B, evolving independently, spread to varying degrees in different geographical areas, while there are local regions where strains of clades A and B exist at the same time, for example, Kruger Park in South Africa, and probably certain regions of the Russian Federation (Republic of Dagestan, Western Siberia). The fact that only 5 isolates of B. anthracis line C was isolated in North America only, a limited number of strains of line B and the wide distribution of strains of line A, suggests the ecological advantages of the latter, which are also associated with different functioning of tryptophan, and possibly other operons.
In the strains of Bacillus cereus, the structure of the tryptophan operon is not different from that in B. anthracis, but the genes and corresponding proteins are mainly speci c for this species, although some proteins are identical in these two species and Bacillus thuringiensis.

Conclusion
General structure of the B. anthracis trp operon is conservative and is characterized by the presence of 7 partially overlapping genes. We have shown the difference in gene sequences and proteins of the biosynthetic pathway of the main lineages of the anthrax pathogen. In accordance with the nature of single nucleotide polymorphisms and InDels in the trpA and trpD genes, the studied strains are divided into two groups, one of which includes strains of the main lineages A and C, and the other -strains of lineage B. Due to a mutation in the trpA gene of the tryptophan synthase subunit alpha, which turns it into a pseudogene, the last step of the tryptophan biosynthesis pathway should be blocked, which may explain the dependence on tryptophan found in several B. anthracis strains of the main lineage B. It remains unknown whether tryptophan dependence is inherent in all strains of this line. The presence of the trpG pseudogene in strains of the main lineage C and the inability to synthesize of anthranilate synthase component II can be probably compensated for by expressing glutamine amido transferase activity of the functional pabA gene outside the tryptophan operon.
Worldwide distribution of line A strains suggest their ecological advantages, which can be associated in particular with full functioning of tryptophan operon.
The revealed features suggest a further study of tryptophan dependence in B. anthracis strains of the main lineage B and may be of interest from the point of view of intraspeci c evolution of the anthrax causal agent.

Bacterial strains
In our study we have used 112 genomes of the B. anthracis strains. 44 strains of them, sequenced in our study, are from the State Collection of Pathogenic Microorganisms of Stavropol Research Anti-Plague Institute (Table 1) and 68 genomic sequences of the B. anthracis strains -from GenBank (Additional le 1: Table S1).
Growth of B. anthracis and extraction of DNA B. anthracis strains were cultivated on the blood agar, then inactivated, and DNA was extracted with the use of DNA extraction kit QIAamp DNA Mini Kit (Qiagen, Germany) according to manufacturer's protocol and the requirements of biological safety rules when working with pathogens of the third group of pathogenicity. DNA concentration was quanti ed using the dsDNA HS Qubit assay kit (Thermo Fisher Scienti c, USA) according to the manufacturer's protocol. DNA preparations were stored at − 20 °C until further use.

Whole genome sequencing
The preparation of genomic libraries with a 400 bp read length was performed using the Ion Xpress Plus Fragment Library Kit reagent kit (Life Technologies, USA) in accordance with the manufacturer's protocol. Monoclonal ampli cation on microspheres was performed using Ion PGM Hi-Q View OT2 Kit reagents (Life Technologies, USA). Genome sequencing was performed using an Ion Torrent PGM sequencer and Ion 316 Chips Kit V2 chips (Life Technologies, USA).

Bioinformatics analysis
We conducted mutation search in genomes in silico via CLC Sequence Viewer 6 [15] and MEGA V.10.0.5 [16] programs, using genome of the Bacillus anthracis Ames Ancestor strain as reference and data on genes and enzymes of the trp operon from GenBank. Phylogenetic analysis was performed via Maximum Likelihood (bootstrap 1000) method in MEGA V. 10