Genetic diversity and characteristics of gyrA gene in Neisseria spp CURRENT STATUS: POSTED

Background Neisseria meningitidis bacteria characterized by clonal complex (CC4821) showed a high resistance rate to quinolones. The aim of this study was to assess whether the DNA gyrase A gene from N.meningitidis CC4821 strains collected in China featured any specific characteristics compared to other Neisseria species. Two hundred fifty two gyrase gene sequences were analyzed, among them 77 generated in this study from N.meningitidis CC4821 strains collected in China between 1978 and 2016. Results The quinolone resistance-related gene, coding for the DNA gyrase subunit A (GyrA) protein, was highly divergent within the N.meningitidis strains whereas N. gonorrhoeae and N.lactamica counterparts appeared well conserved. Only one position, 91 (83 in E.coli gyrA gene), was linked to quinolone resistance, all resistant strains featuring the substitution T91I. The E.coli position 87, which was mutated in quinolone-resistant strains, was also divergent (position 95) in some Neisseria resistant strains. Moreover, twenty eight additional putative resistance markers were identified. Finally, putative recombination events between N.meningitidis and either N.subflava, or N.lactamica or N.cinerea as well as between N.meningitidis strains were reported. critical quinolone new studies the

2 Abstract Background Neisseria meningitidis bacteria characterized by clonal complex (CC4821) showed a high resistance rate to quinolones. The aim of this study was to assess whether the DNA gyrase A gene from N.meningitidis CC4821 strains collected in China featured any specific characteristics compared to other Neisseria species. Two hundred fifty two gyrase gene sequences were analyzed, among them 77 generated in this study from N.meningitidis CC4821 strains collected in China between 1978 and 2016.

Results
The quinolone resistance-related gene, coding for the DNA gyrase subunit A (GyrA) protein, was highly divergent within the N.meningitidis strains whereas N. gonorrhoeae and N.lactamica counterparts appeared well conserved. Only one position, 91 (83 in E.coli gyrA gene), was linked to quinolone resistance, all resistant strains featuring the substitution T91I. The E.coli position 87, which was mutated in quinolone-resistant strains, was also divergent (position 95) in some Neisseria resistant strains. Moreover, twenty eight additional putative resistance markers were identified.
Finally, putative recombination events between N.meningitidis and either N.subflava, or N.lactamica or N.cinerea as well as between N.meningitidis strains were reported.

Conclusions
Analyzing the evolution of gyrA gene within Neisseria spp. is critical to monitor the quinolone resistance phenotype and the acquisition of new resistance markers. Such studies are necessary for the control of the meningococcal disease and the development of new drugs targeting DNA gyrase.

Background
Meningitis is an inflammation of the protective membranes covering the brain and the spinal cord (https://www.cdc.gov/meningococcal). This disease can have multiple causes like bacteria or virus but also fungus, parasite or even non-infectious agent like lupus. Bacterial meningitis can be caused by several types of bacteria including Streptococcus pneumonia or Neisseria meningitidis (N.meningitidis). In addition to N.meningitidis, the genus Neisseria contains at least 30 distinct 3 species infecting humans, other mammalians and even insects (1). N.meningitidis are classified through several different schemes, based on serological test (serogroup) or genetic tests (sequence type) (2, 3). Among the 12 described serogroups, which are based on the structure of the capsule polysaccharide (cps), 6 serogroups (A, B, C, X, Y and W) caused the majority of invasive meningococcal disease (IMD) globally (2). The strains, which reacted with, either more than one, or no serogroup-specific antiserum, were considered non-groupable (NG) (4). Strains lacking the operons required for either the synthesis, or the lipid modification and/or the transport of the capsule were identified as capsule null locus (cnl) (5). In addition, the strains can be grouped into different sequence types (STs) based on the multilocus sequence typing (MLST) method on 7 genes (abcZ, adk, aroE, fumC, gdh, pdhC, pgm) (3). An ST is characterized by a different sequence nucleotide for at least one of the 7 reference genes. So far, 14556 STs have been described. Furthermore, strains that are sharing 4 or more STs (identified as number for convenience) could be classified into the same clonal complex (CC) (3). So far, 9248 CCs have been described (https://pubmlst.org/neisseria/). The strains that could not be classified into an existing CC were called unassigned (UA).
Until 2003, the strains of N. meningitidis with either CC1 or CC5 of serogroup A were responsible for the majority of IMD cases in China (6). In 2003, an outbreak of a new serogroup C meningococcal disease caused by CC4821 was reported in Anhui province of China. This new hypervirulent clonal lineage did not belong to any of the previously reported sequence types and had not been reported in any other countries except one report from Japan (7) (8). Subsequently, CC4821 serogroup C became one of the leading lineages across China (6). Later on, CC4821 became a dominant lineage among serogroup B strains since they were first identified in 2005. In contrast to serogroup C strains, serogroup B strains were usually associated with sporadic infections (6). Our further analyses of historic isolate collections showed that CC4821 strains which included serogroup B and C strains were isolated as early as in 1978 and mostly associated with asymptomatic carrier (9).
Two main strategies have been developed to control the meningococcal disease. Vaccines specific to multiple serogroups have been generated and are globally used (10). As for other bacterial infections, antibiotics have been also used to control the infection. The most common, quinolone and its 4 derivatives, is targeting the DNA gyrase A which is essential for DNA replication (11,12). Quinolone interaction with DNA gyrase has been well studied thanks to the 3D structure of the E.coli protein (13). The mutation of critical sites lead to resistance and they are located in the so called Quinolone Resistant-Determining Region (QRDR) (14).
The aim of this study was to assess whether the gyrase A gene from CC4821 strains from China featured any specific characteristics. Phylogenetic analysis was performed in the context of other Neisseria species. In addition, genetic markers like ST, CC or serogroup, resistance markers as well potential recombination between strains have been studied.

Results
The aim of this study was to identify potential sites in GyrA protein linked to a specific character. In order to perform this analysis, it was necessary first to carefully analyze the dataset in order to identify any potential characteristics either genetic, temporal or geographical that could be linked to any GyrA amino acid changes.

Analysis of the dataset
The 252 gyrA sequences analyzed in this report concerned 15 Neisseria species, N. meningitidis, N.gonorrhoeae and N.lactamica being the most represented (Additional Figure 1a). Ten of these species have been reported to infect humans, N. meningitidis strains as well as N.gonorrhoeae strains being pathogenic for humans. This study focused on N. meningitidis and therefore 87% of sequences was represented by 43 strains (Additional Figure 2a). ST7 and ST5, which belong to CC5, were also well represented with 28 and 11 strains respectively (Additional Figure 2b). Eleven serogroups of N.meningitidis were analyzed; B and C were represented by 60 strains, A by 47 and W by 12 (Additional Figure 2d). Based on STs, 19 clonal complexes were represented in the dataset. The majority was CC4821 (95/252) followed by CC5 (41/252) and unassigned (UA) (Additional Table 1).

Evolutionary analysis of 252 gyrA nucleotide sequences
A neighbor joining phylogenetic tree was constructed with 252 gyrA nucleotide sequences ( Figure 1).
The nucleotide sequences were significantly divergent with an overall p-distance of 0.052 (Table 1).
An overview of the tree showed that the sequences from N.meningitidis were found across the tree demonstrating that gyrA gene was relatively divergent within that species. Moreover, no trend was found based on collection location, time or even genetic characteristics like CC or serogroup. For example, the sequences of CC4821 strains (in red in Figure 1) were found across the tree. Similarly, sequences of CC5 strains (in blue in Figure 1) were found in two lineages suggesting that the gyrA gene evolution was independent of the CC5 typing. Overall, these observations suggested that the evolution of gyrA sequences was independent of location, time as well genetic characteristics of the bacterial strains.
Among the 252 analyzed sequences, 48% (121) were identical (Table 1). We chose to maintain these sequences despite the sequence identity in order to keep valuable information in terms of geography, time and genetic characteristics. Nearly 60 % of the sequences (151) were found on the top of the tree, with no significant bootstrap value ( Figure 1). These sequences were highly homogeneous, with a p-distance of 0.003 (Table 1). The remaining 101 sequences were more divergent, with a p-distance of 0.064 relative to sequences grouped on the top of the tree. Most of the nodes concerning these 101 sequences featured a bootstrap value > 70%. Moreover, the p-distance within this group of 101 sequences was 0.083 demonstrating that these sequences were highly divergent between each other.
As major nodes of the tree featured a bootstrap value > 80%, we decided to arbitrarily assign sequences to 9 different genetic groups ( Figure 1; Table 1).
Fifteen sequences were considered outliers. Even though they were from 2 genetically distinct groups at the bottom of the nucleotide tree, these sequences shared a common node with 90 % bootstrap suggesting that these sequences were divergent compared to the rest of the sequences. The lowest 6 p-distance between the outlier group and the other 8 groups was 0.16. These sequences were from 11 species that were not found elsewhere in the tree. We could think that this might be the consequence of a sample bias. However, as these sequences shared 8 amino acid changes, it strongly suggested that these sequences were related and significantly divergent compared to the rest of the analyzed sequences (Table 1 China-2009-R. However, the N.cinerea sequence featured a long branch suggesting a significant divergence compared to the N.meningitidis sequence. Twenty-five sequences, mainly from strains of ST7 and CC5 shared a common node with a bootstrap of 99% and a long branch. Finally, based on the available data, the sequences from group 6 appeared to be from strains that were resistant to quinolone. However, no amino acid substitution was shared by these strains suggesting that there was no common marker for the resistance phenotype (Additional Table 2). Group 7 was particular as it exclusively concerned sequences from N. gonorrhoeae species. These sequences were highly homogeneous with a p-distance of 0.002 and they shared a node with a bootstrap of 99% and a long branch. Moreover, these sequences shared 6 amino acid changes (Table 1, Additional Figure 3).
Interestingly, the 12 analyzed sequences were from strains that were either susceptible or resistant to quinolone suggesting that this divergence was not related to a resistance phenotype. Group 8 consisted of most of N.meningitidis sequences analyzed in this report. The strains were collected in the last 90 years in 13 different countries from 4 continents. Despite the significant time span and geographic spread, these sequences were highly conserved with a p-distance of 0.003. Moreover, these sequences were from strains of 72 STs, 24 CCs, 10 serogroups, including the reference strain 053442. Overall, this showed that gyrA gene was highly conserved among most N.meningitidis strains despite different genetic characteristics, geographic locations or collection time.

Analysis of the divergence within the GyrA protein
The amino acid divergence within the GyrA protein was analyzed among 131 unique sequences (Additional Table 2). Two hundred fifty seven divergent positions were identified among the 931 amino acid featured in the alignment ( Figure 2). Even though these sites were found across the protein, the distribution of the divergence did not appear to be random. Two regions were highly conserved, from positions 530 to 620, and a smaller region between 300 and 330. According to the protein from E.coli, the first region corresponds to the end of the amino terminal domain and the beginning of the carboxy terminal domain. The second region corresponds to the tower domain of the protein based on the 3D model structure. Among the 257 divergent positions, 4 sites were highly divergent, namely 91, 417, 665 and 210. For example, more than 57% of the sequences featured a mutated residue at position 91 ( Figure 2). Whereas 42% of the sequences had a T, 38% had an I, 14.7% had an S and the remaining sequences had either an F or a V (Additional Table 2). Overall, the 8 divergence among the 257 variable sites was shared by 2-3% of the sequences (Additional Figure 4a).
However, 52 divergent positions were only found in one sequence. These variable residues might be due to sequence artifact. Because these residues were not shared by other sequences, this meant that the variability at these positions had not been selected in the evolutionary process. Seventy-two positions were shared by at least 10 % of the sequences (14 out of 131), position 91 being the most variable as 75 sequences (57%) featured an amino acid substation (Additional Figure 4b). Eighty-nine positions featured multiple amino acid residues (Additional Figure 4c-d). For example, the position 441 featured 6 amino residues (R, H, T, Q, G and N) in addition to the main residue E (Additional Table   2). This suggested that these positions were not only highly variable but also that these positions could allow different residues without affecting the protein function. The alignment of 131 sequences featured a few gaps, mainly located between positions 720-760 and the C terminus of the protein sequence.
The 257 divergent sites were analyzed among the complete dataset of 252 sequences in order to identify any potential correlation between divergence and geographic, species or genetic characteristics (Additional Table 2). As far as we can tell, no correlation was found between divergence and geographic location. Moreover, no temporal pattern was identified. However, a correlation between diversity and species was found for N. gonorrhoeae and N. lactamica (Table 2). Indeed, 6 changes were only found in N.gonorrhoeae, D210K, E456K, E483K, V486I, N836S, D917G.
Similarly, D907E was only found in N.lactamica. In contrast, no residue was specific to N.meningitidis.
As the dataset was biased towards N.meningitidis, one explanation could be poor sampling. Only 12 N.gonorrhoeae sequences were analyzed compared to 120 N.meningitidis. However, the fact that the 6 residues were found in N.gonorrhoeae strains collected in different country and at different time suggested that these residues were indeed species specific. The search for a link between diversity and genetic marker like ST, CC or serogroup was unsuccesuful suggesting that gyrA evolution was independent of the evolution of the rest of the genome. Finally, one position (91) appeared to be linked to quinolone resistance. All the resistant strains had an I or a F whereas all the sensitive strains had an S or a T. V was found in strains with undetermined resistance phenotype. Interestingly, N.gonorrhoeae appeared to have different markers, F for resistance and S for sensitivity.

Identification of potential resistance markers to quinolone
Among the 252 analyzed sequences, 61 were from strains tested for resistance to quinolone (Additional Table 1). These strains were further analyzed in order to identify additional potential markers for resistance. A change that would be found in resistant strains but not found in any sensitive strains would qualify. Thirty sites were identified (Additional Table 3). For example, H8N was found in 29 resistant strains but not featured in any sensitive strains. All the strains were analyzed for these 30 positions. Fifty-one different profiles had been identified, meaning that there were 51 combinations of these markers among all the analyzed strains (Table 3).
Among the 30 potential markers, position 91 and 103 were the most shared in the profiles. Some amino acid substitutions were only found in N.gonorrhoeae strains, like D95G or D95A. Finally, 24 of the 51 profiles concerned strains that were known to be resistant to quinolone. As resistance markers were initially described in E.coli, a comparison between gyrA sequences of E.coli and the reference strain N.meningitidis 53442 was necessary in order to check the position of these markers in E.coli sequence (Additional Table 4).

Recombination within gyrA gene between N. spp.
Among the 9 groups identified during the phylogenetic analysis, group 1 was of particular interest, Furthermore, five of these changes were seen within 30 amino acids (Table 1). Finally, amino acid changes observed in one strain were not seen on other strains, like position 740 and 750. All these observations suggested a recombination between these 2 strains which was confirmed by a bootscan analysis (Figure 3). Other potential recombinations, with either cinerea, or lactamica strains and between meningitidis strains were also identified.

Discussion 10
The genus Neisseria has been found in more than 40 animal hosts mostly mammalians but also insects (1). Nucleotide sequences for more than 30 different Neisseria species can be found in GenBank. The present study featured sequences from 14 of these species even though the dataset was highly biased towards N.meningitidis. Eleven species can be found in human, namely meningitidis, gonorrhoeae, lactamica, sicca, subflava, cinerea, elongata, flavescens, mucosa, pharyngis and polysaccharea (15,16). Nine of these species were featured in the present analysis.
The phylogenetic analysis of the gyrA gene presented in this report showed that N.meningitidis sequences were closely related to N. gonorrhoeae and N.lactamica sequences. To our knowledge, such phylogenetic analysis has never been realized before. However, several genes like argF, recA, rho as well as 16S rRNA have been used for evolutionary analyses (17,18). The genetic relationship between the 3 species, meningitidis, gonorrhoeae and lactamica was confirmed for the 3 genes argF, recA, rho. A genetic relationship between the 3 species was also confirmed at the genome level.
However, lactamica appeared more divergent than meningitidis and gonorrhoeae (18). The present study featured 11 other species of Neisseria. However, as they were only represented by 1 or 2 sequences, it is likely that the poor sampling would compromise the reliability of the analysis. It is however striking to observe that 9 of these species shared 8 amino acid substitutions within GyrA protein.
The phylogenetic analysis on 218 sequences of gyrA gene from N.meningitidis strains showed that gyrA was highly divergent within the meningitidis strains. To our knowledge, only one other report featured a phylogenetic analysis of the gyrA gene from N.meningitidis but omitted to include other species therefore a comparison with other species was not possible (11). Whereas gyrA gene of meningitidis showed high divergence between the analyzed strains, gyrA gene of gonorrhoeae featured a remarkable conservation with 6 amino acid substitutions among the 12 analyzed sequences. A few hypotheses could be proposed to explain such discrepancy between species. First, a sampling artifact might be likely. Only 12 sequences have been analyzed whereas around 300 protein sequences are listed in GenBank. To our knowledge, no systematic phylogenetic analysis was performed on full length gyrA gene in N.gonorrhoeae. We briefly attempted to analyze the sequences available on the public domain and did not find significant any divergence among N.gonorrhoeae sequences thus confirming the present study. Another hypothesis could be that gonorrhoeae gyrase A might feature additional property that are not featured by meningitidis gyrase A. Gonorrhoeae strains do not have the same tropism as meningitidis strains and therefore additional property could be possible. Further studies would be necessary to assess the role of the 6 amino acid substitutions identified in the present study, namely D210K, E456K, E483K, V486I, N836S, D917G. These substitutions are shared by resistant (5 strains) as well as sensitive (4 strains) strains and are therefore very unlikely to be linked to a resistance phenotype.
Gene transfer or recombination between bacterial genomes has been well described (19,20). The present study reported potential recombination events between Neisseria species. Recombination would mean that the bacteria have been replicating at the same time and same location. Among the reported recombinations, a recombination within gyrA gene between meningitidis and subflava was identified. Unfortunately, the metadata concerning the subflava strain were very limited, only the collection year was known after contacting the scientists who reported the genome sequence in GenBank. Knowing the country of collection would help to understand a potential recombination with N.meningitidis collected in China. Recombination between Neisseria species have been previously reported. For example, Wu et al. reported a likely recombination between lactamica and meningitidis (19). Recombination between species as well as intra-species is likely to allow the strains to acquire new resistant phenotype however further studies would be necessary to better assess the acquisition of the resistance phenotype.
The original aim of this study was to identify potential markers in gyrA gene that could be linked to either geographic, temporal or genetic characteristics. Only two criteria were found to be linked to gyrA mutations. One was the species; as discussed above, gonorrhoeae featured 6 unique amino acid substitutions. The other criteria was the position 91 which was exclusively an I or an F in quinolone resistant strains and an S or a T in quinolone sensitive strains. The mutation at position 91 was already well described (19,(21)(22)(23). As quinolone resistance of gyrase A protein was originally described in E.coli, we felt that all positions related to quinolone resistance should be identified based on the E.coli sequence (Yoshida et al., 1990). Resistant markers were originally described as the and 114. The alignment generated in the present study was in agreement with a previous report (24).
A recent study identified 2 additional amino susbstitutions, located outside the QRDR sensu stricto, namely A51V and A119E. These 2 positions were however conserved in Neisseria spp. The discrepancy between E.coli and Neisseria suggests some difference in quinolone resistance.
Unfortunately, to our knowledge, a comparison between E.coli and Neisseria resistance has not been performed and further studies would be necessary to assess whether Neisseria spp. are more or less resistant to quinolone than E.coli. The present study identified 28 positions that might be involved in a resistance phenotype. These positions are characterized by a substitution only found in resistant strains. Further analysis would be necessary to assess whether these substitutions have an effect on the resistance phenotype.

Conclusions
In summary, a phylogenetic analysis of gyrA gene from N.meningitidis in the context of other Neisseria species revealed the following points: gyrA evolution seems species specific for gonorrhoeae and lactamica. In contrast, the evolution of gyrA within meningitidis strains appears more divergent. Several potential recombination events have been identified that could explain why the meningitidis gyrA gene is less conserved. The consequence of recombination between species or within species remains to be studied. Only one position (91) appears to be linked to a quinolone

Bacterial Strains
More than 4,000 N.meningitidis isolates have been collected throughout China from IMD patients as well as close contacts and asymptomatic carriers by our laboratory since 1960s. The bacterial strains were propagated on single Petri dish containing Difco TM Columbia Blood Agar Base with 5% Sheep Blood in a 5 % CO 2 atmosphere at 37 °C for 18 h. Single colonies were lysed and tested by PCR for the meningococcal-specific contact-regulated gene A (crgA) in order to identify bacterial species (25).  (11,26). Seventy-seven complete gene sequences were generated and submitted to GenBank (MK930374-MK930450) (listed in Additional Table 1).

14
The gyrA gene from the reference strain N.meningitidis 053442 (CP000381.1) was used to query the GenBank database and 556 additional gyrA gene sequences were selected from the BLAST output.
Identical sequences were deleted unless the strains did not share the same CC, ST or serogroup.
Sequences less than 90% of full length (2905 nt) were discarded. The remaining 175 sequences were combined with the 77 sequences generated in this study.

Sequence analysis
Two hundred fifty-two gyrA nucleotide sequences were aligned using Mega 6 and a neighbor joining phylogenetic tree was generated using the maximum composite likelihood nucleotide model (27,28).
Phylogenetic inference was tested with 1000 bootstraps (29). Nodes with bootstrap value > 70% were indicated. Average genetic p-distance within genetic groups as well as between genetic groups and pairwise distances were computed in Mega. Amino acid divergence was analyzed in Mega and processed in Excel. Potential recombination events were analyzed with the SimPlot software (30 Table   4).