Haemophilus influenzae one day in Denmark: prevalence, circulating clones, and dismal resistance to aminopenicillins

Haemophilus influenzae is a common cause of mucosal infections that warrants accurate surveillance. We aimed to assess the prevalence of the species in clinical specimens, and characterise population structure and resistance to aminopenicillins by whole genome sequencing.We assessed the point prevalence by entering the database records of 1 day in Denmark and examined the genome sequences of nationwide, collected isolates from the same day. The prevalence of H. influenzae in clinical samples on the 10th of January 2018 was 1.78 per 100,000 person-days (all samples), and 2.47 per 1000 hospital bed-days (hospital samples). Of 2009 bacteria deemed clinically relevant and collected in a concerted action by the Danish departments of clinical microbiology, 62 (3.1%) were H. influenzae. All 62 isolates belonged to phylogenetic group I and were unencapsulated. Three strains from separate Danish regions had identical core genome sequences, but a small number of intergenic mutations testified to circulating clones, rather than individual cases of patient-to-patient transmission. The TEM-1 β-lactamase gene was present in 24 strains, while 13 strains were genetically categorised as ampicillin-resistant due to substitutions in penicillin-binding protein 3; shared patterns of amino acid substitutions in unrelated strains indicated putative lateral transfer of chromosomal resistance. Circulating clones of H. influenzae are frequent, and host factors, rather than direct transmission of epidemic strains, may be the primary cause of infection. The bleak presence of ampicillin resistance revealed by sequencing of point prevalence strains underscores the necessity for close examination of testing methods.


Introduction
Haemophilus influenzae is a small Gram-negative bacterium that is part of the normal microbiota of the human upper respiratory tract. H. influenzae can cause pneumonia, otitis media, sinusitis, and conjunctivitis, and, more rarely, invasive disease [1][2][3]. H. influenzae type b (Hib) conjugate vaccine effectively prevents childhood meningitis and other invasive infections caused by serotype b capsulate strains, and replacement with other capsulate types has been limited [3][4][5]. However, unencapsulated H. influenzae is a major cause of mucosal infections [6] and the global burden is estimated to represent 55-95% of cases of otitis media in children, 44-68% of cases of childhood conjunctivitis, and more than 90% of acute exacerbation of chronic obstructive pulmonary disease [1 and references therein].
H. influenzae with resistance to ampicillin and amoxicillin was reported in the 1970s [7,8]. Two distinct β-lactamases have been identified, ROB-1 and TEM-1, which confer highlevel resistance to ampicillin and other aminopenicillins, but are unable to degrade cephalosporins at a significant rate [9]. Extended-spectrum β-lactamases have not been documented for H. influenzae. A major concern relates to amino acid substitutions in the transpeptidase region of penicillin-binding protein (PBP) 3. By targeted gene disruption, PBP3 was found to be the sole indispensable PBP of the species [10]. β-Lactamase-negative ampicillin-resistant (BLNAR) strains with mutational resistance are usually grouped according to decisive substitutions and antimicrobial spectrum, where groups I and II confer resistance to aminopenicillins only, while the target range of group III includes cephalosporins [11,12]. Resistance to third-generation cephalosporins in H. influenzae has only slowly emerged in Europe [12][13][14], while it is prevalent in Southeast Asia [15,16]. Restricted use of oral cephalosporins probably constrains the emergence of resistance in particular global areas, but development of broad antimicrobial resistance in H. influenzae may not be preventable. In 2017, the World Health Organization published a list of antibiotic-resistant pathogens for which new and effective antibiotics are urgently needed [17]. Ampicillin-resistant H. influenzae was included in the same priority tier as penicillin-resistant Streptococcus pneumoniae.
Previously, we described a nationwide point-prevalence survey of human bacterial pathogens in Denmark. On the 10th of January 2018, 2009 bacterial isolates deemed clinically relevant were collected in a joint operation by all departments of clinical microbiology (DCMs) [18]. Sixty-two (3.1%) of the collected bacteria were H. influenzae. Here, we describe the epidemiology, genome characteristics, and β-lactam resistance of H. influenzae identified in clinical specimens on a single day in Denmark.

Specimen reports, collection, and sequencing of strains
The Danish microbiology database, MiBa, receives copies of specimen reports from all Danish DCMs [19]. We extracted reports of H. influenzae released on the 10th of January 2018 with information of sample type, reported antimicrobial resistance, affiliation of requisitioning physicians (hospital department or general practise), and reporting DCM. Interpretative criteria and antimicrobial susceptibility testing (AST) by disk diffusion were performed according to EUCAST (https:// www.eucast.org/), with some exceptions (see below). Original sample number and person identifiable data were blinded to the study.
Collection of clinically relevant bacteria from all Danish DCMs on the 10th of January 2018 (N=2009), genome sequencing using the Illumina NextSeq 500 platform, and distribution of species have been described [18]. The definition of "clinical relevance" was at the discretion of the responsible DCM; however, the general criterion for inclusion was the report of identified bacterial species plus assessment of antimicrobial resistance. Original sample number and person identifiable data were blinded to the study.

Whole genome sequence (WGS) analysis of H. influenzae
Sixty-two bacteria collected on the 10th of January 2018 were H. influenzae, confirmed by K-mer alignment with ribosomal proteins at the pubMLST database [18]. Raw reads from these 62 strains were re-assembled and initially compared using the nullarbor v2 reads-to-report package [20]. Reads were trimmed with trimmomatic v0.39 [21] and assembled with shovill v1.0.9 for skesa v2.3.0 [22]. Annotation with Prokka and identification by Roary of core genes present in all strains were performed as previously described [23]. Concatenated core genes were aligned across all samples and single nucleotide polymorphism (SNP) distances were counted with snpdists v0.7.0. To include intergenic mutations, SNPs were also called by mapping of reads to closed circular DNA of strain Rd KW20 (NC_000907), using snippy v4.4.3; the read mapping-method gave rise to 129,997 SNPs or 7.1% of the Rd KW20 genome. Roary defines the pangenome reference as the union of Prokka-annotated genes clustered by a threshold of 95% identity. As an alternative assessment of gene presence/absence, we mapped the reads from each sample onto this pangenome reference with Snippy v4.6.0. One-tenth of the median read depth of the gene was used as cutoff value for coverage vs non-coverage. Lastly, we counted the proportion of covered positions in said genes, and regarded genes, where more than 85% of positions were identified, as being present. MLST sequence types were identified with mlst v2.18.0 (https://github.com/tseemann/mlst), and the Haemophilus influenzae database (PubMLST.org) updated February 2020.

Accession numbers
Allelic profiles of four strains with new MLST sequence types have been uploaded to PubMLST.org and assigned sequence types 2329-30 and 2332-3; WGSs of the same four strains have been deposited in GenBank under accession numbers JAECZV000000000-JAECZY000000000. Raw sequence data of the entire ODiD 2018 collaboration have been submitted to the European Nucleotide Archive under study accession no. PRJEB37711 [18].

Data reported to the Danish microbiology database
One hundred three specimens from 103 patients with growth of H. influenzae were reported by the Danish DCMs on the 10th of January 2018, corresponding to a detection rate of 1.78 per 100,000 person-days. Denmark is divided into five regions, and regional detection rates varied between 1.37 and 3.57, median rate 1.72. Thirty-four samples were from hospitals, corresponding to a detection rate in clinical specimens of 2.47 per 1000 hospital bed-days (range 1.04-5.37, median 1.95) ( Table 1). Samples originated primarily from suspected cases of conjunctivitis (N=48; 47 from practising physicians) and from the respiratory tract (N=39; 32 from hospital departments including three bronchial samples). Other anatomical locations were ear (10) and nose (six).
Antimicrobial susceptibility testing was not reported for 26 samples (23 conjunctiva, 2 nose, and 1 ear). One DCM reported absence of β-lactamase for eight samples (three ear, three conjunctiva, and two nose) and stated that "penicillin derivatives will normally be active." With respect to conjunctival samples, two DCMs reported AST of chloramphenicol and ciprofloxacin (12 samples), or fusidic acid and tobramycin (10 samples), respectively. AST of β-lactams was reported for all respiratory tract cultures plus six ear and two nasal cultures. Two departments did not record zone diameters, but reported AST from four respiratory samples (including three from cystic fibrosis patients) based on visual estimation of zones. As determined by reported AST of 47 strains from 10 DCMs, 30 strains (64%) were ampicillin-susceptible, eight (17%) were resistant to ampicillin but susceptible to amoxicillin-clavulanate, and 9 (19%) were resistant to amoxicillinclavulanate.

WGS: genome relatedness and circulating clones
Sixty-two of 103 reported strains of H. influenzae were collected for genetic characterisation, as detailed in the "Materials and methods" section. Figure 1 depicts genome relatedness and lists regional origin, specimen type, and selected WGS characteristics. H. influenzae consists of two distinct, phylogenetic groups, where group II comprises serotypes e and f, plus other rarely detected lineages [28]. The 62 strains of the point prevalence study all belonged to phylogenetic group I, and none of them encoded a capsulation operon (often designated " NTHi", non-typeable Haemophilus influenzae). To fasten the population structure with previous markers, five Finnish otitis media strains are included in Fig. 1 in order to represent the five phylogenetic group I clades of De Chiara et al. [29]. Most study strains cluster with the clade V reference sequence and support the separation of this clade into two distinct clusters, as recently suggested [30]. Multilocus sequence typing (MLST) revealed a major cluster of sequence type (ST) 103, plus some smaller clusters (ST11, ST155, ST160, ST367, ST388, and ST652). Strains of the same ST frequently originated from different regions of Denmark. Extending the size of compared DNA, from seven gene fragments (MLST) to 1166 concatenated core genes, disclosed some SNPs differentiating strains of the same ST, but three ST103 strains (0486, 0802, and 0876) from three separate regions contained identical core genomes. If the genomic comparison was limited to the eight closely related ST103 strains, the number of core genes increased from 1166 to 1791 (1.137 to 1.673 Mb), but SNPs differentiating strain 0486, 0802, and 0876 were still absent. Only inclusion of intergenic regions, by mapping of reads to the closed circular reference genome RD KW20, was able to disclose 35-58 SNPs separating the three strains. Figure 2 shows pairwise SNP distances among 8 closely related ST103 strains (28 pairs) as revealed by comparison of core genes and by read mapping. The two methods were not fully coherent, as observed for the three strains with identical core genes, as well as other unexpected differences. In the complete collection of 62 strains, seven pairs were related below a breakpoint of 40 SNPs by read mapping, and two of them differed by only seven (0648 vs 0778, ST155) and 16 SNPs (0803 vs 0846, ST103). The human hosts of the former pair were separated by a wide geographic distance, and although the number of SNPs was low, pairwise Roaryquantification of shared genes in the two genomes sequestered 68 of 1718 annotated genes to presence in only one of the two strains. The latter pair originated from the same region albeit from separate wards, and pairwise comparison revealed a relatively large number of shell genes (59 of 1881 annotated genes were present in only one of two strains). A third pair (0536 and 1351, ST107) was separated by 31 SNPs by read mapping, but encompassed a comprehensive core, with only two of 1729 annotated genes being confined to the shell. To test if Roary analysis of incomplete genomes overestimated gene absence, we also did read mapping against the pangenome reference sequence (4082 genes). The additional calculation (described in the "Materials and methods" section) slightly increased not only the number of shared genes but also the number of genes present in only one of two strains, thus supporting the Roary-quantification.

WGS: acquired and mutational resistance
The TEM-1 β-lactamase gene (bla TEM-1 ) was present in 24 strains (39%), whereas ROB-1 was absent. For 22 strains, a contig size~0.2 Mb indicated chromosomal location of TEM-1, and flanking genes were associated with various integrative and conjugative elements of the ICEHin1056 family; genes for chloramphenicol acetyltransferase and tetracycline resistance were absent in all strains. Two ST57 strains (0475 and 0476) encoded TEM-1 on small plasmids identical to pA1606 previously described from Denmark [27]. The coverage of the pA1606 contigs was 15 and 14 times the average sequence depth of contigs, respectively, demonstrating that pA1606 is a low copy number plasmid. For the other β-lactamase-positive strains, the coverage of TEM-1 did not differ from average genome sequence depth (mean ratio 1.03, range 0.72-1.24); thus, there was no evidence of extensive gene amplification of bla TEM-1 , which has been shown for piperacillin/tazobactamresistant strains of Escherichia coli [31].
PBP3 was downloaded from 62 strains, and substitutions in the second part of the protein (which encompasses the transpeptidase domain) were determined using strain Rd KW20 as reference. A total of 134 amino acid substitutions were disclosed at 18 positions (range 1-23) (Fig. 3). Substitution R517H (defining group I BLNAR) was not detected, but N526K (group II BLNAR) was present in 13 strains (21%), and these strains are therefore genetically categorised as ampicillin-resistant. Nine of the 13 strains were β-lactamase gene-negative, while four had simultaneous presence of bla TEM-1 and may be designated beta-lactamasepositive ampicillin-clavulanic acid resistant (BLPACR); in the present context, the 13 BLNAR and BLPACR strains are merged into PBP3 group II in Fig. 1 (wild-type PBP3 signify absence of substitutions R517H, N526K, S385T, and/or L389F). Seven strains encoded substitution M377I, close to the SSN motif at position 379-81. M377I was included in the original definition of BLNAR group III that confer resistance to cephalosporins (M377I, S385T plus L389F; [11]), but the substitution has been omitted in recent updates Representatives of the five phylogenetic group I clades [27] are included in the dendrogram, but not in the calculations. Letters appended to PBP3 group II refer to ftsI clusters (Fig. 4). The asterisks denote new sequence types described in this study. Bar represents 2000 residue substitutions. Abbreviations: BAL, broncho-alveolar lavage; Comp -, truncated competence genes; H, hospital department; P, general or speciality medical practitioner; wt, wild-type  Fig. 2 Scatter plot of SNP distances among closely related ST103 strains by alignment of 1166 core genes (abscissa) and by mapping of reads to H. influenzae strain Rd KW20 (ordinate). Selected pairs are indicated of the scheme [12]. Two-thirds (82 of 121) of non-N526K substitutions in amino acids 350-603 of PBP3 occurred in the 13 group II BLNAR strains (Fig. 3, downward columns), with position 547 as a noticeable exception (position 603 is outside the transpeptidase domain). Inactivation of indispensable genes for natural competence was detected in nine strains (Fig. 1, column comp -), most often caused by single nucleotide mutations or deletions giving rise to premature stop codons. Truncation of essential competence genes showed a distinct association with wild-type PBP3 (eight of nine strains).
Sequence comparison of the transpeptidase domain of ftsI (the gene encoding PBP3) is shown in Fig. 4. Thirteen strains with substitution N526K cluster in two groups, with 7 strains in cluster A and 3 strains in cluster B, plus three singletons (ftsI cluster designations are appended to PBP3 group II designations in Fig. 1). The B cluster of ftsI was only observed among closely related strains (ST107 and a new ST, Fig. 1, bottom), while ftsI A cluster strains are dispersed in the genome dendrogram.
Sixteen positions in GyrA. GyrB, ParC, and/or ParE are associated with fluoroquinolone resistance in H. influenzae [33]. Amino acid substitutions at these positions were not detected.

Discussion
Two thousand nine bacterial strains, reported on the 10th of January 2018 and deemed clinically relevant by the local DCM in charge, were collected for genetic characterisation. The focus of the collection was prevalence, rather than selection of blood stream and other invasive isolates, but confinement to a single day during winter is a limitation to a pointprevalence study. Sixty-two of 2009 collected isolates (3.1%) were H. influenzae. Selection of strains was at the discretion of the responsible DCM, and criteria varied. Based on electronic reporting, the true prevalence on the 10th of January was 103 isolates of H. influenzae in 103 samples from 103 patients, corresponding to 1.78 per 100,000 person-days (all samples), or 2.47 per 1000 hospital bed-days (hospital samples). One Danish region reported 15 specimens with H. influenzae and submitted 15 strains for characterisation; the nadir was 21 reported vs four collected strains. However, 62 genetically characterised isolates constitute a solid and credible fraction of 103 point prevalence strains.
All 62 sequenced strains belonged to phylogenetic group I and were unencapsulated. Phylogenetic characterisation of the species usually focuses on diversity, as well as strains from invasive infections [28,34,35], and this aim may explain the overrepresentation of phylogenetic group II and the six capsular types, as well as outgroup strains, in published reports. The current depiction of the species underlines the importance of phylogenetic group I strains as commensals and in mucosal infections.
The genomic characterisation disclosed nationwide, circulating clones. H. influenzae has rarely been subject to epidemiologic investigations, and the current study revealed some obstacles when genomic distances are interpreted in order to guide infection control. Due to inclusion of intergenic sequences, read mapping invariably generated larger, pairwise SNP distances than alignment of core genes. However, the "molecular clock" of the two bioinformatics tools was not congruent, as demonstrated by the discordance of the two methods to identify the most closely related genomes in the large cluster of ST103 strains (Fig. 2). Roary-estimation of shared genes of close neighbours did, in some cases, seclude annotated genes to presence in only one of the two strains. Uptake of genes can also reflect a molecular clock, but true assessment of genomic content will require closing of genomes, or use of long-read sequencing technology. The most significant or decisive bioinformatics tool for epidemiologic investigations of H. influenzae has yet to be clarified. Susceptibility to ampicillin was assessed by phenotype (growth in the presence of antimicrobial agent) for 47 strains, and by genotype (WGS) for 62 strains. Categorisation of resistance differed markedly, with ampicillin resistance rates of 36% and 53%, respectively. A similar difference was reported from Japan, where 39% of upper respiratory tract isolates were non-susceptible to ampicillin (MIC ≥ 2 μg/ml), while 65% were genetically β-lactamase-negative ampicillin-resistant (gBLNAR) [15]. Personal identifiers and original specimen numbers were blinded to this study, and the two samplings retrieved from the same prevalence material of 103 strains were not identical. Ampicillin resistance was overrepresented among conjunctival strains in Norway, and strains from this anatomical site were rarely subjected to AST of aminopenicillin in the present study. But it should be emphasised that the two definitions of resistance are not interchangeable. Mutational resistance comes with a wide range of MIC distribution, and the accepted precision of ±1 dilution for reference broth microdilution AST [36] complicates the interpretation when recorded MICs are close to the breakpoint. Partly to address the difficult identification of strains with mutational resistance, EUCAST has recently suggested that H. influenzae can only be categorised as susceptible to oral amoxicillin if an increased dosage of 750-1000 mg × 3 is used (EUCAST Clinical Breakpoint Tables v. 10.0). Finally, the unstable growth of fastidious organisms can affect the categorisation of phenotypic resistance. It is well known that methods and media have important consequences for detection of resistance in H. influenzae [37]. General use of the EUCAST disk-diffusion method for categorisation of susceptibility to aminopenicillins has reduced the inter-laboratory inconsistency, but essential and categorical agreement with minimal inhibitory concentration determined by broth microdilution is still modest [38].
PBP3-substitution N526K was detected in 13 of 62 strains (21%), which is twice the rate reported from a Danish region in 2007 [39]. The current strains showed a conspicuous accumulation of certain additional substitutions in the transpeptidase region of the protein. PBP3 substitutions identical to ftsI clusters A and B were also common in Norway in 2007, being disclosed in 48 (43%) and 17 (15%) of 111 strains with the N526K substitution [32]. Although genetic comparison in that study was restricted to MLST, a similar tendency to congregation of ftsI cluster B, and dispersion of ftsI cluster A, was observed; the prevalent Norwegian cluster B clone (in that study designated PBP3 type D) consisted of ST201, which is different from ftsI cluster B clone STs of the present study (Fig. 1, bottom). The observed "linkage disequilibrium" between ftsI and other core genes in cluster A strains can arise by selection of particular mutations associated with survival in the presence of repeated exposure to antimicrobials, or by lateral transfer of an optimised fragment of the PBP3 gene. Inactivation of essential competence genes was preferentially observed in strains with wild-type PBP3, but the prevalence of truncated competence genes was too low to firmly disclose the origin of the gene fragment(s) encoding N526K. There is extensive variation in natural competence in H. influenzae [40], and other genes may markedly depress transformation without being essential for competence. Biologic assessment of competence in a random sample of clinical strains is required to elucidate the putative importance of lateral transfer for mutational resistance in H. influenzae.
The point prevalence study has emphasised the significance of unencapsulated H. influenzae, which may be an under-recognised pathogen [1]. Circulating clones are frequent, but infections may be related to host factors, rather than direct transmission of epidemic strains. Revelation of aminopenicillin resistance in half of point prevalence strains by WGS must be addressed by meticulous AST, and confirms the concerns that have been raised by WHO.
Author contribution The study was conceived and planned by NNL. JL retrieved data from MiBa, and CMK performed the major part of the bioinformatics analysis. NNL analysed data and wrote the first draft of the manuscript, which was revised critically for clinical content by NP, HLN, and DSH. All authors approved the final version of the manuscript.
Data availability Accession numbers are given in the "Materials and methods" section. The dataset analysed as part of the current study is available from the corresponding author on reasonable request.

Declarations
Ethical approval and informed consent Not applicable to the present study. Reported culture of H. influenzae and collected bacterial strains are linked to specimen type and a regional DCM. Information cannot be linked to patient records or original sample number, and according to Danish law, approval from science ethics committees is not required.