Biofilm formation is a beneficial mechanism. Its construction requires complex regulation not only within a single bacterium but also with other bacteria within a biofilm body. In H. pylori, in vitro observations of mono-species biofilms showed that high biomass was obtained after 72 h [14, 15]. However, evaluation of several strains under the same conditions showed that biofilm formation was significantly higher in some strains. This shows that variations exist in biofilm formation. These variations were also observed in the present study. Among the strains, 19.6% overproduced the biofilm. High-biofilm formers are likely to be more resistant to antibiotic exposure [7, 16]. However, the prediction of genetic determinants for biofilm formation and antibiotic resistance is necessary.
Phylogenetic analysis showed that there was no association of biofilm formation to a certain lineages inferred using the SNPs based core genome alignment tree. In some studies on other Staphylococcus aureus species, biofilm formation has been reported to be associated with specific lineages [17] whereas other studies reported no phylogenetic link [18]. In a previous study, the H. pylori population of Bangladesh was determined using Structure software, which showed that Bangladeshi isolates were separated into two populations, hpEurope and hpAsia2. The whole genome phylogenetic tree in this study also confirmed that Bangladeshi isolates were clustered near hpEurope strains and hpAsia2 strains.
The whole genome sequence data allow researchers to investigate and screen genes and mutations related to biofilm formation. The ability of H. pylori to grow under a biofilm on an abiotic surface enables researchers to identify genes involved in biofilm formation through the generation of knockout mutants [9, 19]. Hence, we tried to validate the presence or absence of specific genes of potential interest that play roles in the initiation of adhesion, shape formation, efflux pump, and even dispersion, as listed in Supplementary Table 2. Our results showed that these genes were present in almost all isolates, despite variations in the level of biofilm formation. This outcome diverges from observations made in other bacteria such as Staphylococcus aureus, where the presence of the ica gene could be noted in most of the high-biofilm formers [20]. However, the gene marker for the high-biofilm formers remains unclear in H. pylori. A previous study using comparative genomic data captured several genes, such as cagD, futA, and napA, whose presence was associated with the biofilm level [11]. Nevertheless, for the strains that possessed most of the targeted genes, another level of polymorphism might exist and affect the phenotype.
Additional analyses focused on amino acid variants, including insertion, deletion, missense mutation, or SNPs. To our knowledge, this is the first study to assess the SNPs associated with biofilm formation in H. pylori. Several SNPs have been linked to specific phenotypes of H. pylori, such as diseases like gastric cancer [12]. Rather than the absence or presence of specific genes, SNPs or missense mutations are also well known to induce the antibiotic resistance phenotype in H. pylori [21, 22]. The discovery of mutation from whole genome sequence approach were also used in several previous studies [23]. In this study, by using the ARIBA pipelines, the association of the SNPs with the antibiotic resistant phenotype were observed and concordant with other reports that confirm the mutation roles by natural transformation [21]. First, a reference database of genes encoding cellular targets for amoxicillin, clarithromycin, tetracycline, metronidazole, and clarithromycin were constructed. The mutation that has been reported to cause antibiotic resistance could be detected using this method, such as A2147G and D91N of gyrA [10, 21, 24]. A relatively rare mutation in the locus V45I of pbp1a; E679D and E733E of gyrA; and A343V of gyrB was also observed. The analysis was then expanded to the evaluation of the SNPs associated with biofilm formation.
The database of the genes that was proposed to be associated with biofilm formation was constructed and applied to the dataset from Bangladesh. As the results, 11 genes that possessed variants significantly associated with biofilm formation were identified. Among these, five genes encode outer membrane proteins (OMPs) that are crucial for cell adhesion [9]. The other genes are involved in cell shape regulation [25]. This suggests that the change of the spiral shape toward unusual forms such as coccoid forms of H. pylori deserves a spotlight in the biofilm formation mechanism study.
The genes investigated in this study included those encoding AlpA and AlpB, which are OMPs involved in the adherence process in H. pylori. CagD and CagE are part of the Type IV secretion system known to be required for biofilm formation in its early establishment through a mechanism that requires further clarification [26]. The other genes encode the Csd protein complex, MurF, and AmiA, which are crucial for peptidoglycan regulation to maintain the helical shape of bacterial cells [27, 28]. The change from a helical shape to a coccoid is known to be related to the virulence of H. pylori [29] and constitutes a strategy for increasing survival in difficult environments [30]. Studies have established that coccoid forms are common forms of cells observed in biofilm-forming H. pylori [31]. These findings support further studies assessing the role of morphological changes in biofilm formation, which remains unclear to date.
The structural changes in the protein product due to the SNPs were captured by protein modeling analysis. Although this prediction was limited to proteins available in the database, structural changes could be observed in the cgt and gluP genes. Cholesterol-a-glucosyltransferase (CGT) protein acts by importing cholesterol from the host and converting it to cholesteryl glucosides, which are important for colonization, cell wall formation, and host phagocytosis evasion [32, 33]. GluP is an efflux pump that affects multidrug resistance and biofilm formation [34]. Further experimental protein analysis should be performed to confirm this functional change.
This study had some limitations. The small number of samples included in this dataset and the statistical approaches applied cannot avoid the possibility of false positive discovery among the detected SNPs. However, it could be a stepping stone for further molecular studies elucidating the genetic factors involved in biofilm formation and related molecular mechanisms.