Comparative Analysis of Bacterial and Archaeal Population Structure by Illumina Sequencing of 16S rRNA Genes in Three Municipal Anaerobic Sludge Digesters

Understanding the microbial communities in anaerobic digesters is important for better regulation, operation, and sustainable management of the sludge produced at various stages of wastewater treatment processes. Microbial communities in the anaerobic digester of the gulf region where the climatic conditions and other factors may impact the incoming feed have not been documented. Archaeal and Bacterial communities of three full-scale anaerobic digesters, namely AD1, AD3 and AD5 were analyzed by Illumina sequencing of 16S rRNA genes. Among bacteria, the most abundant genus was fermentative bacteria Acetobacteroides (Blvii28). Other predominant bacterial genera in the digesters included thermophilic bacteria (Fervidobacterium and Coprothermobacter) and halophilic bacteria like Haloterrigena and Sediminibacter. This can be correlated with the climatic condition in Dubai, where the bacteria in the original feed may be thermophilic or halophilic as much of the water used in the country is desalinated seawater. Propionic acid-producing bacteria like Paludibacter and propionate oxidizing bacteria like W5 were also dominating group and were found in all the digesters. The predominant Archaea include mainly the members of phylum Euryarchaeota and Crenarchaeota belonging to genus Methanocorpusculum, Metallosphaera, Methanocella, and Methanococcus. The highest population of Methanocorpusculum (more than 50% of total Archaea) hydrogenotrophic archaea matches with the high population of Acetobacteroides (Blvii28) and Fervidobacterium bacteria which ferments the organic substrates to acetate and H2. Coprothermobacter, which is known to improve protein degradation by establishing a syntrophy with hydrogenotrophic archaea, was also one of the dominant genera in the digesters. This study, for the rst time, contributes to an in-depth understanding of the phylogenetic diversity of a microbial community of three full-scale anaerobic digesters of a municipal wastewater treatment plant in Dubai, UAE. structure of three of a Bacteroidetes, Firmicutes, Synergistetes, Theromotogate, OP8, and Chloroexi were dominant bacterial community phyla in all the digesters and the highest diversity observed in AD5 followed by AD3 and AD1. The predominant archaea community included mainly the members of phylum Euryarchaeota and Crenarchaeota belonging to genus Methanocorpusculum, Metallosphaera, Methanocella, and Methanococcus with highest archaeal community diversity observed in AD5 as compared to AD3 and AD1. The ndings of this study indicate that the predominance of thermotolerant and halotolerant bacteria and archaea in the anaerobic digesters, and this could be due to the inuence of environmental conditions on the incoming feed sludge. The anaerobic digesters also characterized by a high population of bacteria known to ferment organic substrates to acetate and H2 and hydrogenotrophic archaea. This indicates the possible mechanism of CH4 production by the microbial community present in these digesters. From the taxonomic classication at the genera level, it can be observed that different digesters alter the microbiome prole differently. For intergroup comparisons, no abundant genera (present at least 1% in any of the samples) showed signicant differentiation. In conclusion, a core and the stable, functional microbial community was observed under more or less similar physicochemical conditions.


Introduction
The process of anaerobic digestion is a multistep microbial process that includes the microbial breakdown of organic matter to produce CO 2 , CH 4, and H 2 O by a complex microbial community, including archaea and bacteria.
These products can be used as biogas, which can be further processed to generate electricity or fuel for transportation [1,2]. The energy obtained from the anaerobic digesters can also be used for the operation of the wastewater treatment plant itself [3]. Due to these possible applications of the gases and the rising cost of conventional fossil fuels, the anaerobic digestion process is emerging as one of the most sustainable methods for the management of organic waste [4]. The process is already being used for the generation of renewable energy in many countries worldwide [5]. Besides, anaerobic digestion is of environmental signi cance also as it helps in managing wastes and reducing greenhouse gasses [6]. But there are clear challenges in translating the technology for simple commercial applications and general use [7]. One of the reasons is the complexity of the microbial community involved in the process, making it di cult to understand the speci c roles played by different bacteria and their maintenance as consortia [8,9]. Therefore, it is essential to understand the roles played by various microorganisms in the process of anaerobic digestion.
During the conversion of organic wastes into CH 4 gas, several microbial processes are involved [10]. It consists of following necessary steps, breakdown of complex organic molecules to simple organic compounds, followed by the conversion of these simple organic molecules to organic acids (acidi cation), and nally, the transformation of these organic acids into the CH 4 gas. Methanogens (methane-producing bacteria) also are of two major types, namely, hydrogenotrophic and acetolactic, depending on the type of substrate they utilize for the CH 4 production.
Each of these processes is carried out by different microorganisms in the presence of various other unrelated microorganisms. But the complex interplay between these microorganisms may in uence these processes adversely or favorably. Multiple factors in uence the composition of the microbial community present in the anaerobic digesters [11]. One crucial factor can be the microbial community in the feed of the anaerobic digester, which can nally in uence the microbial community in the anaerobic digester [12,13]. The composition of the microbial community in the feed can also change with environmental conditions as it has been demonstrated earlier also that the environmental conditions affect the type of methanogens present in anaerobic digesters [14].
Nowadays, next-generation sequencing-based on 16S rRNA amplicon is commonly applied for an in-depth understanding of microbial community structure and dynamics in a wide variety of engineered anaerobic digester systems [15][16][17]. The Illumina is a promising high throughput sequencing platform to illustrate the taxonomic and functional analysis of diverse microbial populations in various environmental samples. There are several studies conducted for the phylogenetic classi cation of microbial communities in full-scale anaerobic sludge systems worldwide; however, next-generation sequencing has not been applied to study the microbial community structure in anaerobic sludge digester of full-scale municipal wastewater treatment plants in the UAE. Although we examined the anaerobic digester microbial community using uorescent in situ hybridization and quantitative PCR technique, a comprehensive mapping of key microbial operational taxonomic groups was necessary [18]. The sewage treatment plant, located in Jebel Ali, is one of the major wastewater treatment plants in Dubai, UAE. Its e cient operation and maintenance are indispensable for the city of Dubai.
Furthermore, as per our knowledge, a deeper understanding of microbial community structure of full-scale anaerobic sludge digesters of municipal sewage treatment plants in the UAE using high throughput Illumina sequencing of 16S rRNA gene has not been attempted. Despite well-documented studies of the in uence of physicochemical operational parameters on microbial composition, this study does not focus on the effect of operational parameters for anaerobic digestion on the microbial population. It was also because all three digesters operated under stable operating conditions fed by the wastewater sludge produced at various stages of the treatment of wastewater. Therefore, in this context, this investigation focussed on the comparison of obtained taxonomic results of the dominant genera of the bacterial and archaeal community structure of three full-scale anaerobic digesters involved in the critical steps of anaerobic digestion process as determined by the high-throughput Illumina sequencing approach. The present study hypothesized that the anaerobic digester microbial community performing under stable physicochemical conditions and sludge feed should have a core functional taxonomic group with a wellde ned role in each phase of the cycle of anaerobic digestion.
This study also attempts to nd out the relative abundance of a speci c taxonomic group between three types of the anaerobic digester and within the same digester sample taken at varying periods using Illumina sequencing of 16S rRNA genes.

Sampling
The waste sludge samples were collected from three anaerobic digesters of the Jebel Ali Sewage Treatment Plant (JASTP), UAE. The JASTP is the largest tertiary wastewater treatment infrastructure in Dubai, which serves a population of approximately 3.37 million and process 375,000 m 3 of wastewater per day. All the three digesters had a capacity of 7433 m 3 and were operating at a mesophilic temperature ranging between 32-37 °C [19]. The digesters were fed with 60 and 40% of raw and activated sludge, respectively. Details on the con guration and characteristics of the digesters are listed in Table 1. The samples from the anaerobic digesters were collected in autoclaved plastic bottles, and the bottles were transported to the lab on ice at 4 °C within an hour. The samples collected from three anaerobic digesters over a period of six months were designated as AD1, AD3, and AD5. The collected samples were immediately mixed with 100% ethanol in a ratio of 1:1 (v/v) and stored at − 20 °C until DNA extraction. Temperature, pH, and electrical conductivity (EC) were measured in sludge samples at the time of collection using HORIBA U-50 Multi Water Quality Checker (HORIBA Instruments Incorporated, USA). DNA was then extracted from 0.25 g of pellet obtained according to the manufacturer's protocol. The extracted DNA was stored at -20 °C until further use. DNA concentration and purity were checked using the Qubit uorometer (Thermo Fisher Scienti c, USA).

Illumina sequencing of samples
The diversity of bacterial and archaeal communities in the samples was determined by amplifying the V3-V4 regions of bacterial and archaeal 16S ribosomal RNA (rRNA) genes. The following is a brief description of the main steps and protocol. The extracted genomic DNA samples quality check was performed by quanti cation using Qubit DNA BR Assay kit (Thermo Fisher Scienti c, USA Cat#Q32853). For the generation of 16S amplicon, the extracted DNA samples were diluted to 10 ng and were ampli ed for 16S (~ 1500 bp) using 16S (5' AGAGTTTGATCCTGGCTCAG 3') and 16S reverse primers (5' GGTTACCTTGTTACGACTT 3') and positive control (internal metagenomic DNA sample) and no template control. These amplicons were checked on 1% agarose gel. To generate V3-V4 amplicon, 16S amplicon was used as a template with all the samples subjected to V3-V4 ampli cation (~ 460 bp) using V3-V4 forward and V3-V4 reverse primers (primer sequences V3-V4-Forward 5' CCTACGGGNGGCWGCAG 3' and V3-V4 Reverse 5' GACTACHVGGGTATCTAATCC 3') and a positive control (internal metagenomic DNA sample) and without template control [20]. The amplicons were checked on 1% agarose gel. The V3-V4 amplicons were then cleaned using AMPure XP beads (Beckman Coulter, CA, USA, Cat# A63882) to get rid of non-speci c fragments. The V3-V4 products were used for DNA library preparation using NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolab, UK, Cat# E7370L). First, the amplicons were end-repaired and monoadenylated at 3' end in a single enzymatic reaction. Next, NEB hairpin-loop adapters are ligated to the DNA fragments in a T4-DNA ligase-based reaction. Following ligation, the loop containing Uracil is linearized using USER Enzyme (a combination of UDG and Endo VIII), to make it available as a substrate for PCR based indexing in the next step. During PCR, barcodes were incorporated using unique primers for each of the samples, thereby enabling multiplexing. The prepared libraries were checked for fragment distribution using D1000 Screen Tapes (Cat# 5067-5582, Agilent, CA, USA) and reagents (Cat# 5067-5583, Agilent, CA, USA). The obtained libraries were pooled and diluted to nal optimal loading concentration before cluster ampli cation on Illumina ow cell. Once the cluster generation is completed, the clustered ow cell is loaded on Illumina HiSeq2500 instrument (Illumina, Inc., San Diego, USA) for amplicon sequencing to generate 0.5M, 250 bp paired-end reads per sample using pair-end approach.

Bioinformatics analysis
The schematic bioinformatics analysis pipeline is represented in supplementary Fig. 1. Brie y, the following steps were involved in the bioinformatics analysis for further processing of the reads and quality ltering sequencing data. The quality checking of the raw fastq les was carried out using FASTQC to check for base quality, base composition and G.C. content. Based on the quality report of fastq data, the sequence reads were trimmed using fastq-mcf where necessary only to retain high-quality sequences for further analysis. Besides, the low-quality sequence reads were excluded from the analysis.
The 16S rRNA gene identi cation was conducted by identi cation of the amplicons from the raw fastq les based on the presence of conserved region forward and reverse. The paired-end sequences read contained a spacer region, some portion of the conserved region, and V3-V4 region. As a rst step, the spacer and conserved regions were removed from paired-end reads. After trimming the unwanted sequences from original paired-end data, a consensus V4 region sequence was constructed using Fast Length Adjustment of SHort reads (FLASH V1.2.7) described by [21].
The pre-processing of reads was done by dereplication step. The dereplication is the identi cation of unique sequences so that only one copy of each sequence is reported. USEARCH is used for dereplication with the derep_full length. Chimeric sequences are formed as a result of experimental artifact, which is a fusion between two or more original sequences. UCHIME utility from usearch was used to remove the chimeras using the de novo approach [22].
The operational taxonomic unit (OTU) identi cation and singleton removal was done by performing the following steps. The sequences having a similarity of 97% were grouped using a closed reference method under a single operational taxonomic unit (OTU) for classi cation of the whole data. Any OTU that has a count of 1, i.e., only one sequence present once in a single sample (singleton) were ltered out as these may have been created as an experimental artifact rather than being some novel organism. For taxonomic classi cation, one representative sequence from each OTU was picked and classi ed using the RDP classi er against the green gene database at 97% similarity. The taxonomy classi cation was done at phyla, order, family, and genera level. The raw sequences have been deposited in National Center for Biotechnology Information (NCBI Bethesda MD, 20894 USA) under Bio-Project accession number PRJNA602372.

Statistical analysis
In the present study, two types of comparisons are carried out. Firstly, intergroup comparisons were carried out to nd differences in genera across each of the groups as follows: AD1 v/s AD3; AD3 v/s AD5 and AD1 v/s AD5. The comparisons are performed using a t-test, and any genus having a p-value less than 0.05 in the t-test is considered as differentially present. Secondly, intragroup comparisons are carried out for each of the three groups, between samples collected at a different period. As these comparisons are between a pair of samples (single sample comparison) without any replicate, a different approach is taken to identify the differential genera. First, the unique genera for both the samples are listed out separately. Then, a ratio of relative abundance for both the samples are calculated for common genera between the samples. These ratios are then converted into log scale to get a normal distribution, and any genera having a value beyond mean ± 2SD are considered as differentially present genera.

Physicochemical conditions of Anaerobic digesters
The physicochemical conditions of the digester operation are given in value range found in this study was similar to a study conducted recently in Austria in which the effects of various co-substates were evaluated for the microbial community composition of seven full-scale anaerobic digesters fed with or without co-substrates [17]. Overall, three anaerobic digesters showed acceptable levels of main operational parameters and were performing stably during the sampling period.

Bacterial community diversity at genus level
The rarefaction curves show that among the three digesters, the highest diversity was present in AD5. At the same time, the least diverse microbial community was found in AD1 ( Fig. 2A). Fifty-one genera were detected as core genera in all the digesters, including Acetobacteroides (Blvii28), Coprothermobacter, Fervidobacterium, Clostridium, Caldilinea, Allochromatium, Sediminibacter, and T78 (Fig. 2B). In addition to these genera, many of the bacteria were identi ed as unknown bacteria. Most of these genera are associated with an anaerobic digester, some have been reported from anaerobic digester earlier also [11]. Notably the highest population of genus Acetobacteroides (Blvii28) was present in all the digesters. The cultured representative member of the genus is known to produce acetate, H 2, and CO 2 as the product of fermentation [26]. Intra-group comparison of three AD3 samples collected between 3). In contrast, T78 is possibly metabolizing carbohydrates and alcohol via syntrophic interactions [27]. T78 is the member of the phylum Chloro exi often found to be abundant in full-scale mesophilic anaerobic digesters receiving excess sludge from wastewater treatment plants [28]. Intragroup comparison of AD5 (Jan-Feb 2017) samples found nine-fold increase in concentration of T78 in February 2017 sample compared to January 2017. Many genera found predominantly in the digesters were thermophilic or halophilic. Like, Coprothermobacter is a known proteolytic anaerobic thermophilic bacteria found in many thermophilic anaerobic digester [29]. This genus can also improve protein degradation by establishing a syntrophy with hydrogenotrophic archaea [29]. Other thermophilic bacteria found in the anaerobic digester include Fervidobacterium and Caldilinea, that ferment carbohydrates to lactate, acetate, hydrogen, and carbon dioxide [30,31].
In comparison, genus Sediminibacter which was initially isolated from marine sediment, was also found as one of the dominant genera [32]. This can be due to the simple reason that most of the water used in Dubai is obtained from the sea and is being used after desalination [33,34]. The incoming feed may contain elevated levels of salts, or it may be due to the fact that Sediminibacter has a unique light-driven sodium ion pump, which helps in their survival in marine habitats [35]. Several purple sulfur bacteria like Allochormatium and Thermotogales AUTHM297 were also part of the microbial community in these digesters. Many sulfate-reducing bacteria were also found, including Desulfomicrobium and Desulfobacter, as the two most dominant genera of SRB. In our previous work using FISH technique, we found that in all three digesters members of the genus Desulfobacter and Desulfobacterium were consistently present in large numbers [19]. It is interesting to note that Desulfomicrobium is also known to be associated with the marine habitat [36].
The process of anaerobic digestion involves the degradation of organic compounds to simple organic compounds, which is most likely carried out by the members of the phylum Bacteroidetes, Firmicutes, Proteobacteria, Synergistetes, and Thermotogae. Other dominant phyla in the fermenters are shown in Fig. 3. Previous studies also demonstrate the presence of these phyla in anaerobic digester as macromolecules degrading bacteria [11]. Mycobacterium, which is associated with the production of Lipases and Lipolytic activity, is also found as one of the predominant genera. Another critical process of anaerobic digestion is acidogenesis, wherein bacteria convert organic monomers into acids like acetic, propionic, and butyric acids. The population of Acidogenic bacteria like Acetobacteroides, Fervidobacterium, Clostridium, and Paludibacter producing acetate lactate or propionate was high in the digesters as reported above. The conversion of these organic molecules to CH 4 is carried out mainly by Archaea. However, some bacteria may in uence the production of methane. For example, acetate-utilizing uncultured bacteria of Synergistes group 4 and its competition with aceticlastic methanogen [37].
The boxplots of intergroup comparisons for differentially present genera among three anaerobic digestor groups (AD1 v/s AD3; AD3 v/s AD5 and AD1 v/s AD5) are shown in Fig. 4 and supplementary le. Three genera Rhodobacter, Allochromatium, and Bi dobacterium, are found to be differentially present across the two groups AD1 and AD3, but among them, only Bi dobacterium showed differential abundance between AD3 and AD5. For the third Comparison (AD1 v/s AD5), genera Propionicimonas, Clostridium, and Paracoccus showed differential abundance between the two groups AD1 and AD5. To nd out if the samples showed an overall group-speci c trend, a principal component analysis based on all the genera detected is performed. The PC1 vs PC2 PCA plot (Fig. 5) showed no group-speci c signature. Overall, from the taxonomic classi cation at the genera level, it can be observed that different digesters alter the microbiome pro le differently. For intergroup comparisons, no abundant genera (present at least 1% in any of the samples) showed signi cant differentiation.

Archaeal community diversity at genus level
When analyzed at the genus level, the highest diversity was observed in digester AD5, followed by AD1 and AD3 ( Fig. 6A). This was evident from the highest Shannon index obtained for AD1, followed by AD3 and AD5. The predominant Archaea in the three digesters were the members of the phylum Euryarchaeota followed by Crenarchaeota. The population of hydrogenotrophic archaea was higher in the digesters. Methanocorpusculum alone constitutes more than 50% of the total population of Archaea. Other predominant genera of archaea include Metallosphaera, Methanocella, Methanococus, Acidianus, Natronobacterium, and others shown in Fig. 6B. Methanocorpusculum is one of the hydrogenotrophic methanogens, which was isolated for the rst time from the biodigester of the wastewater treatment plant [38]. In contrast, the second most predominant genus was Metallosphaera, which is an extreme thermoacidophile with optimal growth at 74 °C and pH 2.0 [39]. Methanocella, which was the third most dominant genus, is a mesophilic, hydrogenotrophic methanogen [40]. While Methanococcus is also a thermophilic hydrogenotrophic methanogen and at least one species of the genus is also known to x nitrogen [41].
It is clear from the microbial community analysis that a core microbial community exists in the anaerobic digesters studied. However, the degradation is carried out by the members of various phylum reported earlier also from the anaerobic digesters. These include the members of the phylum Bacteroidetes, Firmicutes, Proteobacteria, Synergistetes, and Thermotogae [13]. While several genera mainly involved in the conversion of these organic substrates into acetate, butyric acid, propionate, and H 2 were found to be the dominant genera in the anaerobic digesters. However, Euryarchaeota, mainly the hydrogenotrophic methanogens, were found as predominant members in the digesters (Fig. 6). The predominant hydrogenotrophic methanogens activities compared to aceticlastic ones, is in agreement with another study [25]. Overall, abundant microbial genera observed in this study, and their predicted role in anaerobic digestion associated with a speci c process is shown in Fig. 7. Although the digesters were operated under mesophilic conditions, many thermophilic genera of bacteria and archaea were found to predominate in the digester. It may be because of the presence of these bacteria in the incoming feed. Whether or not these bacteria are initially present in the feed is a matter of further investigation. Furthermore, the bacteria of marine origin like Sediminibacter and Haloterrigena as one of the predominant genera maybe because most of the water used in Dubai is desalinated seawater. The high salinity range suggested indirectly by higher electrical conductivity values observed in this study may be the reason for the occurrence of halophilic bacteria found in anaerobic digester systems.

Conclusions
In this study, using Illumina sequencing of 16S rRNA genes, we compared bacterial and archaeal community structure of three full-scale anaerobic sludge digesters of a municipal sewage treatment plant in Dubai, UAE. Bacteroidetes, Firmicutes, Synergistetes, Theromotogate, OP8, and Chloro exi were dominant bacterial community phyla in all the digesters and the highest diversity observed in AD5 followed by AD3 and AD1. The predominant archaea community included mainly the members of phylum Euryarchaeota and Crenarchaeota belonging to genus Methanocorpusculum, Metallosphaera, Methanocella, and Methanococcus with highest archaeal community diversity observed in AD5 as compared to AD3 and AD1. The ndings of this study indicate that the predominance of thermotolerant and halotolerant bacteria and archaea in the anaerobic digesters, and this could be due to the in uence of environmental conditions on the incoming feed sludge. The anaerobic digesters also characterized by a high population of bacteria known to ferment organic substrates to acetate and H2 and hydrogenotrophic archaea.
This indicates the possible mechanism of CH4 production by the microbial community present in these digesters. From the taxonomic classi cation at the genera level, it can be observed that different digesters alter the microbiome pro le differently. For intergroup comparisons, no abundant genera (present at least 1% in any of the samples) showed signi cant differentiation. In conclusion, a core and the stable, functional microbial community was observed under more or less similar physicochemical conditions.

Declarations
Availability of data and materials The raw sequences have been deposited in National Center for Biotechnology Information (NCBI Bethesda MD, 20894 USA) under Bio-Project accession number PRJNA602372.

Competing interests
The authors declare they have no competing interests.  The abundant phyla of bacterial community across all three anaerobic digester samples Figure 2 Panel A shows rarefaction curves of bacterial species diversity in the three anaerobic digesters (AD1, AD3 and AD5).
Species were de ned based on 3% difference in the sequence. Three samples from each digester were chosen.
While panel B shows heat map of hierarchical clustering of twenty genera with the highest mean relative abundance across the three anaerobic digesters  Panel A; rarefaction curves showing diversity of Archaea species in the three anaerobic digesters (AD1, AD3 and AD5). A 3% difference in the sequence was used to de ne a species. Three samples from each digester were chosen. Panel B shows heat map of hierarchical clustering of twenty archaeal genera with the highest mean relative abundance across the three anaerobic digesters