Microbial community structure and metabolic pro�le of anthropized freshwater tributary channels from La Plata River, Argentina to develop sustainable remediation strategies

Microbial communities from freshwater sediments are involved in biogeochemical cycles and they can be modi�ed by physical and chemical changes in the environment. Linking the microbial community structure (MCS) with physicochemistry of freshwater courses allows a better understanding of its ecology and can be useful to assess the ecological impact generated by human activity. The MCS of tributary channels from La Plata River affected by oil re�nery (C, D and E) and one also by urban discharges (C) was studied. For this purpose, 16S rRNA metabarcoding analysis, in silico metagenome functional prediction and the hydrocarbon degradation potential (in silico predictions of hydrocarbon degrading genes and their quanti�cation by qPCR) of the MCS were studied. Principal coordinate analysis revealed that the MCS was different between sites, and it was not structured by the hydrocarbon content. Site C showed physicochemical characteristics, bacterial taxa and an in silico functional prediction related to fermentative/heterotrophic metabolism. Site D, despite having higher concentration of hydrocarbon, presented autotrophic, syntrophic, and methanogenic pathways commonly involved in natural processes in anoxic sediments. Site E showed and intermediate autotrophic/heterotrophic behaviour. The hydrocarbon degradation potential showed no positive correlation between the hydrocarbon degrading genes quanti�ed and predicted. The results suggest that the hydrocarbon concentration in the sites was not enough selection pressure from to structure the bacterial community composition. Understanding which is the variable that structures the bacterial community composition is essential for monitoring and designing of sustainable management strategies for contaminated freshwater ecosystems.


Introduction
Microorganisms that inhabit sediments have an important role in both biogeochemical cycles and biotransformation of pollutants (Su et al. 2018).The microbial communities from freshwater and sediments are usually affected by the availability of nutrients or the presence of contaminants.Therefore, the coupling of biogeochemical processes and its bio lter function in sediments can be altered or even destroyed (Di Cesare et al. 2020; Wang et al. 2015;Xie et al. 2021).The discharge of e uents from urban and industrial wastewater treatment plants, as well as untreated wastewaters in most urban catchments, presents a major challenge for the maintenance of water quality and sediment natural functions (Martínez-Santos et al. 2018).Due to their key role in biogeochemical cycling and their responses to even small chemical and physical changes, the composition of bacterial communities can be used to monitor the ecological impact of human activities on the functional characteristics of the freshwater environment (Zhang et al. 2020).The in-depth knowledge of microbial community diversity of urban rivers or streams sediments is still lacking and is less explored than that of marine or lake ecosystems (Wang et al. 2018), and even more in fresh water courses in Argentina.
La Plata River is considered both the world widest freshwater river and an estuary that drains the second largest basin in South America and the fth largest in the world.This river is a major source of water for domestic use in the most highly populated urban areas of Argentina and Uruguay.Several harbors, resorts, and industrial centers are located within the margins and zone of in uence of the capitals of both countries (Baigún et al. 2016; Moreira and Simionato, 2019).In the late 19th century, the tributary channels of the La Plata River studied in this report, were designed for the operation of agricultural vessels carrying materials and goods from abroad to the port for the export of agricultural products from Buenos Aires province.When petrochemical industries and re neries were built in the early 20th century, the channels received the resulting petrochemical waste.However, no petrochemical waste has been discharged in the last 20 years.One of them (channel C) was found to have two active urban discharges on the left margin.
Within the main pollutants, petroleum hydrocarbons are relevant because of their toxic effects over aquatic and terrestrial organisms.Many of its more persistent compounds can be carcinogenic (Cobbina et al. 2015; Maletić et al. 2019;Perelo, 2010).Once in the aquatic environment, hydrocarbons tend to oat on water surface and, those with higher molecular weight, can accumulate in sediments (Logeshwaran et al. 2018).Nevertheless, hydrocarbons in these ecosystems can be transformed and degraded, sometimes to mineralization, mainly by biological processes (Perelo, 2010).
Exploring a single source of pollution can provide an accurate understanding of the relationship between sediment bacterial communities and that speci c input stream.Though, the contribution of the complex pollution sources to freshwater ecosystems is comprehensive (Zhang et al. 2020).Surface runoff of urban area can bring heavy metals, nutrients, herbicides, pharmaceutical compounds, among others (Simonin et al. 2019), that could alter the composition of the sediment bacterial community and affect its natural ability to degrade hydrocarbons.
The aim of this study was to examine the microbial community structure (MCS) and the metabolic pro le of surface sediments of three channels tributary of La Plata River affected by oil re nery industries, to obtain a better understanding of its ecology and to design future sustainable management strategies.
To achieve this goal, we have characterized the three sites by determining several physicochemical variables, studying their MCS by 16S rRNA gene amplicon sequencing, the hydrocarbon degradation potential in the contributing families by in silico predictions and their quanti cation by qPCR.

Sampling sites
In this work, surface sediments (0-20 cm) and water from three neighbouring channels of the La Plata River named C, D and E, located in an area surrounded by petrochemical industries and oil re neries, has been studied.Figure 1 shows a diagram of the sampled site.
Sampling was carried out by the Laboratory of Continental and Marine Cores (SACMa, Instituto de Estudios Andinos Don Pablo Groeber, UBA CONICET, Argentina) among 2018 and 2019.Depending on site accessibility, samples were collected at different points 100 to 300 m apart.Water samples were collected from the center of each site using a glass bottle and at the same location from each sediment sample.Undisturbed sediment cores were collected from the sampling sites from a platform equipped with a manually operated hammer drill with a transparent PVC liner without a core catcher.The single core tube was up to 2,000 mm in length with an internal diameter of 68 mm.The coring procedure was the same for all sampling sites: the corer was lowered into the sediment where it initially penetrated under its own weight.A driving weight ("hammer") is then lifted from the top of the corer and then released to insert the coring tube into the sediment.Visual inspection after sampling con rmed that the watersediment interface remained intact in the liner, even after the core barrel was raised to the surface.Because the sediment is very uid near the surface, the core is transported from the eld into a core tube and separated by a longitudinal section in the laboratory.The core surfaces were then cleaned, oriented, and accurately measured.Surface sediment subsamples were taken out immediately after core opening and from the middle part of each oor to avoid potential contamination with no-surface sediment due to plastic deformation of the sediment column.Samples were collected in sterile glass bottles or polypropylene tubes.At the end of the sampling, 4 samples were obtained from site D, 4 samples from site E, and 6 samples from site C (due to the presence of municipal waste at site C, 2 additional samples were included in the analysis).Samples were then stored at -20°C for DNA extraction.

Microbial Counts in water and sediment samples
Determination of heterotrophic aerobic bacteria (HAB); aliphatic hydrocarbon degrading bacteria (AlDB), and PAH degrading bacteria (ArDB) were done in sediment and water samples.Microbial counts in sediments were performed suspending 5 g (wet weight) of the sample in 45 ml 0.85% NaCl and homogenized for 30 min on a rotary shaker (150 rpm).Water samples were used directly for microbial counts.To count HAB, 5 decimal serial dilutions were performed, and an aliquot of each dilution (0.1 ml) was spread on plates containing R2A-agar medium (Reasoner and Geldreich, 1985).The agar plates were incubated at 24 ± 2°C and colonies forming units (CFU) were counted after 7 days of incubation.The most probable number (MPN) of aliphatic hydrocarbon-degrading bacteria was done using 5 ml of liquid mineral medium (LMM) supplemented with 1% of hexadecane as carbon and energy source (Vecchioli et al. 1990), in three decimal dilutions and ve replicates per dilution.Tubes were incubated at 24 ± 2°C for 21 days and positive tubes were scored as those that showed turbidity.The MPN of ArDB was determined in 96-well microtiter plates containing LMM and supplemented with a mixture of four PAH (10 g.l -1 phenanthrene, 1 g.l -1 anthracene, 1 g.l -1 uorene and 1 g.l -1 dibenzothiophene) Positive wells were those that after 21 days of incubation at 24 ± 2°C showed yellow, orange, pink or brown colours coming from PAH degradation products (Wrenn and Venosa, 1996).
Sediment DNA extraction, 16S rRNA gene ampli cation and sequencing DNA extraction was carried out from 1 g of sediment samples (stored at -20°C) using E.Z.N.A Soil DNA kit (OMEGA, USA) as indicated in the manufacturer's protocol.With the aim of increasing the DNA extraction e ciencies some modi cations in the lysis step protocol were introduced.Microbial lysis was performed in three successive stages, collecting the eluted lysate after each step.
Step one was performed using the buffers provided by the manufacturer (SLX MLUS and then DS) with the addition of 5.4 µl of lysozyme (300 mg.ml − 1 ) followed by an incubation at room temperature for 10 min to promote microbial lysis.Then, 15 µl of Proteinase K (10 mg/ml) was added and incubated at 37°C for one hour.After this time, samples were treated according to the original protocol.The supernatant obtained after passing through the protocol was separated in a polypropylene tube of 1.5 ml and kept in a refrigerator.The original pellet was subjected to a new lysis step (step 2), in which the buffers provided by the kit (SLX MLUS and DS) were replaced by 1 ml TE buffer (10 mM pH 8) and 100 µl SDS solution at 10% respectively, and the addition of lysozyme, as indicated in step 1, was included.Again, the eluate lysate was separated and kept in refrigerator.The over ow was subjected to a last step of lysis (step 3) in which it proceeded in the same way as in step 2. Once obtained, the lysates of the three steps are mixed (400 µl step 1-200 µl step 2-200 µl step 3) and the extraction continued with the puri cation and elusion steps according to the indications of the original protocol.The resulting DNA extracts were veri ed for quantity and purity with NanoDrop™ 2000/2000c spectrophotometers (Thermo Fisher Scienti c) and stored at -20°C until further analysis.Subsequently, sequences were submitted to the sequencing platform Mr DNA (Molecular Research LP, Shallowater, TX).The 16S rRNA gene V4 variable region PCR primers 515/806 were used in a single-step 30 cycle PCR using the HotStarTaq Plus Master Mix Kit (Qiagen, USA) under the following conditions: 94°C for 3 minutes, followed by 30 cycles (5 cycle used on PCR products) of 94°C for 30 seconds, 53°C for 40 seconds and 72°C for 1 minute, after which a nal elongation step at 72°C for 5 minutes was performed.Sequencing was performed at Molecular Research (www.mrdnalab.com,Shallowater, TX, USA) on an Ion Torrent PGM platform following the manufacturer's guidelines.

Bioinformatic analyses
16S rRNA raw data provided by Molecular Research sequencing service was downloaded from the BaseSpace cloud, barcodes and primers were removed using cutadapt (Martin, 2011) prior to be imported into QIIME2 (v.2021.2) (Bolyen et al. 2019).Data was denoised with DADA2 plugin using denoise-pyro script to obtain Amplicon Sequence Variants (ASVs).Further classi cation of ASVs was performed with Silva 138 trained Naive Bayes classi er 99% OTUs from 515F/806R region of sequences.A ltering approach was carried out to remove low-abundance sequences (< 0.1% of the average depth of the samples), mitochondrial/chloroplast 16S rRNA sequences and reads that were not classi ed at the phylum level.The resultant table with ASV frequencies per sample was rare ed using the lower value of total frequencies in the set of samples.Alpha-(observed features, Shannon, Simpson, Faith) and betadiversity metrics (weighted UniFrac) were calculated.To determine phylogenetic related metrics (Faith, UniFrac), a phylogenetic tree was constructed using fragment insertion strategy using the SILVA reference set (Silva 128 SEPP reference database).To predict the abundance of gene families based on KEGG (Kyoto Encyclopedia of Genes and Genomes) orthologs (KOs) and higher-level pathways (MetaCyc) the software PICRUSt2 (v2.4.1) was used (Douglas et al. 2020).PICRUSt2 was executed following the pipeline established by the software developers.Prediction accuracy was estimated by the Nearest Sequenced Taxon Index (NSTI) scores.Langille et al. (2013), proposed that weighted NSTI values < 0.06 were considered as "good" (well covered), values between 0.06-0.15as intermediate, while values > 0.15 are considered as high (not well covered).However, a weighted NSTI score > 0.15 does not necessarily mean that the predictions are meaningless.
qPCR Gene Quanti cation Quanti cation of 16S rRNA, C230, PAH-RHDα and bamA genes was performed by qPCR on Thermocycler Rotor Gene Q 2X (Qiagen).All qPCR assays were made in a total volume of 10 µl, containing 5 µl of SsoAdvanced Universal SYBR Green Supermix (BIORAD), 5 µM of each primers solutions, 2 µl of the template DNA or sterile, nuclease, RNA and DNA free water for the negative control (NTC).To check the speci city of the amplicon obtained, melting curves were performed after ampli cation, and to check the presence of inhibitors in DNA samples, they were spiked with 10 4 copies of plasmid DNA containing the functional degrading genes and ampli ed together with the same set of non-spiked samples and control DNA (Madueño et al. 2018).In most samples inhibition was negligible and in those samples that showed inhibition, DNA extracts were diluted.Standard curves were performed from serial 10-fold dilutions of plasmid DNA containing the respective target gene.The parameters of the standard curves obtained were in concordance with the recommendation described in MIQE guidelines (Bustin et al. 2009).Limit of quanti cation were calculated (LOQ), showing a reliably detection of among 30, 50 copies, 430 and 2000 copies.µl− 1 of C230, PAH-RDH GN, bamA and 16S rRNA genes respectively (Forootan et al. 2017).All qPCR reactions were carried out in technical triplicates.The software Rotor-Gene Q Series Software version 2.3.1 (Build 49) was used for the analysis of results and gene concentrations were reported as gene copies per ng of DNA in the extract preparation.The qPCR programs are detailed in Table S1.

Statistics
To determine the probability that two sites (C vs D, D vs E and C vs E) can have equal mean in physicochemical variables, microbial counts and in the hydrocarbon genes quanti cation, the Student's Ttest or Kruskal-Wallis tests were performed considering a p-value < 0.05 as signi cant threshold.
Signi cance of alpha diversity differences between sites was tested using the Kruskal-Wallis test with QIIME2 and p-values < 0.05 were considered signi cant.Beta diversity signi cance between sites was tested using a pairwise PERMANOVA done by QIIME2 software and p-values < 0.05 were considered signi cant.Taxonomical differences between sites at class and family level were studied applying pairwise and all groups Mann Whitney/Kruskal Wallis statistical tests.Pearson correlation was used to nd a linear correlation between the predicted and quanti ed numbers of hydrocarbon-degrading genes.

Results
Physicochemical and microbiological characterization in the sites Physicochemical characterization of water samples showed that total hydrocarbons (TH), aliphatic hydrocarbons C6-C35 (AlH), aromatic hydrocarbons C7-C35 (ArH), PAH, and BTEX were undetectable (< 30 mg.l -1 ; < 3 mg.l - ; < 3 mg.l - ; 2 mg.l -1 ; <0.006 mg.l -1 respectively) in the three sites.pH value in water from site C was close to neutrality and in D and E sites the waters were alkaline.Site C had the minor quantity of dissolved oxygen in comparison with water samples from E and D sites (Table 1).Counts of aliphatic degrading bacteria (AlDB) in water from site E where minor than those in site D, while aromatic degrading bacteria (ArDB) counts in water from C where signi cantly lower than those in site D.
The characterization of sediments showedthe lowest concentrations of TH, AlH and ArH in site C compared to sites D and E (Figure 2).PAH, BTEX concentration and cultivable bacterial counts (ArBD, AlBD and HAB) showed no signi cant differences among sites (Figure 2, Table 1).

Microbial diversity in the sediments
16S rRNA gene sequencing generated a total of 648044 read counts, ranging from 32171 to 72309 sequences per sample.A total of 1887 ASVs remained after ltering.Sequencing depth was enough to cover the diversity of the samples demonstrated by the values of Good's coverage index were close to 1 (data not shown) and the observed plateau in the rarefaction curves (Figure S1).The minor number of reads (19098) was used to normalize the ASVs table to the same sample depth to performed diversity analyses.

Alpha diversity
The Shannon and Simpson diversity indexes, and the observed features are shown in Figure 3. Kruskal-Wallis statistical tests were applied for all groups and in pairwise among groups C, D and E. The results obtained (Figure 3), showed that the sites had similar diversity values of Shannon and Simpson diversity indexes and observed features (Kruskal-Wallis all groups p>0.05).By contrast, faith´s phylogenetic diversity index (faith_pd) was signi cantly lower in site C than in D (Figure 3) (Kruskal-Wallis pairwise p>0.05), indicating a closer phylogenetic relationship between the ASVs of site C.

Beta diversity
The differences in the bacterial community structures among sites, and in relation to the TH, were studied by a principal coordinate analysis based on weighted UniFrac distance metric (pCoA) (Figure 4).PCoA showed that samples from each site clustered together, suggesting similarities between from those of the same site.PERMANOVA con rmed that the groups plotted in PCoA were signi cantly different (F = 9.99; p-value = 0,001).However, according to PERMANOVA test (Figure 4), the content of TH would seem not to structure the community of the sites (p-value > 0.05).
As it can be seen in Figure 6, the relative abundance of Rhodobacteraceae and Sphingomonadaceae families, which are part of Alphaproteobacteria class, was signi cantly higher in site C (11 and 4 %) in comparison with D (1.2 and 0.1 %) and E (1 and 0.1%); as for Rhodocyclaceae family, belonging to Betaproteobacteria class, was signi cantly higher (3.8%) in C compared to D (0.5%) and E (0.3%) sites (Figure 6).Also, in C site Pirellulaceae family (members of Planctomycetes) proportion was signi cantly higher than in site D (Figure 6).Although Clostridia was higher in site C, no family belonging to this class was found in a relative abundance greater than 5% and with signi cant differences among the three sites.Anaerovoraceae was found in C site (2.6%) with lower abundances values than in D (0.05%) and E (0.01%) (P<0.05).The families Sutterellaceae and Methylomonadaceae, part of Gammaproteobacteria class,showed signi cantly lower values of relative abundances (0.3 and 0.4 %) in C than D site (1.2 and 3.5%).Methanoregulaceae andthe Marine Benthic Group D DHVEG-1, families that belong to Methanomicrobia and Thermoplasmata, respectively showed higher relative abundances in site D (6 and 7%).One of the families belonging to Syntrophia, were signi cantly higher proportion in site D (5.5 %) in comparison with C (0.4%) and E (1.7%) sites (Figure 6).

In silico metagenome functional prediction
PICRUSt2 for prediction of metagenome functions was applied to compare the energy metabolism and autotrophic processes in the sediments of the three sites.Fermentation, methanogenesis and carbon xation pathways were searched.Initially the average weighted Nearest Sequenced Taxon Index (NSTI) was calculated for C, D and E sites as a measure of the accuracy of PICRUSt2; obtained values were 0.193, 0.296 and 0.304 respectively.Due to the accuracy of PICRUSt2 prediction decreases with increasing NSTI values, it is expected that the samples of these sites contain much unexplored diversity, and therefore PICRUSt2 outcomes should be interpreted with caution.
Table 2 shows that in site C, the pyruvate fermentation to acetate and lactate II (PWY-5100) and to acetone (PWY-6588), the mixed acid fermentation (FERMENTATION-PWY), the acetyl-CoA fermentation to butanoate II (PWY-5676), the hexitol fermentation to lactate, formate, ethanol and acetate (P461-PWY), Llysine fermentation to acetate and butanoate (P163-PWY) and the succinate fermentation to butanoate (PWY-5677) were signi cantly greater than in site D. By contrast, methanogenesis from acetate (METH-ACETATE-PWY) and from H 2 and CO 2 (METHANOGENESIS-PWY) were in minor proportion in site C than in sites D and E. Besides, carbon xation pathways (CODH-PWY) and (CALVIN-PWY) were signi cantly lower in site C than in D (Table 2).E site showed an intermediate behavior, in most cases without signi cant differences in comparison with C or D sites.

qPCR quanti cation of functional genes related to hydrocarbon degradation
The primers used for the quanti cation of functional genes by qPCR were selected by considering the major contributing families of the functional prediction.The primers designed by Cébron et al. (2008) (PAH-RDHa GN R/F), target the ampli cation of 1,2 naphthalene dioxygenase gene of Sphingomonas (sensu latu), among other genera.The primer set C 230F/R was used for the ampli cation of catechol 2,3 dioxygenase gene of Sphingomonas (sensu latu) and other genera (Sei et al. 1999;Sipilä, 2009).Finally, the primers proposed by Staats et al. (2011) for amplifying bamA gene in Syntrophus genus, among other genera.
The quanti cation of C230, PAH-RDH GN, and bamA relative to the copies of 16S rRNA gene (Harms et al. 2003) is shown in Table 3.It could be observed a similar relative quantity of C230 and PAH-RDH GN genes in the three sites.By contrast, bamA gene was found in the following decreasing order: site D > site E > site C.

Relationships between predictions and quanti cation of functional hydrocarbon degradation genes
The accuracy of functional gene predictions was evaluated correlating the functional genes quanti ed by qPCR (copies functional genes/copies 16S rRNA) with the KO abundance (sequences of each KO (K00446, K07539, K14579)/total number of KO predicted by PICRUSt2) through Pearson correlation (r) and coe cient of determination (r 2 ) (Agrawal et al. 2019; Emsens et al. 2020).The r and r 2 were low for the three genes, -0.24 and 0.06 respectively for lineal regression among C230/16S rRNA and K00446/total KO; -0.60 and 0.37 respectively for PAH-RDH GN/16S rRNA and K14579/total KO; and 0.53 and 0.28 respectively for bamA/16S rRNA and K07539/total KO.These results suggest no linear correlation between PICRUSt2 prediction and qPCR quanti cation of these degrading genes.

Discussion
In this study, we have investigated and compared three neighboring, anthropized freshwater courses through physicochemical characterization, diversity of bacterial communities, the prediction of metagenome functions and the prediction and quanti cation of genes involved in hydrocarbon degradation.Physicochemical similarities and differences in waters of the sites were found: hydrocarbons in water samples were below the detection limit, and as a consequence, counts of hydrocarbon degrading populations (AlDB and ArDB) showed lower values in comparison with counts reported by other authors who also studied hydrocarbon contaminated water (Adebusoye et al. 2007).Water from site C showed amounts of dissolved oxygen compatible with hypoxic water courses (Rabalais et al. 1994) and lower pH values than sites D and E. These results suggest that the site C could be considered as a heterotrophic ecosystem.Also, sediments from site C presented the lowest total hydrocarbon concentration.
Taxonomic assignment of the sediments showed that the relative abundance of Rhodobacteraceae, Rhodocyclaceae, Anaerovoracaceae, and Sphingomonadaceae families were higher in site C in comparison with the other sites, and Bacteroidetes vadin HA_17 was also higher in C than in site E. Members of the Rhodobacteraceae family include anoxygenic photoheterotroph and chemoorganoheterotrophs microorganisms, usually found in heterotrophic aquatic environments (Pujalte et al. 2014).Rhodocyclaceae family include anaerobe members that perform propionic acid fermentation and degrade aromatic compounds (Oren, 2014).Anaerovoracaceae belongs to anaerobic and fermentative metabolism of Clostridia class, and is commonly found in fecal microbiome of mammals (Kanazawa et al. 2021;Theelen et al. 2021).In agreement with the bacterial metabolisms mentioned above, the functional prediction showed higher proportions of MetaCyC fermentation pathways in C site.Meanwhile, the families Rhodocyclaceae, Sphingomonadaceae and Bacteroidetes vadin HA_17 could be related with the hydrocarbon contamination in site C. Sphingomonadaceae is widely known for its ability to degrade aromatic compounds (Festa et al. 2017;Madueño et al. 2015, White 1996) and Bacteroidetes vadin HA_17 was assigned in co-occurrence studies as one of the key participants in the anaerobic biodegradation of PAH (Han et al. 2021).By contrast, members of Desulfobacteria and Sva0485 classes, uncultured members of Syntrophales order, the families Methanoregulaceae and Methylomonadaceae were in minor proportions in C in comparison with the sites D and E. Desulfobacteria class includes sulfate reducing bacteria (SRB) (Waite et al. 2020) and Sva0485 class was recently related with benzene biodegradation in the presence of sulfate or under methanogenic conditions (Edwards et al. 2021).Syntrophales order includes anaerobic chemoorganoheterotroph members with a fermentative type of metabolism, and members that can only grow in the presence of H 2 /formate-utilizing partners in syntrophic associations (Galushko andKuever, 2021, 2020;Waite et al. 2020).Methanoregulaceae family includes methanogenic archaea that use H 2 /CO 2 and sometimes formate as substrates for methanogenesis, which require acetate for growth (Oren, 2014).Also, the methanotroph family Methylomonadaceae which utilizes methane as source of carbon and energy (Martinez-Cruz et al. 2017).All this groups of bacteria work in syntrophy within anoxic environments during anaerobic fermentation of organic matter (Morris et al. 2013).The minor relative abundances of Desulfobacteria, Sva0485, Methanoregulaceae, Methylomonadaceae, and the low proportion of methanogenesis and carbon xation pathways predicted in site C, suggest that the methane cycle would be less relevant in the C site than in the other sites.This behavior could be attributed to the urban discharge that dumps organic matter acting as a selection pressure, probably decreasing the autotrophic processes (like e.g., hydrogenotrophic methanogenesis and carbon xation), and selecting members of fermentative and heterotrophic families adapted to degrade easily degradable organic matter and/or hydrocarbons.
Unlike site C, high levels of dissolved oxygen and pH in water from site D, are indicatives with high photosynthetic activity.Oxygen production and pH increase due to the reduction of carbon dioxide (CO 2 ) during photosynthesis could implies that gross primary production exceeds the microbial community respiration (autotrophic ecosystem) (O'Boyle et al. 2013).
Sediments from site D were characterized with higher concentrations of total hydrocarbon than the other sites and higher relative abundance of members of Syntrophia, SvA0845, Methanomicrobia, Thermoplasmata classes and Methylomonadaceae family.The presence of Syntrophia and Methanomicrobia members have been reported previously in surface sediments from site D, and was related with "ecosystem health" (Madueño et al. 2021).The Marine Benthic Group D DHVEG-1 family, belonging to Thermoplasmata class, is frequently found in anoxic sediments (Schneider et al. 2013), and it was associated to protein remineralization (Lloyd et al. 2013).The Rhizobiales Incertae Sedis, Pirellulaceae and Bacillaceae families were in lower relative abundances in site D than in the other sites.Some authors have identi ed that high content of Rhizobiales Incertae Sedis and Pirellulaceae families in bacterioplankton may be taxa indicators of the degradation status of river ecosystems (Yang et al. 2019), therefore the lower proportions in D could indicate that this site presents a healthier ecosystem than the other two sites, despite the hydrocarbon content.Clostridia class and Bacillaceae family were also in minor relative abundances in site D comparison with site C, probably due to the lower availability of easily degradable organic matter than in C. The family Bacteroidetes vadin HA17, classi ed as a keystone in PAH biodegradation under nitrate-reducing conditions (Han et al. 2021; Laczi et al. 2020), was found in higher proportions in D than E site, suggesting that the anaerobic degradation of hydrocarbons processes are also present.Functional prediction in site D revealed that the methanogenesis and carbon xation pathways were in higher abundances than in C.These results suggest that, together with the presence of hydrocarbon degradation processes, autotrophic, syntrophic, and methanogenic mechanisms, involved in natural activities commonly found in anoxic sediments, are also present in site D.
The results from site E suggested that the microbial community exhibits both autotrophic and heterotrophic metabolic behavior, which is intermediate between the characteristics observed in sites C and D. Dissolved oxygen values of their waters seem to be related with hypoxic courses.In comparison to sites C and D, the functional prediction in E revealed no signi cant differences in the abundance of mixed acid, pyruvate, succinate, and hexitol fermentation pathways, as well as intermediate levels in the abundance of methanogenesis-related pathways.Even more, the alkaline pH and the carbon xation pathway found indicated that photosynthetic activity was also present.In concordance, taxonomical assignment of bacteria in the sediment samples showed signi cant intermediate values of Clostridia, SV0485 and Thermoplasmata classes in site E in comparison with C and D, proportions of Alphaproteobacteria, Bacilli, Actinobacteria, Desulfobacteria and Methanomicrobia were close to the ones found in site D, and proportions of Syntrophia and RBG-16-55-12 classes resembled site C.
The principal coordinate analysis (pCoA) revealed that the bacterial community compositions of the sites were not structured by the hydrocarbon content.Similar results were found in sediments from three anthropized mangrove forests from Guanabara Bay, Brasil, where the diversity of bacterial communities in urban mangrove sediments had a speci c bacterial community structure that was not affected by the different levels of hydrocarbon pollution (Marcial Gomes et al. 2008).Similarly in a recent study, Madueño et al. (2021) showed that were the environmental variables and not the hydrocarbon concentration the variables structuring the community composition in surface sediments from a freshwater course.From the pCoA plot, samples from E site clustered in the middle of the plot, between C and D, suggesting an intermediate arrangement of bacterial communities.From the pCoA chart, C and D grouped distant, suggesting less related bacterial community arrangements from site E, were clustered in the middle of the chart, between C and D.
The hydrocarbon degradation potential of the three anthropized freshwater sites was studied by predicting the number of functional genes with PICRUSt2 and evaluating the correlation of the prediction by qPCR quanti cation.The comparison between the prediction and the quanti cation showed different trends: (1) the number of sequences predicted for C230 and PAH-RDH GN genes was higher in D and C respectively in comparison with site E, however, the quanti cation of these genes normalized with total bacteria reveled that their abundances were similar in the three sites; (2) bamA gene prediction showed similar number of sequences in the three sites, nevertheless, qPCR assay quanti ed higher numbers of this gene in site D. In accordance with this qualitative comparison among prediction and quanti cation results, the accuracy of functional gene predictions by Pearson correlation (r) and coe cient of determination (r 2 ) showed no correlations among PICRUSt2 prediction and qPCR quanti cation.The absence of correlation is probably due to the low relative abundance of the target microbial group or function and/or to small number of reference genomes (Agrawal et al. 2019).Thus, it is important to judge the level of uncertainty in PICRUSt2 predictions for single functional genes based on qPCR data.
Other authors have found correlation among PICRUSt analysis and qPCR quanti cation of the functional genes related to nitrogen cycle genes (amoA, nirK, and nosZ) when predicting and quantifying these genes in lab-scale partial nitritation reactors (Agrawal et al. 2019).These correlation ndings with the PICRUSt2 results show that they can be accurate when the functional genes selected for qPCR quanti cation are associated with the selection pressure of the studied system.The authors compared qPCR data with the prediction of selected functional genes, and they estimated the potential uncertainties in these data analyses, describing that the NSTI scores can give some impression about the overall precision of the prediction, but it is not speci c or transferable to any single functional gene (Agrawal et al. 2019).Probably, the absence of precision in our results is related with the fact that hydrocarbons are not structuring the community composition in the sites.
The taxa contributing with the hydrocarbon degrading genes were predicted by PICRUSt2.When focusing on the NSTI values of the families that were major contributors to the analysed KO, value near zero were mostly found demonstrating an accurate prediction and supporting the qPCR primers selection based on reported literature.The well-known hydrocarbon degrading family Sphingomonadaceae (Balkwill et al. 2006;Macchi et al. 2018;Madueño et al. 2011;White et al. 1996) may be the taxa that mostly contributed with the aerobic functional genes involved in hydrocarbon degradation in C site.Comamonadaceae family and uncultured members of Gaiellales order were found to be contributing mostly to the C230 KO in D site.Some authors studied the diversity and distribution of the 2,3-dioxygenase catechol-linked genes of the Comamonadaceae family, suggesting that they likely play an important role in the degradation of BTEX under hypoxic conditions.The order Gaiellales, that belongs to Actinobacteria class, also includes members capable of degrading hydrocarbons (Chen et al. 2020; Kuyukina and Ivshina, 2010).However, the Gaielalles are represented by only one genus and contain only some species (Albuquerque et al. 2011).Probably, due to PICRUSt2 use of an extended ancestral-state reconstruction algorithm to predict which gene families that are present in the sample, this gene may be present in the nearest ancestral to Gaielalles, therefore it was predicted in this order.Although they have been detected frequently by high-throughput sequencing, their functions and characteristics remain unknown due to their di cultly with culture (Chen et al. 2021).With respect to the families found contributing to C230 in site E, we should be cautious with Alyciclobacillaceae as it presented elevated values of NSTI for the prediction of this gene, for this reason PICRUSt2 results should be interpreted with caution.
Syntrophaceae family almost exclusively contribute to bamA gene in the three sites, while qPCR results showed that this gene was in higher copy number in D site, where methanogenic mechanisms were predicted to be present.This result agrees with the observations of Sun et al. ( 2014), who found that their methanogenic microcosms only had sequences that clustered with representative bamA gene sequences from Syntrophus isolates.Syntrophus presence has been associated with natural processes (Staats et al. 2011;Madueño et al. 2021), because aromatic compounds are readily found in the environment and they can be supplied naturally as plant-derived phenolics, degradation products of lignin and humic acids, aromatic amino acids etc.This result further supports the idea that even though site D was the most polluted one, probably higher content of functional anaerobic gene bamA was not exclusively correlated with hydrocarbon content.

Conclusions
Given that all three water courses had some degree of hydrocarbon contamination, the composition of the bacterial communities associated with the analyzed sites revealed that site C was the most affected by anthropic activities, most likely due to the presence of urban discharges rather than hydrocarbon pollution.Despite being D the most hydrocarbon polluted site, it presented autotrophic, syntrophic, and methanogenic mechanisms commonly involved in natural processes in anoxic sediments.Therefore, we suggest that the contamination with hydrocarbon in these three sites was not enough selection pressure for structure the bacterial community composition.Strategies for the sustainable management of anthropized freshwater courses should be focused on collecting evidence on which are the environmental variables, whether natural or induce by pollutants, have a selection effect on microbial diversity and may affect both natural processes such as limiting site recovery.This work further supports that the composition of bacterial communities is a fundamental parameter to assess the ecological impact of human activities on the freshwater environment.

Figure 3 Box
Figure 3

Figure 4 Principal
Figure 4

Figure 5 Relative
Figure 5

Figure 6 Families
Figure 6