Microbial Community Structure and Diversity in Drinking Water Supply, Distribution Systems as well as Household Point of use Site in Addis Ababa City, Ethiopia

Understanding ecology of microbiomes in drinking water distribution systems is the most important notion in delivering safe drinking water. Despite cultivation-based methods routinely employed in monitoring drinking water quality, cultivation of specic indicator organisms alone is not always guarantee for assuring safe drinking water delivery. The presence of complex microbiomes in drinking water distribution systems affects treatment effectiveness leading to poor quality water which as a result affects health of human and animals. Drinking water treatment and distribution systems harbor various microbiota despite efforts made in improving water infrastructures and several waterborne diseases become serious problems in the water industry, specially, in developing Countries. Intermittent water supply, long-time of water storage, low water pressure in distribution systems, storage tankers and pipes as well as contaminated source water are among many of the factors responsible for low drinking water quality which in turn affecting health of people. The aim of this study was to explore microbial diversity and structure in water samples collected from source water, treated water, reservoirs, and several household points of use locations (taps). High throughput Illumina sequencing technology was employed by targeting V4 region of 16S rRNA following Illumina protocol to analyze the community structure of bacteria. The core dominating taxa were Proteobacteria followed by Firmicutes, Bacteroidetes and Actinobacteria. Gamma proteobacteria were dominant among other Proteobacterial classes across all sampling points. Opportunistic bacterial genera such as Pseudomonas, Legionella, Klebsiella, Escherichia, Actinobacteria, as well as eukaryotic microbes like Cryptosporidium, Hartmanella, Acanthamoeba, Aspergillus, and Candida were also the abundant taxa found along the distribution systems. The shift in microbial community structure from source to point of use locations were inuenced by factors such as residual free chlorine, intermittent water supply and long-time storage at the household. The shift in microbial community structure from source to point of use locations were inuenced by factors such as residual free chlorine, intermittent water supply and long-time storage at the household. The complex microbiota which was present in different sample sites receiving treated water from the two treatment plants (Legedadi and Gefersa) starting from

household. The shift in microbial community structure from source to point of use locations were in uenced by factors such as residual free chlorine, intermittent water supply and long-time storage at the household. The complex microbiota which was present in different sample sites receiving treated water from the two treatment plants (Legedadi and Gefersa) starting from source water to household point of consumption across the distribution systems in Addis Ababa brings drinking water quality problem which further causes signi cant health problems to both human and animal health. Treatment ineffectiveness, disinfection ine ciency, poor maintenance actions, leakage of sewage and other domestic wastes are few among many other factors responsible for degraded drinking water quality in this study putting health at high risk which, this, leads to morbidity and mortality. Findings of this research provide important and bassline information to understand the microbial pro les of drinking water along source water and distribution systems.

Background
Centralized drinking water treatment and distribution is the primary attribute of safe water supply; however, it is also among the biggest technological challenges as these systems must keep pace with rapid population growth and increased drinking water demand that comes with growing economic prosperity. Despite the advances in the development of urban water infrastructure, waterborne disease is still common in the cities of developing countries due to intermittent treated water supply which leads to excessive storage, low pressure events, poor integrity of distribution system piping, and contaminated source waters [1][2][3][4]. The Sustainable Development Goals (SDGs) are targeting these challenges and propose to achieve universal and equitable access to safe and affordable drinking water for all by 2030 [5]. Ethiopia is in the forefront of the countries in Sub-Saharan Africa with insu cient access to safe drinking water [6,7]. Addis Ababa, the capital city, suffers from increasing human population, unregulated urban growth [8], poor waste management practices [9], intermittent water supply services [10,11], insu cient operation and maintenance of treatment and distribution systems [12]; which are driving forces for drinking water contamination.
Although more attention is being directed to understand the microbial challenges experienced by Addis Ababa's drinking water, few studies use culture-independent methods. For example, several studies on the quality of drinking water in various regions across Ethiopia and in Addis Ababa used culture-dependent techniques focusing on speci c indicator organisms [13][14][15]. Despite their convenience for routine monitoring, culture-based methods tend to underestimate the diversity and potential presence of both opportunistic and pathogenic microorganisms due to poor cultivability [16] or growth competition by nonpathogenic heterotrophic plate count (HPC) bacteria [17]. Next-generation sequencing platforms are useful for comprehensive assessment of drinking water microbiota and has begun to be employed in both resource-rich and constrained settings [18]. By employing next generation sequencing techniques, it is possible to detect and identify the presence and diversity of pathogenic microorganisms that cannot be achieved with classical methods [19].
This study aimed to elucidate the microbial community present, their diversity and community structure across the water distribution systems starting from source through treatment and along the distribution systems to storage and neighborhood tap for the two drinking water treatment plants serving Addis Ababa.
The speci c objectives of this study are: i) studying the diversity of microorganisms available across the water supply systems; ii) studying dynamics and community structure of microorganism across the water supply systems. Illumina sequencing was used with both 16S and 18S targets. The microbial diversity across the different point of distribution lines and household point of use locations in the City were comprehensively investigated.

Study Site Description and Sample Locations
The study was conducted in Addis Ababa City, the Capital of Ethiopia which lies at an elevation of 2,300 MASL. The City largely relies on surface and ground water as the main sources for drinking, domestic and industrial purposes. There are two surface water treatment plants, Legedadi and Gefersa, that provide 54% of the City's treated water (Fig. 1). Legedadi, established in 1970 GC (Gregorian calendar), is located 30 km from Addis Ababa on the North-East part. It is the largest drinking water treatment plant with a treatment capacity of 195,000 m3/day, which delivers about 47% of the daily distributed water supply for Addis Ababa. Gefersa was established in 1940 GC and is located 20 kms North-West part of the City with a treatment capacity of 30,000 m 3 /day and providing only 7% of the city's distributed water. Legedadi and Gefersa treatment plants receive surface water from Dire and Gefersa dams, respectively. The treatment plants use conventional treatment that includes pre-chlorination, coagulation, occulation, sedimentation, sand ltration and post-chlorination, with a goal of maintaining an average chlorine residual of 0.8mg/l in the distribution system.
For this study, a total of 38 samples were collected from several locations from source to tap along both the Legedadi (n=22) and Gefersa (n=16) systems (Table S1): Samples include source water entering into the treatment plant (LS and GS, n=1 each); nished drinking water before entering into distribution lines (LF and GF, n=1 each); réservoirs in the distribution system that stored treated water (LR, n=7 and GR, n=2); and household taps inside individual houses (LT; n=8 and GT; n=4) and storage tankers of individual houses (LS; n= 5 and GS; n=8). Samples were collected in July 2015 based on established sampling procedures [20]. All glass sample bottles were sterilized by autoclave (121 0 C, 15 min), then supplemented with sodium thiosulfate to quench residual chlorine in duplicate 2 L water samples per location. Prior to sample collection from each reservoir (R samples) and household standpipe (T and S samples), the water was allowed to run for 10 minutes in order to ensure representative sampling and to avoid sampling stagnant water Microbial analysis of water samples DNA extraction from water samples.
Two liters of water per location were ltered through a polycarbonate membrane lters (0.22 µm pore size EMD Millipore™ GTTP02500) and frozen immediately at -20 o C. All ltered samples were shipped frozen using ice packs via overnight courier service to the Environmental Biotechnology Laboratory at the University of Michigan, USA, and arrived frozen. Each membrane lter was cut into four equally-sized pieces using a sterile knife and placed into a single vial to facilitate DNA extraction. Total genomic DNA was extracted from each vial using a Power Soil DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA) following the manufacturer's protocol and instructions. The extracted DNA of each sample with total volume 50 µl was ready for high throughput sequencing process. The concentration and purity of the extracted DNA was determined using NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) (Table S2) and stored at -20 0 C until sequencing. 16s rRNA and 18s rRNA gene PCR ampli cation and Sequencing Extracted DNA was submitted to the Core Genome Sequencing unit at the University of Michigan, USA.
For eukaryotic pathogens detection, 18S rRNA gene fragments were ampli ed using universal primers (forward and reverse) Euk-A7F (5′-AACCTGGTTGATCCTGCCAGT-3′) and Euk-570R (5′-GCTATTGGAGCTGGAATTAC-3′) which target the V1-V3 region of the 18S rRNA gene. PCR ampli cation was done in a total working volume of 770 uL containing 654.5 uL of sample library (containing Accuprime Tag, PCR water and 1X buffer II), 15.4 uL PhiX V3 as a control, and 100.1 uL of sample DNA. Library preparation, cluster generation and sequencing were carried out using the Illumina Miseq platform.

Bioinformatic and Statistical Analysis
The 16s rRNA reads generated from the Illumina Miseq analysis were ltered, clustered, taxonomically assigned and generally curated using MOTHUR (version 1.35.1) following the Miseq SOP (https://www.mothur.org/wiki/MiSeq_SOP) [21]. Sequencing and PCR errors were reduced by using the make. contigs command. Furthermore, sequences that did not align to the SILVA database were removed through sequence screening, and preculturing of rare sequences were merged into larger sequences [22]. The chimera. vsearch command was used to detect and remove chimeric sequences through Mothur [23,24]. Classi cation of high-quality sequences into OTUs at 3% distance (97% similarity) were carried out based on the Ribosomal Database Project (RDP) Training set (Version 9) [25]. Clustering of OTUs at 3% dissimilarity was done using the cluster. split command. Rarefaction analysis, Chao1, Shannon diversity and evenness indices were computed using MOTHUR version 1.35.1 [21]. Non-metric multidimensional (n-MDS) analysis based on the Brey-Curtis dissimilarity index was generated using PAST software (version 3) for clustering of sampling sites based on microbial composition similarity. The relative abundance bar graphs and Box-Whisker plot of phylum, class, and genus levels of the 16s rRNA sequence were plotted using IBM-SPSS (version 23) and XLSTAT tools. Heatmaps based on OTUs at the genus level were generated using ClustVis tool (web tool for visualizing clustering of multivariate data) [26]. The 18s rRNA gene sequences were submitted to MG_RAST web-based sequence analysis pipeline (http://metagenomics.anl.gov/) for qualitative analysis of the presence of public health-relevant parasitic and free-living eukaryotic organisms [27]. 18S reads were used for detection of the organisms only.

Result And Discussion
A total of 949,941, quality reads were processed and 22 phyla were retrieved from 729,385 assembled sequences in the samples taken from 38 sample locations. Sequence clustering created 604 unique bacterial OTUs and 1230 eukaryotic OTUs. The rarefaction curves showed that most of the sampling points failed to reach plateau, which suggested sampling frequency and sequencing depth should be improved to get enough reads (Fig. S1). The nonmetric MDS ordination plot showed marked differences on the similarity of bacterial composition based on the distribution of OTUs (Fig. 2).
Based on the analsysis, samples having similar bacterial composition clustered together whereas those having diffrent microbial assembly are dispersed apart as shown in Fig. 2a (Table 1). Table 1 Community diversity and richness estimators of the 16S rRNA amplicon sequences of samples from the two treatment plants along the distribution lines and their reservoirs. Note that for each sample name, the rst letter denotes the source of water (G = Gefersa and L = Legedadi), the second letter denotes whether it is the untreated source water (S), nished treated water (F), water from reservoir (R), water from end of pipe/tap (T) and water from storage tanker (S). The descriptions of sample points of each location and their distance from the respective treatment plants is found at Samples taken from the water (LF) leaving out Legedadi treatment plant had higher diversity and low richness compared to source water (LS), but water taken from the ve reservoirs showed increasing trend of bacterial richness along their distance from the treatment plant. This might be due to loss and/ or rapid decay of chlorine residuals along the reservoirs located in various degree of proximity from the treatment plant where chlorination is applied. Evidently, LR5 located in the farthest end of the reservoirs had higher diversity and richness. Among the end-of-pipe water samples, increasing trend of diversity and richness was observed with respect to the reservoirs they are sourced from. For example, LT1 and LT2, which are sourced from LR1, showed increasing bacterial diversity (1.5 to 1.7) and richness (104 to 376).
The water (GF) leaving out Gefersa treatment plant showed slight decrease in richness compared with the source (GS) water. Water samples from the two reservoirs (GR1 and GR2), showed a slight increase in their diversity and richness across their distance from the treatment plant. Samples from the four-taps (GT1-GT4) did not show consistent diversity/richness pattern along their distance from the treatment plant.
The samples from storage tankers of both treatment plants showed variable degrees of diversity and richness. This could be explained by their difference in tanker material, age of tanker usage, age of the water held and degree of cleaning frequency of the tanker. The increasing trend of bacterial diversity and richness in tap water samples of this study may be due to leakage of pipes, development of bio lms in the wall of pipes so that bacteria may get protection against residual disinfectants [28].
Inconsistent with this study, more diverse microbial taxa across metropolitan drinking water distribution systems at different geographic environment were detected and factors like source water, physicochemical parameters and treatment process affect microbiota diversity across distribution systems [29].
Members of phyla Firmicutes and Proteobacteria dominate reservoirs of the two treatment plant systems A total of 22 phyla were retrieved from the assembled sequences from the water samples. The untreated raw water (LS) from Legedadi treatment plant were dominated by members of the phylum Bacteroidetes followed by Actinobacteria accounting for 88% and 87% of relative abundance of the total sequences respectively. Raw water from Gefersa treatment plants was dominated by members of the phylum Proteobacteria accounting for 65% of the total sequences, followed by Actinobacteria which was 25% of the total sequences (Fig. 3). On the other hand, LR2 reservoir receiving water from Legedadi treatment plant were found to harbor members of the phyla Proteobacteria and Firmicutes represented by 50% and 21% respectively. The remaining reservoirs which received water from this treatment plant also had higher dominant of Proteobacteria and Firmicutes with varying relative abundance as compared to other members of phyla. Similarly, nished water leaving the Gefersa treatment plant were found to harbor members of the phyla Proteobacteria and Bacteroidetes represented by 57% and 24% respectively. From reservoir sample locations (GR1 and GR2), Proteobacteria were found in high dominance with relative abundance of 83% and 61% respectively followed by Firmicutes which accounted for 13% and 34% of the total sequences, respectively. In line with this present study, in a study done by Li et al., the most dominant bacterial taxa in raw water and subsequent sampling locations including nished water through drinking water distribution systems were Proteobacteria and Firmicutes followed by Actinobacteria and Bacteroidetes [30]. Reservoirs of drinking water distribution systems are naturally very complex environments supporting the growth of diverse microbial assemblages which can adapt to different hydraulic conditions and nutrient availability [31] which supported the presence of dominant phyla Proteobacteria, Firmicutes and Bacteroidetes in this present study. Moreover, there was a nding which reported the predominance of Actinobacteria and Bacteroidetes after disinfection of drinking water at various reservoir locations and in household storage tanker after disinfection [32].
Persistently highest abundance of proteobacteria was observed irrespective of distance from the source The bacterial taxonomical composition in this study revealed that in both treatment plants, Proteobacteria was the most predominant bacterial phylum in all the sampling sites clustered according to their sampling locations (Fig. 4). The presence of Proteobacteria in high dominance in locations far from treatment plants in this present study might be due to the presence of low amount of residual disinfectant, which, therefore favors re-growth, multiplication, and dominance of Proteobacteria and other microbial groups [33]. Moreover, the dominance of proteobacteria among other phyla in drinking water distribution systems was due to its ability to adapt at low nutrient availability conditions and having the capacity of bio lm formation at different pipe material surfaces [30,34,35].
Phylum Proteobacteria in both near and far sampling locations of Legedadi treatment plant had signi cantly higher relative abundance (P < 0.01-Kruskal-Walis test) of 86% and 84 % respectively followed by Bacteroidetes (15% relative abundance) in near sample locations and Actinobacteria (17% relative abondance) in far sample locations. This clearly indicated a shift of bacterial community structure from near to far sample locations. Similarly, phylum Proteobacteria with a relative abundance of 81% and 87% respectively in both near and far sampling locations was the most dominant phylum followed by Firmicutes with a relative abundance of 11% (near) and 7% (far) respectively in Gefersa treatment plant.
A study on metagenomic bacterial community from drinking water supply systems showed similar high dominance of Proteobacteria, followed by Bacteroidetes and Actinobacteria at various residential sample locations of the supply systems [1] and the dominance of these phyla in samples taken from raw water sources of a treatment plant, reaction tank, settling pond and clear water tank in another study were reported [36].
High abundance of class Gamma proteobacteria was found across sample locations along distance from sources of treatment plants The dominant class in both near and far sample locations of Legedadi treatment plant were class γproteobacteria which accounted a relative abundance of 78 % and 88% respectively (Fig. 5). The second dominant class were β -proteobacteria accounted a relative abundance of 32 % and 28% from the ve classes respectively in near and far sample locations. Similarly, in water samples sourced from Gefersa, class γ-proteobacteria with a relative abundance of 74% and 75% followed by β -proteobacteria with a relative abundance of 21% and 66% in near and far sample locations respectively. Like the ndings of this present study, there were researches done at different location on microbial community structure in drinking water distribution systems and reported that δ followed by β-proteobacteria were the most dominant bacterial groups with increased relative abundance starting from treatment plant down to distribution systems [16,29,37].
Along the sampling locations which were far from Legedadi treatment plant, the abundance of α, γ and βproteobacteria increased and similarly sampling locations far from Gefersa treatment plant, the relative abundance of γ-proteobacteria and β-proteobacteria increased and this indicated that these bacterial groups may be able to reside and proliferate at different locations of the drinking water distribution systems without signi cantly affected by disinfectants [38].
Phylum Proteobacteria were highly abundant in running tap and storage tanker locations Phylum Proteobacteria were the most dominant phylum in samples received water from both Legedadi and Gefersa treatment plants. Signi cantly the highest relative abundance (95%) of phylum Proteobacteria (P < 0.05-Kruskal-Walis Test) in running tap water followed by 87% in storage tankers was observed where there was wide range relative abundance in running tap water samples as compared to storage tankers received water from Legedadi (Fig. S2). Phylum Firmicutes was the second dominant phylum (44%) in Legedadi household tap water sample locations. Similarly, the relative abundance of Proteobacteria in tap water and storage tanker sample location of Gefersa treatment plant was 98% and 92% respectively and Firmicutes (50%) followed by Bacteroidetes (42%) were the next abundant phylum in tap water samples whereas Firmicutes (12%) followed by Actinobacteria (5%) were the dominant phyla across storage tanker samples.
Several studies in consistent with this nding reported phylum Proteobacteria, Bacteroides, Actinobacteria and Firmicutes as the most dominant phylum in samples taken from tap water of urban drinking water supply systems [39,40] and hence this present study nding was not an eye-opening in its high-level detection.
The presence of bacterial taxa in wide range of their respective relative abundance at each sample locations in this study was an indication of change in bacterial community from running tap and storage tanker locations throughout the distribution systems. Moreover, high dominance of bacterial taxa in tapes and storage tankers was probably due to resistance mechanisms against disinfectants and forming bio lms inside pipe surfaces and inside the wall of storage tankers as this fact was supported by scholars who did microbial community structure investigation in treated drinking water [41,42]. The high dominance of bacterial taxa in running tap in this study compared to storage tanker sample locations might be changing water pressure through piping systems and general hydraulics which intern cause detachment of some cells from bio lms and resuspension of sediment associated bacteria. The presence of bacterial taxa to a lesser extent in storage tankers, on the other hand, in this study might be due to low water temperature and water stagnation which deters microbial metabolism and regrowth [43][44][45].
The presence of phylum Actinobacteria as the least proportion in treated tap water may be due to its vulnerability to disinfectant chlorine as this phylum is naturally dominant in raw and bulk water [46]. The dominance of Proteobacteria over taxa such as Bacteroidetes, Firmicutes and Actinobacteria in treated water samples collected from both running tap water and storage tanker in this study may be due to availability of nutrients under oxidative disinfection process which gives selective advantage for Proteobacteria having wide range of metabolic versatility and fast growth rate and also fundamental resilience to disinfection [32,46]. The phyla Proteobacteria, Bacteroidetes, Firmicutes and Actinobacteria are the most common dominating bacterial groups in drinking water treatment systems [3,37,47].
Household tap and storage tanker locations were conducive environments for the dominance of Gamma proteobacteria In running tap water and storage tankers samples sourced from Legedadi, class γ-proteobacteria was most predominant among ve identi ed classes between tap and storage tanker samples with relative abundance of 92% and 88% respectively whereas in storage and tap samples sourced from Gefersa, class γ-proteobacteria with relative abundance of 71% and 97% respectively was observed.
The second abundant class was β-proteobacteria which accounted 28% and 46% in tap water and storage tanker locations of Legedadi treatment plant and similarly 72 % and 53 % in tap and storage sample locations of Gefersa treatment plants respectively (Fig. 6).
Moreover, δ-proteobacteria were the other class detected in high relative abundance (74%) in the storage sample locations of Gefersa treatment plant. Across all the tap water and storage tanker sample locations, there were wide range in the relative abundance of all classes. On the other hand, δproteobacteria followed by, β and γ-proteobacteria were the most common and predominant bacterial groups usually found in residential water samples and chlorinated drinking water systems [1,48]. There was signi cant variation in the relative abundance γ-proteobacteria among the four classes at each clustered sample location (P < 0.05-Kruskal-Walis Test).
The dominance of γ-proteobacteria shifted from the lowest to highest relative abundance across all clustered samples of tap water and storage tanker sample locations compared to other classes. Although α-proteobacteria has competitive advantage over other Proteobacterial groups by its virtual nature of existing under low nutrient availability and its ability to degrade complex organic compounds for its nutrient need in disinfected water [46], the predominance of γ-proteobacteria in this study would be attributed to its survival under suppressed environments in addition to its ability of tolerance to different treatment processes and disinfectant agents [49][50][51]. Moreover, the relative abundance of γ-Proteobacteria along the ow of the treatment process in another similar study showed that there was an increased pattern as compared to α and β-Proteobacteria classes which showed a decreasing relative abundance across the different steps of a treatment plant [30].
The presence of β-proteobacteria relatively with high percentage in potable water supplies may be the result of nutrient availability and due to formation of bio lms [52,53] since this bacterial group is very sensitive to low nutrient availability and high residual disinfectant as compared to other Proteobacterial classes. Although the effect of residual chlorine results in a dynamic shift of bacterial community structure [28][29][30]32], the ndings of this study showed γ-proteobacteria had comparatively selective advantage over other classes in stored and piped water samples after chlorination. From this, it can be inferred that there could be different mechanisms like high tolerance to disinfection stress and recovery during treatment.
Pseudomonas was the most dominant genus across the distribution systems The relative abundances of all the identi ed genera from both Legedadi and Gefersa treatment plants are depicted in Fig. 7and Fig S3. The relative abundance of Pseudomonas in source water (LS) of Legedadi treatment plant was 71% and showed a slight increase (77%) in nished water (LF). The trend uctuated across the subsequent reservoirs from 55% in LR1 to 87% in LR5, with highest level (95%) in LR4.
Drinking water is commonly inhabited by Pseudomonas species as this genus is characterized by high prevalence, wide distribution in drinking water and its antibiotic resistance capacity [54]. Across tap water samples, Pseudomonas was the most dominant in LT5 (98%) followed by LT6 (96%) and water sample from the storage tanker, LS5, found to contain 95% relative abundance (Fig. 6). Like wise in Gefersa tretamnet plants, Pseudomonas was the also the dominnat genus which accounted 98% in tap water sample (LT5) located far from the treatment plant compared to low relative abundance (8%) of source water (GS) and 86% relative abundance from nished water sample (GF). Next to nished water sample, reservoir GR1 was found to contain 96% relative abundance of Pseudomonas and storage samples, GS5 and GS7 respectively contained 84 % and 77% relative abundance of Pseudomonas. In drinking water distribution systems and pipes, Pseudomonas and Acinetobacter commonly reside coated with thin lms (bio lm) as long as organic matter is available there which these genera cause illness in young and immunocompromised people [55]. Moreover, distribution systems characterized by intermittent water supply, low water pressure, intrusions of pathogens, metals, other chemicals through breakage and infrastructure ine ciency and presence of organic matter which reduce residual chlorine as well as long time storage of water in the household storage tanker, all are important mechanisms of water contamination by opportunistic pathogens like Pseudomonas [4]. In agreement with this nding, high relative abundance of Pseudomonas in chlorinated drinking water distribution systems was also reported [56].
The genus Pseudomonas was repeatedly described as the leading bio lm former bacterial groups in treated drinking waters [57] which hence might be basic factor for its' presence at high relative abundance in drinking water distribution systems of this study. Moreover, this genus has the ability to attach in pipe surfaces and get protection from the action of disinfection by forming extra cellular polymeric substances (EPS) helping for cell-to-cell communication in addition to attachment to pipe surfaces [56,58].
On the other hand, Acinetobacter was the second most abundant genus in Legedadi treatment plant samples, LS, with a relative abundance of 25%, Tap water (LT1), 87% and storage tanker sample (LS2) with a relative abundance of 93% and similarly a relative abundance of 92% was observed in source water (GS) of Gefersa treatment plant. Furthermore, tap water sample (GT4), storage tanker samples (GS2, GS3, GS6 and GS8) had a relative abundance of 93%, 99%, 61%, 63% and 55% of Acinetobacter respectively. Acinetobacter is the most common causative agent of hospital acquired infections particularly respiratory infections in susceptible individuals [59]. Similarly, it was reported that Acinetobacter was the most abundant in centralized drinking water treatment plants after disinfection by chlorine and several of its strains were found to survive without being affected signi cantly [60].
Moreover, Escherichia with 9%, 38%, 6% and 19% was observed in nished water (LF) and three reservoirs (LR1, LR2 and LR3) respectively. In agreement with this study, several opportunistic pathogenic genera were identi ed from chlorinated drinking tap water with Escherichia as predominant genera [61]. Legionella with 3% relative abundance was found in source water (LS), and very small percentage from nished water (LF), few of the reservoir (LR5), tap (LT2) and storage tanker (LS3). Interestingly, Legionella was found in the nished water (GF) only and not detected from the subsequent samples in Gefersa.
Although in very small percentage, detection of Legionella in distribution systems of this study might be associated with presence of host protozoa, invasion of Legionella from bio lms inside the pipe surface and low impact of chlorination [62].
The presence of these bacterial genera which are known to contain pathogenic species in drinking water treatment systems, storage tanks, public faucets and individual tapes were also reported [57,[63][64][65]. The diversity and dynamics of microbial communities in centralized water treatment plant showed the dominant of potential pathogenic bacterial groups in different sampling locations [36]. The presence of these opportunistic pathogens in this study may be due to rapid recovery after chlorination, development of resistance to disinfection, disinfection ine ciency, leakage after treatment and low disinfectant residual across subsequent sample locations of drinking water distribution system [66]. Moreover, the ability to survive under high temperature (up to 40 0 C) and adapt at low nutrient conditions of soil, fresh water and drinking water systems [67] were other factors which strengthen the presence of opportunistic pathogenic groups (Pseudomonas, Acinetobactor and Legionella) in considerable abundance.
Sporadic distribution of Eukaryote communities was observed across sampling locations Diverse eukaryotes were detected from all sampling locations starting form source water up to running tap and storage tanker samples across the distribution networks. The most important genera found across different sample locations of the distribution system were Hartmannella, Cryptosporidium, Cryptococcus, Acanthamoeba, Aspergillus and Candida (Fig. 8). Hartmannella was found in 28% of all the samples sourced form Legedadi treatment plant and 44% of samples sourced from Gefersa treatment plant. Furthermore, Cryptosporidium was also found in 22% of the samples in Legedadi and 38 % of samples in Gefersa. Moreover, the genera Acanthamoeba was found in 56% of samples form Legedadi and 57% of samples sourced from Gefersa treatment plant. Acanthamoeba, the most prominent parasitic protozoa transmitted through contaminated water and Cryptosporidium, a causative agent of diarrheal disease in children in developing countries are the main problems in urban drinking water supply systems [68].
The opportunistic fungal pathogens including Aspergillus was found in all sampling locations except source water (LLS) and reservoir (LR2) of Legedadi treatment plant and also nished water (GF) and rst reservoir (GR1) receiving water from Gefersa treatment plant. Candida accounted 67% and 50% in Legedadi and Gefersa sample locations respectively whereas Cryptococcus were observed in 50% and 38% of the sampling locations of Legedadi and Gefersa as well. Although many species of fungi under the genera Penicillium and Aspergillus produce mycotoxins in food and beverages, interestingly there are species of Aspergillus which produces a atoxin in cold drinking water storage tanks [69]. The human pathogen Candida species was reported in drinking water and surface water [70] and similarly, sequencing results from fungi in drinking water and bio lms showed a higher abundance of Cryptococcus [71]. A phytoplankton microalgae genus Cryptomonas was also detected in 50% and 44% of the sampling locations which received water from Legedadi and Gefersa treatment plants respectively.
The unusual growth of genus Cryptomonas, a microalgae phytoplankton, plays for the presence of shy odor in drinking water and hence contributes for low quality water [72].
Hartmanella and Acanthamoeba serve as a host for many pathogenic waterborne organisms such as Legionella, Acinetobacter, Clostridium and Mycobacterium [36,73] and these host organisms are believed to found in surface water, waste water and drinking water systems in tropical regions where water temperature is 30 0 C [74]. Although this study hasn't established the chlorine sensitivity test, the presence of Hartmanella in samples analyzed in this study showed its resistance to disinfection. Similar assessment of drinking water treatment plants showed very high residual chlorine concentration (0.8-3 mg/l) in produced water was insu cient for the removal of cysts of Acanthamoeba and other free-living amoebas [75].
Targeting 16S rRNA genes of speci ed regions (V4) with short read sequencing platforms cannot achieve the taxonomic resolution at species and strain level which is afforded by sequencing the entire genome.
Likewise, identi cation of microbial community using 16S rRNA gene at genus level may mask important levels of inter-genus population differences and heterogeneity, which remain inaccessible to short-read 16S rRNA gene-based analysis. Furthermore, as limitation of this study, under sampling in this study clearly implied further frequent sampling need to be conducted to get su cient sequence reads and to get bacterial populations in the study area.

Conclusions
To our knowledge this study is the rst in investigating the microbial community structure and dynamics of drinking water distribution systems in Addis Ababa City using Illumina Miseq sequencing techniques. The microbial community structure, diversity and dynamics of water samples collected form source water entering to treatment plants, nished water inside treatment plants, reservoirs, and from various household point of use pipes across the distribution system was investigated. Highly diverse microbial communities were detected across all sampling locations. Moreover, the community structure shifted throughout the distribution systems. The phylum Proteobacteria followed by Firmicutes, Bacteroidetes and Actinobacteria was the highest predominant bacterial group found in all the sampling locations. Unlike several researches which reported the predominance of Alpha proteobacteria in treated drinking water systems, Gamma proteobacteria was the predominant class found followed by Alpha and Beta proteobacteria. Potentially opportunistic pathogens like Pseudomonas, Legionella, Escherichia, Klebsiella, Proteus and Acinetobacter were the dominant bacteria genera found in almost all the sampling locations.
Free living Protozoan organisms like Hartmanella and Acanthamoeba which are hosts for opportunistic pathogens like Legionella and pathogenic ones like Cryptosporidium were also found in the samples analyzed.