Unimproved water sources in Ethiopia: Spatial variation and household point-of-use treatment practice based on 2016 Demographic and Health Survey


 Background

Improved water sources are not equally available in all geographical regions. Populations dependent on unsafe water sources are recommended to treat their water at point-of-use using adequate methods to reduce associated health problems. In Ethiopia, the spatial distribution of households using unimproved water sources have been incomplete or ignored in most of the studies. Moreover, evidence on the point-of-use water treatment practice of households dependent on such water sources is scarce. Therefore, the current study is intended to analyse the spatial distribution of unimproved water sources by wealth quintiles at country level and point-of-use treatment practices using nationally representative data.
Method:

The data of 2016 Ethiopian Demographic and health survey (EDHS) conducted on 16650 households from 643 clusters were used for the analysis. For spatial analysis, the raw and spatially smoothed coverage data was joined to the geographic coordinates based on DHS cluster identification code. Global spatial autocorrelation was performed to analyse whether the pattern of unimproved water coverage is clustered, dispersed, or random across the study areas. Once a positive global autocorrelation was confirmed, a local spatial autocorrelation analysis was applied to detect local clusters. The point-of-use water treatment is analysed based on reported use of either boiling, bleach, filtration or SODIS.
Result

There were 5005 households using unimproved water sources for drinking purpose. Spatial variation of unimproved water coverage was observed with high coverage observed at Amhara, Afar, Southern Nations Nationalities and People and Somalia regions. Disparity in unimproved water coverage between wealth quintiles was also observed. The reported point-of-use water treatment practices among these households is only 6.24%. The odds of point-of-use water treatment among household heads with higher education status is 2.5 times high (95%CI = 1.43–4.36) compared to those who did not attend education.
Conclusion

An apparent clustering trend with high unimproved water coverage was observed between regions and among wealth quintiles hence indicates priority areas for future resource allocation and the need for regional and national policies to address the issue. Promoting households to treat water prior to drinking is essential to reduce health problems.


Introduction
Access to improved water is a fundamental human right [1]. However, it is not equally available in all geographical regions (He et al., 2018). A series of studies have shown that location and socio-economic status are the most pronounced factors for unequal access to improved drinking-water [2,3]. This is clearly seen, such as, in the analysis of ve African and Latin American countries which showed a clear pattern of inequality for broad regional units and more at local scale and different wealth classes [4]. Hence, population dependent on such water to suffer a lot from different health problems of which diarrhoea is one [5]. On the other hand, populations dependent on unsafe water sources are strongly recommended to treat their water at point-of-use to reduce associated health problems mainly diarrhea, even if it is not routinely and widely practiced [6][7][8].
In Ethiopia, according to WHO/UNICEF 2015 report, 43% of populations depend on unimproved water sources [9]. Diarrhea, its association with water sources and the effectiveness of water treatment at pointof-use are indicated in prior studies [10][11][12][13]. In addition, the households water treatment practices at national level is acknowledged [14].
But, evidence on the spatial distribution of households which depends on unimproved water sources has been incomplete or ignored in most of the studies [15,16]. In this regard, few studies were conducted in Ethiopia to locate or map areas with highest unimproved water source coverage among different wealth categories and local level [17,18]. Moreover, data on how point-of-use water treatment practices of populations dependent on unimproved water sources look like is scarce. Therefore, this could have in uence on intervention measures and makes it impossible to trace sources of water-borne epidemics [19,20].
Performing spatial description of geographic data with respect to demographic, environmental, socioeconomic and other risk factors, and the point-of-use treatment practices will aid policymakers, partners and planners, in the water and health sector to develop appropriate strategies in improving water sources development, quality monitoring and measures like wide-scale use of household water treatment methods [21][22][23]. Thus, this study is distinct from prior studies that it is the rst to analyse and map the spatial distribution of unimproved water coverage by wealth quintile in detail and demonstrate disparities in unimproved water coverage not only across regions but also with in regions between wealth quintiles. In addition, it assesses the point-of-use water practices of households dependent on such water sources.

Study setting
The world population review indicates that Ethiopia has an estimated population of 114.96 million in 2020 that makes the second populous country in Africa [24]. The country has an administrative structure of nine regional states (Tigray, Afar, Amhara, Oromiya, Somali, Benishangul-Gumuz, Southern Nations Nationalities and People Region (SNNPR), Gambela, and Harari) and two city administrations (Addis Ababa and Dire Dawa) [25].

Study design and data source
The Ethiopian Demographic and health survey (EDHS) data collected in 2016 is used. We obtained the data through online registration on MEASURE DHS program. A two-stage strati ed sampling design based on nationally representative household surveys was implemented. Although, EDHS has different data les, the household (HR) data le was used in this study.

Data processing
In total, 16,650 households from 643 clusters are included and each cluster is represented by a GPS point with latitude and longitude coordinates. After initial data processing, 22 clusters without GPS points were excluded. For the nal data analysis, 621 clusters and 5005 (weighted frequency of 5857) were identi ed to be using water from unimproved water sources based on the WHO/UNICEF category [26]. These households were used to analyse point-of-use household water treatment practice. GPS coordinate displacement was performed on the actual locations of each cluster to produce data with displaced distances to maintain con dentiality of the surveyed respondents. For urban clusters, a displacement towards a random direction by a maximum of 2 kilometres was implemented. Rural clusters were displaced to a maximum of 5 kilometres, with a further 1% displaced to a maximum of 10 kilometres. A subset of clusters with GPS points were created for each wealth quintile. Since, a single cluster was represented by a single GPS point and households in a single DHS cluster may fall in different quintiles, subsets of clusters were not mutually exclusive.

Data Analysis
We have constructed a new wealth quintile by excluding drinking water as an asset of the households.
The default asset scores, and quintiles of DHS datasets were constructed by including drinking water supply as an input. However, drinking water is the dependant variable of this study and was excluded to construct wealth quintiles. An asset score for each household was constructed using Principal Components Analysis in Stata 14.0 (Stata Corp, College Station, Texas, USA). All surveyed households were ranked and divided into ve subsets or wealth quintiles. The rst quintile included the poorest 20% households and the fth quintile included the wealthiest 20%. Following the approach used by [27] ve subsets of clusters with GPS points representing ve quintiles were created. Each subset included all the clusters that contained at least one household in the corresponding quintile. Since, a single cluster was represented by a single GPS point and households in a single DHS cluster may fall in different quintiles, subsets of clusters were not mutually exclusive.
The raw coverage rate of unimproved water in each cluster was calculated as the proportion of households with any of unimproved water sources to the total households in each cluster for the overall population and for each quintile and accounted for the survey design and weight. The difference in raw coverage rates among the sampled clusters was statistically tested using one-way ANOVA.
A spatially smoothed rate was calculated to stabilize raw rates. To perform the smoothing, rst a Thiessen polygon which divide an area into regular sub-areas that encloses all locations closer to the central point than to any other point was created [28]. Spatial smoothing was used to produce a corresponding estimate to the raw coverage rate of each cluster from a collection of neighbouring clusters enclosed by Thiessen polygon. For this study, the rst order Queen Contiguity was applied as the spatial smoothing rule. Queen Contiguity spatial smoothing rule considers all neighbouring polygons sharing common edge or a common vertex with the target Thiessen polygon as neighbours. The difference between spatially smoothed and raw coverage rates for overall population and each quintile was also calculated by subtracting the raw coverage from spatially smoothed coverage rates. Spatial autocorrelation was performed by joining the raw and spatially smoothed coverage data to the geographic coordinates based on DHS cluster identi cation code. We have assumed there is a complete randomness of unimproved water distribution in the study sites. Global spatial autocorrelation was performed to analyse whether the pattern of unimproved water coverage is clustered, dispersed, or random across the study areas. The Global Moran's I measure spatial autocorrelation based on the feature locations and attribute values. For a set of features with associated attribute, Global Moran's I evaluate whether the pattern expressed is clustered, dispersed, or random. When the z-score or p-value indicates statistical signi cance, a positive Moran's I index value indicates tendency towards clustering while a negative Moran's I index value indicates tendency towards dispersion. As the global spatial autocorrelation techniques provides one quantitative value for the whole dataset, it cannot identify local clusters with high or low coverage. Thus, local spatial autocorrelation analysis was applied to detect local clusters for positive global autocorrelation results. Local Moran's I was used to calculate a test statistic for each location and to identify clusters of high and low coverage. A Random Permutation Procedure (RPP) was used to replicate the statistics 999 times to generate reference distributions. The distribution of the test statistics was evaluated against a theoretical or random reference distribution generated. Local Moran's I was calculated for both raw and spatially smoothed rates. Both the global and local spatial autocorrelation was calculated using GeoDa [29]. For point-of-use water treatment, the number of households reportedly use adequate water treatment methods (bleach, boiling, ltration and solar disinfection) were considered a yes (1) and no otherwise (0 = if the household had used neither of them).
Descriptive and logistic regression were used to assess the associated factors with the household pointof-use water treatment.

Results
Raw and spatially smoothed coverage Five subsequent wealth quintiles or subsets of clusters have been created based on the national assets score, excluding drinking water supply. The rst quintile included the poorest 20% households and the fth quintile included the richest 20%. Based on the asset score, households were assigned to one of the ve wealth quintiles. Clusters categorized by the number of households each cluster denotes for each quintile are presented in Fig 1. The number of households varies across quintiles with 26.9-41.2% of clusters had only 1-3 households, meanwhile 11.3-48.2% of the clusters possessed 17-29 households. The spatial distribution of clusters by quintile and for the overall population (Fig 1) showed a similar distribution of clusters in the rst three quintile, while quintile four and ve had a greater number of clusters around the capital Addis Ababa. The rst three quintiles had higher number of clusters at the North (Tigray and Amhara regions) and South (SNNPR). The trend continued to the fourth quintile, except the distribution of clusters extended to the centre and around the capital, Addis Ababa. The fth quintile had a similar pattern around the capital and clusters sparsely distributed at the periphery.
The number of clusters with varying percentage of raw coverage by wealth quintile are presented in Fig 2. The percentage of clusters with high number of unimproved water coverage (60-100%) declined from the poorest quintile (56.3%) to the richest (12.6%). The difference in raw coverage rates among the sampled clusters was statistically tested using one-way ANOVA. The difference in raw coverage between quintiles was statistically signi cant, F (3, 1697) =21.1, p <0.001. The percentage coverage of unimproved water in the rst three quintiles was different from the fourth and fth quintiles. Spatially, clusters with higher percentage of unimproved water coverage were located at the North (Amhara and Afar regions), South (SNNPR) and East (Somalia) regions (Fig 3). While in the capital Addis Ababa, Dire Dawa city administration and Gambella region, the coverage of unimproved water is low.
As 26.9-41.9% of the clusters in different quintiles had only three or less households, a spatially smoothed coverage rate was calculated for each cluster to overcome the small number issue. Spatially smoothed rates were calculated by borrowing information from neighbouring clusters to produce a more stable and less noisy estimate of the rate associated with each cluster. Clusters categorized by smoothed coverage in each quintile are presented in Fig 4. The percentages of clusters with 0-20% and >80-100% coverage has decreased in all quintiles except quintile 5, which showed a rise in the percentage of clusters with 0-20% coverage. The spatial smoothing has adjusted the raw coverage rates of clusters in each quintile towards a level tting the average scenario of the surrounding. It resulted a decline and rise in the coverage rate among clusters included in this study. Those clusters with upper and lower end were adjusted and converged to the middle (>20-80%). Fig 5 showed the distribution of spatially smoothed coverage rates for overall population and by quintile. The pattern of clusters with >60-100% coverage of unimproved water was clearly observed at the north and south and becomes infrequent as we moved from the rst quintile to the fth.
The difference between spatially smoothed and raw coverage rates for overall population and each quintile was calculated by subtracting the raw coverage from spatially smoothed coverage rates. Direction wise (rising and declining), an increasing trend of spatially smoothed coverage rate was observed at the north and south, especially in the rst two quintiles. The rates in quintile 3 and 4 were most intensively under and overestimated, respectively. In contrast, the degree or extent of change is low in the fth quintile and being spatially smoothed was different among quintiles and related with the number of households each cluster represents. Clusters with small number of households showed a considerable change by spatial smoothing process.

Hotspots and clustering trends
The global spatial autocorrelation was calculated using Global Moran's I. The analysis based on feature locations and attribute values revealed a clustering pattern of unimproved water coverage across the whole clusters (Global Moran's I = 0.174, p-value < 0.0001). Following a statistically signi cant positive result from the global spatial autocorrelation, Local Moran's I was also calculated for the overall population to show hot and cold spots. Unlike the global spatial autocorrelation, local spatial autocorrelation was calculated for both the raw and spatially smoothed data. The local clustering trend of high and low spots among sampled clusters was identi ed in both raw and spatially smoothed coverage (Fig 6a and b). However, clustering of spatial outliers was eliminated in the spatially smoothed coverage. The raw coverage revealed 72 high-high and 120 low-low statistically signi cant clusters (p<0.05), followed by 8 high-low and 27 low-high clusters. Meanwhile, the spatially smoothed coverage showed 155 high-high and 149 low-low statistically signi cant clusters (p<0.05). Clusters in the North (Amhara region and Afar), in the East (Somalia region) and in the south (SNNPR region) had statistically signi cant unimproved hot spots (95% con dence). Clusters in the centre, south and west had statistically signi cant cold spots as presented in (Fig 6a and b).

Point-of-use treatment practices
Of the total households (5005), 613 (10.46%) treat their water prior to drinking with any of the methods (use either boiling, ltration, bleach, SODIS, let it stand and settle, and cloth straining) and 365(6.24%) of households treat using adequate methods (use either boiling, ltration, bleaching or SODIS). The number of households and reportedly used treatment methods are: boiling 125 (2.14%), bleach 164 (2.80%), cloth straining 204 (3.48%), ltration 105 (1.80%), let it stand and settle 41 (0.70), and SODIS 5 (0.09%). The data shows, of the adequate methods, treating with SODIS the least to be used by the households in the country ( Table 1).
The logistic regression shows that the odds of treating water among household head with highest education level was 2.50 more (95%CI=1.43, 4.36) compared to those who did not attend formal education. Household with highest wealth quintile had more odds of treating water compared to poorest ( Table 1).

Discussion
Narrowing the gap in service inequality, particularly access to improved water as a human right is viewed as a signi cant post Millennium Development Goals actions [4,16,20]. The current study analyses the spatial distribution of unimproved water coverage in Ethiopia by wealth categories. Households included in the rst three quintiles were majorly located in the Northern and Southern part of the country. Simultaneously, these are regions with the highest percentage of unimproved water coverage, with additional clusters in the east (Somalia region). Meanwhile, the last two wealth categories had more households around the capital compared to the rst three wealth quintiles. The percentage of households with low unimproved water coverage were linked with the latter two quintiles. The spatial distribution of unimproved water sources observed among regions and wealth quintiles were similar with previous studies [17,30,31]. It also corroborates studies which indicated variations in access to improved water attributed by wealth status [15,16]. The high coverage of unimproved water at the North and south was also partly attributed to large number of clusters sampled compared to the small number of clusters in the east and the centre [25].
Local Moran's I clustering trend of unimproved water coverage for each cluster to neighbouring clusters proved presence of spatial variation. Thus, the null hypothesis of complete randomness of unimproved water distribution is rejected, and clustering trend indicating high and low coverage neighbourhoods was con rmed.
A clear clustering trend indicating areas of high and low raw coverage surrounded by neighbouring clusters with matching raw coverage rates were observed. In contrast, high coverage clusters surrounded by low coverage clusters and vice versa were sporadically observed across the country. These clusters were adjusted using spatial smoothing which reduces outliers and replaced with a local average. The difference between the raw and spatially smoothed coverage rates showed a noticeable change in raw rates within neighbourhoods or small numbers of households in quintile subsets. Clusters with less than three households in all quintiles were adjusted, but in most clusters with more than ten households, the rates were slightly adjusted. Therefore, the smoothing allows us to stabilize rates based on small numbers by combining available data in the neighbourhood, thereby it avoids the need to aggregate the clusters to achieve stable rates for mapping [32]. The smoothing also reduces noise in the rates and clustering trend caused by different population sizes, thus increased our ability to discern systematic patterns in the spatial variation of coverage rates [33].
In regions with large difference in coverage before and after smoothing, further surveys and analytical methods are needed to con rm the representativeness of surveyed clusters. This was speci cally an issue when we look the spatial patterns of coverage where most of the surveyed households in rural north and south of Ethiopian categorized in the rst tree quintiles, while all surveyed households in Addis Ababa and Dire dawa were in urban areas and assigned in the last two quintiles [25].
Of the total households using unimproved water sources, only 6% reportedly treat their water using adequate methods despite the nding that shows population dependent on such water sources are at risk of different health problems [5]. The point-of-use water treatment of the households association with education status of household head and wealth quintile comply with prior ndings among all households included in the survey in the country [14], and speci c study conducted in speci c part of the country [34], and other countries [35,36].
In total, most households dependent on unimproved water sources are rural dwellers and a small number of households treat water prior to drinking. This could be related to low perception about the water quality and associated health risks [35,37,38]

Limitations Of The Study
The samples taken from some regions like Somalia region was small compared to other regions. This may in uence spatial smoothing process due to small number issue. Displacement of locations was performed to ensure con dentiality of respondents. Thus, the location of clusters may not be precise and exact. Nevertheless, the displacement was only within the administrative regions and will not in uence clustering trends among regions. Household water treatment assessment was based on respondent's self-report and there was no con rmatory testing of treated water.

Conclusion
The study revealed spatial variation in unimproved water coverage across regions. Statistically signi cant local clusters with high coverage of unimproved water were detected in Amhara, Afar and Somalia regions. Results also showed inequality between wealth quintiles within regions which may indicate the need for regional level policy and planning to combat inequity issues. Disparity in unimproved water coverage at spatial level and among wealth quintiles indicates priority areas for future resource allocation. Inequalities between wealth quintiles within regions may indicates the need for regional level policy and planning in addition to national level policies. The spatial variation and inequality in coverage of unimproved water should be dealt to address basic human right to improved water access and to achieve sustainable development goals. This study also demonstrates the possibility and potential of spatial analysis techniques to detect inequalities in access to improved water at regional and national level. Household head education and wealth quintile were also statistically signi cant with point-of-use water treatment suggesting the need appropriate measures for wide-scale use of the treatment methods.

Declarations
Ethics approval: We follow the principles and procedures of the data owner (Measure DHS Program). Each survey was conducted after ethical clearance was obtained from the appropriate Ethics Review Committee of the country.

Consent for publication
Not applicable Availability of data: The datasets used and/or analysed during the current study are belong to DHS program. The authors can provide in discussion with the data owner.
Con ict of interests: The authors declare that they have no competing interest.
Funding: It is not applicable Authors' contribution: YTD and AG conceived and designed the study, analysed the data, interpreted and drafted the manuscript. Both authors have read and approved the nal manuscript.  Percentages of clusters categorized by the raw coverage rates of unimproved water by quintile. The red and pink colors indicate clusters with the highest percentage of unimproved water (60-100%). The percentage of clusters with high number of unimproved water coverage declined from the poorest to the richest quintile.

Figure 3
Percentages of clusters categorized by the raw coverage rates of unimproved water. The pattern shows distribution of unimproved water coverage for overall population and by quintile. Clusters with higher percentage of unimproved water coverage were located at the North, South and East part of the country.
The red and pink colors indicate clusters with the highest percentage of unimproved water.

Figure 4
Numbers and percentages of clusters categorized by the spatially smoothed coverage rates of unimproved water. The pattern shows distribution of unimproved water coverage for overall population and by quintile. The percentage of unimproved water coverage in the rst three quintile showed a considerable change compared to the raw coverage (see Fig 2). The change was different among quintiles and clusters with small number of households showed a considerable change by spatial smoothing process. The rates in quintile 3 and 4 were most intensively under and overestimated, respectively.

Figure 5
Numbers and percentages of clusters categorized by the spatially smoothed coverage rates of unimproved water. An increasing trend of unimproved wate coverage rate was observed at the north, south and east (see Fig 3).

Figure 6
Clustering trends of coverage rates before (a) and after spatial smoothing (b). Clustering of spatial outliers was eliminated in the spatially smoothed coverage. The red colors indicate clusters with the highest percentage of unimproved water while the blue colors shows clusters with low unimproved water coverage.