5.1 Study location
Malawi is a country in south-eastern Africa, Fig. 4. Its population of over 20 million is mostly (84%) rural14. It is undergoing rapid demographic change with annual population growth of 2.6%64 and urbanisation resulting in an expected 60% of the population classed as urban by 206044. Malawi is one of the poorest countries in the world with a largely agro-based economy (employing over 80% of the population), making Malawi’s economy particularly vulnerable to climatic shocks64. Tropical cyclones and droughts have become more severe and frequent, causing substantial loss of life, economic impact, and environmental damage including to groundwater supplies9.
Groundwater provides the main source of drinking water for 85% of Malawi’s population4, mainly accessed from boreholes/tube14. Currently only 4.9% meet the requirement of SDG6.1.1, ‘having improved drinking water source located on premises, free of E. coli and available when needed’ 8,14. Over 90% of the population use pit-latrines as their primary source of sanitation 6,14,50,54.
5.2 Spatially explicit population estimation
Using a similar methodology to that outlined in Boke-Olen et al. (2017)40 and summarised in Fig. 5, we generated a 3 arc-second resolution (approx. 90m in Malawi) spatially explicit gridded population projection from 2000 to 2070. The WorldPop 2000 unconstrained, 100m resolution population count for Malawi provided the initial spatial population distribution for the year 2000 at 3 arc-second resolution41–43. Locations of major roads in Malawi were accessed from the open-source Malawi Spatial Data Platform (MASDAP)65. A raster file of the distance to population centres in Malawi was calculated using the COGravity function under the SDMTools packages66 in R67. A unique spatial population grid was generated by combining the spatial population distribution, distance to roads, and the distance to population centres raster files, providing a population distribution weighted towards areas surrounding roads and population centres. The modified spatial population distribution was assigned into urban and rural areas based on the fraction of the cell classed as urban in 0.25-degree cells (approximately 39 km in Malawi) from Hurtt et al. (2011)47, Fig. 5. Hurtt et al. (2011)47 provides urban fractions based on both socioeconomic and emissions scenarios. We assumed all scenarios follow a medium stabilisation emissions scenario, Representative Concentration Pathway (RCP) 6.068,69. In areas with a small proportion of cells classed as urban, there is a potential overconcentration of the population into urban cells. The urban population was distributed over a greater area by dividing the urban fraction outlined in Hurtt et al. (2011)47 by an ‘Urban Fraction Smoothing Factor’ (UFSF), ranging from 0–1.
Multiple socioeconomic scenarios of population growth and urbanisation were considered using the 5 shared socioeconomic pathway (SSP) scenarios that project population and urbanisation levels under hypothetical socioeconomic scenarios38. SSP1 and SSP5 are low population growth scenarios with high urbanisation. SSP3 and SSP4 are high population growth scenarios with low and high urbanisation respectively, and SSP2 represents a ’middle of the road’ scenario with moderate population growth and urbanisation38. The projected urban/rural population for a given SSP scenario was distributed between respective urban and rural cells based on weighted population value of the cells. This was repeated iteratively for subsequent years to produce population projections. The approach is summarised in Fig. 5.
5.3 Validation of population estimates
To validate our population estimates, the projected population distribution for the year 2020 (20 years of modelled distribution) was compared to the WorldPop 2020 population distribution for 3 arc-second and 30 arc-second resolution42,43. The results of different Urban Fraction Smoothing Factors (UFSF) were compared to WorldPop 2020 spatial population distributions at 100m and 1km resolution, UN-adjusted and non-adjusted were used as reference population distributions42,43. Results are summarised in Extended Data Tables 1 and 2. The Root Mean Squared Error (RMSE)70,71 of the difference between the projected population raster and reference population raster was calculated using Eq. (1).
(1)
Where N is the number of cells within the raster file, n is the given cell investigated, P is the projected raster and R is the reference raster.
As the RMSE value can be strongly influenced by individual outliers71, we calculate the percentage of cells in which the projected population differed from the reference WorldPop raster42,43 by more than 1, 10 or 100 people for 3 and 30 arc-second resolutions. For comparison, the RMSE value for Boke-Olen et al., (2017)40 (year 2020, scenario SSP2 RCP6) 30-arc second resolution was compared to WorldPop 2020 (UN-adjusted 1km resolution) population42,43.
To compare available gridded population databases for Malawi, the total population count for 2020 was calculated for WorldPop datasets (UN-adjusted and non-adjusted and at 100m and 1km resolution)42,43, Landscan49, Boke-Olen et al., (2017) projected populations (SSP2 RCP6)40, and the model presented here. The percentage error from the World Bank Malawi 2020 population estimation was calculated48.
5.4 Sanitation policy scenarios
The rural and urban population distributions were broken into administrative districts, boundaries available from MASDAP65. The Demographic Health Survey (DHS) 2015–2016 data was used to indicate the level of pit-latrine adoption for rural and urban populations each in district50. The DHS 2015–2016 being the most recent survey providing a breakdown of the sanitation facility usage in urban and rural contexts, alongside district level data of ’improved’ and ’unimproved’ sanitation access50. For each district, the ratio of improved/ unimproved sanitation use, for both urban and rural contexts, was used to scale the national percentage of the population utilising each type of sanitary facility. The percentage of the population in each district (rural and urban) using pit-latrines was multiplied by the spatial population distribution to estimate the distribution of pit-latrine users, see Fig. 6.
Three sanitation policy scenarios were proposed. Scenario A assumed that, from 2020–2070, the percentage of the population using pit-latrines remains the same as in 2015 using the pit-latrine usage data from the DHS 2015-16 survey50.
Scenario B assumed that the percentage of the urban and rural population using pit-latrines follows a linear model from the 2015-16 district pit-latrine usage50 to a 2070 forecast. The 2070 forecast was estimated by modelling the percentage of the population that will be using flush toilets (to septic tanks or sewerage systems) in 2070, applying a simple linear regression model using the lm() function in the Stats package in R67 and assuming the remaining population will be using pit-latrines. The model assumed that Malawi would achieve its target of ending OD, largely through pit-latrine promotion. Whilst each district has a different pattern of change in the number of pit-latrine users, this scenario had a national increase in pit-latrine use.
Scenario C assumed an increase in the provision of flush toilets to septic tanks and sewers from 2015 to 2070, modelled on the change in the percentage of the population using flush toilets observed in Botswana from 2001 to 201151,52. From 1981 to 2011, the percentage of Botswanan the population using their own flush toilet increased from 8.6–25.2%51,52. The linear trend of flush latrine adoption in Botswana was applied to the Malawi case study by adjusting the intercepts to the percentage of flush latrine usage in Malawi according to the DHS 2015-16 survey50 for rural and urban contexts. The remaining population in 2070 was assumed to use pit-latrines. The model assumes an overall reduction in the percentage of the population using pit-latrines through promotion of flush toilets. The model assumes Malawi ends OD.
For Scenarios B and C, annual estimates of pit-latrine use are made for each district from a linear model (lm() function, R Stats package67) of the district pit-latrine in 2015/16 levels50 to 2070 projections. Scenarios B and C are summarised in Extended Data Fig. 2.
5.5 Cumulative faecal loading
Spatial estimates of pit-latrine users for different years, SSP and sanitation policy scenarios were calculated as the product of the spatially explicit population and pit-latrine usage estimations. To evaluate spatial differences in latrine user density, the estimated number of latrine users was subdivided into river sub catchments, water resource units (WRUs)72.
The quantity of excreta loaded into each WRU was calculated to identify WRUs at risk of ground-water contamination, with a focus on nitrogenous contamination.
To calculate the volume of faecal waste, the number of latrine users was multiplied by the estimated volume of faecal matter per capita per year (270L/year based on an extensive study of pit-latrine loading in Kampala, Uganda59). Other studies estimated volumes of excreta ranging from 100L-1000L per capita per year59,73,74. The cumulative loading of faecal waste was calculated by summing the volume of excreta per year produced by users from 2020 to 2070 for each WRU. The number of latrine users was also multiplied by the estimated chemical composition of faecal waste to estimate the total volume of chemicals in the waste (12.5, 1.5, 3.5 and 30 g/ppd for nitrogen, phosphorous, potassium and carbon respectively)58,60. The cumulative faecal load was divided by the area of the WRU to estimate the spatial density of faecal waste loading, Fig. 6.
5.6 Latrine density
An extensive survey of pit-latrines, waste sites and water points in Malawi was conducted by the Government of Malawi through the Climate Justice Fund Water Futures Programme (CJFWFP) from 2012 to 2020, using semi-structured interviews of stakeholders at each facility. Trained staff delivered interviews in both Chichewa and English and provided the location of each site with a photograph of the facility. Responses were hosted on the data-platform mWater75. Quality control was provided by the University of Strathclyde and ethical approval through the Government of Malawi. Data cleaning involved the removal of incomplete and duplicate responses.
The most comprehensively mapped district of Malawi was Chikwawa (Fig. 4), with most surveys conducted in 2017, case studies from the district of Chikwawa were used to approximate population per pit-latrine. The district of Chikwawa was divided into rural and urban based on the 2017 population; 3 urban and 3 rural regions were selected and the number of surveyed pit-latrines within case-study area was summed. The number of pit-latrines was divided by the estimated population for each area calculated from the WorldPop 100m population estimate for the year 201742,43. The urban and rural case studies were averaged to estimate the number of latrine users per latrine in urban and rural contexts. To estimate the number of pit-latrines, the number of pit-latrine users was divided by the number of users per pit-latrine for urban and rural cases.
To identify water-point contamination risk from pit-latrines, cells were classified according to the number of pit-latrines in each 3 arc-second grid. The equivalent distance a pit-latrine would be from a water-point in a 3 arc-second cell for given latrine density was estimated to provide estimate the associated risk. The number of latrines likely to be within a given radius of a waterpoint was estimated from the density of latrines using Eq. (2):
(2)
Where N is the number of pit-latrines within a grid cell of length, l, necessary to have a 95% probability that at least one latrine will be within a radius r of a centrally located water-point.
Estimating the radius from a central water-point enabled comparison of latrine density estimates to the wider body of literature relating the water-point contamination risk to the distance to a pit-latrine4,18–28.
The CJFWFP water-point survey geolocated 127,000 improved and unimproved water points across Malawi, enabling identification of water-points at high risk of contamination45. ‘Vulnerable water-points’ were defined as boreholes, tube-wells or dug wells (both protected and unprotected) that were functional and in-use (but not primarily for agricultural, or livestock). Point locations of vulnerable water-points were aggregated into pixels, at 3 arc-second resolution, to generate a binary raster of vulnerable water-point presence/absence. Latrine density was considered in cells containing a ‘vulnerable’ water-point. Cells containing a vulnerable water-point in which the density of latrines exceeded a threshold density were identified, Fig. 6.
To account for spatial variation in population distribution and the locations of sanitation and water facilities, 3 arc-second grids were aggregated. The percentage of 3 arc-second cells within a 30 arc-second grid containing a vulnerable water-point with latrine densities exceeding the latrine thresholds was summarised under multiple scenarios.
CJFWFP water-point data45 identified whether water-points were within 100m of a latrine (Government of Malawi recommended spacing23), and was used for model validation. The percentage of vulnerable water-points within 100m of a latrine was calculated. Eq. (3) enabled comparison of the percentage of cases in which a water-point was within 100m of a latrine with the percentage of cases in which a water-point was found within the same 3 arc-second grid cell as a latrine:
(3)
Where Pg is the percentage of water-points with a pit-latrine within the same grid cell of length l, and Pr is the percentage of water-points with a pit-latrine within a radius, r. This assumes an even distribution of latrines within the cell and a centrally located water-point.
Visual inspection compared the locations of pit-latrines from the CJFWFP sanitation survey to the modelled predicted latrine density for 2020, an example is shown in Extended Data Fig. 5.
5.7 Methodological Assumptions
To evaluate microbial contamination risk, the distance of modelled pit-latrines to vulnerable water-points was estimated from pit-latrine density, Eq. (2). This assumes that water-points are centrally located within a cell known to contain a water-point; water-points could actually be located anywhere within the 3 arc-second cell. Only the density of latrines within the cell containing a water-point is considered for determining the risk of contamination, the dispersion of latrines within the cell is assumed to be random. This may result in the underestimation of the contamination risk in cases where the water-point is localised at the edge of a grid-cell and is at risk from pit-latrine contamination from neighbouring cells with. There may be overestimation in cases where the water-point is localised far away from the pit-latrines within the cell. This was mitigated by aggregating data from 3 arc-second to 30 arc-second resolution, identifying regions with high microbial contamination risk. The model also assumes radial groundwater flow i.e., preferential flow in the predominantly the weathered and fractured rock72 is not accounted for.
Only cells with a functional and in-use borehole, tube-well or dug-wells (vulnerable water-points) were used to estimate the contamination risk. From 2020 to 2070, these water-points may be abandoned, and new water-points constructed. It is also likely there will be more water-point containing cells in 2070 than assumed in the model due to increased water-point construction to meet the needs of the growing population44. This study may underestimate the number of vulnerable boreholes if there is a significant growth in borehole numbers. Transition from vulnerable water-points to taps and piped water-supplies is also not accounted for76 as there is no information currently available on which to model these changes. Finally, water-point presence/ absence is a binary measure. If more than 1 vulnerable water-point is present within a cell, it may underestimate the contamination risk. These are assumed to be fitting limitations as the purpose of the study is the identification and prioritisation of areas for policy and management intervention which will still be identified in these cases.
Latrines are assumed to be co-localised with the population, an assumption employed in the literature13. It is recommended that latrines should be no more than 50m from houses77, therefore they are assumed to be within the same 90m grid cell as the modelled population. The model accounts for the number of users sharing a latrine by calculating the number of latrine users in the rural and urban areas of the Chikwawa case study from the CJFWFP survey45. In areas with very high population density, there may be more latrine users per latrine and therefore a lower latrine density than modelled. However, as it is recommended that no more than 20 latrine users share a latrine77, these should still be identified as areas for intervention. Equally, the study may underestimate pit-latrines in very sparsely populated areas as fewer users may share a latrine in this context. Given the focus of this paper is on areas of high latrine density, this is not considered a significant limitation.
The cumulative quantity of faecal waste was used to estimate the mass of residual contaminants in the ground after pit-latrine abandonment (nitrogen, phosphorous, potassium and carbon). The model does not estimate the concentrations of contaminants in groundwater. While the model divides the cumulative loading of waste by the area of WRUs to give an indication of faecal waste density, data is not currently available on the pathways for contaminant mobility or the total volume of groundwater in each WRUs, therefore it was not possible to estimate the concentration within groundwater. Despite these limitations, the indication of areas with a high risk of chemical contaminants should guide further research and monitoring.
Microbial and chemical contamination risks of water-points from pit-latrines assume that there are no barriers to groundwater contamination from faecal waste. Pit-latrines are assumed to be unlined and not undergoing pit-latrine emptying. To validate this, the percentage of pit-latrines that were lined and the percentage practising pit-latrine emptying was calculated based on results from the national CJFWFP sanitation survey. Results were calculated for both the whole of Malawi and for the region of Chikwawa, Fig. 4.