Guiding placement of health facilities using malaria criteria and interactive tool

Background: Access to healthcare is important in controlling malaria burden and, as a result, distance or travel time to health facilities is often a signiﬁcant predictor in modeling malaria prevalence. Adding new health facilities may reduce overall travel time to health facilities and may decrease malaria transmission. To help guide local decision makers as they scale up community-based accessibility, we explore how the allocation of new health facilities might inﬂuence malaria prevalence in Bunkpurugu-Yunyoo district in northern Ghana. We perform a location-allocation analysis to ﬁnd optimal locations of new health facilities by minimizing three district-wide objectives separately: malaria prevalence, malaria incidence, and average travel time to health facilities. Methods: We used generalized additive model to model malaria prevalence as a function of travel time to health facility and other geospatial covariates. The model predictions are used to calculate the optimization criteria for the location-allocation analysis. This analysis was performed for two scenarios: adding new health facilities to the existing ones, and a hypothetical scenario in which the community-based healthcare facilities would be allocated anew. We created an interactive web application to facilitate eﬃcient presentation of this analysis and allow users to experiment with their choice of health facility location and optimization criteria. Results: Using malaria prevalence and travel time as optimization criteria, we found two locations that were not covered by existing community-based health services that would beneﬁt from new health facilities, regardless of scenarios. Due to the non-linear relationship between malaria incidence and prevalence, the optimal locations chosen by using incidence criterion tend to be inequitable and are diﬀerent from those based on the other optimization criteria. Conclusions: Our ﬁndings underscore the importance of using multiple optimization criteria in the decision-making process. We believe that our analysis and interactive application can be repurposed for other regions and criteria, bridging the gap between science, models and decisions.


Background
Access to quality health care is an important health system goal [1]. In particular, achieving universal health coverage, which includes access to quality health care, medicines, and vaccines for all, is emphasized in the United Nations Sustainable Development Goals [2]. While there are many factors that contribute to healthcare accessibility, such as cost [3,4] and social network systems [5], geographic distance or travel time is often recognized as a significant impediment to effective treatment [6,7,8]. In the case of malaria, accessibility (distance or travel time) to nearby health facilities has long been recognized as a significant factor for controlling malaria burden [9,10] and a significant predictors of malaria prevalence (e.g., [11,12,13]).
In our study area in northern Ghana, distance to health facilities, alongside other geospatial predictors such as distance to urban center, amount of vegetation, and elevation were found to be significantly associated with malaria infection [14,15].
As a result, adding new health facilities in the district may reduce overall travel time to health facilities and, as a result, may help decrease malaria transmission.
Ghana has been expanding the coverage of the Community-based Health Planning and Services (CHPS) program, for example, in the Upper East Region through the Ghana Essential Health Intervention Project (GEHIP) [16,17]. CHPS aims to improve geographical access to health care with an initial focus on remote and rural areas. The primary function of the CHPS program is to train and place community health officers in under-served communities, providing vital medical services including the provision of rapid diagnostic testing following by malaria treatment [17,18].
Besides the community health officers, CHPS also relies on community volunteers to provide healthcare services not only at the health post but also through door-todoor services. Although CHPS and GEHIP are not specific to malaria, malaria is strongly associated with many of their evaluation indicators, such as child mortality rates and health of children under five, in highly endemic areas such as northern Ghana [17]. The CHPS program has been shown to achieve 49% reduction in underfive mortality rates relative to comparison districts [19,20]. Our study is focused on estimating the potential impact of new health facilities on malaria to help guide local decision makers as they scale up community-based accessibility using CHPS compounds. These estimates can then be used to conduct location-allocation analysis, i.e., to determine locations for new health facilities that optimizes some criteria [21]. For example, one could maximize the reduction in area-wide malaria prevalence or incidence. Calculating these metrics will require an understanding of the popula-tion distribution and the relationship between malaria prevalence and accessibility to local health facilities.
Interactive visualizations can be particularly useful for location-allocation analyses given that these analyses can result in many charts and maps depending on the number of evaluated scenarios and optimization criteria (e.g., [22,23]). Importantly, previous research has demonstrated that interactive information can have greater impact than passive information (e.g., [24,25]). Furthermore, the application can also serve as an interactive simulator that allows users to create and explore their own scenarios regarding health facility allocation and how this influences accessibility and health outcomes [26].
In this article, we present an interactive decision support application that enables users to place hypothetical new health facilities (or replace existing ones) and test how the new spatial configuration of health facilities may influence malaria burden.
Our decision support tool can also be used to determine the optimal location of health facilities. To create the application, we first model the malaria prevalence using travel time to health facilities among other covariates. We then determine the optimal locations for new health facilities based on one of three criteria: overall malaria prevalence, incidence, and the average travel time to nearest health facilities. Based on these results, we compare our findings regarding the optimal vs. the current location of health facilities, revealing two new areas in the district that could substantially benefit from CHPS compounds. We also discuss why differences arise when using each of these three criteria, highlighting the importance of using a multi-criteria strategy to optimize the location of health facilities.

Data
The Bunkpurugu-Yunyoo district (Fig. 1), a rural district in northeastern Ghana bordering Togo, has historically been hyperendemic for malaria and has experienced high transmission during the rainy season. In 2010 and 2013, six cross-sectional household surveys were conducted to assess the impact of an indoor residual spraying (IRS) campaign on malaria parasitemia in Bunkpurugu-Yunyoo. A baseline end-of-rainy season (peak) and baseline end-of-dry season (trough) surveys were conducted in 2010 to 2011. Annual IRS was introduced at the end of the 2011 dry season. Peak and trough season surveys were repeated in 2011-12 and 2012-13, for a total of six surveys [27,15].
In each survey, households were selected using a multi-stage randomized cluster sampling approach, with clusters sampled with probability proportional to population size and households randomly selected within these clusters. Children under five years of age in these selected households were tested for malaria using both RDT and microscopy. Between 1311 to 2040 children were sampled in each survey.
The survey datasets were enhanced by the incorporation of remote-sensed data on environmental risk factors for malaria. Details of the survey design and epidemiological evaluation can be found in [27,15]. Importantly, spatial heterogeneity of malaria prevalence was high, ranging from 19% to 90% across the district [15] (Fig.   1).
In this study, we used the datasets from the three peak-season survey in 2010 to 2012, as the overall prevalence, and spatial variation in prevalence, among the three surveys were similar. In these datasets, each sample was geocoded according to their cluster (i.e., children within the same cluster shared the same geographical coordinates). We used the binary microscopy outcome as the malaria infection status (0 = negative, 1 = positive).

Spatial prediction of malaria prevalence
To optimize locations of new health facilities using malaria criteria, we used a Generalized Additive Model (GAM) to determine the relationship between malaria prevalence and geospatial covariates, including travel time to the nearest health facility.

Geospatial covariates
We chose travel time to the nearest health facility, distance to urban centers, elevation, normalized difference vegetation index (NDVI), and population density as the geospatial covariates (See Supplementary Information for their spatial distribution). These were the most important covariates according to the output of a variable selection procedure adopted in previous studies using the same dataset [14,15].
We calculated travel time to the nearest health facility using the global travel time surface provided by the Malaria Atlas Project [28]. This comprehensive dataset estimates how long it takes humans to travel through a landscape by combining political, infrastructural, and environmental information sources to create a 1 km 2 resolution "friction surface" for the entire globe. Using this friction surface, we calculated the time (in minutes) it takes to travel to the nearest health facility, using geo-coordinates obtained from Ghana Health Services, based on the leastcost path algorithm [29]. This analysis accumulates the cost of moving through each pixel to estimate a more realistic path (and therefore time/distance) than Euclidean distance. The least-cost path analysis was performed using the gdistance package in R [30].
In relation to the other geospatial variables, distance to the nearest urban center (in kilometers) was based on Euclidean distance to the nearest two settlement with population larger than 5000: Bunkpurugu and Nakpanduri ( Fig. 1). Elevation (meters above sea level) was based on the 90 m resolution digital elevation map from Consortium for Spatial Information [31] whereas vegetation was represented using NDVI from MODIS and was based on the maximum monthly index 30 days prior to the survey. Finally, population density was sourced from the five-year stratified WorldPop 2014 population estimates [32].
All covariates, except for distance to health facilities, were raster-based covariates that were interpolated from the nearest four cells of a given coordinate.

Model fitting and prediction
We modeled individual malaria infection status using the GAM with a logit link function. We applied a thin-plate cubic spline only on the elevation, which was found to exhibit a strong non-linear pattern. All other predictors were fitted as linear predictors without splines. This was the best configuration in terms of crossvalidation errors based on our preliminary analysis. We also added the survey year as a categorical variable to the model, allowing the intercept to vary from year to year. The GAM was fitted using the mgcv package in R [33].
A 1 km 2 resolution grid was created over the district and prevalence was estimated for each grid cell. Travel time to health facilities was calculated using the centroid of each grid cell and other covariates were extracted from the rasters using similar methods described in previous section. The fitted GAM model was then used to predict malaria prevalence in each pixel across the district.

District-wide metric as optimization criteria
We optimized the location of new health facilities using one of the three districtwide metrics that are likely to be relevant for decision makers when placing new health facilities: (a) the expected malaria prevalence of all children under five in the peak season, (b) expected incidence of malaria cases for all ages per person year observed during the high transmission season, or (c) expected travel time to the nearest health facility.
We weighed the predicted prevalence of each pixel according to its population, yielding the district-wide malaria prevalence: where p i and m i are the predicted prevalence and population under five for pixel i, respectively, and N is the total number of pixels across in the district.
To calculate the expected incidence per year, we first converted the predicted prevalence for children under 5 to prevalence of children of two to ten years old using the method outlined by [34]. We then used the equations from [35] to convert the predicted prevalence of 2 to 10 years old children to the expected incidence per person-year for children under five, older children (5 to 15 years old) and adults (¿ 15 years old). Finally, using the population of each pixel, and the age structure of the northern region estimated from the 2014 Demographic Health Survey [36], we estimated the expected incidence per year over all age groups in the district.
We estimated the travel time to the nearest health facility for each pixel using the method outlined in 'Geospatial covariates' section. Then, we calculated the district-wide expected travel time as the population weighted average of travel time among all pixels. Notice that, differently from malaria prevalence and incidence, the expected travel time to the nearest heath facility is not a malaria specific criterion and, as a result, is not influenced by our GAM results. We chose this optimization criterion because it may maximize overall access to healthcare.

Projecting the impact of new health facilities
One of the main goals of this modeling exercise is to predict changes in malaria prevalence and incidence if new health facilities were to be built. The procedure to do this consists of two steps. First, with a given set of coordinates for the proposed new health facilities, we can recalculate the travel time to health facility by rerunning the least-cost path analysis (described in 'Geospatial covariates' section) using the new set of health facilities (i.e., the existing and proposed health facilities). Second, we can update the predicted malaria prevalence using the new travel time surface (i.e., the new predicted probabilities represent the projected prevalence if the new health facilities are built).
Using this procedure, we determined the optimal locations of one up to five new health facilities. We ran our optimization algorithm multiple times, once for each criterion to be minimized and each number of new health facilities. Identifying the optimal location of new health facilities can be a relatively high-dimensional optimization problem (e.g., five proposed new facilities imply optimization of ten values [two coordinates per facility]) and can lead to local minima problem. To alleviate this problem, we used the genetic algorithm in the GA package in R [37] to perform the optimization.
We investigate the optimal locations of health facilities under two scenarios: Adding new facilities to the existing set of health facilities (Scenario 1) and adding new facilities after excluding the existing CHPS compounds (Scenario 2). The latter scenario allows us to examine whether the current location of CHPS compounds are close to optimal or not.

Interactive decision support tool
We designed a web-based application to let users add hypothetical new health facilities to the district and see how the spatial distribution of malaria risk and the district-wide summary metrics would change correspondingly. We developed the application using shiny package in R [38].
The user is shown a map of Bunkpurugu-Yunyoo District, overlaid by the pre-

Relationship of malaria prevalence and covariates
Fitting the GAM model to the data revealed that all of the selected covariates had statistically significant associations with malaria prevalence (p < 0.001 for all covariates except NDVI with p < 0.05). Expected prevalence was positively correlated with travel time to health facilities and distance to urban center, and negatively correlated with the other covariates (Fig. 2).

Optimal locations for new health facilities
Our results indicate that, for any optimization metric, the optimal locations chosen when optimizing for a smaller number of new facilities were a subset of the optimal locations when optimizing for a higher number of new facilities. For instance, let A and B be the optimal locations chosen by the algorithm when optimizing for two new facilities using prevalence criteria. Our results indicate that A and B are also optimal locations when optimizing for three to five new facilities using the same criteria. As a result, we can rank the importance of a location in reducing the selected metric based on the frequency it appeared in the set of optimal locations when we optimized for one to five new facilities.

Scenario 1: Adding new facilities to the set of existing health facilities
Optimal locations obtained by minimizing either prevalence or travel time criterion were similar with only one disagreement, and these locations were remarkably different from those based on incidence criterion (Fig. 3). The discrepancy between the optimization results based on malaria prevalence versus malaria incidence can be attributed to the highly uneven spatial distribution of the population and the fact that the relationship between malaria prevalence and incidence is non-linear.
In particular, the addition of a new health facility may decrease prevalence in areas with high malaria prevalence but have little effect on the expected number of cases per year.
When the locations of health facilities were optimized using overall prevalence as criteria, adding one up to five health facilities is predicted to reduce district-wide prevalence by 0.3, 0.5, 0.6 and 0.7 and 0.8%, respectively. Similarly, for districtwide incidence rates during high transmission season, the reductions were equal Despite the relatively small changes in district-wide metrics, these proposed health facilities often have strong local impacts. For instance, Fig. 3a reveals that the top priority health facility (i.e., HF 1) reduces prevalence in the surrounding area by 6 to 10%. Importantly, these proposed health facilities do not spatially improve the prevalence and incidence around them in an uniform way (i.e., forming concentric circles around new health facilities), as illustrated in Fig. 3. This is because travel time can differ even when geographical distance is identical, and reduction in travel time is dependent on the locations of the other health facilities. Regardless of criteria, at least two of the optimal locations overlap with the existing CHPS compounds (Fig. 4), suggesting that the position of existing CHPS compounds were generally well chosen to reduce malaria risk. Interestingly, the location around CHPS B (in the map of Fig. 4) ranked the most important using prevalence or travel time criteria, while CHPS C was the second most important based on incidence criteria. This indicate that there might be tradeoff between the locations around CHPS B and C: existence of one of these health facilities reduces the impact of the other.

Discussion
We modeled malaria prevalence during the high transmission season using travel time to health facilities, distance to urban centers and other environmental factors as predictors. Based on the model, we determined the optimal locations of up to five new health facilities under two scenarios (adding new health facilities to the ones that already exist vs adding new health facilities while assuming that no CHPS compounds exist) and three different optimization criteria (maximal reduction in district-level prevalence, incidence, or travel time). We created a web-based interactive visualizer and simulator application that effectively incorporates the various components involved in this analysis and help stakeholders determine the best location for new health facilities.
Since we used travel time to model prevalence and we assumed a linear relationship, we expected that minimizing travel time or malaria prevalence would yield very similar results. Indeed, we find that these results were generally similar but there were also some important differences (compare panel A and C in Figs. 3 and 4). These differences likely arise due to the influence of other spatial covariates on prevalence. For instance, in an area predicted to already have low malaria prevalence (due to the other spatial covariates), adding a new health facility may substantially decrease local travel time in this area without decreasing district-wide prevalence significantly. Our results also indicate that a new health facility may not improve travel time or prevalence evenly. For example, adding a new health facility in Yunyoo area (Location 1 in Fig. 3c) can reduce travel time much more to the neighboring area to the west than to the east. The reason for this is that the reduction in travel time is dependent not only on the friction surface, but also on the locations of other health facilities. Patterns like these are not immediately obvious and intuitive and interactive tools like ours can help users better understand these relationships.
The optimization results were substantially different when minimizing incidence, as compared with minimizing population-weighted travel time or prevalence, because of the non-linear relationship between incidence and prevalence. While this relationship was mostly linear for infants and young children, incidence plateaus at moderate and low level of prevalence for older children and adults, respectively [35].
Because of this plateau, reducing prevalence from 80% to 60%, for instance, would not lead to a substantial decrease in overall incidence. On the other hand, in settings with low prevalence, adding new health facilities can both decrease prevalence and incidence [35]. Thus, the spatial optimizer based on the incidence criterion explicitly avoided high transmission areas even if the expected drop in prevalence was large, confining the optimal locations to the northern and eastern regions of the district.
In the first scenario, the optimizer even chose the Bunkpurugu town which already had two health facilities as an optimal location. These results underscore the importance of using multiple optimization criteria to highlight important tradeoffs and assumptions inherent to these criteria. For example, the use of incidence as the optimization criterion in our case study would be at odds with current healthcare policies that aim to reduce barrier to geographical access to health care. Despite these differences, there were some agreement in optimal locations of health facilities when using these objectives. Under scenario 2, two out of three existing CHPS compounds matched the optimal locations found by our algorithms. These results highlight that the current positions of CHPS compounds were nearly optimal based on our model. Nevertheless, the close proximity of the two CHPS compounds near the eastern border (CHPS B and C in Fig. 4) might have dampened the effects of each other based on our findings. We note that our analysis focused purely on accessibility but not availability, which is another important access dimension of healthcare system [39]. More specifically, we did not consider the capacity of the health facilities in this exercise. This is important because two CHPS compounds with small capacity that are close to each other may be necessary if their surrounding communities are highly populated. Finally, our results highlight that areas close to Yunyoo and Najong 1 (see Fig. 1) would strongly benefit from new health facilities (Location 1 and 2 in Fig. 3a). Both locations experienced moderate-to-high malaria prevalence during rainy season and had relatively high travel time to nearest health facilities given their population density. In the context of malaria control, children health and survival, and reducing barriers to healthcare, new CHPS compounds could be prioritized for these locations.
Our analysis represents our best estimates based on the data available to us. There are a few important assumptions and limitations. First, correlation does not imply causation: having a new health facility in a particular location does not necessarily and automatically reduce the malaria prevalence in its vicinity. However, the relationship between distance to health facility and malaria is strongly supported by current literature. For example, early diagnosis and treatment of malaria is well known to contribute to reduced disease transmission and malaria death [40,41], and accessibility to healthcare, which includes proximity to health facility, is widely acknowledged to decrease malaria prevalence [42,43,44,45]. Secondly, our analysis assumes that the associations between malaria prevalence and the spatial covariates will remain unchanged, which may not be the case. Thirdly, since malaria incidence in the Bunkpurugu-Yunyoo district was not readily available, we had to convert the observed prevalence to incidence based on a model ensemble analysis which relied on data from 30 sites in Sub-Saharan Africa collected from 1981 to 2011, only one of which was from Ghana [35]. These sites may not be necessarily representative of the study district and using the formula to convert prevalence to incidence may not have adequately captured the relationship between these malaria indicators for our study site. Fourthly, an important limitation of our study is that we did not have information regarding the location of health facilities on neighboring districts.
Consequently, we implicitly assume that there are no health facilities close to the districts borders or that people do not go to them, particularly when that entails crossing the eastern international border to Togo. Finally, optimal locations determined using our analysis may seem attractive on the paper but may be impractical on the ground (e.g., due to logistical issues). For this reason, it is important for decision makers and other stakeholders to be able to interact with the model themselves (e.g., via our interactive decision support tool) to explore other locations that may be more feasible and still nearly optimal.

Conclusions
Location-allocation analysis can leverage a relatively standard malaria risk mapping model to create actionable decisions by integrating other pieces of information such as population distribution, estimated travel time, and prevalence to incidence conversion. Importantly, this analysis, together with the use of multiple optimization criteria, can uncover patterns that are not immediately obvious when gleaning any single piece of information. For example, prioritizing incidence over prevalence might have overlooked the barrier to healthcare access for communities in the southern, less populated, and higher burden areas of Bunkpurugu-Yunyoo District.
The complexity of such analysis can be efficiently communicated through a webbased interactive decision support tool. Instead of static maps and figures, our interactive tool allows stakeholders to experiment with their own choices of locations with a few mouse clicks. This is particularly important because modeler-initiated analysis like this can rarely account for all the criteria and constraints that a decision maker may consider. As a result, the interactive tool enables decision makers to use inference from our statistical model and optimization results in conjunction with any additional criteria that they might have to make an informed decision, helping to bridge the gap between decision makers and modelers. Adding new optimization criteria (e.g., focusing on other endemic diseases or health conditions) would be relatively straightforward once the statistical framework and interactive tool are created. We provide the code for our application (available at https://github. com/kokbent/byd-hf-update) because we believe the tool can be repurposed for other regions and criteria. Importantly, public health scientists can now, more than ever before, create interactive applications without deep knowledge in computer sciences. Harnessing such technologies will be important in bridging the gap between science, models, and decisions. season. The prevalence map is from [15]. Nakpanduri and Bunkpurugu are the urban areas in this district, while Nasuan, Yunyoo and Najong 1 are highlighted larger villages.   incidence of all-age malaria cases per 1000 person years observed, or (c) travel time to health facilities (in minutes), under Scenario 2. In this scenario, we assume the absence of existing CHPS compounds (labeled as A to C here). The color of each pixel shows the changes in the predicted metrics (without new HF's minus with new HF's). The number associated with each proposed HF indicates their priority: location 1 had highest priority and most reduced the metric used for optimization, while location 5 was lowest in priority and least reduced the metric.