This study utilized data from the Kaloleni-Rabai Community Health Demographic Surveillance System (HDSS). This HDSS cohort is centred around Mariakani township and covers 112 villages spanning 10 Community Health Units (CHU) of Kaloleni and Rabai sub-counties, Kilifi County, Kenya (Fig. 1), three of which are peri urban. This cohort has been followed up semi-annually since 2017, and six rounds of data collection had been completed by the end of 2019. New individuals can enter the cohort by either birth or in-migration, while cohort members can exit by either out-migration or death. A detailed profile of this cohort has been described elsewhere . Each member of the cohort is assigned a unique identifier at entry, which is used to longitudinally track the individual.
For each round of data collection, a trained community health volunteer (CHV) visited the longitudinally tracked households and interviewed the mother or caretaker of the child who provided the following data: vaccination data (based on child’s vaccination card or on maternal recall if card is unavailable), demographic information, reproductive, maternal and child-health data, child orphan status, school attendance among children, social determinants of disease, nutritional data, vital events (births, migration, and deaths). DPT3 immunization data were captured for all children 14 weeks – 11 months of age. Global positioning system (GPS) coordinates of the households were also collected. A preconfigured open data kit (ODK) installed in electronic tablets was used for data collection, and upon completion of the interview, data were reviewed for completeness and synced to a central server. Further data screening was performed by a data manager for any errors (omissions and inconsistencies) and the feedback sent to CHV for verification. The whole process of data collection was supervised and coordinated by field officers and the Ministry of Health Public Health Personnel.
In each round of data collection, the data were analysed, and the audit reports per CHU shared with CHVs who in return coordinated the dissemination sessions with the community, where they discussed key areas of active feedback, including vaccination status, among others.
Assembling of geospatial data necessary for the estimation of spatial accessibility
To compute spatial access to health facilities, the following information was assembled: coordinates of health facilities, land cover, digital elevation model, road network, and barriers. While the households of interest were within Kaloleni and Rabai sub-counties of Kilifi County and with the assumption that the nearest health facility might be in the neighbouring counties, especially for households along the borders of the neighbouring sub-counties, we confined the analysis of the spatial accessibility to include immunising health facilities, digital elevation model (DEM), land cover, and road network from the counties neighbouring the study area, as shown in Fig. 2.
Since healthcare facilities are critical in the delivery of vaccines, we obtained a list of all facilities that offer immunization services within the study area and the neighbouring administrative areas from the Kenya master health facility list  and the Kenya health information system . We merged facilities from these two sources, eliminated duplicates and obtained their GPS coordinates, which we validated against the recently geocoded master database of all health facilities in sub-Saharan Africa . Furthermore, we ensured that the resultant heath facilities were within the settlement and not on waterbodies by checking their coordinates using Google Earth.
Data for road networks were assembled from OpenStreetMaps (OSM) and Google Map Maker (GMM). Duplicates and short sections of roads disconnected from the main network were removed. As done elsewhere [27, 28], we classified roads into 4 categories: primary (class A & B) roads that mainly connect international borders, secondary (class C & D) roads that feed into primary roads or connected to major towns, county (class E) roads that feed into secondary roads and connect smaller towns or market centers, and rural (class U) roads that connect rural areas. These roads were assigned different speeds depending on the probable mode of transport as follows: primary and secondary roads whose modes of transport were vehicular were assigned speeds of 80 km/h and 50 km/h, respectively. County roads with bicycling as a mode of transport were assigned 11 km/h, while rural roads were assigned 5 km/h based on similar studies in Kenya [28, 29].
Digital elevation model & land cover
We obtained data for the land cover and digital elevation model (DEM) at a spatial resolution of 30 m from the Regional Centre for Mapping of Resources for Development (RCMRD). This is the centre responsible for disseminating open geospatial datasets for Eastern and Southern Africa. Land cover for the study area consisted of 9 categories, which we assigned walking speed based on previous studies [28, 29, 31]; tree cover (4 km/h), shrub cover (5 km/h), grassland (5 km/h), cropland (2 km/h), aquatic vegetation (0.01 km/h), sparse vegetation (2 km/h), bare areas (5 km/h), built-up areas (5 km/h), and open water (0.01 km/h). Walking and bicycling speeds were further adjusted accordingly based on the topography derived from the DEM. This correction used Tobler's equation  that linked walking and bicycling speeds with the slope of the terrain.
Land covers and the DEM showing different elevations of the study area are provided in supplementary file 1.
Estimation Of Travel Time Using Geographic Accessibility Model
Methods for estimating geographical accessibility have been developed over time, namely, the travel time model , network analysis , and gravity model . In this study, we used the travel time model because it has been recommended by the WHO as a suitable method of modeling healthcare accessibility  and because it takes into consideration other key aspects of accessing care, such as terrain and land cover surfaces .
We used AccessMod (version 5) to model geographical accessibility. The software uses the Manhattan distance method to cumulatively determine the time needed to cross contiguous cells using the least cost path from settlement to immunizing health facilities. Therefore, to estimate travel time, we first generated a travel impedance raster surface by merging land cover, elevation, and road network. To each contiguous cell of the resultant raster layer, we assigned travel speeds accordingly as described earlier. Lastly, we combined the location of the immunizing health facilities to the rasterized layer and estimated the time in minutes needed to travel to the nearest facility at 30 m spatial resolution. For further analyses, we extracted the travel time for each household’s geographical coordinates from the generated raster file. The obtained travel time was then distributed to children within a given household. Maps of travel time to the nearest immunizing health facility and the average time per household were plotted in QGIS (version 3.12).
We used a Bayesian hierarchical logistics regression model to explore the effect of spatial accessibility on DTP3 coverage. Community health units (CHUs) and rounds of data collection were used as random effects. To stabilize computations, we used weakly informative priors that also serve to bind the estimates within the acceptable ranges . We specified four chains each with 5000 iterations, half of which were used to warm the sampler and were discarded before estimations were made. The convergence of the model was determined by examining trace plots of the model. We adjusted for confounding due to sociodemographic and other health system factors available in the dataset (see Fig. 3). In keeping with previous studies investigating the effect of travel time , we grouped travel time into two groups: less than 1-hour and more than 1-hour travel seeking health services. To compare differences between two groups, we used an independent t-test statistical technique, and the results were interpreted using a p-value at the significance level of α = 0.05. The results from the multivariable model were reported as odds ratios (ORs) and 95% credible intervals. Significance of odds ratios was assumed if the 95% credible intervals excluded one. All analyses were performed using R Version 3.4.3.