Shaanxi Province is the most developed province in Western China, with an area of 205,800 square kilometers and a total population of 38.35 million in 2017 . Geographically, the central, southern, and northern regions of Shaanxi Province differ significantly. The central part is on a plain and includes the wealthiest area in Shaanxi Province, while the southern part includes the Qinling Mountains, and the northern part covers the Loess Plateau. The economy is less developed with a relatively small population density in the southern and northern parts. This study divides Shaanxi Province into three regions based on geographic and economic conditions.
Data collection scheme
In order to measure the spatial accessibility to health resources, three types of data were basically needed: geographical distribution of the population, the geographical location of hospitals, and the time and distance between residents and the hospitals. Therefore, we collected data in three steps.
First, considering the uneven distribution of the population, we used the geographical location of the villages and neighborhoods to identify the population distribution. Here, two strategies were adopted in our study:
- For villages and neighborhoods with a village clinic, we selected the coordinates of the village clinics to represent the population distribution since the village clinics should be in an area with relatively high concentration of the village population to cover the population of the village to the greatest extent.
- For villages and neighborhoods whose village clinic we could not acquire or that have multiple village clinics, we selected the default coordinates provided by the web mapping navigation service provider. This coordinate usually defaults to the location of the village or neighborhood office, which is usually located in a populated area.
Second, we obtained the names of the county hospitals from the Health Commission of Shaanxi Province, and then we directly used the names of the hospitals to get their geographical locations from the web map.
Third, the time and distance between each village and neighborhood to the county hospitals were collected from the navigation results provided bythe web mapping navigation service. We chose the fastest route, but not the highway route (because China's highway import and export are usually set around the county), to get the time and distance from villages and neighborhoods to the local county hospital by using the real-time navigation data of the AutoNavi map in the driving mode. The reason why only the local county hospitals were selected is that the Chinese new rural cooperative medical insurance implemented in rural areas provides cover at the county level only. In this study, we assumed that due to the medical insurance reimbursement strategy, residents were less likely to visit a doctor in another county-level hospital outside their own county .
Data collection method
To perform the data collection, first, we obtained the names of the village clinics and county hospitals in overall Shaanxi Province from the Shaanxi Provincial Health Statistics Annual Report in 2017, which was provided by the Health Commission of Shaanxi Province. In addition, we also obtained the names of the village and neighborhood committees in overall Shaanxi Province from the website of the National Bureau of Statistics .
Second, we used the geocoding interface of AutoNavi map to collect the coordinates of all the villages and neighborhoods and county hospitals. The requests through the API for the geocoding of the AutoNavi map were conducted by using a web crawler (a web data extraction tool) in the Python 3.6 program . The URL of this geocoding interface can be found in the footnote below. AutoNavi map, known as Gaode in Chinese, was founded in 2011 and is one of the largest web mapping, navigation, and location-based services providers in China. It offers map services at Amap.com and on a mobile App too.
Third, navigation data, including driving time and distance, were collected by using the path planning interface by setting the coordinates of the villages and neighborhoods as the starting point and the coordinates of a county hospital in the district as the endpoint. The URL of the path planning interface can be found in the footnote below. To consider the influence of the traffic conditions at different times, this study was performed four times randomly: the morning (10:00 to 11:00) and afternoon (14:00 to 15:00) on November 23, 2018 (Friday) and November 27, 2018 (Tuesday). For the time periods studied, crawling requests were made by Python to the AutoNavi mapping for the four time periods, and we took the average value of the results for the four time periods. Finally, data on 10,350 villages and neighborhoods (total of 13,074 villages and neighborhoods) from 73 counties of Shaanxi Province were obtained in our study (Figure 1).
The travel impedance to a nearest provider (TINP) was used to evaluate the spatial accessibility in this study. Here, TINP measured the spatial accessibility by using indicators such as the distance, time, or cost from the place of residence to the nearest medical institution, expressed in terms of a straight Euclidean distance (straight line) . Distance and time were chosem as they are indicators that directly reflect spatial accessibility, whereby the closer the distance, the shorter the travel time and the higher the accessibility. Although TINP ignores the supply of health resources, this method is applicable to situations where the choice of seeking health care service is relatively simple, as in rural areas. In addition, we used more precise traffic distance downloads from a web map instead of the Euclidean distance in this study.
We calculated Getis-Ord Gi* statistics for the spatial association of each county to explore whether there were any disparities in spatial accessibility [27, 28]. The Gi* statistic returned for each county was recorded as a z-score . A high positive z-score and small p-value for a county represent a spatial clustering of high values (hot spot) ); whereas a low negative z-score and small p-value represent a spatial clustering of low values (cold spot). The higher or lower the z-score, the more intense the clustering. A z-score close to zero means no significant spatial clustering. Getis-Ord Gi* statistics were calculated by using the R package ‘spdep’ . The spatial relationships of counties were defined as Queen's Case. The distance and time of counties were recorded as the averages of distance and time of villages and neighborhoods.
We used concentration curve and the concentration index (CI), a method recommended by the World Bank to measure the inequality in health indicators related to living standards , to explore the influence of gross domestic product (GDP) and population on the differences in spatial accessibility to county hospital across counties.
In the concentration curve plot, the horizontal axis is the cumulative percentage of the observation unit (county in our study) ranked in ascending order by living standards (rank variable), and the vertical axis is the cumulative percentage of the health indicator. Originally, the living standard is a socioeconomic indicator, but we extended it to GDP and population. Concentration curve can be used to examine inequality in any health sector variable of interest , such as health resources and health services [32, 33].
The value of CI is the double area between the concentration curve and the line of equality (the 45-degree line); it is a negative (positive) value when the concentration curve lies above (below) the line of equality. The range of CI is between −1 and 1, where zero means no rank variable-related inequality, a negative (positive) value means a disproportionate concentration of the health indicator among the observation unit with the lower (higher) value in the rank variable. Generally, the calculation method of the concentration index is as follows: (see Formula 1 in the Supplementary Files)
where, h is the health indicator, μ is its mean, and r is the fractional rank of the observation unit. In this study, we selected the average shortest time of the village/neighborhoods to county hospitals as the health indicator, GDP and population as the rank variables, and county as the observation unit.
 Data were collected from the Shaanxi Statistical Yearbook in 2019. (http://tjj.shaanxi.gov.cn/upload/2020/pro/3sxtjnj/zk/indexch.htm)