3.1 Study area
This study focuses on Beijing, the world's most populous capital city, with an administrative area of 16,410 km². A recent study suggests that Beijing has approximately 1398 km of sidewalks and 414 km of bike routes (Jiao & Cai 2020). As noted in the most recent general plan (2016–2035), Beijing aims to build a city-scale complete street system and to improve the governance of static transportation like parking lots on the roadside (Beijing Municipal Commission of Planning and Natural Resources, 2016). To implement the goals in the general plan, Beijing published a specific plan to guide the urban design for street regeneration and Governance. The street regeneration plan addresses the street conflict issues with five direct implementation measures, they are: 1) improve governance of roadside space parking; 2) renovate street intersections in the central city; 3) optimize parking capacity for bikes on sidewalks; 4) formalize the parking requirement for delivery trucks/vans; and 5) remove unnecessary separation fences in roadside space (Beijing Municipal Commission of Planning and Natural Resources, 2018).
Although the city of Beijing has a strong vision for street regeneration, the conflicts between vehicles, cyclists, and pedestrians are chronic issues that deserve more attention in academics and planning practice.
3.2 Spatial sampling and image coding procedures
The study overcomes two major methodological challenges to conducting a large-scale analysis with street-view images. The first challenge is identifying appropriate data collection parameters (e.g., sampling distance, field of view, direction). The other challenge is interpreting conflicts from the collected images.
To address these challenges, the study employs a spatial sampling procedure, as shown in Fig. 1. The goal is to cover the study area with enough data collection points to generate meaningful results while maintaining a manageable amount of data collection work. Similar to this sampling procedure, early studies have demonstrated promising results using several spatial sampling methods in large-scale analysis (Kim and Li 2021; Nelson and Hellerstein 1997). In addition, the spatial sampling method can mitigate spatial autocorrelation for the regression analysis (Brady and Irwin 2011). As illustrated in Fig. 1, the first step is to create a 500m×500m grid in the study area and drop grid cells without any segment of streets. Depending on the location of each cell, there are 2–5 blocks in a cell and a smaller grid (e.g., 500m×500m) size will create too many zero observations while a larger cell (1km×1km) decreases the spatial resolution of the analysis. The second step is to generate weighted random samples (WRS) based on the road network density of each cell. In the WRS algorithm, the cells are weighted with road network density, and the probability of each cell being selected is determined by its relative weight. The definition of WRS is the following (Efraimidis and Spirakis 2008):
S1: For k = 1 to m do
S2: Let \({p}_{i}\left(k\right)={w}_{i}/{\sum }_{sj\in V-S}{w}_{j}\) be the probability of cell \({v}_{i}\) to be selected in
S3: Randomly select a cell
For each selected cell, the third step of the sampling procedure is to create equal interval (100m) street view image collection points along the streets. Given that the study focuses on roadside space, the angle of the camera is set as left (heading angle=\({270}^{0}\)) and right (heading angle=\({90}^{0}\)) at each point. To field of view (FoV) is set as \({180}^{0}\) to maximize the street coverage in each image. The pilot test of the sampling procedure indicates that each image can capture about 50-100m lengths of the street with the wide FoV setting, which justifies the spacing at 100 meters. Admittedly, there are some distortions at the corner of the images which decreases the interpretability of the images to some extent.
The next step is to identify roadside space conflicts from the images. As noted above, the study focuses on two types of conflicts, they are vehicle-bike conflicts and vehicle-pedestrian conflicts. Specifically, the study identifies vehicles parking on bike lanes as a proxy to measure vehicle-bike conflicts and vehicles parking o sidewalks as a proxy to measure vehicle-pedestrian conflicts. These metrics have been extensively explored in a recent study by Popescu (2022). To reduce the human input in the image coding process, the researcher developed a machine learning tool with YOLO v5 pre-trained with the COCO dataset to assist in the identification of street conflicts from the images (Jocher 2022; Li 2022; Tsung-Yi et al. 2021). The tool helps the researcher filter out images without any vehicles in the images. The test sample with randomly selected 200 images indicates that the machine learning tool developed by the author can identify 94% of vehicles in the images. Therefore, the tool is qualified to help the study filter out images without a vehicle before proceeding to manually coding images with the rules illustrated in Fig. 2.
Using the sampling procedure, the study creates two samples in the study area for the regression analysis and sensitivity test respectively. The regression model, data, and sensitivity test are described in the following section.
3.3 Model and data
The study generated two samples from the grid cells, each sample has 10% of the cell population (N = 704/10%). Based on the data collection points, the study collected 35,428 street-view images via Baidu API. The machine learning tool filtered out 23,877 images without any vehicles. Among the 11,551 valid images, 9,541 vehicle-bike conflicts, and 1,841 vehicle-pedestrian conflicts have been manually identified with the procedures above. As mentioned in the preceding section, the study focuses on how place-based factors affect the probability of street conflicts. A variety of place-based factors are included in this study. Most of the factors are POI data collected from the Gaode map. Besides, the study pays particular attention to the potential spatial variation of conflicts among communities by employing the classification of communities (urban communities, suburban and rural communities) defined by the municipal government of Beijing (Beijing Municipal Civil Affairs Bureau 2020). The summary statistics of these variables are presented in Table 1.
Prior to estimating regression parameters, the variance inflation factor (VIF) is used to test the multicollinearity among these variables. The results suggest that some explanatory variables are correlated, such as the Nearby_PrimarySchool and Nearby_MiddleSchool. Therefore, the study employs ridge regression to mitigate the impact of collinearity on the model estimation. The expression of the ridge regression model is:
\(Y=X\beta \left(K\right)+\epsilon\) \(\beta \left(K\right)={({X}^{T}X+KI)}^{-1}{X}^{T}Y\)
where \(Y\) is the matrix of street conflicts, \(X\) is the matrix of explanatory variables, \(\beta \left(K\right)\) is the regression coefficient, K is the ridge parameter, I is the identity matrix, and ε is the error term. As shown in the function, the model parameters of the ridge regression model arbitrarily depend on the ridge parameter K (ranging between 0 and 1). The reasonable value of K is usually determined according to the stable point or inflection point of the ridge trace (Marquardt and Snee 1975).
Since a single sample is unlikely to form a robust conclusion, this study creates two samples and estimates the model to check the sensitivity of the Ridge estimation results. The following section presents the results of the baseline sample and the results of the test sample are presented in Appendix B.
Table 1
Summary statistics of the baseline dataset (N = 743)
Variables | Sources | Mean | Std. Dev. | Min | Max |
Vehicle-bike conflict density/km | Baidu streetview API1 | 9.108 | 8.209 | 0 | 31.741 |
Vehicle-pedestrian conflict density/km | Baidu streetview API | 1.151 | 1.479 | 0 | 5.443 |
Pop_Dens/km² | Baidu population density map2 | 2.408 | 1.994 | 0.000 | 6.952 |
Nearby_ArterialDist/km | OpenStreetMap3 | 0.765 | 1.070 | 0.000 | 7.661 |
Log_TiananmenDist/km | OpenStreetMap | 9.863 | 0.791 | 6.788 | 11.600 |
Log_BuildingArea/km² | OpenStreetMap | 9.186 | 1.674 | 2.790 | 14.039 |
Nearby_ExclusiveParking/count | Gaode map4 | 15.553 | 20.953 | 0 | 120 |
Nearby_StreetParking/count | Gaode map | 9.890 | 15.787 | 0 | 99 |
Nearby_PrimarySchool/count | Gaode map | 0.875 | 1.399 | 0 | 11 |
Nearby_MiddleSchool/count | Gaode map | 0.647 | 1.159 | 0 | 7 |
Nearby_Museum/count | Gaode map | 0.211 | 0.651 | 0 | 7 |
Nearby_ShoppingCenter/count | Gaode map | 0.301 | 0.907 | 0 | 9 |
Nearby_SubwayStation/count | Gaode map | 1.049 | 2.284 | 0 | 16 |
Neaby_Clinic/count | Gaode map | 0.842 | 1.373 | 0 | 13 |
Neaby_Hospital/count | Gaode map | 0.803 | 1.481 | 0 | 13 |
Neaby_Park/count | Gaode map | 0.609 | 0.944 | 0 | 5 |
1 Baidu streetview API: https://lbsyun.baidu.com/; 2 Baidu real-time population heat map: http://huiyan.baidu.com/cms/heatmap/beijing.html; 3 OpenStreetMap; 4 Gaode POI data: https://doi.org/10.18170/DVN/WSXCNM |