Influence of spatial distribution pattern of buildings on the distribution of urban gaseous pollutants

Buildings are the main component of urban, and their three-dimensional spatial patterns affect meteorological conditions and consequently, the spatial distribution of gaseous pollutants (CO, NO, NO2, and SO2). This study uses the Jinan Central District as the study area and constructs a building spatial distribution index system based on DEM, urban road network, and building big data. ANOVA and spatial regression models were used to study the effects of building spatial distribution indicators on the distribution of gaseous pollutants along with their spatial heterogeneity. The results showed that (1) the effects of most of spatial distribution indexes of building on the concentration distribution of the four gaseous pollutants were significant, with one-way ANOVA outcomes reaching a significance level of 0.01 or more. The DEM mean, building altitude, and their interaction with other building spatial distribution indicators are important factors affecting the distribution of gaseous pollutants; The interaction of other three-factor indicators did not have a significant effect on the distribution of gaseous pollutant concentrations. (2) The spatial distribution of CO and NO2 is mainly influenced by the indicators of the spatial distribution of buildings in this study unit, and the effects of CO and NO2 concentrations in adjacent study units are the result of the action of stochastic factors. The NO and SO2 concentrations are influenced by the spatial distribution index of buildings in this study unit, the neighborhood homogeneity index, and NO and SO2 concentrations. (3) Spatial heterogeneity was observed in the effects of building spatial distribution indicators on the concentrations of different pollutants. The GWR models constructed using CO and NO concentrations and building spatial distribution indicators were well fitted globally and locally. The CO and NO concentrations were negatively correlated with the mean topographic elevation and NO concentrations were correlated with building density.

interaction of other three-factor indicators did not have a significant effect on the distribution of gaseous pollutant concentrations.
(2) The spatial distribution of CO and NO 2 is mainly influenced by the indicators of the spatial distribution of buildings in this study unit, and the effects of CO and NO 2 concentrations in adjacent study units are the result of the action of stochastic factors. The NO and SO 2 concentrations are influenced by the spatial distribution index of buildings in this study unit, the neighborhood homogeneity index, and NO and SO2 concentrations. (3) Spatial heterogeneity was observed in the effects of building spatial distribution indicators on the concentrations of different pollutants. The GWR models constructed using CO and NO concentrations and building spatial distribution indicators were well fitted globally and locally. The CO and NO concentrations were negatively correlated with the mean topographic elevation and NO concentrations were correlated with building density.

Highlights:
• Study units are constructed by using DEM, urban roads and building attributes. • Analysis of variance and spatial regression models are used in this study. • Buildings spatial patterns affect the spatial distribution of gaseous pollutants. • The effects of building spatial patterns on pollutants has spatial heterogeneity.

Introduction
After entering the twenty-first century, urbanization in China has entered a phase of accelerated development, and rapid urbanization has led to changes in urban spatial structure, exacerbating air pollution and the urban heat island effect, resulting in increased cardiovascular and respiratory diseases (Shahriyari et al., 2022). The approach of mitigating air pollution by optimizing the spatial layout of a city has received increasing attention from scholars. CO, NO, NO 2 , and SO 2 are important components of air pollutants, which directly affect urban environmental quality (Zinzi et al., 2020) and human health (Jf et al., 2020). It is of great practical importance to study the influence of urban spatial structure on gaseous pollutants to promote healthy urbanization and accelerate the construction of ecological civilization. Air pollution in China shows seasonal specificity, with NO 2 , SO 2 , and PM 2.5 pollution dominating in winter, and O 3 pollution dominating in spring and summer; the monthly change in air quality level shows a typical "inverted U" change pattern of rising and then falling (Xiao et al., 2017;Zhao, 2021). In recent years, urban air quality in China has improved, with concentrations of four pollutants-PM 2.5 , PM 10 , SO 2 , and CO-decreasing, and NO x and O 3 emerging as major pollutants (Song et al., 2017;Xiao et al., 2021). The source, spatial, and temporal distribution characteristics of gaseous pollutants vary for the north and south regions of China. The annual average concentrations of PM 2.5 , PM 10 , CO, NO 2 , and SO 2 are higher in the north than in the south, and the rate of decline is also faster in the north than in the south (Ma et al., 2018). The potential sources of air pollutants in the Chang-Zhu-Tan area are the air currents from the northeast and southwest, which carry higher concentrations of pollutants (Zhu et al., 2022). Differences in O 3 , PM 2.5 , CO, NO 2 , and SO 2 concentrations between suburban and urban environments in Beijing are decreasing . Air pollution in the periphery of Jinan City is more serious, whereas pollution in urban areas is relatively light (Zhang et al., 2020).
Urban air pollutant concentrations are influenced by the spatial structure of the city , pollution source (Jiang et al., 2020), regional economics (Bai et al., 2022;Liang et al., 2019), population density (Wang & Huang, 2015), meteorological condition , and other factors. In Europe, urban air quality monitoring is mainly done by ground stations, and the transport of pollutants in cities depends on the location of the sources and flow conditions such as wind speed and direction (Lateb et al., 2016). Space heating and domestic hot water are the main sources of air pollutants and the heating structure can be changed to reduce airborne pollutants (Kaczmarczyk et al., 2020). The fragmentation (density) of urban sites in the USA is negatively correlated with air quality (McCarty & Kaza, 2015). City size and urban form are important factors influencing the urban heat island effect (Gaur et al., 2018;Huang & Wang, 2019). Urban spatial structural characteristics such as building height, density, and volume drastically change the characteristics of the urban subsurface, which directly affects the interaction between the subsurface and the atmosphere, which in turn affects climate and environmental elements such as wind speed, wind direction, and temperature, making the urban aerodynamic characteristics present complex spatial and temporal characteristics, which directly affect the diffusion and distribution of gaseous pollutants. Zones of accumulation of particulate matter and gaseous pollutants in leeward and windward spaces were found, while in the open spaces of university buildings, PM particles are reduced (Cichowicz & Dobrzański, 2022). A combination of wind tunnel experiments and computational fluid dynamics simulations revealed that the effect of louvers on the flow field was mainly concentrated on the windward side and the roof (Cui et al., 2014). By studying the effect of shorttime variations in wind speed on the mass transfer rate between street canyons and the atmospheric boundary layer for many urban areas in Mediterranean countries, it was found that increasing the frequency of inflowing winds reduces the amount of CO in the monitored volume (Murena & Mele, 2014). Gases in urban street canyons are expelled mainly when low-velocity regions occur above, and pollutants in the free stream flow into the canyons as high-velocity fluids (Kikumoto & Ooka, 2012). Wind directions parallel to the street are more conducive to particle pollutant dispersion than winds perpendicular to the street, and pollutant concentrations are significantly higher in multi-story building areas than in high-rise building areas (Miao et al., 2020). Reducing the density of buildings and improving the three-dimensional urban structure can improve the urban ventilation potential as well as avoid the accumulation of pollutants in crowded structures (Badach et al., 2020). The positive correlation between urban form and NO 2 concentration was significant, with X-or H-shaped cities having less air pollution than diamond-shaped cities (Lu & Liu, 2016;Wang et al., 2020). All pollutant concentrations were negatively correlated with NDVI (normalized difference vegetation index) . Urban spatial structure can influence the vertical and horizontal air flows in the cities, which will have a significant impact on the spatial distribution and dispersion paths of pollutants (Fang & Qu, 2018). The reported studies on influencing factors for urban air pollutants mainly use geographical detectors , multivariate Moran models (Yao et al., 2019), nonparametric panel models (Liu et al., 2021a), and correlation analysis. The intensity and direction of air pollution are affected differently by the spatial structure of cities with different levels of size and economy (Wang et al., 2021). The spatial structure of the boulevard has a more obvious influence on the spatial distribution of gaseous pollutants SO 2 and NO x (Kan, 2020). Urban three-dimensional morphology is the most universally representative structural representation of the urban spatial structure and has a significant impact on the thermal environment; however, insufficient attention has been paid to three-dimensional spatial information in the study of heat island intensity in local areas (Chen, 2019;Qiao et al., 2019;Zhou & Tian, 2020). Building height has a greater effect on surface temperature than vegetation cover or building density . Optimizing the layout of building forms can reduce air pollutant emissions from urban microclimates, thereby reducing air pollutant emissions (Fan et al., 2017).
Thus far, various studies have investigated the urban spatial structure of urban heat islands; however, few studies have examined the effect of urban spatial structure on gaseous pollutant concentrations. Architecture is the main constituent element of the city, and the three-dimensional spatial distribution pattern of buildings is the main body of the spatial structure of the city. The central city of Jinan was considered as the research area in this study. We first constructed a building spatial distribution index system based on multi-source data, and then used analysis of variance (ANOVA) to investigate the influence of different building spatial distribution indices on the distribution of gaseous pollutants. Finally, spatial regression models such as spatial lag, spatial error, and spatial Dubin were used to analyze the influence of each building spatial distribution index on the distribution of air pollutants, and the spatial heterogeneity of their effects was investigated using geographically weighted regression. This study expands the research ideas and methods on the influence of urban spatial structure on the concentration of gaseous pollutants, and the obtained results can provide theoretical support for rational urban planning and the improvement of urban air quality.

Study area and data sources
Jinan is the capital of Shandong province in China and the political, cultural, and economic center of the province. It is located in the east of China, the middle and west of Shandong province, and the southeast edge of the North China Plain. Its geographical location is between 36°01′-37°32′ N and 116° 11′-117° 44′ E. It features a low topography in the north and high topography in the south, which can be divided into three belts namely, the northern near the Yellow River belt, central pre-mountain plain belt, and southern hilly mountain belt. By the end of 2020, Jinan city had 12 counties and districts, and this study selected five central districts as the study area, including the Lixia, Licheng, Tianqiao, Huaiyin, and Shizhong Districts (hereinafter collectively referred to as Jinan Central District), with an area of approximately 2094 km 2 . The buildings in these five areas are relatively dense and widely distributed, with obvious regional differences in building height and density, and taking them as the study area to study the influence of the spatial distribution pattern of buildings on the distribution of gaseous pollutant concentrations has obvious representativeness.
The administrative division vector data of Jinan City used in the study was obtained from Shandong Province Geographic Information Public Service Platform (http:// www. sdmap. gov. cn/). Buildings and city roads vector data was obtained from Baidu Big Data, using the Shui Jing Jie map downloader. Air quality data was collected from 76 meteorological stations in the central city of Jinan. DEM data was obtained from Google Earth at the spatial resolution of 9.55 m. The geographic coordinate system used for the above data was the 2000 national geodetic coordinate system (CGCS2000), and the projection coordinate system was CGCS2000_3_Degree_GK_CM_117E.

Research ideas and methods
Urban air pollutant concentration distribution is affected by a variety of factors and different research methods, and there are correlations between some factors, especially building height, density, and volume indicators that are closely related to the urban population, economy, traffic, etc. Therefore, this study only takes into account only the spatial parameters of buildings concerning the impact on air quality related to air quality impacts, and other background factors are no longer considered. The research ideas and methods are as follows: First, the urban road network and DEM data were used to define the study units and approximate the spatial distribution characteristics of buildings in the same study unit. Then, the monitoring data of gaseous pollutants were used to obtain the spatial distribution data of gaseous pollutant concentrations using the Kriging method, building big data to construct a building spatial distribution index system, extract the concentration of gaseous pollutants, and develop a spatial distribution index for the corresponding study units. ANOVA and spatial regression analysis were used to study the influence of building spatial distribution indicators on the distribution of gaseous pollutant concentrations.

Research unit construction
In this study, two approaches were used to determine the study units: neighborhood delineation based on an urban road network, and irregular triangular network delineation based on DEM. Neighborhood unit division based on urban road network: First, ArcGIS 10.7 was used to merge all roads into one layer; then, according to the principle of mathematical morphology for the expansion operation, the road with double line problem was used for the buffer analysis, buffer rasterization, and binarization, after the corrosion operation to extract the centerline. Finally, the road network surface elements were formed and too large and too small cells were dealt with. DEMbased irregular triangulation, first used the southern mountain DEM data to generate irregular triangulation, then use the "Eliminate" tool to eliminate the cells with too small area, and form the southern mountain irregular triangulation surface elements. Finally, the central city of Jinan was divided into 2708 neighborhoods, with 997 neighborhoods having buildings, and 992 neighborhoods remained after deleting the neighborhoods without "neighbors" during the construction of the neighboring space weight matrix. Combining the layers formed in the above two steps constituted the total study unit.

Spatial distribution index system of urban buildings
The spatial distribution pattern of buildings directly affects urban temperature and ventilation, which in turn affects the air quality. We investigated the influence of spatial distribution pattern of buildings on the distribution of urban gaseous pollutants by constructing three-dimensional urban building distribution indicators based on the perspective of urban planning . The following 10 indicators were selected: (1) one-dimensional height indicators, DEM standard deviation (Y1), reflects the sharpness of surface undulations in the study area; building height standard deviation (Y2), reflects the dramatic change in vertical height of buildings in the study area; DEM mean (Y3), reflects the average elevation topography in the study area; building height mean (Y4), reflects the average level of vertical height of buildings in the study area; (2) two-dimensional plane indicators, building density (P1), reflects the density of buildings and open space ratio in the study area; building footprint standard deviation (P2), reflects the dramatic change in the size of the horizontal projection of buildings in the study area; building footprint mean (P3), reflects the average horizontal projected area of buildings in the study area; (3) three-dimensional space indicators, volume ratio (S1), reflects the efficiency and intensity of building sites in the study area; building volume standard deviation (S2), reflects the dramatic change in the volume size of buildings in the study area; building volume mean (S3), reflects the average level of building volume in the study area.

Ordinary Kriging
The Kriging method is a geostatistical method for generating an estimated surface through a set of scattered points with z-values and the ordinary Kriging is the most commonly used Kriging method, whose interpolation was calculated as shown in Eq. (1).
where S 0 is the predicted position, Z(S 0 ) is the predicted value at position S 0 , i is the weight of the measured value at position i, and Z(S i ) is the measured value at position i.

ANOVA
ANOVA was used to investigate whether different levels of one or more factors had a significant effect on the observed variables. Its significance was tested using the F-value, which was calculated as shown in Eqs. (2)-(4).
where S 2 A and S 2 B are the sum of the squared effects of factors A and B, respectively; S 2 A×B is the sum of the squared effects of the interaction of factors A and B, and S 2 E is the sum of squared errors. If the p-value corresponding to the F-value is less than the critical value of 0.05 in the test critical value table, it means that the factor has a significant effect on the test results. ANOVA was performed using SPSS 26.0.

Spatial regression model
(1) Geographically weighted regression (GWR): Embedding the spatial location of the sample point data into the regression parameters can quantify the spatial heterogeneity, which can reflect the variability of building spatial distribution indicators on the influence of gaseous pollutants in different regions as shown in Eq. (5).
is the spatial geographic location function, and i is the random error term of i. The degree of fit was tested using the Akaike information criterion (AIC).

Spatial pattern of gaseous pollutant concentrations
Using the observed data from 76 pollutant monitoring stations in the central city of Jinan, the geostatistical analysis toolbox of Arcgis 10.2 was used for interpolation analysis. It was verified and cross-validated that the ordinary kriging method has less error and higher accuracy, so the ordinary kriging method was used to analyze the spatial distribution of each gaseous pollutant, and the results are shown in Fig. 1.
As shown in Fig. 1, the CO concentration shows an increasing trend from southeast to northwest, with the low values mainly distributed in the southeast and central part of the study area and the high values mainly distributed in the northwest part of the study area. the NO and NO 2 concentrations show an increasing trend from south to north, with the low values mainly distributed in the southern part of the study area and the high values mainly distributed in the northern part of the study area. The low values of SO 2 concentrations are mainly distributed in the central part of the study area and the high values are mainly distributed in the northwest and northeast parts of the study area. The low values of SO 2 are mainly distributed in the center of the study area, and the high values are mainly distributed in the northwest and northeast of the study area.
The high-value areas of the four air pollutant concentrations are mainly distributed in the north area with lower elevation and flat topography, which is related to the topographic characteristics, the spatial distribution of (7) y = X + buildings, the spatial distribution of pollution sources, climate and meteorological characteristics, and aerodynamic characteristics of the central city of Jinan. The topography of Jinan City is high in the south and low in the north, the buildings are dense in the north and sparse in the south, some heating units and other polluting enterprises are located in the north area, and the dominant wind direction in December is northeast with an average wind speed of 1.16 m/s, which is not conducive to the diffusion of air pollutants and leads to the gathering of pollutants in the northeast area.

Results of variance analysis
In this study, one-and multi-way ANOVA were used to investigate the effects of single and multiple indicators of building spatial distribution and their combinations (interaction effects) on the distribution of gaseous pollutant concentrations.

Effect of the single index on concentration distribution of gaseous pollutants
The height and undulation of buildings, size of the base area, density, and other factors directly affect the threedimensional structure of the urban space, affecting air circulation and, consequently, the concentration distribution of gaseous pollutants. To investigate whether the influence of individual building spatial distribution indicators on the distribution of gaseous pollutant concentrations is significant, the quantile method was used to classify the 10 building spatial distribution indicators into 10 categories, and one-way ANOVA was performed using SPSS 26.0 to calculate the F-value and p-value, respectively. A p-value less than 0.05 means that different levels of this indicator have a significant effect on the distribution of gaseous pollutant concentrations. The F-values and significance test results are presented in Table 1. As shown in Table 1, the effects of the seven-indicator DEM standard deviation (Y1), building height standard deviation (Y2), DEM mean (Y3), building height mean (Y4), building density (P1), and building volume standard deviation (S2) on the concentration distribution of the four gaseous pollutants-CO, NO, NO 2 , and SO 2reached a confidence level of 0.001, and the effects were found to be significant. The building footprint standard deviation (P2), building footprint mean (P3), and building volume mean (S3) reached a confidence level of 0.001 for the distribution of the CO, NO, and SO 2 concentrations suggesting the significant effect. The effect of S3 on the distribution of NO 2 concentration reached a confidence level of 0.01, and the effect was found to be significant. The effects of both P2 and P3 on NO 2 concentration were not significant. It can be seen that each building's spatial distribution index is an important factor affecting the distribution of gaseous pollutant concentrations.

Effects of multiple indicators and their interactions on concentration distribution of gaseous pollutants
The influence of building spatial distribution indicators on the concentration of gaseous pollutants is often not independent and involves interactions. Therefore, based on the investigation of the influence of individual building spatial distribution indicators on the concentration of gaseous pollutants, it is necessary to discuss whether the influence of multiple indicators and their interactions on the concentration of gaseous pollutants is significant. We selected the indicators with interactions based on physical significance and urban planning perspective as follows: Y1 and Y3 interactions, integrating the degree of topographic relief and the average level of interaction; Y2 and Y4 interactions, integrating the interaction of the degree of undulation of the building height and the average level; Y3 and Y4 interaction, integrating the interaction of the average level of terrain and the average level of building height; Y3 and P1 interaction, combining the interaction of average terrain level and building density; Y3 and P3 interaction, integrating the interaction of the average level of terrain and the average level of building footprint; Y3 and S1 interaction, integrating the interaction of terrain average and building land use efficiency; Y3 and S3 interaction, integrating the interaction of the average level of the terrain and the average volume of the building; Y4 and S1 interaction, integrating the interaction of average building height and site efficiency; Y4 and P1 interaction, integrating the interaction of average building height and density; Y4 and P3 interaction, integrating the interaction of average building height and average footprint level; interaction of Y3, Y4, and P1 indicators, considering the interaction of the average level of terrain, average level of building height, and degree of density; interaction of Y3, Y4, and S2 indicators, considering the interaction of the average level of terrain, average level of building height, and drastic change in the size of the building volume; interaction of Y3, P1, and P3 indicators, considering the interaction of average topography, building density, and average footprint; interaction of Y4, P1, and P3 indicators, considering the interaction of average building height, density, and floor space; interaction of Y4, P1, and S1 indicators, considering the interaction of average building height, density, and site efficiency; interaction of Y4, P3, and S1 indicators, considering the interaction of average building height, footprint, and site efficiency. For the above indicators, multivariate ANOVA with interaction was performed using SPSS 26.0, and the results are shown in Tables 2 and 3. As shown in Table 2, the interactions of the DEM mean (Y3), building height mean (Y4), DEM mean (Y3), building volume mean (S3), building height mean (Y4), and volume ratio (S1) had significant effects on the concentration distribution of the four gaseous pollutants CO, NO, NO 2 , and SO 2 , among which the interaction of Y3 and Y4 on the distribution of these four pollutants reached a significance level of 0.001, and the effect was found to be significant. The interactions of Y3 and P1, Y3 and S1, and Y4 and P1 had significant effects on the concentration distribution of all the three gaseous pollutants, CO, NO, and NO 2 , and the interaction of Y3, P1, Y3, and S1 on the distribution of these three pollutants reached a significance level of 0.001. The interaction of Y1 and Y3 on the distribution of the three pollutants NO, NO 2 , and SO 2 reached a significance level of 0.001, and the effect was found to be significant. The interaction of Y2 and Y4 on the distribution of two pollutants, CO and NO 2 , reached significance levels of 0.001 and 0.05, respectively; the interaction of Y3 and P3 on the distribution of both CO and NO pollutants reached a significance level of 0.05, and the interaction of Y4 and P3 had no significant effect on the concentration of any of the four pollutants. Overall, the interaction of the DEM mean (Y3) and building height mean (Y4) with other indicators had a significant effect on the concentration distribution of the four pollutants. The topographic elevation, building height, and their interaction with other building spatial distribution indicators are important factors affecting the distribution of gaseous pollutant concentrations.
As shown in Table 3, compared with Tables 1 and 2, the F-values of the results of the three-factor indicator interaction ANOVA differed from those of the oneand two-factor interaction ANOVA. The effects of the interaction of the DEM mean (Y3), building height mean (Y4), and building volume standard deviation (S2) on the distribution of NO and NO 2 concentrations passed the confidence test of 0.001 and were found to be significant, while the effects of the interaction of the other three factors on the distribution of the four pollutants were not significant. The effects of the interactions of Y3 and Y4, Y3 and P1, and Y3 and S2 on the concentration distribution of the three gaseous pollutants-CO, NO, and NO 2 -passed the 0.001 confidence test, suggesting significant results. The interactions of Y4, P1, Y4, and S1 had significant effects on the concentration distribution of both gaseous pollutants, NO and NO 2 , whereas the interactions of Y4 and P1 on NO and Y4 and S1 on the distribution of NO and NO 2 pollutants reached a significance level of 0.01. The interaction of P1 and P3 on SO 2 distribution reached a significance level of 0.05, suggesting a significant effect. Overall, the effect of the interaction of the three-factor indicators on the concentration distribution of gaseous pollutants was not significant, and the interaction of Y3, Y4, P1, and other indicators had a significant effect on the concentration distribution of gaseous pollutants. This indicates that topographic elevation, building density, building height, and their interaction with other building spatial distribution indicators are important factors that influence the concentration of gaseous pollutants.

Results of spatial regression analysis
To quantitatively study the degree of influence of different building spatial distribution indicators on the distribution of gaseous pollutant concentrations, a variance inflation factor (VIF) was used to test the multicollinearity between the indicators. When VIF > 10, the model had a covariance problem, which was eliminated by gradually reducing the variables. SPSS 26.0 was used to perform the multiple covariance test, and the eight indicators with VIF values less than 10 were finally selected (Y1, Y2, Y3, Y4, P1, P2, P3, and S3); the VIF values of these indicators were less than 6, and most of them were less than 5. Using the eight selected building space distribution indicators that do not have significant covariance, we used spatial lag, spatial error, and spatial Dubin regression analysis methods (Qiu & Huang, 2020;Xi & Li, 2015) to study the combined effects of building spatial distribution indicators on gaseous pollutants. The GWR model was then used to study the spatial heterogeneity of their impact effects.

SLM, SEM, and SDM model results
The SLM, SEM, and SDM were used, with gaseous pollutant concentration (y) as the dependent variable, the eight building spatial distribution indicators selected above as the independent variables, and the spatial weight matrix (W) using the Queen connection. The regression analysis tool GeoDa 1.20 was used for calculations, and the optimal model was selected based on the judgment of log likelihood, AIC, and R-squared value. The optimal model and test results of the four pollutant concentrations are shown in Table 4. As seen in Table 4, (1) the relationship between CO and NO 2 concentrations and building spatial distribution indicators is suitable for interpretation by SEM, which shows that the spatial distribution of CO and NO 2 concentrations is mainly influenced by the indicators of the spatial distribution of buildings in this study unit, and the effect of the influence between CO and NO 2 concentrations in neighboring study units is the result of the action of random factors. (2) The relationship between NO and SO 2 concentrations and building spatial distribution indicators can be explained by the SDM, indicating that the spatial distribution of NO and SO 2 concentrations is influenced not only by the spatial distribution indicators of the buildings in this study unit but also by the same indicators in the neighborhood study units and the NO and SO 2 concentrations in the neighborhood study units. Coefficient β 2 of the SDM model and its test results are listed in Table 5.

GWR model results
To further analyze the spatial heterogeneity of the influence of building spatial distribution indicators on the distribution of the four gaseous pollutant concentrations, the GWR model was constructed using the geographically weighted regression tool in the ArcGIS 10.7 spatial statistics toolbox. The significance test results of the GWR model for the four gaseous pollutants and the spatial distribution index of the buildings are shown in Table 6.
As seen in Table 6, among the GWR models of the four pollutants and the spatial distribution index of buildings, the global fits of the GWR models of CO and NO concentrations and the spatial distribution index of buildings were both better, with R 2 of 0.90 and 0.87, and R 2 adjusted of 0.89 and 0.86, respectively. The local fits of these two models were also better, with mean and median local R 2 all greater than 0.5 at 0.63 and 0.53, and 0.65 and 0.52, respectively. The GWR models of NO 2 and SO 2 concentrations and building spatial distribution indices had worse global and local fitting effects than the GWR models of the first two pollutants, in particular, the mean and median of their local R 2 were only in the range of 0.30-0.35, with poor local fitting effects. Therefore, based on the results of the GWR model test of the four pollutants and building spatial distribution indicators, only CO and NO concentrations and building spatial distribution indicators were selected to construct the GWR model.
To analyze the spatial heterogeneity of the influence of the eight building spatial distribution indicators on  (e) Building density 10 -2 (f) Building footprint standard deviation 10 -6 (g) Building footprint mean 10 -6 (h) Building volume mean 10 -6 the distribution of CO and NO concentrations, the coefficients of each building spatial distribution indicator in the model were determined according to the constructed GWR model, and the results are shown in Figs. 2 and 3. As shown in Fig. 2, the regression coefficients of the DEM standard deviation (Y1), building height mean (Y4), and building footprint mean (P3) were mostly positive and positively correlated, while the building height standard deviation (Y2), DEM mean (Y3), and building density (P1) regression coefficients were mostly negative and negatively correlated. The Y2 and Y3 regression coefficients showed a stepwise distribution from southwest to northeast, gradually decreasing from southwest to northeast. The regression coefficients of building density (P1) and building footprint standard deviation (P2) showed a stepwise distribution from west to east, gradually decreasing from east to west. The building footprint mean (P3) and building volume mean (S3) form a peak area in the central building-dense area, with the P3 regression coefficient gradually decreasing from the center to the periphery and the S3 regression coefficient gradually increasing from the center to the periphery. Negative values of the Y1 regression coefficient are mainly distributed in the southwest and northeast corners, those of the Y4 regression coefficient are mainly distributed in the southwest and east, those of the P2 regression coefficient are mainly distributed in the central and southeast areas, and those of the P3 regression coefficient are mainly distributed in the periphery of dense building areas. Positive values of the Y2 regression coefficient are mainly distributed in the periphery of dense building areas, those of the P1 regression coefficient are mainly distributed in a few areas in the west, and those of the S3 regression coefficient are mainly distributed in dense building areas.
As shown in Fig. 3, the regression coefficients of the DEM standard deviation (Y1), building height mean (Y4), building footprint mean (P3), and building volume mean (S3) are mostly positive and positively correlated, whereas the regression coefficients of building height standard deviation (Y2), DEM mean (Y3), building density (P1), and building footprint standard deviation (P2) are mostly negative and negatively correlated. The Y2 and Y3 regression coefficients showed a stepwise distribution from southwest to northeast, gradually decreasing from southwest to northeast. The P2 regression coefficient showed a stepwise distribution from west to east, gradually decreasing from east to west. S3 forms a peak in the central building-dense area, which gradually rises from the center to the periphery. Negative values of the Y1 regression coefficient are mainly distributed in the southwest and northeast corners, negative values of the Y4 regression coefficient are mainly distributed in the east and west sides, negative values of the P3 regression coefficient are mainly distributed in the south-central and western sides, negative values of the S3 regression coefficient are mainly distributed in the central buildingdense area and northeast corner, positive values of the Y2 regression coefficient are mainly distributed in the northeast corner, and positive values of the P2 regression coefficient are mainly distributed in the northwest corner.
The distribution trends of the regression coefficients of the urban structure indicators in the CO and NO concentration models are roughly similar, but the values differ, indicating that the spatial distribution indicators of buildings have different effects on the concentrations of different pollutants. The regression coefficients of Y3 in both GWR models were negative, indicating that the DEM mean was negatively correlated with the CO and NO concentrations. The regression coefficients of P1 in the GWR model of NO concentration were all negative, indicating a negative relationship between building density and NO concentration.

Conclusions and discussion
This study considered the central city of Jinan as the research area. First, the city road network was used to construct blocks as the research unit, and the building spatial distribution index system was constructed using building big data and DEM data. Then, ANOVA was used to investigate the influence of different levels of building spatial distribution indicators on the distribution of gaseous pollutant concentrations from multiple perspectives. Finally, a spatial regression model was used to further investigate the role of building spatial distribution indicators on the distribution of gaseous pollutant concentrations. The key conclusions are listed as follows: (1) Most of the building spatial distribution indicators have a significant impact on the concentration distribution of gaseous pollutants. The effects of seven indicators, including DEM standard deviation and DEM mean value, on the concentration distribution of four gaseous pollutants were highly significant; the effects of building footprint standard deviation and building footprint mean value on the concentration distribution of CO, NO, and SO 2 were highly significant, while none of the effects on NO 2 concentration was significant. The interactions of most of the two-factor indicators had a significant effect on the pollutant concentration distribution, among which the topographic elevation and building height and their interaction with other building spatial distribution indicators were important factors affecting the distribution of gaseous pollutants. Most of the ANOVA results of the interactions of the threefactor indicator F-values differed from those of the one-and two-factor indicator interactions, and the effect on the concentration distribution of the four gaseous pollutants was not significant.
(2) The spatial distribution of CO and NO 2 concentrations is mainly influenced by the indicators of the spatial distribution of buildings in the study unit. The influence of CO and NO 2 concentrations in adjacent study units is the result of the action of stochastic factors, and NO and SO 2 concentrations are influenced by the indicators of the spatial distribution of buildings in this study unit, the same indicators in the neighboring areas, and the concentrations of NO and SO 2 . The increase in the DEM mean significantly contributes to the increase in NO and SO 2 concentrations in the neighboring area, and the spillover effect of this index on the neighboring area is greater than the direct effect on this study unit. An increase in the building height standard deviation has a significant effect on the reduction in the NO concentration in the neighborhood. The negative spillover effects of DEM standard deviation, building density, building footprint standard deviation, building footprint mean, and building volume mean on neighboring NO and SO 2 concentrations were not significant, and the positive spillover effects of building height mean on neighboring NO and SO 2 concentrations were not significant. (3) The GWR models of CO and NO concentrations and building spatial distribution indices have good global and local fits, while the mean and median local R 2 of the GWR models constructed for NO 2 and SO 2 concentrations are only in the range of 0.30-0.35, which is a poor local fit. The values of the regression coefficients of each urban structure indicator in the GWR models constructed for CO and NO concentrations differ, and the overall trend distribution is approximately similar, indicating that the spatial distribution indicators of buildings have different effects on the concentrations of different pollutants. The DEM standard deviation, building height mean, and building footprint mean were mainly positively correlated with CO and NO concentrations; the building height standard deviation, DEM mean, and building density were mainly negatively correlated with CO and NO concentrations; the building footprint mean was positively correlated with NO concentrations; and the building footprint standard deviation was negatively correlated with NO concentrations.
The strength of the study is that urban roads and building attributes were used to divide neighborhoods as the study unit. A study unit divided by neighborhoods can reflect human aggregation and better reflect human activities (Liu et al., 2021b). It is possible to overcome the disadvantages of the commonly used grid division of research cells, such as buildings being divided by the grid, which causes inaccurate building properties. To reduce the influence of multicollinearity among multiple indicators, eight building spatial distribution indicators with VIF values less than 6 were selected according to the multicollinearity test of SPSS 26.0 and the exploratory regression results of ArcGIS 10.7, and four spatial regression models of SEM, SAR, SDM, and GWR were selected for regression analysis of four gaseous pollutants and building spatial structure indicators. These four models can reduce the effect of spatial non-smoothness (Zhu et al., 2020) and more accurately explore the spatial variation of gaseous pollutant concentrations and related drivers, which is significantly better than commonly used models such as OLS (ordinary least squares). The monitoring station data was used as the data source for gaseous pollutant concentrations. Although the accuracy of the data is high, the study has certain limitations as interpolation model error has a certain "replace the surface with point" problem, which affects the accuracy of the research results to a certain extent. The interpolation results of monitoring data and the inversion results of remote sensing images of pollutants are considered for subsequent comparison to obtain more accurate data on the spatial distribution of pollutant concentrations and make the research results more objective and accurate.
Yu analyzed the data. Xinwei Yu drafted the manuscript. Baoyan Shan and Yongqiang Lv provide funding acquisition. Yanqiu Chen, Qiao Zhang, Qixin Ren, and Yongqiang Lv supervised and complemented the writing. All authors have read and agreed to the published version of the manuscript.
Funding This work was financially supported by the Humanities and Social Science Fund of the Ministry of Education of the People's Republic of China (NO.12YJA790019) and the Natural Science Foundation of Shandong Province (ZR2020QD021).
Humanities and Social Science Fund of the Ministry of Education of the People's Republic of China, 12YJA790019, Natural Science Foundation of Shandong Province, ZR2020QD021 Data availability The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Declarations
Competing interests The authors declare no competing interests.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.