Exploration of Temporal-spatially varying Impacts on COVID-19 Cumulative Case 1 in Texas using geographically weighted regression (GWR) 2

: Since COVID-19 is extremely menacing human’s health, it is a significant to expose on its fator’s 9 impacts for curbing the virus spreading. To tackle the complexity of COVID-19 expansion in spatial-temporal 10 scale, This research is approriatedly analyzed the spatial-temporal heterogeneity at county-level in Texas. 11 First,factors impacts of COVID-19 are captured on social, economic, and environmental multiple-facets and the 12 Communality is extracted through Principal Component Analysis (PCA). Second, this research is used COVID- 13 19 CC as the dependent variable and the common factors as the independent variable. According to the virus 14 prevailing hierarchy, spatial-temporal disparity is are categorized four quarters in the modeling GWR analysis 15 according to the virus prevailing hierarchy. The findings are exibited that GWR models provided higher fitness, 16 more geodata-oriented information than OLS models. In Texas El Paso, Odessa, Midland, Randall and Potter 17 County areas, population, hospitalization, and age structure presented static, positive influences on COVID-19 18 cumulative casesm, indicating they should be adopt stringent stratgies in curbing COVID-19. Winter is the most 19 sensitive season for the virus spreading, implying the last quarter should be pay more attention to prevent the virus 20 and take pracutions. This research are expected to provide references for preventing and controlling COVID-19 21 and related infectious dieseaces, evidences for disease surveillance and response systems to facilitate the 22 appropriate uptake and reuse of geographical data.


Introduction
From summer to winter in 2020, the virus, as a perfect storm, virtually spreads every part of the U.S. at a speed 44 unprecedented in American history, according to Johns Hopkins University data. Jamie Ducharme argued the 45 pandemic had claimed more than three times the American lives that were lost in the Vietnam War (Ducharme, 46 2020). The coast-to-coast surge is causing hospitals across the country to the edge of catastrophe. Doctors and 47 factors such as race/ethnicity and socioeconomic status, leading to encode the vulnerability to adverse health 136 outcomes such as negative effects of  Analysis of the relationship between these possible risk factors (e.g., AQI, race, gender) and COVID-19 in 138 different counties will help develop policies to prevent and control the spread of COVID-19 technically. The 139 spatial-temporal distribution of COVID-19 will also contribute to county-driven COVID-19 real-time and 140 dynamic monitor systems. The advantage is that the results are directly used to draw up community containment 141 strategies, which are fundamental public health measures used to control the spread of communicable diseases, 142 including isolation and quarantine (Mollalo et al., 2020). Therefore, this paper unveils spatial-temporal 143 heterogeneity on county-level within a state, providing real-time scientific evidence for creating an effective 144 For temporal study in the paper, time series thought was classified into four layers according to four seasons 167 in 2020. Quarterly statistical data are based on environmental, social-economic indexes at the end of the season 168 in response to COVID-19 NC and TC at that time. The temporal-study framework is in Fig. 1. For spatial-study perspective, we explore correlations between variables with SPSS before building GWR 173 models, no matter what kinds of variables. Since dependent variables must meet the assumption of a normal 174 distribution, we have to describe their statistical characteristic property and spatial autocorrelation analysis. 175 Simultaneously, all explanatory variables after standardization should be examined by Principal Component 176 Analysis to eliminate multicollinearity. After that, we try to model simple Ordinary Lease Square (OLS) and 177 geographically weighted regression between variables. Finally, via two models' comparisons, we pay more 178 attention to their differences in spatial heterogeneity and analyze how did it happen, as shown in Fig.2. Data standardization is this process of making sure that your dataset can be compared to other data sets. It is 190 a key part of the research, and standardized data is essential for accurate data analysis. It is also easier to make 191 clear conclusions about current data when there are other data to measure it against. The condition of 192 standardization with the Z-score is that the data mean is equal to 0 and the standard deviation is equal to 1. Stepwise Regression (SR) is an automatic variable selection procedure that selects from a couple of 201 candidates the explanatory variables, which are the most related. We used the unidirectional forward methods. Simple OLS is the estimation of a linear relationship between two variables, , and , of the form: 210 Where denotes the ith observation on the dependent variable Y which could be CC, and Xi denotes the 212 i th observation on the independent variable X which could be synthetic factors. OLS assumptions involve the 213 disturbances have zero mean and a constant variance, in addition to are not correlated. The explanatory variable 214 X in OLS is non-stochastic. 215

Geographical Weighted Regression 216
According to the first law of geography, there is more similarity between more adjacent geographical 217 entities (Tobler, 1970). Meanwhile, due to the unbalanced distribution of natural resources endowment and 218 socioeconomic factors in different provinces, there also exists interregional spatial correlation and spatial 219 heterogeneity. And because of these, such global-regression-model-related assumptions do not hold anymore, 220 for instance, data values are independent of geographical location, there exists no spatial correlation, and sample 221 data are balanced. Therefore, it is impossible to properly explain an individual situation, and herein spatial 222 heterogeneity, by using global overall parameters. Based on Foster's spatial varying parameter regression, a 223 Geographically Weighted Regression model (GWR) (Fotheringham et al., 2002) was further proposed by 224 Fotheringham using a local smooth processing method to solve the spatial heterogeneity. With spatial 225 heterogeneity taken into consideration, geographic coordinates and core functions are utilized to carry out local 226 regression estimation on adjacent individuals of each group. The equation of the GWR fitted model is in Eq(2) 227 (Nakaya, 2016). 228 where i denotes the individual sample; (u i ,v i ) is the coordinates of sample i; β k (u i ,v i ) is the k th regression 230 parameter of sample i; y i is the dependent variable of sample i , x k,i is the k th independent variable for the 231 sample i, ε i is random error term which obeys normal distribution when the variance is a constant, thus the 232 parameter estimation value of sample i is given by: 233 where W is the spatial weight matrix, whose selection and the setting is the core issue of GWR regression. 235 And its calculation consists of two major steps. The first step is the selection of a proper kernel function to 236 express a spatial relationship between the observed units. Specifically, four major kernel functions are being 237 used in existing research, namely fixed Gaussian, fixed Bi-square, adaptive Bi-square, and adaptive Gaussia. 238 Since the merits of a kernel function play a direct and decisive role in obtaining the most accurate possible 239 regression parameter estimation of spatial heterogeneity, after careful analysis and comparison, Fixed Gaussian 240 was chosen as the kernel function in the paper, which is expressed as 241 where w ij represents the distance weight from sample i to sample j; d ij is the Euclidean distance between 243 sample i and sample j; θ is the bandwidth, which determines the speed at which the spatial weight attenuates 244 with distance. The second step of spatial weight matrix calculation is the selection of optimal bandwidth which 245 could contribute to a higher fitting degree. According to the GWR4.09 User Manual (Nakaya, 2016  The precondition of regression analysis is that the dependent variable should meet the normal distribution. 251 The request for normal distribution has two conditions. One is uncertain variable is symmetric about the mean, 252 another is that uncertain variable is more likely to be in the vicinity of the mean than far away. Thus, a normal 253 distribution is conducted in 5 dependent variables quarterly. After logarithm transformation, quarterly CC within 254 is qualified normal distribution except for the first quarter in Fig.4. When modeling GWR regressions, the first 255 quarter CC is overlooked as skewed distribution.   which is negatively related to COVID-19 CC. That demonstrated that keep spatial distancing benefits COVID-19 307 CC reduction. Factor 6 refers to medical supply (i.e., BPC), meaning hospital beds is positive related to COVID-308 19 CC. 5 factors in the third quarter is identical with the 5 factors in the first quarter except for factor 6. 5 factors 309 in the last quarter are similar with the 5 factors in the first quarter except for factor 3. The distinction in the last 310 quarter is that factor 3 is added to PCN in natural supply, meaning precipitation positively influences on COVID-311 19 increasing. The concreted relationships are shown in Table 3 and Table 4. 312

Comparison of composite OLS & GWR models 318
Modeling OLS is to examine whether there is a linear relationship between CC and its factors. By the T-319 test and F test, all factors are passed. Modeling GWR is to examine whether there is a spatial-temporal 320 relationship between CC and its factors. Since COVID-19 CC is clustered and varies around the study area, and 321 ADAPTIVE kernel in GWR models is appropriate. The AICc method I chose to find the bandwidth which less than that of the OLS maps. Predicted CC in GWR quarterly map is more clustered than OLS quarterly map, 332 the cluster area is in eastern and northern Texas. Therefore, the GWR model is superior to the OLS model. In 333 table 5, 334  Fig. 4 incorporates Texas spatial-temporal distribution maps based on 6 factors 340 in terms of 6 aspects in table 3 in three quarters. 341 In the second quarter, factor 1 among 6 factors has the largest effects on CC in northern Texas thanks to the 342 maximum coefficient is 6.88. It's the lowest impact in eastern Texas due to the coefficient range of 0.61-0.88. 343 indicating total population and hospitalization are the key factor of COVID-19 and northern Texas is the main 344 precaution and control area of COVID-19. Factor 2 (Age structure) positively affects COVID-19 spatial 345 heterogeneity in central Texas with pink color. The area of the largest coefficient range 1.41-1.6 is distributed in 346 northern Texas. The smallest impacts of the coefficient range 0.43-0.83 are the coastal area at the bottom of the 347 map. Factor 3 is an air quality index, having remarkable spatial disparity for its coefficient is from range -1.78--348 1.09 to range 0.15-0.66. In central Texas, the improvement of air quality is driven by COVID-19 CC, but it 349 reversely works in northern Texas. That indicates AQI has spatial non-stationary and environmental harness is 350 available reducing CC in northern Texas. Factor 4 is an economic composite index that coefficient is from range 351 -0.86-0.50 to range 0.56-0.81. The spatial heterogeneity is located between northern Texas, coastal counties, and 352 eastern Texas. Factor 5 is the natural supply index that coefficient is from range 0.02-0.28 to range 1.24-1.90. The 353 spatial heterogeneity is subtle. Factor 6 is the medical supply index that coefficient is from range -0.56--0.31 to 354 range 0.32-0.51. It is evident to see the change of spatial heterogeneity that the medical condition in northern 355 Texas is worse than in other Texas counties (https://www.dallasnews.com/news/2021/01/22). 356 In the third quarter, factor 1 among 6 factors is the dominant effect on CC due to the maximum range of of western Texas. Factor 3 is a natural supply index, having remarkable spatial disparity for its coefficient is from 362 range -1.10-0.26 to range 0.83-1.36. In central Texas, land area is driven COVID-19 CC, but it reversely works 363 on northern Texas. That indicates spatial distancing is not available for northern Texas, compared to central Texas. 364 Factor 4 is an economic composite index that coefficient is from range -0.49--0.28 to range 0.54-0.82. The spatial 365 heterogeneity is located between Central Texas, coastal counties, and eastern Texas. Factor 5 is air quality index 366 that coefficient is from range -1.09--0.41 to range 0.84-1.34. The spatial heterogeneity is obvious to be seen in the 367 change of spatial heterogeneity that positive impacts are from western Texas to eastern Texas while negative 368 impacts are from north Texas to western and southern Texas. 369 In the fourth quarter, factor 1 among 6 factors is still the dominant effect on CC without the range of 370 maximum coefficient is 3.99-6.6. Spatial heterogeneity is slight, implying it is a fixed factor. Factor 2 is an 371 economic composite index that coefficient is from range -0.58--0.23 to range 1.12-1.74. The spatial heterogeneity 372 is that areas of positive impacts are decreased while areas of negative impacts are increased. Factor 3 is a natural 373 supply index that coefficient is from range -0.28-0.03 to range 1.65-2.49. The spatial heterogeneity is that areas 374 of positive impacts are decreased while areas of negative impacts are moved from north Texas to eastern Texas. 375 Factor 4 is Age structure index that coefficient is moved from range 0.28-0.49 to range 1.02-1.16. The spatial 376 heterogeneity is that both areas of positive impacts and negative impacts are increased. Factor 5 is air quality 377 index that coefficient is from range -1.09--0.41 to range 0.84-1.34. The spatial heterogeneity is obvious to be seen 378 in the change of spatial heterogeneity that positive impacts are from western Texas to eastern Texas while negative 379 impacts are from north Texas to western and southern Texas. Factor 6 is the medical supply index that coefficient 380 is from the range --0.87--0.52 to range 0.29-0.58. It is evident to see the change of spatial heterogeneity that areas 381 of positive impacts are moved from eastern Texas to western and south Texas while areas of negative effects are 382 decreased and moved. Natural supply impacts in three quarters have fluctuated. First, the coefficient range within three quarters are 408 changed 0.02-1.9, -1.10-1.36, into -0.28-1.49. It demonstrated that the role of natural supply is out of control. 409 Second, the cluster of positive impacts with red colors is decreasing from 24 counties in north Texas to 9 410 counties. Simultaneously, the areas of negative impacts with blue colors are changing from the east to the north, 411 finally landing on the east. It means that natural impacts are weakening, compared to other factor's impacts. 412 Medical supply impacts in three quarters have fluctuated as well. First, the coefficient range within two 413 quarters is changed -0.5-0.51, 0, into -0.87-0.58. It demonstrated that the role of medical supply impacts is slight 414 and out of control. Second, the cluster of positive impacts with red colors is increasing from the east-south 415 tracking to the west-south tracking. Simultaneously, the areas of negative impacts with blue colors are 416 decreasing from the center to the north. Interestingly, the impacts of the third quarter are ignored, representing 417 medical capacity is limited and scarce. 418

Discussion 419
In this study, 14 potential risk variables are selected from the race, climate, land cover, demographic categories, 420 hospitalization, gender, age structure, and socioeconomic as independent variables to estimate their spatial-421 temporal impacts on the distribution of the COVID-19 cumulative cases at the county-level in Texas. Since current 422 research is lack of consideration of time series models, spatial-temporal GWR is explored to accurately identify 423 distribution in pandemic in Texas. to spatial-temporal quarterly GWR models, yet there is a distance to be reached for real-time dynamic GWR 485 models. GTWR or more effective spatial-temporal models are considered in future research. 486

Implications 487
The COVID-19 pandemic revealed systemic flaws in the food distribution system that fails to protect 488 against hunger and dietinfluenced non-communicable diseases. It also exposes the conditions that made people 489 who are living on low-incomes, disenfranchised, discriminated-against, and chronically ill the most vulnerable Availability of data and materials: The datasets used during the current study are 513 available from the corresponding author on reasonable request. 514 Compliance with ethical standards 515

Competing interests 516
The authors declare that they have no competing interests. 517 Ethical approval Not applicable. 518 Consent to publish All the co-authors consent the publication of this work. 519 Consent to participate Not applicable. 520 521