How Has the Shared Bike and Subway Ridership Integration in New York City Changed in Response to the Covid-19 Pandemic?

31 The COVID-19 pandemic has hit the world and made significant impacts on all parts of human 32 settlement areas. Passenger journeys on public transportation have dropped significantly. This 33 study looks at the effects of the COVID-19 on the change of bike usage-subway ridership in-34 tegration between 2019 and 2020 in New York City (NYC), USA. To investigate the effect, 35 this study uses various data sources including bike sharing data from Citi Bike, subway rid-36 ership data from Metropolitan Transportation Authority, Census data from IPUMS, land use 37 data from Department of City Planning (DCP) and transportation-related data from U.S. De-38 partment of Transportation (DOT). The Geographically Weighted Regression was employed 39 to examine the spatiotemporal varying effects of bike-subway integration for casual users and 40 subscribers in the shared bike system. The results show that the pandemic impacted the usage 41 of bike-subway integration spatially and temporally. The bike-transit integration impact is 42 largely positive and tends to be stronger when the subway stations are located farther away 43 from CBD areas in 2019, while the bike-subway integration tend to be insignificant for a large 44 number of stations in 2020. It also confirms that the impact of the shared bike usage on subway 45 ridership during workdays present a larger magnitude of the coefficients than the ones on non-46 workdays in 2019. In contrast, the 2020 model shows that the impacts do not differ between 47 workdays and non-workdays. These findings are rarely discussed in earlier studies. This study 48 also used an 800-meter boundary captures the spatial impact of shared bike usage on subway 49 ridership in NYC. However, it is barely discussed what network typologies determines such a 50 spatial boundary of the shared bike impact area. This will be further discussed in future research. 51


Introduction
In early 2020, the COVID-19 pandemic spread across the world.In mid-March, New York City declared a state of emergency and introduced measures such as limiting occupation of spaces (Mayor 2020).As the number of infections and deaths continued to rise, the State immediately ordered the closure of all non-essential businesses (State 2020).The limitation of unnecessary trips must have an impact on the demand for transportation, especially public transportation (Teixeira and Lopes 2020).Since COVID-19 is primarily transmitted through respiratory droplets of coughs and sneezes, the physical proximity between uninfected and infected individuals poses a significant risk (health 2020, Teixeira and Lopes 2020).With more crowded and enclosed spaces, public transportation provides the environment for human-tohuman transmission of the virus, and the ability of the coronavirus to remain active for hours on surfaces such as plastics and metals can make things worse (van Doremalen, Bushmaker et al. 2020).For these reasons, many governments have implemented social distancing strategies that help limit travel behaviors.Several reports show that there has been a significant reduction in mobility during the COVID-19 period (De Vos 2020, Borkowski, Jażdżewska-Gutta et al. 2021).Some studies have shown that both rail transportation and intercity bus play an essential role in accelerating the spread of influenza (Cui, Luo et al. 2011, Piso, Albrecht et al. 2011, Zhang 2011).Because asymptomatic passengers cannot be readily detected or prevented from travelling, the public will reduce their use of public transportation due to fear of illness, which can only disappear after a long period of improvement in public trust in safety against the virus (Wang 2014).This increases the usage of other modes of transportation, among which cycling can be a popular choice.
Despite the significant change in human mobility during the pandemic, how the bike-transit integration was impacted by the COVID-19 still remains questionable.In fact, it is widely believed that bike-transit integration is one of the important ways to promote a transit city and achieve efficient and sustainable urban transportation systems (Zhao and Li 2017, Ji, Ma et al. 2018, Lin, Zhang et al. 2019, Ma, Zhang et al. 2019, Guo and He 2020, Guo, Yang et al. 2020, Zhao, Lin et al. 2020).The integrated usage aims to encourage transit passengers to use a bike as a transfer mode to/from a transit station (Zhao and Li 2017).Its advantages are mainly reflected in two aspects.On the one hand, cycling has a higher speed than walking.On the other, it is cheaper and more flexible than public transportation (Keijer andRietveld 2000, Rietveld 2000).For large cities, the "last mile" between the home and transit station is a major factor affecting residents' usage of a transit system, and bike-transit integration provides a chance to promote both transit and cycling (Martens 2004, Bachand-Marleau, Larsen et al. 2011).This kind of integration has seen substantial growth during the last decade in many countries.Previous studies have analyzed its characteristics in multiple dimensions, such as the effect of shared bike usage on subway ridership, bike service areas, bike parking issues and the accessibility of bike-transit integration (Iacono, Krizek et al. 2010, Bachand-Marleau, Lee et al. 2012, Hochmair 2015, Arbis, Rashidi et al. 2016, Yang and Long 2016, Ji, Fan et al. 2017, Saghapour, Moridpour et al. 2017, Zhao and Li 2017, Ji, Ma et al. 2018).Yet, the change of bike-transit integration may be less clear during the pandemic.
Several studies have discussed the role of shared bikes during epidemics and how the epidemic impacted bike usage and subway ridership.A study conducted a spatial analysis of the shared bike system in Point-Of-Interest (POI) to examine the effects of the COVID-19 on bike usage in Beijing (Chai, Guo et al. 2020).It was found that bike usage gradually increased after the pandemic was brought under control.Teixeira and Lopes (Teixeira and Lopes 2020) used Citi Bike data in March 2020 in order to demonstrate that the use of the shared bike system rebounded quicker than the subway system.The shared bike system is relatively less impacted by the pandemic than the subway in terms of the number of passengers and travel duration (Teixeira and Lopes 2020).Wang and Noland (Wang and Noland 2021) extend the scope of the research by using two-year time-series data, including weather and policy-related variables, examining how the policy mitigates the pandemic impacts on shared bike usage and subway ridership.However, these studies did not examine how the bike-transit integration was altered by the COVID-19 with different types of land use variables and different time categories.We build our research questions upon the existing studies.
The aim of this research is to provide empirical evidence on the relationship between land use and bike-transit integration in New York City.To this end, two questions will be addressed: (1) What are the impact of the built environment and land use on bike-transit integration during the pandemic?(2) What is the spatial-temporal variation of those land use effects?To answer the questions in a comprehensive way, multiple quantitative methods including a global regression model, a local spatial regression model, and other relevant methods are utilized.Based on the literature review, this study contributes to existing literature in three aspects: (1) investigating the relationship between shared bike usage and subway ridership and how the variation of the relationship implies land use development and policy; (2) examining how the shared bike usage and subway ridership has been altered between before COVID-19 and after COVID-19 periods; (3) identifying the proper threshold of catchment area of subway stations that promotes bike usage particularly in Manhattan of NYC; (4) understanding the difference of bike usage between workdays and non-workdays; and (5) looking at the stationed-shared bike as the majority of shared bike programs in the US are of such kind.

Bike-sharing in the aftermath of the pandemic
Mobility in all transportation modes has utterly decreased since the pandemic started (Borkowski, Jażdżewska-Gutta et al. 2021).However, the performance of shared bike is the most resilient among other transportation modes.In Sweden, while ridership in public transportation decreased by 40%-60% in different regions, bike use does not show a significant drop as much as the number of transit travellers does (Jenelius and Cebecauer 2020).Similarly, a study conducted in New York, US demonstrated that the bike sharing programme saw a 71% of decrease as opposed to the public transit system that presented 90% of reduction in ridership (Teixeira and Lopes 2020).The average travel time of shared bike users increased from 13 minutes to 19 minutes.These two studies also suggest that passengers of subway lines shifted their mode choice from public transit to bike-sharing (Jenelius andCebecauer 2020, Teixeira andLopes 2020).In contrast, another study done in Greece suggests that the pandemic does not change the mobility of shared bike users as much as other cities do (Nikiforiadis, Ayfantopoulou et al. 2020).A possible explanation is that people would regard cycling as safe way to travel without contacting other people and being around close contact with crowd.This statement is supported by research done in Chicago which reported that taxi and ride-hailing, pooled ride-hailing, and public transit are the most risky transportation modes while shared bike, private bike and shared e-scooter are the least risky modes among others (Shamshiripour, Rahimi et al. 2020).This implies that people would prefer to use a transportation mode that restrict social contact, shifting from using public transit to shared bike system.In addition, the pandemic negatively impacted the frequency of bike sharing travel while the trip duration increased in early stage of the pandemic in three cities in USA such as New York, Boston, and Chicago (Padmanabhan, Penmetsa et al. 2021).This may be due to the fact that the number of short trips was reduced.People may use another transport mode or simply walk in a short distance.Although people would shift to use bike sharing, they may demand more intensive sanitizing measures and better bike infrastructure.In Spain, a study reported that people would be willing to use shared bike if free masks, gloves, and hydroalcoholic gels are provided in extended bike lanes (Awad-Núñez, Julio et al. 2021).Interestingly, the shared bike usage only marginal before the pandemic (Awad-Núñez, Julio et al. 2021).
Concerns over the disparity of mobility among the low-income groups are also emerging.Several studies demonstrated that reduction of mobility is associated with income and education level (Brough, Freedman et al. 2021).These groups are generally working class who primarily use public transportation.It was also observed that shared bike use increased in the areas served by subway stations during the post pandemic period although there is a significant decline in ridership of public transit in all boroughs (Pase, Chiariotti et al. 2020).This implies that when travel mobility restriction is imposed, higher-income groups may reduce travel and does not have to use public transportation while low-income groups necessarily engage in essential work activities and commute using public transportation.These studies also have emphasized that the government need to create safe biking environments by improving bike infrastructure because the bike could offer alternative mode for travellers who are concerned about infection of the disease.However, in Chicago, US, the bike sharing programme is more dominant in the areas with higher income level, better education attachment, and higher density particularly around urban center (Hu, Xiong et al. 2021).In terms of transit accessibility of bike docks, shared bike usage increased in the vicinity of subway stations before the pandemic and it relatively decreased after the pandemic (Hu, Xiong et al. 2021).Particularly, stations near the city center has more bike sharing usage and relatively drops more after the pandemic (Hu, Xiong et al. 2021).

Measurement of bike-transit integration
The measurement of bike-transit integration is the foremost issue in exploring the effects of shared bike usage on subway ridership.Prior studies points to two major methodologies to quantify bike-transit integration -One being the widely used field questionnaire surveys, and the other being using big data.In the study of Zhao and Li (Zhao and Li 2017), Ji et al. (Ji, Fan et al. 2017) and Yang et al. (Yang, Liu et al. 2016), interviews and questionnaire surveys were used to explore the determinants of bike-sharing as a transfer mode to rail transit.There are several approaches to using data to derive bike-transit integration.Ma et al. (Ma, Ji et al. 2018) used smart card data to mine travel location and time, seeking to accurately identify a transfer trip by setting a proper threshold for transfer time and physical distance.However, this approach only applies to traditional shared bikes that are docked, and would require travelers to use the same smart card for different modes of transit, so that transfers can be identified by tracing the same card ID using the pre-set threshold.This strict requirement prevented this method from being widely adopted since very few cities adopt a unified smart card system for all its transit modes.
Meanwhile, sensitivity issues with individual level data means it is very difficult to access smart card data for researchers.Some other efforts revealed the feasibility of using location data of dockless/floating shared bike to infer the bike-transit integration (Wu, Lu et al. 2019, Wang, Lu et al. 2020, Chen, Chen et al. 2021).Their approach assumes that dockless/floating bike-sharing trips from the near distance to the subway station are treated as travel for accessing the subway.A widespread assumption made in this approach is that shared bike trips ending in close proximity to subway station entrances are treated as integration trips for accessing the subway, while shared bike trips starting from locations in close proximity to subway entrances are treated as having finished subway journey and completing their trip to the final destination by means of shared bike (Wu, Lu et al. 2019, Guo and He 2020, Guo, Yang et al. 2020).This would be an ideal approach to cities where subway stations are situated within a reasonable distance from each other but would not work well for cities with a well-developed transit network like New York (especially in Manhattan, where a shared bike trip could both start and finish from near a subway station).It is worth noting that the majority of the studies adopting the close proximity approach has been applied on dockless/floating bike systems, with the exception of the study conducted by Böcker et al. (Böcker, Anderson et al. 2020) on Oslo's docked public bicycle system, where bicycle trips starting and ending within a 200 meter buffer of a transit station is assumed to be bike-transit integration trip.All these empirical studies shed light to our work, which attempts to estimate bike-transit integration in New York using publicly available datasets.
In terms of the catchment area, existing studies have used quarter-mile as a reasonable walking distance (Ma, Liu et al. 2015, Noland, Smart et al. 2016); meanwhile, the first mile-last mile issue has been well studied in the transportation planning field.This study used half-mile radius as a reasonable cycling distance for the purpose of analyzing bike-transit integration.It is worth mentioning that since the subway network in Manhattan is of high density, the actual service area of a typical station in Manhattan is smaller than the area that a half-mile radius buffer cover.The total usage of shared bike within each service area of subway station is regarded as the bike-subway integration ridership.While in reality, the purpose of shared bike trips are diverse; we think the bike-transit integration impact can be reflected by studying the impact of shared bike usage on the ridership of the subway stations they serve.

Determinants of bike-transit integration
Many studies have been dedicated to the influencing factors of bike-transit integration.The determinants can be roughly divided into four categories: socioeconomic demographics, travelrelated, built environment, and other latent factors.Socioeconomic demographics such as user age, household income, employment density and car ownership have been evidenced to their impact on bike-transit integration (Ji, Ma et al. 2018, Zhao, Lin et al. 2020).In the study of Ji et al. (Ji, Fan et al. 2017), female, older, and low-income transit passengers are less likely to use shared bikes to access public transportation stations.Similarly, Zhao and Li (Zhao and Li 2017) found that young people and low-income earners are more likely to use buses than cycle.According to Martin and Shaheen (Martin and Shaheen 2014), shared bike usage is more closely linked to rail transit in suburbs with lower population density.It was also found by Ma et al. (Ma, Liu et al. 2015) that areas with dense job distribution contribute to a higher rate of bike-transit trips.As to travel-related factors, most studies took travel distance into consideration.It is one of the most important determinants which influence the usage of shared bike to access rail transit (Chen, Pel et al. 2012, de Souza, Puello et al. 2017, Zhao and Li 2017).The analysis by Ji et al. (Ji, Ma et al. 2018) and Guo et al. (Guo and He 2020) indicated that the demand for subway-bike transfer is negatively correlated with riding distance.Additionally, travel time and travel cost are found to significantly affect bike-subway integration (Wang, Feng et al. 2014).
How built environment factors, such as bike capacity, transportation infrastructure, and land use, influence the integrated usage has also attracted extensive attention from researchers.Bike capacity and infrastructure directly affect cycling rates (Wu, Chung et al. 2021, Wu, Kim et al. 2021).The number of bike docks and lanes around transit stations are found to be crucial in increasing integrated usage demand (El-Assi, Mahmoud et al. 2017, Zhao andLi 2017).Also, the density of bike stations and subway stations have evidenced to impact the demand on shared bikes positively (Ji, Ma et al. 2018, Guo, Yang et al. 2020).Moreover, better cycling service, such as exclusive and safe parking sites designed for bicycles, is identified as a stimulus to choose a bike to access transit stations, as suggested by Puello and Geurs (Puello and Geurs 2015).As for land use, the characteristics are usually studied by land use types and mixed land use.Land use types are highly associated with human activities; thus, they not only distribute the origin and destination, but also affect residents' willingness to cycle (Guo andHe 2020, Zhao, Lin et al. 2020).Existing evidence shows that the impact of different land use types on demand for bike-transit integration differ.Generally, commercial and residential patterns are associated with more integrated usage.On the one hand, these land use types contribute to the increasing volume of bike-sharing stations, encouraging shared bike flows (Mateo-Babiano, Bean et al. 2016, Zhao, Lin et al. 2020).On the other hand, they are related to commuting trips, making shared bike an ideal mode of choice during peak times (Guo and He 2020).In addition, mixed land use is seen as an important influencing factor that increases the rates of cycling as a feeder mode to rail transit (Zhao and Li 2017, Guo and He 2020, Guo, Yang et al. 2020, Zhao, Lin et al. 2020).These efforts can provide a reference for the selection of built environment variables (see Table 1).
By identifying the key factors from existing studies, some practical guidance for planning has been suggested.On the one hand, a transit-oriented urban planning could increase the frequency of commuters using shared bikes as proposed by Liu et al., because they found that long-time commuters prefer to use shared bike to access subway services (Liu, Ji et al. 2020).On the other hand, the subway-bike share service should be enhanced (Ji, Ma et al. 2018).In the case of Shenzhen, improving bicycle infrastructure in accordance with local conditions is emphasized.For example, there should be more dedicated lanes for cyclists in certain areas, depending on differences in demand (Zhao, Lin et al. 2020).Moreover, even though some areas are saturated with shared bikes, passengers are still unable to use them freely in many cases, so it is necessary to frequently rebalance or adjust bikes according to real-time demand, particularly during peak times (Ji, Ma et al. 2018, Zhao, Yang et al. 2019, Liu, Ji et al. 2020).Guo et al. also suggested not placing too many bikes in the crowded areas of stations to alleviate the problem of unordered parking (Guo, Yang et al. 2020).

Demographic and socioeconomic
proportion of male population density

Study Area
The study area of this research is New York City (NYC), which has a total area of 1,214 square kilometers and more than 8.5 million residents, making it the largest city in the United States.It consists of five boroughs (Manhattan, Queens, Brooklyn, The Bronx and Staten Island).As a world-class city, the rail transit system in NYC is one of the most developed in the world, which is composed of the New York City Subway (NYCS), the Port Authority Trans-Hudson (PATH) and the Staten Island Railway (SIR).The system operates 24 hours a day, making it the most heavily used form of public transportation in the US.Along with the developed rail transit system, bike-sharing system has appeared and rapidly advanced.Citi Bike, the NYC's largest bike-sharing system, began operating in 2012 and has been expanding every year.Currently, with over 1000 stations and 14,500 bikes, Citi Bike has more than 150,000 registered members (subscribers) who make a total of more than one million trips a month on average (Bike 2020, Teixeira and Lopes 2020).Figure 1 shows the spatial distribution of Citi Bike stations and subway stations in NYC.

Data source
Multi-source publicly available data are employed to examine how shared bike usage, as well as built environment and socioeconomic factors affect subway ridership before and after the COVID-19 outbreak.Table 2 below lists all the datasets used in this study as well as the data providers and sources, followed summary explanations of each dataset.

Subway ridership data
Subway ridership data is obtained from a weekly-updated dataset published by the Metropolitan Transportation Authority (MTA) of the New York City Metropolitan Area (Authority 2020).
The MTA publishes one dataset every week containing roughly 200,000 records, each of which representing readings on one specific turnstile in a subway station at a specific point in time during the week.Turnstiles are fixed mechanical gates used in subway stations in NYC to control and record station entry and exit.Readings are typically taken every four hours, resulting in six readings per day per turnstile.Total ridership for a particular time period can be calculated as the difference between two readings taken at the start and end of a time period.
For the purpose of this study, turnstile readings data for the entire Years 2019 and 2020 were collected, with 10,878,394 observations in 2019 and 11,030,671 in 2020.

Shared bike trip data
Shared bike trip data is obtained from the operator of New York City's bike-share program, Citi Bike.Citi Bike publishes bike trip data (CitiBike) containing records of every individual bicycle trip that takes place on a monthly basis.Each record consists of the start and end station information, the start and end time, the trip duration, and the type (subscriber/casual user), gender, and year of birth of the bike user.Shared bike trip data for the entire 2019 and 2020 was collected from the operator for the study.The average duration of bicycle trips, according to the raw datasets, is approximately 16 minutes in 2019 and 22 minutes in 2020.Close examination of the raw data saw bicycle trips lasting longer than 24 hours, which were deemed abnormal (due to human error) and deleted subsequently.This data cleaning resulted in 20,545,579 valid trips recorded in 2019 and 19,497,779 in 2020, with an average trip duration of 14 minutes in 2019 and 18 minutes in 2020.Despite the COVID-19 pandemic, 2020 did not witness a significant decrease in Citi Bike usage.

Shared bike station infrastructure data
The operator of Citi Bike provides up-to-date data for all of its operational stations via an information feed (CitiBike 2021).As of May 2021, there are a total of 1465 recorded shared bike stations in the entire Citi Bike system.The aggregated shared bike usage data shows 882 and 1199 active stations (bike stations with at least one borrowing/returning activities in one year) in 2019 and 2020, respectively.This increase of active bike stations can be explained by the expansion plan (CitiBike 2019) published by the operator in 2019, with a significant number of bike stations added in the Year 2020, particularly in Upper Manhattan and The Bronx (Figure 2).
Figure 2 Citi Bike Stations For the purpose of this study, the capacity of each Citi Bike station (number of bicycle docks) is extracted from the Citi Bike station information feed.

Census data
This study views socioeconomic factors as explainable factors of subway ridership.Data for these factors are taken from the 2010 US Census and 2010 American Community Survey (Manson, Schroeder et al. 2020).It was decided to use 2010 data instead of the latest because the data is more comprehensive for the purpose of this study, and the complete result of the 2020 US Census is yet unpublished by the time this study was finished.Among all the factors census data covers, this study is particularly interested in the total and employed population, the median age, the household income, and the car ownership of the areas surrounding each subway station.

Multi-source built environment-related data
This study employed extensive land use and geographic data (Primary Land Use Tax Lot Output, known as the PLUTO™) published by the Department of City Planning (DCP) of NYC ((NYC) 2020).This dataset identified 11 land use categories and matches each building to the most appropriate one.This study is particularly interested in residential, commercial, mixed (residential and commercial), public facilities and open space types of land use.
This study also collected transportation infrastructure data, including NYC's bicycle lanes data from the Department of Transportation (DoT) ((NYC) 2021), main road data from the Department of Information Technology & Telecommunications (DoITT) ((DoITT)) and bus stop shelters data from DOT ((NYC)).

Variables
The main purpose of this study is to analyze the impact of shared bike usage on subway ridership.Our literature review identified some of the key impact factors of bike-transit integration ridership, including demographics, socioeconomic, built environment, transportation related factors, etc.Of these potential factors, we are particularly interested in studying how built environment (land use and transportation infrastructure) and socioeconomic factors influence transit ridership, before and after the COVID-19 outbreak.
The dependent variable for this study is the ridership of NYC Subway stations.To derive this variable, the raw turnstile data is first aggregated at turnstile (spatial) and day (temporal) level.After close examination of the resulted data, it was subsequently discovered that there were erroneous observations where aggregated daily traffic for a specific turnstile is extremely large (millions) and were deemed invalid.It was decided that observations of such should be eliminated from the study because the faults in the aggregated data was resulted from unexplainable faults in the raw published data, and thus is unable to be fixed.The threshold for deletion is set to 20,000 for each turnstile per day, as the value falls under this number for most turnstiles (see Figure 3).This cleaned data is then divided into workday and non-workday (including weekends and federal holidays), and both further aggregated at station complex (spatial) and year (temporal) level.Because each subway station complex has many entrances with multiple turnstiles, eliminating records from one turnstile will have a limited effect on the total traffic of an entire station complex for an entire calendar year.Therefore, the second aggregation process reduced the effect of the inevitable deletion of certain erroneous observations.The resulted aggregated data shows the yearly ridership for each NYC Subway station complex on workdays and non-workdays.The MTA publishes an official station grouping system (Authority) where most of the close-by subway stations are merged into station complexes, resulting in 430 station complexes of such in total.This made it possible to mitigate the effect of close-by stations sharing the exact same or similar attributes.Station ridership is then further aggregated as station complex level to form the final ridership statistics for NYC Subway Station Complex ridership.
The key independent variable of our interest is bike-transit integration ridership.Our literature review summarizes two major common approaches to quantify bike-transit integration.This study only makes use of publicly available datasets.There are no publicly available smart card data for New York City Subway system that are suitable for measuring bike-transit integration.Moreover, the New York City Subway is a very dense network with the closest distance between two subway stations being 92 meters, which makes it very difficult to measure biketransit integration using the close proximity approach.
This study took a different approach by calculating the usage of shared bike in a defined catchment area of each subway station.The idea of a catchment area is based on the assumption of shared bikes acting as 'feeder' for subway and also helps complete passenger journeys that subway cannot do alone.The catchment area for each subway station is calculated based on the Thiessen Polygons truncated to 800 meters (half-mile).Figure 4 shows the catchment area for each subway station complex.The total usage of shared bike within each service area of subway station is regarded as the bike-subway integration ridership.While in reality, the purpose of shared bike trips is diverse; we think the bike-transit integration impact can be reflected by studying the impact of shared bike usage on the ridership of the subway stations they serve.To calculate the usage of shared bike in the catchment area of each subway station, the two raw shared bike trip record datasets (2019 and 2020) were aggregated.It was decided to firstly divide the two datasets according to the type of shared bike user (subscriber/casual user).Studies have shown that casual users are more likely to use shared bike for leisure while subscribers for work trips (Demaio 2009, Fishman, Washington et al. 2013, Fishman 2016).To coincide with the subway ridership data, the dataset is further divided according to the type of day (workday/non-workday) trips are taken.The four sets of shared bike usage data resulted from the two divisions are then aggregated at bike station level.With this in hand, the ridership of all shared bike stations within the catchment area of each subway station complex is summed up, which serves as the key independent variable indicating shard bike usage in catchment areas of subway stations.Table 3 shows the summary statistics of subway ridership and shared bike usage.Other independent variables have also been taken into account and calculated from various sources.Socioeconomic factors of the census block group where each subway station is located is extracted or calculated from US Census data (Manson, Schroeder et al. 2020); this includes the total population, the median age, the median household income, the employment density, and the median number of vehicles owned by each household.Table 5 below shoes summary statistics of socioeconomic variables.This study measures built environment attributes from two dimensions, which are land use types and transportation infrastructure.To quantify the built environment factors surrounding each subway station complex, the proportion of each selected type of land use within the catchment areas of subway station complexes is calculated from the PLUTO data.Subsequently, the Entropy Index, which measures land use mix, can be derived using the following equation: where   is the proportion of each land use category j in the census block group, and k denotes the number of land use categories (ZAGORSKAS 2016).The index can take a value from 0 to 1; a higher Entropy Index implies the existence of a higher level of mixed land use.For transportation infrastructure-related factors, this study calculates the length of main vehicle roads and bicycle lanes and the number of bus stops within each subway station catchment area.The proximity between subway stations is also calculated as the distance between one subway station and its nearest station neighbor using Manhattan distance (sum of vertical and horizontal distances).This study is also interested in how the capacity of shared bike (number of docks in a subway station catchment area) impacts subway ridership.
Below is summary statistics of the rest of the independent variables.

Model
This study seeks to understand how shared bike usage, along with built environment and socioeconomic factors affect the ridership of stations on the NYC Subway system.A Geographically Weighted Regression (GWR) model is adopted, while a global regression model is also applied for comparison.A local regression model (GWR) is employed in this study to tackle the issue of spatial non-stationarity -Because the independent variables are heterogeneous across the study space, they are very likely to have distinct and localized effects on subway ridership.A global regression model does not consider the variation of independent variables across space.

Global Regression
Global models assume that the association between the dependent variable and the independent variables are the same for all observations.A global regression model using the Ordinary Least Square (OLS) method is defined as: where   represents the total ridership of subway station complexes on all workdays or nonworkdays in 2019 or 2020,  0 is the constant,   is the coefficient of the -th independent variable,   is the -th independent variable at sample location ,  is the number of independent variables, and   is the random error term at sample location .

Geographically Weighted Regression (GWR)
A GWR model is implemented to investigate the spatially varying association between the independent variable (subway ridership) and the dependent variables.Each independent variable has a local coefficient at each sample point (subway station complexes).A GWR model can be defined as (Fotheringham, Charlton et al. 1998): where   represents the total ridership of subway station complexes on all workdays or nonworkdays in 2019 or 2020,  0 (  ,   ) is the constant at sample location , (  ,   ) represents the coordinate of location ,   (  ,   ) is the coefficient of the -th independent variable at location ,   is the -th independent variable at sample location ,  is the number of independent variables, and   is the random error term at sample location .
When estimating regression coefficients at sample point , a weight matrix   needs to be computed: where   is the weight assigned to sample point  based on its distance to sample point .
Tobler's First Law of Geography indicates that data points closer to the studied sample point  could have a stronger influence on its coefficient estimation than sample points farther away.
The weight matrix applied to the parameter estimation controls this distance effect.This study adopted the adaptive bi-square kernel function (Fotheringham, Brunsdon et al. 2003) to calculate the weight matrix for each sample point, which can be defined as: where   is the distance between the studied point  and another point ,   is an adaptive bandwidth defined as the distance from  to its -th nearest neighbor. is a bandwidth that needs to be selected appropriately.In this study, the Akaike Information Criterion (AIC) is selected for bandwidth optimization because it can trade off the local degrees of freedom and the goodness of fit (Fotheringham, Brunsdon et al. 2003).
With the weight matrix in hand, the regression coefficients can be estimated as follows (Fotheringham, Brunsdon et al. 2003): where   ̂ is the estimated coefficients of the sample point ,  is a matrix of independent variables,   is the weight matrix and  is a column vector of the dependent variables.

Global regression (OLS)
A global linear regression model is built using the OLS function in the Python statsmodels package to explore the association between subway ridership and shared bike usage, built environment, and socioeconomic factors on a global scale.Before the model was constructed, a correlation test was conducted to check whether independent variables correlate with each other, using Pearson's Correlation Coefficient.Test result (see Figure 6) shows that total population and employed population (0.89), population density and employment density (0.87), and shared bike ridership and shared bike capacity (0.72) are highly correlated with coefficient higher than the 0.6 threshold, signifying a strong correlation in between (Dixon and Massey 1957).Therefore, it was decided that employment population, population density and shared bike capacity should be excluded from the regression model building.Because shared bike capacity is of particular interest in this study, a new variable named shared bike intensity is calculated using the equation below: ℎ   = ℎ   ℎ   It denotes the mean number of times each bike (dock) is used per year in each catchment area; a higher intensity value could indicate higher popularity of shared bike use or a more significant shortage of shared bike.The resulting variable is not strongly correlated with any other independent variable in the model.The variance inflation factor (VIF) values as shown in Table 6 suggests that no multicollinear-508 ity problem exists among all the independent variables (Cardozo, García-Palomares et al. 2012).509 The result indicates that shared bike use has a positive impact on subway ridership overall.This result is consistent with the findings of existing studies on the impact of shared bike on subway ridership attraction (Ashraf, Hossen et al. , Ma, Liu et al. 2015).In 2019, such positive impact was stronger on workdays ( = 0.202 and 0.196) than non-workdays ( = 0.177 and 0.167).This finding provides evidence on the significant role of shared bike being a feeder mode of subway commute in the NYC public transportation system.However, in 2020 the difference of such impact between workday ( = 0.144 and 0.156) and non-workday ( = 0.148 and 0.154) is minor.This indicates that the COVID-19 pandemic and the subsequent home working situation resulted in a significant change in people's travel behavior.
Of all the socioeconomic factors, total population size always has a negative impact on subway ridership.This non-intuitive finding could be because that despite being the major economic and employment center of NYC and having the biggest number of subway stations and the busiest stations, Manhattan is only the third most populous borough (in terms of the residential population recorded by census data (Manson, Schroeder et al. 2020)) in NYC after Brooklyn and Queens, the latter two of which jointly accounts for over half of NYC's population.The median number of vehicles owned per household also has a negative impact on subway ridership in all models, indicating that higher vehicle ownership decreases residents' incentive to use public transportation.Existing studies (Ashraf, Hossen et al.) found that the number of no vehicle households positively affects subway ridership.The finding of this study provided new evidence for this.Moreover, the negative impact from higher vehicle ownership is also observed to be stronger on non-workdays.One possible reason is that people tend to favor public transportation on workdays due to road congestion and are more likely to use private vehicles on non-workdays for leisure.The global regression models also show that employment density has a statistically significant impact (positive) on subway ridership in 2019, which is consistent with existing studies conducted in different parts of the world (Gutiérrez, Cardozo et al. 2011, Zhao, Deng et al. 2013, Jun, Choi et al. 2015).However, in 2020, no statistically significant association between employment density and subway ridership is observed, which provides further evidence on the pandemic-induced decrease in commuting trips.
Of all the land use factors, commercial & office space has the strongest impact ( > 0.4) on subway ridership, and such positive impact is significant on both types of days and in both years.Moreover, it is also the only land use type whose ability to attract subway users is seen stronger on workdays than non-workdays.This indicates that commercial & office space tend to attract commuting types of subway riders.Mixed type of land use (commercial & residential) has also shown statistically significant positive strong impact on subway ridership, second only to commercial land use ( > 0.1).These findings are consistent with existing literature studying built environment effects on subway ridership and public transportation use in general (Loo, Chen et al. 2010, Zhao, Deng et al. 2013, Shi, Zhang et al. 2018, An, Tong et al. 2019).Land use mixture level (entropy index) have a positive impact on subway ridership, which is consistent with empirical studies (Jun, Choi et al. 2015, Guo andHe 2020).Moreover, such impact is statistically more significant and much stronger on non-workdays than on workdays.Residential land use only has a statistically significant impact on subway ridership on non-workdays.No statistically significant effect on subway ridership is found with open space and public facility types of land use.This finding is different from some in existing literature where park/public space is found to positively affect subway ridership (Guo and He 2020).This may suggest that most subway users do not use bikes for leisure or recreational purposes.
Of all the transportation infrastructure-related factors, the length of bicycle lanes, distance to the nearest subway station and shared bike intensity have no statistically significant impact on subway ridership.These findings differ from empirical studies conducted in other cities (Guo and He 2020).The length of main roads and the number of bus stops both show a positive impact on subway ridership, which is consistent with findings in existing studies (Kuby, Barranda et al. 2004, Sohn and Shim 2010, Guo and He 2020).This finding indicates that often subway station complexes serve as virtual transportation hubs that promote different modes of public transportation.

GWR
This section first describes why the GWR is a more suitable model to examine the relationship between subway ridership and shared bike usage, socioeconomic and built environment attributes.The results of the GWR model are presented, followed by the visualization and detailed interpretation of the spatially varying coefficient estimate.
Although the global regression model provided plenty of insight on the effects of shared bike usage, built environment, and socioeconomic factors on subway ridership, evidence suggests that it is not the most suitable method for such purpose.Table 4 shows that the Jarque-Bera (J-B) statistic is statistically significant on workdays in both 2019 and 2020, suggesting that the residuals of the global model do not conform to a normal distribution.Table 3 shows the result of the Global Moran's I residuals test for the global model, indicating that the residuals of the global regression models are spatially autocorrelated.
Spatial regression models like GWR tackle issues of spatial autocorrelation and non-stationary arising from the global model (Fotheringham, Brunsdon et al. 2003).Instead of producing a single average global coefficient for each variable in the model, GWR allows the relationship between the dependent variable and the independent variables to vary across space.It takes into account the spatial dependency effects of nearby subway stations and Tobler's First Law of Geography which states that near things are more related than distant things by applying a weight matrix.
To compare the result between global and local models, the same set of independent variables from the OLS model were also put in the GWR models for analysis.This study used the MGWR 2.2.1 (Oshan, Li et al. 2019) software to build GWR models.
Because GWR results do not vary by much between workdays and non-workdays and casual bikers and subscribers, results from workdays and subscribers are selected for illustration.Summary statistics (Table 7) of the local beta coefficients of the independent variables shows estimated coefficients ranging from negative to positive values, implying diverse local effects that shared bike usage and built environment and socioeconomic factors have on subway ridership.The GWR results are visualized using choropleth maps showing negative and positive estimated local beta coefficients, using an appropriate color scheme for optimal readability.Model diagnostic information shows that for all eight GWR models, the AIC/AICc is less than that of the OLS model and the adjusted R square value of the GWR models are greater than that of the corresponding global models (Table 8).This clearly indicates a better goodness of Shared bike usage is found to have a positive impact on subway ridership in all the global regression models.However, the GWR models produced spatial variations of parameter estimations.The spatial distribution of the coefficient of GWR model is demonstrated in Figure 6.For year 2019, the impact that shared bike usage has on subway ridership is uniformly positive and has the tendency to become stronger when the stations are located farther away from CBDs towards suburban areas.Moreover, this impact is seen as non-significant in Midtown Manhattan, which is traditionally viewed as the largest CBD of NYC.This might be because the density of subway stations in this area is very high and so is the ridership, meaning that the subway stations in this area are less reliant on shared bikes to attract riders; instead, business activities in the area naturally attracts people (land use factors) and the high density of subway stations means the first/last mile issue does not exist.On the other hand, subway stations in the not so densely populated suburban areas could potentially share a greater benefit from shared bike use nearby.The GWR result for the shared bike usage variable in 2020 tells a completely different story from 2019.The impact of shared bike use on subway ridership becomes insignificant for a large number of stations across the study space, and both positive and negative impacts exist.Subway stations in Lower Manhattan are seen to be having the strongest impact on their ridership from shared bike usage, while negative impact of shared bike use on subway ridership is observed in the borough of Queens.The retained positive impact of shared bike usage in certain parts of Manhattan could be explained as business-as-usual during the global pandemic due to the city itself being the global center for finance.The negative impact shared bike has on subway ridership could be strong evidence of shared bike 'stealing' users from the subway during the pandemic.It is possible that residents in Queens view shared bike as a safer way to travel during the global pandemic.
Total population size is the only that has significant local effect on subway stations.Total population size is found to have an overall negative effect on subway ridership in the global regression model.The hypothesis is that this is due to the imbalance between the local residential population and subway ridership.The GWR result provided further evidence towards this hypothesis.The association between total population size and subway ridership is not statistically significant for subway stations in the core parts of the most populated boroughs of NYC, namely Brooklyn and Queens.Lower and Midtown Manhattan sees a small negative effect on its subway ridership from total population size.This may be because having a larger number of the working population living locally could in turn reduce the need to travel on the subway, which has been brought up in previous studies (Guo and He 2020).The strongest negative impacts from total population size are observed in suburban areas, which are also the areas where subway stations ridership is most negatively affected by higher car ownership.The farther away from urban centers people live, the more likely they are to own cars, which has a negative impact on subway use.In 2020, some subway station whose ridership had a statistically significant association with total population size in 2019 no longer saw this effect, which can be a result of the pandemic reducing the number of trips that need to be taken.
The percentage of commercial & office space is seen as having the strongest positive impact on subway ridership ( > 0.4) among all other factors in the global regression model.The GWR model produces spatial variation of this impact across NYC.The positive impact that commercial and office space has on subway ridership is universally significant and is stronger in suburban areas than in CBDs.This could be because subway stations in suburban areas rely more heavily on commercial activities to attract ridership, compared with CBD areas which are ridership magnets naturally.One significant difference of the impact of commercial and office space on subway ridership between the two study years is that, in 2020, the range of beta coefficients ( ∈ [0.45, 0.79]) is wider compared with that of 2019 (𝛽 ∈ [0.26, 1.57]).This difference is another reflection of the strong impact that a global pandemic like COVID-19 has on travel behavior.Reduced travel demands resulted from the working-from-home trend means that commercial activities play a more significant role in the trip attraction of the not-so-popular subway stations.Several factors related to transportation infrastructure have a significant impact on subway ridership in the global regression.The GWR model allowed this to be evaluated on the local scale.
The number of bus stops in subway station catchment area shows the strongest positive impact on the ridership of subway stations located in the CBD areas of Manhattan and the southernmost part of Brooklyn in 2019.This could be due to a high prevalence of bus-subway interchange in these areas.However, in 2020, despite the positive impact on subway ridership from bus services in Manhattan being largely retained, a considerable number of subway stations in Downtown Brooklyn (CBD) and Queens are no longer significantly impacted by bus stops.This could be due to the pandemic situation discouraging intra-city travels, especially multimodal means of transportation (e.g.bike-subway / bus-subway integration) to minimize the chance of contracting the virus.
The length of main roads in station catchment area is another factor positively impacting subway ridership in the global regression.The GWR result shows that in both 2019 and 2020, the farther away a subway station is from CBD areas, the stronger (positive) impact it receives from the length of main roads in its catchment area.2019 sees subway stations in Lower Manhattan receiving small positive impact from the length of main roads while in 2020, the effect became largely insignificant in CBD areas in Manhattan and Brooklyn and only some small positive impact retains in a limited number of stations located in the city's very outskirts.Length of main road could be an indicator of road traffic volume.The deterioration of the impact of road traffic on subway ridership could be a result of the pandemic-induced reduction in human mobility.
The length of bicycle lanes and distance to nearest stations are found to have no significant impact on subway ridership overall.Shared bike intensity is found to have no significant impact on subway station ridership in 2019.However, in 2020, subway stations in the southern part of the Bronx are found to have been positively impacted by the intensity of shared bike use.This area happens to be one of the most invested and improved areas for the Citi Bike system in 2020, with new bike stations and docks introduced in areas that are not covered by the system in 2019 (CitiBike 2019).A higher intensity of shared bike use can stimulate subway ridership in two ways.On the one hand, high intensity may naturally be caused by the high usage of shared bike, which could be a popular feeder mode to the subway.On the other hand, high intensity can also be due to the low capacity of the new shared bike stations in the area, resulting in residents having to use alternative means of transportation, of which subway is a popular choice.

704
Table 9 above shows the comparison of the estimated parameters and diagnostic information of the OLS and GWR models.The adjusted R squares of the GWR models show 12.5% and 25.5% improvement in 2019 and 2020 respectively.AICs/AICcs of the GWR models are also noticeably smaller than those of the OLS models.This provides strong evidence that the GWR model performs much better than the global models.The Moran's I test of the residuals of the two models indicates that the residuals of the GWR models do not have spatial autocorrelation problem (towards randomness), while those of the OLS model are spatially clustered.The ANOVA test (see Table 10) also shows that GWR is an improvement of the OLS, with a smaller sum of squares value.

Conclusions & Discussion
This study investigates the impact of COVID-19 on bike-transit integration by controlling built environment, land use and socioeconomic variables in New York City, USA.This study utilized the most extended time-series data to capture the difference of bike-subway integration between pre-pandemic periods (January 2019-January 2020) and post-pandemic periods (January 2020-January 2021).The empirical findings of this study provide a new understanding of bike-transit integration during the COVID-19 periods in the USA.Before this study, it was difficult to predict how the bike-transit integration is spatially and temporally determined by controlling land use and socioeconomic factors.It is confirmed that the bike-transit integration impact is largely positive and tend to be stronger when the subway stations are located farther away from CBD areas in 2019, while the bike-subway integration tends to be insignificant for a large number of stations in 2020.In addition, this study also confirms that the impact of the bike usage on the subway during workdays present larger magnitude of the coefficients than the ones on non-workdays in 2019.In contrast, the 2020 model shows that the impacts do not differ by much between workdays and non-workdays.These findings are rarely discussed in earlier studies.This study also uses 800-meter boundary that captures the spatial impact of shared bike usage on subway ridership in NYC.However, it is barely discussed what network typologies determines such a spatial boundary of the shared bike impact area.This will be further discussed in future research.
The findings draw important implications for developing policies concerning the development of bike-subway integration system, the base infrastructure and spatial planning framework.Potentially, it is suggested to create new policy guidelines for promoting active transportation and its integration with the subway system during the post-pandemic period depending on the different neighborhoods and communities.The analysis results demonstrated that the most impacted areas with the reduction of bike-transit integration is Queens and Brooklyn in 2020.In addition, the findings show that the strongest impact on the subway ridership from shared bike usage appeared in Lower Manhattan after the pandemic.People who live in the Manhattan area would rather walk or ride a bike only.However, the residents who live in the other parts of New York who need to travel using multiple modes such as shared bike and subway may not be encouraged to travel during the pandemic.Given that the pandemic situation gets prolonged and its impacts would not fade away readily, it is arduous for policymakers to promote the use of active transportation and public transit system.Notably, as explained in the results section, the negative impact shared bike has on subway ridership could be strong evidence of shared bike absorbing users from the subway during the pandemic.The shared bike and subway system do not need to be mutually exclusive in promoting sustainable transportation agendas in the city.For New York City government, it may be worthwhile to consider putting more bike parking facilities and increasing the number of Citi bikes near the public transportation system to increase connectivity and encourage use of the shared bike to the metro stations during the post-pandemic era.
Although this study provides important findings, several research questions need to be explored in future studies.First of all, it is difficult to generalize the findings of this study as this research focuses on bike-subway integration in New York City alone.Using a similar framework, the study could be repeated in other cities in the USA.Secondly, as Wang and Noland (Wang and Noland 2021) indicated, it is still unknown how the reduction of the subway services affected its ridership.It may be because of the shortage of staff members and a decrease in ridership.
Although the ridership gradually rebounded over the last year, it is still hard to capture the effect.

Figure 1 .
Figure 1.Location of Citi Bike and subway stations in New York City and Hudson County.

Figure 3
Figure 3 Distribution of daily entries of single station turnstiles

Figure 4
Figure 4 Catchment Areas of NYC Subway Station (Based on Thiessen Polygons truncated to 800 meters)

Figure 5
Figure 5 Pearson's correlation coefficient matrix of independent variablesThe result of the OLS model is shown in Table5below.

Figure 6
Figure 6 Spatial distribution of estimated local coefficients of the GWR model in 2019 and 2020

Table 1 .
Summary of literature review on key factors impacting subway ridership

Table 2 .
Raw Datasets used in the study

Table 3 .
Summary statistics of subway and shared bike usage

Table 6 .
Variance Inflation Factor values of independent variables

Table 8 .
Global Moran's I residuals test and AIC test for OLS and GWR models

654
The study also seeks to investigate how the diversity of land use types (mixture level) affects subway ridership.The GWR results suggest that in 2019, land use mixture level positively affects subway ridership predominantly in the largest three CBDs of NYC (Midtown and Lower Manhattan, Downtown Brooklyn), whereas in 2020 such effect became insignificant in Midtown and some parts of Lower Manhattan while still hold in Downtown Brooklyn.Mixed Use is one land use type recognized by the NYC's Department of City Planning, representing land use consisting of both residential and commercial space.Unlike pure commercial and office space which positively impacts ridership of nearly all subway stations, mixed land use type only contributes to subway trip attraction in stations located in non-CBD areas.Moreover, the farther away from CBDs, the stronger this impact becomes.It is also worth mentioning that in 2020, the overall beta coefficients are larger for mixed land use than those of 2019.The reason for this could be similar to the reason why commercial space attracts subway trips more effectively in suburban areas than in CBDs.Residential Use, Public Facility, and Open Space are found to have largely insignificant impact on subway ridership.

Table 9 .
Estimated Parameters and Diagnostic Information of the OLS and GWR Models

Table 10
: Df, degrees of freedom; Sum Sq, sum of squares; Mean Sq, mean squared error; F value, F statistic values. Note