Isolation Forest algorithm
In today's era of data explosion, it is inevitable that actual datasets collected and analyzed contain a significant number of outliers47,48. These outliers can have a substantial negative impact on data quality as they can distort the results of statistical analyses, thereby misleading the interpretation of the data. More critically, outliers can directly affect the training process of machine learning models, leading to models overfitting these anomalous data points rather than following the overall data trends49,50. This overfitting reduces the model's generalizability and ultimately affects its predictive performance on new data51. Therefore, identifying and properly handling outliers is a crucial step in enhancing dataset quality and ensuring model accuracy and reliability.
The Isolation Forest algorithm is an anomaly detection method based on the simple principle that anomalies are rare and differ significantly in data structure from other points. Thus, anomalies are easier to "isolate" compared to normal points. Introduced in 2008 by Liu Fei et al., the Isolation Forest algorithm is an efficient method for detecting anomalies, particularly suitable for highdimensional datasets52. The fundamental concept behind the Isolation Forest algorithm involves selecting a feature at random, followed by a random split value for that feature to segment the data iteratively. This procedure constructs an isolation tree (iTree), where outliers are segregated early during the treebuilding phase, thereby arriving at leaf nodes more promptly. By aggregating multiple such iTrees into a forest, the algorithm can evaluate the 'anomaly level' of individual data points: data points that are isolated at lower depths are generally more anomalous. A key strength of this algorithm is its adeptness in managing datasets with many dimensions and its operational efficiency compared to methods based on distance or density for outlier detection. Furthermore, the Isolation Forest does not presuppose any particular distribution of the data, enabling its application across diverse datasets.
The core of the Isolation Forest algorithm lies in the construction of isolation trees and the calculation of anomaly scores.
Construction of Isolation Trees
Isolation Trees are constructed by recursively selecting features and split values. In each splitting process, a feature \(X\) and a split value \(s\) are randomly selected between the minimum and maximum values of the feature. This process is repeated until the tree reaches a predetermined height or the number of data points in a node falls below a certain threshold.
Calculation of Anomaly Scores
The anomaly score for each data point is calculated based on the average path length of that point within the tree. The path length of a data point \(x\) in a single Isolation Tree is denoted as \(h\left(x\right)\). If a data point is isolated at a lower level of the tree, it typically indicates that it is likely an anomaly. For a given data point \(x\), the average path length across all trees is \(E\left(h\left(x\right)\right)\). The anomaly score \(s\left(x,n\right)\) for the data point can then be calculated using the Eq. (1), where \(n\) is the number of samples:
$$s\left(x,n\right)={2}^{\frac{E\left(h\left(x\right)\right)}{c\left(n\right)}}$$
1
Here, \(c\left(n\right)\) is a normalization factor calculated based on the number of samples \(n\), which adjusts the average path length to make the score more uniform and comparable. This method provides a means to quantify the likelihood of a data point being an outlier by evaluating its average performance across multiple trees. As shown in the following Eq. (2).
$$c\left(n\right)=2H\left(n1\right)\frac{2(n1)}{n}$$
2
In the formula, \(H\left(i\right)\) represents the harmonic number, which can be approximated by \(\text{ln}\left(i\right)+0.5772156649\) (Euler's constant).
If \(s\left(x,n\right)\approx 1\), then \(x\) is likely to be an outlier.
If \(s\left(x,n\right)\approx 0\), then \(x\) is likely to be a normal data point.
Gated Graph Convolutional Neural Network
This study employs a gated graph convolutional neural network (GCN) with multifeature edge weights to predict the rental and return demand of bikesharing stations. The network architecture mainly consists of two parts: the gating mechanism and the graph convolutional neural network.
Gating Mechanism
Gating mechanisms are essential in deep learning architectures, notably in Recurrent Neural Networks (RNNs) and their derivatives like Long ShortTerm Memory networks (LSTMs) and Gated Recurrent Units (GRUs)53,54. These mechanisms facilitate the models' ability to capture longterm dependencies by regulating information flow. The mathematical expression of the gating mechanism is as follows.
Suppose there are two vectors, \(A\) and \(B\), where \(A\) is the information to be controlled, and \(B\) is the source controlling the information. A simple gating mechanism can be represented as follows.
$$G=\sigma \left({W}_{g}\bullet B+{b}_{g}\right)$$
3
Here:
\(G\) is the gate signal, which determines how much information from \(A\) can pass through.
\(\sigma\) is an activation function, such as Sigmoid, which ensures the values of \(G\) are between 0 and 1. This allows \(G\) to act as a proportionality coefficient to control the flow of information.
\({W}_{g}\) and \({b}_{g}\) are parameters of the gating mechanism, which are obtained through training data.
\(*\) denotes the Hadamard product (elementwise multiplication), used to control the flow of information by multiplying each element of \(A\) with the corresponding element in \(G\).
\(C\) is the output after gating, representing the information modulated by the gating mechanism.
Graph Convolutional Networks
Graph Convolutional Networks (GCNs) are engineered to handle data organized in graph structures, leveraging graph convolutional layers to derive vertex feature representations. By enabling the spread of information across a graph's nodes, GCNs allow each node to collect data from its adjacent nodes, effectively encapsulating the graph’s topological attributes55. The fundamental principle of graph convolution is succinctly captured in the following mathematical formula.
Consider a graph \(G=(V,E)\), where \(V\) is the set of nodes and \(E\) is the set of edges. Let \(A\) be the adjacency matrix of the graph, and \(X\) the node feature matrix (with each row representing the feature vector of a node). A single layer of a GCN can be expressed with the following mathematical formulation.
$${H}^{(l+1)}=\sigma \left({\widehat{D}}^{\frac{1}{2}}\widehat{A}{\widehat{D}}^{\frac{1}{2}}{H}^{\left(l\right)}{W}^{\left(l\right)}\right)$$
5
In the context of GCN:
\({H}^{\left(l\right)}\) represents the node features at the \(l\)th layer, with \({H}^{\left(0\right)}=X\) for the input layer.
\({W}^{\left(l\right)}\) is the weight matrix for the \(l\)th layer.
\(\sigma\) denotes a nonlinear activation function, such as ReLU.
\(\widehat{A}=A+{I}_{N}\) is the sum of the adjacency matrix \(A\) and the identity matrix \({I}_{N}\), where \(N\) is the number of nodes. This addition of selfconnections allows each node to retain its own features while aggregating information from its neighbors.
\(\widehat{D}\) is the diagonal degree matrix of \(\widehat{A}\), with each element \({\widehat{D}}_{ii}\) equal to the degree of node \(i\) after adding the selfconnection.
\({\widehat{D}}^{\frac{1}{2}}\widehat{A}{\widehat{D}}^{\frac{1}{2}}\) is the normalized adjacency matrix, used to ensure that features maintain relative proportions during propagation.
Construction of the Graph Structure
The Gated Graph Convolutional Neural Network relies on the graph structure, which is represented in the form of an adjacency matrix. In this study, two types of adjacency matrices were constructed, and node features were added to each station in the graph structures.
Dynamic Adjacency Matrix
The demand for borrowing and returning shared bicycles at various stations is complex and varies significantly with different time states, heavily influenced by time series56. This study uses a onehour time window for statistical analysis, aggregating data at an hourly granularity for each timestamp. Each bikesharing station is treated as a node, with features including the hourly rental and return demand, and the proportion of different user characteristics within each time window.For edge weights, the study normalizes and incorporates the distances between stations, travel durations between stations, and correlation values between stations. The correlation coefficients between stations are calculated by analyzing the number of rentals and returns between stations, assessing the degree of mutual influence and cooperative relationships. The Pearson Correlation Coefficient (PCC) is used as the measurement tool for this analysis. Different weight coefficients are assigned to the distance, travel duration, and station correlation values, which are then combined to obtain a total weight for the edges. The formula for edge weights is as follows.
$$W=\alpha \bullet {W}_{time}\left(t\right)+\beta \bullet {W}_{distance}+\gamma \bullet {W}_{PCC}$$
6
Here, \(\alpha ,\beta\) and \(\gamma\) are the weight coefficients for travel duration, travel distance, and station correlation values, respectively. \(W\) represents the composite total weight. \({W}_{time}\left(t\right)\) denotes the normalized travel time at different moments, \({W}_{distance}\) represents the normalized time distance between stations, and \({W}_{PCC}\) indicates the correlation coefficient between stations.
Reachability Matrix
When determining whether bikesharing stations are reachable from one another, spatial distance is a crucial evaluation factor. In this study, the geographic distance between known station pairs is calculated by analyzing their latitude and longitude coordinates using the Haversine formula. Through this method, we can precisely assess the actual accessibility between different stations. The elements within the accessibility matrix are defined as follows.
$${A}_{ij}=\left\{\begin{array}{c}1,if {d}_{ij}\le {k}_{d};\\ 0,if {d}_{ij}>{k}_{d}\end{array}\right.$$
7
Here, \({d}_{ij}\) represents the geographical distance between station pair \(i\) and \(j\), and \({k}_{d}\) is a predefined threshold for station proximity.
Construction of the Research Framework
The architecture of the Gated Graph Convolutional Neural Network is meticulously designed into several main components: the input layer, gated graph convolutional units, the readout layer, and fully connected layers. The input data to the model primarily includes node features and the topology of the graph. In this network, each gated graph convolutional unit is a core component, consisting of two parts. The structure of the Gated Graph Convolutional Neural Network is depicted in Fig. 1.
Experimental results and analysis
Dataset
The original dataset used in this study was downloaded from https://citibikenyc.com/. These records document the shared bicycle borrow and return trips in Jersey City, USA, for the entire year of 2020. Each record includes information such as travel duration, start and end times, geographical coordinates of the start and endpoints, station IDs at both endpoints, locations of the stations, user type, gender, and birth year. We performed the following data preprocessing tasks on the raw dataset.
We used the Isolation Forest algorithm to detect and remove outlier trip records with abnormal trip distances, durations, and travel frequencies between stations. Common parameter settings for the Isolation Forest algorithm were used, with the number of estimators set to 500. The sample size for each isolation tree and the contamination ratio (proportion of outliers in the dataset) were set to 'auto', allowing the algorithm to automatically adjust these parameters based on the dataset.
Time Dependency Analysis
In this study, we conducted statistical analysis on the timestamps in the dataset with an hourly granularity. We also labeled holiday information in the dataset, integrating holidays and weekends as nonworking days, while categorizing the remaining dates as working days. A visual analysis was performed to compare the hourly demand for bikesharing between working days and nonworking days, as shown in Fig. 2.
From the graph, it is evident that.
During working days, travel frequency peaks during the morning (around 7–9 AM) and evening (around 5–7 PM), reflecting a typical commuting pattern for work.
On nonworking days, the distribution of travel frequency is more uniform, but it gradually increases starting from noon, reaching its peak in the evening.
Spatial Dependency Analysis
In this study, we calculate the PageRank values between stations to reveal which are more pivotal within the research area. We also compute Moran's I index and perform a Moran's I test to assess local spatial autocorrelation with local Moran's I indices and global spatial autocorrelation with global Moran's I indices. Pearson's correlation analysis based on the number of borrow and return transactions between station pairs helps identify significant mutual influences between stations. Using data from shared bicycle station trips, we create flow maps and heatmaps of the stations to visualize the movement patterns and hotspot areas between stations.
The PageRank values of the stations and their relative positions on the geographical latitude and longitude map are shown in Fig. 3.
We use different colors to differentiate the PageRank values of the stations. Among all stations, Grove St PATH has the highest PageRank value, reaching 0.08, followed by Newport Pkwy, Hamilton Park, and Sip Ave, each surpassing a PageRank value of 0.04. Additionally, according to Fig. 4, the heatmap of bicycle borrowing and returning demand, and Fig. 5, the flow map of bicycle borrowing and returning demand, it is evident that the central area of Jersey City is a hotspot for shared bicycle station activity. The Grove Street PATH station, an important public transportation hub and venue for many community events and festivals, is often referred to as the heart of downtown Jersey City. Its core status is also evident from its high PageRank value, central position in the heatmap, and the intense frequency of trips shown in the flow map. Similarly, the stations with high PageRank values like Newport Pkwy, Hamilton Park, and Sip Ave play significant roles in the economic, cultural, and daily life of Jersey City.
From Figs. 3, 4, and 5, we observe that different shared bicycle stations play varied roles within the city, and there are differences in their impact on residents' travel. These variations reflect the combined effects of urban planning, community layout, transportation networks, and residents' travel habits. Therefore, understanding spatial dependencies is essential and crucial when predicting the demand for bicycle borrowing and returning at shared stations.
This study also utilizes Moran's I index to analyze global and local autocorrelation of stations; and employs Pearson correlation analysis to study the interrelations between stations. The findings are as follows.
Table 1
Moran's I Index

Pvalue

0.08172

0.001

From the results of the global Moran's I index shown in Table 1 and the frequency distribution of the local Moran's I index in Fig. 6, we can see that overall, there is no apparent clustering or regular distribution pattern among the shared bicycle stations within the area. However, there are clear spatial clusters among certain local stations, and some even show significant spatial dispersion. Most stations do not exhibit significant spatial autocorrelation, and the pvalues for some stations exceed 0.05, indicating insufficient evidence to suggest that the spatial clustering or dispersion at these locations significantly differs from a random distribution.
Spatiotemporal dependency analysis
In this study, after conducting temporal dependency analysis and spatial dependency analysis, we proceeded with spatiotemporal dependency analysis. We analyzed the variations in rental and return demand for each station at different times of the day. Stations were divided into four groups based on their rental and return demand levels, resulting in the hourly rental demand variation graph and the hourly return demand variation graph for bikesharing stations, as shown in Figs. 7 and 8.
From Fig. 7, in Group 1, the "Grove St PATH" station shows a significant number of rentals during the evening peak hours. Conversely, in Fig. 8, in Group 1, the same station shows a high number of returns during the morning peak hours. This pattern reflects the "tidal" nature of residents' work and life. This phenomenon is common among stations with high travel volumes.
In Group 4 of Figs. 7 and 8, which includes stations with very low or low hourly rental and return demand, the morning and evening peak patterns are not as pronounced due to the location of these stations. The demand changes smoothly between 6 AM and 8 PM.
Some stations, like 'Journal Square' and 'McGinley Square' in Group 3, exhibit unique peaks that differ from typical commuting times. This indicates localized demand likely related to nearby facilities such as schools, offices, or shopping areas.
Traveler Dependency Analysis
In this study, traveler characteristics are creatively incorporated as input features into the model, and an analysis of traveler dependency is conducted. This analysis examines the differences in shared bicycle borrowing and returning demand from the perspectives of the travelers' age, gender, and user type, as well as the variations in travel behavior resulting from different user characteristics. Initially, the dataset's ages were categorized into segments as follows, based on empirical values and not defined by existing laws: Teenagers (up to 19 years old), Young Adults (20 to 29 years old), Middleaged (30 to 44 years old), Older Adults (45 to 64 years old), and Seniors (65 years and older). In this study, we created heat maps based on the average travel distance, average travel time, and travel frequency, categorized by traveler age, gender, and user type. These are shown in Figs. 9, 10, and 11. Gender types are designated as 1 for male and 2 for female, and user types as 0 for casual users and 1 for subscribers. These visualizations reveal the differences in travel distance, time, and frequency among different combinations of user characteristics. To determine whether the differences in travel distance and duration among different genders, user types, and age groups are statistically significant, the study employed the ShapiroWilk test to examine the distribution of travel distance and duration data. The results indicate that the data do not conform to a normal distribution (Pvalue far less than 0.05), meaning that statistical tests based on the normal distribution assumption, such as ANOVA, cannot be used to analyze this data. Consequently, the KruskalWallis Htest was utilized to examine the differences in these data. The results of the KruskalWallis Htest for different genders, user types, and age groups regarding travel distance and duration are presented in Table 2.
Table 2
Results of the KruskalWallis Htest Statistical Analysis
Object

Statistic

Pvalue

Distance by Gender

2.455

0.117

Distance by User Type

2041.979

0.000

Distance by Age Group

2209.023

0.000

Duration by Gender

1618.293

0.000

Duration by User Type

27518.404

0.000

Duration by Age Group

6074.359

0.000

Based on the results of visual and data analyses, we can clearly identify significant differences in travel behavior across different genders, user types, and age groups, focusing particularly on the key metrics of travel distance and duration.
The comprehensive visual analysis of the current dataset allows us to distinctly differentiate the travel behaviors across different genders, user types, and age groups. These differences are not only statistically significant but also have profound implications for practical applications. Given these findings, incorporating individual traveler characteristics into the predictive model for shared bicycle station borrowing and returning demands becomes particularly crucial. This not only enhances the accuracy of predictions but also aids in optimizing resource allocation, improving user satisfaction, and enhancing overall service efficiency.
Model Validation and Analysis
Graph Generation and Node Feature Calculation
This study employs a onehour time window for predicting hourly rental and return demand at bikesharing stations. A onehour time window effectively captures shortterm demand fluctuations. The dataset is aggregated at onehour intervals, recording the number of rentals and returns at each station per hour. Age groups, gender, and user types are encoded using onehot encoding, allowing for the calculation of user type distribution (including the proportion of subscribers and customers), gender distribution (including the proportion of male and female users), and age distribution (proportion of users in different age groups) within each station and time window. These calculated metrics, along with the rental and return demand for each station and time window, are combined to form the node features.
To construct the station accessibility matrix, the distances between station pairs were calculated, with a distance threshold set at 2 km. This matrix provides the information needed to establish edges between station pairs. Previous studies often employed simplistic edge weight designs, using either travel distance or travel time as the edge weight. However, this study argues that a singular edge weight design does not adequately consider the travel dynamics between stations. For instance, some travelers may have the same start and end stations but exhibit a specific travel duration. Therefore, this study incorporates both station distance and travel time in the edge weight construction. Additionally, to account for varying correlations between stations, the station correlation coefficient is also included as a factor in the edge weight calculation, resulting in composite edge weight information. The parameters \(\alpha ,\beta\)and \(\gamma\) are set as model parameters, allowing them to adaptively select the optimal values based on the dataset. The edge weights are calculated using Eq. (6).
Model Validation and Results Analysis
In this study, Gated Graph Convolutional Neural Network (GGCN) is constructed to predict the hourly rental and return demand at bikesharing stations. The structure of the GGCN is illustrated in Fig. 12.
Given the significant impact of seasonal factors on the time series data of bikesharing, this study analyzes a full year of bikesharing data from Jersey City. To effectively account for seasonal factors, the annual data is divided into different seasons based on the temperature and seasonal changes in Jersey City for 2020, as provided by https://weatherspark.com/. Spring is defined as March to May, Summer as June to August, Fall as September to November, and Winter as December to February. Considering seasonal similarities and transitional effects, Spring and Fall have relatively similar climatic conditions, characterized by moderate temperatures that are conducive to biking. Therefore, merging the data from Spring and Fall for analysis can increase the data volume and improve the model's training efficacy. Conversely, Summer and Winter experience extreme weather conditions, such as high temperatures and heavy snow, which significantly affect bikesharing usage. Analyzing Summer and Winter data together allows for better capture of the impact of extreme weather on bikesharing demand. Thus, this study divides the dataset into SpringFall and SummerWinter datasets for predictive analysis, fully considering the influence of seasonal factors. For each of the four seasons (Spring, Summer, Fall, and Winter), the data from each threemonth period is split into training and testing sets, with the first two and a half months used for training and the last half month for testing. The training and testing sets for Spring and Fall are combined, and similarly for Summer and Winter. In the testing set, the prediction focuses solely on peak hours at the stations (7–9 AM for morning peak, and 5–7 PM for evening peak) for rental and return demand.
Mean Squared Error (MSE) and Mean Absolute Error (MAE) are used as evaluation metrics for the model, with the ADAM optimizer, a learning rate of 0.001, MSE as the loss function, and a batch size of 128. The baseline models selected for comparison are Graph Convolutional Network (GCN), Gated Graph Recurrent Neural Network (GGRNN), Graph Isomorphism Network (GIN), and Graph Attention Network (GAT). The training results for different models, showing the loss values on the SpringFall and SummerWinter testing sets as a function of the number of iterations, are presented in Figs. 13 and 14. The MAE and MSE values for the different models across the SpringFall and SummerWinter datasets are shown in Tables 3 and 4.
Figures 13 and 14 employ dual yaxes due to the significant differences in test loss values obtained from different models. A logarithmic transformation is applied, with the test loss values for GAT, GCN, and GateGCN set on the left yaxis, and the test loss values for GIN and GGRNN set on the right yaxis. From the visualization results and the test loss values of different models, the following conclusions can be drawn:

The GateGCN model exhibits significant fluctuations in test loss values but consistently maintains the lowest values. This indicates that GateGCN is the most robust model with the strongest generalization ability on the test data.

The GIN model performs the worst, with the highest test loss values and considerable fluctuations.

The GAT, GCN, and GGRNN models perform moderately, with relatively stable test loss values. However, these values remain high, suggesting potential issues with model capacity or overfitting.
Table 3
Evaluation Metrics Results for Different Models in the SpringFall Seasons
Models

MAE

MSE

GCN

0.592

1.071

GateGCN

0.52

0.906

GAT

0.588

1.058

GIN

3.866

34.735

GGRNN

0.566

0.979

Table 4
Evaluation Metrics Results for Different Models in the SummerWinter Seasons
Models

MAE

MSE

GCN

0.34

0.765

GateGCN

0.296

0.594

GAT

0.349

0.754

GIN

2.006

14.938

GGRNN

0.334

0.715

By comparing the evaluation metrics (Mean Absolute Error  MAE and Mean Squared Error  MSE) of different models in Tables 3 and 4 across the SpringFall and SummerWinter seasons, we can draw the following conclusions: 
The GateGCN model performs the best in terms of MAE and MSE, followed by the GGRNN and GCN models, which show relatively close performance. The GAT model performs slightly worse than the GateGCN and GCN models in terms of MAE and MSE, but still outperforms the GIN model. The GIN model performs the worst, with error values significantly higher than those of the other models.
The results in the tables clearly demonstrate that the GateGCN model, as a graph neural networkbased model, exhibits superior performance in predicting the hourly rental and return demand at bikesharing stations. It effectively captures the spatial relationships between stations and the temporal dynamic features, dynamically adjusting the flow of information through the gating mechanism, thereby comprehensively considering both spatial and temporal features to achieve more accurate predictions.
By analyzing the differences in Tables 3 and 4, we observe discrepancies in the model evaluation metrics between the SpringFall and SummerWinter seasons. The data indicates that the evaluation metrics for the models in the SummerWinter seasons are smaller compared to those in the SpringFall seasons, suggesting that the models perform more accurately in the SummerWinter seasons. This difference can be explained by seasonal demand variations and the impact of extreme weather conditions.
Firstly, extreme weather conditions in the SummerWinter seasons may lead to more concentrated bikesharing usage. During these seasons, people are more likely to travel during short periods when the weather is more favorable, reducing the likelihood of using bikesharing services at other times. In contrast, the mild climate of the SpringFall seasons is more conducive to outdoor activities, leading to more dispersed travel patterns. In these conditions, bikesharing usage is more influenced by personal travel preferences and longterm commuting needs rather than direct seasonal weather factors. This dispersed travel pattern makes it more challenging for the prediction models to accurately capture the demand trends, resulting in relatively larger evaluation metrics for the SpringFall seasons.
In summary, extreme weather conditions in the SummerWinter seasons may lead to more concentrated bikesharing demand, while the mild climate of the SpringFall seasons results in more dispersed demand. This seasonal demand pattern directly affects the performance of the prediction models. Therefore, it is essential to fully consider the impact of seasonal factors in the prediction of bikesharing station rental and return demand to improve the accuracy and practicality of the models.
Table 5
Comparison Results of MultiFeature Edge Weights and SingleFeature Edge Weights in the SpringFall Seasons
Edge Weight

MAE

MSE

trip

0.554

0.962

dist

0.53

0.915

corr

0.54

0.913

tripdistcorr

0.52

0.906

Table 6
Comparison Results of MultiFeature Edge Weights and SingleFeature Edge Weights in the SummerWinter Seasons
Edge Weight

MAE

MSE

trip

0.312

0.643

dist

0.299

0.601

corr

0.305

0.597

tripdistcorr

0.296

0.594

For better clarity, we denote travel time as (trip), travel distance as (dist), and correlation coefficient as (corr). By comparing the data in Tables 5 and 6, it is evident that using multifeature edge weights instead of singlefeature edge weights in this study resulted in superior prediction accuracy in the construction of the hourly rental and return demand prediction model for bikesharing stations. The advantage of this approach is validated in both the SpringFall and SummerWinter seasons.
Specifically, for the SpringFall seasons, the model using multifeature edge weights outperformed the model using singlefeature edge weights across all evaluation metrics. The same superiority is observed in the SummerWinter seasons. This demonstrates the flexibility and effectiveness of multifeature edge weights, enabling more accurate results in prediction tasks across different seasons. Therefore, the findings of this study further emphasize the importance of employing multifeature edge weights, providing valuable insights for research and practice in the field of bikesharing station demand prediction.
Conclusions and future directions:
Predicting hourly rental and return demand at bikesharing stations plays a crucial role in improving the utilization rates of these stations. Numerous factors influence the prediction of bikesharing station demand, with previous researchers focusing on aspects such as time, space, weather, and events. This study, however, considers the impact of traveler characteristics, such as age, gender, and user type, on the demand prediction for bikesharing stations. Using a fullyear dataset from Jersey City in 2020 for empirical analysis, this study conducts temporal, spatial, and traveler dependency analyses to demonstrate the necessity of considering these diverse features. Seasonal factors are also taken into account, with the dataset divided into different seasons. Spring and Fall are analyzed together due to their similar climates and transitional period effects, while Summer and Winter are analyzed together. The Gated Graph Convolutional Network (GGCN) is employed as the prediction model, with bikesharing station data treated as nodes and traveler characteristics added as node features. Edges are connected based on a distance threshold, with multifeature edge weights constructed using travel distance, travel time, and station correlation coefficients. This innovative approach involves creating composite edge weights, dynamic adjacency matrices, and accessibility matrices. Using evaluation metrics MAE and MSE, the proposed method is compared with baseline models to prove its feasibility and superiority. The prediction results for SpringFall and SummerWinter further demonstrate the importance of considering seasonal factors to enhance the accuracy and practicality of the prediction model.
Many factors influence the prediction of bikesharing station demand, with weather being a key factor not considered in this study. Future research should explore the applicability of the proposed method with higherdimensional input features. Additionally, the contribution of different edge feature weights to the prediction values warrants further investigation.