Assessment of the dynamics of bike-sharing for students’ mobility in Kigali City

doi:10.21203/rs.3.rs-2431901/v1

Compared to other modes of transportation available today, bike sharing is favored in more than 800 cities for its low environmental impact. Members of the bike-sharing program can use bikes from any related bike-sharing network. Users get the advantages of the system without bearing the burdens of ownership. There have been four versions of bike-sharing used thus far. The application of smart cards is a relatively new innovation introduced in bike-sharing systems. The new innovation allowed for the beginning of data availability through Stations and it has facilitated data accessibility for researchers. But papers tracking the evolution of bike-sharing membership and activities are rare. In contrast, studies focusing on students' mobility are few and far between. Within the framework of bike sharing, the dynamics of bike sharing are discussed in the literature of this paper. This has been accomplished by keeping tabs on how many people used the system over time, since this provided important information about the system's functioning and might be used as a basis for allocating stations. Data from GuraRide, Kigali - Rwanda used in this study. And 10,073 bike-share trips taken between December 9, 2021 and May 30, 2022 were analyzed. Then, a GIS software was used to map the locations of stations and corridors, and Python software was used for statistical analysis. A random forest algorithm was used to assess changes in stations and corridors usage. The findings indicated positive variations among system users and stations usage.

Bike sharing

machine learning algorithms

GIS

station and corridor

Bike-sharing or public bike share make bikes available to the public for sharing purpose (Zhou et al., 2022; Huang et al., 2020; Zhou et al., 2019). Generally, the bike-sharing user is allowed to borrow a bike at a station and return it at any station of the same program in the city (Fishman, 2016). The program usage has been increased in different countries due to its flexibility and low cost to users (Fishman, 2016, Zhou et al., 2022). The program started in 1965, and it passed through different development stages (Brennan, 2017; Shaheen et al. 2010; Shaheen et al., 2011). Today, it is playing a significant role in urban transport (Zhang et al., 2015). In comparison with motorized transport modes, bike sharing is environmentally friendly. Addition, it is a good solution for the first and last mile problem (Chen et al., 2020; Zhou et al., 2022). With the increase of congestions, emissions and parking problem on motorized mode, road users gradually shifted to a bike-sharing and walking for short trips. Through this, researchers paid attention to this mode and some studies have been conducted.

Most research has focused on the advantages of bike-sharing in urban areas. Congestion and emission reduction came first (Shaheen et al, 2010; Hunter, 2020; Manville and Shoup, 2005). Other researchers focused on time and cost saving, and connecting to other existing transport modes (Bauman, et al. 2008, Brennan, 2017, Shu et al., 2010; Shaheen et al., 2011; Maio, 2009). However, (1) there are few studies on bike-sharing programs target mobility of students and youth at large. According to research done by Weigand (2008), students prefer to ride their bikes to school. Similarly, in African cities such as Dar es Salaam, students prefer cycling to school instead of public or private transportation (Bwire, 2011). In addition, (2) there is lack of studies on the assessment of the dynamics of the changes that have happened since the introduction of the bike-sharing system. Furthermore, (3) Hongyun (2018) tracked 208 bike-sharing papers published over the past decade and discovered that none of them are from Africa. Then, (4) there is a dearth of study on bike-sharing in Rwanda and Africa in general. However, it is not advisable to transfer findings from other locations in Rwanda due to the diverse conditions and cultures. Providing information on the transportation system facilitates improved activity planning and scheduling for both public authorities and operators (Zhao et al., 2020, Victoriano et al., 2020). This study will bridge the gaps identified.

The purpose of this research is to examine the evolution of bike-sharing in Kigali, the Capital City of Rwanda. The machine learning model was utilized to predict station and corridor usage frequency. Referring to existing literature, the Random Forest model was identified as the best model for this assessment. The accuracy and F1-score, which combines precision and recall, were used to assess the efficacy of the model. The stations and passageways were mapped using a GIS application.

Svenja (2017) noted that bike-sharing users must be evaluated over time in order to assess changes over time. Consequently, some scholars recommended evaluating the evolution of bike-sharing programs over time (Dill, 2007; winters, 2010; Titze, 2008; Hunt, 2007; Khaled, 2020; Krykewycz et al, 2010; Shu et al., 2010; Shaheen et al., 2011; Maio, 2009). In 2011, the total number of bike-sharing trips in France exceeded 6.2 million on 60 square kilometers (Vélo V, 2014). In 2014, trips rose to 8.3 million trips. The majority of users switched from automobiles and walking to bike-sharing (Vélo V, 2014). Kittleson (2017) investigated traffic congestion in Baltimore and discovered that bike-sharing can be a solution. Mitrovic et al. (2013) concluded that several bike-sharing schemes enhance accessibility. Raviv (2013), Dhingra and Kadukula (2010) argued that the station location is a key in dynamic assessment.

Bike-sharing programs influence traffic flow in congested cities and bike-sharing users spend less time in contested areas (Fishman et al., 2014; Fishman et al., 2012; Zhao et al., 2014; Rixey,2013; Marleau et al., 2012; Bachand et al., 2012; Faghih et al., 2014; Fishman et al., 2015, Shaheen et al., 2011; Yang et al., 2011; Parody et al., 2017; Habib et al 2014; Jensen et al., 2010). This fact indicates a shift from other existing models to bike-sharing. The use of credit cards improves bike-sharing utilization and provides research data (Zhao et al., 2014; Jäppinen et al., 2013; Lathia et al., 2012; Daddio, 2012; Rixey, 2013; Wang et al., 2016; Gonzalez et al., 2016; Eluru, 2015). Moreover, Cycling Lane separated with motorised lane affects positively the cycling quality (Rixey, 2013). The size of stations and nearby stations has a positive impact on ridership (EI- Assi et al, 2015; Faghih- Imani and Eluru, 2015). Furthermore, several studies indicated that low travel cost, travel time, and accessibility are the major advantages of the bike-sharing system usage (Fishman et al., 2013; Jäppin et al., 2013; Frade and Ribeiro, 2014 and Gonzalez et al., 2016; Corcoran et al., 2014; Gebhart and Noland, 2014).

Globally, machine learning algorithms have been applied to model and predict demand in different transport modes. Researchers used machine learning algorithms to predict tourism demand and evolutions (Akın, 2015; Claveria et al., 2016; Claveria et al., 20118; Wong et al., 2019; Bi et al., 2020; Xu et al., 2020). Other studies include, modelling bike-sharing availability (Ashqar et al., 2017), predicting bike-sharing (Bacciu et al., 2017; Fei et al., 2017; Li et al., 2015; Wang et al., 2017; Yang et al.,2015; Zhang et al., 2016, Kim et al.,2018; Liu et al., 2019; Chen at el.,2016; Chang et al., 2017), predicting bike-sharing usage (Caggiani et al.,2017), modelling bike-sharing travel time (Ghanem et al., 2017); bike-sharing optimization (Liu et al., 2015); bike sharing demand predicting (Salaken et al., 2015).

Akın (2015) compared different machine learning algorithms to predict tourism arrivals in Turkey. Claveria et al (2016) used support vector regression, neural network models and Gaussian process regression to predict tourism arrivals in Spain. Gao and Chen (2022) used linear regression, the support vector machine, and a random forest model to predict bike sharing demand in Seoul. To assess the dynamics of bike- sharing or bike-sharing demand analysis, random forest has been used by different researchers (Sathishkumar and Cho, 2020; Ashqar et al., 2017; Almannaa et al., 2020; Manish V. D., 2016; Ruffieux et., 2018; Xu et al., 20190). Bike-sharing optimization using random forest (Fan et al., 2019); bike-sharing network analysis using random forest (Ashqar et al., 2021), in bike-sharing dynamic repositioning, a random forest has been used (Lin et al., 2022).

The majority of examined research failed to examine the dynamics and evolution of bike-sharing users over time. Previous studies showed that most research are from Europe, Asia, Oceania and America (Muren et al., 2019; Yahya, 2017; Schuijbroek et al., 2017; Zhang and Mi, 2018). There is a paucity of bike-sharing research in Rwanda and in Africa at large. Bike-sharing usage is on the rise, but there is scarce research on evaluation of changes over time and therefore there is a need to ensure the readiness of the operating system. Furthermore, there is a gap in the selection of machine learning model which is the best to identify the dynamics of bike-sharing (Félix et al., 2020). In Seoul, South Korea, the approach is used to assess the hourly changes in the system usage.

In this study, the following questions will be addressed. (1) How do consumers of bike-sharing in Kigali City evolve over time? (2) What factors influence changes in bike-sharing consumers travel behavior? (3) How the number of bike-share users has shifted over time and across locations? Answers to these questions may have implications for the enhancement of bike-sharing location-allocation and related planning efforts. The study will use trip data for 2021 and 2022 and compare the data in terms of system user and bike-sharing usage trends.

3.1 Study Area

The city of Kigali experiences a tropical climate. In 2035, the population of the city is predicted to reach 3.8 million people, up from the current population of roughly 1.2 million. The percentage of youths in the city is around sixty five percent, and a large proportion of youngsters are students.

The city has introduced car-free days and car-free zones (Kigali car-free day, 2019). In 2019, City of Kigali implemented bike-sharing to alleviate traffic congestion and greenhouse gas emissions (Guraride, 2019). But no research conducted on the evaluation of the mode.

Bike-share riders ride in designated bike lanes, which are physically separated from the pedestrian walkway and traffic lanes. This ensures the safety of riders. Users of bike-sharing programs take precautions against head injury by wearing helmets. The bikes are built with low gears specifically to reduce speed. To prevent accidents, bikes are illuminated and colored brightly at night. A deposit is required to utilize the system, which encourages responsible behavior. Users of the system are primarily young and well-educated. As a result, passengers are well-versed on traffic regulations, and the system administrators occasionally host trainings on the topic. In additional, experts have shown that private riders are more likely to have an injury than bike-sharing users. This is due to the design of bike-sharing bikes and the accompanying recommendations offered to users. Figures I and II show detailed maps of the stations and the corridors connecting them.

As it can be seen from Figure III, the maps displayed depict the locations of stations and corridors. The city is divided into three districts: Kicukiro, Gasabo, and Nyarugenge. The only districts where bike-sharing is available are Gasabo and Nyarugenge. Only 9 of the 18 stations established are operational, including 5 stations in corridor Remera and 4 stations in the Central Business District. Kigali has an area of around 730 square kilometers and a population of over 1.2 million.

3.1 Map of Kigali with administrative boundaries

Safe and regulated movement is possible throughout the city of Kigali. Since 2007, the government of Rwanda has implemented stringent speed limits across all of the country' districts. The movement of motorized transport is monitored by stationary and mobile speed cameras. In addition, it is illegal to operate a vehicle while under the influence of alcohol in Rwanda. Drivers who act in such a reckless manner are subject to severe fines. This was implemented in order to manage the movement and reduce the number of road traffic accidents. Both public and private transportation vehicles, as well as commercial vehicles, have speed governors installed. It is recommended that these vehicles have a top speed of no more than 60 kilometers per hour. The planning of transportation is within the purview of the Ministry of Infrastructure. The Rwanda Transport Development Agency is in charge of putting this plan into action. The Rwanda Utility Regulation is in charge of regulating fares, while the Rwanda National Police is in charge of enforcing all of the laws and regulations. This ensures the safety of cyclists when they are riding their bikes and motivated me to do research on Kigali city and assess how bike-sharing usage have changed over a time.

3.2 Data

From September 12th, 2021, through May 30th, 2022, data was gathered from GuraRide. From a total of 9 active docking stations, we have data on 1073 bike-share trips. Eight docking stations were found to be non-functional, thus their data was not included in the tally. There are nine factors that have been used to predict system usage: (1) Date, (2) Gender, (3) Station, (4) Corridor, (5) Time, (6) fare, (7) age, and (8) Education. The date is crucial since it indicates year, months, and whether or not students will be on holiday or not. Understanding the gender of a potential user can assist determine if they are a man or a woman. Dates hep to know evolutions of bike-sharing on a daily basis. Location of the station is crucial, as users are less likely to make use of it if it is inconveniently situated. The way a system is used depends on the corridors that connect various types of land use and infrastructure. The amount of time a user spends in a system might vary, but prolonged use leads to a decline in productivity. Potential users may be enticed to join the system because of the fare. Younger individuals are more inclined to ride bikes than elderly people because they have more energy to devote to the activity. It's simple to teach people how to log in and out, the system involves the use of applications, and education plays a significant role in increasing system utilization.

3.3 Methods

The frequency of bike-share rides was predicted using random forest algorithm, a machine learning algorithm. The goal of machine learning is to utilize mathematical or statistical models to gain a high-level understanding of the data and then use that knowledge to create predictions. This model was useful in machine learning since it is constructed on the basis of assumptions about the relationships between dependent and independent variables. A random forest model is built from data subsets to address the overfitting problem, and the result is based on an average ranking. Additionally, Random Forest chooses observations at random, builds a decision tree, and calculates the mean of the outcomes. It doesn't use any formulae at all.

3.4 Random Forest model

The classification method known as Random Forest is an ensemble approach. It takes a training dataset and uses bootstrapping to generate several decision trees, with each decision tree being trained with a different random subset of all the characteristics. It has been demonstrated that Random Forest can avoid overfitting and bring down the variance. During the generation of bootstrapped samples (i.e., tree bagging), given a training feature matrix Q = [q₁, q₂, · · ·, q_n] T, n is the observations, and each xi is a p- dimensional feature vector, and a response vector F = (f₁, f₂ · · ·, f_n) T, bagging will repeat B times to sample with replacement of the n observations and subsample without replacement of the p features and construct a decision tree based on the bootstrapped training data. Typically, √p features are subsampled to form each of the B trees. Random Forest not only performs sample bagging but also performs feature bagging, and its final prediction is based on the majority vote of classification results in B trees.

Given that: (q_1, f₁), (q_2, f₁), (q_3, f₁), ………… (q_{n, fn}) for training data where f_i$\in \left\{-\text{1,1}\right\}$_, and the class labels are written as m_i=2fi-1$\in \left\{-\text{1,1}\right\}$ and weight of each data point q_i is M₁(i)= $\frac{1}{n}$ where i= 1…..., n. for t = 1,..., T; where T is the total iteration number. Train a weak learner, using the training data and weights provided by M_t. h_t(q_i) $\in \left\{-\text{1,1}\right\}$. Then the classification error ($e$) will be:

$e=\frac{1}{n}\sum _{1=1}^{n}{M}_{1}$ (i) I${ (h}_{t}({f}_{i}$)$\ne {z}_{1}$), select ${\alpha }_{t}$=$\frac{1}{2}\text{ln}(\frac{1-{e}_{t}}{{e}_{t}})$, as i is data point, i= 1,…,…, m then if the data point was properly categorized, its weight will decrease in the subsequent iteration. If not, its weight will increase in the subsequent cycle.

${M}_{t+1}$ (i)=$\frac{{M}_{t}\left(\text{i}\right)\text{e}\text{x}\text{p}(-{\alpha }_{t}{{z}_{i}h}_{t}\text{f}\text{i}) }{{Z}_{t}}$ where Z_t is a constant used to normalize the data$\sum _{i=1}^{n}{M}_{t+1}\left(\text{i}\right)=1$

Finally, the prediction is:

H(f_i) = Sign ($\sum _{t=1}^{T}{\alpha }_{t}{h}_{t}\left(\text{f}\right)$, Then The prediction label is 1 if H(x) is less than zero, and 0 otherwise.

3.4.1 Splitting Train test, label encoding and feature selection

With the train-test-split method, machine learning algorithms can be tested. The method splits the given dataset into two groups (train data set and test data set): The training dataset is used to train and fit the model while test data set is used to test the model. The algorithm uses the input element of the training data in the test dataset to make predictions. First, the model has to fit the given data with inputs and outputs that are already known. Then, the algorithms are used to make predictions about the remaining subset of data so that the data can be used to learn. Scikit-learn is a Python library that can be used to cluster, choose a model, and classify data in many different ways. "Model-selection" is a blueprint framing technique. This plan is used to look at data before putting that knowledge to use for new data. When making predictions, it helps to get accurate results by using the right model. The train-test-split is used to divide a dataset into two parts. One is used to train models, and the other is used to test the model. Instead of having to do it by hand, this function automatically divides data into subsets. By default, train-test-split uses random partitions to split the data into two parts. People often use this treatment because it is quick and easy to do.

Typically, machine learning datasets have column labels. Labels consist of words or numbers. Labeling training data with words facilitates comprehension. Label encoding translates integers to forms that are machine-readable. Algorithms define how labels should be utilized. The label encoding stage is a component of supervised learning models. Label encoder translates labels into 0–1 numeric values. Label encoding converts information into a machine-readable format. Each dataset class is assigned a unique number beginning with zero, which creates training priority concerns. There may be a preference for high-value labels over low-value ones.

The technique of feature selection was used to determine the important traits with the strongest association to the target label. Relevant attributes hinder model performance. Select model characteristics prior to creation. Feature selection lowers overfitting, increases accuracy, and saves training time. In research, the feature Selection makes use of the chi-square score to identify pertinent qualities.

3.4.2 Data preparation and preprocessing

We used data from the Guraride dataset. Some values in the dataset were missing. Thus, we have removed null values from the dataset. During this stage, data were cleaned, and wrong entry data corrected. There were 1254 non-students, 30 full-time employees, 48 part-time employees, and 68 self-employed individuals, while there were 8673 students. Education values have been changed to Yes for the Student bike-sharing service and No for all other non-students. The Education column has been replaced by the Bike Sharing Student column. After transformation, the frequency in the Table I has changed. In 1073 trips, 8673 students (Yes cases) and 1400 non-students (No cases) traveled (No cases).

Table I: Python Output: Data preparation and processioning

Since it was necessary to distinguish 2021 data from 2022 data, the date field was modified to include a year and month column as shown in Table II.

Table II: Python Output: Data separation (year 2021 and year 2022)

The annual basis revealed that in 2021, non-students (no-cases) made 136 bike-sharing trips while students (yes-cases) made 3,543. Non-students (no-cases) traveled 1,264 times in 2022, whereas students (yes-cases) traveled 5,130 times.

3.4.3 Data processing

According to Table III and Figure IV, the number of non-students who used bike-sharing was much lower than the number of students who used bike-sharing. It can be seen that the behavior of students was evaluated depending on the station and time in the year 2021.

Table III: Data 2021-2022

Months	Counts
Data of year 2021
9	555
10	784
11	1286
12	918
Data of year 2022
1	1491
2	1128
3	678
4	835
5	998

Additionally, Table III and Figure IV indicate how bike-sharing users changed over time. The country of Rwanda has 4 seasons, namely: long rainy seasons, long dry season, short rainy season and a short dry season. The leading month in 2021 is November, the last month of the short rainy season. And the leading month in year 2022 is the month of January which is the second month of the short dry season. In those two months, the students were at school, the main reason for a great number of bike-sharing users.

3.4.4 Key performance indicators

Confusion Matrix and Tuning Hyper Parameters

For classification issues with multiple output class labels, one common performance indicator is the confusion matrix. It is a matrix with four columns, one for each pair of hypothetical and real values. Precision, recall, and f1-score are determined using a confusion matrix. There are settings that may be tweaked to improve each technique.

Algorithms are trained using initially randomized parameters before their performance is assessed. After the parameters have been fine-tuned to maximize accuracy, they are employed in the final machine learning algorithm to make predictions about test data.

- Accuracy: It's a metric for measuring the amount of incorrectly anticipated classes as compared to the actual classes. The formula for this is:

$$\frac{TP \left(True Positive\right)+TN \left(True Negative\right)}{TP \left(True Positive\right)+TN \left(True Negative\right)+FN \left(False Negative\right)+FP\left(False Positive\right)}$$
- Precision: It's a metric for keeping track of the percentage of incorrectly classed predictions relative to the true classes. It is determined by:$\frac{TP \left(True Positive\right)}{TP \left(True Positive\right)+FP\left(False Positive\right)}$
- Recall: Recall, also known as sensitivity, is the proportion of time that a positive prognosis is really accurate. This measure has a great deal of promise because the data are so accurately analyzed. The following is the formula:$\frac{TP\left(TruePositive\right)}{TP\left(TruePositive\right)+FN\left(FalseNegative\right)}$
- F1-score: the score considers the possibility of both false positives and negatives. The formula for this is: 2x$\frac{recall x Precision}{recall + Precision}$

3.4.5 Chi-square test

The Chi-square is used to examine data relationships. The Test of Independence compared the actual pattern of cellular responses with the pattern predicted if the variables were independent. Researcher compare the Chi-Square statistic to a key value from the Chi-Square distribution to determine if real and predicted cell counts differ significantly. Below table indicates the variables tested at 95%.

Table IV: Chi-square test results

Variables	P-Value	Decision
1. Gender versus Bike-Sharing	0.00087479	We don't have enough evidence to support that, there is no association between Gender and Bike-sharing.
2. Station versus Bike-Sharing	0.0000000000	We don't have enough evidence to support that, there is no association between the Station and Bike-sharing.
3. Corridor versus Bike-Sharing	0.000	We don't have enough evidence to support that, there is no association between Corridor and Bike-sharing.
4. Time versus Bike-sharing	0.3951566	We have enough evidence to support that, there is no association between time and Bike-sharing.
5. Fare versus Bike-sharing	-0.08590316	We don't have enough evidence to support that, there is no association between fare and bike-sharing.
6. Age versus Bike-sharing	0.0395898	We don't have enough evidence to support that, there is no association between Age and Bike-sharing.

3.4.6 Building model

Table V presents a sample of the dataset after applying dummy variables and balanced dataset.

Table V: model construction

This dataset contains 8400 of the No (non-students’ cases) and 8673 of the YES cases (Students). The final dataset had 7073 observations. Then 25% of the data were used for model testing and the rest of samples are used for model training. Random Forest model (RF) was applied and evaluated. And the results are presented in terms of f1-scores, recall, precision and accuracy.

3.4.7 Discussion of the results

Table VI shows that the random forest model properly identifies 2001 as "Yes" out of a total of 2105 excursions for students. This indicates that it appropriately categorizes 95% of student trips (yes). While the same model may categorize 1,570 "No" trips taken by non-students as "No"" There were 2,164 journeys taken by individuals other than students out of the total number of trips taken. Therefore, the classification accuracy for trips taken by anyone other than students is 73%. This indicates that the model's accuracy across both categories is 84%.

Table VI: classification results

Table of Classifications
Observations		Predictions
		Status of Trips
		Student trips (Yes)	non- Student trips (No)	%
Status of Trips	Student trips (Yes)	2001	104	95%
Status of Trips	non- Student trips (No)	594	1570	73%
Overall accuracy 84%

For a developed model, Table VII summarizes the metric performance where f1-score is corresponding to 81.8% with the overall accuracy on student trips and non-student trips corresponding to 84%.

Table VII: Model performance

	Precision	Recall	F1_score	Accuracy
Student trips (Yes)	72.60%	93.80%	81.80%	84%
Non- Students trips (No)	95.10%	60.50%	73.90%	84%

Table VII details the outcomes of an experiment using machine learning algorithms. We used random forests (RF) and the parameters have been tuned and the method used indicate parameters with optimal value to produce good results. The table indicates performance in terms of accuracy, precision, recall and f1-score for each category of the person accessing the bike-sharing service being a student or others. The accuracy which is an overall performance is also indicated. The results obtained show that the model performance is high.

Table VIII illustrates the relationship between independent variables and a dependent variable.

Table VIII: Independent variable coefficient

Criteria	Values	Coefficients
GENDER	F	0.26047474
GENDER	M	0.35092256
STATIONS	ARENA	0.84465993
	CHUK	0.10064695
	ENGEN	0.63766107
	I&M BANK	0.40591029
	MARRIOT	-1.42763083
	MTN	0.11908014
	NORSKEN	-1.03437044
	REB	0.10644164
	SERENA	0.85899855
CORRIDOR	CBD	1.99499702
CORRIDOR	REMERA	-1.38359972
	Fare	-0.00234912
	AGE	0.00750487

When it comes to eligibility for the bike share program, there is no distinction between male and female students. People using the Arena and Serena stations are more likely to be students than people using any other station. Up next is ENGEN Station. Customers at the MARRIOT and NORSKEN stations are less likely to be students. Central Business Development corridor users are students while Remera corridor users are others.

On the basis of the findings, bike-sharing usage has grown. In 2021, in the first months, bike sharing trips were 3,543. In the months of the year 2022, bike-sharing trips increased up to 4,132 trips. And among those trips, model results indicated that dominant bike-sharing users are students. The dominant moth in year 2021 is November with 1286 trips and in year 2022, the dominant month is January with 1491 trips. The Random Forest model has the accuracy of 84% in trips classification. The model indicated that bike-sharing program users at the Arena and Serena stations are students and they have significantly increased during the study period. MARRIOT and NORSKEN station riders are less likely to be students. The CBD corridor users of the bike-sharing program are likely to be students, whereas the Remera corridor users are non-students. The model's findings suggest that station location in line with land use is critical to the success of a bike share program.

As a result of societal changes and technological improvements, Bike-sharing programs have expanded and altered. In this study, we assessed the dynamics of bike-sharing for students’ mobility in Kigali City. The commonly used statistical model has been used to evaluate changes overtime on stations and corridors. Interestingly, based on Random Forest model (RF) which is accurate at 83.7%, the results indicate an exponential growth between Year 2021 and year 2022. Based on bike-sharing user increase, it is deduced that factors that contribute to bike-sharing variations are (1) fixed income of system users, (2) the Location of stations, (3) land use and location of education institutions. Most of the bike-sharing users shifted from public transport. This is due to the low performance of public transport operators and the high tariff of motor taxi transport. Bike-sharing was also found to cover short distances between households and public transportation or between transit stations and places of employment that may be too far to walk. The students are more likely to use the Arena and Serena stations. Further, it was found that the addition of bike lanes alongside new roadways has attracted a lot of sharing patronage. Finally, it is suggested that transport rules should recognize bike share riders as legitimate road users, and that drivers of motorized vehicles be educated on the need of sharing the road with cyclists.

Contribution of Authors: This research took place in Kigali, the capital of Rwanda. Although similar studies have been conducted in the past, this one is the first of its kind in Rwanda, and it would be unfair to try to compare its findings to those of other countries given the vast differences in their historical background, socioeconomic landscapes, and transportation system and policy. In addition, despite the abundance of literature on the topic of bike sharing, very little of it has focused on the dynamics of the system and its evolution through time. A review of the relevant studies reveals that the majority of them focus on hourly variations. This research will help close the gap between short- and long-term evaluations. Furthermore, some studies have used likelihoods (R square) to evaluate models, but that only works for regressions and isn't a good way to measure how well a model performs in classifications. Instead, this study focused on accuracy and f1-score, which combines precision and recall and also takes into account how the data is spread out. Professor Hannibal Bwire supervised the work.

Ethical Approval: Not applicable
Competing interests: Not applicable
Funding: The authors have no financial or personal interests that are directly or indirectly associated with the submitted work.
Availability of data and materials: The dataset is available, but in case needed, it can be shared.

Akın, M.: A novel approach to model selection in tourism demand modeling. Tour Manage, (2015)
Alvaro Fernandez-Heredia, A., Monzón, S.R.: Jara-Díaz. Modelling bicycle use intention: the role of perceptions.Transportation43(1), (2014)
Ashqar, H.I., Elhenawy, M., Almannaa, M.H., Ghanem, A., Rakha, H.A., House, L.: Modeling bike availability in a bike-sharing system using machine learning. In 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), 26–28 June 2017, 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), (2017). (2017)
Audikana, A., Ravalet, E., Baranger, V., Kaufmann, V.: Implementing bike-sharing systems in small cities: Evidence from the Swiss experience.Transp. Policy, (2017)
Bacciu, D., Carta, A., Gnesi, S., Semini, L.: An experience in using machine learning for short-term predictions in smart transportation systems. Journal of Logical and Algebraic Methods in Programming (2017)
Bachand-Marleau, J., Lee, B., El-Geneidy, A.: Better understanding of factors influencing the likelihood of using shared bicycle systems and frequency of use. Transportation Research Record: Journal of the Transportation Research Board. (2314): 66 ± 71, (2012)
Ben-Akiva, M., McFadden, D., Train, K., Walker, J., Bhat, C., Bierlaire, M., Bolduc, D., Boersch-Supan, A., Brownstone, D., Bunch, D.S., Daly, A., De Palma, A., Gopinah, D., Karlstrom, A., Munizaga, M.A.: Hybrid choice models: progress and challenges. Mark. Lett. 13(3), 163–175 (2002)
Bi, J.-W., Han, T.-Y., Li, H.: International tourism demand predicting with machine learning models: the power of the number of lagged inputs. Tour Econ (2020)
Botha.c, M.E., Chhetri, L., Krishnasamy., S., Van, T.D., Bernadina, G.J., Plessis, D.: V.D, Karunakara: Kigali master plan 50, zoning regulation, A (2019)
Brand, C., Goodman, A., Ogilvie, D.: Connect consortium: Evaluating the impacts of new walking and cycling infrastructure on carbon dioxide emissions from motorized travel: a controlled longitudinal study.Appl Energy, (2014)
Brunsdon, C., Fotheringham, A., Charlton, M.: Geographically weighted summary statistics—a framework for localized exploratory data analysis, (2002)
Button, K.: Environmental externalities and transport policy. Oxf. Rev. Econ. Policy. 6, 61–75 (1990)
Bwire, H.: Children’s Independent mobility and perception of outdoor environment in Dar es Salaam, pp. 183–193. Global studies of childhood (2011)
Caggiani, L., Ottomanelli, M., Camporeale, R., Binetti, M.: Spatiotemporal clustering and predicting method for free-floating bike sharing systems. InAdvances in Intelligent Systems and Computing, (2017)
Chang, P.-C., Wu, J.-L., Xu, Y., Zhang, M., Lu, X.-Y.: Bike sharing demand prediction using artificial immune system and artificial neural network. Soft Comput. 23(2), 613–626 (2017)
Chen, X., Liu, Z., Zhang, K., Wand, K.: A parallel computing approach to solve traffic assignment using path-based gradient projection algorithm. Transportation Research Part C: Emerging Technologies, (2020)
Chen, T., Guestrin, C.: Xg boost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794), (2016)
Chhavi Dhingra and Santhosh Kodukula: Public Bicycle Schemes: Applying the Concept in Developing Cities, New Dheli:, (2010)
Claveria, O., Monte, E., Torra, S.: Combination predicts of tourism demand with machine learning models.Appl Econ Lett, (2016)
Claveria, O., Monte, E., Torra, S.: Modelling tourism demand to Spain with machine learning techniques. The impact of predict horizon on model selection. Revista de Economia Aplicada (2018)
De Maio, P.: Bike-sharing, History, impacts, models of provision, and future. Journal of Public Transportation. 12 (4): 41 ± 56, (2009)
Faghih-Imani, A., Eluru, N., El-Geneidy, A.M., Rabbat, M., Haq, U.: How land-use and urban form impact bicycle flows evidence from the bicycle-sharing system (BIXI) in Montreal. Journal of Transport Geography. 41: 306 ± 314, (2014)
Fei, L., Shihua, W., Jian, J., Weidi, F., Yong, S.: Predicting public bicycle rental number using multi-source data. In 2017 International Joint Conference on Neural Networks (IJCNN), 14–19 May 2017, 2017 International Joint Conference on Neural Networks (IJCNN), (2017)
Fishman, E., Washington, S., Haworth, N.: Barriers and facilitators to public bicycle scheme use: A qualitative approach. Transportation research part F: Traffic Psychology and Behavior; 15: 686 ± 698, (2012)
Fishman, E., Washington, S., Haworth, N.: Bike share's impact on car use: evidence from the United States, Great Britain, and Australia. Transportation Research Part D: Transport and Environment. 31:13 ± 20, (2014)
Fishman Elliot: Bikeshare: a review of recent literature. Transport Reviews (2016)
Chen, Y.: Using Machine Learning Methods to Predict Demand for Bike Sharing. Springer (2022)
Ghanem, A., Elhenawy, M., Almannaa, M., Ashqar, H.I., Rakha, H.A.: Bike share travel time modeling: San FranciscoBbaA area case study. In 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), 26–28 June 2017, 5th IEEE. International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), (2017). (2017)
Habib, K.N., Mann, J., Mahmoud, M., Weiss, A.: Synopsis of bicycle demand in the city of Toronto: Investigating the effects of perception, consciousness and comfortability on the purpose of biking and bike ownership. Transp. Res. Part A Policy Pract, (2014)
Huang, D., Chen, X., Liu, Z., Wang, S.: A static bike repositioning model in a hub-and-spoke network framework Transportation Research Part E. Logistics and Transportation Review (2020)
Kabra, A.: Elena Belavina, Karan Girotra:Bike-share Systems: Accessibility and Availability. Working paper, INSEAD, Fontainebleau, France (2016)
Keijer, M.J.N., Rietveld, P.: How do people get to the railway station? The Dutch experience. Transp. Plann. Technol. 23(3), 215–235 (2000)
Kim, W.-K., Son: Bike Sharing Demands Prediction based on GCN. In Proceeding on 2018 KISS Conference (pp. 832–834). Korea Information Science Society, (2018)
Kutela, B.: Kidando. E, Langa. N, Kitali. A. E: A text mining approach to elicit public perception of bike-sharing systems. Travel behavior and Society (2021)
Li, Y., Zheng, Y., Zhang, H., Chen, L.: Traffic prediction in a bike-sharing system. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems (p. 33), (2015)
Li, Y., Zheng, Y., Zhang, H., Chen, L.: Traffic prediction in a bike sharing system. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems, (2015)
Li, Y., Guo, H., Li, H., Xu, G., Wang, Z., Kong, C.: Transit-oriented land planning model considering the sustainability of mass rail transit. J. Urban Plan. Dev. 136, 243–248, (2010). (2010)
Liu, A., Ji, X., Xu, L., Lu, H.: Research on the recycling of sharing bikes based on time dynamics series, individual regrets and group efficiency.J Clean Prod, (2018)
Liu, J., Li, Q., Qu, M., Chen, W., Yang, J., Xiong, H., Zhong, H., Fu, Y.: Station site optimization in bike sharing systems. In IEEE International Conference on Data Mining, (2015). (2015)
Liu, X., Gherbi, A., Li, W., Cheriet, M.: Multi Features and Multi-time steps LSTM Based Methodology for Bike Sharing Availability Prediction. Procedia Comput. Sci. 155, 394–401 (2019)
Müller, J., Bogenberger, K., Schmöller, S.: Empirische Datenanalyse von Free-floating Car Sharing-System en. Heureka (2014)
Nagesh Singh Chauhan: Decision Tree Algorithm. KD nuggets (2022)
National Bureau of Statistics of China: :China Statistical Yearbook; China Statistics Press: Beijing, China, (2017)
Nikitas, A.: Understanding bike-sharing acceptability and expected usage patterns in the context of a small city novel to the concept. A story of Greek Drama (2018)
O’neill, P., Caulfield, B.: Examining user Behavior on a shared bike scheme: The case of Dublin Bikes. In Proceedings of the 13th International Conference on Travel Behavior Research, Toronto, ON, Canada, (2012)
OECD: National policies to promote cycling: Organization for Economic Cooperation and Development, European Conference of the Ministers of Transport, Paris, France:, (2004)
Parady, G.T., Takami, K., Harata, N.: Built environment and travel behavior: Validation and application of a continuous-treatment propensity score stratification method.J. Transp. Land Use, (2017)
Petritsch Bruce, W., Theodore, A., Landis Peyton, S., McLeod, H.: Pedestrian Level-of-Service Model for Arterials. Record J. Transp. Res. Board. 2073(–1), 58–68 (2008)
Profillidis, V.A., Botzoris, G.N., Galanis, A.T.: Environmental Effects and Externalities from the Transport Sector and Sustainable Transportation Planning—A Review. Int. J. Energy Econ. Policy. 4, 647 (2014)
Pucher, J., Dill, J., Handy, S.: Infrastructure, programs, and policies to increase bicycling: an international review. Prev. Med. 50, 106–125 (2010)
Qiu, L.Y., He, L.Y.: Bike-sharing and the economy, the environment, and health-related externalities. Sustainability. 10, 1145 (2018)
Raschka, S.: Python Machine Learning. Packt Publishing Ltd., Birmingham (2015)
Raviv, T., Tzur, M., Forma, I.A.: Static repositioning in a bike-sharing system: models and solution approaches.EURO J Transp Logist, (2013)
Record d’utilisation pour les Vélo’v en: : Linkviewed on 25/02/2015. (2014). http://www.lyoncapitale.fr/Journal/Lyon/Actualite/Actualites/Transports/Record-d-utilisation-pour-les-Velo-v-en,
Reiss, S., Paul, F., Bogenberger, K.: Empirical Analysis of Munich’s free-floating Bike Sharing System: GPS-Booking Data and Customer Survey among Bike Sharing Users. TRB (2015)
Saelens, B.E., Sallis, J.F., Frank, L.D.: Environmental correlates of walking and cycling: findings from the transportation, urban design, and planning literatures, (2003)
Salaken, S.M., Hosen, M.A., Khosravi, A., Nahavandi, S.: Predicting bike sharing demand using fuzzy inference mechanism. In Neural Information Processing. 22nd International Conference, ICONIP 9–12 Nov. 2015, volume pt.III of Neural Information Processing. 22nd International Conference, ICONIP 2015. Proceedings: LNCS 9491, pages 567–74. Springer International Publishing, (2015). (2015)
Shaheen, S., Zhang, H., Martin, E., Guzman, S.: China's Hangzhou public bicycle: understanding early adoption and behavioral response to bike-sharing.Transportation Research Record: Journal of the Transportation Research Board, (2011)
Shaheen, S., Zhang, H., Martin, E., Guzman, S.: China's Hangzhou public bicycle: understanding early adoption and behavioral response to bike-sharing.Transportation Research Record: Journal of the Transportation Research Board, (2011)
Shaheen, S., Guzman, S., Zhang, H.: Bike-sharing in Europe, the Americas, and Asia. Transp. Res. Record: J. Transp. Res. Board. 143, 159–167 (2010)
Shaheen, S., Martin, E.: Unraveling the Modal Impacts of Bike-sharing. Access. Magazine. 1, 8–15 (2015)
Shaheen, S.A., Guzman, S., Zhang, H.: Bikes-haring in Europe, the Americas, and Asia: Past, present, and future. Transp (2010)
Shaheen, S.A., Zhang, H., Martin, E., Guzman, S.: Hangzhou public bicycle: Understanding early adoption and behavioral response to bike sharing in Hangzhou, China.Transp. Res. Rec. J. Transp. Res. Board, (2011)
Lim, T.-S., Loh, W.-Y., Shih, Y.-S.: "A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms," Machine learning, journal article, (2000)
Tran, T.D., Ovtracht, N.: Bruno Faivre d’Arcier: Modeling bike sharing system using built environment factors, vol. 30. Procedia CIRP (2015)
Victoriano, R., Paez, A., Carrasco, J.A.: Using machine learning to classify people’s mobility strategies through four key dimensions. Travel Behavior and Society (2020)
Walker, J.L., Ben-Akiva, M.: Generalized random utility model.Math. Soc. Sci., (2002)
Walker, J.L.: Extended Discrete Choice Model: Integrated Framework, Flexible Error Structures and Latent Variables. Ph.D. Dissertation, Massachusetts Institute of Technology, (2001)
Wang, D., Cao, W., Li, J., Ye, J.: Deepsd: Supply-demand prediction for online car-hailing services using deep neural networks. In Proceedings - International Conference on Data Engineering, (2017)
Wang, M., Zhou, X.: Bike-sharing systems and congestion: Evidence from US cities.J. Transp. Geogr, (2017)
Wong, J., Manderson, T., Abrahamowicz, M., Buckeridge, D.L., Tamblyn, R.: Can hyperparameter tuning improve the performance of a super learner? Epidemiology, (2019)
Xu, T., Han, G., Qi, X., Du, J., Lin, C., Shu, L.: A hybrid machine learning model for demand prediction of edge-computing-based bike-sharing system using Internet of Things. IEEE Internet Things (2020)
Yang Chong, G., Jaillet, C.Y.P.: Estimating Primary Demand in Bike-sharing Systems.SSRN Electronic Journal, (2019)
Yang, M., Guang, Y., Zhang, X.: Public bicycle prediction based on generalized regression neural network. In 2nd International Conference on Internet of Vehicles ndash; Safe and Intelligent Mobility, IOV 2015, December 19, 2015 - December 21, volume 9502 of Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer Verlag, (2015). (2015)
Yang, T., Pan, H., Shen, Q.: Bike-sharing systems in Beijing, Shanghai, and Hangzhou and their impact on travel behavior. In Proceedings of the Transportation Research Board 90th Annual Meeting, Washington, DC, USA, (2011)
Zhang, L., Zhang, J., Duan, Z., Bryde, D.: Sustainable bike-sharing systems: characteristics and commonalities across cases in urban China.Journal of Cleaner Production, (2015)
Zhang, J., Zheng, Y., Qi, D.: Deep spatiotemporal residual networks for citywide crowd flows prediction. CoRR, abs, (2016)
Zhang, Y., Mi, Z.: Environmental benefits of bike-sharing: A big databased analysis. Appl. Energy. 220, 296–301 (2018)
Zhao, J., Deng, W., Song, Y.: ; Ridership and effectiveness of bike-sharing: The effects of urban features and system characteristics on daily use and turnover rate of public bikes in China. Transport Policy. 35: 253 ± 264, (2014)
Zhao, X., Yan, X., Yu, A., Van Hentenryck, P.: Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. Travel Behavior and Society (2020)
Zhing, Y.: evaluating the performance of bicycle sharing system in Wuhan, China, unpublished document, (2011)
Zhou, J., Guo, Y., Suna, J., Yu, E., Wang, R.: Review of bike-sharing system studies using bibliometrics method.Journal of Traffic and Transportation Engineering, (2022)
Zhou, X., Wang, M., Li, D.: Bike-sharing or taxi? Modeling the choices of travel mode in Chicago using machine learning.Journal of Transport Geography, (2019)
Sathishkumar, V.E., Cho, Y.: Season wise bike sharing demand analysis using random forest algorithm. Computational Intelligence (2020)
Almannaa, M.H., Elhenawy, M.H.: A: Dynamic linear models to predict bike availability in a bike sharing system.International Journal of Sustainable Transportation, (2020)
Manish Varma Datla: Bench marking of classification algorithms: Decision Trees and Random Forests-a case study using R, International Conference on Trends in Automation, Communication and, Technologies, C.: I-TACT, (2016)
Ruffieux Simon, S., Nicolas, K.O., Abou: Real-Time usage forecasting for bike-sharing systems: A study on random forest and convolutional neural network applicability. Intelligent Systems Conference, (2017)
Xu, H., Duan, F., Pu, P.: Dynamic bicycle scheduling problem based on short-term demand prediction, Applied Intelligence, (2019)
Fan, Y., Wang, G., et al.: Distributed forecasting and ant colony optimization for the bike-sharing rebalancing problem with unserved demands. PLoS ONE (2019)
Huthaifa, I.: Ashqar Mohammed Elhenawy: Network and station-level bike-sharing system prediction: a San Francisco Bay Area case study, Leanna House, Journal of Intelligent Transportation Systems: Technology, Planning, and Operations, (2021)
Lin, I., Chen Wang, Hou, T.: A crowdsourced dynamic repositioning strategy for public bike sharing systems. Numerical Algebra, Control and Optimization (2022)

Competing interest reported. not applicable

Assessment of the dynamics of bike-sharing for students’ mobility in Kigali City

Status:

Version 1

Abstract

Figures

1. Introduction

2. Review Of Literature

3. Materials And Methods