Achieving High Success in Fall Detection through Cross-Brand Inertial Sensor Utilization of Hybrid Data in Machine Learning

doi:10.21203/rs.3.rs-4301091/v1

Falls can result in severe injuries and even mortality among individuals of all age groups. Hence, numerous wearable sensor-based fall monitoring systems are being developed to provide assistance. Fall detection and activity tracking have been partially successful using smartwatches, smartphones, and specialized devices. However, a comprehensive solution that combines sensor data from different brands in a single model and performs fall detection with high accuracy and at a satisfactory level has not been encountered.

This study aims to bridge this research gap by combining data from two different brands of IMUs (inertial measurement units) that incorporate accelerometers, magnetometers, and gyroscopes, in order to create a hybrid dataset. To achieve accurate predictions on data from both brands, machine learning (ML) models were trained using ML algorithms.

The first dataset was obtained from 14 volunteers using a commercially available activity tracking system called Motion Trackers Wireless (MTw). The second dataset was collected from 30 volunteers using a custom-designed Activity Tracking Device (ATD) specifically developed for detecting falls and daily-life activities. In both cases, the sensors from the respective brands were positioned on the waist to capture data related to falls and daily-life activities.

The data was organized using a time-series style to reveal relational effect of the sequential falling data. During the modelling, ten different classifiers trained, and classification was performed on unseen data using the data splitting method. The Extra Tree algorithm emerged as the most successful model, achieving an accuracy of 99.54%, precision of 99.18%, recall of 99.79%, and an F-score of 99.49% on the hybrid dataset constructed from the MTw and ATD datasets. This study demonstrates hybrid dataset to create a successful system with high accuracy and low false alarm rates using inertial sensor data from various brands.

artificial intelligence

fall detection

machine learning

time series

The global population is increasing, along with the proportion of elderly individuals. The World Health Organization (WHO) predicts that by 2050, the elderly population will exceed 1.5 billion [1]. Falls have negative physical and psychological consequences, limiting the quality of life both physically and psychologically [2]. Fall-related impacts can lead to serious trauma and injuries. Even some of falls can result in permanent injuries restricting essential activities of daily living or even death [3]. The fear of falling leads to reduce physical activity, weaken in muscle, and increase risk of falling. The development of a fear of falling has been shown to diminish the zest for life and decrease one's commitment to life. Additionally, falls among older adults can impede their independence and make them dependent on others [6]. Consequently, a high-performance fall detection system that contributes to improving quality of life, predicts falls, and ensures prompt assistance after a fall is a crucial need [7].

Efforts to mitigate post-fall risks in the elderly can be categorized as fall prevention and fall detection. Fall detection aims to determine, based on data from sensors, whether a fall event has occurred. Subsequent notification to relevant units and individuals varies depending on the studies [8]. Fall prevention, on the other hand, involves analyzing pre-fall data to predict the occurrence of a fall immediately after activities of daily living (ADLs). Fall detection and prevention systems employ similar sensors, such as inertial sensors, imaging sensors, environmental sensors, as well as artificial intelligence methods and threshold strategies, to determine fall events.

The aim of fall detection studies is to rapidly detect falls and develop systems that inform individuals and organizations for prompt intervention during emergency situations. This minimizes the response time and reduces the effects caused by falls [9]. The fundamental and critical feature of these systems is their ability to provide rapid notifications when a fall is detected. Especially in cases where patients have lost consciousness or are unable to move, automatic notification of the designated person or organization for emergency communication is vital [10]. These systems aim to enhance the independent living abilities of older adults, reduce their dependence on others, minimize privacy infringements resulting from constant presence of other individuals, and support them in maintaining the highest possible quality of life.

Fall detection systems are based on the principle of detecting impact by measuring body acceleration (using accelerometers) and body orientation (using magnetometers) along with the direction of rotation (using gyroscopes). Some studies only consider significant acceleration effects [11]. Efforts to improve system performance include increasing the number and variety of sensors and determining the most effective sensor combinations through experimentation [12].

Most studies in the literature involve the use of a three-axis accelerometer and a gyroscope [13]. Nowadays, smartwatches have been utilized in fall detection systems to monitor real-time activities of daily living (ADLs) [14]. Data obtained from accelerometers, gravity sensors, and gyroscopes present in smartphones have been used in a machine learning algorithm through an Android application [15]. In another study, a wearable sensor unit consisting of an accelerometer and a gyroscope was preferred for deep learning algorithms in daily activity monitoring and fall detection [16].

When examining fall detection studies, approaches using inertial sensors provide the most successful and practical solutions in individual fall detection [17]. Moreover, wearable systems perform fall detection independently of environmental conditions, thus minimizing privacy concerns as long as the device is worn. Accordingly, wearable sensor-based systems are more independent, privacy-respecting, and successful compared to other environmental systems in providing personalized and accurate fall detection.

Data used in fall detection systems are obtained from ready-made datasets such as SisFall [18], commercial ADL detection devices [19], or custom-designed devices [20]. Although all the sensors used in the systems belong to the inertial sensor group, there is diversity in design and technology (brand and version differences, resolution, error tolerance, and axis positioning) [19]. An important weakness of the existing solutions is the dependence on specific devices. In this study, a single and successful machine learning (ML) model was obtained by utilizing a dataset acquired from two different IMUs placed on the same region of the body, enabling it to tolerate the mentioned differences above.

In this section, data collection systems are examined. In the second part, the successful algorithms, performance criteria and problem-specific definitions will be presented.

2.1. Data Collection Systems

This study will investigate the data structures of sensors from different brands, which serve as the primary motivation for this research. Two of these data collection systems are Motion Trackers Wireless (MTw) and BMX055 sensors, which are utilized in a specially designed Activity Tracking Device (ATD). The structure of these sensors, their placement on the body, as well as their detection accuracy and precision values will be briefly introduced.

2.1.1 Wireless Human Motion Tracker (MTw) Data Collection System

In this section inertial or wearable sensor-based fall detection systems is introduced. The data collection systems obtained from the Xsens MVN Awinda - Wireless Human Motion Tracker (MTw), a commercial sensor, and publicly available DLA and fall data [21] were utilized. The MTw development kit [22], produced by Xsens Technologies, comes with hardware and software components. The kit consists of six MTw sensor units and an Awinda Receiver station as the two main components (Fig. 1).

Each MTw sensor unit, as shown in Table 1, is equipped with a triaxial (3D) accelerometer (± 120 m/s²) for acceleration detection (A), a triaxial gyroscope (± 1200 deg/s) for angular velocity detection (G), a triaxial magnetometer (± 1.5 Gauss) for magnetic field measurement (M), and a barometer (300–1100 hPa) for atmospheric pressure measurement, which is not used for classification purposes.

Table 1

Sensing components of MTw sensors
	A	G	M
Sensor Type	Digital	Digital	Digital
Full Scale	± 120 m/s²	± 1200 deg/s	± 1.5 Gauss
Noise	200 µg/√Hz	0.01 deg/s/√Hz	0.2 mGauss/Hz
Bandwith	180 Hz	180 Hz	10–60 Hz (var.)
Bias Stability	0.lmg	10 deg/hr

The Awinda station collects data by establishing wireless connections with the six Xsens sensor units through radio waves. It can also charge the units (Fig. 2). The Awinda communication station, which has an open-field range of approximately 50 meters, allows wirelessly connected Xsens sensors to collect uninterrupted data for up to three hours. The MT Manager software package that comes with the MTw Development Kit enables the visualization, analysis, and recording of data received from the sensor units through a graphical user interface (Fig. 2).

2.2.1 Activity Tracking Device (ATD)

The second dataset was obtained using an Activity Tracking Device (ATD) developed for a doctoral thesis project and designed for DLA and fall detection studies [23]. The ATD consists of a sensor module with four sensors, a microcontroller, a battery, and an SD card reader for local file storage, totaling six subunits (Fig. 3).

The sensor utilized in the ATD for collecting DLA data in this study is the BMX055 integrated circuit produced by Bosch Sensortec™. It is an integrated circuit that combines an accelerometer, a gyroscope, and a magnetometer sensor into a single package. While the accelerometer measures the device's linear movements, such as acceleration and inclination, the gyroscope detects rotational movements, such as angular velocity and rotational speed. The magnetometer determines the device's direction by responding to magnetic fields. These sensors are integrated into the BMX055 in a single package. Consequently, the BMX055-equipped sensor module can function as a 9-axis motion tracking system. The ATD equipped with the BMX055 sensor card consists of a microcontroller card that transmits sensor data to a computer via Bluetooth protocol, a case, and a belt-mounted system (see Fig. 3). The fundamental measurement parameters of the BMX055 sensors are presented in Table 3 [23].

Table 3

Sensor information for the ATD design with BMX055 sensors
BMX055
Accelerometer	Sensor Type	Digital
	Measurement Range (g)	± 16
	Zero-rate Offset (mg)	± 80
	Noise Density (ugA/Hz)	150
	Resolution (bit)	12
Gyroscope	Sensor Type	Digital
	Measurement Range (degrees /sn)	2000
	Zero-rate Offset	± 1
	Noise Density (degrees /sNHz)
	Resolution (bit)	16
Magnetometer	Sensor Type	Digital
	Measurement Range (gauss)	12
	Resolution (bit)	12

In this study, the collected data utilized the accelerometer, magnetometer, and gravity sensor, while data obtained from other sensors were not employed.

2.3. Machine Learning Algorithms

The success of machine learning algorithms forms the core of the decision-making mechanism. The problem of determining whether a fall has occurred or not is a binary classification problem. The appropriate approach to be applied depending on the problem type is the binary classifier algorithm solution [24]. Due to their reliability and high-performance results, as well as the availability of labeled data, supervised classifier algorithms have been preferred for the application on fall datasets.

The training of a classifier model for an algorithm involves revealing the relationship between pre-classified (labeled) input samples. This mathematically model can be used as a predictor for unknown samples or as a descriptor for classified samples.

In this study, 10 different machine learning classifier algorithms were trained, and the performances were compared. In the modeling process, successful algorithms in the datasets, namely Extra Trees and AdaBoost, will be briefly introduced in this section.

2.3.1 Extra Tree Algorithm (Extra Tree Classifier)

Extra Trees, or Extremely Randomized Trees, is an ensemble method that combines the advantages of random forests and decision trees. It was developed by Pierre Geurts and is known for its ability to reduce model variance. Extra Trees randomly selects split points for node splitting, which provides a higher level of randomness compared to traditional decision trees. This randomness helps reduce the problem of overfitting and improves the model's generalization ability. Extra Trees is particularly suitable for problems with high-dimensional data and noisy features. It has been successfully applied in various domains, including traffic flow prediction and image classification [25].

2.3.2 AdaBoost Classifier

AdaBoost (Adaptive Boosting) is a powerful classification algorithm and one of the ensemble learning methods. Its basic idea is to combine weak learners (e.g., decision trees) to create a stronger classifier. AdaBoost uses weak learners that are repeatedly trained on different subsets of the dataset. In each training iteration, the weights of misclassified examples are increased, and these examples are given more attention in the next iteration. Thus, AdaBoost combines weak learners to create a stronger classifier. As a result, AdaBoost is an effective algorithm that can generate high-accuracy classification models that can be used to solve complex problems. AdaBoost has been successfully applied in numerous fields and has shown high performance in classification problems [26].

2.3.3 Performance Criteria

The evaluation of machine learning algorithms involves considering various performance measures beyond accuracy, such as precision, specificity, and F1 score. These measures are necessary to evaluate the classification and predictive performance of machine learning models, particularly in the context of imbalanced datasets. The use of a threshold that minimizes the difference between sensitivity and specificity, as emphasized by [27], can lead to improved performance, especially in cases where commission and omission errors carry equal costs. Relying solely on accuracy as a performance criterion is a common mistake. Accuracy does not account for false predictions and only provides the number of correct predictions. In the evaluation of imbalanced datasets, metrics such as sensitivity (Eq. 2), precision (Eq. 3), specificity (Eq. 4), and F1 score (Eq. 5) need to be computed.

True Positive (TP): The number of cells where the model correctly predicts true falls. TP is an important performance measure for correctly identifying true falls.

False Positive (FP): The number of cells where the model incorrectly predicts a non-fall situation as a fall. FP occurs when another situation is falsely classified as a fall.

False Negative (FN): The number of cells where the model fails to detect true falls. FN occurs when the model misses falls or falsely concludes that there is no fall.

True Negative (TN): The number of cells where the model correctly predicts non-fall situations. TN is important for correctly classifying non-fall situations.

Accuracy: The ratio of correctly predicted falls and non-fall situations to the total data. Accuracy is commonly used as a performance test parameter and indicates how many predictions are correct (Eq. 1) and easily can be calculated from confusion matrix in Table 4.

Table 4. Confusion Matrix

Accuracy should not be evaluated as a sole criterion for imbalanced data types. The accuracy value should be as high as possible. Calculating other parameters at a value where accuracy is considered low, although the thresholds for being deemed successful may vary depending on the problem, will help demonstrate system failure.

$$\begin{array}{c}Accuracy=\frac{TP+TN}{\left(TP+FP+FN+TN\right)} \#\left(1\right)\end{array}$$

Recall or Sensitivity indicates the ratio of correctly predicted examples from all positive classes. It expresses how well the model detects true fall situations. Recall is also referred to as the true positive rate (Eq. 2).

$$\begin{array}{c}Recall=\frac{TP}{\left(TP+FN\right)} \#(2\end{array})$$

Precision indicates the proportion of predicted examples that are correctly classified as positive. It expresses the percentage of predicted examples that are truly falls. Precision represents the false alarm rate (Eq. 3).

$$\begin{array}{c}Precision=\frac{TP}{\left(TP+FP\right)} \#\left(3\right)\end{array}$$

Specificity indicates the proportion of correctly predicted examples in the non-fall (negative) class. It expresses the ability of the model to correctly detect non-fall examples. Specificity is also referred to as the true negative rate (Eq. 4).

$$\begin{array}{c}Spesificity=\frac{TN}{\left(TN+FP\right)} \#\left(4\right)\end{array}$$

The F1 score is a criterion derived from precision and recall values that combines the accuracy of both. The F1 score is calculated by taking the harmonic mean of precision and recall (Eq. 5).

$$\begin{array}{c}F1 Score=2*\frac{precision*recall}{precision+recall} \#\left(5\right)\end{array}$$

By using the concepts of true positive, false alarm, specificity, and F1 score, you can evaluate the performance of the model. While recall focuses on how accurately it detects true fall situations, precision represents the false alarm rate, and specificity indicates the rate of correctly predicting non-fall situations. The F1 score provides a general performance measure by considering the balance between precision and recall.

Fall detection studies using machine learning require the definition of a problem depending on the desired outcome. In this study, fall detection is a binary classification problem that involves making decisions between daily activities and fall events. Such classification problems can be addressed using supervised machine learning methods. In this study, supervised classification algorithms were preferred because the data used is labeled, and artificial intelligence models developed with labeled data tend to yield more successful results.

When training an artificial intelligence model, the following steps are typically followed: problem identification, data acquisition, data analysis, data preprocessing, creation of a training dataset, training of the dataset, testing the model with a test dataset, and evaluating its performance using performance criteria. The training process of the artificial intelligence model is completed by selecting the most successful model (Fig. 4).

3.1 Datasets

The fall detection system used in this study is based on data obtained through two different brand IMU and volunteers. The first dataset is obtained from the MTw commercial IMU and is available as an open source [21]. The second IMU dataset is obtained from an Activity Tracking Device (ATD) designed to detect falls and daily activities. Both IMUs are collected from different volunteers with different physical characteristics.

3.1.1 MTw Dataset

The dataset consists of data obtained by placing the Xsens sensor on the lower back, capturing daily life activities (DLAs), and fall events. The proposed experimental protocol was adopted for DLAs and fall events [28]. The experiments were conducted by 14 volunteers, consisting of 7 females and 7 males. Female volunteers are aged between 19–23, with weights ranging from 47–70 kg and heights ranging from 165–174 cm. Male volunteers are aged between 21–25, with weights ranging from 55–80 kg and heights ranging from 160–184 cm (Table 5).

Table 5

Anthropometric and physical information of volunteers for MTw.
Gender	Volunteer Code	Height (cm)	Weight (kg)	Age
	101	170	75	21
	102	174	81	21
	103	180	78	23
Male	104	176	67	27
	106	160	54	22
	107	175	72	21
	108	184	68	21
	203	170	51	21
	204	157	47	21
	205	169	51	20
Female	206	166	47	19
	207	165	60	20
	208	163	55	24
	209	182	70	22
		All Volunteer
Avarage		170.79	62.57	21.64
Standard Deviation		8.17	11.82	1.98

The MTw dataset consists of 2520 records (14 volunteers x 5 repetitions x (20 Falls + 16 DLAs)). These records are obtained from 20 fall activities and 16 DLAs of the 14 volunteers, each repeated 5 times.

3.1.2. ATD Dataset

The accelerations and fall events in the ATD dataset were determined based on the activity types collected with MTw. The experiments were conducted with 30 volunteers, consisting of 15 females and 15 males. The age, weight, and height ranges of the female volunteers were measured as 18–41 years, 45–73 kg, and 156–175 cm, respectively. The male volunteers were within the age range of 18–50, weight range of 62–100 kg, and height range of 168–191 cm (Table 6).

Table 6

Anthropometric and physical information of volunteers for ATD.
Gender	Volunteer Code	Height (cm)	Weight (kg)	Age
Male	101	179	100	39
	102	174	96	37
	103	191	69	18
	104	178	86	20
	105	175	89	18
	106	168	93	50
	107	170	62	21
	108	168	90	37
	109	182	75	20
	110	192	60	18
	111	178	76	34
	112	176	71	21
	113	170	87	41
	114	174	81	37
	115	180	67	22
Female	201	163	50	34
	202	169	69	19
	203	166	53	21
	204	162	62	19
	205	156	73	41
	206	172	70	18
	207	163	45	19
	208	165	53	19
	209	162	59	23
	210	169	73	21
	211	159	48	21
	212	175	63	19
	213	157	64	20
	214	168	63	22
	215	160	60	18
Avarage		170.70	70.23	25.57
Standard Deviation		9.01	14.73	9.32

The data obtained from the volunteers were used for binary classification in fall detection. In the study, falls were labeled as 1 and activities as 0. The MTw dataset (30 volunteers x 3 repetitions x (15 activities + falls)) consists of 1350 records. These records are obtained from 20 fall activities and 16 activities of the 14 volunteers, each repeated 5 times.

3.1.3. Hybrid Dataset and Data Formation

Both data sets were collected at 25 Hz, resulting in data of equal resolution. Each activity is represented by a two-dimensional matrix consisting of 101 rows and 9 columns of sensor axis data. These matrices include the sensor values of the accelerometer (Ax, Ay, Az), magnetometer (Mx, My, Mz), and gravity acceleration (Gx, Gy, Gz). Each activity data (DLA and fall) contains a duration of 4 seconds. Each activity (101x9 matrix + 1 class column) includes sensor data (Fig. 5).

All activities are combined into a single dataset, resulting in 20 falls + 16 DLAs = 36 activities. Since each activity has 5 repetitions, it is transformed into a two-dimensional matrix of size 2520x910, consisting of 2520 rows and 910 columns. Thus, each data point collected at 25 Hz within 4 seconds is organized as a feature. As a result, each activity is represented in one row of a two-dimensional matrix with dimensions 2520x910. This allows for easy modeling of time series data using machine learning algorithms.

Each activity consists of 1 row and 910 columns, including 909 sensor data columns and 1 label column (Ax0, Ay0, Az0, Gx0, Gy0, Gz0, Mx0, My0, Mz0, ..., Ax100, Ay100, Az100, Gx100, Gy100, Gz100, Mx100, My100, Mz100, and Class Label). The MTw dataset is obtained from two different IMUs and has dimensions of 2520x910, while the ATD dataset is obtained from a single IMU and has dimensions of 1350x910. Both datasets are organized in a time series approach.

For the machine learning approach, the datasets are divided into train and test sets. The train part of the MTw dataset consists of data collected from 10 volunteers (1800x910), while the data collected from 4 volunteers (720x910) forms the MTw_test dataset (Fig. 4). The ATD dataset is divided into ATD_train, consisting of data from 22 individuals (990x910) for training, and ATD_test, consisting of data from 8 volunteers (360x910) for testing. It is crucial to exclude the data from the same individuals in the testing phase to ensure the reliability of the tests.

After the separation of data for training and testing, they are combined within their respective groups to obtain the hybrid_train dataset (2790x910) and the hybrid_test dataset (1080x910).

3.2 Performance Evaluation

The data sets used for training were subjected to training with various machine learning algorithms before being merged, with aim of determining successful algorithm. Models of successful algorithms were then applied to the respective data sets. After data sets were merged, hybrid model was trained, and predictions were made using this model. Performance metrics were extensively analyzed, including success rates in inter-brand test data sets and the hybrid data set. The performance of the hybrid model using different brand sensor data and the hybrid data set was evaluated comparatively.

3.2.1 Evaluation of MTw Dataset

The dataset obtained from the MTw sensor was transformed into a matrix of size 2520x910 and analyzed using a time-series approach. For training purposes, the data of 10 volunteers were used, and the data of 4 volunteers were reserved for testing. As a result of training studies, the Ada Boost algorithm was identified as the most successful model. The confusion matrix used for performance evaluation is shown in Fig. 6. In this matrix, 0.0 represents Activities of Daily Living (ADL) and 1.0 represents fall activities. While the Ada Boost classification algorithm correctly classified 397 out of 400 fall activities as falls, it incorrectly predicted 3 of them as ADL (Fig. 6). Additionally, it correctly classified 318 out of 320 ADL activities as ADL, and incorrectly predicted 2 of them as falls. These results indicate that Ada Boost algorithm has a high success rate.

These high-performance criteria can be expressed as accuracy (99.31%), area under the curve (AUC) (99.94%), recall (99.38%), precision (99.07%), and F1-score (99.22%). The classification performance of the model trained with the MTw dataset on the unused test dataset is presented in Table 7

Table 7

Performance metrics of models trained with MTw dataset and tested on MTw test dataset
Model	Accuracy	AUC	Recall	Precision	F1
Ada Boost Classifier	0.9931	0.9994	0.9938	0.9907	0.9922
Extra Trees Classifier	0.9917	0.9997	0.9875	0.9937	0.9906
Extreme Gradient Boosting	0.9917	0.9998	0.9938	0.9876	0.9907
Logistic Regression	0.9903	0.9996	0.9875	0.9906	0.9890
Light Gradient Boosting Machine	0.9889	0.9998	0.9906	0.9845	0.9875
Gradient Boosting Classifier	0.9847	0.9995	0.9875	0.9783	0.9829
K Neighbors Classifier	0.9847	0.9963	0.9781	0.9874	0.9827
Ridge Classifier	0.9750	0.9728	0.9531	0.9903	0.9713
Linear Discriminant Analysis	0.9736	0.9891	0.9500	0.9902	0.9697
Decision Tree Classifier	0.9736	0.9734	0.9719	0.9688	0.9704

These 10 algorithms trained on MTw dataset were used to classify ATD test dataset. Thus, performance of a model trained on different brand sensor data was observed on normalized sensor data. These results are presented in Table 8.

Table 8

Classification performance of models trained with MTw dataset tested on ATD test dataset
Model	Accuracy	AUC	Recall	Precision	F1
Gradient Boosting Classifier	0.5333	0.4495	0.0000	0.0000	0.0000
Extreme Gradient Boosting	0.5333	0.3129	0.0000	0.0000	0.0000
Ada Boost Classifier	0.5167	0.2809	0.0000	0.0000	0.0000
Decision Tree Classifier	0.4861	0.4788	0.3690	0.4397	0.4013
SVM - Linear Kernel	0.4778	0.4881	0.6429	0.4576	0.5347
Linear Discriminant Analysis	0.4750	0.4760	0.5655	0.4502	0.5013
Ridge Classifier	0.4583	0.4647	0.5595	0.4372	0.4909
Logistic Regression	0.4333	0.4473	0.6607	0.4302	0.5211
Extra Trees Classifier	0.4167	0.4429	0.5595	0.4087	0.4724
Light’ Gradient Boosting Machine	0.3694	0.2222	0.5952	0.3861	0.4684

Although IMU data belonged to the same type of sensor group, it was observed that model trained with MTw system failed to recognize data obtained from ATD system sensor group. According to performance criteria, most successful model was determined to be Gradient Boosting Classifier, with an accuracy rate of 53.33%, accuracy in classification (AUC) of 44.95%, recall of 0%, precision of 0%, and F1-score of 0.

Classification performance of models trained using MTw training dataset on a hybrid test dataset consisting of both MTw and ATD test data is shown in Table 9.

Table 9

Classification performance of models trained with MTw dataset tested on hybrid test dataset.
Model	Accuracy	AUC	Recall	Precision	F1
Extreme Gradient Boosting	0.8389	0.9093	0.6516	0.9876	0.7852
Ada Boost Classifier	0.8343	0.8855	0.6516	0.9725	0.7804
Gradient Boosting Classifier	0.8343	0.9291	0.6475	0.9783	0.7793
Decision Tree Classifier	0.8111	0.8070	0.7643	0.8074	0.7853
Linear Discriminant Analysis	0.8074	0.7686	0.8176	0.7703	0.7932
Logistic Regression	0.8046	0.7036	0.8750	0.7400	0.8019
SVM - Linear Kernel	0.8083	0.8113	0.8422	0.7597	0.7988
Ridge Classifier	0.8028	0.8041	0.8176	0.7629	0.7893
Extra Trees Classifier	0.8000	0.9336	0.8402	0.7482	0.7915
Light Gradient Boosting Machine	0.7824	0.9078	0.8545	0.7177	0.7802

In hybrid dataset, it was observed that higher number of data samples from MTw dataset resulted in improved prediction accuracy for MTw test data. However, significant errors were observed in recognizing ATD data. Ada Boost Classifier emerged as most successful algorithm, with an accuracy rate of 83.43%, accuracy in classification (AUC) of 88.55%, recall of 65.16%, precision of 97.25%, and F1-score of 78.04. A comparison of performance of top-performing Ada Boost Classifier algorithm on MTw dataset with test datasets is presented in Table 10.

Table 10

Performance of AdaBoost Classifier model trained with MTw dataset tested on test datasets
Model	Test Set	Accuracy	AUC	Recall	Precision	F1
Ada Boost Classifier	MTw	0.9931	0.9994	0.9938	0.9907	0.9922
Ada Boost Classifier	ATD	0.5167	0.2809	0.0000	0.0000	0.0000
Ada Boost Classifier	Hibrit	0.8343	0.8855	0.6516	0.9725	0.7804

3.2.2 Evaluation of ATD Dataset

Dataset of ATD system has been transformed into a matrix of size 1350x910 to enable application of a time series approach. This dataset has been obtained from a total of 30 volunteers. Data from 22 volunteers were used for training purposes, while data from 8 volunteers were reserved for testing.

At conclusion of training process, it was determined that Extra Tree algorithm was the most successful. confusion matrix, presented in Fig. 7, illustrates performance achieved on test data. In confusion matrix, a value of 0.0 represents Daily Life Activities (DLA), while a value of 1.0 represents Falls. With a 100% success rate, the confusion matrix accurately predicted all 192 instances of falls and 168 instances of daily life activities (Fig. 7). These results indicate an exceptionally high level of accuracy. However, there is a suspicion that model may have simply memorized data, raising concerns about reliability of system. Performances of other trained algorithms can be examined in Table 11.

Table 11

Performance of algorithms trained with ATD dataset tested on ATD test data.
Model	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	1.0	1.0	1.0	1.0	1.0
K Neighbors Classifier	1.0	1.0	1.0	1.0	1.0
Logistic Regression	0.9972	1.0000	0.9940	1.0000	0.9970
Light Gradient Boosting Machine	0.9944	0.9999	0.9881	10.000	0.9940
Ada Boost Classifier	0.9944	0.9997	0.9940	0.9940	0.9940
Gradient Boosting Classifier	0.9944	0.9996	0.9881	10.000	0.9940
Extreme Gradient Boosting	0.9944	0.9997	0.9881	10.000	0.9940
Naive Bayes	0.9944	0.9998	0.9940	0.9940	0.9940
SVM - Linear Kernel	0.9944	0.9940	0.9881	10.000	0.9940
Decision Tree Classifier	0.9889	0.9885	0.9821	0.9940	0.9880

High-performing models trained with MTw dataset provided unsuccessful results in classifying ATD test dataset values, as shown in Table 12.

Table 12

Performance of algorithms trained with ATD dataset tested on MTw test data.
Model	Accuracy	AUC	Recall	Precision	F1
Decision Tree Classifier	0.6213	0.5991	0.3689	0.6406	0.4681
Ada Boost Classifier	0.6194	0.5350	0.3730	0.6341	0.4697
Extreme Gradient Boosting	0.6167	0.4727	0.3709	0.6285	0.4665
Light Gradient Boosting Machine	0.6139	0.5567	0.3709	0.6220	0.4647
Gradient Boosting Classifier	0.6083	0.5608	0.3709	0.6094	0.4611
Extra Trees Classifier	0.5769	0.5947	0.3730	0.5465	0.4434
Linear Discriminant Analysis	0.4361	0.4468	0.5020	0.4010	0.4459
Ridge Classifier	0.4361	0.4388	0.4672	0.3951	0.4282
Logistic Regression	0.4139	0.1989	0.3852	0.3608	0.3726
K Neighbors Classifier	0.4009	0.4012	0.3689	0.3468	0.3575

Due to the imbalance in data distribution, the hybrid dataset contains fewer ATD dataset values, resulting in a decreased accuracy in predicting the ATD test data within the hybrid dataset. The Decision Tree algorithm emerged as the most successful algorithm, with values of 62.13% for accuracy, 59.91% for area under the curve (AUC) in classification, 36.89% for recall, 64.04% for precision, and 46.81% for F1-score. The accuracy of the K-Neighbors algorithm on the hybrid dataset was 40.09%, indicating the lowest performance and supporting the suspicion of memorization during modeling.

The performance of the most successful model, the Extra Tree Classifier Algorithm, on other datasets can be further compared in Table 14. However, there is a suspicion that model may have memorized some aspects during the learning process with the ATD dataset.

Table 14

Performance of Extra Tree Classifier model trained with ATD tested on test datasets.
Model	Test Set	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	ATD	1.0	1.0	1.0	1.0	1.0
Extra Trees Classifier	MTw	0.3653	0.1159	0.0438	0.0848	0.0577
Extra Trees Classifier	Hibrit	0.5769	0.5947	0.3730	0.5465	0.4434

3.2.3 Evaluation of the Hybrid Dataset

Data obtained from two different brands of inertial sensors were combined without mixing to create a hybrid dataset. Training phase of hybrid dataset utilized data from 10 MTw volunteers and 22 ATD volunteers. For testing, data from 4 MTw volunteers and 8 ATD volunteers were used. At conclusion of training process, Extra Trees Classifier algorithm emerged as the most successful model. Performance on test data was presented using a confusion matrix, which is shown in Fig. 7. In confusion matrix, value of 0.0 represents fall activities correctly predicted as falls, and value of 1.0 represents fall activities incorrectly predicted as ADL (Activities of Daily Living). According to the confusion matrix (in Fig. 8), out of 592 fall activities, 591 were correctly predicted as falls, while 1 was incorrectly predicted as ADL. Additionally, out of 488 fall activities, 4 were correctly predicted as falls, while 484 were correctly predicted as ADL (Fig. 8). These results demonstrate that the model operates with a high success rate.

When evaluated according to performance metrics, the success rates in the confusion matrix were calculated as follows: accuracy 99.54%, classification accuracy (AUC) 100.00%, recall 99.18%, precision 99.79%, and F1-score 99.49%. These values indicate that the model performs with a high level of success. The ATD test dataset was predicted using the top 10 performing algorithms from MTw dataset. A performance comparison is presented in Table 15.

Table 15

Performance metrics of algorithms trained with hybrid dataset test on hybrid test dataset
Model	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	0.9954	1.0000	0.9918	0.9979	0.9949
Light Gradient Boosting Machine	0.9935	1.0000	0.9877	0.9979	0.9928
Extreme Gradient Boosting	0.9935	0.9998	0.9918	0.9938	0.9928
Gradient Boosting Classifier	0.9880	0.9994	0.9857	0.9877	0.9867
Ada Boost Classifier	0.9833	0.9986	0.9857	0.9776	0.9816
K Neighbors Classifier	0.9796	0.9942	0.9672	0.9874	0.9772
Decision Tree Classifier	0.9685	0.9661	0.9406	0.9892	0.9643
Logistic Regression	0.9250	0.9671	0.9180	0.9162	0.9171
Ridge Classifier	0.9176	0.9162	0.9016	0.9148	0.9082
Linear Discriminant Analysis	0.9148	0.9590	0.8975	0.9125	0.9050

Performance of these highly successful models on MTw test dataset of the hybrid dataset can be seen in Table 16.

Table 16

Performance metrics of algorithms trained with hybrid dataset test on MTw test dataset.
Model	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	0.9944	0.9999	0.9906	0.9969	0.9937
Light Gradient Boosting Machine	0.9931	10.000	0.9875	0.9968	0.9922
Extreme Gradient Boosting	0.9931	0.9999	0.9938	0.9907	0.9922
Ada Boost Classifier	0.9875	0.9992	0.9875	0.9844	0.9860
Gradient Boosting Classifier	0.9847	0.9993	0.9812	0.9843	0.9828
K Neighbors Classifier	0.9833	0.9934	0.9812	0.9812	0.9812
Logistic Regression	0.9819	0.9975	0.9719	0.9873	0.9795
Ridge Classifier	0.9819	0.9806	0.9688	0.9904	0.9795
Linear Discriminant Analysis	0.9792	0.9920	0.9625	0.9904	0.9762
Decision Tree Classifier	0.9556	0.9512	0.9125	0.9865	0.9481

The performance of hybrid dataset models on ATD test dataset can be seen in Table 17.

Table 17

Performance metrics of algorithms trained with hybrid dataset test on ATD test data.
Model	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	0.9972	1.0000	0.9940	1.0000	0.9970
Light Gradient Boosting Machine	0.9944	1.0000	0.9881	1.0000	0.9940
Gradient Boosting Classifier	0.9944	0.9996	0.9940	0.9940	0.9940
Extreme Gradient Boosting	0.9944	0.9995	0.9881	10000	0.9940
Decision Tree Classifier	0.9944	0.9944	0.9940	0.9940	0.9940
Ada Boost Classifier	0.9750	0.9972	0.9821	0.9649	0.9735
K Neighbors Classifier	0.9722	0.9968	0.9405	10000	0.9693
Logistic Regression	0.8111	0.8876	0.8155	0.7874	0.8012
SVM - Linear Kernel	0.8111	0.8140	0.8571	0.7660	0.8090
Linear Discriminant Analysis	0.7861	0.8599	0.7738	0.7692	0.7715

It has been observed that the model trained on the hybrid dataset can achieve significantly higher performance compared to the models trained on the MTw and ATD datasets separately. The performance of the hybrid dataset on the test datasets is presented in Table 18.

Table 18

Performance metrics of algorithms trained with hybrid dataset tested on test datasets.
Model	Test Set	Accuracy	AUC	Recall	Precision	F1
Extra Trees Classifier	Hibrit	0.9954	1.0000	0.9918	0.9979	0.9949
Extra Trees Classifier	MTw	0.9944	0.9999	0.9906	0.9969	0.9937
Extra Trees Classifier	ATD	0.9972	1.0000	0.9940	1.0000	0.9970

Indeed, the Extra Tree algorithm demonstrates high performance across all datasets, as seen in Table 18. Extra Tree machine learning classification model trained on the hybrid dataset exhibits a high success rate not only on the MTw dataset but also on the ATD dataset. This model, based on hybrid dataset, proves its ability to reliably classify data and can be confidently integrated into a system.

In this study, datasets obtained from various brand IMUs were utilized to develop models and conduct tests with different training and testing combinations, thereby comparing their performances. The most successful artificial intelligence model emerged as the Extra Tree algorithm model developed through training on hybrid datasets. The hybrid model exhibited high performance in both mixed and diverse brand datasets. The results are presented comparatively with performance metrics.

The utilization of wearable sensors in fall detection has been the subject of various studies investigating the effectiveness of different brands and their impact on system performance. Inertial Measurement Units (IMUs) are at the forefront of achieving high accuracy in individual fall detection within fall detection systems. Studies have shown variability in the brand-dependent success of these systems. For instance, while some systems utilizing different sensor brands have exhibited high success rates, others have been observed to fail to meet adequate performance levels [24], [29]. Research examining this variability in performance emphasizes the critical importance of sensor brand and system design in ensuring the overall effectiveness of fall detection systems [30]. Additionally, it has been noted that sensor placement and sensitivity ranges contribute to differing success rates among different sensor models [31]. Therefore, careful evaluation of both sensor brand selection and sensor placement with sensitivity ranges is crucial in determining the performance of wearable fall detection systems [32].

This study demonstrates that different brand sensor data used in MTw and ATD fall detection systems yielded successful results in their respective datasets (99.91% accuracy for MTw and 100% accuracy for ATD). However, when applied to different brand sensor data, they performed poorly with low accuracy rates (51.67% accuracy for MTw and 36.53% accuracy for ATD). It has shown how raw IMU data can be optimized by selecting the active fall region through a time series approach. IMU-based fall detection systems placed around the waist have shown that system-specific variables such as sensor placement direction and plane, sensor resolution, and threshold and sensitivity differences can be addressed with hybrid models.

To determine the best algorithm for hybrid system 10 different algorithm were trained, and their performance metrics were calculated to compare their performances. The most suitable solution to the problem was presented by comparing the performance criteria of the algorithm models. The best model Extra Tree algorithm, trained on the hybrid data set created from MTw and ATD data sets, showed high success in fall detection with an accuracy rate of 99.54% in the hybrid data set, 99.44% in the MTw test data, and 99.72% in the ATD test data.

In conclusion, this study shows that a single hybrid model trained on data from different brand inertial sensors (or IMU) can generate accurate predictions and be effectively used in a large-scale system. The research can contribute to the development of a comprehensive fall detection system with a hybrid, cloud-based artificial intelligence model. Not only were the correct prediction values increased, but false alarms were also minimized. Ensuring both accurate predictions and minimizing false alarms is an important improvement. This study provides significant contributions in terms of generating accurate and timely notifications and taking necessary precautions as soon as possible, especially in environments such as hospitals and elderly care centers.

In future studies, a fall detection model that is trained on a diverse range of sensor data from various brands, including real fall data, can be developed to align with real-world scenarios. Additionally, a dynamic system design based on a continuously updated dataset and a dynamically trained model can be implemented. This would enable working on more realistic scenarios and further improving system performance.

Author Contribution

All manuscript writing and illustraiton operations, the dataset operations which are dimension change to combine the data sets, bringing them into a processable form with the time series approach, merging the data sets, model training and performance comparison, manuscript writing and editing processes were all carried out by the corresponding author.

“Population: the numbers”, Population Matters. Access: 22 Aug. 2023. [Online]. Access address: https://populationmatters.org/the-facts-numbers/
C. Qiu, G. Johansson, F. Zhu, M. Kivipelto, and B. Winblad, “Prevention of cognitive decline in old age—varying effects of interventions in different populations”, Ann. Transl. Med., c. 7, sy Suppl 3, Art. sy Suppl 3, July 2019, doi: 10.21037/atm.2019.06.19.
K. Amiroh, D. Rahmawati, and A. Wicaksono, “Intelligent System for Fall Prediction Based on Accelerometer and Gyroscope of Fatal Injury in Geriatric”, J. Nas. Tek. ELEKTRO, c. 10, Kas. 2021, doi: 10.25077/jnte.v10n3.936.2021.
M. Kuzuya vd., “Falls of the elderly are associated with burden of caregivers in the community”, Int. J. Geriatr. Psychiatry, c. 21, sy 8, ss. 740-745, Ağu. 2006, doi: 10.1002/gps.1554.
H. Zhu, S. Samtani, and R. Brown, “A Deep Learning Approach for Recognizing Activity of Daily Living (ADL) for Senior Care: Exploiting Interaction Dependency and Temporal Patterns”, MIS Q., c. 45, ss. 859-896, Jun. 2021, doi: 10.25300/MISQ/2021/15574.
M. A. Fiatarone Singh, “Exercise, nutrition and managing hip fracture in older persons”, Curr. Opin. Clin. Nutr. Metab. Care, c. 17, sy 1, ss. 12-24, Oca. 2014, doi: 10.1097/MCO.0000000000000015.
X. Yu, Approaches and principles of fall detection for elderly and patient. 2008, s. 47. doi: 10.1109/HEALTH.2008.4600107.
B. R. Greene, K. McManus, S. J. Redmond, B. Caulfield, and C. C. Quinn, “Digital assessment of falls risk, frailty, and mobility impairment using wearable sensors”, Npj Digit. Med., c. 2, sy 1, ss. 1-7, Dec. 2019, doi: 10.1038/s41746-019-0204-z.
J. Wang, Z. Zhang, L. Bin, S. Lee, and R. Sherratt, “An Enhanced Fall Detection System for Elderly Person Monitoring using Consumer Home Networks”, Consum. Electron. IEEE Trans. On, c. 60, ss. 23-29, Feb. 2014, doi: 10.1109/TCE.2014.6780921.
K. Miguel, A. Brunete, M. Hernando, and E. Gambao, “Home Camera-Based Fall Detection System for the Elderly”, Sensors, c. 17, s. 2864, Dec. 2017, doi: 10.3390/s17122864.
Y. Uotani, K. Yamamoto, C. Ye, M. Bouazizi, and T. Ohtsuki, An LSTM-based Approach for Fall Detection using Accelerometer-collected Data. 2023.
E. Kavuncuoğlu, E. Uzunhisarcıklı, B. Barshan, and A. T. Özdemir, “Investigating the Performance of Wearable Motion Sensors on recognizing falls and daily activities via machine learning”, Digit. Signal Process., c. 126, s. 103365, Jun. 2022, doi: 10.1016/j.dsp.2021.103365.
“Accurate, Fast Fall Detection Using Gyroscopes and Accelerometer-Derived Posture Information”. Acces: 22 Aug. 2023. [Online]. Access address: https://www.researchgate.net/publication/221313120_Accurate_Fast_Fall_Detection_Using_Gyroscopes_and_Accelerometer-Derived_Posture_Information
A. Fawaz, M. Elsayed, A. Sharshar, M. Sayed, A. Abd El-Malek, and M. Zahhad, Fall Detection Algorithm Using a Smart Wearable System for Remote Health Monitoring. 2023. doi: 10.11159/icbb23.111.
M. Habaebi, S. Yusoff, A. Ishak, M. Islam, J. Chebil, and A. Basahel, “Wearable Smart Phone Sensor Fall Detection System”, Int. J. Interact. Mob. Technol. IJIM, c. 16, ss. 72-93, Jun. 2022, doi: 10.3991/ijim.v16i12.30105.
E. Casilari, M. Álvarez-Marco, and F. García-Lagos, “A Study of the Use of Gyroscope Measurements in Wearable Fall Detection Systems”, Symmetry, c. 12, s. 649, Apr. 2020, doi: 10.3390/sym12040649.
L. Ren and Y. Peng, “Research of Fall Detection and Fall Prevention Technologies: A Systematic Review”, IEEE Access, c. 7, ss. 77702-77722, 2019, doi: 10.1109/ACCESS.2019.2922708.
N. Zurbuchen, A. Wilde, and P. Bruegger, “A Machine Learning Multi-Class Approach for Fall Detection Systems Based on Wearable Sensors with a Study on Sampling Rates Selection”, Sensors, c. 21, sy 3, s. 938, Oca. 2021, doi: 10.3390/s21030938.
C. Auepanwiriyakul, S. Waibel, J. Songa, P. Bentley, and A. Faisal, “Accuracy and Acceptability of Wearable Motion Tracking for Inpatient Monitoring Using Smartwatches”, Sensors, c. 20, s. 7313, Ara. 2020, doi: 10.3390/s20247313.
D. Zhang vd., “LT-Fall: The Design and Implementation of a Life-threatening Fall Detection and Alarming System”, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., c. 7, ss. 1-24, Mar. 2023, doi: 10.1145/3580835.
B. B. Ahmet Ozdemir, “Simulated Falls and Daily Living Activities Data Set”. UCI Machine Learning Repository, 2014. doi: 10.24432/C52028.
“Xsens Products | Movella.com”. Access: 05 Eylül 2023. [Online]. Access address: https://www.movella.com/products/xsens
E. Kavuncuoğlu, “Ulusal Tez Merkezi | Anasayfa”. Acces: 20 Sep. 2023. [Online]. Access address: https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp
M. El-Gohary vd., “Continuous Monitoring of Turning in Patients with Movement Disability”, Sensors, c. 14, sy 1, Art. sy 1, Oca. 2014, doi: 10.3390/s140100356.
F. Zhang, J. Bai, X. Li, C. Pei, and V. Havyarimana, “An Ensemble Cascading Extremely Randomized Trees Framework for Short-Term Traffic Flow Prediction”, KSII Trans. Internet Inf. Syst., c. 13, sy 4, ss. 1975-1988, Nis. 2019.
Y. Freund and R. E. Schapire, “A desicion-theoretic generalization of on-line learning and an application to boosting”, in Computational Learning Theory, P. Vitányi, Ed., in Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 1995, ss. 23-37. doi: 10.1007/3-540-59119-2_166.
J. Lobo, A. Jiménez-Valverde, and R. Real, “AUC: A misleading measure of the performance of predictive distribution models”, J. Glob. Ecol. Biogeogr., c. 17, ss. 145-151, Oca. 2008, doi: 10.1111/j.1466-8238.2007.00358.x.
S. Abbate vd., “Monitoring of Human Movements for Fall Detection and Activities Recognition in Elderly Care Using Wireless Sensor Network: a Survey”, in Wireless Sensor Networks: Application - Centric Design, IntechOpen, 2010. doi: 10.5772/13802.
R. Martinez-Mendez, M. Sekine, and T. Tamura, “Detection of anticipatory postural adjustments prior to gait initiation using inertial wearable sensors”, J. NeuroEngineering Rehabil., c. 8, sy 1, s. 17, Apr. 2011, doi: 10.1186/1743-0003-8-17.
“Sensors | Free Full-Text | Smart Wearables with Sensor Fusion for Fall Detection in Firefighting”. Access: 10 Nov. 2023. [Online]. Access address: https://www.mdpi.com/1424-8220/21/20/6770
H. Gjoreski, M. Lustrek, and M. Gams, “Accelerometer Placement for Posture Recognition and Fall Detection”, in 2011 Seventh International Conference on Intelligent Environments, July. 2011, ss. 47-54. doi: 10.1109/IE.2011.11.
P. Ntanasis, E. Pippa, A. T. Ozdemir, B. Barshan, and V. Megalooikonomou, “Investigation of Sensor Placement for Accurate Fall Detection”, in Wireless Mobile Communication and Healthcare, P. Perego, G. Andreoni, and G. Rizzo, Ed., in Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Cham: Springer International Publishing, 2017, ss. 225-232. doi: 10.1007/978-3-319-58877-3_30.

No competing interests reported.

Achieving High Success in Fall Detection through Cross-Brand Inertial Sensor Utilization of Hybrid Data in Machine Learning

Status:

Version 1

Abstract

Figures

1. Introduction

2. Materials and Methods