Efficient Monitoring of Fall Risk in the Elderly via Functional Data Analysis

Background: A disease screening service is a preventive healthcare service. Accurate and efficient disease examination based on continuous monitoring data obtained from diseased subjects is the basis of developing this service. The traditional disease screening method for a specific disease is designed according to the physical meaning of the collected data. Methods: In this paper, a general disease detection statistical model based on monitoring data is proposed. By analyzing the distribution of data obtained from subjects who may have certain diseases, we used functional data analysis to establish a statistical model and obtain an efficient algorithm for parameter estimation. Results: The proposed model is applied to a real example of an elderly fall risk screening service based on plantar pressure data collected from elderly individuals walking over obstacles. Reasonable intervals of the model parameters used to screen the elderly for fall risk are obtained from the training samples, which are used to estimate the fall risk of the elderly with the test samples. Conclusions: The study shows that the foot plantar pressure measured in screening tests can be characterized by functional data analysis, and a linear mixed effect model can be used when time points are fixed. The restricted maximum likelihood technique is used for parameter estimation, and a nonlinear optimization algorithm is employed to iteratively determine the model parameters. This paper is to provide a method of detecting falls in the elderly based on statistical data rather than the physical meaning of collected data.


Background
Disease screening refers to the implementation of quick physical or laboratory examinations to identify patients with potential disease risk. Specifically, the main objective is to identify patients who are in the subclinical stage or are high-risk subjects [1]. Disease screening is a preventive health care service with a broad spectrum of detection possibilities, including falls in the elderly, cancer genes, childhood autism, osteoporosis, prenatal risk factors and other conditions. Traditional disease screening evaluation methods include choosing the "gold standard" (the most reliable and authoritative diagnostic method that can reflect the actual disease situation is called the "gold standard"), selecting the research subjects, determining the sample size, designing a synchronous blind test (using the experiment to evaluate subjects from the case group and noncase group determined by the "gold standard"), analyzing data, performing quality control and other tasks [1]. The traditional disease screening procedure mainly includes collecting data with detection devices, analyzing the physical meaning of the data corresponding to the disease itself, and then performing disease screening. However, this traditional research method lacks certain universality.
The emergence of preventive medicine and health care services is a considerable benefit to society. The screening reports generated after participating in a screening service are very helpful for disease detection in subjects. Hence, an accurate and appropriate screening method is not only the key to such services but also an important basis for the evaluation of the sustainable operation of such services in the future. A valuable report will enable more people to use disease screening services. Therefore, it is of great social and practical significance to develop accurate and efficient examination methods to correctly identify potential subjects with diseases.
Fall risk screening for the elderly is a type of disease screening. In the United States, approximately 30% of people over 65 years of age fall at least once a year, and 50% of people over 80 are likely to fall [2]. In China, the likelihood of elderly people 3 falling in a year is approximately 34%. Approximately 38% of the elderly population in urban communities is injured after a fall, and some individuals may die [3]. Falls have become the leading cause of medical treatment for individuals age 65 and older in China in recent years. In addition to damaging the physical health of the elderly, falls also endanger their mental health by reducing their quality of life and increasing the burden on their families and society [4]. In 2014, the China National Health Planning Commission officially included "prevention of falls" in its "core information on health for the elderly" policy. Since then, the risk of falls has gradually attracted increasing social attention [5].
Qualitative assessments of fall risk for the elderly have mainly been based on two methods: the clinical observation method and the scale evaluation method [6]. The clinical observation method is mainly based on clinical observations by an outpatient doctor, nurse or home care worker for an elderly person who has a risk of falling or has a recent history of falls. Although this method is comprehensive [7], it is highly random and lacks standard criteria; additionally, the evaluation results from different doctors can be very different. Hence, this approach is not an objective and quantitative evaluation method [8]. In contrast, the evaluation results of the scale assessment method are more universal, thus supporting the consistency of this approach. Notably, this type of approach has been adopted by most hospitals, nursing homes and health service centers to assess the risk of falls for elderly people. Scale assessments are based on the Morse Falls Assessment Scale (MFS) [9], the Berg Balance Scale (BBS) [10], the Tinetti Scale (POMA) [11], and the Hendrich II Fall Risk Model [12].
Research on quantitative detection methods of fall risk for the elderly is based on different types of fall detection devices, which are mainly divided into three categories: wearable devices, environmental sensing devices and video image devices.
When an elderly person falls from an upright position to a lying position, there is usually a sudden increase in negative acceleration. In this process, waist mounted accelerometer equipment can detect fall behavior [13]. Additionally, falls are detected by triaxial accelerometers mounted at the waist, wrist and head using acceleration 4 thresholds [14]. Fall detection systems based ground tremors and ground vibration information are also used to assess the fall risk of elderly individuals [15]. A fall detection system is equipped with a camera at a fixed indoor position to analyze the collected image information, assess an individual's activity status and determine whether an elderly has fallen [16]. An image detection system identifies a fall by calculating the vertical distance between the head of an individual and the ground [17].
The video images of individuals collected by an image detection device are analyzed, and the skeleton structure is studied to determine the occurrence of falls [18].
Functional data analysis involves data modeling using functions or functional parameters. The complexity of functions is not assumed to be known in advance, so approximation methods that are flexible based on the data requirements are used. This approach can be used in the healthcare field [19], such as in the classification of Alzheimer's patients [20], assessments of the rate of clinical alarms in an intensive care unit [21], the determination of heart failure [22] and the identification of diabetes mellitus [23].
In this paper, a statistical model is proposed to identify potentially diseased patients based on functional data analysis, which is an objective quantitative method. The parameter estimation methods for the model and the corresponding algorithm are given. The proposed model is applied for fall risk screening in an elderly population.
A numerical analysis of the model parameters is performed, and a reasonable range of the model parameters is obtained. On this basis, it is feasible to screen the elderly for fall risk. The main contribution of this paper is to provide a method of detecting falls in the elderly based on statistical data rather than the physical meaning of collected data.

Data Collection and Description
The data were collected with an elderly fall risk detection device developed by the Shenzhen Institute of Advanced Technology of the Chinese Academy of Sciences. 5 We recruited elderly volunteers who were willing and able to perform testing in a health service center in Luohu District, Shenzhen. We assigned IDs to these elderly volunteers and obtained their basic information (age, gender, height, weight, etc.). The length of the test section was 6 meters, and the height of the obstacles was 6 centimeters. The elderly volunteers who were tested proceeded to the designated road test area, and the remaining volunteers remained in the rest area. The rest area was approximately 1 meter away from the starting point of the test road. The elderly volunteers wore special insoles during the test. The insoles had plantar pressure sensors that could sense changes in plantar pressure and feed the data back to the corresponding equipment. The test area included flat road followed by a 6-centimeter-high obstacle and then flat road, forming a 6-meter-long test section.
Each elderly volunteer walked this section twice, and plantar pressure data were collected and stored.
Each elderly volunteer was required to repeat the test times. The data from the th measurement were recorded as ( ), = 1, 2, ⋯ , . In general, the foot plantar pressure was divided into three phases: the increasing pressure phase, the stable pressure phase and the decreasing pressure phase, which correspond to three phases of walking, i.e., starting, crossing obstacles and stopping, respectively.

The Proposed Method
This paper examines the problem from a new perspective. First, we collect detection data for all the screening subjects and investigate the optimal parameter estimation scheme. Second, reasonable ranges of distribution parameters are identified, a unified screening standard is obtained, and all detection objects are classified. This approach does not treat each subject separately. In addition, this approach can be generalized and extended for the screening of various diseases.
We assume that the continuous detection data are collected as a time series. Given a sample of subjects, one key objective is building an appropriate model ( ) to fit the data. It is also important to estimate reasonable ranges of model parameters based on training with sample data to distinguish subjects with or without disease. Then, the 6 proposed model can be used for detection.
Based on the research questions above and the background of the health screening service, to facilitate the construction of the statistical model ( ) and subsequent analysis, we assume that (1) the subjects receiving the health screening services are independent; (2) the subjects are able to complete the entire detection process without human assistance; and (3) the subjects need to complete multiple repeated detection tests from the same disease screening service, with each test being independent of the others.
Here, we define some notations to facilitate model construction. is the error term for each test at different time points; therefore, this error is random if is a constant. We assume that the error term follows a normal 7 distribution as ~(0, 2 ) and is independent of , where = 1,2, ⋯ , . We allow the error term to have different variances for different tests involving the same subject.
( ) represents the response value of the medical index in the th test at time point ∈ , where = 1,2, ⋯ , .
In this subsection, we employ statistical theory and a model to study disease screening. The model is built to fit the data as follows: Then, the model can be written as Based on the assumptions that and are independent and follow the normal distributions, we have where = ( ) × is the covariance matrix for the vector . As time changes, the data can be characterized by a Gaussian process model defined as ⃑⃑⃑ ≜ ⃑⃑⃑ ( ) = ( ( 1 ), ( 2 ), ⋯ , ( )) .
With the probability distribution function for ( ), the joint probability distribution function of ⃑⃑⃑ is , ⋯ , ( )) , where is an by +1 matrix with the th row equal to ( ) and is the identity matrix. As a result, the probability density function of ⃑⃑⃑ is as follows: where | | is the determinant of and −1 represents the inverse matrix of . Now, we have built a statistical model for the data collected from the disease screening tests and derived the probability distribution of the response (6) and the corresponding joint probability density function (9). We note that if the multiple time pints 1 , 2 , ⋯ , are fixed, the collected data are longitudinal or panel data.
Consequently, the task of Gaussian process model estimation can be converted to longitudinal data modeling using a linear mixed effect model.

Results
In this section, we derive the likelihood function of the proposed model (6)-(8) based on the normal probability density function (9). Therefore, the maximum likelihood theory in statistical analysis is used for the parameter estimation.
The joint probability density function of the responses from all the detection tests, that is, the likelihood function, is Consequently, maximum likelihood estimation can be used to determine the parameters in the model based on profile estimations of , and . Taking the partial derivative of (12) with respect to , , and yields ( , , ) = 0, ( , , ) =0.
From (13), we can obtain where is an by +1 matrix with the th row equal to ( ). However, the maximum likelihood estimates of the variance and covariance parameters and are biased. Hence, we adopt the restricted maximum likelihood estimation (REML) method to obtain and . As a result, equations (14) and (15) We iteratively solve for three parameters , and until convergence is achieved.
We built a statistical model for the data collected from the disease screening tests and performed detailed parameter estimation. However, solving equations (16) and (17)

Discussion
In the test, the foot plantar pressure of elderly individuals in each test was recorded when they walked across predesigned obstacles. Each elderly individual was required to repeat the test four times. Therefore, = 4 for model (1). The reference intervals and average values for each parameter are shown in Table 1.   Table 1 can be used to screen the fall risk of the elderly; notably, the foot plantar pressure data measured from the elderly can be input into the model constructed in section 2 to obtain the parameter estimates, which can be compared to the corresponding reference intervals in Table 1. If the estimated values of most parameters are within the corresponding intervals, an individual can be considered to have a low risk of falls. In contrast, if most of the parameter estimates are not within the reference intervals, the risk of falls is considered high. The reference intervals of the model parameters are given by 50 training samples.
We use these intervals to determine the fall risk of the remaining 12 test samples. The results show that 3 of the 12 test samples are conservatively classified as having a high risk of falls, and the remaining 9 samples are classified as having a relatively low risk of falls. Tables 2 and 3 are obtained by taking 7 representative samples as an   example, where Table 2 shows the parameter estimates for elderly individuals with a relatively low risk of falls and Table 3 reports the parameter estimates for those with a high risk of falls. The parameter estimates outside the reference intervals are indicated 12 by different shades. Table 2 shows that more than half of the parameter estimates are within the reference intervals provided in Table 1, so these individuals can be considered to have a relatively low risk of falls. Conversely, Table 3 shows the parameter estimates for three individuals with parameters outside the intervals provided in Table 1. For example, the participants with ID numbers 2 and 57 have 10 parameter estimates outside the reference intervals, and participant number 6 has 9 parameter estimates outside the reference intervals.
Thus, these individuals have a relatively high risk of falls. Considering the estimation results for 7 elderly individuals in Tables 2 and 3, Fig. 1 and Fig. 2 display the corresponding foot plantar pressure curves over time. Figure 1 shows the curves of the participants who were screened as having a relatively low risk of falls, as denoted in Table 2. Figure 2 depicts the curves of the individuals who were screened as having a relatively high risk of falls, as denoted in Table 3.

Conclusions
Based on a disease screening service, this paper studies the distribution of data collected from screening tests, and a statistical model for disease detection is built considering both fixed and random effects. The proposed model is applied in the context of a fall risk screening service for elderly individuals. The study shows that the foot plantar pressure measured in screening tests can be characterized by functional data analysis, and a linear mixed effect model can be used when time points are fixed. The restricted maximum likelihood technique is used for parameter estimation, and a nonlinear optimization algorithm is employed to iteratively determine the model parameters. Based on the training samples, the reference intervals for each parameter are constructed, and the elderly can be classified into low or high fall risk classes by determining whether the corresponding parameter estimates are inside or outside the reference intervals.
From this work, we have identified several research directions for the future. (1) In this work, only the foot plantar pressure data from fall detection focused on elderly individuals are used. Notably, we could also consider data from other medical indices and include them in the model. (2) In this work, we use ℎ 1 ( ) = , ℎ 2 ( ) = 2 and ℎ 3 ( ) = 3 as the covariates corresponding to the fixed effects. We could also consider other characteristics of participants, such as height, weight, and age. (3) Due to the limitations of the research conditions, the real data samples are only for elderly individuals who participated in the service within a fixed time and at a certain location.
The estimated intervals of parameters obtained through these samples are not representative of the general elderly population. Hence, it may be impossible to accurately identify individuals with a high risk of falls. Thus, the size of the random sample set from multiple cities across China should be increased to obtain the parameter reference intervals.

List of abbreviations
Not applicable.

Declarations Ethics approval and consent to participate
The study was approved by the Research Ethics Committee of the Southeast University. All participants were informed about the study goals and signed informed consent. For participants who were considered illiterate, written informed consent was also obtained from their legal guardians.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.