Classication of Hypertensive and Normotensive Subjects Using Bilateral Differential Biopotential Signals

Hypertension or high blood pressure is a severe health issue in the modern world, especially in this pandemic scenario, that can cause many heart related diseases or even death, and it is increasing day by day. For this reason, a reliable, automatic and easy to use system for hypertensive subject detection is an important focus for the researchers. Biopotential signals can play a pivotal role in this regard. Though, few strategies were proposed based on electro-cardiogram (ECG) or electrodermal (EDA) signals, but those require special circuitry, as well as trained persons. In this article, a method is proposed to classify hypertensive and normotensive subjects using diﬀerential biopotential signals. Neither special circuitry, nor much expertise is required for handling this system. It was assumed that progression of rest is dependent upon blood pressure. To serve the purpose, signals were acquired from both hypertensive and normotensive subjects bilaterally for 10 continuous minutes. Result of the random forest (RF) classiﬁcation establishes that from the analysis of the progression of the bilaterally acquired diﬀerential biopotential signals, hypertensive subjects can be distinguished from normotensive subjects.


Introduction
Hypertension is one of the most common disease in the modern world. It can cause many severe health complications like heart problem, stroke or even death. In the modern world, the number of deaths due to hypertension is increasing day by day [45]. According to the global report, 17.9 million deaths were occurred in 2019 due to cardiovascular diseases. Among these, 85% were due to heart attack and stroke [1]. In India, 63% of the total death are occurred due to noncommunicable diseases. Among these, 27% are due to cardiovascular diseases [44]. Different stages of hypertension are tabulated in Table 1 [31]. This fatality occurs due to lack of awareness, lack of primary care and follow up. For this reason, a reliable BP monitoring system is very much essential. Furthermore, automation is important to reduce human error. Biopotential signals can perform a pivotal role in this regards. Electrochemical activity of the excitable cells is the reason for generating biopotential signals [8]. These excitable cells are present in the nervous, muscular as well as in the glandular system of the body. The autonomic nervous system (ANS) regulates the activity of internal organs like the heart, stomach, blood vessels, sweat glands etc. [20]. As a result, this electrochemical activity depends upon the activity of the ANS. It effects a number of physiological functions (vasocontraction-dilatation, temperature of the body, barrier function, secretion, growth, cell nutrition etc.) as well as pathophysiological functions (immune defence, proliferation, wound healing etc.) [34]. Since long, biopotential signals such as electrocardiography (ECG), electroencephalography (EEG), electromyography (EMG), electroneurography (ENG), electrodermal activity (EDA) etc. are being used for the diagnosis as well as prognosis purposes.
In a research article, the author presented a hypertension assessment method using ECG and photoplethysmograph (PPG) signals [27]. The proposed method can classify hypertensive and normotensive subjects with F1 score of 94.84%. Assessment of pulmonary hypertension was also reported using ECG and some other simple non-invasive techniques. Result of this study showed significant positive prediction value (PPV) for the high risk patients [26]. Several indexes were also proposed for hypertension detection, using ECG signals [32,13]. In a recent study an automated technique is proposed for detection of severity of hypertension in human subjects [33].
Studies were also reported on hypertension research using electrodermal signals. In a study, the author showed that non-specific skin conductance response (NS.SCR) frequency is negatively correlated with both systolic and diastolic blood pressures [16]. In another study difference in races in cardiovascular and non-cardiovascular sympathetic nervous system (SNS) activity was studied for hypertensive and normotensive subjects. Skin conductance response was recorded to study the non-cardiovascular SNS activity [15]. Furthermore, interdependency between job strain and risk of cardiovascular disease was also established using EDA signals [41].
A number of studies mentioned above used classification technique in machine learning for prediction of hypertension [27,32,33,26] from different physiological featured data [28,29] or medical data [11]. Classification is a popular technique in the field of biomedical condition monitoring and detection purposes. In few recent studies, the classification technique is used for brain disease [18,24] and tumour detection from medical image data [23,36], analysis of chest disease [19], electromyogram (EMG) signal classification [17], pneumonia disease classification [14], detection of epileptic seizure from EEG signals [21], arrhythmia detection [25], even detection of Covid-19 from x-ray [3] to name a few.
All the above-mentioned studies for hypertension classification, need special acquisition system, as well as expertise to handle these systems. For this reason, use of these techniques are very limited. A method is proposed in this article to classify hypertensive and normotensive subjects using simple bilateral differential biopotential signals. These signals are acquired from fingers of the subjects using simple non-invasive Ag-AgC electrodes and precise multimeters, which does not require much expertise. Furthermore, the classification is done using random forest (RF) which also can be put into a generalised program, so that it can be user friendly for everyone. Researches have already established the association between blood pressure and the rest in human being [10,2]. It was assumed that the progression of rest or how a human body goes into rest is dependent upon blood pressure. For this reason, long term signals were acquired from a number of hypertensive and normotensive subjects and classification was performed.
Differential biopotential signals were introduced first in 2011. Some derived parameters from this non-organ specific signal were assumed to depend upon the homeostasis of human being [5]. A reliable data acquisition system was also developed for simultaneous data acquisition from multiple locations [38]. Furthermore, bilateral differential biopotential signals were also studied for quantification of bilateral asymmetry [37]. In a very recent study, these differential biopotential signals were used for cognitive load assessment using machine learning [22].

Methodology
The overall methodology for the experiment is given in a flowchart in Figure 1. The experiment consists of 3 main parts, these are acquisition of experimental data, processing of the collected data and the classification of the processed data. Subjects were not given any kind of external stimuli (neither any electrical or mechanical stimuli, nor any physical or mental tasks given to them) during this experiment to capture the resting biopotentials of them. Therefore, resting biopotential should carry the informations of internal state, not just the stimulus responses. Due to that, the resting biopotential is thought to contain more significant information about homeostasis [9,43]. For this reason, data acquisition was done in fully physical and mental restful condition.

Instrumentation system
The experimental setup was already stated in [37,4]. The acquisition system consists of 2 numbers of 4 3 4 digit multimeters (model: RISH Multi 18S, make: Rishabh Instruments) and 2 optically isolated adapters (model: RISH Multi SI232, make: Rishabh Instruments). Resolution of this multimeters is 10 µV . These multimeter-adapter assembly is connected to a PC or laptop via serial cable. The Rishcom 100 software is used in PC to interface, display as well as storage purposes. Shimmer sensing made 8mm snap type reusable Ag/AgCl finger electrodes with single core coaxial cable connectors were used to acquire potentials from the skin and aqueous USG gel was used for electrode-skin in-terfacing. The actual acquisition system, subject in experimental posture and electrode connections are shown in Figure 2. Electrodes are connected to the intermediate phalanges of index and middle fingers of both hands and differential potentials are acquired simultaneously. The technique is very similar to that of recommended endosomatic electrodermal data acquisition [6,40], but it differs in terms of placement of electrodes as well as the acquisition system.

Subject selection and general conditions
A total number of 98 sessions of data were acquired from 16 subjects belong to both male and female genders. Each session contained for 10 minutes. Among all, 40 sessions were recorded from 7 hypertensive (BP=140-165/90-95 mmHg) and rest 58 sessions were recorded from 9 normotensive (BP=110-130/75-85 mmHg) subjects. All 7 hypertensive subjects were clinically tested and verified by specialist doctors and taking prescribed medicines regularly. All the subjects were dextral and their age was within 24 to 58 years. The whole experiment was performed within 2 months of duration, inside a controlled (lighting, temperature, humidity and sound) laboratory. Pulse rate, body temperature and oxygen saturation level were recorded from subjects to ensure the normality.

Consent and questionnaire
The study was designed following the 1964 Helsinki declaration and its later amendments and approved by the Institutional Ethical Committee (IEC) of Jadavpur University (resolution no. 19 dated 25.05.2018, held on 23.12.2020 at 4pm.). Informed consents were signed by each participant at the beginning. Along with, a questionnaire was filled up by the participants to assess their general health condition. However, daily conditions were also assessed by another questionnaire set prior to the daily acquisition.

Procedure for data acquisition
Prerequisite: Data were acquired separately from each subject and not more than one session in a particular day from individual subject. As mentioned in Section 2.1.3, the daily questionnaire was filled up by the participants daily, prior to the acquisition. It was ensured that no subject took any kind of anticholinergic drug within 72 hours prior to the acquisition. At first, subjects were allowed to be settled down typically for 10 minutes. After that, they lied down on experimental bed and BP, body temperature, pulse rate as well as oxygen saturation level were recorded. At last bias voltage test [40] was performed by shorting positive and reference terminals of each lead to ensured that the bias voltage never exceeds 0.02 mV.
Potential acquisition: After all these prerequisites, electrodes were connected to the middle phalanges of index and middle fingers of both hands and start acquisition. The acquisition was continued for 10 minutes. Signal acquired from left-hand was named as LH, similarly right-hand signal was named as RH. After completion, electrodes were disconnected and subjects were asked to go back to their work normally.

Data processing
The raw acquired data is needed to be processed for the final experimentation. The processing of data consists of preprocessing, extraction of features from the signal and selection of best attributes from the signal features.

Preprocessing
At first these signals are checked for abnormality. Any missing bits, not a number (NAN) and abnormal spikes are corrected. After that, these 10 minutes sets are divided into 5 subsets of 2 minutes each. Each of these subsets are henceforth denoted as a state of restfulness. Detail of these states are tabulated in Table 2. It is already stated that hypertension and rest are very much related to each other. The effect of rest is very significant in hypertension. It was assumed that hypertension can be characterised using the progression of rest. As the state increases, subjects go into deeper resting state. The acquired LH and RH signals are then used to derive two more signals and these are Gap and Pair sum (PS), shown in Equation 1, proposed by A. Bhattacharya [4]. These derived signals are bilateral in nature, depending on potentials of both hands.

Features extraction
A number of time domain features were identified from these 5 states. Based on the characteristics, all the parameters/features were divided into 3 subcategories.
(a) Basic parameters: Mean value, standard deviation, skewness and kurtosis were considered as basic parameters. Furthermore, 3 more parameters, namely Zero Crossing Instant (ZCI), slope (m) and Root Mean Square (RMS) value of the residual of the best 1 st order fit line, passing through ZCI were also considered as basic parameters. ZCI, m and residual were calculated from a linear model (Equation (2)) of this signals [4].
where, y k is the corresponding signal value at k th time instant and k ∈ (0, 2] minutes. µ is the mean of the signal. ZCI represents the zero crossing time instant, the closest to 1 minute time instant where the deviation signal crosses the 0mV axis. m is the slope of the 1 st order best fit that goes through the ZCI, shown in Equation (3). e k is the residual at k th time instant. RMS value of e k is also calculated. These ZCI and m represents the slow changing or the tonic component and residual (e k ) represent the fast changing or phasic component of the differential biopotential signal.
In Equation 3, k 1 is the start time and k u is the end, which is 2 minutes for each states in this experiment. Therefore, a total of 7 features were calculated for each of LH and RH. Considering all 5 states of restfulness a total number of 70 (2 signals × 5 states × 7 f eatures) features were calculated from LH and RH. Furthermore, all 7 basic parameters were also calculated for the 2 derived signals Gap and PS. Here also another 70 (2 derived signals × 5 states × 7 f eatures) features were calculated from these two signals. Therefore, in total 140 basic parameters were calculated. (b) Ratio parameters: Ratios of basic parameters for LH and RH signals were calculated. This ratio parameters were also very much significant to characterise bilateral relationship of differential biopotential signals. 3 types of ratio parameters were used here.
Direct ratio parameter: Direct ratio parameters were the ratios of a left hand parameter and its corresponding right hand parameter. e.g. Rto µ LH1 is the ratio of mean of 1 st state of LH and mean of 1 st state of RH (Equation (4)). Similarly, ratios of rest 6 parameters (sd, skewness, kurtosis, ZCI, m and RMS of residual) were also calculated to derive 6 more direct ratio parameters. Considering all 5 states total 35 (5 states × 7 basic parameters) direct ratio parameters were calculated.
Lateralization coefficient 1: A Lateralization coefficient, named here as LC1 was proposed in an article [30] for subsequent evaluation of electrodermal lateralization, defined as Equation (5). Here, EDR is the electrodermal response of right and left hand, whereas EDR max is the maximum of EDR right and EDR lef t . Similarly, LC1 was calculated for all 7 basic parameters of this bilateral differential biopotential signals (mean, sd, skewness, kurtosis, ZCI, m and RMS of residuals) of all 5 states. Therefore, total 35 (5 states × 7 basic parameters) LC1 parameters were calculated.
Lateralization coefficient 2: Another Lateralization coefficient, named here as LC2 (Equation (6))was proposed by Schulter and Papousek [39]. Where, EDR is the electrodermal response of right and left hand. Same as LC1, LC2 was calculated for all 7 basic parameters of all 5 states. Similarly, total 35 (5 states × 7 basic parameters) LC2 parameters were also found out.
(c) Correlation parameters: Furthermore, cross-correlation coefficients of simultaneously acquired LH and RH were acquired for 5 different states. Therefore, another 5 features were extracted for classification.
At last total 250 prediction variables were calculated (140 basic parameters + 105 ratio parameters + 5 cross correlation parameters) to classify 2 response variables or classes and these were hypertensive and normotensive subjects.

Attribute selection
Attribute selection and classification was done in WEKA, version 3.9.4. WEKA or Waikato Environment for Knowledge Analysis is a well known open source data mining software developed by University of Waikato, New Zealand. Attribute selection is an important task that helps to reduce dimensionality, training time as well as over-fitting. There are number of attribute evaluators available in the software WEKA. Among these, "AttributeSelection" filter was used in this experiment. It is a very flexible supervised attribute filter and allows various search and evaluation methods together. After applying attribute selection filter, only 6 attributes were selected and these are tabulated in Table 3.

Classification
Random Forest (RF) proposed by L. Breiman [7], is one of the most useful and effective classification technique. RF is actually an ensemble of different classification trees. Each tree predicts a result and all results are ensembled to produce the actual prediction of the classification. For this reason, RF minimizes error. It also works well for imbalanced dataset and capable of handling missing data. Only drawback of this classification modelling is that it is very complex as it creates a number of decision trees. For this reason, RF appears as a black-box and the training time it takes is also high compared to others.
There are numbers of techniques available to choose training and testing set for cross validation in classification experiment [12]. In this experiment k-fold cross validation technique was applied. In this technique, total dataset is divided into k number of subsets. 1 subset is kept for testing purpose and rest (k-1) subsets are considered as training set. For the next instant, another subset is considered as testing subset and rest are considered as training subset and the whole thing is repeated for k number of iterations. Accuracy is calculated by averaging all k predictions. In this experiment we have used 10 fold cross validations as well as leave one out (LOO) cross validation experiments. Both are special cases of k-fold cross validation. In 10 fold cross validation, the value of k is 10 and in case of LOO cross validation, the value of k is equals the total number of data sets, 98 in this experiment.
RF classification was performed in the same WEKA platform. The dataset under classification was not a balanced one. For this reason, F1-score [42] and Precision-Recall area under curve (PRC-AUC) [35] were studied as the measure of the quality of the classification model. Precision is the ratio between true positive (TP) and the sum of TP and false positive (FP) (Equation (7)), also known as Positive Predictive Value (PPV). Recall or true positive rate is the ratio between TP and sum of TP and false negative (FN) (Equation (8)).

P recision =
T P T P + F P Recall = T P T P + F N (8) Performance comparison of this RF model, Support Vector Machine (SVM) and K-Nearest Neighbour (KNN) were also performed with 10 fold crossvalidation, in terms of sensitivity, specificity, positive prediction value (PPV), negative prediction value (NPV) and F1 score. Sensitivity, also known as recall or true positive rate (Equation (8)), is defined as among all actually positive classes, how many are truly classified. Similarly, specificity is defined as truly classified negative classes among all actually negative classes (Equation (10). On the other hand, PPV, commonly known as precision represents the amount of true prediction among all positive predictions. Similarly, NPV (Equation (11)) represents the amount of true prediction among all negative predictions.

Result
It can be seen from the analysis that out of 98 instants, the classifier model successfully classified 76 instants, which is 77.55% overall accuracy. Along with that, the weighted average (WA) of precision and recall are 0.774 and 0.776 respectively. Furthermore, the weighted averages of F1-score and PRC area under the curve (AUC) was 0.772 and 0.778 respectively. All these measures of this classification are given in Table 4.  Comparison of RF with SVM and KNN is presented in Table 5. From the results in Table 5 it is seen that among all three techniques tested, RF performed best in terms of all 5 parameters presented in Table 5.
Leave One Out (LOO) cross validation was performed and the result is given in Table 6. In Table 6 weighted average of accuracy and F1-score are 0.806 and 0.804 respectively. Along with, PRC area is also 0.819. All these results are satisfactory to say that hypertensive and normotensive subjects can be classified using this technique.

Discussion
The present experiment was conducted to classify hypertensive and normotensive subjects using differential biopotential signals collected from two identical bilateral location of both hands. It was assumed that progression of rest is actually depends upon blood pressure of the human being. For this, signals were acquired continuously for 10 minutes from a number of subjects, belong to both the hypertensive and normotensive classes. To address the progress of rest, 10 minute signals were divided into 5 continuous states or subsets of 2 minutes and each subset represents a state of restfulness. Total 250 time domain features were calculated from the preprocessed signals. These parameters were either individual or bilateral in nature. After using attribute selection filter in WEKA, only 6 parameters were selected among all. Out of these, 5 are bilateral parameters. Along with that, attributes belong to 2 nd , 3 rd and 4 th states contributes in this classification. This signifies that the attributes are clearly depending on different states of restfulness.
RF classification was used for this experiment with 10 fold cross validation and leave 1 out cross validation. Accuracy, F1-score were greater than 0.77 in 10-fold cross-validation and greater than 0.80 in leave 1 out cross validation experiment respectively. However, other 2 classification models like KNN and SVM were also performed and compared with RF. Results showed that RF can predict both classes more accurately than other 2 techniques.

Conclusion
At the last of this experiment it can be concluded that hypertensive and normotensive subjects can be classified from the time domain features calculated during progression of bilaterally acquired differential biopotential signals, using random forest classification. In future, it may be possible to classify different stages of hypertension from this signal. May be a real time device can be made for automatic blood pressure measurement that can be very useful for telemedicine applications.
Acknowledgements The first author is very thankful to CSIR for funding this research and to Jadavpur University for providing the research facilities.

Conflict of Interest
All three authors declare that they have no conflict of interest.

Ethical approval
All procedures performed in this study involving human participants were in accordance with the ethical standards of the Institutional Research Committee (IRC) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent
All the subjects involved in this study were aware of the procedure and purpose of the study. Informed consent was signed by all individual participants included in the study.