Prediction Model for Potential Depression using Sex and Age-reected qEEG Biomarkers.

Depression is the mental disorder that prevalent in modern society, causing many people to suffer or even commit suicide. Psychiatrists and psychologists typically diagnose depression using representative tests such as the Beck’s Depression Inventory (BDI) and the Hamilton Depression Rating Scale (HDRS) in conjunction with patient consultations. Traditional tests, however, are time consuming, can be trained on patients, and entail a lot of clinician subjectivity. In the present study, we trained the machine learning models using sex and age-reected z-score values of QEEG indicators based on data from Data Center for Korean EEG with 116 potential depression subjects and 80 healthy controls. The classication model distinguished potential depression groups and normal groups with a test accuracy of up to 92.31% and a 10-fold cross validation loss of 0.13. This performance proposes a model with z-score QEEG metrics considering sex and age as an objective and a reliable method for detecting potential depression.


Introduction
Depression is a major cause of global burden that can be a life-threatening mental disorder [1]. World Health Organization (WHO) reported that more than 300 million people worldwide suffer from depression.
The bigger problem is that the procedure of depression diagnosis is complicated. Diagnosis of depression is usually done through interviews with physicians, accompanying tests such as Beck's depression inventory (BDI) or Hamilton Depression Rating Scale (HDRS). However, this process is timeconsuming, burdensome for the patient, and re ects much of the doctor's personal subjectivity.
In recent, many studies are trying to nd biomarker for depression using brain activity to diagnose in a more objective and time-saving way [2,3]. Among several methods of measuring brain activity, noninvasive EEG is best suited as a quick and simple way to diagnose depression. EEG is also has lots of advantages: less time consuming, and cost-e cient than neuroimaging methods like as functional magnetic resistance imaging (fMRI).
Band power is the most representative indicator in the EEG. Prior studies showed that biomarkers of depression were studied with band power at 25% of total, followed by Alpha asymmetry (20.8%), and evoked potential (18.8%) [1]. Findings regarding band power has also been actively reported, and Alpha band accounting for a large portion for important feature among them [4][5][6][7]. Another biomarker for depression is Alpha asymmetry, which can be obtained from difference of alpha band power between brain hemispheres. Several ndings about Alpha asymmetry as a biomarker of depression have been reported recently [8][9][10].
Predicting result of disease diagnosis using a classi cation model is an example of how to discover in uential biomarkers. The more in uential biomarkers are, the greater the performance of the model.
Indeed, prior studies of detecting depression using arti cial neural network and achieving a high accuracy of more than 90% have already been reported [11,12]. However, the almost previous studies have a limitation that the number of subjects were under 15 per group, and all subjects were already clinically diagnosed with Major depressive disorder (MDD). Furthermore, the models are relatively complex and over tting is concerning because non-linear features have been applied to the Arti cial Neural Network (ANN) to be trained with numerous parameters.
In the present study, we divided potentially depressed people and healthy people from database by optimal BDI criteria [13]. We propose a reliable potential depression predictive model based on su cient subjects and a very simple z-scored band power. The signi cance of our study lies in predicting potential depression among those who have not been clinically diagnosed.

Data
All data were obtained from the Data Center for Korean EEG. The data center has approximately 1,700 electroencephalographic data (called normative data) and is linked to the iSyncBrain. The experimental procedure for data was approved by the Research Ethics Committee of the Seoul National University and informed consent was signed by each participant prior to the recording. All methods were carried out in accordance with relevant guidelines and regulations.

Subjects
Based on the Beck's Depression Inventory (BDI) cut-off criteria [13], a total of 196 subjects were selected including 116 subjects with the potential depression (male = 23, female = 95, age = 58.66 ± 15.08 years, BDI = 21.17 ± 6.28) and 80 healthy controls (male = 44, female = 36, age = 48.66 ± 16.71 years, BDI = 0) from the Data Center for Korean EEG (BDI cut-off: 14.48). All subjects were not taken medicine, had never been diagnosed with mental illness, and had never visited a hospital for depression. Hereafter, the group of 116 depression subjects is denoted as a potential depression group and the group of 80 healthy controls as a normal group, respectively.

Preprocessing
Overall EEG preprocessing basically was performed using denoising algorithm in iSyncBrain (iMediSync, Inc., Korea, https://isyncbrain.com/.). The raw EEG data was ltered with notch lter. Low cut-off and high cut-off frequency were 1 Hz and 45 Hz, respectively. Re-referencing was performed using Common average reference (CAR). Artifacts were removed by bad epoch rejection and Independent Component Analysis (ICA) based algorithm.
Absolute band power is spectral band power based on fast Fourier transform (FFT) provided by iSyncBrain. Relative band power is the absolute power in a speci c frequency band divided by the total power. We rst performed Shaprio-Wilks test or Kolmogorove-Smirnov test for normality, and then performed independent T test or Mann-Whitney U test to test a signi cant difference in the band power between groups for each frequency band.

Feature extraction and selection
Four distinctive features were obtained from band powers: Absolute band power, Relative band power, Absolute z-scored band power, and Relative z-scored band power. Gamma band (30-45Hz) was excluded from the analysis because the gain of overall feature importance was obtained when it was removed. To remove the differences of sex and age between groups, we matched each subjects' sex and age to data in the Data Center for Korean EEG, and calculated z-scored band power, that are, Absolute z-scored band power and Relative z-scored band power.
A total 532 features (4 kind x 19 channels x 7 bands) were extracted for candidates of nal feature. We computed feature importance by summing changes in the mean squared error due to splits on every feature and dividing the sum by the number of branch nodes in tree-based ensemble models to select the nal feature. A total six tree-based ensemble model were used to compute feature importance: Adaptive logistic regression, Adaptive boosting, Gentle adaptive boosting, Robust boosting, Bootstrap aggregating, and Totally corrective boosting. Once the feature importance has been calculated in each model, we adopt an intersection of features with higher scores in each model as the nal feature (Fig. 2).

Model Training
In model training, 80% of the total data was used for training and 20% for testing.  Figure 3 shows topomap of frequency bands that have signi cant difference between groups. The spectral power of each group was average value of subjects in each group. Potential depression group had signi cantly larger power in beta2 and beta3 both in absolute band power and relative band power than normal group (p < 0.05). However, potential depression group had signi cantly lower relative band power in alpha2 (p < 0.05). Beta2 and beta3 showed signi cant differences in almost all areas in brain, while alpha2 showed signi cant differences mainly in frontal, temporal, and parietal domains.

Classi cation model performance
The performances of binary classi cation models according to the number of nal features were showed in Table 1. The best classi cation result was when AdaboostM1 had 21, 23, and 28 features, respectively, showing that 92.31% test accuracy. 10-fold cross validation loss for each result were 0.14, 0.17, and 0.13, respectively. Sensitivity and speci city were 0.88 and 1, respectively. The highest test accuracy and lowest cross-validation loss were obtained when using 28 features to AdaboostM1 model. Table 2 shows information of 28 features used in a AdaboostM1 model. Out of the 28 features, 15 were selected from the relative z-scored band power. At the frequency band level, beta and alpha bands were the most common, with 11 and 8 respectively, and at the brain level, the frontal and temporal areas were the most common with 8. Table 1 Comparison of each model performance according to number of nal features. Each model's performance was represented as a test accuracy. The row that has the highest test accuracy and corresponding number of the feature were marked to bold. Table 2 Information of 28 features used in a AdaboostM1 model. Abs and Rel represents Absolute band power and Relative band power, respectively. Abs_zscore and Rel_zscore represents Absolute z-scored band power and Relative z-scored band power, respectively.

Discussion
In the present study, the potential depression group showed higher beta2, beta3 band power and lower alpha2 power compared to the normal group, which is consistent with the results of prior study [16]. Our result can supports the role of alpha and beta bands as biomarkers for diagnosis of depression. Given the proportion of statistical test, beta powers are more likely to be important feature than alpha power. In addition, Alpha2 was not signi cant in absolute power, but showed signi cant differences in relative power. This implies the big variability of the subjects in each group, suggesting the need for regularized indicators such as z-scored power.
We constructed a low-complexity and reliable predictive model of potential depression using a band power considering sex and age for a total of 196 subjects. In terms of classi cation accuracy, Logit boost and tree-based ensemble models showed superiority over other models (see Table 2). This is presumed to be due to the selection of the nal feature by obtaining feature importance based on the tree-based ensemble model. Among the 28 features, beta and alpha bands accounted for the largest proportion, consistent with the results from the group analysis (see Table 2). In addition, the z-scored power, considering sex and age, accounted for a much larger share than the typical band power. It suggests that sex and age effects are not negligible in EEG data so recommended to be regularized. The z-scored power that complements these sex and age effects may act as a biomarker with a large in uence on potential depression prediction. In the further study, we will independently construct the model adding other features such as Alpha asymmetry or source level features [17,18] and expand to multi-class classi er focused on the performance.

Figure 1
Montage of the international 10-20 system Figure 2 Procedure of calculating feature importance for each feature in each ensemble model. T means threshold of the number of the highest score features for each model, and ∩ means intersection for the highest feature in each model.

Figure 3
Topomap of frequency bands that have signi cant difference between groups. Unit of spectral power is μv2 (A), (B) represent Absolute power and Relative power of each group, respectively.