A. EEG dataset
We collected the resting-state EEG signals of patients with epilepsy from the General Hospital of Western Theater Command PLA. The EEG was performed with locations according to the international 10-20 system using 21 Ag-AgCl electrodes. Next, we sampled all signals at 500 Hz with a 50-Hz notch filter and a 0.01–100-Hz bandpass filter. The Ethics Committee of The General Hospital of Western Theater Command PLA approved all experimental protocols. All participants provided written informed consent, and the study was performed in compliance with the tenets of the Declaration of Helsinki. Two senior epileptologists having extensive experience in diagnosis and treatment of epilepsy established the diagnosis.
After careful review, patients to be included for analyses were selected by two senior epileptologists with extensive experience in diagnosis and treatment of epilepsy. They selected 15 E-AD patients (age, 18–57 years) and 11 E-no-AD patients (age, 19–46 years). The following were the inclusion criteria for the screened E-AD patients: 1) Each patient underwent an outpatient or inpatient video-EEG recording for ≤24 h, two senior experts confirmed that the segments were recorded during the resting state, 2) A confirmed diagnosis of epilepsy was established by two senior experts, 3) Both groups of patients were assessed using the Hamilton Mood Scale or Self-Rating Anxiety/Depression Scale (SAS/SDS) by professional physicians [41, 42] and were identified as having anxiety/depression disorder, and 4) the treatment did not involve the use of any mood-altering drugs [43]. The E-no-AD patients included herein were screened using all the same standards except for not being detected as having anxiety/depression disorder. First, using the reference electrode standardization technique, the data were transformed to the approximate zero reference for reducing the impact of the reference effect [44]. Continuous EEG data were randomly divided into five-second segments for each patient, and high-amplitude segments (>100 μV) were excluded. Finally, 6346 and 4416 EEG segments for E-AD and E-no-AD patients were selected for the next step, respectively.
B. Functional networks
Herein, based on the resting-state EEG signals of different epilepsy patients, the functional networks were estimated. We considered six well-known frequency bands, which included the full frequency band (0.1–100 Hz), the delta band (0.1–4 Hz), the gamma band (30–100 Hz), the alpha band (8–13 Hz), the beta band (13–30 Hz), and the theta band (4–8 Hz) [45].
Coherence (\(Coh\)) was used to measure the connection strength between each pair of electrodes for establishing the functional network. In theory, for analyzing the cooperative, synchrony-defined cortical neuronal assemblies, \(Coh\) is the most commonly used metric. This metric represents a linear relationship at a specific frequency between two signals [\(x\left(t\right)\) and \(y\left(t\right)\)] on the basis of their cross-spectrum. Notably, to indicate the linkage strength between two network nodes, we herein adopted frequency-specific coherence. \(Coh\) was expressed using the following formula [46]:
$$Coh\left(f\right)=\frac{{{P}_{xy}\left(f\right)}^{2}}{{P}_{xx}\left(f\right){P}_{yy}\left(f\right)}$$
where \({P}_{xy}\left(f\right)\) is the cross-spectrum between \(x\left(t\right)\) and \(y\left(t\right)\), and \({P}_{xx}\left(f\right)\) and \({P}_{yy}\left(f\right)\) are the respective auto-spectra at frequency \(f\) estimated from the Welch-based spectrum at 0.1-Hz resolution. For each frequency band, the coherence matrices of all frequency points in this band were averaged to compute the coherence matrix.
C. Network Properties
For measuring the network topology property, we herein used several network measurements, such as global and local efficiency, characteristic path length, and clustering coefficient. We calculated the clustering coefficient (\(CC\)) as follows [47]:
$$CC=\frac{1}{N}\sum _{i\in N}\frac{\sum _{j,h\in N}{\left({\omega }_{ij}{\omega }_{ih}{\omega }_{jh}\right)}^{\frac{1}{3}}}{{k}_{i}{k}_{i-1}}$$
wherein \({k}_{i}\) is the degree of node \(\text{i}\), and \({\omega }_{ij}\) is the weight between nodes \(i\) and \(j\) in the network. A network’s characteristic path length (\(L\)) when \({L}_{ij}\) is the shortest path length between two nodes was calculated as follows:
$$L=\frac{1}{N}\sum _{i\in N}\frac{\sum _{j\in N,i\ne j}{d}_{ij}}{N-1}$$
The global efficiency (\({E}_{g}\)) was computed using the following formula [48]:
$${E}_{g}=\frac{1}{N}\sum _{i\in N}\frac{\sum _{j\in N,i\ne j}{{d}_{ij}}^{-1}}{N-1}$$
The local efficiency (\({E}_{i}\)) of node \(i\) was defined as follows:
$${E}_{i}=\frac{1}{2}\sum _{i\in N}\frac{\sum _{j,h\in N,i\ne j}{\left({\omega }_{ij}{\omega }_{ih}\right[{{d}_{jh}\left({N}_{i}\right)}^{-1}\left]\right)}^{\frac{1}{3}}}{{k}_{i}{k}_{i-1}}$$
We used the Brain Connectivity Toolbox (http://www.brain-connectivity-toolbox.net/, Rubinov et. al) to calculate the above network properties. The authors have reported on more detailed descriptions of the network topology properties [48].
D. Principal Component Analysis
Principal component analysis (PCA) was aimed to transform a number of correlated variables into a significant smaller number of uncorrelated variables, called principal components. It has a wide range of applications, such as de-noising signals, cluster analysis, feature reduction, and pattern recognition.
Let the centered data input vectors be \({x}_{t}\) (\(t=1, \dots , l and \sum {x}_{t}=0\)), each of which is of \(m\) dimension defined by \({x}_{t}={[{x}_{t}\left(1\right), {x}_{t}\left(2\right), {\dots ,x}_{t}(m\left)\right]}^{T}\) (usually \(m\) < \(l\)), and \({s}_{t}\) linearly transforms each vector \({x}_{t}\) as:
$${s}_{t}={U}^{T}\bullet {x}_{t,}$$
The eigenvalue of the principal components could be calculated as follows:
$${\lambda }_{i}{u}_{i}=C\bullet {u}_{i,}$$
where \({\lambda }_{i}\) is the eigenvalue of \(C\). Then, the principal components of \({s}_{i}\) could be computed as follows:
$${s}_{t}\left(i\right)={u}_{i}^{T}{x}_{t}.$$
E. Spatial Patterns Networks
For distinguishing between normal and abnormal EEG or EEG components, the use of common spatial pattern (CSP) analysis was proposed in the early 1990s [33, 34]. SPN is primarily aimed at identifying the CSPs among various weighted brain network topologies. Therefore, as is the case with canonical CSP, the SPN-extracted spatial pattern is not in the physical data space but rather in the network space [49].
Let \({\phi }_{1}\) and \({\phi }_{2}\) be the \(N\times N\) centered matrices for each subject; the spatial filters are the projections that maximize the following function [10, 34]:
$$J\left(\omega \right)=\frac{{\omega }^{T}{\phi }_{1}^{T}{\phi }_{1}\omega }{{\omega }^{T}{\phi }_{2}^{T}{\phi }_{2}\omega }=\frac{{\omega }^{T}{\varPhi }_{1}\omega }{{\omega }^{T}{\varPhi }_{2}\omega }$$
Here \({\varPhi }_{1}\) and \({\varPhi }_{2}\) are the covariance matrices of the adjacency matrix for the two groups. The objective function can be written as follows upon the introduction of the Lagrange multiplier:
$$L\left(\omega , \lambda \right)={\omega }^{T}{\varPhi }_{1}\omega -\lambda ({\omega }^{T}{\varPhi }_{2}\omega -1)$$
Under the condition \(\frac{\partial L}{\partial \lambda }=0\), the generalized eigenvalue equation can be used to estimate the objective projection \(\omega\).
$${{\varPhi }_{2}^{-1}\varPhi }_{1}W=\sum W$$
where\(W\)is the matrix comprising eigenvectors of \({{\varPhi }_{2}^{-1}\varPhi }_{1}\) and \(\sum =diag({\lambda }_{1},{\lambda }_{2},\dots {\lambda }_{m})\) with \(\lambda\) variables representing corresponding singular values [49].
F. Pattern Recognition
Herein, to realize the distinguishing of E-AD patients and E-no-AD patients, we performed the analysis and the feature extraction in the collected EEG datasets. As schematically shown in Fig. 1, we constructed the functional networks from the resting-state EEG segments. Then, for evaluating the classification performance, four traditional network properties (characteristic path length, clustering coefficient, local efficiency, and global efficiency), the PCA features of the network, and SPN features of the network were extracted. Then, we compared the classification performances to identify effective features for distinguishing between E-no-AD and E-AD patients. Notably, to explore the mechanism, the above comparisons were repeated in different frequency bands. For each feature, we introduced a support vector machine (SVM) classifier to learn feature distribution [50]. Next, to determine the optimized set of parameters, we implemented a grid search approach. It was ensured that the process for each step in the testing set was the same as that in the training set [51]. To ensure reproducibility of results, 5-fold cross-validation was used to evaluate prediction results for validation.
G. Statistical Testing
Herein, we employed two-sample Student’s t-test for the comparisons of the connectivity strengths represented by the trial-averaged networks, where p < 0.01 was the threshold for significance. We also used it to assess the significant differences of network properties for E-AD patients and E-no-AD patients.
To evaluate the classification performance based on different features, the 5-fold cross-validation strategy was used for the testing process. In each evaluation, specifically, four-fifth of the segments in this dataset were used for training and the other segments were used for testing. Such process was repeated to ensure that all segments served as testing dataset. We herein used three evaluation metrics including specificity (\(SPE\)), sensitivity (\(SEN\)), accuracy (\(ACC\)) to make the assessment of the specific classification performance. Sensitivity and specificity values indicate missed diagnosis and misdiagnosis rates of E-AD patients, respectively; the preference is to keep these rates low, which indicates an overall better performance. Accuracy represents the probability of correct identification in all cases.
$$ACC=\frac{n+{n}_{NAD}}{{N}_{AD}+{N}_{NAD}}\times 100\%$$
$$SEN=\frac{{n}_{AD}}{{N}_{AD}}\times 100\%$$
$$SPE=\frac{{n}_{NAD}}{{N}_{NAD}}\times 100\%$$
where \({N}_{AD}\) and \({N}_{NAD}\) represent the total number of E-AD and E-no-AD patients, and \({n}_{AD}\) and \({n}_{NAD}\) represent the number of correctly identified EEG signals of these patients, respectively.