Graph theoretical analysis based on EEG effective connectivity in ADHD children

This paper reports a new method to identify the ADHD children using EEG signals and effective connectivity techniques. In this study, the original EEG data is pre-ltered and divided into Delta, Theta, Alpha and Beta bands. And then, the effective connectivity graphs are constructed by applying independent component analysis, multivariate regression model and phase slope index. The measures of clustering coecient, nodal eciency and degree centrality in graph theory are used to extract features from these graphs. Statistical analysis based on the standard error of the mean are employed to evaluate the graph theory measures in each frequency band. The results show a decreased average clustering coecient in delta band for ADHD subjects. Also, in delta band, the ADHD subjects have increased nodal eciency and degree centrality in left forehead part and decreased in forehead middle. and edges. To detect the ADHD, we use three graph theory measures: clustering coecient, nodal eciency and degree centrality.


Introduction
Attention de cit hyperactivity disorder (ADHD) is a heterogeneous disease with a high prevalence. The prevalence of children worldwide is estimated to be 8%-12%, and about 60% of symptoms and their effect continue into adulthood [1,2]. In recent years, researchers have used brain Computed Tomography (CT), magnetic resonance imaging (MRI), functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), electroencephalography (EEG) and other methods to conduct in-depth research on the microstructure of the brain [3][4][5][6]. The development of the detection method encouraged the researchers to gain deeper information in ADHD [7]. The brain structure, cerebral blood ow, brain electrical activity, gene structure, executive ability and cognitive ability of children with ADHD are signi cantly different from those of normal children, and this difference exists for a long time [8]. EEG is a popular and widely used measurement technique for extracting information from brain. It also is a noninvasive nerve discharge detection technology with millisecond-level high time resolution [9]. Visual inspection of the EEG signal is a method to detect the ADHD [10][11][12].
One of the methods for detecting ADHD is functional connectivity. The Functional connectivity represents the statistical correlation of the functional activities of different brain regions in time courses, and the statistical calculation is performed based on EEG signal [13][14][15]. Effective connectivity also called the directed functional connectivity which provided the causal information between each brain regions [16][17][18]. At present, most methods of calculating effective connectivity are based on parameter models, such as Granger causality model, dynamic causality model and multivariate regression (MVAR) model [19][20][21].
Graph theory is the main mathematical tools used in the brain connectivity analysis, the measures of the graph theory describe the local and global features of the brain network [22]. Most of the brain connectivity in ADHD children were focused on the degrees, clustering coe cient, shortest path length, centrality and e ciency [2,23].
MVAR model is used to t the Granger causality model because it can infer the causal information between each brain region [24]. However, the Granger causality algorithm has the limitation in constructing effective connectivity which exists volume conduction problem [25]. To overcome this limitation, we proposed the phase slope index (PSI) measure to construct effectivity in our study. Most PSI measure is based on the power spectrum analysis because this algorithm refers to the change of phase differences as a function of frequency [26]. To get more high frequency resolution in frequency domain output, we used the MVAR model to represent the power spectrum analysis.
In this study, we used the PSI algorithm based on the MVAR model to construct the effective connectivity.
We applied the connectivity matrix as the graph to extract the features by three graph theory measures clustering coe cient, nodal e ciency and degree centrality. At last we used the standard error of the mean (SEM) to get the statistical results between ADHD and health control (HC).

Method
The framework of the proposed method is shown in Fig. 1, which indicates how to use effectivity connectivity to detect the ADHD and HC progress. At rst, the 19 channels EEG raw data is collected from ADHD and HC subjects during a visual attention task. Then, the pre-processing is conducted to remove the artifacts and noise from the raw data. The MVAR model and PSI algorithm are applied to construct a connectivity matrix (19 × 19 × 4)in delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-13 Hz) and beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30) Hz) frequency bands for each subject. Furthermore, graph theoretical analysis is proposed to extract the features from the connectivity matrix (graph) as global e ciency and clustering coe cient. At last, we conducted the statistical analysis by applying the SEM algorithm.

Dataset
To validate the performance of the classi cation between ADHD and HC in this study, the EEG data for ADHD and HC from IEEE Data port (DOI: 10.21227/rzfh-zn36) has been used. According to DSM-IV criteria, participants were 30 children with ADHD (15 boys and 15 girls, ages 7-12) provided by Roozbeh hospital in Tehran, Iran and 30 HC subjects were (15 boys and 15 girls, ages 7-12) collected from a primary school. All children of HC group have not got a history of psychiatric brain disorder such as epilepsy, major medical illness or any report of high-risk behaviors.
2.2 EEG pre-processing EEG signals were ltered in bandpass frequency bands between 0.5 Hz to 50 Hz using a zero-phase nite impulse response (FIR) ltering algorithm. Independent component analysis (ICA) is used to assume statistically independent sources, in addition, removing the blinks, eye-movements and artifacts in this step. Then, we re-referenced the data to the average of all scalp channels.

Multivariate autoregressive (MVAR) models
MVAR model can infer directivity and the causal relationship between brain connections based on effective connectivity methods which is an extension of AR model on multi-dimensional variables [21]. The algorithm of MVAR model is shown as follow: W(n) is a vector of zero-mean Gaussian noise process with covariance matrix Σ. Here we use the multichannel Yule-Walker equation to describe the relationship between the coe cient matrix A(k) and covariance matrix Σ because its simple calculation and good performance [27]. Thus, the output of the X(n) is the 19 × 19 × p matrix of each subject.
Another key parameter of the MVAR model is the order of the MVAR model. The choice of order is closely related to the tting effect of the model. The small order cannot make full use of the information of the observation data for accurate tting. The large order would cause the phenomenon of over tting and would increase the expense of calculation.
In this study, Akaike Information Criterion (AIC) equation is provided to assess the order of MVAR model [28].
where Σ(p) represents the covariance matrix of tting error of the p-order model, and N represents the total number of settlements used for model tting. Thus, p = 5 was selected as the model order according to the AIC equation.
We also need to obtain frequency domain data through coherent spectrum estimation, where the MVAR model is converted to frequency domain form through Fourier transform. The transfer matrix of MVAR model H(f), and cross-spectrum matrix S(f) are estimated as follow: Σ is the noise covariance matrix. A k is the parameter of M × M coe cient matrix and the p is the number of model order.
We use the MVAR model to obtain more re ned spectral analysis results, which is conducive to more accurate calculation of effective connectivity coe cients. The spectrum power values were calculated in whole frequency band. After the MVAR model tting, we got a 19 × 19 × 128 matrix for each subject, this is also the input of the PSI algorithm in Eq. (6).

Effective connectivity analysis
Phase slope index measure is used in our study. The PSI between two given components signals 'i' and 'j' is de ned as: where F is the set of frequencies of interest, F equals to half-bandwidth of the integration across frequencies. C is the normalized coherent spectrum and δ f is an incremental step in the frequency domain. The normalized coherent spectrum C is de ned as: The de nition of S ij (f) means the cross-spectrum between i and j, and it is the output of the Eq. (4).
According to the de nition in Eq. (6), the imaginary part of coherent spectrum is used in this algorithm. Because the imaginary part information of the coherent spectrum would not change due to aliasing between the signals [29]. In other words, PSI can avoid erroneous estimation in effective connectivity caused by signal aliasing.

Graph theoretical analysis
Two key parameters of the graph are the nodes and edges. To detect the ADHD, we use three graph theory measures: clustering coe cient, nodal e ciency and degree centrality.

Threshold method
Regarding to not all connection is necessary to be calculated in the graph theory, threshold method is used in weighted graph analysis. But threshold method has issue with threshold selection. A higher threshold may cause the problem of not being able to construct a brain network and a lower threshold may cause the problem of no meaningful connectivity measures [30]. Empirically, selecting threshold as 0.2 produced the best results.

Clustering coe cient
Clustering coe cient is proposed to assess the ability of segregation in graph which is the most important measures in researching cognitive problem of brain [31]. The algorithm is shown as follow [22]: The t i w is the number of triangles around a node i, a subgraph with three nodes and three edges is called a triangle. k i is the degree of a node i. N is the numbers of nodes, here N equals to 19. The algorithm of thet i and k i is described in Eqs. (8) and (9).
where w ij is the connection weights between node i and node j. When the measures of edges from the graph are greater than the threshold value, the connection is de ned existed and w ij equals to the value of the edges, otherwise w ij = 0.

Nodal e ciency
Nodal e ciency of a graph measures the ability of each node to exchange information, and is de ned as [22]: where N is the number of the nodes in the graph, and l ij is a path between nodes i and j with the minimum number of edges.

Degree Centrality
Degree centrality of a graph measures the direct impact of the brain region on other adjacent brain regions [22]. The degree centrality formula shown as follow: where w ij is the normalized connection weights that 0 ≤ w ij ≤ 1.
Thus, each subject has four graphs in four EEG frequency bands, and each graph has extracted three graph theory measures to detect the ADHD.

Statistical analysis
All data were presented as mean ( ± SEM) by the SEM algorithm. SEM represents the relative error between the sample mean and the overall mean [32]. The smaller the SEM value, the smaller the sampling error. The SEM algorithm is shown as follow:

Clustering Coe cient
Using the SEM algorithm to do the statistical analysis for the average clustering coe cient, the result Table 1. Here we found the HC groups is greater than the ADHD groups obviously in each frequency band.

Nodal E ciency
Here, we found in delta band the nodal e ciency provided the most different features and the least error because the mean value between the ADHD and HC, and the smallest value of the SEM. Focus on this frequency band, the F3 and Fz points which represent the left forehead and the forehead midline point have signi cant differences. Comparing the healthy children, the ADHD children has very high ability of exchange information in left forehead but poor at the forehead middle part.

Degree Centrality
In Table 3, Delta band also proposed a least error. Comparing the HC groups, we found that ADHD groups has most difference in F3 and Fz in measure of degree centrality. That means the left forehead of ADHD children has more correlation with other brain regions, but this ability is poor in the forehead middle part.

Discussion
In EEG signal analysis, multi-channel analysis can provide the structure function relationships between different areas of the brain which single channel approaches ignore [33]. Here, we construct the brain network using effective connectivity technique to describe the brain activity in whole brain network. It can be notice from Fig. 3, PSI algorithm support casual information in connectivity analysis. In previous study, functional connectivity revealed that the ADHD patient exist the difference in forehead part comparing with the HC. Here, we use the PSI provide more evidence of the ADHD in causal relationship of each region of interest in the whole brain.
The nature of EEG signal is dynamic, and it provides a high resolution in times series. The PSI method (Figs. 2 and 3) is used to combine information about different frequencies. The classic PSI measure are based on the power spectrum in frequency domain, here we used MVAR model to represent the power spectrum analysis. The MVAR model overcome the limitation of the classic PSI method that the poor resolution in frequency domain for short time dataset. The MVAR-PSI method propose in our study is more helpful in dynamic analysis of brain activity. We used the MVAR-PSI method construct the effective connectivity in each frequency band.
Graph theorical analysis as the most popular mathematic tools of brain network analysis is used in this study. Though the graph theory analysis, the features of ADHD is described clustering coe cient, nodal e ciency and degree centrality of speci c brain regions in Tables 1, 2 and 3. Due to MVAR-PSI method provided the more information in the graphs such as the casual information and dynamic features, so we got more ndings and understanding of ADHD in our research.

Conclusions And Future Work
This research proposed a new method to identify the ADHD children using EEG signals and effective connectivity techniques. The original EEG data is pre-ltered and divided into Delta, Theta, Alpha and Beta bands. And then, the effective connectivity graphs are constructed by applying ICA, MVAR model and PSI. The measures of clustering coe cient, nodal e ciency and degree centrality in graph theory are used to extract features from these graphs. Statistical analysis based on the SEM are employed to evaluate the graph theory measures in each frequency band. The results show a decreased average clustering coe cient in whole frequency bands for ADHD subjects. Also, in delta band, ADHD children has very high ability of exchange information in left forehead but poor at the forehead middle part. Moreover, left forehead of ADHD children has more correlation with other brain region, but this ability is poor in the forehead middle part.
In this research, just 19 channels EEG data were used which could not include more details in brain regions. In addition, the visual attention task of EEG collection is simple, a more complex task might nd more different features between ADHD subjects and HC subjects. Furthermore, the activity-dependent uctuations in connectivity is not consider in this study, we will add a dynamic model in connectivity analysis in the future.  PSI effective connectivity matrix in delta band (0.5 -4Hz) for HC subject