Extracting the Invisible: Mesial Temporal Source Detection in Simultaneous EEG and SEEG Recordings

Epileptic source detection relies mainly on visual expertise of scalp EEG signals, but it is recognised that epileptic discharges can escape to this expertise due to a deep localization of the brain sources that induce a very low, even negative, signal to noise ratio. In this methodological study, we aimed to investigate the feasibility of extracting deep mesial temporal sources that were invisible in scalp EEG signals using blind source separation (BSS) methods (infomax ICA, extended infomax ICA, and JADE) combined with a statistical measure (kurtosis). We estimated the effect of different methodological and physiological parameters that could alter or improve the extraction. Using nine well-defined mesial epileptic networks (1949 spikes) obtained from seven patients and simultaneous EEG–SEEG recordings, the first independent component extracted from the scalp EEG signals was validated in mean from 46 to 80% according to the different parameters. The three BSS methods equally performed (no significant difference) and no influence of the number of scalp electrodes used was found. At the opposite, the number and amplitude of spikes included in the averaging before the extraction modified the performance. Anyway, despite their invisibility in scalp EEG signals, this study demonstrates that deep source extraction is feasible under certain conditions and with the use of common signal analysis toolboxes. This finding confirms the crucial need to continue the signal analysis of scalp EEG recordings which contains subcortical signals that escape to expert visual analysis but could be found by signal processing.


Introduction
Few recent studies using simultaneous multi-scale electroencephalography (EEG) recordings demonstrated that deep brain sources contribute to scalp EEG recordings but are not spontaneously visible by visual expertise (Koessler et al. 2015;Ramantani et al. 2016;Pyrzowski et al. 2021;Lee et al. 2021). These brain sources, despite their depth and their mixing activity with superficial sources, can generate electric field potentials that project on scalp electrodes with a low signal to noise ratio (SNR). Extracting these EEG potentials from the deep mesial temporal sources would be very crucial because they could be used as biomarkers of pathological (e.g., epilepsy or Alzheimer disease) or Handling Editor: Christoph Michel.
* Laurent Koessler laurent.koessler@univ-lorraine.fr 1 3 cognitive (e.g., memory, language, and sleep in particular) processes. The lack of awareness of these biomarkers can induce misinterpretation, wrong medical treatment strategies or the use of invasive recordings in clinical context such as drug resistant epilepsy. Using electric or magnetic source imaging, several authors were able to localize deep cortical and subcortical brain sources (Koessler et al. 2010;Krishnaswamy et al. 2017;Seeber et al. 2019;Rikir et al. 2020). These studies demonstrated that deep source extraction is possible from scalp EEG using high spatial sampling and source localization methods. These methods, thanks to a regularization (e.g., Tikhonov, L-curve, General-Cross Validation methods,…), can estimate and extract the deep sources within noisy data (Grech et al. 2008).
Applied to EEG, blind source separation (BSS) approach is an interesting method to suppress environmental or physiological EEG artefacts but also to identify and separate brain sources from EEG recordings (Jung et al. 2001;Jutten and Karhunen 2004;Congedo et al. 2008). This automatic and unsupervised (i.e., without visual expertise) method could be used to extract deep brain sources from scalp EEG signals. In 2019, Pizzo and colleagues used BSS, and especially an independent component analysis (ICA) method, to disentangle the activity from focal deep and large superficial brain sources from MEG signals and using stereoelectroencephalographic (SEEG) recordings as reference method. They noticed that large band-pass filtering (2-60 Hz) and high number of events to average result are required to increase visibility of independent components (ICs). Hippocampal and amygdala activations could be found in 6 out of 14 patients and from some patients (4 out of 14) ICA revealed evidence of a thalamic signal. This promising result relies on an advanced technology (i.e., MEG) and on the use of a high number of sensors (248 magnetometers). Because MEG technology cannot be commonly used in a high number of clinical or research centres, BSS approach for deep sources detection need to be investigated with scalp EEG that is the most common, portable and easy to use electrophysiological technique. This investigation should be particularly important and informative especially in a scalp EEG context with low spatial resolution (10/20 system; Seeck et al. 2017) and a resistive volume conduction (especially the skull; Akhtari et al. 2002) context that reduce the SNR (at the difference to MEG where magnetic fields flow without attenuation in the head tissues).
In this paper, the first aim was to evaluate the efficiency of different BSS methods, combined with a statistical measure (kurtosis), to extract deep brain sources from scalp EEG recordings using a well-defined simultaneous EEG-SEEG dataset of mesial deep temporal sources (Koessler et al. 2015). The second aims were to investigate the influence of methodological and physiological parameters that could alter or improve the extraction. For methodological parameters, we evaluated the influence of the IC ranks and the number of scalp EEG electrodes. For the physiological parameters, we evaluated the influence of the amplitude of the brain sources and the number of averaged interictal events. At the end, we identified the causes of non-detection and proposed an improvement of the method using an expert control.

Patients
For this study, we used nine simultaneous EEG-SEEG datasets corresponding to mesial networks from a previous investigation (Koessler et al. 2015) involving seven patients (three females, mean age of 38 years) with temporal lobe epilepsy (TLE). These patients had (i) an epileptogenic zone confined to the temporal lobe and (ii) at least one interictal intracerebral spike (IIS) source confined to the mesial temporal lobe (MTL), as defined by SEEG recordings. All included patients agreed to participate in this study, approved by the Ethical Committee of our institution (CHRU, Nancy), and the database has been declared to the CNIL.

Simultaneous EEG-SEEG Recordings
Simultaneous SEEG and EEG recordings were performed using a 128-channel system with 512 Hz sample rate and scalp FPz electrode as reference. For all patients, the following brain structures were sampled with multi-contact SEEG electrodes: amygdala; anterior and posterior hippocampus; entorhinal cortex; collateral fissure; parahippocampal gyrus; internal and external temporal pole; superior, middle, and inferior temporal gyri; temporo-occipital junction; fusiform gyrus; insula. Further SEEG electrodes were occasionally placed according to the spatial distribution of interictal spikes and the respective electroclinical hypothesis. For scalp EEG recordings, two main scalp regions were sampled: the fronto-centro-parietal region and the lateral and basal temporal regions ipsi-and contralateral to the presumed epileptogenic zone. Sterile scalp electrodes (from n = 8-25; Table 2) were glued on the patient's head according to a specific sterile procedure and considering the position of the multi-contact SEEG electrodes. Two to three hours of EEG-SEEG recordings during calm wakefulness are selected for interictal spike analysis, avoiding ictal events or preictal changes.

Dataset Pre-processing
All data analyses were performed using MATLAB software (MATLAB 7.0, The MathWorks, Inc.) and, for BSS, the toolbox EEGLAB (Delorme and Makeig 2004). Data pre-processing included three consecutive steps: IIS selection, IIS network classification and averaged ISS extraction. These steps are described below.

IIS Selection
First, SEEG recordings were visually analysed using a bipolar montage. From a previous study (Koessler et al. 2015), mesial IIS network was selected by the reproducible occurrence of IIS within the MTL comprising the amygdala, hippocampus, entorhinal cortex, parahippocampal gyrus and collateral sulcus. Within this network, the source corresponding to the earliest IIS with the highest amplitude was considered as the principal source. These IIS were manually marked with a trigger (t 0 ) at the peak of the initial component, then segments of 1 s centred on the local extreme t 0 were automatically extracted and the corresponding amplitude and latency computed.
Second, condensed cartographies of IIS amplitude and latency were computed to (i) ascertain that the selected IIS was indeed the earliest event with the highest amplitude within the network, and to (ii) verify that all individual spikes presented an identical intracerebral distribution and therefore belonged to a mesial network (i.e., no co-occurrence of IIS in other parts of the temporal lobe).
Lastly, to check that non-mesial SEEG contacts were not activated at t 0 , a quantitative validation was performed. Each IIS was compared to background activity, in every SEEG contact, by a statistical test of outlier rejection under the null hypothesis that the amplitude of the peak did not significantly differ from the amplitude of the background activity as measured in the intervals [− 500, − 250] and [250, 500] ms around t 0 .

Averaged ISS Extraction
Scalp EEG segments were extracted using the same 1 s epoch centred on t 0 as SEEG segments. Segments with scalp EEG amplitude > 150 µV that represented < 10% of all segments were rejected and considered as artifacts. The remaining segments were band-pass filtered (1.5-30 Hz) and averaged.

Blind Source Separation and Independent Components Validation
From a signal processing point of view, averaged EEG signals could be considered as a set of mixed signals, originated from sources (or components) representing brain activities, artefacts, and measurement noise. In this methodological study, we were interested in the automatic extraction of a particular component, supposed to be the contribution of the mesial temporal source. To do that, we had to, first, separate without specific information all the components of this set (that is BSS) and second, select and validate the relevant component. This could be performed in five consecutive steps: (1) data whitening, (2) BSS, (3) ICs selection, (4) ICs labelling and (5) ICs validation (Fig. 1).

EEG Datasets
As mentioned previously, we evaluated the influence, on the BSS results, of the number of EEG segments used for averaging and of the contribution of the deep source. Thus, for BSS on a specific IIS network, we made several trials with a 10-segments step of EEG signals for the average calculation. Furthermore, we considered the contribution of the deep sources using the amplitudes at t 0 in the triggering SEEG Fig. 1 Overview of blind source separation and independent components validation. Selected electrodes correspond to the scalp electrodes that were the most impacted by the ICi (i = 1-3). Finally, the time courses of the ICs were compared with the triggering SEEG signals. ISS interictal surface spike, ICs independent components, BSS blind source separation signal. Therefore, to see if a few high amplitude triggering SEEG signals were sufficient to extract the mesial sources despite their depth, we prepared two EEG datasets for each network: one with IIS sorted by decreased amplitude at t 0 in the triggering SEEG signal and the other unsorted.

Data Whitening and Decorrelation
Due to an insufficient spatial diversity of the EEG sensors, the EEG signals were highly correlated. Thus, the first treatment in order to guarantee the best source separation was the whitening of the EEG dataset. We have chosen to use the zero-phase component analysis (ZCA) whitening (Bell and Sejnowski 1997) with the aim to be as close as possible (in the least squares sense) to the original observations (Kessy et al. 2018).

BSS Method Choice
We have chosen to use and compare three well-known and proven BSS methods implemented in many user-friendly software like EEGLAB (Delorme and Makeig 2004): infomax ICA, extended infomax ICA and JADE. Considering the estimated sources, the two first methods minimise the mutual information of these sources and the third one maximizes their non-Gaussianity. Infomax ICA is based on the information-maximization approach proposed by Bell and Sejnowski (1995) with the stochastic gradient learning rule of Amari et al. (1995). This method is effective in separating sources that have super-Gaussian distributions (sharply peaked probability density functions with heavy tails). Extended infomax ICA (Lee et al. 1999) constitute an extension of the precedent method to the separation of mixtures of super-Gaussian and sub-Gaussian sources. It uses a learning rule with a nonlinearity that considers the two kinds of distribution. JADE-joint approximate diagonalization of eigen-matrices (Cardoso and Souloumiac 1993) exploit the fourth order cumulants of the source estimates that are a measure of non-Gaussianity. JADE seeks an orthogonal rotation of the observed mixed vectors to estimate source vectors with maximum non-Gaussianity.

Independent Component Selection
Scalp EEG spike is a sharply contoured waveform with a duration of 20-70 ms (Fisch 1999) and its probability density function was different of a Gaussian distribution. Then, to select the most relevant ICs representing the brain sources and thus likely to contain spikes, we have chosen to use the kurtosis value of IC as indicator. Indeed, the kurtosis could be used to appreciate the non-Gaussianity of a random variable, thus it is applied to design contrast function in BSS (Hyvärinen and Oja 1997) or to detect artefacts in EEG data (Delorme et al. 2007). In our case, we supposed that the averaged EEG signals were, as much as possible, artefact free and we just wanted to detect averaged ISS. Then, we assumed that IC containing averaged ISS was super-Gaussian with a significant positive value of the kurtosis (Fig. 2). The effect of brain sources could occur on several ICs so, to determine the relevant number of ICs to analyse, we have considered the low spatial resolution of scalp electrodes (8 to 25 electrodes for the seven patients) and some preliminary investigations that led us to select the three ICs with maximum kurtosis value, called after in decreasing order of kurtosis value, IC1, IC2 and IC3.

ICs Labelling: Electrode Selection
For each of the selected ICs, the associate 2-D scalp map projection was obtained using the corresponding column of the inverse of the unmixing matrix issued of the BSS process. For that column, the weight with the maximum absolute value corresponded to the scalp electrode that was the most impacted by the source. This electrode was defined as scalp selected electrode for the considered IC and its name was used as a label to compare ICs.

IC Waveform Characterisation
The IC waveform was identified as a transient event, with one or two main peaks, distinguishable from the background activity. The earliest extremum of the IC beyond two standard deviations was taken as the extremum of the initial peak and the position of this extremum was used as a time reference, in particular to calculate the latency with respect to t 0 (Fig. 3). If there were two extrema (one maximum and one minimum), beyond two standard deviations, with a difference of more than 80 ms between them, they were considered not to be part of the same event and therefore the one with the lowest amplitude was defined as an outlier and the one with the higher amplitude was defined as the correct peak.

ICs Validation
This last step concerned the validation that brain source contribution, occurring in the vicinity of the triggering SEEG contacts, was noticeable on an IC. Four criteria were used to allow this validation. First, as brain propagation is assumed to be instantaneous, if the latency between the peak position and t 0 was greater than 19.5 ms, i.e., 10 samples, this latency was not validated, and the IC was discarded. We also verified, in the same way as for IIS selection, that the amplitude of the IC at t 0 was significantly different from the amplitude of the background activity using the Walsh's test (for details see Koessler et al. 2015). Moreover, the morphologies and For easier visualisation of the final results, a bar charts representing EEG channels according to the number of selections for ICs was used for each ICi (with i = 1,…,3).

Influence of the BSS Method
First, we investigated if the choice of the BSS method had an influence on the quality of results, expressed as the validation percentage of ICs for all trials of the nine networks. Thus, the mean percentages obtained with the three methods were compared with each other, for each of the three ICs of unsorted datasets (nine tests with a sample size equal to the number of networks). The H 0 hypothesis was that the medians of samples were equal and H 1 hypothesis that they were not. So, we used a two-tailed Wilcoxon rank sum test with adequate Bonferroni correction (significance level p = 0.05/9).

Relevance of the ICs Rank
We also investigated the influence of the kurtosis rank (1st, 2nd or 3rd) on the quality of results. So, we compared the mean validation percentages of IC1 with that of IC2 and IC3, for the three BSS methods and for unsorted datasets (two comparisons with three methods i.e., six tests). The H 1 hypothesis was that the median of the IC1 was greater than the median of ICi (i = 2, 3) and the H 0 hypothesis was that the median of the IC1 was not greater than the median of ICi. A one-tailed Wilcoxon rank sum test with adequate Bonferroni correction (p = 0.05/3) was used.

Influence of the Number of EEG Electrodes
Another point to investigate was the influence of the number of scalp EEG electrodes on the validation percentage of ICs. For each ICi (i = 1…3), let be the dimension vector of the mean percentage of validated ICi for the three methods together and the corresponding vector of electrodes number, to appreciate the correlation between these two variables, we computed the Pearson correlation coefficient calculated on the ranks of these vectors (3 methods with 9 networks i.e., a vector dimension of 27). For some networks with the same or almost the same number of electrodes, additional approach was to evaluate the dispersion of the validation percentage of each BSS method, reflected by the corresponding standard deviation.

Influence of the Deep Source Strength
The effect of the amplitude of the deep source was evaluated by comparing, for the three BSS methods and the nine networks together, the total number of validated ICs of sorted and unsorted data for the five first datasets obtained by increasing the number of EEG segments in 10-segment increments. The H 1 hypothesis was that the median of the number of validated ICs for sorted data was greater than the median of the number of validated ICs for unsorted data and the H 0 hypothesis was that the median for sorted data was not greater than the median for unsorted data. The comparison between the obtained 5-item samples was realised with a one-tailed Wilcoxon rank sum test. As additional information, the mean values of triggering SEEG signal for the first trial and the fifth trial were also collected, for both sorted and unsorted data.

Evolution of the Validation Percentage of ICs According to the Number of Averaged EEG Segments
Furthermore, we evaluated, for all trials and for the three BSS methods, the minimum number of EEG segments from which the different ICs can be validated. Next, we analysed the graphs of the evolution of the validation percentage of ICs according to the number of segments to find out if there were some specific patterns. Finally, we identified the cases of simultaneous validation of two or three ICs for the same trial.

Causes of Non-validation of ICs and Associated Indicators
We tried to identify and quantify the different causes of nonvalidation of the first three ICs using relevant indicators. These causes could be: (i) the presence of an artefact on EEG signals that induces an abnormal pattern on ICs that was mistaken for a spike, inaccurate estimation of the t 0 for some segments, time difference between the deep source and the corresponding SEEG triggering signal which increases latency with respect to t 0 , (ii) noise level too high to detect the correct peak at t 0 , (iii) presence of a second deep source or artefact that leads to an incorrect cartography. An excessive latency value may indicate the presence of one of the first set of causes or a high level of noise. If the latency was correct, non-detection using the Walsh's test may indicate a slight peak shift due to inaccurate estimation of t 0 or time difference between the deep source and the triggering signal and, at last, if the peak amplitude was validated, an incorrect cartography could suggest the presence of another source or artefact. Then, we listed, for all the non-validation, the latency, the result of amplitude Walsh's test and of cartography validation to build a table showing the percentage of non-detection of ICs according to the value of these three indicators. In this table, the latency evaluation was split up into four cases: latency greater than 250 ms, latency between 250 and 100 ms, latency between 100 and 50 ms and latency between 50 and 20 ms.

Patient Networks Characterisation
For all seven patients, nine M networks were validated (Table 1). These networks included sources localized in the anterior hippocampus, six included sources localized in the middle and/or posterior hippocampus, seven in the amygdala, and five in the para-hippocampal gyrus (Koessler et al. 2015). For these networks, a total of 1949 IIS was selected. The mean IIS amplitude was 729 ± 279 µV.
According to this dataset, the corresponding number of trials was equal to 195 for the nine networks with a mean number of trials per network equal to 21.7 (min: 5, max: 44).
Original data of this study, including anatomical localizations of the SEEG electrodes and the SEEG signals of all IIS networks are available at Mendeley Data at: https:// data. mende ley. com/ datas ets/ 8wz3w vm9y5/2.

Overview of the Results for All Trials Together
For all nine networks (195 trials) and the three BSS methods together, that is for a total of 585 results, the percentages of validated IC1, IC2 and IC3 were 53%, 8% and 3% for unsorted data respectively.
At last, for all nine networks and the three BSS methods (27 results), the total percentage of cases without any validated IC for IC1, IC2 and IC3 were 26%, 33% and 63% for unsorted data respectively.
Considering now each network separately, for the 3 ICs with the 3 BSS methods and for unsorted data (9 combinations), there is only 1 network (i.e., #9) without any validated IC. For the other networks, the percentage of results without validated IC varied from 11 to 55% (Table 2).
The analysis of the bar charts representing EEG channels according to the number of selections for ICs showed that, for IC1, the patterns were different between networks with high and low validation percentage for ICs (Fig. 5). Thus, for bar charts with high validation percentage, the selected electrode was often the same, inducing a large bar for this electrode and few smaller ones for the other electrodes (Fig. 5a, networks 1 to 5 and 7) whereas, for low validation percentage bar charts, the range of bar amplitudes was lower and, often, the number of bars was higher (Fig. 5a, networks 6, 8  and 9). On the contrary, for IC2, there was no clear pattern to differentiate between bar charts (Fig. 5b), as well as IC3.

Influence of the BSS Method
The mean p-value for the nine comparison tests between the BSS methods was equal to 0.83 ± 0.20 with a minimum p-value equal to 0.44. Consequently, H 0 hypothesis (equality of medians) could not be rejected for these nine tests.
Consequently, there is no significant influence of the BSS method for the extraction of deep mesial sources.

Relevance of the ICs Rank
For infomax ICA, extended infomax ICA and JADE methods, the p-values for the comparison tests between the mean validation percentages of IC1 and IC2 were 0.0067, 0.0986, 0.0126 respectively. In the same way, the p-values for the comparison tests between the mean validation percentages of IC1 and IC3 were 0.003, 0.026, 0.005 respectively. Consequently, H 1 hypothesis (median of IC1 > median of ICi for i = 2, 3) could be accepted for infomax ICA and JADE methods.

Influence of the Number of EEG Electrodes
For IC1, IC2 and IC3, the Pearson correlation coefficients between the vector of mean percentage of validated IC for the three methods and the corresponding vector of electrodes number, were − 0.36, 0.05 and − 0.04 respectively. For all these coefficients, the corresponding p-value was greater than 0.05 indicating that the H 0 hypothesis (correlation is zero) could not be rejected. Networks 4 to 8 had a number of electrodes between 12 and 14 with 9 common electrodes, so we computed the corresponding standard deviation of the validation percentage of each BSS method. For infomax ICA, extended infomax ICA and JADE methods and for IC1, the standard deviations were 29%, 34% and 30% respectively; for IC2, standard deviations were 11%, 3% and 13% respectively and for IC3, 8%, 3% and 6% respectively. Figure 6 shows the validation percentage values obtained for the JADE method alone.

Influence of the Deep Source Strength
For trials one to five, the number of validated ICs was 10, 13, 13, 15 and 15 (total = 66) for sorted data respectively and 1, 6, 7, 10 and 13 (total = 37) for unsorted data respectively. For these five first datasets, the p-value of the comparison  test between the number of validated ICs for both sorted and unsorted data was 0.03. Consequently, H 1 hypothesis (median of validated ICs for sorted data > median of validated ICs for unsorted data) could be accepted. As an illustration, Fig. 7, networks 2 to 6, showed that, for the five first trials, the validation percentage of ICs was better for sorted data than for unsorted data. Next, for the first trial and for sorted data, five networks had one validated IC for at least one method and, for these networks (2, 3, 4, 5 and 9), the means of the triggering SEEG signal were 698 µV, 2,022 µV, 1,574 µV, 920 µV, and 757 µV respectively. For the first trial and for unsorted data, the network 2 had one validated IC and, for this network, the mean of the triggering SEEG signal was 359 µV. At last, for the fifth trial, the mean of the triggering SEEG signals for all networks were 1085 and 652 µV for sorted and unsorted data respectively.

Evolution of the Validation Percentage of ICs according to the Number of Averaged EEG Segments
The mean values of the minimum number of segments when IC1, IC2 and IC3 were validated for the first time were 70 ± 77, 71 ± 63 and 67 ± 50 for sorted data respectively and 62 ± 41, 80 ± 56 and 129 ± 58 for unsorted data respectively. The evolution curves of the validation percentage of ICs as a function of the number of EEG segments can be classified into four main categories: (1) curves for which, after a minimum number of segments, the validation percentage increases almost all the time, (2) curves with alternating increases and decreases, (3) curves with one-time increase, (4) curves with a validation percentage always equal to zero (Fig. 7). The first category concerned exclusively some IC1s and indicated that, for the corresponding networks, the deep brain source was detectable in almost all cases after a minimum number of segments (Fig. 7, networks 1 to 4 and 7). The second category involved IC1s and IC2s whose evolutions appeared to be coupled (one IC increases and the other decreases) as if the brain source contribution was identified by one IC and then by another ("switching effect") ( Fig. 7a, network 5 and Fig. 7b, network 6). Notice that this change can also be observed at the beginning of some curves of the previous category (Fig. 7, networks 1 and 3). The third category concerned IC2s (Fig. 7, network 4, Fig. 7b, networks 1, 2 and 7), mainly IC3s (Fig. 7, networks 3 and 8, Fig. 7a, networks 4 and 6) and rarely IC1 (Fig. 7a, network  8). The last category concerned IC1 (Fig. 7, network 9), IC2 (Fig. 7, networks 2 and 9, Fig. 7b, networks 5 and 8) and mainly IC3 (Fig. 7, networks 2, 5, 7 and 9, Fig. 7a, network 1, Fig. 7b, network 4). For the total number of trials [1170 i.e., 195 trials with 3 BSS methods and 2 datasets (sorted and unsorted)] obtained for the three BSS methods and both sorted and unsorted data, the percentage of trials with two ICs validated at the same time was 1.5%. These trials concerned networks 1, 2, 4 and 8 with concordant selected electrodes. Note that this result was consistent with the "switching effect" described above.

Causes of Non-validation of ICs and Associated Indicators
For the three ICs with the three BSS methods and unsorted data, the percentages of non-validation associated to Walsh's test and cartography validation were very stable with mean about 7 ± 2% and 5 ± 2% respectively. By contrast, the percentages of non-validation associated to the four categories of latency defined above were increasing for IC2 (77%) and IC3 (87%) comparing with IC1 (42%) ( Table 3). For IC1 specifically, the cases of incorrect latency were different according to the networks. Thus, for networks 1 to 4, the mean of incorrect latencies was equal to 15%; they were mostly higher than 250 ms and appeared when the number of averaged EEG segments was low, suggesting that source The ICi (i = 1-3) correspond to the three ICs with maximum kurtosis value sorted by decreasing order of kurtosis value separation was not yet conclusive, or punctually at any time, due to EEG artefact. Next, for networks 5 and 7, the mean of incorrect latencies was equal to 37%; they were mainly associated with a low number of averaged EEG segments and all categories of latencies were concerned even if these latencies were often comprised between 20 and 50 ms as if the IC waveform was trying to synchronize to t 0 without success. Lastly, for networks 6, 8 and 9, the mean of incorrect latencies was equal to 82% and all the categories of incorrect latencies were concerned.

Improvement of the Extraction: BSS Analysis with Expert Control
According to the causes of non-validation, two improvements could be suggested to ameliorate the validation percentage of ICs. First, the use of a minimum number of average EEG segments, greater than the minimum observed mean values (e.g., 100 segments), could improve the extraction of the ICs. Second, when the peak latency of the IC1 was too long (e.g., greater than 250 ms in absolute value), it clearly indicates the presence of some artefacts or noise in the scalp EEG signals. So, replacing this IC1 by the corresponding IC2 and this IC2 by the corresponding IC3 could also improve the extraction. Consequently, the IC3 data were incomplete and IC3 could no longer be used for various comparisons. This situation would not be an important problem because, in this study, its validation percentage was very low. For all initial trials with the three methods, when this solution was applied, the substitution involved 79 IC1 (14%) and the proportion of these IC1 that were replaced by an IC2 with a correct latency was 41%.
With these two improvements under visual expertise, two networks with insufficient number of segments (networks 5 and 9) were discarded. The total number of trials was then equal to 118. For all seven networks, the mean percentage of validated IC1 and IC2 for the three methods together were 70 ± 35% and 8 ± 6% for unsorted data respectively. More specifically, the overall mean percentage of validated IC1 and IC2 were 69 ± 34% and 14 ± 18% respectively for infomax ICA method, 60 ± 45% and 6 ± 8% respectively for extended infomax ICA and 80 ± 36% and 4 ± 5% respectively for JADE. Then, the sums of overall mean percentage of validated ICs for infomax ICA, extended infomax ICA and JADE methods were 83%, 66% and 84% respectively (Fig. 8).
BSS analysis under visual expertise resulted in validated IC1 associated with a relevant selected electrode for all networks with all methods (7 networks × 3 methods: 21 cases), except for network 6 (validation percentage equal to zero for extended infomax ICA), network 7 (validation percentage equal to 46% for extended infomax ICA) and network 8 (validation percentage equal to zero for all methods) (Figs. 9, 10).
As before, the six comparison tests of BSS methods (three methods compared with each other, for two IC and for unsorted data) applied to the results of these improvements concluded that H 0 hypothesis (equality of medians) could not be rejected and thus the BSS methods remained equivalent (mean p-value equal to 0.57 ± 0.26 with a minimum value equal to 0.318). Conversely, the three comparison tests between mean validation percentage of IC1 and IC2 concluded that H 1 hypothesis (median of IC1 > median of IC2) could be accepted for the infomax ICA and JADE methods (for infomax ICA, extended infomax ICA and JADE methods, the p-values were 0.006, 0.029, 0.004 respectively). Next, for the correlation between mean percentage of validated IC and electrodes number, the H 0 hypothesis (correlation is zero) could not be rejected (for IC1 and IC2, the p-value was greater than 0.05 and the Pearson correlation coefficient was respectively equal to − 0.13 and 0.16). Lastly, for networks 4 and 6 to 8 with a number of electrodes between 12 and 14, the standard deviations of the validation percentage for infomax ICA, extended infomax ICA Fig. 5 Classification, for the JADE method with all trials and the unsorted data, of EEG channels according to the number of selections for ICs. a IC1. b IC2. The black bar and the grey bar correspond to validated and non-validated ICs respectively. P patient, N network ◂ Fig. 6 Validation percentage of ICs, for the nine networks, according to the number of surface electrodes using the JADE method with all unsorted trials and JADE methods, were 41%, 42% and 48% for IC1 respectively and 16%, 8% and 4% for IC2 respectively.

Influence of the BSS Method
The three methods used in this paper, and implemented in EEGLAB, belong to two categories depending on the adopted criteria: infomax ICA and extended infomax ICA use an information maximization approach using entropy and JADE maximizes the non-Gaussianity of the estimated sources by exploiting fourth order cumulants. Although these methods are different in their principles, their aim is to estimate components which are independent (the less Gaussian as possible), i.e., ICs that differ to the Gaussian background noise. In our case, this difference of categories did not change the performance of brain source extraction. Indeed, for the pairwise comparison tests of the BSS methods, the H 0 hypothesis (equality of medians) was not rejected. Moreover, particularly high p-values and the visual analysis of the validation percentage of ICs according to the number of segments did not show a clear advantage of one method over the others. Hence, we concluded that, with our datasets, the BSS methods were equivalent in terms of performance.

ICs Selection
In this study, we choose to select and analyse the three ICs with maximum kurtosis value. The mean validation percentage of ICs, for all networks and all patients, suggested that the level of validation of the ICs was consistent with their rank (Figs. 4, 8). Moreover, this trend was statically validated for IC1, with infomax ICA and JADE methods. It appeared that IC1 is a relevant indicator that could be used to point out the presence of a mesial source. Indeed, for BSS analysis with expert control, the mean percentage of validated IC1 was equal to 70% and only one network out of seven was never detected whatever the number of trials used.
With a mean validation percentage equal to 8%, the role of IC2 appeared more limited. Nevertheless, according to the bar charts analysis, IC2 could extract mesial sources that were sometimes missed in IC1. This hypothesis is supported by the low number of situations where IC1 and IC2 were simultaneously validated. At last, IC3 was disregarded for analysis and only used to replace some IC2 for BSS analysis with expert control.  Compared to MEG, in scalp EEG recordings and especially in the 10/20 system configuration, the ICs selection is performed with low number of sensors. In a MEG-based study of Pizzo et al. (2019), authors had to keep only 20 ICs from 248 magnetometers in order to extract deep mesial sources from MEG signals. So, the ICs selection of deep mesial sources in MEG is trickier than in scalp EEG. At the opposite, MEG presents advantages thanks to very precise scalp topographies (high number of sensors) and no amplitude attenuation of magnetic signals from deep sources to scalp (no attenuation by the skull).

Influence of the Number of EEG Electrodes
Due to the use of BSS methods, the relationship between the number of scalp electrodes and the percentage of validation was investigated with the assumption that the more scalp electrodes are used, the higher performance should be. First, the Pearson correlation coefficients between mean percentages of validated ICs and electrodes number are low and, for all coefficients, the H 0 hypothesis (correlation is zero) was not rejected. Next, for the networks with almost the same number of electrodes, the standard deviation of the validation percentage was significantly higher for IC1. Consequently, the validation percentage did not appear to be correlated with the number of surface electrodes alone. Another important aspect could be related to the spatial sampling of the EEG electrodes and more precisely the sampling of the scalp temporal regions i.e., the number and the position of scalp electrodes in these regions in our cohort. Then, our results could be more related to the specific scalp EEG topography from mesial temporal sources contribution. Indeed, in Koessler et al. (2015), we demonstrated using hierarchical clustering of scalp EEG topographies that mesial temporal sources had a specific scalp EEG topography with an electric field projection in these temporo-basal scalp EEG electrodes.
By comparison to other studies that used electromagnetic source imaging with a high number of sensors (n > 64), our study was limited by the low number of electrodes. This low number was due to the surgical constraints (particularly the asepsis rules) of the SEEG investigation. The performance of deep sources extraction from scalp EEG with few scalp electrodes needs to be confirmed with more electrodes in a larger cohort. Anyway, the demonstration of deep brain sources extraction from scalp EEG also helps to understand that source imaging of deep and subcortical source localizations was plausible and cannot be always considered as false localizations.

Influence of the Deep Source Strength
For the first five trials of the initial setting, the fact that the median of validated ICs from sorted data were greater than those from unsorted data suggested that the contribution of the deep source influenced the waveform of the IC and especially the super-Gaussianity. In other words, if the amplitude of the deep source was strong enough, some ICs could be validated with the averaging of a few EEG segments. It is important to remind that mean SNR of interictal scalp spikes from these mesial networks used in this study was − 2.1 dB (Koessler et al. 2015) i.e., there was an absence of spikes visibility in the un-averaged scalp EEG signals. In this tricky situation of low SNR, BSS methods are not completely efficient because they are very dependent to the nature of the noise. Using another methodology, Pyrzowski et al. (2021) used zero-crossing patterns for extracting low SNR interictal discharges from simultaneous scalp and intracranial EEG recordings. Their results suggest that scalp zero-crossing patterns extract the spatiotemporal structure of subtle scalp voltage fluctuations correlated with intracranial interictal epileptic discharges. Considering the use of very few depth electrodes (majority of subdural electrodes) in the whole cohort of this study and that some physiological propagations, the efficiency of zero-crossing patterns method of mesial temporal sources is not completely defined.

Causes of Non-validation of ICs
In this paragraph, we focus only on IC1 because IC2 and IC3 less frequently corresponded to deep sources.
The first cause of non-validation of IC1 was an insufficient number of averaged EEG segments to disentangle the sources. The curves analysis of the ICs validation percentage as a function of the number of EEG segments revealed that the minimum number of required segments was variable from patient to patient and no simple and robust method can be found to adapt this number to each patient. Nevertheless, beyond a minimal number of 70 to 100 segments, IC1 was often validated.
A second cause was the presence of artefacts in scalp EEG signals that induced peaks on IC waveform. The preprocessing of the data used in this study removed the most significant artefacts, but unfortunately not all. Thus, if the corresponding latency of the remaining artefacts was sufficiently long (i.e., greater than 250 ms), IC1 could be replaced by IC2. The choice of this high latency threshold allowed us to be sure that the observed peak was clearly due to an artefact. At the opposite, this threshold had no effect on artefacts that occurred in short latencies and the resulting improvement did not concern all substitutions (41% with our data).
Finally, the other causes could be related to the manual selection of the EEG segments resulting for example to (i) unwanted presence of other co-activated brains sources, outside the temporal lobe, that induced an incorrect cartography, or (ii) inaccurate estimation of the t 0 . Regarding this last point, our blind method could operate with a small bias between estimated value of t 0 and the real one if the estimate dispersion was small, i.e., corresponding to a peaky IC waveform for the averaged EEG segments.
Despite the use of expert control, network 6 with extended infomax ICA, and network 8 with all methods, had a validation percentage of IC1 equal to zero. These failures can be explained by different factors. For network 6 and for extended infomax ICA method only, the baseline of the selected IC1 looked very fluctuating like a variable frequency square wave. Consequently, the detection threshold was higher than that of the other two methods and the amplitude of the peak of the waveform was lower than the threshold, so that the IC was not validated. For network 8, the IC waveform associated with relevant EEG electrodes comprised two peaks, a first small amplitude peak followed by a higher one. The latency of this first peak was mainly correct but its amplitude was very often lower than the detection threshold so that the second peak, with a latency outside the thresholds, was selected and, consequently, the corresponding trials were not validated. Then, the remaining trials with validated latency were discarded by the Walsh's test because the IC's amplitude value at t 0 was too small compared to the background activity.

Decision-Making and Clinical Perspective
To avoid erroneous decisions, it seemed more convenient to analyse the results of several consecutive trials and not just one. To do that, the proposed bar chart seemed well-suited and, in a blind way, the pattern of the IC1 bar chart could be In clinical context without simultaneous SEEG recordings, and so without the ability to average epileptic spikes, we could use a quite long duration of raw scalp EEG recordings to maximize the probability of recording deep and very active mesial temporal sources. MTL epilepsies are known to have a very active irritative zone with very frequent epileptic discharges in the hippocampus, amygdala, entorhinal cortex, parahippocampal gyrus,…,etc. (Bourien et al. 2005;Koessler et al. 2015;Karunakaran et al. 2018). Before the analysis of long-duration raw scalp EEG recordings, spontaneous visible epileptic spikes and artefacts could be removed with automatic open-source detection algorithms. This step should avoid the detection of ICs from artefactual sources or, most importantly, lateral brain sources. From our previous study (Koessler et al. 2015), we showed that spontaneous visible scalp EEG spikes (i.e., with a high SNR) raised from neocortical sources. So, removing these visible scalp EEG spikes should result in scalp EEG signals with both background activity and invisible scalp EEG spikes from deep brain sources (i.e., spikes with a low SNR like in this study). Then, in a first step, the visual analysis of the IC topographies could be used to find the scalp topography that could correspond to a mesial temporal source contribution i.e., a negative/ positive polarity in the basal temporal electrodes like FT10/9 (Koessler et al. 2015). Then, in a second step, the kurtosis analysis of short segments of this longduration scalp EEG recordings could be used to count the number of segments (and so, give a percentage) with significant super-Gaussian ICs. The selection of long duration scalp EEG recording with a lot of intracerebral interictal discharges will not change the SNR of the corresponding scalp EEG signal, but it will increase the probability to record very high amplitude intracerebral epileptic discharges.
Our study has demonstrated that in this situation (e.g., the five first segments of sorted data) that extraction is feasible. In a perspective way, despite the unperfect detections of mesial temporal sources in this study, our method could be used in some clinical routine situations (that need to be defined using a large cohort) for alerting medical doctors that it exists a probability of mesial temporal source activation in their raw scalp EEG signals.

Conclusion
Having established the contribution of deep mesial temporal sources to scalp EEG (Koessler et al. 2015), we demonstrated in this methodological study that the extraction of these invisible sources on the scalp is possible under certain conditions. The first IC extracted from the scalp EEG signals was validated in mean from 46 to 80% according to the different parameters. Despite the unperfect detection, this study shows that a relatively simple signal analysis of scalp EEG can extract epileptic discharges of brain sources that are hidden/mixed with others and so, can escape to visual expert analysis. For the clinical diagnosis of epilepsy, this solution that relies on non-invasive recordings would be important because it can change the medical care, especially in drug resistant epilepsy where source detection and localization (e.g., deep and/or lateral) are crucial. Finally, it is important to mention that we have deliberately used common available toolboxes to test their performances and finally found promising results. These offer several interesting perspectives for the development of new signal processing tools and methods that could improve the performance of deep source extraction.

Competing interests
The authors declare no competing interests.

Fig. 10
Blind source separation analysis with expert control: classification, for the JADE method with all trials and for unsorted data, of EEG channels according to the number of selections for ICs. a IC1. b IC2. The black bar and the grey bar correspond respectively to validated and non-validated ICs. P patient, N network ◂