Causal Cortical and Thalamic Connections in the Human Brain

The brain’s functional architecture is intricately shaped by causal connections between its cortical and subcortical structures. Here, we studied 27 participants with 4864 electrodes implanted across the anterior, mediodorsal, and pulvinar thalamic regions, and the cortex. Using data from electrical stimulation procedures and a data-driven approach informed by neurophysiological standards, we dissociated three unique spectral patterns generated by the perturbation of a given brain area. Among these, a novel waveform emerged, marked by delayed-onset slow oscillations in both ipsilateral and contralateral cortices following thalamic stimulations, suggesting a mechanism by which a thalamic site can influence bilateral cortical activity. Moreover, cortical stimulations evoked earlier signals in the thalamus than in other connected cortical areas suggesting that the thalamus receives a copy of signals before they are exchanged across the cortex. Our causal connectivity data can be used to inform biologically-inspired computational models of the functional architecture of the brain.


Introduction
The brain's dynamics are shaped by electrophysiological interactions throughout its subregions, encompassing not only cortical but also subcortical regions.Advancements in neuroimaging have provided signi cant insights into the global architecture of the brain's functional connectivity 1 .However, we know considerably less about the dynamic, fast-paced and causal electrophysiological relationships in the human brain.This knowledge gap is even more pronounced when it comes to understanding the role of subcortical areas such as the thalamus, which is known to play a key role in modulating the global dynamics of the brain 2 .
A classic method of studying causal electrophysiological relationships in the brain is by sending repeated single electrical pulses to intracranially implanted electrodes while recording the presence or absence of electrophysiological changes in all other areas of the brain where recording electrodes are present 3 .This method has been traditionally referred to as the study of cortico-cortical evoked potentials 4 (CCEPs) since it has primarily been used to study causal electrophysiological connectivity between pairs of cortical regions.Similar studies of thalamocortical and corticothalamic connections have been rarely conducted since direct recordings from and stimulations in the human thalamus are extremely rare in human neuroscience research [5][6][7] .As a result, it remains unknown if stimulation of the thalamus will have a different, or the same, effect on other regions of the brain.Moreover, with a few exceptions 7,8 , previous studies have been largely reliant on simple univariate measures detecting large peaks or the time-to-peak in the evoked signals, and limit responses to xed windows of interest.These traditional methods are not able to capture the complex dynamics of physiological responses generated by electrical stimulations (Fig. 1); and they can vary signi cantly depending on the chosen cutoff z-scores for determining an effect, current intensity and the distance between stimulating and recording electrode contacts 9 .These problems become more signi cant when we lack prior data to inform our hypotheses about evoked potentials caused by human subcortical structures such as the thalamus.
In the current study, we aim to use a novel approach to provide a map of causal electrophysiological connections in the human brain including the thalamus.

Results
Our results are based on intracranial recordings and stimulations in twenty-seven participants (40.7% female; mean age ± SD: 34.9 ± 10.0 years) diagnosed with focal epilepsy.Each participant provided data from an average of 180 (± 46) electrode sites.The entire dataset encompasses 4864 sites across all subjects, with each adjacent pair stimulated approximately 45 times (with su cient inter-trial intervals to avoid overlapping effects), while recordings were taken from all other available sites.
We leveraged a recently developed clinical method that involves The brain's functional architecture is intricately shaped by causal connections between its cortical and subcortical structures.Here, we studied 27 participants with 4864 electrodes implanted across the anterior, mediodorsal, and pulvinar thalamic regions, and the cortex.Using data from electrical stimulation procedures and a data-driven approach informed by neurophysiological standards, we dissociated three unique spectral patterns generated by the perturbation of a given brain area.Among these, a novel waveform emerged, marked by delayed-onset slow oscillations in both ipsilateral and contralateral cortices following thalamic stimulations, suggesting a mechanism by which a thalamic site can in uence bilateral cortical activity.Moreover, cortical stimulations evoked earlier signals in the thalamus than in other connected cortical areas suggesting that the thalamus receives a copy of signals before they are exchanged across the cortex.Our causal connectivity data can be used to inform biologically-inspired computational models of the functional architecture of the brain.thalamic recordings in neurosurgical patients undergoing invasive stereotactic electroencephalography (sEEG) 5 .In this procedure, intracranial electrodes were implanted in three different sites of the thalamus -corresponding to anterior, mediodorsal, and pulvinar nuclei -by extending electrodes initially placed in the cortical areas.This procedure was designed for extended clinical indication without increasing the number of implanted electrodes.

Identifying Common Stable Patterns in Stimulation Evoked Potentials
To determine true physiological effects upon electrical stimulation, we used a non-linear manifold learning algorithm 10 integrated with human judgement, based on the spectral information of evoked potentials (EPs) across time and frequencies (Fig. 2a-c).In this approach, we rst labeled the data partially and tentatively, i.e., subject to subsequent data-driven corrections, based on two measures of connectivity: (i) trial-averaged power of EPs as a measure of strength, and (ii) inter-trial phase coherence (ITPC) across repeated stimulations as a measure of consistency for the evoked responses 11 .We applied a semisupervised machine learning approach to decipher the inherent data structure from this preliminary labelling, and re-classi ed the data, within each subject, into two separate clusters (Fig. S2): one cluster of data where stimulation of the seed site caused signi cant EPs in the target site, and another cluster of data where stimulation of the seed site did not cause any signi cant EPs in the target site.For simplicity, we called these two clusters "activated" and "non-activated", respectively.
As a result, EPs in the activated cluster were differentiated from the spontaneous activity based on prominent changes in both power and ITPC relative to the baseline (pre-stimulation intrinsic neural activities), and also differentiated from the unnatural signals that we have manually labelled as noise (e.g., stimulation artifact or bad channels) by providing those typical cases.Once all subjects' EPs were successfully classi ed as "activated" vs. "non-activated", we ensured the cluster's meaningful representation of the data by manually inspecting the original and reconstructed data for every subject.With the machine-learning re ned labelling (activated vs. non-activated) as a benchmark, the preliminary activation criteria were re-evaluated for further sanity checks (Table S1).
We next aimed to characterize the prominent electrophysiological properties of the EPs in the "activated" cluster.Dimensionality reduction with uniform manifold approximation and projection (UMAP) revealed that the inherent structure of the data was different for stimulation-recording pairs within the same hemisphere (i.e., ipsilateral) compared to those across the two hemispheres (i.e., contralateral connections).Surprisingly, we also discovered that the inherent structure of the data was different for EPs generated by stimulation of the thalamic sites vs. cortical sites (Fig. 2d; more label mapping can be found in Fig. S3).The thalamic vs. cortical recording sites, however, do not exhibit such obvious differences (Fig. S3c vs. Fig.S3d).We refer to this anatomical features (1.Stimulation from the cortex or thalamus, 2. Ipsilateral or contralateral connections) as UMAP localizers.We then searched for the signi cant features that set the data apart in the embedding space.To dissociate the data that contain the features of interest, we applied supervised learning with the UMAP localizers as labels.As expected, it resulted in four clusters in the embedding space with distinct spectral patterns, i.e., clusters of thalamic, cortical, ipsilateral, and contralateral EPs (Fig. 2e).

Dissociating Distinct Neural Features in Evoked Responses
Once the four clusters were identi ed, we used the power and ITPC spectrograms from each cluster for further statistical testing.Cluster-based permutation testing identi ed the signi cant time/frequency boundaries of three distinct neural features in the data that we labeled as Feature 1 (F1), Feature 2 (F2), and Feature 3 (F3) as tentatively shown in Fig. 2e by visual identi cation.Details of statistical testing procedures are presented in the Methods.We summarize the three features as follows (Fig. 3a, b): F1 is characterized by signi cant increases in power of high frequency activity in the gamma range and the strength of ITPC within the rst 10-60ms.Of note, F1 was clearly dissociable from the artifact which was showed as a high-frequency (> 100 Hz) peak power immediately after the stimulation (< 10ms) and lacking signal continuation in both frequency and time dimensions (Fig. S5).F2 is characterizable by an increased power in the theta to alpha range, without phase locking, happening ~ 120 ms upon stimulation.
Later than F2 (~ 200 ms post-stimulation), F3 is featured by a prolonged power increase in the theta band, with a phase-locking feature which indicates oscillation (Fig. 4a).
Importantly, the EPs generated from thalamic stimulations were distinct from the cortical stimulations by the presence of a strong F3 signal (i.e., higher theta frequency change starting emerging around 165 ms and persisting for about 250 ms (Fig. 3b).By checking the signals in the time domain, combining the evidence of the evoked power and ITPC, we con rmed (i) the presence of oscillatory activity embedded in the F3 signal, and (ii) that the F3 signal is not a delayed F2 or continuation of it.In the time domain, F3 indicates a ~ 5 Hz oscillation, but F2 indicates a wider curve with jittered peaks across sites.In the timefrequency domain, both F2 and F3 share similar power spectrum, but their ITPC pattern differs signi cantly in that F2 does not have a phase-locking feature (Fig. 4a).Moreover, unlike F2, F3 is lacking in cortical EPs, instead, it is speci c to thalamic EPs with both THAL-ipsi and THAL-contr clusters showing this delayed and long-lasting F3 activity.
We then investigated the spectrograms separately for different recording sites.Pro les of THALàCOR, CORàTHAL and CORàCOR in both ipsilateral and contralateral pairs re ected the main results of the UMAP localizers, which showed that the stimulation sites, rather than the recording sites, dominate the spectral features.
Notably, multi-site thalamic coverage enabled us to examine intra-and inter-thalamic causal connectivity at the individual level (Fig. 3c).The results provided unprecedent data from the thalamus where we could compare the connectivity of different thalamic subregions with the available brain recording sites and examine the effect of electrical stimulation of a given thalamus upon the activity of sites in other thalamic subregions within the same or across the other hemisphere (Fig. S7).

Timing and Global Patterns of Connectivity Indicated by Feature Presence
Next, we aimed to understand how each of the three neural features unfolds in time within individual instances of connectivity for the four categories of interest (i.e., COR-ipsi, COR-contr, THAL-ipsi and THALcontr).The group-averaged spectral information within the signi cant time windows from the COR-ipsi, COR-contr and THAL-contr categories was used to create the feature template for F1, F2 and F3, respectively (Fig. 4a).Using a sliding-window correlation approach (see details in Methods and illustration in Fig. 4b), we estimated the strength of feature representations (r*) and their presence in time.
The decoding results also echoed the previous nding that F3 was seen mainly in evoked potentials pertaining to the thalamic stimulations, with F3 peaking higher in ipsilateral than in contralateral EPs (Dr* = 0.11, t = 48.05,p FDR-corr = 0.02, n-conn.= 6497, n-sbj.= 25), but without signi cant difference in the latency between ipsilateral and contralateral EPs (Dr* = 5.03 ms, t = 1.86, p = 0.06, n-conn.= 3106, n-sbj.= 25 for robust F3 with r* > 0.5, latency > 200 ms).Additional post-hoc comparisons among sub-groups are presented in Fig. 5 and SI Table S9.The correspondence of the decoding results with the previous grouplevel spectrogram analysis suggests that our devised feature indices can reliably indicate the presence of the multivariate features with single values.
We further examined the whole-brain spatiotemporal patterns indicated by the three neural features.To do so, we constructed connectivity matrix with each cell lled with the connectivity index of R for indicating the signal strength between two brain regions (R = , averaged across the sites where the feature is present in the given regions).By organizing brain areas in the order of anatomical proximity, interesting patterns emerged: (i) the F1 matrix demonstrated a pro le of modular connectivity, i.e., higher F1 representation was seen for connections between pairs located within the same hemisphere or lobe and within the ipsilateral thalamus (Fig. 6a, b); (ii) the F2 matrix revealed wide-spread representations across regions and hemispheres and demonstrated that the global architecture of connections based on F2 did not differ much from the F1 matrix.This seems to suggest that F2 is a combination of jittered rebounds following the same paths for the F1 signals (Fig. S10); (iii) unlike the F1 and F2 matrices which have a modular architecture, the F3 was sparsely present among the cortical connections, but largely present in the cortical responses (i.e., widespread and bilateral) associated with thalamic stimulation (Fig. 6f, g).
In a closer examination, we found that F1 corticothalamic and F3 thalamocortical connectivity features had the following unique characteristics in common: (i) F1 responses recorded in the thalamus had widespread and bilateral origins across the brain (i.e., stimulation of many cortical regions bilaterally could evoke responses in the thalamus) (Fig. 6a), and vice versa for F3, the stimulation of a thalamic site would cause widespread cortical areas bilaterally to show delayed slow oscillations (Fig. 6b); and (ii) both corticothalamic F1 and thalamocortical F3 responses were stronger and earlier than corticocortical F1 and corticocortical F3 responses, respectively.The statistical details are presented as below.
We compared the cortical versus thalamic targets responding to the same cortical seed of stimulation in the same participant, and con rmed the F1 responses in the thalamus were earlier and stronger (Dlatency − r * = 2.79 ms, t = 12.74, p FDR-corr < < 0.01, n-conn.= 48446, n-sbj.= 26; Dr* = 0.06, t = 18.20, p FDR-corr < < 0.01, nconn.= 30965, n-sbj.= 26, with HLM corrected for the covariance effect of cross-hemispheric connections and the nested random effect of 1378 different stimulation sites of different brain areas in all subjects) (Fig. 6a for R matrix, SI Fig. S10 for latency matrix).In other words, stimulation of a given cortical site changed the activity of the thalamus before it changed the activity of its cortical targets (Fig. 5a and SI Table S9 for post-hoc latency comparisons).
Moreover, the F3 matrix revealed that the stimulation of thalamic sites caused delayed low-frequency oscillatory effects (i.e., F3) in a large mantle of the cortex in not only ipsilateral but also contralateral hemispheres (Fig. 6b).By contrast, the F3 feature was only sparsely present with the stimulation of nonthalamic seeds.Importantly, the latency of F3 responses caused by thalamic stimulations was signi cantly earlier than the non-thalamic seeds (36.27 ms earlier, t = 21.60,p FDR-corr < < 0.01, n-conn.= 20823, corrected for the covariance of cross-hemispheric effect, and the nested random effect of 3041 recording sites of different areas in all subjects with an HLM) (Fig. 5c).

Differential connectivity among thalamic subregions
Up to this point, we have remained agnostic to the speci c anatomical location of the thalamic sites where evoked responses were recorded or stimulations were seeded.However, each subject had more than one electrode in the thalamus targeting the anterior, mediodorsal and pulvinar nuceli 5 .Given that the precise anatomical boundaries of each thalamic nucleus within each subject may be variable and di cult to ascertain with neuroimaging data, and that the eld of evoked responses or electrical stimulations may cross the nuclear boundaries, we refrained from labeling our thalamic sites according to speci c nuclei, and instead, referred to them as anterior (antTH), middle (midTH), and posterior thalamus (pstTH).
Leveraging the anatomical information of recordings and simulation sites -at the individual level -we compared the connectivity pro les of the three different thalamic sites.This analysis revealed unique sitespeci c pro les of connectivity between the thalamus and the cerebral cortex (Fig. 6c, d, e).As describing the details of differential connectivity of thalamic sites are outside the scope of this report, we only highlight some of the most salient ndings.The full results of pair-wise comparisons among the thalamic subregions for their bi-directional connections with each of the brain regions are provided in the supplementary Table S2 Next, we compared the delayed thalamocortical F3 signal representation map with the ones for early thalamocortical F1 and early corticothalamic F1 maps, with pair-wise comparisons within the same pair of thalamic vs. non-thalamic sites.Using the measure of normalized feature representation , as shown in Fig. 7 and detailed in Tables S6 and S7, between a subregion of the thalamus and a given brain region of interest, we found signi cant asymmetry between thalamocortical F3 representations and thalamocortical F1 representations.Many regions had stronger F3 than F1 representations suggesting that the delayed thalamic (F3) out ow is represented across a more widespread cortical areas compared to the fast thalamic (F1) out ow signal (see brain regions with more brown colors in Fig. 7a and positive estimate numbers in Table S6).The same applies to the comparison between thalamocortical F3 and  S7).

Discussion
In our study, we addressed the limitations of past studies by leveraging a novel approach to characterize the complex features of electrophysiological responses that are evoked by not only cortical but also thalamic stimulations in the human brain.This is important because, as we documented in Fig. 1, reliance on the magnitude of recorded neural responses or expecting typical N1 and N2 pro les in the evoked responses may greatly bias our views of causal connections in the brain.The magnitude of evoked responses can be affected by several confounding factors such as distance from the source of stimulation, volume conduction, and electrode impedance 12 and the time-to-peak may be elusive when there is no clear N1 or N2 peaks.While typical N1 and N2 waveforms reported in the CCEP literature are abundant, it is important to note that the past reports were based on group level analysis of data from cortical stimulations 13,14 .As noted in Fig. 1, individual evoked signals may be far more complex than simple N1 and N2 waveforms and the magnitude of evoked responses may not always pass an arbitrary threshold we set for signi cance.This issue is even more pressing when waveforms from subcortical structures such as the thalamus are included.Therefore, in line with recent attempts by other investigators 7,8 , we felt it was imperative to employ a multivariate and unbiased approach for sorting the distinct pro les of connectivity across cortical and thalamic sites.
In our encoding method, we relied on the machine learning algorithms of UMAP to identify the inherent variabilities among the datasets.We emphasize that our data-driven approach was not blind to biologically plausible features.Instead, it was directly guided by the 2/3 of the data that had already been labelled based on established neurophysiological criteria for signi cant evoked responses (i.e., signi cant amplitude above the noise level, signi cant inter-trial phase coherence concerning reliability of responses across trials, and more importantly, labeling what constitutes a stimulation artifact).Our post-hoc analysis also con rmed that the machine-learning re ned labeling largely corresponded to the activation detection results based on the preset biological criteria, among which, the criterion based on signi cant ITPC (i.e., consistency in the evoked responses over trials) had the highest correspondence (0.80).In contrast, the traditional activation detection criterion of large peak detection (> 5 z-score) had a high precision (0.99) but low sensitivity (0.53), which is in line with our observation: many thalamic evoked responses (especially the contralaterally recorded ones) would be missed if based solely on the amplitude criterion as those responses usually have lower amplitudes on the voltage traces, though not necessarily possessing less robust evoked features.
Our new approach yielded important results suggesting that the complex electrophysiological signals evoked by repeated single pulse electrical stimulations have three distinct features (i.e., F1, F2, and F3).In the context of past literature, unitary F1 and F2 features may correspond to clear N1 and N2 peaks, but we emphasize that changes in the neural activity of a given brain site evoked by the stimulation of another site cannot always be depicted as a uniform waveform consisting of N1 and N2, but rather, a complex signal with different components of the evoked waveform which may in turn re ect distinct physiological substrates with distinct sources.While we postulate that F1 and F2 may be related to the well-known N1 and N2 components of the EP signal described in the literature 12,14,15 , our data suggests that F3 is a unique feature of its own and not a prolonged F2 -even though F2 and F3 features have seemingly similar power spectrogram peaking at a low frequency range.Our rationale is as follows: (i) F3 has a peak oscillation at about 5 Hz, but F2 has wider frequency range with jittered peaks across trials.(ii) F3's ITPC spectrogram differs signi cantly from the F2's, as F3 has a phase-locking feature at the theta frequency but F2 does not (Fig. 4a).The oscillatory feature of F3 was also observed at the same time-frequency cluster as long-lasting theta oscillations ( > = 2 cycles).The oscillatory feature of F3, unlike F2, was observed in the time domain: it was strongly time-locked and exhibited at least two cycles of oscillationeven when averaging the EPs recorded across all recording areas and across all thalamic stimulation sites.(iii) The temporal window of signi cant time-frequency clusters in F2 and F3 do not overlap (i.e., F3 always happens after F2, peaking > 200 ms post-stimulation).(iv) Lastly, F3 feature is present speci cally when the thalamus is stimulated (including both THAL-ipsi and THAL-contr clusters), while F2 does not have this anatomical speci city.
We documented that the F3 feature was detected with the stimulation of all three thalamic sites.This suggests that the F3 signal is not produced by the stimulation of only one speci c thalamic nucleus.However, it's important to acknowledge a limitation of our study, which stemmed from the absence of data from a number of other thalamic nuclei.For instance, we have no information about the pro le of electrophysiological responses evoked by the stimulation of thalamic nuclei such as the geniculate bodies.This limitation was primarily due to the clinical constraints and ethical considerations inherent in studies involving human subjects.
We also note that the F3 oscillatory evoked responses were also seen with the stimulation of a few nonthalamic brain regions -especially the hippocampus and amygdala -both of which are non-cortical structures [Figure 6b]).interestingly, the latency of F3 signal evoked by the stimulation of these nonthalamic sites was signi cantly delayed compared to the latency of F3 signal evoked by the stimulation of the thalamic sites, and they were mostly present in ipsilateral recording sites while the thalamic stimulations evoked F3 in far more widespread areas involving bilateral hemispheres.
As the thalamus is considered to play a key role in cortical rhythms 16 , and as the hippocampus is known for its role in generating theta rhythms 17 , we hope our ndings will motivate future systematic studies to determine how theta oscillations observed throughout a broad surface of higher association areas in the cerebral cortex during cognitive activities tasks or rest 18,19 are related to the delayed modulatory effects exerted by the thalamic (or medial temporal lobe) structures.It also remains to be determined if the bilateral presence of the F3 signal evoked by thalamic stimulations is functionally related to, or is crucial for, the integration of information across the two hemispheres.
Our approach with signal decoding of the stimulation and recording data revealed that stimulation of a given cortical region evoked earlier and stronger responses in the thalamus compared to the cortical targets.A hypothetical interpretation of this nding is that the thalamus acts as a universal receiver of information being exchanged across cortical regions (i.e., a "listener of corticocortical dialogue") -which is a necessary for it to function as a key structure for information integration as detailed recently 2 .
Additionally, the multisite thalamic stimulation and recording within the same individual offered us an unprecedent opportunity to examine the intrathalamic and interthalamic causal connections.We demonstrate that the stimulation of a given thalamic site led to fast high frequency modulation of activity in other thalamic sites ipsilaterally as well as delayed low frequency modulation of thalamic sites contralaterally.These ndings provide a plausible electrophysiological mechanism for intra-thalamic and inter-thalamic connectivity, which may have important implications and clinical relevance: Stimulating a speci c thalamic nucleus with DBS (e.g., for treatment of epilepsy 20 ) is likely to impact the activity of other thalamic nuclei (beyond the one directly targeted) and the cortical networks they are associated with.Thus, the therapeutic bene ts of DBS might be related to a broader network of brain regions that are modulated rather than a focal network targeted.However, this does not imply that the effect of stimulation of each targeted thalamic site is diffuse and all-encompassing, as shown in Fig. 5, stimulations in three subregions of the thalamus do not elicit anatomically identical responses, but instead show relative anatomical preference.
We are mindful that our data was acquired from patients with a neurological disease (epilepsy) and one may ask to what extent they are generalizable to normal human brains.While we acknowledge that molecular and cellular changes accompany local circuit rewiring in patients with chronic epilepsy 21 , there is no rm evidence suggesting changes in the fundamental topology of connections at the system level in these patients 22 .In the extant literature, unless studying the circuit level and local changes in a region-ofinterest type of analysis [23][24][25] or graph theoretic measures for deriving high-level estimation of network capacity 26,27 , the global network topology of the brain has been shown not to be signi cantly different between epileptic cohort and the normal control 28 .Several studies focusing on epileptic brain reorganization have shown that the same sets of canonical brain networks can be discovered by the datadriven methods, just as in normal controls [29][30][31] .
We are also mindful that electrode coverage in our subjects was limited and sparse.This is unfortunately a limitation of the intracranial approach in human participants that we, as researchers, have little control over since the implantation of electrodes in all participants was clinically driven.We made the best effort to reduce the confounding effects by maximizing the number of subjects, and diversifying the cohort with different epilepsies so as to avoid a system bias of encountering pathological tissues concentrated in speci c areas.
Despite the limitations of our study, we hope that our ndings can serve as the steppingstone towards a wider understanding of the architectural topology of interactions across the cerebral cortex and between those and subcortical structures such as the thalamus.
In closing, we acknowledge that the causal connectivity maps presented in our current study provide only one aspect of the functional architecture of the brain, and as such, they serve as rough approximations for the dynamic causal interactions occurring in real-life scenarios and during cognitive processing and are complimentary to those connectivity measures generated by structural diffusion MRIs or resting state or task speci c imaging studies 3,4,[32][33][34] .To highlight the functional signi cance of RSEPS-based connectivity maps, we recently con rmed that the strength of RSEPS connectivity between sites correlates with the strength of their co-activations during a cognitive task, i.e., if a speci c pairs of neuronal populations across the cortex and thalamus had high RSEP-based connectivity strength they had higher correlation of high-gamma activations during a memory task 11 .Additionally, it has been shown that seizures propagate from cortical seizure onset zones to thalamic sites 5 or other cortical areas 35 that show faster and stronger RSEPS connectivity measures with the seizure onset zone.
Online Methods

Materials availability
Raw data, including electrophysiology collected in this study will be shared publicly after publication.
Data and code availability: The electrophysiological data have been deposited at Mendeley and are publicly available as of the date of publication.The DOI will be listed in the key resources table.
All original codes have been deposited at Zenodo and is publicly available as of the date of publication.DOIs will be listed in the key resources table.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Participants
In this study, twenty-seven participants (40.7% female; mean age ± SD: 34.9 ± 10.0 years) diagnosed with focal epilepsy were recruited (see Table 1).Each participant had 180 ± 46 (mean ± standard deviation) SEEG contacts implanted.The total number of electrode contacts being studied is 4864.All participants underwent invasive electrophysiological monitoring at our medical center as part of their treatment for refractory epilepsy.The placement of electrodes was determined exclusively based on clinical requirements.Prior to their involvement, all participants provided written informed consent, and the study protocol was approved by the Institutional Review Board for human experimentation.

Patient Safety & Ethics
As noted, we did not implant extra electrodes for thalamic recordings.We only extended the electrodes that were clinically planned for cortical monitoring to reach the thalamus.Our procedure was based on patients' informed consent and all our research related activities were IRB approved.To minimize injury, we used a reduced diameter obturating stylet and reduced diameter electrodes with 0.86mm diameter (AdTech, Inc).With this surgical approach, we have observed no complications, no thalamic hemorrhage or edema, and no neurological symptoms post operatively in any of the patients examined.We note that the thalamic recording in epilepsy patients has been widely practiced in some centers in Europe for two decades yielding clinically important information [36][37][38][39][40] .The method has recently gained ground in the U.S. [41][42][43][44] , and a recent neurosurgery editorial recommended and encouraged more centers in the U.S. to do the same 45 .Additionally, iEEG implantation strategy is decided by a group of clinicians who meet weekly to decide the clinical needs of the patient, and not by this research project.Also, as noted, all patients provide informed consent prior to being enrolled in any research studies.

Electrode Implantation
MRI sequences for imaging the thalamus and its subnuclei included BRAVO, fGATIR and MP2RAGE.We co-registered the post-implant CT with a pre-implant MRI.High resolution T1, fast gray matter acquisition T1 inversion recovery (FGATIR), and T1 post-contrast imaging were used for planning.Trajectories were planned to traverse in an orthogonal plane to capture the cortical frontal or temporal operculum, insula, and to be extended into speci c sites of interest in the thalamus.The priority for all trajectories was safety and avoidance of blood vessels.To achieve implementation of our multisite sampling approach, the cortical electrode trajectories were extended to reach three different thalamic sites: anterior, mid, and posterior thalamus as detailed in our recent publication 5 .Approximate locations and number of electrodes along with their trajectories were all planned in a multidisciplinary surgical epilepsy conference with detailed review of presurgical data leading to a clinical hypothesis of the most likely seizure onset zones.Electrode placements were at the discretion of the clinical care team, who were not speci cally involved with this research project.

Co-localization of electrodes
A thin cut head-CT, obtained after electrode implantation, was co-registered to pre-operative 3Tesla MRI data for veri cation of the trajectory.Precise electrode positioning in the FreeSurfer surface space, voxel space and MNI space were automatically extracted by the iElVis toolbox 46 .The T1-weighted MRI scan was used to generate 3D cortical volume and subcortical segmentation using recon-all command of Freesurfer v7.3.2 47 .The post-implant CT scan were co-registered to the pre-implant MRI using the irt from the Oxford Centre for Functional MRI of the Brain Software Library 48,49 or using bbregister from Freesurfer 50 to get the best results.We manually labelled each electrode on the T1-registered CT image using BioImageSuite 51 .The electrode coordinates in the native anatomical space were inspected and manually labeled by the experienced neurologist J.P., based on the individual brain's morphology and landmarks.

Thalamic parcellation
To check the concordance between manual and automatic labeling of thalamic electrode sites, individual contact centers of electrode masses were de ned in native T1 space for each subject.The center of mass of each contact was then converted into an X,Y,Z coordinate in the MNI space.A recently developed and widely used MNI thalamic atlas 52 was utilized to parcellate the nuclear location of each contact within the thalamus.For this, a 1mm cubic voxel region was created around each contact center of mass (contact neighborhood).For each contact neighborhood, the fraction of voxels that overlapped with each thalamic nucleus in the THOMAS atlas was calculated.Contact neighborhoods that fell solely within a single THOMAS nuclear mask would have a value of 1 for that speci c nucleus, while contact neighborhoods that had no overlapping voxels with a given nuclear mask would have a value assigned to 0 for that speci c nucleus.Thus, for every contact neighborhood anatomically inside the thalamus, the fractional overlap with each THOMAS nuclear mask was calculable.For each site, this within-subject fractional overlap was summed across subjects and across all thalamic contacts to generate overall contact neighborhood fractional overlap values.

Identifying Regions of Interest
All electrodes contact locations were manually labelled with brain regions of interest (Fig. 1a) by the experienced neurologist J.P. on the team.This process involves carefully inspecting the CT-MRI fused images in axial, sagittal, and coronal planes for the implanted electrodes, and their surrounding anatomical landmarks in the brain morphology (e.g., basal ganglia including putamen, and pallidum etc.).Contacts entirely or partially located in the white matter (e.g., internal capsule, or cingulum etc.) were labelled as white matter.We emphasize that these regions of interest were not identi ed in the standard atlas space, but in each participant's own high-resolution T1 scan to ensure anatomical precision of electrode localization.

Repeated single electrical pulse stimulation (RSEPS)
Stimulations were performed in a bi-polar manner stimulating between a pair of electrodes.Each RSEPS trial entailed an instantaneous (pulse duration = 0.2 ms) 6 mA biphasic square-wave pulse.For electrode contacts near the seizure onset zones (identi ed by the clinical team), we stimulated at 4 mA.Repeated pulses (n = 42 ± 2 trials) were delivered in a row with an interval of 2 seconds.We excluded data from (i) bipolar channels with both contacts located in the white matter (see above for details of anatomical localization), (ii) bipolar contacts not within the same anatomical region (e.g., one contact in insula and the other in orbitofrontal cortex), (iii) recording bipolar sites too close to the stimulation site (Euclidean distance between their middle points < = 5 mm), and (iv) channels largely contaminated by stimulation artifact, i.e., recording large signals ( > = 7 SD of the baseline) only in the artifact window (< 10 ms), and (v) bad channels based on more speci c criterion as detailed below.

Intracranial EEG data acquisition and preprocessing
The data was collected using the Nihon Kohden recording system with a sampling rate of 1000 Hz.We used an in-house programmed preprocessing pipeline to denoise the intracranial EEG data before any statistical testing.The pipeline included notch ltering at 60, 120 and 180 Hz, data exclusion, and rereferencing.Excluded data includes pathological, noisy channels, and bad trials, the identi cation criteria of which are described as follows.We used a recently developed algorithm to identify pathological channels by the presence of pathological high-frequency oscillations (HFOs) 53 .Noisy channels were identi ed by the presence of extremely high raw amplitude (> 5 standard deviations [SD] across all channels) and/or the prevalence of spikes (> 3 median of the distribution across all channels), i.e. jumps between consecutive data points larger than 80 µV.Trials with extreme variations in the signal, i.e. having extreme mean (> 4 SD across all trials) of the absolute magnitude of the signal values, and extreme standard deviation (> 3.5 SD) of the variance of the signal values along the sampled timepoints, were excluded for being trial-averaged to create the time-domain evoked potential.Following data screening, we applied bi-polar re-referencing instead of common averaging, given that the stimulations were performed in a bi-polar manner, i.e., injected between a pair of electrodes.Common average reference of all non-noisy channels for the intracranial EEG recording analysis has also been experimented and compared with the bi-polar referencing scheme to make sure that the bi-polar subtraction did not remove signals.After manual inspection, we chose to use the bi-polar referencing for the further analyses.
To extract temporal-frequency information, the continuous bipolar voltage traces were decomposed using Morlet wavelet ltering (at log-spaced frequencies between 1 and 256 Hz [59 total frequencies]; each wavelet having a width of 5 cycles).The output instantaneous power timeseries has a sampling rate of 200 at each frequency.The timeseries was then epoched, baseline corrected and averaged across good trials.Values in the power spectrogram were then log-transformed to approximate normal distribution and z-scored across time and frequency.To measure how EPs are consistent across trials, we calculated the inter-trial phase coherence (ITPC) by taking the complex phase component of the wavelet decomposition, averaging across trials, and then taking the absolute.Values in the ITPC spectrogram were then squareroot transformed to approximate normal distribution and then z-scored across time and frequencies.The remaining text of the paper will refer to the evoked potentials (EP), the evoked power and ITPC spectrograms as their preprocessed version by default.
Activation detection using semi-supervised Uniform Manifold Approximation and Projection (UMAP)

Preliminary activation labelling
The key to determining a casual connectivity lies on correctly detecting a true physiological activation among the volume conductions of an electrical stimulation.We employed a semi-supervised learning approach utilizing the Uniform Manifold Approximation and Projection (UMAP) algorithm with the python toolbox umap-learn (https://umap-learn.readthedocs.io/en/latest/) 54.Unlike traditional dimensionality reduction techniques, semi-supervised UMAP allows for integration of labeled and unlabeled data, using our partial knowledge to guide the manifold structure of a true activation.
Before training, we produced a preliminary activation label based on the "consensus" of the pre-de ned 3 criteria: (1) "Uniform-Cutoff": a hard threshold of EP z-score > 7 at any time point in the [10, 600] ms time window.( 2) "Channel-Adaptive": stimulation-channel speci c thresholds (SD > 2) for peak heights and peak dominances of the EPs generated from the same bipolar stimulation.EP peaks were detected using the MATLAB " ndpeaks" function for all peaks (positive or negative), and the peak and prominence distributions were constructed from all the detected peaks, upon which we applied the thresholds.( 3) "Phase-locking": individual speci c thresholds with a con dence level of 95% in the distribution of connectivity index scores across all stimulating-recording pairs of the same subject.The connectivity index is based on ITPC, recently created to measure causal connectivity as discussed in a recent study 11 .
Apart from these criteria reported from the literature, we also imposed some criteria that should be met to be physiologically alike: the voltage trace of the EP should have smooth and fast rises in the early phase (10 ~ 60 ms), later peaks (after 250 ms) can have lower peak heights (z-score > 5), and the signal should approximate to baseline after 800 ms.When all the criteria agree, we labelled a stimulating-recording instance as either "1" (activated) or "0" (non-activated); otherwise (~ 1/3 of the cases), we labelled it as "-1" (unlabeled), which is left for the algorithm to apply learned rules and decide for us.

Training data preparation
We applied a semi-supervised UMAP with power and ITPC spectrograms as input data, as the spectrograms explicitly quanti ed the rich waveform information (components and consistency) of the EPs.We prepared the training data with the following steps.We down sampled and smoothed the spectrograms by averaging the nearest neighbors to reduce the data redundancy, save working memory for the intensive computation, and to denoise and highlight the features of interest.We used characteristic time-frequency kernels to smooth (averaging the nearest neighbors) and down-sampled the spectrograms, based on Leland McInnes k-means clustering from all stimulation-evoked potential peaks, and our prior knowledge that the rst 10 ms upon a stimulation onset is mainly artifact (Fig. S5).
Therefore, we down sampled the temporal domain into 5, 25, 20, and 10 points to cover the windows of [-30, 10] ms (artifact window), [11, 117] ms, [118, 283] ms, and [284, 800] ms.The artifact window was kept in the training data for the algorithm to learn to recognize.We used more data points for representing the [11, 283] ms because most evoked potentials happen in this time window.On the frequency domain, they were down sampled into 4 datapoints to cover the [0.5, 5] Hz (delta), [5, 8] Hz (theta), [8, 15] Hz (alpha), [15, 30] Hz (beta), [30, 70] Hz (gamma) and > 70 Hz (high).The down sampled power and ITPC spectrograms were then vectorized and concatenated as a single vector of multiple features (number of features = timepoints*frequencies of power + timepoints*frequencies of ITPC) for each stimulatingrecording instance.Each stimulating-recording vector was then concatenated to be the input training data.
The input data was curated using the sklearn toolbox in python (https://scikit-learn.org/stable/),including missing data imputation with mean statistics (negligible cases, about 0.089% of the datapoints), and quantile transformation to distribute the data normally.When both transformed and un-transformed data were experimented, the results did not change but showed a clearer clustering effect using the transformed data; hence, the results using transformed data are reported in this paper.

Subject-level semi-supervised UMAP learning
The number of latent dimensions for the UMAP mapping was set to 2, the size of the local neighborhood was set to 5, the minimum distance of the points in the latent space was set to 0.01.The rest of the parameters were kept at default.As a result, each participant's EP data was classi ed into activation and non-activation clusters that were grouped at the far ends in latent space, while the few oating data points that were not clustered were considered outliers.Each person's clustering was manually checked by (1)  visualizing the spectrograms of the centroid and boundary points of each cluster, and (2) visualizing the inverse projection of the original space from the latent space.26 out of 27 participants' data were grouped into two meaningful clusters, while one participant's data (S21_196) with extensive amount of noisy data failed to cluster and was excluded from the following analyses.
Neural feature extraction for activated signals

Group-level UMAP learning
To gain an overview of the data structure of the activated spectrograms, we applied an unsupervised UMAP with input data of denoised power and ITPC spectrograms of the activated channels (number of latent dimensions = 4).To minimize the in uence of electrical artifacts, we created a non-activated spectral information (power and ITPC) template for each stimulating channel by averaging all the nonactivated pairs stimulating from this channel.This non-activated spectral template contains mostly the artifact of volume conduction (Fig. S5), as the physiological events would be averaged out when not synchronized by an activation.We used the non-activated spectral template as a control and subtracted it from all the activated spectrograms from the same stimulation channel within the same participant.When the non-activated cases for a stimulation channel are too few (< 10) and thus unable to generate a stable mean as a contrast, this happened only once, we excluded the case from the subsequent analyses.The remaining traces of the activated spectral information were considered denoised and fed into the grouplevel analysis.
We applied a supervised UMAP with input data of denoised power and ITPC spectrograms of the activated channels, to map to the anatomical labeling of interest.The provided anatomical labels were "THAL-ipsi", "THAL-contr", "COR-ipsi" and "COR-contr", corresponding to stimulating from thalamus/cortex and recording from the ipsilateral/contralateral hemisphere.
The same training data preparation as mentioned before was applied.The number of latent dimensions for the UMAP mapping was set to 2, the size of the local neighborhood was set to 15, the minimum distance of the points in the latent space was set to 0.1.These parameters varied from the within-subject semi-supervised UMAP learning because the data variability within-subject is assumed to be smaller than the group data, while we aim to nd commonalities that override individual differences.Therefore, the local neighborhood and the latent space are relaxed to bigger numbers (As a general principle, we experimented with different sets of parameters [within a reasonable range] to achieve the best clustering results from visual inspection).The rest of the parameters were kept at default.We further employed a hierarchical-density clustering algorithm to de ne the boundaries of the embedding clusters against the background noise.Finally, we investigated the centroid and averaged the presentation of the data in each cluster to verify that the clustering preserved meaningful and distinct features within the data.
To ensure the data-driven clusters are generalizable and meaningful, we fed the resulting embedding to a series of feature evaluation procedures, including using a support vector classi er to test their generalizability to the ¼ unseen testing data.We showed that the embedding for separating the cortical and thalamus evoked responses has 75.9% predictability to the unseen data, much higher than chance rate (25%).Especially, the clusters of thalamus and cortex stimulated data are mostly well preserved, suggesting these two types of connectivity show most distinctive features as encapsulated by their power and ITPC spectrograms.We further employed a hierarchical-density clustering algorithm 55 to de ne the boundaries of the embedding clusters against the background noise, using the HDBscan toolbox (https://hdbscan.readthedocs.io/en/latest/).We then investigated the averaged spectrograms (i.e. the cluster centroid) for each cluster, for a sanity check of the clustering results.

Cluster-based permutation test
Category-speci c spectral features were determined by cluster-based permutation signi cance testing: ( were fed into a group-level one-sample T-test for nal inference.The testing was performed using the toolbox MNE-python (v1.5.0) (https://mne.tools/stable/index.html).The number of permutations was set to 5000, the initial cluster forming threshold was bigger than 6 t-scores, and the cluster-wise permutation con dence interval was set to 99% (i.e., P cluster < 0.01).

Neural feature decoding
Based on the cluster-based permutation test, we identi ed three distinct neural features (i.e., signi cant clusters) that are speci c to each anatomical category of interest (i.e., UMAP localizers).They are represented by the power and ITPC patterns in the window of ( [10, 50] ms of the COR-ipsi category (averaged across all instances in the corresponding category), in the window of [70, 160] ms of the CORcontr category, and the [165, 280] in the THAL-contr category.The temporal-frequency relationships in these time ranges and anatomical categories were considered as neural features.They were decoded in each instance of stimulated/record pairs.To do this, we utilized the spectral information from the signi cant cluster in the group-level spectrograms as a template.We employed a sliding-window crosscorrelation to match this template with the speci c instances of the spectrograms, point by point in time.
Different similarity (or distance) measures were experimented and turned out to produce similar results; therefore, we used the Pearson correlation r for its easy interpretability.The resulting timeseries of r has a samping rate of 200 per second, which is the same as the spectral dataset.
A sliding-window cross-correlation with the 3 identi ed neural features generated three temporal similarity curves for each stimulating-recoding instance, each corresponding to the neural feature's temporal evolvement in this speci c pair of stimulated/recorded sites.With the temporal similarity curve, we detected the peak similarity and the time to the peak using MATLAB " ndpeak" function.Instances within each anatomical area were averaged for brain regional inferences.

General framework of statistical testing:
The datasets in the present study generally have a nested structure such that each condition consists of multiple electrode contacts, each with a brain area and subject identity; therefore, the mixed linear model was used as a general testing framework for model tting, using the brain area and subject identi ers as nested random factors.Therefore, the comparisons were always made within the same stimulation/recording channels, brain areas and subjects, whenever possible.Practical model setups can vary slightly according to data structures and tested hypotheses.For signi cance inference, the model comparison approach with a hypothesis testing framework was adopted.Namely, a null model is compared with an alternative model which has the hypothesized effect added on top of the null model.
The model with higher evidence is selected, based on a likelihood-ratio test.We additionally provided the Akaike information criterion (AIC) and Bayesian information criterion (BIC) in our supplementary table of post-hoc tests where spurious signi cant results may arise given the large numbers of tests, even after the false-discovery-rate (FDR) correction.The model ttings and selections were conducted using the lme4 toolbox and the anova function in R. The model speci cations were provided along with the signi cance tables in the supplementary materials.

Declarations
Figures   reduction of the evoked spectral information of all pair-wise evoked potentials in the activated cluster (i.e., pairs with underlying connections).The spectral information is the concatenated power and inter-trial phase coherence (ITPC) spectrograms of the evoked potentials, which are down-sampled to balance the number of datapoints for earlier and later neural responses (see Methods).While this gure is not intended to show distinct clusters, we have colored the dots (a posteriori) to illustrate that data from ipsilateral thalamic stimulations (red dots) are already distinguishable upon a visual inspection.Notably, the color patterns follow different stimulation sites but for recording sites (also see SI Fig. S3 a,c,d).They are not biased by speci c subjects (SI Fig. S4).(e)Neural feature encoding involved two main steps: UMAP supervised learning and cluster-based permutation testing.The input data was the evoked power and ITPC spectrograms of the group-level whole-brain evoked potentials, while the dependent variable (i.e., the labels) were the ipsilateral cortical (COR-ipsi), ipsilateral thalamic (THAL-ipsi), contralateral cortical (COR-contr), and contralateral thalamic (THAL-contr) pairs.The algorithm successfully characterized these evoked spectrograms into the four categories.HDBscan was used to formally de ne the clusters in the embedding space, dissociating them from noisy channels.Original spectrograms of the pairs clustered in the four categories were then used to perform cluster-based permutation testing, by which we identi ed three time-frequency clusters within the spectrograms that were speci c to each category (i.e., Neural Feature).Before statistical testing, the evoked power spectrograms of the four stimulation sites already show distinct features that can be distinguishable by visual inspection alone (marked by white circles and numbered).For THAL->THAL (contra) connections, no signi cant cluster was generated here as there were not enough subjects (n<10) to provide su cient statistical power for the mixed-model testing.However, a signi cance test can be done at the level of electrode contacts (i.e., without considering the grouping factor of "subject").This alternative analysis showed a consistent pattern with the subject-averaged spectrogram, and resulted in two signi cant clusters, one in the gamma band before 60ms, another in the alpha/theta band around 100-275 ms (Fig. S8).Acronym example "THAL-THAL (ipsi)" on the subtitle means ipsilateral pairs stimulating from the thalamus and recording from the thalamus.
Electrophysiological neural features indicating different types of connectivity and illustration of the decoding process.(a) Neural features speci ed by the spectra-temporal information.The time windows of the neural features are circumscribed by the signi cant clusters distinct among conditions from the previous cluster-based permutation testing.The time-frequency relationships in the three features are represented by the group-averaged spectrograms of the COR-ipsi, COR-contr and THAL-contr stimulations, as these three categories show clearest and non-overlapped signi cant clusters at the group level.The frequency axis is in log-scale.The time axis is in a natural scale with even samples, different from the time axis in Figure 3.The values of power and ITPC spectrograms are transformed (logarithmized and squarerooted, respectively) to approximate Gaussian distributions, and then zscored (in both time and frequency directions) to be comparable among connections.Line plots to the right of the spectrograms show the (normalized and log-scaled) spectral density of the power and ITPC during the signi cant time window, with colored lines for all the time points, and black for the time average.Since the evoked spectrograms were baseline-corrected, depicted curves can be seen as the "bumps" on top of the 1/f background noise.Group-averaged evoked potentials at the temporal domain corresponding to each category of the spectrograms are shown below.(b) Cross-correlation with a sliding window approach.To reveal the presence of each feature in individual connections, the feature information (i.e. the short-lasting spectral information characterized by the power and ITPC in the time-frequency window) is correlated to the spectrograms with Person-correlation r.To examine this correspondence at every time point (sampling rate is 200 per second), an over-lapping sliding window approach is used, whereby the "transient" timefrequency information at each time point is examined while the examining window is sliding over the spectrograms.This generates the dynamic appearance of the feature presence over time.The line plot in the middle is an example case randomly selected form the THAL-contr instances.Every point on the curve, with a speci c time in x-axis and similarity measure in y-axis, indicates how similar the current spectral information (power and ITPC sustaining in certain amount of time) matches the neural feature in concern.The maximum r (r*) and the time to r* (latency) was taken as indices of feature representation and analyzed in the subsequent analyses.The heatmap to the right shows the correlation between all the THAL-contr instances and the Feature 3 (F3).An overview of feature presence in all categories of all features is presented in Fig. S9.
. In this comparison, the hippocampus (HPC) was an exception among all other examined brain regions: representation of F1 signals evoked by HPC and recorded in anterior and posterior thalamic sites was stronger than the representation of F3 signals recorded in HPC and evoked by the thalamus, with pair-wise comparisons for the same sites in the regions of interest (Dz = 0.30, t = 2.51, p FDR-corr =0.04, n-conn.= 190, n-sbj.= 21 for antTH; Dz = 0.62, t = 4.72, p FDR-corr < < 0.01, n-conn.= 194, n-sbj.= 18 for pstTH) (Table

Figure 5 The
Figure 5