Time-resolved smoothness of distributed brain activity tracks conscious states and unifies emergent neural phenomena


 Much of systems neuroscience posits that emergent neural phenomena underpin important aspects of brain function. Studies in the field variously emphasize the importance of distinct emergent phenomena, including weakly stable dynamics, arrhythmic 1/f activity, long-range temporal correlations, and scale-free avalanche statistics. Few studies, however, have sought to reconcile these often abstract phenomena with interpretable properties of neural activity. Here, we developed a method to efficiently and unbiasedly generate model data constrained by interpretable empirical features in long multiregional neurophysiological recordings. We used this method to ground several major emergent neural phenomena to time-resolved smoothness, the correlation of distributed brain activity between adjacent timepoints. We first found that in electrocorticography recordings, time-resolved smoothness closely tracked transitions between conscious and anesthetized states. We then showed that a minimal model constrained by time-resolved smoothness, variance, and mean, captured dynamical and statistical emergent neural phenomena across modalities and species. Our results thus decouple major emergent neural phenomena from network mechanisms of brain function, and instead couple these phenomena to spatially nonspecific, time-resolved changes of brain activity. These results anchor several theoretical frameworks to a single interpretable property of the neurophysiological signal and, in this way, ultimately help bridge abstract theories of brain function with observed properties of brain activity.


Introduction 20
Coordinated neuronal interactions give rise to intricate patterns of distributed brain activity.
Much of systems neuroscience seeks to understand the spatiotemporal structure of these patterns. Studies often seek to explain this structure in terms of emergent neural phenomena, spatiotemporal patterns of brain activity that reflect complex interactions of individual neurons or brain regions. Many of these phenomena are situated within longstanding theoretical 25 frameworks (Breakspear, 2017;Buzsáki, 2004;He, 2014;S. Palva & Palva, 2018). One such framework centers on the study of dynamical phenomena (Breakspear, 2017;Cunningham & Byron, 2014;Deco et al., 2011). Dynamical phenomena show up as trajectories in phasespace embeddings of neuronal (Churchland et al., 2012;Kato et al., 2015) or regional (Shine et al., 2019) recordings. These embeddings provide a convenient geometric representation 30 of distributed brain activity (Saxena & Cunningham, 2019). Moreover, the smoothness of dynamical trajectories can endow dynamical activity with robustness to high-dimensional noise (Trautmann et al., 2019).
A third framework, the theory of neural criticality, lies at the intersection of the first two, 45 and seeks to explain both dynamical and statistical emergent phenomena. This framework proposes that neural dynamics critically balance on the border of stable and unstable states (Bialek, 2017;Magnasco et al., 2009). Critical dynamics show highly varying responses to perturbations (Cocchi et al., 2017) and give rise to scale-free power spectra, scale-free avalanche statistics and long-range temporal correlations (Beggs & Timme, 2012;Chialvo, 2010). Criti-50 cality has many theoretically appealing properties, including optimized detection (Kinouchi & Copelli, 2006;Shew et al., 2009) and transformation (Beggs & Plenz, 2003, 2004 of sensory input. Collectively, these theoretical frameworks underpin a translational literature that seeks to explain the nature of brain function (Breakspear, 2017;He, 2014) and dysfunction (Voytek & 55 Knight, 2015;Zimmern, 2020). Studies within this literature tend to associate healthy brain activity with optimally structured emergent phenomena, including low-dimensional dynamics (Saggar et al., 2018), scale-free statistics , and neural criticality (Ezaki et al., 2020). Similarly, they tend to associate deviations from this optimality with clinically relevant changes in brain activity, including disrupted neuronal excitability (Ponce-Alvarez et 60 al., 2018), loss of consciousness (Solovey et al., 2015), and seizures (Maturana et al., 2020).
However, despite this large and growing body of existing literature, the function and importance of many emergent neural phenomena remains contested or controversial (Beggs & Timme, 2012;Destexhe & Touboul, 2021;Stumpf & Porter, 2012). For example, there is a lack of experimental evidence that reconciles weakly stable dynamics and scale-free statistics 65 with stable and scale-specific circuit mechanisms of cognition or behavior (Jonas & Kording, 2017;Krakauer et al., 2017). In contrast, most support for these and other emergent phenomena comes in three main forms. First, many studies confirm these phenomena against weak null models, typically generated with naive shuffling of empirical data. Such tests do not generally constitute strong evidence of function or importance (Clauset et al., 2009;Stumpf 70 & Porter, 2012). Second, many studies report correlations between emergent phenomena and aspects of brain function (Beggs & Plenz, 2003;He et al., 2010;Solovey et al., 2015). These studies are interesting, but often do not fully account for confounding explanations. Third, many studies explain the origin or nature of emergent phenomena in computational models (Bedard et al., 2006;Chaudhuri et al., 2018;Haimovici et al., 2013;Touboul & Destexhe, 75 2017). These studies, while also interesting, often require additional generative assumptions that can be difficult to validate. Collectively, the combined presence of strong claims about the nature or importance of emergent neural phenomena, and the absence of similarly strong supporting evidence for these claims, has resulted in an extensive literature of intriguing but inconclusive findings. 80 Here, we make major steps to resolve this impasse by advancing the approach of surrogate data. This approach evaluates phenomena of interest in model datasets constrained by predefined empirical features, and is increasingly adopted to evaluate features of brain network or population activity (Elsayed & Cunningham, 2017;Rubinov, 2016). The approach has several important strengths. First, in contrast to tests against weak null models, it admits 85 strong competing explanations. Second, in contrast to correlational studies, it closely controls for confounds by a priori constraining only a predefined set of features. Finally, in contrast to computational modeling studies, it makes no mechanistic assumptions, beyond assuming the importance of constrained features.
In contrast to these strengths, computational cost is a major weakness of this approach.

90
More specifically, the generation of highly constrained model data in large recordings can often be prohibitively slow. Here, we overcame this problem by developing a fast and unbiased method for generating model data with a range of interpretable empirical features. In long neurophysiological recordings of distributed brain activity, our method outperforms standard constrained randomization (shuffling) methods by orders of magnitude. This method allowed 95 us to rigorously evaluate emergent neural phenomena in multiregional recordings with hundreds of thousands, and in some cases millions, of timepoints across species and modalities.
We used this method to ground several major dynamical and statistical neural phenomena to time-resolved smoothness, the correlation between patterns of distributed activity at adjacent timepoints. First, we found that in electrocorticography recordings, time-resolved 100 smoothness reliably tracked brain-state changes across consciousness and anesthesia. Second, we went beyond correlations to show that a minimal model informed by time-resolved smoothness, variance, and mean, captured major dynamical and statistical emergent neural phenomena across species and modalities. Finally, we showed that five alternative models of the neurophysiological signal, including models with many more constraints, could not explain 105 emergent phenomena to the same extent. Collectively, our results anchor several disparately studied emergent neural phenomena to a single and interpretable property of the neurophysiological signal. We conclude with a discussion of ways by which these results bridge theoretical frameworks with empirical observations, and thus drive progress in systems neuroscience. Table 1 summarizes the properties of all our studied datasets. We focused our analysis and modeling on recordings with intracranial EEG modalities, namely stereotactic EEG and electroencephalography. Intracranial EEG recordings are optimally suited for the study of emergent phenomena by their combination of millimeter-scale spatial resolution, millisecond-115 scale temporal resolution, and distributed cortical coverage (Parvizi & Kastner, 2018). In practice, we analyzed and modeled datasets that comprised hundreds of thousands, and in some cases millions, of timepoints recorded from around 100 depth (stereotactic EEG) or grid (electrocorticography) cortical electrode channels. In addition to these recordings, we also followed the recent literature to evaluate critical emergent phenomena in light-sheet calcium 120 imaging of fictively swimming zebrafish (Ponce-Alvarez et al., 2018). The spatiotemporal resolution of light-sheet calcium imaging data provides a unique opportunity to model brainwide activity with cellular resolution (Ahrens et al., 2013). Nonetheless, the large number of neurons, the small number of timepoints, and the low temporal resolution of these data presented unique challenges in modeling these data, as we discuss below.

Time-resolved smoothness tracks conscious states
We defined time-resolved smoothness as corr(v i , v i+lag ), where corr denotes the Pearson correlation coefficient, v i denotes a vector of neuronal or regional activity at time i, and lag denotes an integer number of timepoints. Figure 1a-d shows the distribution of this quantity for immediately adjacent timepoints (lag = 1), grouped separately for each species and modality.

130
The figure shows that time-resolved smoothness was unimodally distributed close to its maximal value of 1, reflecting gradual changes in distributed brain activity over time. The medians The approach of surrogate data can be prohibitively slow for nontrivial constraints in large datasets. Here, we used insights from linear algebra to develop an unbiased and efficient nullspace sampling method for generating model timeseries with a range of interpretable constraints.
The Methods section describes our method in considerable detail. We adopted this method, 150 alongside more standard phase and frame randomization methods (Beggs & Plenz, 2003;Prichard & Theiler, 1994), to define six competing models of the neurophysiological signal.
For convenience, we refer to each model using an italicized shorthand. Below, we summarize the properties of each model. We also list the number of model constraints in terms of the number of nodes n, the number of timepoints t, and the percentage of total data elements.

155
Model data generated using the nullspace sampling method Model data generated using phase or frame randomization methods 5. power spectrum: more than 1 2 nt (%50) constraints. Model data constrained by the power 175 spectrum of each node, and by all-pairwise node cross-correlations. This highly constrained model is often used to evaluate the presence of dynamic correlations in the neuroimaging literature (Liegeois et al., 2017). We found that, in spite of its many constraints, this model did not recapitulate all evaluated emergent phenomena across datasets.
6. shuffled frames: more than nt (%100) constraints. Model data constrained by shuffling  frames model data (Figure 1h-i). Pairwise node correlations reflect, indirectly, mechanistic network interactions that underpin cognition and behavior (S. Palva & Palva, 2012). Second, it shows that no model data, apart from smoothness model data, captured empirical timeresolved smoothness. These observations imply that time-resolved smoothness reflects spa-190 tially nonspecific changes of brain activity, and that these changes are distinct from spatially specific network mechanisms.
To further interpret these changes, we examined the dynamics of time-resolved smoothness in four electrocorticography recordings of macaque monkeys across consciousness and propofol anesthesia (Nagasaka et al., 2011). Previous work has shown that emergent properties 195 of brain activity, including slope of the 1/f power spectrum (Colombo et al., 2019;Gao et al., 2017;Lendner et al., 2020), as well as dynamical stability (Alonso et al., 2019;Solovey et al., 2015), separated consciousness and anesthesia in these recordings. Figure  utes after injection (Schnider et al., 1999). Finally, the smaller relative drop of time-resolved smoothness in macaque 2 (Figure 2c-d) matched previous observations that this monkey was not fully anesthetized (Alonso et al., 2019). Collectively, these results show that time-resolved smoothness can robustly track gross changes in brain activity across conscious states.

Time-resolved smoothness is sufficient to explain emergent dynamics 210
An important advantage of our approach is the ability to go beyond correlations and evaluate the sufficiency of basic constraints to explain emergent phenomena. We first evaluated the sufficiency of model data to explain dynamical stability, an emergent dynamical phenomenon (Deco et al., 2011). We followed previous work to estimate dynamical stability using the eigen-  Theory predicts that interesting, weakly stable dynamics have largest eigenvalues with magnitudes that approach 1. Conversely, less interesting, strongly stable dynamics have largest eigenvalues with smaller magnitudes. Previous stability analyses of monkey electrocorticography recordings found that consciousness associates with weakly stable dynamics, and propofol 220 anesthesia associates with strongly stable dynamics (Alonso et al., 2014;Alonso et al., 2019;Solovey et al., 2015).
Figure 3a-c shows that the smoothness model, and no other evaluated models, reproduced empirical changes in dynamical stability across consciousness and propofol anesthesia. We quantified dynamical stability by the median magnitude of the 5% largest eigenvalues.  235 We next evaluated the extent with which model data accounted for scale-free statistical phenomena. We considered two commonly studied signatures of these phenomena: 1/f powerfrequency relationships and long-range temporal correlations. 1/f denotes the decay of spectral power as a function of frequency. This scale-free, or arrhythmic, power-frequency relationship is distinct from neuronal oscillations, which represent narrow peaks in frequency 240 spectra. This relationship is typically represented by the slope of the 1/f power spectrum.  We next considered long-range temporal correlations, emergent phenomena that track the statistical self-similarity of long timeseries. Typically, these correlations are defined by the scaling exponent of mean fluctuations in the amplitude envelope in alpha or beta frequency bands, using detrended fluctuation analysis, a method especially applicable to nonstationary 255 timeseries (Peng et al., 1995). The scaling exponent between window sizes and mean fluctuations indexes the presence of statistical self-similarity. Values of this exponent considerably larger than 0.5 denote interesting, self-similar (fractal-like) structure. In contrast, values close to 0.5 denote less interesting, random (white-noise-like) structure (Linkenkaer-Hansen et al., 2001).  both 1/f scaling and long-range temporal correlations across datasets.

Time-resolved smoothness is sufficient to explain avalanche statistics
Neural avalanches are transient periods of coordinated activity between groups of neurons or brain regions. The study of avalanche dynamics plays a prominent role in the theory of neural criticality. One commonly studied phenomenon in this literature is the power-law scaling of 275 avalanche size and duration distributions. Nonetheless, studies have also emphasized that this scaling may not be strongly indicative of critical dynamics (Touboul & Destexhe, 2010), and recent work has focused on more specific signatures of criticality, including power-law relationships between size and duration exponents (Friedman et al., 2012), as well as shape collapse, a universal scaling relationship of avalanche phase with activity (Friedman et al.,  avalanches. Theory predicts that at criticality, collapsed temporal profiles of avalanches will converge to a universal shape independent of avalanche duration (Beggs & Timme, 2012;Friedman et al., 2012;Sethna et al., 2001). 285 We first evaluated the presence of avalanche phenomena in all intracranial EEG datasets.  Figure 5). data. All other models lacked universal shape collapses. These results imply that basic time-305 resolved constraints are sufficient to generate data that pass stringent tests of criticality.

Time-resolved smoothness of brain-wide calcium imaging data
Finally, we sought to evaluate emergent phenomena in calcium imaging recordings of fictively swimming zebrafish. These data provide an unmatched spatiotemporal resolution of brainwide cellular activity (Ahrens et al., 2013). At the same time, the distinct nature of these data 310 present several methodological challenges (Table 1). Specifically, in contrast to intracranial EEG recordings, brain-wide calcium imaging recordings were imaged at a relatively low frequency (about 3Hz), had many more nodes (about 100 thousand neurons), and many fewer timepoints (about 6 thousand). The low sampling rate limited our evaluation to critical emer- such as the smoothness model.
These caveats aside, our modeling results on these datasets were generally inline with results on intracranial EEG data. Figure 6,  nents of most other models deviated to a larger extent from those values, while shape collapse results were largely inline with those observed on intracranial EEG datasets (see Figure 6, and Supplementary Figures 31-32 for a summary of all results).

Discussion
A major aim of systems neuroscience is to understand how collective neural interactions give 340 rise to brain function. A burgeoning literature in the field has proposed, over the last two decades, that several emergent neural phenomena enable the emergence of such function. The growth of this literature, however, has not been matched by a corresponding growth in conceptual and methodological advances for reconciling high-level neural phenomena with known properties of the neurophysiological signal. One key hurdle for such advances has been the difficulty in 345 scaling methods of surrogate data to long multiregional recordings.
Here, we developed a new method to generate surrogate data constrained by interpretable empirical properties in multivariate timeseries with hundreds of thousands, and even millions, of timepoints (Table 1). This method allowed us to rigorously ground, for the first time, several prominent emergent neural phenomena to an interpretable property of the neurophysio-350 logical signal. Our main results (Figures 3-6) show that empirical constraints of time-resolved smoothness, variance, and mean, recapitulated all evaluated emergent phenomena across modalities and species. Importantly, a model informed by these constraints outperformed competing models with many more constraints, including models informed by full power spectra and cross-correlation structure.

355
The nature of time-resolved smoothness offers strong clues to the nature of our evaluated phenomena. Notably, this metric is agnostic to properties of individual nodes (regions or cells), and therefore cannot, by definition, capture mechanistic network interactions. Figure   1 illustrates this effect as the lack of correlation structure for timeseries constrained by timeresolved smoothness. Put another way, knowledge of time-resolved smoothness tells us little 360 about how specific network elements interact to cause brain function. Importantly, this does not imply that past discoveries of emergent neural phenomena are false. On the contrary, we were able to robustly replicate these discoveries across modalities. Instead, this implies that, by the principle of scientific parsimony (Occam's razor), it may be needlessly complicated to imbue emergent neural phenomena with function that transcends properties of time-resolved 365 smoothness.
What then is the relevance of emergent neural phenomena for brain function? Our results offer one plausible answer by closely linking time-resolved smoothness to global changes of distributed brain activity during propofol anesthesia (Figure 2). We hypothesize that more subtle changes of time-resolved smoothness, and the accompanying emergent phenomena, will 370 likewise track subtle neuromodulatory changes associated with sensory, cognitive, and motor function (Bargmann, 2012;Marder, 2012). Indeed, it may be possible to test this falsifiable hypothesis with future high-speed multicellular imaging data that can capture subtle changes in distributed activity across brain states (Demas et al., 2021;Stringer et al., 2019).
More generally, our results offer to unify and simplify existing theoretical frameworks 375 in systems neuroscience in several ways. First, these results provide a roadmap for future neurobiological explanations of emergent phenomena. Specifically, we propose that the discovery of molecular, cellular or circuit mechanisms of time-resolved smoothness will lead to the corresponding discovery of mechanisms that underpin a range of emergent phenomena.
Prime targets for such mechanisms include the action of widely projecting subcortical systems 380 that modulate cortical excitability and arousal (Avery & Krichmar, 2017;Jones, 2005;Lee & Dan, 2012). Second, our results provide a new perspective on regional variation of scalefree activity (Gao et al., 2017;He et al., 2010;Lendner et al., 2020). This variation is related, although not equivalent, to the notion of hierarchy of cortical timescales (Chaudhuri et al., 2015;Gao et al., 2020;Honey et al., 2012;Murray et al., 2014;Raut et al., 2021). Our results 385 clarify the meaning of this regional variation by noting that time-resolved smoothness of distributed interactions can automatically give rise to scale-free activity. This finding suggests that scale-free regional variation can be more intuitively interpreted as the extent of regional contribution to changes in distributed activity, and without need to explain high-level scale-free phenomena. Third, our results have considerable implications for a growing translational 390 literature that centers on the study of emergent phenomena Zimmern, 2020). Much of this literature seeks to find alterations in distinct emergent neural phenomena across healthy and diseased brain states (Maturana et al., 2020;Ponce-Alvarez et al., 2018;Solovey et al., 2015). Our findings show that a single interpretable property of the neurophysiological signal can replace, in this literature, a battery of high-level clinical biomarkers. In this 395 way, our findings can accelerate translation by motivating future discovery of mechanisms that affect time-resolved smoothness across clinical states.
Finally, we stress that our discovery was ultimately made possible by the development of a new method to generate surrogate data of long multiregional recordings (Elsayed & Cunningham, 2017;Rubinov, 2016). As systems neuroscientists come against bigger and more 400 highly resolved neurophysiological datasets, the development of scalable modeling methods will become increasingly important for rigorous analysis and modeling of these data. Indeed, we hope that future adoption of our method will allow investigators to comprehensively evaluate other aspects of multiregional recordings across brain states. Ultimately, we consider that widespread adoption of this and similar methods will be necessary to enable unified and 405 cohesive explanations of distributed brain structure and activity.

Generation of empirically constrained model data Definition of time-resolved smoothness
We defined time-resolved smoothness as, .
where v j i denotes the activity of node j at time i,v i denotes the mean activity of all nodes at 410 time i, and lag is an integer.
It is easy to show that average time-resolved smoothness is related to average node autocorrelation as follows, This equivalence holds because both quantities in equation 1 represent averaged dot products, normalized either over the activity of all nodes at a single timepoint (time-resolved smoothness), or over the activity of a single node at all timepoints (node autocorrelation). At the same time, the non-averaged versions of these quantities are incommensurate because time-resolved 415 smoothness is defined for each timepoint, while autocorrelation is defined for each node.

General formulation of nullspace sampling
Our sampling method expresses empirical features of interest as sets of linear equations. Formally, let us define v ∈ R n×t to be the empirical timeseries of n nodes and t timepoints. Let us also define vec(·) to be a vectorization or flattening operator, such that vec(v) ∈ R nt×1 .

420
Let us now consider m features of interest, define b ∈ R m×1 to be the empirical values of these features, and define A ∈ R m×nt to represent the coefficient matrix for these features.
Formally we can write, Without loss of generality, let us assume that the empirical features of interest are linearly independent (and therefore that the matrix A is full rank). Since the number of these features will, in most cases, be much smaller than the number of data elements (m ≪ nt), we end up with an underdetermined set of linear equations. It follows that we can generate model data by unbiasedly sampling from the solution space of this set.

425
Linear algebra tells us that this solution space is spanned by orthogonal unit vectors that form the nullspace of the coefficient matrix A (Laub, 2005). Let us define Z ∈ R nt×(nt−m) as this nullspace, such that AZ = 0. We can then express the solution space as, where x 0 denotes the minimum-norm solution, k is a scaling constant and q ∈ R (nt−m)×1 is a uniformly sampled random weighting vector. Geometrically, each row in A represents an unbounded hyperplane in nt dimensions and the solution space is an nt − m vector space formed by the intersection of these hyperplanes.
We regularized the space of these solutions by constraining the Frobenius norm of all 430 model timeseries to empirical values. We did this by first setting the norm of q to 1. Uniformly sampling q with |q| = 1 is equivalent to sampling from the surface of the nt − m dimensional hypersphere of radius 1 and centered at the origin. This sampling can be achieved by generating vectors from the nt − m dimensional standard normal distribution and rescaling the vectors to have unit norm. This procedure guarantees to produce random samples uni-435 formly distributed on the surface of the hypersphere (Smith, 1984). We then used the scaling parameter k to rescale the norm to match the empirical timeseries. To determine the value of k, we note that vec(x 0 ) = A † b where · † denotes the Moore-Penrose pseudoinverse. From the standard theorem of algebra, we know that vec(x 0 ) is always orthogonal to Z. Because the two additive components in equation 3 are orthogonal, we can use Pythagoras theorem to set 440 k = |v| 2 − |x 0 | 2 and thereby ensure that the Frobenius norm of x equals the norm of v.
Once we set k and q, we can then directly sample datasets x. Because Z is an orthonormal matrix, our sampling is guaranteed to be uniform as long as q is drawn from a multivariate uniform distribution. We used standard MATLAB functions such as qr and svd to build orthonormal nullspaces. Once we compute the nullspace, it becomes trivial to sample q from 2. Compute orthonormal nullspace matrix Z of the coefficient matrix A.

4.
Uniformly sample the weighting vector q and compute the scaling parameter k. 5. Generate model data using vec(x) = vec(x 0 ) + kZq.

Sequential nullspace sampling with nonlinear constraints 450
The general formulation of nullspace sampling admits only linear constraints, and produces massive nullspace matrices of close to (nt) 2 elements. This formulation is of limited use for constraining time-resolved smoothness and other nonlinear constraints in large datasets.
Nonetheless, here we exploited the sequential definition of time-resolved smoothness to allow nullspace sampling with nonlinear constraints in long multiregional recordings. 455 We first expressed empirical timeseries v as a sequence of vectors v 1 , v 2 , . . . v t , where v i ∈ R n×1 denotes the activity of all nodes at time i. Likewise, we expressed model timeseries x ∈ R n×t as a sequence of activity vectors x 1 , x 2 , . . . x t , with x i ∈ R n×1 . We then defined correlations between adjacent timepoints as c 1 , c 2 , . .
and · denotes the dot product.

460
This formulation allowed us to generate model timeseries with correlation constraints sequentially, timepoint by timepoint. Specifically, at each timepoint i, we used model data x i as the coefficient matrix to generate model data at timepoint i + 1. Without loss of generality, we used the following algorithm to generate x i+1 with zero mean and unit standard deviation: 1. For i = 0, sample a random normally distributed vector x 1 with zero mean and unit 465 standard deviation.
2. For i = 1, 2 . . . t − 1, sample x i+1 by solving, where 1 denotes the all-ones vector. The structure of the coefficient matrix in equation 4 guarantees that x i+1 will always have zero mean, and together with equation 5 guarantees that the vector will have unit standard deviation. Model vectors can then be simply rescaled to match the mean and standard deviation of empirical data.

470
In practice, this sequential generation of empirical data required nullspace matrices with approximately n 2 elements, and scaled linearly with the number of timepoints t. This approach was feasible for long intracranial EEG data (n ≈ 100), where it outperformed standard constrained randomization methods (Elsayed & Cunningham, 2017;Rubinov, 2016) by orders of magnitude. On the other, the approach was not feasible for brain-wide cellular calcium 475 imaging datasets (n ≈ 10 5 ). For these data, we replaced nullspace sampling with memoryefficient but slow constrained randomization methods.

Implementation of mean and variance constraints
In some cases (mean, varmean, and varmean-clusters models), we constrained model data by empirical mean and variance, but not by smoothness. In such cases, we generated x i by solving To enforce these constraints within spatiotemporal clusters (varmean-clusters model), we used k-means clustering to group n nodes into spatial clusters, and separately t timepoints into 480 temporal clusters. For long multiregional recordings, we set spatial clusters to have on average 5 nodes, and separately temporal clusters to have an average of 50 timepoints. To prevent an excessive number of clusters in brain-wide cellular calcium imaging recordings, we set spatial clusters in these datasets to have an average of 250 nodes, and temporal clusters to have an average of 100 timepoints.

485
To further optimize the efficiency of nullspace sampling, we derived an analytical expression for the nullspace Z ∈ R n×(n−1) of the coefficient matrix A = 1 ⊤ in equation 6, where a, b and c denote solutions to the following set of equations: b − a(n − 2) − c = 0 c 2 + a 2 (n − 2) + b 2 − 1 = 0 One can directly verify that this solution is an orthonormal nullspace of the coefficient matrix A = 1 ⊤ , such that Z ⊤ Z = I and AZ = 0. In practice, this analytical formulation allowed us to eschew the use of numerical computational methods for computing the nullspace of mean, varmean, and varmean-clusters models.
Phase randomization 490 We generated phase-randomized model data using a well-known algorithm from the physics and neuroscience literature (Liegeois et al., 2017;Prichard & Theiler, 1994). Briefly, we first used the Fourier transform to compute the phase and amplitude of each timeseries. We then rotated the phase of all nodes at each frequency by a random complex variable e iφ with φ uniformly distributed in [−π, π]. Finally, we obtained model timeseries by computing the inverse

495
Fourier transform of these phase-randomized data. This method is guaranteed to generate maximally random data that preserve nodal power spectra and full cross-correlation structure.

Macaque electrocorticography (Neurotycho)
Data were collected at the Laboratory for Adaptive Intelligence, Brain Science Institute, RIKEN.

500
Electrocorticography recordings were made from two male macaques before, during, and after administration of propofol anesthesia. Grid electrodes were implanted on the frontal, parietal, temporal, and occipital lobes. Additional experimental details for these data are provided in previous studies (Nagasaka et al., 2011;Yanagawa et al., 2013). These data are publicly available from http://www.neurotycho.org.

Human intracranial EEG
Data were acquired at the University of California Irvine Hospital. Intracranial EEG recordings were made from adult epileptic patients performing a visuospatial working memory task.
Grid and/or depth electrodes were implanted in frontal and medial temporal lobes. Additional experimental details for these the data are provided in a previous study .

Human stereotactic EEG
Data were acquired at Vanderbilt University Medical Center. Stereotactic EEG recordings were made from adult epilepsy patients, one day after electrode implantation and before medication wean. The patients were instructed to keep their eyes closed and remain awake for 515 twenty minutes. Depth electrodes were implanted in cortical regions, depending on suspected seizure origin. Additional experimental details for these data are provided in a previous study (Goodale et al., 2020).

Zebrafish calcium imaging
Data were acquired at Janelia Research Campus, Howard Hughes Medical Institute. Light-520 sheet calcium imaging recordings were made from fictively swimming larval zebrafish embedded in agarose. The fish swam against a fixed-velocity one-dimensional moving stripe pattern, which represented virtual water flow. The imaging spanned almost all brain neurons expressing a genetically encoded calcium indicator (GCaMP6f). Additional experimental details for these data are provided in a previous study (Mu et al., 2019).

Intracranial EEG
Dynamical stability analyses were performed on recordings sampled at 1000Hz, in accordance with previous work (Alonso et al., 2019;Solovey et al., 2015). All other analyses were performed on intracranial EEG data downsampled to 250Hz. All our results were robust to 530 analysis and modeling of data at 1000Hz, but this higher sampling rate incurred unnecessary computational cost.
All recordings were highpass filtered with a cutoff of 0.5Hz using the bandpassfilter function from Fieldtrip toolbox (Oostenveld et al., 2011). In addition, macaque electrocorticography data were notch-filtered at 50Hz and 100Hz to remove line noise. Likewise, human 535 stereotactic EEG and intracranial EEG data were notch-filtered at 60Hz and 120Hz to remove line noise.

Calcium imaging
All images were motion corrected, and cells were segmented in contiguous and overlapping three-dimensional blocks, using non-negative matrix factorization with sparseness constraints.

540
The resulting demixed and denoised cell segmentations showed high signal-to-noise ratio relative to raw pixel timeseries. Additional details of segmentation were provided in a previous study (Mu et al., 2019). Our segmentation software is publicly available from https://github.com/mikarubi/voluseg

Dynamical stability analysis
We performed linearized dynamical stability analysis within temporally local time windows.
We first divided the timeseries v ∈ R n×t , into non-overlapping windows of length ∆t = 500ms (Solovey et al., 2015). We then fitted a time-invariant vector autoregressive model, representing a linear dynamical system, separately for each window v k ∈ R n×∆t , where k is a vector of timepoints within a window. The vector autoregressive model is defined as, For each window, we first estimated the system matrix A k ∈ R n×n using least squares (Neumaier & Schneider, 2001), and then computed its eigenvalues, z k ∈ C n×1 . These eigenvalues tracked the dynamical stability of this system. Specifically, the presence of all eigenvalues strictly inside the complex unit circle ( z j k < 1, for j = 1, 2, . . . n) implies that the system is 550 asymptotically stable, such that external perturbations will decay exponentially. In contrast, the presence of some eigenvalues outside the complex unit circle ( z j k > 1) implies that the system is unstable and that external perturbations will grow exponentially. Finally, the presence of eigenvalues on the complex unit circle ( z j k ≈ 1), implies that the system is weakly stable and external perturbations will produce diverse responses (Solovey et al., 2015).

Power spectral analysis
We used Welch's method to characterize the power spectral density over a relatively large frequency range between 1 and 80Hz. In brief, power spectra were computed for each channel separately, using windowed timeseries convolved with Hanning windows of length 30 seconds and then averaged over all windows.

560
There has been significant recent progress in the development of methods for estimating slopes of neural power spectra. Unlike earlier methods, such as coarse grained spectral analysis (He et al., 2010), which were strictly applicable only to completely aperiodic signals (Wen & Liu, 2016), more recent methods, such as IRASA (Wen & Liu, 2016) and Fitting Oscillations and One Over f (FOOOF) (Donoghue et al., 2020), are able to estimate both 565 periodic and aperiodic signal components.
Here, we adopted FOOOF to estimate the spectral exponent of empirical and model timeseries. Formally the algorithm estimates, where P is the power spectrum, f is the frequency of interest, e is the scaling exponent, f offset is the offset frequency, and f knee is the knee frequency (Donoghue et al., 2020). More details about the application of this method to the study of neural spectra is provided in a recent study (Waschke et al., 2021). In this study, we set the maximum number of peaks to 6, and the mini-570 mum peak height to 0.15.

Long range temporal correlation analysis
Following initial preprocessing, we bandpass filtered the data in the alpha (8-12Hz) and beta (20-30Hz) frequency bands, and applied the Hilbert transform to extract time-resolved amplitude at each electrode. We then used detrended fluctuation analysis to estimate amplitude scaling coefficients from these time-resolved amplitude timeseries (Peng et al., 1995). Detrended fluctuation analysis is especially suitable for analysis of neurophysiological timeseries because it is unaffected by nonstationary structure. In practice, given a timeseries y j ∈ R 1×t , we first defined the cumulative sum Y j as follows, whereȳ j denotes the mean activity of node j. We then segmented the timeseries into nonoverlapping segments of varying lengths ∆t. We considered a broad range of temporal windows, from 1 second to 3 minutes, in order to robustly capture scaling behavior across multi-575 ple timescales.
For each timescale ∆t, we computed F (∆t), the root-mean-square of the detrended cumulative signal Y j within each segment, averaged over segments. Finally, we fitted a power law of the form F (∆t) = h∆t α . The value of the exponent α denotes the presence or absence of long-range temporal correlations. A value between 0.5 and 1.0 is indicative of long range 580 temporal correlations, while values close to 0.5 denote an uncorrelated process.

Avalanche analysis
Avalanche detection in electrophysiological datasets. Inline with previous studies, we binarized electrophysiological imaging datasets by setting to 1 deflections in local field potentials that exceeded three standard deviations below the mean (J. M. Palva et al., 2013;Solovey 585 et al., 2012;Varley et al., 2020). We then detected, at each timepoint, bursts of temporally contiguous coordinated activity in these binarized timeseries. Specifically, a new avalanche is instantiated if at least one of the constituent elements becomes active. The avalanche propagates until none of the constituent elements are active. The size of an avalanche is defined by the number of active constituent elements, and the duration of an avalanche is defined by the 590 period of its activity.
Avalanche detection in calcium imaging datasets. Inline with previous studies, we binarized calcium imaging datasets by setting to 1 deflections in calcium concentration that exceeded three standard deviations above the mean (Ponce-Alvarez et al., 2018). The standard method for avalanche detection fails for brain-wide cellular resolution data because the large 595 number of cells produce a single never-ending avalanche. Inline with recent work, we used a method to detect spatially contiguous avalanches from binarized zebrafish calcium imaging timeseries (Ponce-Alvarez et al., 2018). More specifically, we detected clusters of simultaneously active and spatially contiguous neurons. We used the MATLAB function bwconncomp to find these clusters. We then formed neuronal avalanches by detecting clusters that shared 600 cells at adjacent timepoints. In this way, we followed the spatiotemporal structure of each avalanche as it started, propagated and ended.
Power law fitting to avalanche distributions. We assessed the presence of power-law scaling in avalanche distributions, by adopting methods described in (Clauset et al., 2009) and previously implemented in (Rubinov et al., 2011). Specifically, we denoted the probability density function of avalanche size as p(s), p(s) = s −τ ζ(τ, s min ) − ζ(τ, s max + 1) with a cumulative distribution function P (s) = ζ(τ, s) − ζ(τ, s max + 1) ζ(τ, s min ) − ζ(τ, s max + 1) where s is avalanche size, τ is the scaling exponent, s min and s max are the lower and upper cutoff sizes and ζ(τ, s) = ∞ i=0 (i + s) τ is the generalized Hurwitz zeta function. The functions incorporate an upper cut-off s max because all empirical distributions are necessarily bounded 605 by system size (Beggs & Plenz, 2003). We defined the probability distributions for avalanche durations l in exactly the same way. We then estimated the exponents of avalanche sizes and durations using the method of maximum likelihood, as previously described (Clauset et al., 2009;Rubinov et al., 2011). Exponent scaling and shape collapses. We quantified power-law exponents between mean 610 avalanche size for a given avalanche duration. We estimated the slope of this power law using linear regression in log-log coordinates (Friedman et al., 2012). Finally, we quantified avalanche shape collapses by averaging all avalanches of a given duration in a single recording. We rescaled collapsed avalanche shapes to have unit area and avalanche durations to have length 1.