Data-Driven Control of Neuronal Networks with Population-Level Measurement

Controlling complex networks of nonlinear neurons is an important problem pertinent to various applications in engineering and natural sciences. While in recent years the control of neural populations with comprehensive biophysical models or simplified models, e.g., phase models, has seen notable advances, learning appropriate controls directly from data without any model assumptions remains a challenging and less developed area of research. In this paper, we address this problem by leveraging the network’s local dynamics to iteratively learn an appropriate control without constructing a global model of the system. The proposed technique can effectively regulate synchrony in a neuronal network using only one input and one noisy population-level output measurement. We provide a theoretical analysis of our approach and illustrate its robustness to system variations and its generalizability to accommodate various physical constraints, such as charge-balanced inputs.


Introduction
Large populations of interacting neurons govern the majority of our physiological activities, from circadian and central nervous system rhythms [1,2] to memory formation [3]. These populations often require precise firing patterns among units to operate effectively, for example, synchronization of neural activity in the gamma band for perceptual and cognitive functions [4,5] and desynchronization for transition between perception and the motor response [6]. Disruptions in these patterns lead to pathological conditions such as Parkinson's disease [7], epilepsy [8], and schizophrenia [9], where excessive synchronization of neuronal activities is detected.
Owing to these practical interests, several techniques to regulate the synchronization patterns in neural populations have been proposed. These techniques, although effective, rely on either state-space descriptions [10,11] or reducedorder models such as phase models [12][13][14][15][16][17][18][19][20], both of which have their own set of limitations. The state-space control design, for example, suffers from (i) partial observability, due to the availability of a single measurement such as local field potential, which makes identifying the state-space model difficult [21], and (ii) the high dimensionality which hinders the control design, assuming that the true model can even be obtained. Phase models, on the other  The symbol L f denotes the desired desynchronized index, evaluated using the order parameter (see Methods), whereasL f is the desynchronized index obtained by the optimal input u * (t). In panel (b), we consider minimumenergy controls designed from true and the estimated phase model and evaluate the error in the final desynchronization index as the estimation error varies. The estimation error represents the incorrect estimate of the coupling strength, ∥α − α∥, and the minimum-energy control is designed by using optimal control theory (see Supplementary Materials for details).
ii hand, facilitate tractable control design due to their low dimensionality but are only applicable to small-magnitude control inputs [1], and the precise input bounds are often unknown in practice [22]. Most importantly, inferring the true coupling dynamics is a non-trivial task [23,24] where estimation errors could result in unsatisfactory control performance ( Figure 1).
Given these limitations associated with model-based approaches and the recent unprecedented availability of measurement data, it is therefore natural to ask whether the desired control strategies can directly be learned from data. In this paper, we address this problem by designing control inputs that can synchronize or desynchronize a network of neurons within a pre-specified time frame. The advocated framework differs from the existing works [25][26][27][28][29][30][31] in that it does not require the availability of a rich training data set but instead leverages the system local dynamics to directly learn appropriate control with a small number of trails. We learn the control input online by iteratively perturbing the system and observing its response. We present two output measurement scenarios: (i) when voltages of individual neurons can be measured, and (ii) when only a single measurement in the form of the mean firing voltage is available. The effectiveness of the developed control framework is demonstrated on a network of Hodgkin-Huxley neurons, and the control performance is compared with existing control techniques. In addition, we demonstrate the robustness of the framework to system parameter variations as well as its generalizability to accommodate various physical restrictions, such as charge-balanced inputs.

Hodgkin-Huxley neuron model
We use Hodgkin-Huxley (HH) neuron model that describes the generation and propagation of action potentials in a squid giant axon based on the dynamic interplay between iconic conductances and electrical activity [32]. The dynamic equations describing the action potentials in a globally coupled heterogeneous neuron population are as follows:V The axon membrane voltage of the neuron i, i = 1, . . . , N is represented by the variable V i , while the ion gating variables are given by m i , n i , and h i . The baseline current and control input are given by I b and I, respectively, and α denotes the coupling strength between two neurons. The nominal parameters used in this paper are V N a,i = 50γ i mV, V K,i = −77γ i mV, V L,i = −54.4γ i mV, I b = 10 µA/cm 2 ,ḡ N a,i = 120γ i mS/ cm 2 ,ḡ K,i = 36γ i mS/ cm 2 ,ḡ L,i = 0.3γ i mS/ cm 2 and c = 1µF/cm 2 , where γ i is a varying parameter used to create heterogeneity in the neuron population.

Principles of data-driven control design
We consider the problem of designing a periodic input, I(t), that can desynchronize or synchronize an ensemble of coupled neurons in a given time T , where T is set to 10 × T mean , with T mean denoting the mean natural period of the neurons. The periodic structure of the input is motivated by the findings that stimulation must be periodic in certain neurological treatments, such as deep brain stimulation [33]. Using Fourier representation, without loss of generality, we can express periodic input as where Ω is the mean natural frequency of the neurons and r is the number of Fourier harmonics in the input. By focusing on periodic input, the control design problem is equivalent to learning the input parameters β = [c 0 , c 1 , d 1 , . . . , c r , d r ] ′ that can produce the desired synchronization pattern. In the following, we describe our datadriven control scheme which includes (i) choosing an appropriate objective function L depending upon the control objective and measurement availability, and (ii) learning the input coefficients β that minimize L through systematic perturbations of the neuronal network. We start by defining a function L on the output measurements, individual or aggregated, so that minimization of L achieves the desired synchronization structure in the neuronal ensemble. Given an input parameter vectors iii β = [c 0 , c 1 , d 1 , . . . , c r , d r ] ′ , for small perturbations of the input coefficients, ∆β, the updated objective function due to the perturbations can be expressed as The importance of G is that it encapsulates the local information of the system's current dynamics and characterizes the effects of small input changes on the current trajectory. Our control idea is to leverage this expression in order to appropriately and gradually alter the control parameters, i.e., determine ∆β, to yield an improvement in the objective function. More specifically, to step-by-step steer the system closer to desired synchronization structure, in each iteration, we consider the following optimization problem where λ ≥ 0 is a regularization parameter to enforce a penalty on the magnitude of ∆β, ensuring that the input perturbations are small. We select λ in an adaptive manner such that , where 1 > σ > 0 is a scaling factor and V ∆β * ,λ denotes the optimal objective value of equation (2). This condition is to ensure the updated control decreases the objective function (see Supplementary Materials for more details). In the absence of a mathematical model of the system, G is estimated by perturbing each input coefficient and applying the associated perturbed input to compute the change in L. After obtaining a (closed-form) solution to the quadratic program (2), i.e., ∆β * = −(G ′ G + λI) −1 G ′ L, we then update (or improve) the control parameters and repeat the process, which is outlined in the following algorithm.
Algorithm 1: Learning to control the synchronization structure of a neuron population Require: Objective function L, an initial (arbitrary) input coefficients β, a regularizer λ ≥ 0, and α ∈ (0, 1). 1: Apply the input corresponding to β, record the output trajectories, and compute the objective function L. 2: Estimate G by perturbing the input coefficients and observing the change in the objective function (estimated using the perturbed trajectories). 3: Solve for ∆β * of the optimization problem (2).
We prove that the proposed algorithm generates a sequence of parameters {β (k) }, where k is the iteration number, such that {∥L (k) ∥} is a decreasing sequence that converges to 0. This is shown by leveraging the controllability properties of the population (see Supplementary Materials for proof). The subsequent subsections explain how to design an appropriate objective function L for two scenarios: (i) when each neuron in the network can be measured, and (ii) when only an aggregated output measurement is available.

When each neuron voltage can be measured individually
In scenarios where we can measure each individual neuron, the phase associated with the neuronal trajectory offers a practical and efficient way to measure the degree of synchrony in a neuron population. It is worth noting that, different from the phase-model control design, our approach only uses the concept of phase to characterize the level of synchrony of the neuron population and does not construct the phase model (from data) for the control design.
The phase of each neuron can be computed from its output voltage by considering the time gap between two consecutive neuron firings as one cycle; hence, the phase increase during the time interval equals 2π [22]. Therefore, by assigning a phase 2πk to time t k , time of k th spike, the phase at time t (t k < t < t k+1 ) can be determined using linear interpolation, i.e., where θ(t) is the phase at time t (Figure 2). To evaluate the degree of network synchronization, we leverage the well-known Kuramoto order parameter defined as [34,35]  where θ i is the estimated phase of neuron i, i = 1, . . . , N . The order parameter R indicates the amplitude of the mean-field behavior and takes values between R = 0, which corresponds to the incoherent state, and R = 1, which corresponds to the completely synchronized network state. Thus, a neuronal network can be desynchronized by designing an input I(t) of the form (1) that minimizes the order parameter, i.e., minimizes L = R 2 , at time T . Similarly, for the purpose of synchronization, the input of the form (1) is designed to maximize the Kuramoto order parameter, i.e., minimizes L = 1 − R 2 . Both tasks are achieved by applying Algorithm 1 with their corresponding objective functions L.

When only the aggregated voltage can be measured
The idea of leveraging the phases of individual neurons to evaluate the network synchrony falls short when the average output voltage of all neurons V = N −1 N i=1 V i is the only available quantity for measurement. In such scenarios, it becomes essential to assess the network synchrony from the observed mean voltage. We do this by leveraging a physical observation, that is, synchronization results in enhanced oscillations of the mean-field, while desynchronization manifests itself via reduced mean-field oscillations [11,22,36]. Neurons can therefore be synchronized (desynchronized) by amplifying (inhibiting) the mean field oscillations, which is achieved by increasing (decreasing) the variance of the mean field. By utilizing this insight, for the desynchronization objective, we inhibit the mean voltage oscillations by minimizing the objective function where Var(V ),V are the numerically estimated variance and mean using the M mean filed samples, V (1) , . . . , V (M ) (M samples are collected during the last measurement cycle). Similarly, to synchronize the population we amplify the mean voltage variance by minimizing L = K − Var (V ), where K is a large enough positive number to ensure L ≥ 0.

Regulating synchrony through individual voltage measurements
We consider the objective of desynchronizing a population of globally coupled 50 HH heterogeneous neurons (γ ∈ [0.95, 1.05]) when each individual neuron can be measured. In the absence of external input, this population remains synchronized due to internal coupling α = 5×10 −2 (Figure 3(f)). To break the simultaneous neuron firing, we apply the proposed data-driven control technique by initializing the input to be zero, i.e., β = 0. The input coefficients are then slightly perturbed and G is estimated from changes in the objective function L due to small input perturbations, which is used to update the input (equation (2)). This process is repeated 14 times till ∥L∥ < 0.01, and the resulting neuron voltages are shown in Figure 3(g). Panels (h), (i), and (j) of Figure 3 display the designed control, ∥L∥ variation with each iteration, and the estimated final neuron phases. Note that our approach does not require any dataintensive task, such as training an artificial neural network, for control design and only takes 14 iterations to achieve desynchronization, significantly fewer than other online learning approaches such as Reinforcement Learning [29,30]. Now, we consider the synchronization of 50 coupled HH neurons where the population is in a desynchronized state in the absence of external input (Figure 3   Similar to the previous case, we initialize the input parameters to zero and then employ Algorithm 1 till ∥L∥ < 0.01. After 30 iterations, the neuron population is successfully synchronized. The controlled trajectories, designed input, ∥L∥ in each iteration, and the estimated neuron phases in the final cycle are shown in Figures 3(b)-(e). It is worth noting that to archive (de)synchronization, some existing data-driven neuronal control methods in fact require a rather complete set of pre-training data with a uniform distribution [25][26][27]. In practical applications, such complete data sets are often not readily available and must be carefully collected during experiments. Our approach, on the other hand, does not rely on pre-training data and only utilizes the data in the neighborhood of the system's current trajectory to facilitate control design. In addition, we consider (de)synchronizing population of identical HH neurons. Our algorithm takes 31 (22) iterations to synchronize (desynchronize) the population (see Supplementary Materials Figure 2). These results illustrate that the proposed data-driven technique is effective in both enhancing and suppressing the synchronization of neuron populations when individual neuronal measurements are available.

Learning control from a population-level measurement
We consider the (de)synchronization of 50 coupled HH neurons where the noisy mean firing voltage of the population is the only available measurement. This is achieved by designing a control input to suppress (for desynchronization) or enhance (for synchronization) mean field oscillations using Algorithm 1. The results are displayed in Figure 4, where desynchronization, similarly synchronization, is achieved when the mean voltage (red line) oscillations are reduced, similarly amplified (Figures 4(b) and (f)). Note in Figure 4(f), the lower the mean field oscillations, the more desynchronized the population is. Additionally, similar to the case of individual neuron measurement, our algorithm requires 14 (31) iterations to achieve desynchronization (synchronization) (Figures 4(d) and (i)) even with a single noisy population-level output measurement. Now, we compare the control performance, input power and error in the control objective, of the individual measurement case to the aggregated measurement case (Table I). The input power is estimated as 1 /Ω) using Parseval's theorem [37], and the error in control objective is taken as 1 − R 2 for synchronization and R 2 for desynchronization. Note that we estimate the order parameter, R 2 , from the individual voltage trajectories in the aggregated case only for comparison purposes. We observe that the control performance of our algorithm with aggregated measurement is similar to the individual measurement case for the desynchronization objective (Table I). However, we observe a small increase in the input power for the synchronization case.  Table I. A comparison of control performance between the individual and aggregated measurement cases.

Online Dynamic Adaption to System Variation
Unquestionably, system variation is a distinct characteristic of biological and physical rhythmic systems, such as fluctuations of the interspike interval in real neuron activity [38] and frequency variability in electrochemical oscillator networks [39]. Therefore, it is essential to develop control techniques that can accommodate variations  i ). Our iterative control technique quickly adapts to these changes and achieves the desired synchronization structures. This adaptation of control is evident from Figure 5(b) and (d) (light blue to dark blue). Even with such system variation, the total number of control updates remains low, with 16 (21) times for synchronization (desynchronization).
It is worth noting that controlling either synchronization or desynchronization of the severely underactuated HH population through a single control and a single aggregated noisy measurement is a highly non-trivial task even when the system model is available [40][41][42][43]. To the best of our knowledge, no other data-driven control work has addressed this issue. In this paper, we achieve this control task purely by means of online data from a small number of trials while being able to handle changes in the system with minimal degradation in control performance.

Robustness to coupling strength variation
In a population of coupled HH neurons, synchronization is observed when the intrinsic coupling strength α is increased [44][45][46]. This means that as coupling strength increases, less external forcing is necessary for synchronization, while strong input is required for desynchronization to dominate the system dynamics. A similar behavior is observed  viii when we construct control inputs to synchronize or desynchronize a neural population for various coupling strengths (Table II). Specifically, we observe that the control input power required for synchronization decreases as the α increases, whereas the opposite trend is found for desynchronization control. These results have two implications: (i) our control method performs consistently over a wide variety of coupling strengths, and (ii) the control technique leverages the system dynamics to find a relatively minimal control rather than merely finding a feasible control. Note that, for the purpose of comparison, the error tolerance ϵ tolerance is kept identical across all coupling strengths.

Comparison with model-based methods
Here, we show that our data-driven control method achieves similar performance as a model-based control design techniques, such as [15]. The technique in [15] utilizes the phase models to construct periodic, open-loop control inputs that synchronize an ensemble of heterogeneous oscillators. We keep the input frequency similar for both methods and compare the input power and the accuracy of the final synchronization pattern. We consider a network of 50 coupled non-identical HH neurons and design control inputs to desynchronize the network from a synchronized state (and, similarly, to synchronize from a desynchronized initial state). The comparison results are displayed in Table  III with the corresponding control inputs shown in Figure 6. This comparison illustrates that the proposed model-  Table III. Control performance with the proposed approach and the reference model-based technique proposed in [15].
free technique achieves better control performance, both in terms of the control objective L and input power, than the model-based technique. For the synchronization objective, our approach delivers comparable synchronization performance with 60% less power, whereas for the desynchronize objective, we obtain an improvement of 52% in the control objective with a three-fold reduction in input power. This improvement in control performance can be explained by the fact that the model-based approach imposes certain assumptions, such as network topology and coupling structure, on the system dynamics for the control design, whereas our model-free strategy learns the control input by directly interacting with the system without any assumptions.

Discussion
In this paper, we present a data-driven framework to effectively regulate the synchrony in a network of coupled heterogeneous neurons. We apply the proposed framework to the HH neuron population and show that the desired controls can be determined from noisy aggregated measurements. We provide a theoretical analysis of the convergence of our algorithm. Further, through the means of numerical simulations, we show that our data-driven controls provide better performance than the classic phase model-based approaches.
Our control technique differs from the existing data-driven techniques for complex networks [25][26][27][29][30][31] in two important aspects. Firstly, we do not extract information from a rich pre-training data set, as in [25][26][27], and directly ix learn the appropriate control online from scratch. Online interactions with the system enable the control synthesis to effectively adapt to system variation. Secondly, unlike Reinforcement Learning [25][26][27], we do not learn from a limited reward signal, which without a hand-crafted function requires many trials with a large amount of data. Instead, we utilize the rich local dynamics of the system along its current trajectory to step-by-step regulate the system toward the desired synchrony.
The suggested control method is both theoretically sound and practically applicable, as it merely requires the (population-level) voltage measurement to learn the desired control without relying on any assumptions on the system dynamics. The online learning aspect of our algorithm enables us to effectively regulate systems with time-varying parameters. This problem of controlling time-varying systems, which is rather challenging in a model-based setting, finds a simpler solution in a data-driven framework due to the continual nature of online system interactions. In addition, our algorithm offers the generalizability to incorporate experimental constraints, such as charge-balanced (CB) inputs, which is a necessary requirement for deep-brain stimulation [16,47]. This CB constraint can be expressed as T 0 I(t)dt = 0, where T is the duration of the stimulation [19], and can readily be implemented in our framework by setting c 0 = 0.
Some limitations of our work should also be addressed and acknowledged. First, the evaluation of the gradient from the perturbed trajectory might suffer from noise. This can be mitigated by ensuring that the perturbation is larger than the noise. Second, our algorithm requires resetting the system to evaluate the effect of perturbation, which can be challenging in certain applications. However, the number of resettings required by our algorithm is low, which can be done by applying a resetting pulse in an experimental setup. Nonetheless, it would be ideal to have a data-driven strategy that does not require the reset of experiments, since this would enhance the relevance and application of our data-driven control framework and lead to practical control methods for complex networks.

Data availability
All data generated or analyzed during the study are included in this article (and its Supplementary Materials). The simulation codes are further available upon request.