Can the brain use waves to solve planning problems?

A variety of behaviors like spatial navigation or bodily motion can be formulated as graph traversal problems through cognitive maps. We present a neural network model which can solve such tasks and is compatible with a broad range of empirical findings about the mammalian neocortex and hippocampus. The neurons and synaptic connections in the model represent structures that can result from self-organization into a cognitive map via Hebbian learning, i.e. into a graph in which each neuron represents a point of some abstract task-relevant manifold and the recurrent connections encode a distance metric on the manifold. Graph traversal problems are solved by wave-like activation patterns which travel through the recurrent network and guide a localized peak of activity onto a path from some starting position to a target state.

In the brain, though, the overwhelming majority of connections between neurons are recurrent, i. e. they connect neurons within the same cortical area or transmit information from higher areas back to lower ones. For example, in the visual cortex, synapses from the lateral geniculate nucleus of the thalamus, i. e. the feed-forward connections, make up only 5 %-10 % of the excitatory synapses in their target layer 4 of V1 in cats and monkeys [1]. The understanding of neurons as "feature detectors" can therefore only represent a small fragment of the over-all picture.
Several possible explanations of the function of these recurrent connections have been proposed. For example, it has been suggested that neural activity follows almost chaotic trajectories in an extremely high-dimensional state space while the dynamics are still sensitive enough to be influenced by the relatively small share of feed-forward connections [2]. In this conceptual framework, attention signals, past memories, and sensory input are merged in order to guide the system towards well-separated, lower-dimensional subspaces which represent certain states of perception. It is also hypothesized that top-down projections from higher cortical areas transmit predictions or expectations to influence how the lower areas interpret the incoming sensory data [3,4]. Such predictions are thought to play a role in noise-reduction and signalrestoration or to direct attention bottom-up to features which deviate from the prediction and thus require some executive reaction. Nevertheless, the full computational purpose of the recurrent connections is still little understood [1].
In the present paper, we propose a new algorithmic role which recurrent neural connections might play, namely as a computational substrate to solve graph traversal problems. We argue that many cognitive tasks like navigation or motion planning can be framed as finding a path from a starting position to some target position in a space of possible states. The possible states may be encoded by neurons via their "feature-detector property". Allowed transitions between nearby states would then be encoded in recurrent connections, which can form naturally via Hebbian learning since the feature detectors' receptive fields overlap. They may eventually form a "map" of some external system. Activation propagating through the network can then be used to find a short path through this map. In effect, the neural dynamics then implement an algorithm similar to Breadth-First Search on a graph.
The remainder of the paper is organized as follows: In Section 2, we give a conceptual overview, describe the technical details of the proposed model and show some simulation results for an exemplary numerical implementation of the model. We then review empiric support for some components of the model in Section 3. Limitations, implications and ideas for further development are discussed in Section 4. The more technical details related to general graph theory and to the numerical implementation can be found in Section 5.

A Network of Neurons that Represents a Manifold of Stimuli
We consider a neural network which is exposed to some external stimuli-generating process under the assumption that the possible stimuli can be organized in some continuous manifold 1 in the sense that similar stimuli are located close to each other on this manifold. For example, in the case of a mouse running through a maze all possible perceptions can be associated with a particular position in a two-dimensional map, and neighboring positions will generate similar perceptions, see Figure 1a.
Proprioception, i. e. the sense of location of body parts, can also be a source of stimuli. For example, for a simplified arm with two degrees of freedom every possible position of the arm corresponds to one specific stimulus, cf. Figure 1b. All possible stimuli combined give rise to a two-dimensional manifold. The example also shows that the manifold will usually be restricted since not every conceivable combination of two joint angles might be a physically viable position for the arm.
The manifold of potential stimuli needs not necessarily be embedded in a flat Euclidean space as in the case of the maze. For example, if the stimuli are two-dimensional figures which can be shifted horizontally or rotated on a screen, the corresponding manifold is two-dimensional (one translational parameter plus one for the rotation angle) but it is not isomorphic to a flat plane since a change of the rotation angle by 2π maps the figure onto itself again, see Figure 1c.
We assume that such manifolds of stimuli are approximated by the connectivity structure of a neural network which forms via a learning process. The result is a neural structure which we call a cognitive map. The defining property of a cognitive map is that is has a neural encoding for every possible stimulus and that two similar stimuli, i. e. stimuli which are close to each other in the manifold of stimuli, are represented by similar encodings, i. e. encodings which are close to each other in the cognitive map 2 . There is considerable evidence, which we review in Section 3.1, that such cognitive maps are implemented by the brain, but the details of the encoding of stimuli remain mostly unclear.
For the model, we make a very simplistic choice and assume a single-neuron encoding, i. e. the manifold of stimuli is covered by the receptive fields of individual neurons. Each such receptive field is a small localized area in the manifold and two neighboring receptive fields may overlap, see Figure 2. Such an encoding is a typical outcome for a single layer of neurons which are trained in a competitive Hebbian learning process [5]. Examples for such competitive learning algorithms are Kohonen Maps [6], (Growing) Neural Gas [7] or variants of sparse coding dictionary learning [8].
The key idea of the model is that solving a problem that can be formulated as a planning problem in the manifold of stimuli, can be solved by a planning problem in a corresponding cognitive map. To this end, it is not enough to consider the cognitive map as a set of individual points, but its topology must be known as well. This topological information will be encoded in the recurrent connections of the neural network. It seems natural that a neural network could learn this topology via Hebbian learning: Two neurons with close-by receptive fields in the manifold will be excited simultaneously relatively often because their receptive fields overlap. Consequently, recurrent connections within the cognitive map will be strengthened between such neurons and the topology of the neural network will approximate the topology endowed with a Riemannian metric -curved. For example, a saddle-shaped hyperbolic plane, a sphere or a torus are manifolds. 2 Of course, we do not imply that two neurons which are close to each other in the connectivity structure are also close to each other with respect to their physical location in the neural tissue.  (b) Approximate positions of the "arm" are encoded in single neurons. Physically impossible positions where the "arm" intersects with the "body" are not encoded at all (because they have never been observed by the neural network) giving rise to the gap in the center of the cognitive map. An example planning problem is to move the "hand" from behind the body to a position in front of the body without collision.

A A A
x a x a (c) The visual stimulus is always the letter "A", but at different x-positions and tilted at different angles α. Due to the periodicity of the stimulus under change of α, the resulting cognitive map has the topology of a cylinder. An example planning problem in this case is the decision whether the "A" has to be moved/tilted to the left or to the right to convert it from some given position to another one. of the manifold, see Figure 2. This idea has been explored in more detail by Curto and Itskov in [9].

Dynamics Required for Solving Planning Problems
Having set up a network that represents a manifold of stimuli, we need to endow this network of feed-forward and recurrent connections with dynamics. We do so by imposing two interacting mechanisms.
First, the neurons in the network should exhibit continuous attractor dynamics [10]: If a "clique" of a few tightly connected neurons are activated by a stimulus via the corresponding feed-forward pass, they keep activating each other while inhibiting their wider neighborhood. The result is a self-sustained, localized neural activity surrounded by a "trench of inhibition".

Manifold of stimuli
Overlap between receptive fields

Layer of neurons
Receptive fields

Recurrent connections
Feed-forward connections Figure 2: In the model, the recurrent connections within a single layer of neurons approximate the topology of the manifold of stimuli. During the learning process, the strongest recurrent connections are formed between neurons with overlapping receptive fields. The problem of finding a route through the manifold (red line) is thus approximated by the problem of finding a path through the graph of recurrent neural connections (red path).
In the model, this encodes the as-is situation or the starting position for the planning problem. Such a state is called an "attractor" since it is stable under small perturbations of the dynamics, and it is part of a continuous landscape of attractors with different locations across the network. For a recent review of attractor networks, the reader is referred to [10].
Second, the neural network should allow for wave-like expansion of activity. If a small number of close-by neurons are activated by some hypothetical executive brain function (i. e. not via the feed-forward pass), they activate their neighbors, which in turn activate theirs, and so on. The result is a wave-like front of activity propagating through the recurrent network. The neurons which have been activated first encode the to-be state or the end position of the planning problem.
The key to solving a planning problem is in the interaction between the two types of dynamics, namely in what happens when the expanding wave front hits the stationary peak of activity. On the side where the wave is approaching it, the "trench of inhibition" surrounding the peak is in part neutralized by the additional excitatory activation from the wave. Consequently, the containment of the activity peak is somewhat "softer" on the side where the wave hit it and it may move a step towards the direction of the incoming wave. This process repeats, leading to a small change of position with every incoming wave front. The localized peak of excitation will follow the wave fronts back to their source, thus moving along a route through the manifold from start to end position, see Figure 3.
The two types of dynamics described above are seemingly contradictory, since the first one restricts the system to localized activity, while the second one permits a wave-like propagation of activity throughout the system. To resolve the conflict in numerical simulations, we have split the dynamics into a continuous attractor layer and a wave propagation layer, which are responsible for different aspects of the system's dynamical behaviour. We discuss the

Trench of inhibition
Localized activation (as-is state) Activation (to-be state) Wave-like propagation of activation Deplacement of localized activation Figure 3: The as-is state of the system is encoded in a stable, localized, and self-sustained peak of activity surrounded by a "trench" of inhibition (top left corner). A planning process is started by stimulating the neurons which encode the to-be position (bottom right corner). The resulting waves of activity travel through the network and interact with the localized peak. Each incoming wave front shifts the peak slightly towards its direction of origin. Note that, for reasons of simplicity, we did not draw the neural network in this figure but only the manifold which it approximates.
concepts of a numerical implementation in Section 2.4 and ideas for a biologically more plausible implementation in Section 4.

Connection to Real-Life Cognitive Processes
To make the proposed concept more tangible we present a rough sketch of how it could be embedded in a real-life cognitive process. As an example, we consider a human grabbing a cup of coffee. According to our hypothesis, the as-is position of the subject's arm is encoded as a localized peak of activity in the cognitive map encoding the complex manifold of arm positions. We assume that this encoding works in a bi-directional way, somewhat like the string of a puppet: When the arm is moved by external forces, the neural representation of its position moves along with it. On the other hand, if the representation is changed slightly by some cognitive process, then some hypothetical muscle control mechanism attempts to make the arm follow its neural representation and bring the limb and its representation back into congruence. If now the human subject decides to grab the cup of coffee, some executive brain function constructs a to-be state of holding the cup: The position of the hand with the fingers around the cup handle is what the person consciously thinks of. This thought activates the encoding of the to-be state in the cognitive map that represents the manifold of possible arm positions. The activation creates waves of activity propagating through the network, reaching the representation of the as-is state and shifting it slightly towards the to-be state. The hypothetical muscle control mechanism reacts on this disturbance and performs a motor action to keep the arm and its representation in line. As long as the person implicitly represents the to-be state, the arm "automatically" performs the complicated sequence of many individual joint movements which is necessary to grab the cup.
This concept can be extended to flexibly consider restrictions that have not been hard-coded in the cognitive map by learning. For example, in order to grab the cup of coffee, the arm may need to avoid obstacles on the way. To this end, the hypothetical executive brain function which defines the target state of the hand could also temporarily "block" certain regions of the cognitive map (e. g. via inhibition) which it associates with the discomfort of a collision. Those parts of the network which are blocked cannot conduct the "planning waves" anymore and thus a path around those regions will be found.

Implementation in a Numerical Proof-of-Concept
To substantiate the presented conceptual ideas, we performed numerical experiments using multiple different setups. In each case, the implementation of the model employs two neural networks that both represent the same manifold of stimuli.
The continuous attractor layer is a sheet of neurons that models the functionality of a network of place cells in the human hippocampus [11,12]. Each neuron is implemented as a ratecoded cell embedded in its neighborhood via short-range excitatory and long-range inhibitory connections as in [13]. This structure allows the formation of a self-sustaining "bump" of activity, which can be shifted through the network by external perturbations. The bump represents the as-is state of the planning problem, which is to be solved by moving the bump to its target state.
The wave propagation layer is constructed with an identical number of excitatory and inhibitory Izhikevich neurons [14,15], properly connected to allow for stable signal propagation across the manifold of stimuli. The target node is permanently stimulated, causing it to emit waves of activation which travel through the network.
The interaction between the two layers is modeled in a rather simplistic way. As in [13], a timedependent direction vector was introduced in the synaptic weight matrix of the continuous attractor layer. It has the effect of shifting the synaptic weights in a particular direction which in turn causes the location of the activation bump in the attractor layer to shift to a neighbouring neuron. The direction vector is updated whenever a wave of activity in the wave propagation layer newly enters the region which corresponds to the bump in the continuous attractor layer. Its direction is set to point from the center of the bump to the center of the overlap area between bump and wave, thus causing a shift of the bump towards the incoming wave fronts.
For more details on the implementation, see Section 5.

Results of the Numerical Experiments
In a very simple initial configuration, the path finding algorithm was tested on a fully populated quadratic grid of neurons as described before. Figure 4 shows snapshots of wave activity and continuous attractor position at some representative time points during the simulation. As expected, stimulation of the wave propagation layer in the lower right of the cognitive map causes the emission of waves, which in turn shift the bump in the continuous attractor layer from its starting position in the upper left towards its target state. t = 25ms t = 33ms t = 34ms t = 35ms t = 130ms t = 275ms As described in Section 2.3, the manifold of stimuli represented by the neuronal network can be curved, branched, or of different topology, either permanently or temporarily. The purpose of the model is to allow for a reliable solution to the underlying graph traversal problems independent of potential obstacles in the networks. For this reason we investigated whether the bump of activation in the continuous attractor layer was able to successfully navigate through the graph from the starting node to the end node in the presence of nodes that could not be traversed. To test this idea we constructed different "mazes", blocking off sections of the graph by zeroing the synaptic connections of the respective neurons in the wave propagation layer and by clamping activation functions of the corresponding neurons in the continuous attractor layer to zero (see Figure 5). We found that in all these setups, the algorithm was able to successfully navigate the bump in the continuous attractor layer through the mazes.

Relation to Existing Graph Traversal Algorithms
To conclude this section, we highlight a few parallels between the presented approach and the classical Breadth-First Search (BFS ) algorithm.
BFS begins at some start node s of the graph and marks this node as "visited". In each step, it then chooses one node which is "visited" but not "finished" and checks whether there are still unvisited nodes that have an edge to this node. If so, the corresponding nodes are also marked as "visited", the current node is marked as "finished" and another iteration of the algorithm is started. For a more formal treatment of BFS, we refer to Section 5.
The approach presented here is a parallelized variant of this algorithm. Assuming that all neurons always obtain sufficient current to become activated, the propagating wave corresponds to the step of the algorithm in which the neighbors of the currently considered node are investigated. In contrast to BFS, the algorithm performs this step for all candidate nodes in a single step. That is, it considers all nodes currently marked as visited, checks the neighbors of all these nodes at once and marks them as visited if necessary. This close connection also allows to derive theoretical performance properties for the algorithm based on the behavior of BFS. As a more in-depth analysis of this connection is not within the scope of this paper, we refrain from going into detail here and refer again to Section 5. Having all ingredients of the proposed conceptual framework in place, the following section reviews some experimental evidence indicating that it could in principle be employed by biological brains.

Empirical Evidence
In this section we review empirical findings which are relevant for the model. We dedicate one subsection to each of several key model assumptions on the neural connectivity and dynamics.

Cognitive Maps
The concept of "cognitive maps" was first proposed by Edward Tolman [16,17] who conducted experiments to understand how rats were able to navigate mazes to seek rewards. He noticed that these animals showed remarkably flexible behaviour when confronted with novel versions of their maze environments, such as finding previously unexplored shortcuts or finding new routes when obstructions made old ones untraversable. Tolman theorized that this behaviour was made possible by the rats having an internal model (or map) of the mazes which they used to navigate and which they updated when new information about the maze was presented.
A body of evidence suggests that neural structures in the hippocampus and enthorinal cortex potentially support cognitive maps used for spatial navigation [11,18,19]. Within these networks, specific kinds of neurons are thought to be responsible for the representation of particular aspects of cognitive maps. Some examples are place cells [11,18] which code for the current location of a subject in space, grid cells which contribute to the problem of locating the subject in that space [20] as well as supporting the stabilisation of the attractor dynamics of the place cell network [13], head-direction cells [21] which code for the direction in which the subject's head is currently facing, and reward cells [22] which code for the location of a reward in the same environment.
The brain regions supporting spatially aligned cognitive maps might also be utilized in the representation of cognitive maps in non-spatial domains: In [23], fMRI recordings taken from participants while they performed a navigation task in a non-spatial domain showed that similar regions of the brain were active for this task as for the task outlined in [24] where participants navigated a virtual space using a VR apparatus. Not only were the same spatial-task aligned regions active for this non-spatial-domain navigation task but the firing patterns of the neurons recorded in the former displayed the same hexagonal firing patterns that are characteristic of enthorinal grid-cells. Further, according to [25], activation of neurons in the hippocampus (one of the principal sites for place cells) is indicative of how well participants were able to perform in a task related to pairing words. Supporting this observation with respect to the role played by these brain regions in the operation of abstract cognitive maps, [26] found that lesions to the hippocampus significantly impaired performance on a task of associating pairs of odors by how similar they smelled. Finally, complementing these findings, rat studies have shown that hippocampal cells can code for components in navigation tasks in auditory [27,28], olfactory [29], and visual [30] task spaces.
Taken as a whole, the above body of research provides good evidence for the following ideas: Firstly, that cognitive maps exist in humans. Secondly, that these maps can and are used for solving problems in a general class of task spaces. Thirdly, that hippocampal and enthorinal cells likely play a key role in the construction and operation of these maps.

Feed-Forward and Recurrent Connections
As described in Section 2.1, the proposed model is built around a particular theme of connectivity: Each neuron represents a certain pattern in sensory perception mediated via feed-forward connections. Such a pattern could be, for example, all the percepts associated with a particular position in a maze, a certain body posture, or some letter in the field of vision, cf. Figure 1.
In addition, recurrent connections between two neurons strengthen whenever they are activated simultaneously. In the following, we give an overview of some relevant experimental observations which are consistent with this mode of connectivity.
The most prominent example of neurons which are often interpreted as pattern detectors are the cells in primary visual cortex. These neurons fire when a certain pattern (like a small edge of bright/dark contrast) is perceived at a particular position and orientation in the visual field. On the one hand, these neurons receive their feed-forward input from the lateral geniculate nucleus. On the other hand, they are connected to each other through a tight network of recurrent connections. Several studies (see e. g. [31][32][33]) have shown that two such cells are preferentially connected when their receptive fields are co-oriented and co-axially aligned. Due to the statistical properties of natural images, where elongated edges appear frequently, such two cells can also be expected to be positively correlated in their firing due to feed-forward activation.
Similar statements are valid for auditory cortex: Neurons in primary auditory cortex receive feed-forward input from thalamocortical connections as well as intracortical signals via recurrent connections. The feed-forward input is tonotopically organized and A1 neurons typically respond to one or several characteristic frequencies. There is evidence for a cross-frequency integration via intracortical input: For example, neurons in A1 show subthreshold responses to frequency ranges broader than can be accounted for by their thalamic inputs [34] while the latency of their response is shortest at their characteristic frequency [35]. Additional supporting facts are reviewed in the introduction of [36]. By analogy from the visual cortex, one might expect that intra-cortical connections are strongest between neurons if their characteristic frequencies differ by a harmonic interval, e. g. by a full octave, since such intervals are most highly correlated in the frequency spectra of natural sounds [37,38]. While we are not aware of any study examining this particular conjecture, there is a lot of evidence that harmonics play a major role in the organization of the auditory cortex in general [39].
The somatosensory cortex is another brain region where several empirical findings are in line with the postulated theme of connectivity. Area 3b in the somatosensory cortex contains neurons which respond to tactile stimuli. Their receptive fields are not dissimilar to those of cells in V1. Experiments on non-human primates suggest that "3b neurons act as local spatiotemporal filters that are maximally excited by the presence of particular stimulus features" [40].
Regarding the recurrent connections in somatosensory cortex, some empirical support stems from the well-studied rodent barrel cortex. Here, the animal's facial whiskers are represented somatotopically by the columns of primary somatosensory cortex. Neighboring columns of the barrel cortex are connected via a dense network of recurrent connections. Sensory deprivation studies indicate that the formation of these connections depends on the feed-forward activation of the respective columns: If the whiskers corresponding to one of the columns are trimmed during early post-natal development, the density of recurrent connections with this column is reduced [41,42]. Conversely, synchronous co-activation over the course of a few hours can lead to increased functional connectivity in the primary somatosensory cortex [43].
The primary somatosensory cortex also receives proprioceptive signals from the body which represent individual joint angles. Taken as a whole, these signals characterize the current posture of the animal and there is an obvious analogy to the arm example, cf. Figure 1b.
We are not aware of any experimental results regarding the recurrent connections between proprioception detectors, but it seems reasonable to expect that the results about processing of tactile input in the somatosensory cortex can be extrapolated to the case of proprioception. This would imply that a recurrent network structure roughly similar to Figure 1b should emerge and thus support the model for controlling the arm.
Area 3a of the somatosensory cortex, whose neurons exhibit primarily proprioceptive responses, is also densely connected to the primary motor cortex. It contains many corticomotoneuronal cells which drive motoneurons of the hand in the spinal cord [44]. This tight integration between sensory processing and motor control might be a hint that the hypothetical string-of-a-puppet muscle control mechanism from Section 2.3 is not too far from reality.
In summary, evidence from primary sensory cortical areas seems to suggest a common cortical theme of connectivity in which neurons are tuned to specific patterns in their feed-forward input from other brain regions, while being connected intracortically based on statistical correlations between these patterns.

Wave Phenomena in Neural Tissue
In the model we present, the target state of a cognitive planning task is encoded by localized activation within the cognitive map. Starting from there, neural activation travels through the recurrent network in what resembles expanding wave fronts.
There is a large amount of empirical evidence for different types of wave-like phenomena in neural tissue. We summarize some of the experimental findings, focusing on fast waves (a few tens of cm s −1 ). These waves are suspected to have some unknown computational purpose in the brain [45] and they seem to bear the most resemblance with the waves postulated in the model.
Using multielectrode local field potential recordings, voltage-sensitive dye, and multiunit measurements, traveling cortical waves have been observed in several brain areas, including motor cortex, visual cortex, and non-visual sensory cortices of different species. There is evidence for wave-like propagation of activity both in sub-threshold potentials and in the spatiotemporal firing patterns of spiking neurons [46].
In the motor cortex of wake, behaving monkeys, Rubino et al. [47] observed wave-like propagation of local field potentials. They found correlations between some properties of these wave patterns and the location of the visual target to be reached in the motor task. On the level of individual neurons, Takahasi et al. found a "spatiotemporal spike patterning that closely matches propagating wave activity as measured by LFPs in terms of both its spatial anisotropy and its transmission velocity" [48].
In the visual cortex, a localized visual stimulus elicits traveling waves which traverse the field of vision. For example, Muller et al. have observed such waves rather directly in single-trial voltage-sensitive dye imaging data measured from awake, behaving monkeys [49].
The detailed propagation mechanisms which lead to fast travelling waves in cortical tissue are still under discussion. The prevalent view seems to be that waves are actually propagated through the circuitry of the respective cortical area rather than, being the result of spatiotemporally organized activation that stems from some other brain region. Two competing mechanisms for waves [46] are: (1) strictly localized generation of activity followed by monosynaptic propagation through long-range horizontal connections of the superficial cortical layers or (2) a "chain reaction" of firing neurons leading to a self-sustaining spread of activity through the deeper cortical layers. While possibly both mechanisms play a role in the brain, only the second one is incorporated in the model.

Spatial Navigation Using Place Cells
Finding a short path through a maze-like environment, cf. Figure 1a, is one of the planning problems the model is capable of solving. In this case, each neuron of the continuous attractor layer represents a "place cell" which encodes a particular location in the maze.
Place cells were discovered by John O'Keefe and Jonathan Dostrovsky in 1971 in the hippocampus of rats [50]. They are pyramidal cells that are active when an animal is located in a certain area ("place field"), of the environment. Place cells are thought to use a mixture of external sensory information and stabilizing internal dynamics to organize their activity: On the one hand, they integrate external environmental cues from different sensory modalities to anchor their activity to the real world. This is evidenced by the fact that their activity is affected by changes in the environment and that it is stable under a removal of a subset of cues [51,52]. On the other hand, firing patterns are then stabilized and maintained by internal network dynamics as cells remain active under conditions of total sensory deprivation [53]. Collectively, the place cells are thought to form a cognitive map of the animal's environment.
In theoretical or computational studies, continuous attractor models are often used to describe place cell dynamics. Just as we do in the present article, it is typically assumed that each place cell responds, on the one hand, to location-specific patterns of sensory cues and, on the other hand, to stimulation via recurrent connections from cells with overlapping place fields.

Targeted Motion Caused by Localized Neuron Stimulation
In our model, the process of motion planning is triggered by stimulating the neurons which represent the body's to-be position, cf. Figure 1b. In the present section, we review some experimental results that support the biological plausibility of this assumption.
In 2002, Graziano et al. reported results from electrical microstimulation experiments in the primary motor and premotor cortex of monkeys [54]. Stimulation of different sites in the cortical tissue for a duration of 500 ms resulted in complex body motions involving many individual muscle commands. The stimulation of one particular site typically led to smooth movements with a certain end state, independent of the initial posture of the monkey, while stimulating a different location in the cortical tissue led to a different end state. In particular, Graziano et al. noted that stimulation at a fixed site can have the different effects in terms of low-level muscle commands: For example, a monkey's arm might either stretch or flex to reach a partially flexed position, depending on its initial condition. In terms of the model presented here, this would be explained by two wave fronts propagating in opposite directions away from the to-be location, only one of which hits the localized peak of activity encoding the as-is location and pulls it closer to the to-be state. Graziano et al. also reported that the motions stopped as soon as the electrical stimulus was turned off. This is fully consistent with our model, where stopping the to-be activation means that no more wave fronts are created and thus the as-is peak of activity remains where it is.
After this original discovery by Graziano et al. in 2002, several additional studies have confirmed and extended their results, see [55] for an overview. Similar effects of motor cortex stimulation have been observed in a variety of different primate and rodent species. The results also hold true for different types of neural stimulation: electrical, chemical and optogenetic.
The neural structures which cause the bodily motions towards a specific target state have been named ethological maps or action maps [55].
Furthermore, several studies suggest that such action maps are shaped by experience: Restricting limb movements for thirty days in a rat can cause the action map to deteriorate. A recovery of the map is observed during the weeks after freeing the restrained limb [56]. Conversely, a reversible local deactivation of neural activity in the action map can temporarily disable a grasping action in rats [57]. A permanent lesion in the cortical tissue can disable an action permanently. The animal can re-learn the action, though, and the cortical tissue reorganizes to represent the newly re-learnt action at a different site [58]. These observed plasticity phenomena are fully in line with our model which emphasises a self-organized formation of the cognitive map via Hebbian processes both for the feature learning and for the construction of the recurrent connections.

Participation of the Primary Sensory Cortex in Non-Sensory Tasks
For the first two examples in Figure 1, the association with a planning task is obvious. Our third example, the geometric transformations of the letter "A", may appear a bit more surprising, though: After all, the neural structures in visual sensory cortex would then be involved in "planning tasks". The tissue of at least V1 fits the previously explained theme of connectivity, but it is often thought of as a pure perception mechanism which aggregates optical features in the field of vision and thus performs some kind of preprocessing for the higher cortical areas.
However, there is evidence that the visual sensory cortex plays a much more active role in cognition than pure feature detection on the incoming stream of visual sensory information.
In particular, the visual cortex is active in visual imagery, that is, when a subject with closed eyes mentally imagines a visual stimulus [59]. Experiments suggest that mental imagery leads to activation patterns in the early visual cortex which are composed of the same visual features as during actual sensory perception: Using multi-voxel pattern classification on fMRI measurements of the visual cortex, it is possible to train machine learning models which can accurately decode cortical activation and determine which image in the field of vision has caused the neural response. The same models, trained only on perceptual images, have been used successfully to decode cortical activation caused by purely mental images [60].
Based on such findings, it has been suggested that "the visual cortex is something akin to a 'representational blackboard' that can form representations from either the bottom-up or topdown inputs" [59]. In our model, we take this line of thinking one step further and speculate that the early visual cortex does not only represent visual features, but that it also encodes possible transformations like rotation, scaling or translation via its recurrent connections. In this view, the "blackboard" becomes more of a "magnetic board" on which mental images can be placed and shifted around according to rules which have been learned by experience.
Of course, despite the over-simplifying Figure 1c, we do not intend to imply that there were any neurons in the visual cortex with a complex pattern like the whole letter "A" as a receptive field. In reality, we would expect the letter to be represented in early visual cortex as a spatiotemporal multi-neuron activity pattern. The current version of our model, on the other hand, allows for single-neuron encoding only and thus reserves one neuron for each possible position of the letter. We will discuss this and other limitations of the proposed model in Section 4.

Temporal Dynamics
The concept presented in this article implies predictions about the temporal dynamics of cognitive planning processes which can be compared to experiments: The bump of activity only starts moving when the first wave front arrives. Assuming that every wave front has a similar effect on the bump, its speed of movement should be proportional to the frequency with which waves are emitted. Thus both the time until movement onset and the duration of the whole planning process should be proportional to the length of the traversed path in the cortical map. Increased frequency of wave emission should accelerate the process.
One supporting piece of evidence is provided by mental imagery: Experiments in the 1970s [61,62] have triggered a series of studies on mental rotation tasks, where the time to compare a rotated object with a template has often been found to increase proportionally with the angle of rotation required to align the two objects.
In the case of bodily motions, the total time to complete the cognitive task is not a well suited measure since it strongly depends on mechanical properties of the limbs. Yet for electrical stimulation of the motor cortex (cf. Section 3.5) Graziano et al. report that the speed of evoked arm movements increases with stimulation frequency [63]. Assuming that this frequency determines the rate at which the hypothetical waves of activation are emitted, this is consistent with our model.
In addition, our model makes the specific prediction that the latency between stimulation and the onset of muscle activation should increase with the distance between initial and target posture. We are not aware of any studies having examined this particular relationship yet.

Discussion
The model proposed here is, to the best of our knowledge, the first model that allows for solving graph problems in a biological plausible way such that the solution (i. e. the specific path) can be calculated directly on the neuronal network as the only computational substrate.
Similar approaches and models have been investigated earlier, especially in the field of neuromorphic computing. For example, in [64][65][66][67][68] graphs are modeled using neurons and synapses, and computations are performed by exciting specific neurons which induces propagation of current in the graph and observing the spiking behavior. Although some models are more general than the one presented here and allow for solving more complex problems like dynamic programs [65], enumeration problems [67] or the longest shortest path problem [68], we are not aware of any model explicitly discussing the biological plausibility. In fact, most of these approaches are far from being biologically plausible as they e. g. require additional artificial memory [65] or a preprocessing step that changes the graph depending on the input data [68]. Also, the model of Muller et al. [64] as well as the very recent model of Aimone et al. [66] which are biologically more plausible do not discuss how a specific path can then be computed in the graph, even if the length of a path can be calculated [66].
Our model has not been created with the intent to explain empirical findings from one particular brain region, mental task or experimental technique in full detail. Rather, we sought to explore ways in which a generic algorithmic framework might solve seemingly very different problems based on more or less the same neural substrate. Working on a relatively high level of abstraction and ignoring most of the domain-specific aspects may not only help our understanding of computational principles employed by the brain but also pave the road to the development of new useful algorithms in artificial intelligence. Nevertheless, it is important to note that many features of our model are in line with experimental results from various areas of brain science and we review those findings in Section 3.
In the following we discuss limitations of the presented model and potential avenues for further research.

Single-Neuron vs. Multi-Neuron Encoding
In our model, each point on a cortical map is represented by a single neuron and a distance on the map is directly encoded in a synaptic strength between two neurons. The graph of synaptic connections an therefore be considered as a coarse-grained version of the underlying manifold of stimuli. Yet such a single-neuron representation is possible only for manifolds of a very low dimension, since the number of points necessary to represent the manifold grows exponentially with each additional dimension. For tasks like bodily movement, where dozens of joints need to be coordinated, the number of neurons required to represent every possible posture in a single-neuron encoding is prohibitive. Therefore, it is desirable to encode manifolds of stimuli in a more economical way -for example, by representing each point of the manifold by a certain set of neurons.
It is an open question how distance relationships between such groups of neurons could be encoded and whether the dynamics from our model could be replicated in such a scenario.

Wave Propagation and Continuous Attractor Layers
Certain design choices made in the numerical implementation should be discussed regarding their biological plausibility and possible alternative mechanisms.
If the wave propagation layer and the continuous attractor layer were to form organically as two separate sub-circuits in a real biological system, each of their neurons would need to act as a feature detector, since otherwise it is not clear how the right structure of recurrent connections could develop. Then each feature will be represented by two detectors -one in each layerand there must be some unknown mechanism which establishes a link between every pair of corresponding feature detectors.
Moreover, the split of the model dynamics into two layers leads to a somewhat artificial implementation of the interaction between them: As described in Section 2.4, we compute the direction into which the activation bump in the continuous attractor layer is shifted whenever a wave front arrives at the corresponding location in the wave propagation layer. The details of this mechanism do not appear to be biologically plausible and we would rather expect that the bump is moved only by the aggregated effects of local interactions between synaptically connected neurons. This interaction could be mediated by the connections between corresponding feature detectors which we postulated above.
Alternatively, an elegant and biologically plausible model could be obtained by merging the wave propagation and continuous attractor dynamics into a single layer of neurons. In such a single-layer model, the whole network can form in a self-organized way: First, the feed-forward connections are generated via a process of competitive Hebbian learning, leading to a network of individual feature detectors. In a second step, these detectors establish recurrent connections among each other, again driven by Hebbian learning, to create the graph structure required in our model.
In the single-layer scenario, the model must allow for continuous attractor dynamics and wavelike expansion of activity simultaneously. We speculate that this is possible in principle, for example by imposing a time delay on the effect of inhibition -which appears biologically plausible considering that it is mediated in an extra step via inhibitory interneurons. The time delay of inhibition has only minimal effect on the quasi-static peak of activity and thus conserves the landscape of continuous attractors from the two-layer scenario. On the other hand, the time delay allows activation patterns with strong temporal fluctuations to emit waves of activity before inhibition has any effect.
The interaction between the waves and the localized peak of activity could potentially shift the peak in the direction of the incoming waves without the need to impose any artificial assumptions to the model: The incoming waves are annihilated by the peak's "trench of inhibition", but they also increase the level of activation of the neurons on the side of the bump which is hit by the wave. Due to the attractor dynamics of the network the bump recovers from this deformation, but in the process it changes its location slightly towards the direction of the incoming wave.
Realizing the effects described above will require a very careful numerical set-up of the model and tuning of its parameters. We consider this an interesting direction for future research since the potential outcome is a rather elegant model with a high degree of biological plausibility.

Embedding into a Bigger Picture
While the model focuses on the solution of graph traversal problems, it appears desirable to embed it into a broader context of sensory perception, decision making, and motion control in the brain. One particular question is how the hypothetical "puppet string mechanism"which we postulated to connect proprioception and motion control -could be implemented in a neural substrate. Similarly, if our model provides an appropriate description of place cells and their role in navigation, the question arises how a shift in place cell activity is translated into appropriate muscle commands to propel the animal in the corresponding direction.
It is intriguing to speculate about a deeper connection between our model and object recognition: On the same neural substrate, our hypothetical waves might travel through a space of possible transformations, starting from a perceived stimulus and "searching" for a previously learned representative of the same class of objects. This could explain why recognition of rotated objects is much faster than the corresponding mental rotation task [69]: The former would require only one wave to travel through the cognitive map, while the later would require many waves to move the bump of activity.
And finally, an open question is the connection between the model and the hypothetical executive brain functions which are assumed to define the target state for the graph traversal problem and activate the corresponding neurons.

Conclusion
We have shown that a wide range of cognitive tasks, especially those that involve planning, can be represented as graph problems. To this end, we have detailed one possible role for the recurrent connections that exist throughout the brain as computational substrate for solving graph traversal problems. We showed in which way such problems can be modeled as finding a short path from a start node to some target node in a graph that maps to a manifold representing a relevant task space. Our review of empirical evidence indicates that a theme of connectivity can be observed in the neural structure throughout (at least) the neocortex which is well suited to realize the proposed model.
We constructed a two-layer neural network consisting of a layer of neurons that implemented a continuous attractor sheet modelled after neurons found in the human hippocampus and enthorinal cortex. This sheet of neurons enacts a "bump" of activity centered on the neuron representing the start node s in the graph. As a second step we implemented a sheet of spiking neurons that generated a wave of activation across the same sheet starting from a individual neuron that represented the target node t in the graph. Finally, we implemented an interaction algorithm which caused the bump of activation in the continuous attractor layer to move in the direction of the wave front as it reached the region of activation in the continuous attractor layer. We found that this model was successfully able to move the activation bump in the continuous attractor layer through the sheet of neurons to the location that mapped to the target node t in the graph, thus solving the graph traversal problem. We found further that the model was robust to large changes in the graph structure. Specifically we showed that if large portions of the graph are made inaccessible and the relevant neurons in the model were zeroed out that the model is still able to guide the activation bump from the start node s to the target node t successfully.
Despite its relatively small scale we believe that models such as ours may provide a starting point in understanding how brains are able to exhibit flexible behaviour with respect to different kinds of cognitive tasks. Apparently, the stereotypical theme of connectivity encountered across the neocortex allows the brain to create a model of its environment based on sensory perception. Once established, the neural structure can be used as a "planning board" to support different cognitive tasks.
Next to a deeper understanding of the brain, we believe that models like ours can be an inspiration for new algorithms of artificial intelligence (AI). Artificial neural networks used in technical applications today are typically input-driven, they rely on feed-forward processing of information through several layers of neurons, and they are trained via supervised learning. The brain, however, continuously integrates sensory input into its own dynamics, its connectivity structure is mostly recurrent, and learning happens to a large extent in an unsupervised way. In all three of these fundamental differences, our model is aligned more closely to the properties of the brain than those of most other AI algorithms. At the same time, it shows how relevant computational problems can be solved with a very generic approach that relies heavily on selforganization. Potential applications include motion control for robots, especially in scenarios which require a high degree of flexibility and continuous adaptation to changing circumstances. As an additional interesting feature, since the model is based on artificial spiking neurons, it can potentially be implemented very efficiently in neuromorphic computer hardware.

Methods and Experiments
Connection to Mathematical Graph Traversal Problems As the model described in Section 2 uses a neural network of neurons to solve planning problems in the cognitive map, it is natural to interpret this network as a graph consisting of nodes representing the neurons and edges representing their synaptic connections. Thus, the planning problem in the network translates into a graph traversal problem in the corresponding graph.
In the following, we hence introduce some basic terminology used in the field of graph theory.
We refrain from giving too many details and references, as most of the standard formalism can be found in classical books on mathematical optimization. In particular, we refer to [70,71] for references, details, proofs and further discussions.
A graph G is a pair G = (V, E), where V is a finite set of nodes and E is the set of edges, where each edge is set of two nodes. For two nodes s, t ∈ V, an s-t-path is a path of nodes starting at s and ending at the target node t such that any two consecutive nodes along the path are connected by an edge. A node t ∈ V is reachable from another node s ∈ V if there exists an s-t-path in G. An example of a graph with 15 nodes and different paths is given in Figure 6. In the following, we let G = (V, E) be a fixed graph. For simplicity, we assume that every node is reachable from every node.
We are interested in finding a path between two given nodes s, t ∈ V in G. The idea is that the node t represents the neuron encoding the to-be state and s represents the neuron encoding the as-is state of the underlying planning problem. To formalize this problem, we denote by Path(s, t) the problem of finding a path from s to t for given nodes s, t.
Even though this problem technically only asks for finding some path from s to t, shorter paths that use as few connections as possible are superior to longer paths using more connections. The reason is that the fewer connections a path has, the fewer intermediate states are traversed in the planning problem. When considering the previous example of grabbing a cup of coffee, a possible solution could be to move the arm around the head before performing actually reaching towards the coffee cup. This is not the movement that would be performed in actual behavior. However, we are similarly not obliged to find the shortest possible path. Considering the previous example again, a shortest path would reflect a movement with as few intermediate positions as possible. This might correspond to stretching the arm in such a way that the cup can barely be reached and might yield an unrealistic behavior. Thus, in summary, our goal is to find reasonably short paths that do not necessarily need to be shortest paths.
The Path(s, t) problem is a well-investigated problem in computer science and mathematics. With BFS and DFS, two standard path finding algorithms from computer science are described in Section 5. There, we also argue why these approaches cannot directly be applied to our scenario due to the fact that the graphs we consider represent neural networks in which algorithms have to be performed in a biological plausible way.

Mathematical Background and Solving the Path Problem in Typical Graphs
In the following, we consider how the Path(s, t) problem can be solved in general graphs that do not represent neural networks. We later discuss what problems occur when trying to adapt these algorithms to such graphs when using the neurons as computational substrate. In all of the following, we omit technical details and proofs and instead refer to [70,71] again.
Consider some fixed graph G = (V, E) and two nodes s, t ∈ V. For simplicity, we assume that there is at least one path between any pair of nodes. The most basic class of algorithms that can be used to solve the Path(s, t) problem is the class of graph search algorithms. Two of the most prominent examples of graph search algorithms are Breadth-First-Search (BFS ) and Depth-First-Search (DFS ).
Both algorithms start at the starting node s and traverse the graph iteratively by following its edges. Intuitively, DFS (s) tries to follow a single path starting in s for as long as possible, only returning to a previously considered node and starting a "new" path if it is strictly necessary. In contrast to this, BFS (s) tries to always visit a node as close as possible to the starting node s next. A visualization of the results of these two algorithms applied to the same graph starting at node a is given in Figure 7. By remembering which nodes are already completely explored, both of these algorithms find all nodes reachable from the starting node s. In particular, the algorithms are typically implemented in a way that the traversed paths can easily be recovered from the data produced by the algorithms. As both algorithms are very similar, they can be implemented as a realization of a general scheme for finding paths in a graph. This scheme is given in Algorithm 1. It uses a generic data structure D that only has to allow for the two basic operations of inserting in and removing nodes from it. In each step, the algorithm extracts a node u from D and checks for unvisited nodes among all nodes which have an edge towards u. For each such node w, the algorithm inserts w into the data structure D and remembers that the node w was reached from u by marking u as the parent of w. To avoid visiting vertices more than once, the node w is then also marked as visited.
After performing this step for each such node, the node u is completely explored and it is not necessary to consider it again.
Depending on the specific data structure that is chosen for D, this then yields either the BFS or the DFS algorithm. More precisely, if D is chosen as a queue that inserts and removes nodes first-in-first-out, then Algorithm 1 yields the BFS algorithm. If D is chosen as a stack that inserts and removes nodes last-in-first-out, then one obtains the DFS algorithm. Both variants of this algorithmic scheme can solve the Path(s, t) problem. However, as mentioned in Section 2, we want to find a short path from s to t. This is guaranteed if we use the BFS (s) algorithm as this algorithm always finds shortest paths with respect to the number of edges. We later argue why this result implies that we are able to find short paths in the neural network representing the manifold of stimuli, even if we cannot guarantee that they are shortest paths.
We now discuss why it is not biologically plausible that graph traversal problems in the brain are solved by exactly one of these algorithms. The main obstacle is that Algorithm 1 requires the data structure D to organize the nodes that still have to be considered, as well as a mechanism to remember which nodes have already been visited. Especially the data structure D which might have to store a large number of nodes and is in some sense "global" cannot be implemented in the brain in a way it can be implemented in a computer. The reason is that individual neurons in a neural network can only access local information or information that was just sent to them by a pre-synaptic neuron. In a neural network, however, neurons are only able to communicate with their synaptic neighbors via sending and receiving electric current.
As discussed in Section 2.6, our network configuration yields a wave propagation that behaves like a "parallelized" version of BFS where a set of nodes can be visited simultaneously. This also explains how using a wave propagation algorithm can find short paths, but not necessarily shortest paths: A neuron potentially receives current from more than one neuron, hence it is not possible to uniquely retrace the path to the starting node. However, as wave propagation behaves like a parallelized BFS algorithm, the paths that can be obtained via backtracking will never be too long. Although this behavior has some similarities with other well-understood graph problems like virus propagation [72][73][74] or diffusion processes [75] in networks, the respective theories are not directly applicable to our specific scenario.

Neuronal Network Setup -Exemplary Implementation of the Model
Splitting Dynamics to Two Network Layers As described in Section 2.2, for our numerical implementation of the model, we separated the two different types of dynamics into distinct layers of neurons, the continuous attractor layer and the wave propagation layer. The split into two layers makes the model more transparent and ensures that parameter changes have limited and traceable effects on the over-all dynamics. As an additional simplification, we do not explicitly model the feed-forward connections which drive the wave propagation layer, but we rather directly activate certain neurons in this layer.
Activation in the continuous attractor layer C represents the start node s, that in the course of the simulation will move towards the target node t, which is permanently stimulated in the wave propagation layer P . Waves of activation are travelling from t across P . As soon as the wave front reaches a node in P that is connected to a node in proximity to the current activation in C, the activation in C is moved towards it. Thus, every arriving wave front will pull the activation in C closer to t, forcing the activation to trace back the wave propagation to its origin t.
In detail, these dynamics require a very specific network configuration which is described in the following. Figure 8 contains a general overview of the intra-and inter-layer connectivity used in the model and our simulations.

Spiking Neuron Model in the Wave Propagation Layer
In the performed experiments, the wave propagation layer P is constructed with an identical number of excitatory and inhibitory Izhikevich neurons [14,15], that cover a regular quadratic grid of 41×41 points on the manifold of stimuli. The spiking behavior of each artificial neuron is modeled as a function of its membrane potential dynamics v(t) using the two coupled ordinary differential equations Here, v is the membrane potential in mV, u an internal recovery variable, and I represents synaptic or DC input current. The internal parameters a (scale of u / recovery speed) and b (sensitivity of u to fluctuations in v) are dimensionless. Time t is measured in ms. If the membrane potential grows beyond the threshold parameter v ≥ 30 mV, the neuron is spiking and the variables are reset as follows.
v ← c, Again, c (after-spike reset value of v) and d (after-spike offset value of u) are dimensionless internal parameters. representation. In the wave propagation layer, excitatory synapses are drawn as solid arrows, dashed arrows indicate inhibitory synapses. Upon its activation, the central excitatory neuron stimulates a ring of inhibitory neurons that in turn suppress circles of excitatory neurons to prevent an avalanche of activation and support a circular wave-like expansion of the activation across the sheet of excitatory neurons. Furthermore, overlap between the active neurons in C and P is used to compute the direction vector ∆(t) used for biasing synapses in C and thus shifting activity there.
If not stated otherwise in the following, the parameters listed in Table 1a were used for the spiking neuron model in P . They correspond to regular spiking (RS) excitatory and fast spiking (FS) inhibitory neurons. In contrast to [14], neuron properties were not randomized to allow for reproducible analyses. The effect of a more biologically plausible heterogeneous neuron property and synaptic strength distribution is analyzed in Section 5. Compared to [14], the coupling strength in P is large to account for the extremely sparse adjacency matrix as every neuron is only connected to its few proximal neighbours in our configuration. Whenever a neuron in P is to be stimulated externally, a DC current of I = 25 is applied to it, cf. Equation (1). As in [14], the simulation time step was fixed to 1 ms with one sub-step in P for numerical stability.
Synaptic Connections in the Wave Propagation Layer As described before, neurons in P correspond to reachable locations in the manifold of stimuli. Thus, it is plausible to assume that neurons representing near-by locations in a suitable metric on the respective manifold will also be closely connected. Assuming that neurons will not have a very strictly defined region of responsibility, but there will also be some overlap, this is consistent with a Hebbian learning  approach: Neurons that are sensitive to nearby regions will often fire at the same instant in time, strengthening their mutual connectivity.
As depicted in Figure 8, the excitatory neurons are driving nearby excitatory and inhibitory neurons with a synaptic strength of where s e→i (d) is defined analogously. Here, d is the distance between nodes in the manifold of stimuli. For simplicity, we model this manifold as a two-dimensional quadratic mesh with grid spacing δ = 1 where some connections might be missing. The choice s ∝ 1 /d was made to represent the assumption that recurrent coupling will be strongest to nearest neighbours and will decay with distance. Note that (5) in particular implies that we have s e→e (0), s e→i (0) = 0, which prevents self-excitation. To restrict to only localized interaction, we exclude interaction beyond a predefined excitation range d e and inhibition range d i , respectively. Values of the parameters in the expressions for the synaptic strengths used in the simulations are given in Table 1c.
The inhibitory neurons suppress activation of the excitatory neurons by reducing their input current via synaptic strength Wave Propagation Dynamics The described setup allows for wave-like expansion of neuronal activity from an externally driven excitatory neuron as shown in Figure 9.
If the activity of the excitatory neurons grows too much in a region, the respective inhibitory neurons will start spiking to eventually suppress activity locally. This suppression happens with a delay of two time steps due to the causal signal travelling time through s e→e and s e→i , Excitatory Firing Pattern Inhibitory Firing Pattern Figure 9: Activity patterns of the excitatory and inhibitory neurons on a 101 × 101 quadratic neuron grid. Spiking neurons are shown as gray areas. One excitatory neuron at the grid center (arrow) is driven by an external DC current to regular spiking activity. Due to the nearest-neighbour connections, this activity is propagating in patterns that resemble a circular wave structure. The inhibitory neurons prevent catastrophic avalanche-like dynamics by suppressing highly active regions. The specific pattern shape is an artifact of the underlying regular grid structure and thus not perfectly circular. This could be alleviated using, e. g. a hexagonal instead of a quadratic mesh of neurons.
but could also be implemented via different synaptic time constants, i. e. AMPA (excitatory) vs. GABA A (inhibitory). Thus, the inhibitory neurons prevent an avalanche-like activity by turning off active excitatory neurons.
As can be seen in Figure 9, this effectively means that propagating signals in the excitatory subnetwork are followed by similarly shaped propagating signals in the inhibitory sub-network. In this respect, signal propagation does not behave like physical waves, such as ripples on water: They do not interfere in constructive and destructive manner to form interference patterns. Instead, activity stops where to propagating signals touch as shown in Figure 10. This is an important property in our setup as it ensures that signals do not run through each other in the wave propagation layer but do mutually annihilate. Thus, the wave fronts tend to form stable and continuous patterns and activation of the continuous attractor layer from different directions is vastly reduced. t = 10ms t = 15ms t = 18ms Figure 10: Activity patterns of the excitatory neuron grid where two neurons are driven to periodic spiking activity (arrows) at different instants in time. Again, spiking neurons are shown as gray areas and neuronal connections are set up as described in Section 5 As soon as the signal propagation fronts touch, they annihilate each other due to the inhibitory activity that accompanies them. Instead of forming interference patterns or travelling through each other, the remaining wave fronts merge and continue propagating as a well-defined line of activity.
With the capability of propagating signals as circular waves from the target neuron t across the manifold of stimuli in P , it is now necessary to set up a representation of the start neuron s in C. This will be done in the following subsection before the coupling between P and C will be described.

Neuron Model for Place Cell Dynamics
The continuous attractor layer C, implements a sheet of neurons that models the functionality of a network of place cells in the human hippocampus using rate-coding neurons [11,12] and thus the manifold of stimuli. As for the wave propagation layer, we also use a quadratic 41 × 41 grid of neurons for this layer. Activation in the continuous attractor layer will appear as bump, the center of which represents the most likely current location on the manifold of stimuli.
This bump of activation is used to represent the current position in the graph of synaptic connections representing the cognitive map. Planning in the manifold of stimuli thus amounts to moving the bump through the sheet of neurons where each neuron can be thought of as one node in this graph. With respect e. g. to the robot arm example in Figure 1b, the place cell bump represents the current state of the system i. e. the current angles of the arm's two degrees-of-freedom. As the bump moves through the continuous attractor layer, and thus through the graph, the robot arm will alter its configuration creating a movement trajectory through the 2D space.
Synaptic Connectivity to Realize Continuous Attractor Dynamics Our methodology for modelling the continuous attractor place cell dynamics adapts the computational approach used in [13] by including a computational consideration for synaptic connections between continuous attractor neurons and an associated update rule that depends on information from the wave propagation layer P .
σ Gaussian width 0.03 T Gaussian shift 0.05 J Synaptic connection strength 12 τ Stabilization strength 0.8 Table 2: Parameters for the continuous attractor layer C.
The synaptic weight function connecting each neuron in the continuous attractor sheet to each other neuron is given by a weighted Gaussian. This allows for the degrading activation of cells in the immediate neighbourhood of a given neuron and the simultaneous inhibition of neurons that are further away, thus giving rise to the bump-shaped activity in the sheet itself. The mathematical implementation of these synaptic connections also allows for the locus of activation in the sheet to be shifted in a given direction which is, in turn, how the graph implemented by this neuron sheet is able to be traversed.
The synaptic weight w i, j ∈ R (Nx×Ny)×(Nx×Ny) connecting a neuron at position i = (i x , i y ) to a neuron at position j = (j x , j y ) is given by Here, J determines the strength of the synaptic connections, · is the Euclidean norm, σ modulates the width of the Gaussian, T shifts the Gaussian by a fixed amount, ∆(t) is a direction vector which we discuss in detail later, and N x and N y give the size of the two dimensions of the sheet.
In order to update the activation of the continuous attractor neurons and to subsequently move the bump of activation across the neuron sheet, we compute the activation A j of the continuous attractor neuron j at time t + 1 using where B j (t + 1) is a transfer function that accumulates the incoming current from all neurons to neuron j and τ is a fixed parameter that determines stabilization towards a floating average activity.
Simulation parameters for the continuous attractor layer C are given in Table 2. They have been manually tuned to ensure development of stable, Gaussian shaped activity with an effective diameter of approximately twelve neurons in C.
As in [13], a direction vector ∆(t) ∈ R 2 has been introduced in Equation (7). It has the effect of shifting the synaptic weights in a particular direction which in turn causes the location of the activation bump in the attractor layer to shift to a neighbouring neuron. In other words, it is this direction vector that allows the graph to be traversed by informing the place cell sheet from which direction the wave front is coming in P . Thus all that remains for the completion of the necessary computations is to compute ∆(t) as a function of the propagating wave and the continuous attractor position.
Layer Interaction -Direction Vector The interaction between the wave propagation layer P and the continuous attractor layer C is mediated via the direction vector ∆(t). The direction vector is computed such that it points from the center of the bump of activity towards the center of the overlap between bump and incoming wave as follows. Let C t and P t denote the sets of positions of active neurons at time t in layer C and P , respectively. Note that each possible position corresponds to exactly one neuron in the wave propagation layer and exactly one neuron in the continuous attractor layer as they have the same spatial resolution in the implementation. Now let A t := C t ∩ P t . Then, is the average position of overlap. We compute the direction vector from the current position p t of the central neuron in the continuous attractor layer activation bump to mean (A t ) via Layer Interaction -Recovery Period In order to prevent the wave from interacting with the back side of the bump in C, we introduce a recovery period R of a few time steps after moving the bump. During R, which is selected as the ratio of bump size to wave propagation speed, A t is assumed to be empty, which prevents any further movement. In our experiments, we used R = 12 ms.
It is worth acknowledging at this point that this approach of connecting the two layers, which we have chosen for reasons of simplicity, is somewhat artificial. We discuss this and other limitations of our implementation in Section 4.2.

Numerical Experiments
In order to test the complex neuronal network configuration described in Sections 2 and 5 and to study its properties and dynamics, we performed numerical experiments using multiple different setups. Source code used for our studies is published at [76]. Results of our simulations are presented in Section 2.5. In the following, we will add some more in-depth analyses on specific properties of the model as observed in the simulations.
Transmission Velocity In our setup, no synaptic transmission delay, as e. g. in [77], is implemented. As, due to the strong nearest-neighbour connectivity, only few pre-synaptic spiking neurons are sufficient to raise the membrane potential above threshold, the waves are travelling across P with a velocity of approximately one neuronal "ring" per time step, cf. Figure 4. In contrast, the continuous attractor can only move a distance of at most half its width per incoming wave. Accordingly, its velocity is tightly coupled to the spike frequency of the stimulated neuron while still being bound due to the recovery period R. In the specific case of the simulation in Figure 4, in total nine wave fronts were observed to be required traversing the Gaussian continuous attractor activity zone to finally pull it on a straight line to its destination over a distance of d = 45.25.

Obstacles and Complex Setups
In the S-shaped maze Figure 5a, the continuous attractor activity moves towards the target node t on a direct path around the obstacles. Due to the optimal path being more than two times longer than in Figure 4, the time to reach the target is accordingly longer as well. This is also in line with the required travel times from s to t in Figures 5b and 5c, where -despite its complexity -a path through the maze is found fastest due to it being shorter than in the other cases of Figure 5. This observation is also evidenced by the fact that our model is a parallelized version of BFS, cf. Sections 2.6 and 5, which is guaranteed to find the shortest path in an unweighted and undirected graph.
Symmetric Paths An additional interesting observation can be made in the central block setup, Figure 5b: The setup is perfectly symmetric with respect to the two possible paths. Thus, in principle it can not be solved with our model. However, after interaction with several wave fronts, a minor shift of the continuous attractor position occurs due to numerical instability. This is further emphasized by subsequent incoming waves, finally pulling the continuous attractor onto a path to the target node t. While such numerical instabilities are clearly resulting from the specific implementation of our model on a computer system, also organically grown biologic networks will never be perfectly symmetric. Here, natural variations in synaptic connectivity and neuron properties will break potential symmetries, favoring one of the possible paths. In the following, we inspect the influence of these variations on the overall performance of the model.

Heterogeneous Neuron Properties and Synaptic Strengths
In the simulation experiments described up to now, a homogeneous wave propagation layer P is employed. There, all neurons are subject to the same internal parameters, being either regular spiking excitatory neurons or fast spiking inhibitory neurons. Also, synaptic strengths are strictly set as described previously with parameters from Table 1c. This setup is rather artificial. Natural neuronal networks will exhibit a broad variability in neuron properties and in the strength of synaptic connectivity.
To account for this natural variability, we randomized the individual neuron's internal properties as suggested in [14], see Table 1b. As in [14], heterogeneity is achieved by randomizing neuron model parameters using random variables r e and r i for each excitatory and inhibitory neuron. These are equally distributed in the interval [0; 1] and vary neuron models between regular spiking (r e = 0) and chattering (CH, r e = 1) or fast spiking (r i = 1) for excitatory neurons and low-threshold spiking (LTS, r i = 0) for inhibitory neurons. By squaring r e , excitatory neuron distribution is biased towards RS. In addition, after initializing synaptic strengths in P , we randomly varied them individually by up to ±50 %.
Despite this strong modification to the original numerically ideal setup, a structured wave propagation is still possible in P as can be seen in Figure 11. While the stereotypical circular form of the wave fronts dissolves in the simulation, they continue to traverse P completely. As before, they reach the continuous attractor bump and are able to guide it to their origin. t = 25ms t = 175ms t = 325ms t = 365ms t = 375ms t = 430ms Figure 11: Block setup as in Figure 5 but with a heterogeneous neuron configuration in P .
Apparently, the overall connection scheme in P is more important for stable wave propagation than homogeneity in the individual synaptic strengths and neuron properties.
An interesting aspect of this simulation when compared to Figure 5b is the apparent capability of solving the graph traversal problem quicker than with the homogeneous neuronal network. As already indicated, this is an artifact of the explicitly broken symmetry in the heterogeneous configuration: The wave fronts from different directions differ in shape when arriving at the initial position of the continuous attractor layer activity. Thus, one of them is immediately preferred and target-oriented movement of the bump starts earlier than before. This capability of breaking symmetries and thus quickly resolving ambiguous situations is an explicit advantage of the more biologically realistic heterogeneous configuration.