Neural waves and short-term memory in a neural net model

We show that recognizable neural waveforms are reproduced in the model described in previous work. In so doing, we reproduce close matches to certain observed, though filtered, EEG-like measurements in closed mathematical form, to good approximations. Such neural waves represent the responses of individual networks to external and endogenous inputs and are presumably the carriers of the information used to perform computations in actual brains, which are complexes of interconnected networks. Then, we apply these findings to a question arising in short-term memory processing in humans. Namely, we show how the anomalously small number of reliable retrievals from short-term memory found in certain trials of the Sternberg task is related to the relative frequencies of the neural waves involved. This finding justifies the hypothesis of phase-coding, which has been posited as an explanation of this effect.


Introduction
We continue the quest to understand mechanisms underlying brain-like computation in terms of the model described in earlier work [1,2]. The basic object modeled there is a network of "bicameral" nodes called b-neurons which are two-state qubit-like entities obeying the statistics of Fermi-Dirac, though all vector structures are over the real field and finite dimensional. The (real) spaces of states of such b-networks are the associated Fermi-Dirac spaces (i.e., exterior algebras) generated by the (real, finite dimensional) Hilbert space of states of single b-neurons in the usual way. These spaces of states may be combined in various ways that reflect combinations of the underlying networks. Likewise, maps between these spaces may reflect possible connections between the underlying networks. Such connections include possible non-synaptic, substrate-like connectivities, such as those associated with ambient flows of neurotransmitters, hormones, and interneurons, as well as the usual synaptic ones.

3
The basic states themselves represent possible groups of simultaneously ON or co-firing b-neurons in the b-network to hand, the general state being a superposition of these basic states. In this context, the act of superposition simply signifies a certain kind of parallelism among the subsets of co-firing b-neurons participating in the superposition. Superposition here does not carry the ontological burden it supports in actual quantum theory. Such states were dubbed firing patterns in [1].
When we speak of a dynamics for a network of this type, we refer to temporal changes in its states of occupancy, i.e., temporal changes in its firing patterns. There are at least two levels of these dynamics for our model(s). The first is the innermost dynamics of a single network: that is, the dynamics internal to a network itself. This is given by an infinitesimal generator of temporal development, or (real) operator Hamiltonian, acting upon the network's firing patterns, and was derived in [1]. (See Sect. 3 for a review.) It is important to note that this operator is in general not skew-symmetric: a matrix or operator M is skewsymmetric if M T = −M , where the superscript denotes transposition. So the principle, hallowed in ordinary quantum theory, that the dynamics (that is to say, the corresponding time evolution operator) should always preserve probabilities, is violated in our systems, as it is generally in biological systems. This phenomenon has been dubbed fragility [3]. Biological causes sometimes do not have their expected effects: a biological neuron sometimes does not deliver its neurotransmitters upon firing. A biological event occurring with probability 1 at one time may occur with probability less than 1 at another time. Our non-orthogonal time evolution operators mimic this phenomenon.
The second level of dynamics takes account of notional non-synaptic influences, which emanate either from outside the network or from inside the network, such as the chemical and electrochemical environment in which the synaptic network is embedded and contributes to via the secretion of neurotransmitters. Thus, it involves influences which may be both completely external or a mixture of internal and external. The simplest form of Hamiltonian for this was introduced in [1] and will be taken further in Sect. 3.2.
Here we will first argue that realistic firing patterns emerge from our model, denatured as it appears to be. This is the business of Sect. 5, in which we will derive recognizable brain wave patterns in some very simple examples. The simplicity of these examples, taken together with the draconian approximations we have imposed, speaks to the resilience of our model and its capacity to generate an enormous variety of waveforms, including those found in actual brains. The apparent emergence of a general scheme to reproduce EEG-like output generated by a general network from given inputs-in closed mathematical formmay be of independent interest.
Having argued in favor of the basic mechanism, we turn in Sect. 6 to an attempt to explain an apparently anomalous finding in the application of a working memory test known as the Sternberg task: to wit, the small size of the working memory space in humans when confronted with certain kinds of such tasks. Our analysis leads to a justification for adoption of phase-coding which has been posited as a possible mechanism underlying this phenomenon [4].

Born's law and some remarks on quantum, or quantum-like, computation
An important milestone in the path to quantum mechanics was the formulation of the socalled Born law, which enables an interpretation of superposition. It says that if a quantum system is in a state , an element of a Hilbert space, then the probability that it may be found in or make a transition to a state , which we write as prob( → ) , is given by This "law" is named for Born, since he seems to have intuited it in the 1920s, and it was ensconced as a fundamental axiom of standard quantum theory for many decades. However, more recently it was seen to follow plausibly from other axioms by Finkelstein [5] and independently and rigorously proved to so follow by Hartle [6,7]. The proof entails assumptions about assembling quanta which we shall adopt since they do not fundamentally belong to physics se ipse but can be argued on purely combinatorial grounds. This law enables us to interpret superpositions such as Thus, if the i are orthonormal and recalling that our c i s are real numbers, we have That is to say, the probability that a superposition will "collapse" onto one of its component states is given in terms of the coefficients in the superposition by the right-hand side of the above equation. We shall adopt this interpretation mutatis mutandis in our real cases.
Superposition does not in this context carry the ontological weight that it bears in echt quantum physics. Here it has the mundane interpretation that a certain kind of parallelism is to hand. Consider the state where the i are orthonormal. Then, by Born's law Thus, (t) oscillates between a transition to the the state 1 and a transition to the state 2 . This is a generalized sort of time division multiplexing, a technique used in signal processing since its invention in the 1870s by Émile Baudot to enable a single line to carry many different signals simultaneously. Namely, because of the time dependence of the coefficients, the system may oscillate at any frequency between certainly being in one basic firing pattern or another. This is the phenomenon that the hopeful study of "quantum computing" seeks to exploit. Why is it better than classical computing? The usual reason given is that quantum

3
computing is "parallel" while classical computing is not. This is then usually explained by invoking properties of quantum superposition, which seem strange to classical thinkers. However, there is an easy way to see it that does not involve mystical journeys into alternate universes.
Consider a generic state of a quantum system which we will not specify too precisely at this stage. Say a qubit: namely, just a system with two basic orthonormal states �0⟩ and �1⟩ , say. Then, a generic state of the system, at time t say, is a superposition By Born's law, the probability that, when measured or observed, the qubit is in the state �0⟩ , where the hatted state denotes the normalized version of the state under the hat, is while the probability that it is found in the other basic state is "Computation," whatever else it means, certainly entails forcing the system into one state or the other. Note that if one could arbitrarily control the coefficients c i (t) then one could increase arbitrarily the probabilities of one or the other outcomes. This is, among other things, what quantum computation seeks to do.
Faced with the classical version of this system, which is just the bit of classical computation, no such modulation of the probable outcome is possible. A bit must be left in the 0 state or the 1 state. If it is in the undesired state, it takes work and time to move it into the other state. In the quantum case, one may prepare a superposition ahead of time, in which the desired outcome has a high probability and takes no time, or very little time. So that is the difference in a nutshell.
But where is parallelism? It is there, in the same setup. Take a superposition of the canonical form (as above) Then, and So at t = 0 the probability that the qubit will be in the OFF state is 1 and as time goes on the other state becomes more probable at the same time, eventually becoming a certainty at time 2 . This is parallelism.
Why is a quantum computation quicker? If you have a classical bit which is OFF, i.e., 0, then it takes a certain amount of work and time to turn it ON, i.e., to force the transition 0 → 1 , and there is no way of modulating that for a given instantiation of the bit.
On the other hand, for a qubit, suppose you could prepare a state of it in the form Then, it goes from �0⟩ at t = 0 to �1⟩ at t = 2n . If you can chose n arbitrarily, then this is done in an arbitrarily short time.

Review of the model
The model has two faces, one is physics/dynamics based, the other logic based. In this paper, we shall be concerned only with the physics-like aspects of the model. (The logic behind the following definitions and the "quantum-like" nature of the models adoptedwhich is independent of any actual quantum physics, being an accident of logic-is exhaustively rehearsed in [1].) Our model quantum-like networks have nodes comprising a bipartite system which is supposed to model "standard" biological neurons in general morphology. That is to say, we assume two chamber-like nodes, one corresponding to the input part of the cell's somatic membrane receiving fanned in dendritic signals, and the other to the output part of the cell's somatic membrane including the hillock/trigger zone/axon initial segment (or possibly the whole axon). A single axon/output chamber fans out to the input chambers of other such bicameral neurons or b-neurons.
The space of states of a single b-neuron is thus of the form where ℝ denotes the field of real numbers, e 0 denotes the state representing the input node, e 1 represents the state representing the output node, and we have explicitly written in the generators of the two subspaces. Compared to the qubit of quantum computing fame, e 0 = �0⟩ and e 1 = �1⟩ . In addition, we argued that this space is in fact the exterior algebra of the space ℝe 1 , the state space of the output node, so that e 1 ∧ e 1 = 0 . This latter equation is a reflection of the fact that a biological neuron cannot enter a state of firing while it is firing. It is this exterior product structure that gives b-neurons their fermionic structure, although all vector spaces are over the field of real numbers. Consequently, collections of them obey the statistics of Fermi-Dirac. Networks of such b-neurons are then defined and give rise to the following structures.
• A directed finite network, denoted N b A say, whose vertices are assigned b-neurons, denoted by n A i , for i = 1, … , N where N is the number of vertices, with links/edges fanning in and out; • A real finite dimensional Hilbert space, denoted ℌ A , of dimension N, with an orthonormal basis {e A i } , i = 1, … , N , where e A k denotes the state corresponding to the output node of the b-neuron n A k . Thus, an element of ℌ A , such as represents a (superpositional) state in which the b-network may be found to have the b-neuron n A i 0 firing with probability (2.14) (t) = cos(nt)�0⟩ + sin(nt)�1⟩.
This follows from Born's law discussed above. Our model b-neurons have no internal structure, or rather such structure has been hidden by the logic of our approach, which is aimed at considering only the relevant structural aspects. Thus, the value v A i in the output node of n A i is not causally connected to whatever firing mechanism is involved until a dynamics is imposed to drive this mechanism. Only then does a recognizable form of action potential emerge from our model b-neuron, as shown in Example 1.
• A real N × N matrix (J ij ) whose entries are the product of a factor representing a synaptic weight or scaling factor associated with a single synaptic connection n A j to n A i , and the corresponding adjacency matrix for the network, namely the number of edges or links from n A j to n A i .

Dynamics
(Please see [1] or [2] for further details.) As noted, dynamics in this context refers to the ebb and flow of states of occupancy of the b-network involved. The state space of the b-network N b A (of fermion-like) b-neurons is then the usual (though in this case real) Fermi-Dirac space generated by the space of states of single occupancy, namely ℌ A , which we shall denote by E(ℌ A ) , also known as the exterior algebra of ℌ A (which has the additional structure of a graded Hopf algebra). That is values attributable to the corresponding co-firing subset, though not available (or hidden) until some dynamics is imposed, and then they contribute probabilistically.) We call such a state a firing pattern of the b-network. It is a superposition of basic firing patterns.
The grade 0 elements are of the form 1 ∈ ℝ , where the "firing pattern" 1 ∈ E(ℌ A ) represents the vacuum, or unoccupied state, in which no b-neuron is firing (i.e., is in the firing state). An alternate notation, which we shall use in the rest of the paper, is as follows. and so on. We note also that for any firing pattern , say, e 0 ∧ = 1. = . That is, e 0 = � ⟩ is the unit in the exterior algebra.
The creation operators a † i , and annihilation operators a i , are defined as usual for i > 0 , except that for historical reasons we have retained the notation a † i for the ordinary transpose of a i which we would, in all other cases of our real operators, denote by a superscripted T. These operators obey the usual anticommutation relations: from which it follows that In [1], we derived the infinitesimal generator, H N , for the internal network dynamics of a b-network N b A . (A slightly less rigorous derivation is to be found also in [2].) Namely where the exchange terms, or synaptic efficacies, J ij , are generally time dependent. It is of the form known to physicists as quasi-spin and resembles an operator version of the Hopfield model. The associated time evolution operator is then the sign in the exponent being conventional. So if represents a firing pattern at time t = 0 then it evolves into the firing pattern at time t ⩾ 0 . As noted, H N is generally not skew-symmetric so T N (t) is not generally orthogonal. (The dynamics of the thirteen three neuron motifs are investigated in [2].)

Interaction dynamics
Our model of the principal neuron, the b-neuron, is extremely reductive, ignoring as it does the details of the geometry and complex electrochemistry of the membrane boundary, and of the interstitial environment, including the complexes of the morphologically widely diverse, but mainly inhibiting, interneurons. The overall dynamical effects associated with these sources (ambient medium, membrane surface gates and channels, peptides, hormones, interneuron inhibiting neurotransmitters, etc.) are supplied by adding a separate or interaction Hamiltonian. This Hamiltonian takes into account (in simplified form) the effects of the agents within the ambient environment, both excitatory and inhibitory, including those that are presumed to emanate from the neurons themselves, such as neurotransmitters. So it takes into account effects that can be either or both external and internal to our b-networks.
The simplest non-synaptic interaction Hamiltonian reflecting such ambient and possibly self-generated influences will be, at least to a linear approximation, of the form where S denotes some subset of the indexing set of the b-neurons in N b A . The coefficients J ij , l i , and h i are generally functions of time. This operator need not be skew symmetric, and usually will not be, though in our examples to come they will be, for reasons to be given.
The expression H I has an interesting property which follows immediately from the anticommutation relations of the operators involved. Namely Proof From the anticommutation relations, we have: as required. We note that this immediately specifies the (real) eigenvalues of H I : they are ( ⋅ ) Thus, in the absence of a network structure other than the degenerate one, the system will have no eigenvectors other than the zero vector unless ⋅ > 0 . In other words, it will be unstable if ⋅ ⩽ 0 , but please see below.
The associated time evolution operator T I (t) ∶= e −tH I (t) , where t is the time measured from some point chosen as t = 0 , then assumes an interesting form as follows, using the last equation: Putting ∶= ⋅ , this is soon seen to give: where Both of these series clearly converge for all values of t and and are uniformly convergent in any closed t-interval.
An even more interesting situation arises when < 0 . In this case, let us put = − 2 . Then, it is easy to see that so that from equation (3.28) (Note that T I (t) remains the same whichever sign is chosen for .) We note that equation (3.28) is exact, and not an approximation to an exponential, which it superficially resembles.
A significant feature of the last equation, which is an exact expression for the time evolution operator induced by it upon the b-network, is that its periodic nature is manifest even if the interaction term is constant. For instance, let us consider the simplest case of the b-network N b A having no synaptic connections at all: that is to say, it is a disconnected cluster of b-neurons. Suppose the interaction influence starts at time t = 0 with no b-neuron firing. That is to say the cluster N b A is in the state � ⟩ . Then at time t ⩾ 0 , the cluster is in the state Thus, the probability that while in this state, the i th 0 b-neuron, for some i 0 ∈ S , is ON or firing is So n i 0 is certainly firing when cos 2 ( t) = 0 or at t = (n + 1 2 ) , n = 0, 1, … , and it is certainly not firing when sin 2 ( t) = 0 or at t = n , n = 0, 1, … This behavior is the same for all n i 0 with i 0 ∈ S . Thus, the ensemble of stimulated b-neurons in the cluster turns ON and OFF in synchrony at a possible time-dependent frequency determined by the input parameters. This is true even if the input parameters are constant.
In the case when the stimulus is self-generated-that is, by neurotransmissions from the neurons themselves-then such rhythmic oscillations are self-generated by the cluster standing alone. These wave-like oscillations are of course observed in actual brains and such a phenomenon apparently underlies the engine that drives their computational processes.
Thus, our networks differ from most neural network models, such as the Hopfield models, in that inhibition as well as excitation is inherent and leads to rhythmic activity.
In Sect. 5, we will investigate simple examples in which there is a network structure. It is worth noting at this point that the H I term appearing in the time evolution operator above acts as a kind of switch, since, for example and where the hat in the last equation indicates the item under it should be deleted. Thus, if the co-firing cluster whose state is � j 1 , j 2 , … , j k ⟩ does not contain the stimulated b-neuron, then T I (t) acts, among other things, to add it to the cluster, while if the co-firing cluster does contain the stimulated b-neuron, then T I (t) acts to remove it. So stimulated b-neurons are removed from the co-firing clusters they are in and switched to the co-firing clusters they were not in, with probabilities guided by the l and h factors. These switch-like acts are done in the sort of parallel or multiplexed fashion associated with superposition and are moreover additionally subject to the oscillations entailed by the sinusoidal factor. Immense computation-like resources are apparently therefore to hand, since not only are there a multiplicity of such forms available, acting in parallel, but there is also a continuum of different l and h factors. (3.37)

Other consequences of the model
There are two sources of dynamic instability in the model. One emerges from the "infinitesimal" nature of the proposed b-network Hamiltonian which can only be trusted for small values of t, and may therefore be only an artifact of the model. This is discussed in the first subsection below. The other seems to be more intrinsic, both to the model and in real neural networks. It is named for Hebb, and is discussed in the second subsection to follow.

Intrinsic b-network dynamic instability
Let us consider the simplest non-trivial b-network, namely two b-neurons connected by a single synaptic link as depicted in Fig. 1.

The network Hamiltonian is
The time development operator is then and this is exact, i.e., not an approximation. Starting with the b-neuron n 1 in the ON state t = 0 , the network state at time t ⩾ 0 is Thus, as time goes on, the contribution of the second b-neuron's state to the superposition that is the network state at that time might increase without bound, depending on the nature of the J-factor. This is one, intrinsic, source of instability, and it holds for general b-networks. In order to avoid such catastrophes, some vitiating time-dependence would have to be present in these factors. Since they act as a synaptic weights, such dependence may be called synaptic scaling.

Hebbian learning
With the same system as above, and allowing for non-synaptic substrate influences, we showed in [1], chapter 4.1, that, under mild restrictions, repeated almost simultaneous firings of the two b-neurons produce in our model a hyperexponential change in the non-synaptic component of the strength of the connection. This essentially simulates "Hebbian learning" and is another inherent source of instability.
This sort of general instability is well known to reflect the behavior of actual synaptic connections. In reality, some form or forms of vitiation must intervene in real brains to stem it. The study of such possible biological mechanisms seems to be of surprisingly recent origin, essentially being pioneered in [8]. For a review, see [9].
Various phenomena have been suggested whereby this may be achieved and some have been observed, for instance in the brains of midlarval zebrafish [10]. In the latter case, the "synaptic scaling" involved is the proliferation of synaptic branching that vitiates the Hebbian explosion, and just such a mechanism is also suggested by the second result concerning Hebbian learning from [1].
Generally, such inhibiting effects are regarded as a form of homeostasis, whereby the system seeks to restore a former state.
By design, our model has very few parameters to vary, and this paucity will be an advantage in exploring certain simple models from this "synaptic scaling" viewpoint, which is the subject of the next section.

Waveforms
The purpose of this section is to demonstrate the ability of our model to simulate realistic neurallike waves. We shall do this by working some simple examples followed by some remarks on the general case.
It may be noted in this connection that the waves, and firing patterns generally (in the jargon of our model), are the outputs of network activity. They are the vehicles or agents by which computations are effected.
We will approximate the following setup: a b-network subject to an external stimulus, with generic Hamiltonian of the form described above, denoted H I , of a subset of its b-neurons, while taking into account the network's intrinsic dynamics, whose generic Hamiltonian will be denoted H N . We will investigate the state of such a system at later times, assuming the external stimulus starts at time t = 0 , when the stimulated b-neuron or b-neurons b-network will be assumed to be in either of their basic states, ON or OFF. Thus, we will be faced with time development operators of the form e −t(H I +H N ) . In order to unravel this operator, we need the Zassenhaus formulation of the Baker-Campbell-Hausdorff formula, namely: ignoring the higher powers of , which is one form of approximation. Here, the commutator is [X, Y] ∶= XY − YX . In our cases, we have [H N , H I ] being an element in the algebra of creation/annihilation operators, whose scalar coefficients are products of l-factors and h factors, paired with J factors: that is, those of the forms Jl, Jh. So, in addition to the other simplifying assumptions, we shall also assume that such products are small enough so that this commutator can be ignored to a good approximation. Then, we may take for small enough t.
We examine some small examples before discussing more general situations.
Example 1 Even the simplest case illustrates certain phenomena intrinsic to the model. This is the case of a single b-neuron, n 1 , being stimulated by an input Hamiltonian of the form where l 1 and h 1 are of different signs and generally time dependent. In this first example, we will take these parameters to be constant. For further examples, some time dependence must be adopted, for reasons to be discussed. Starting from n 1 OFF: The b-neuron's state develops into the following state at time t ⩾ 0: This equation is exact, since there is no network or network Hamiltonian. Thus, So, the probability that the b-neuron is OFF at time t is and that it is ON at time t is Thus, the b-neuron blinks on (when the second probability is 1) and off (when the first probability is one) at regular intervals even when the input is constant, as in our earlier example of the disconnected cluster. Note also that the first probability approaches 1 as |h 1 | gets larger relative to |l 1 | , while it approaches 0 as |l 1 | gets larger relative to |h 1 | . This shows that the influence of the h-term is relatively inhibitory while the influence of the l-term is relatively excitatory, as expected.
It is interesting to see what happens if the b-neuron is ON at t = 0. Starting from n 1 ON, we have Thus, So the probability that n 1 is OFF at time t is while the probability that it is ON is Here we note that the difference between the respective probabilities when the stimulated b-neuron is initially ON, vs. its being initially OFF, is merely transposition.
We note also that if |l 1 | = |h 1 | then both (t) 's are of the "canonical" qubit-like type considered earlier.

Expected value of the whole network
In the case of a general b-network, we shall be interested in the actual expected value of the output of the whole network after time t, with the stimulus starting at t = 0 . Our standing assumption is that this value reflects the overall expected behavior as for instance detected in EEG-like measurements of electrical activity. Specifically we assume: • The expected value of the whole network reflects its overall electrical activity; • This value comprises the sum of each b-neuron's contribution to the total value; .
• The coefficients in a firing pattern are proportional to the actual (common) value in the output nodes of the corresponding co-firing ensemble's b-neurons.
This last assumption is based on the fact that the matrix entries of an operator act as scaling factors, or weights, acting on the (hidden) actual value in the nodes whose states they multiply. And here we assume for simplicity that this is the same normalized value for every b-neuron in the relevant b-network at the time in question. This assumption would ultimately determine our choice of units, a choice we postpone.
The expected value ⟨ ⟩ of a random variable is defined in general as In our case, the random variable registers whether there is overall activity in the network or not. That is, whether it is in the rest or vacuum state � ⟩ or not. When it is in this rest state, the total output value is 0. When it is in the active state the total output value is the sum of the coefficients in that state, all of these changing with time.
We shall denote our generic expected value for a given b-network at time t ⩾ 0 by E(t). Thus if the b-network is in the state (or firing pattern) (t) at time t, we have where we omit the coefficient of � ⟩ , namely the one with = 0 , since there is no output in the state of zero occupancy.
Returning to the single b-neuron example, we found its state to be at t ⩾ 0 when started from being OFF. So the expected value of its output at t ⩾ 0 is: E(t) = prob(network is not in the OFF state at t) × (value in the output node at time t) .
To investigate our simple but basic example, we shall assume that the "ON switch" magnitude, namely the magnitude of the l-factor, is much greater than that of the "OFF switch," or h-factor. Otherwise, the values chosen are arbitrary. We choose:

Thus,
With these values, the expected value (Eq. (5.26)) yields the plot shown in Fig. 2. Time is the abscissa and intensity of response the ordinate. We have left the actual units in abeyance.
It is important to note that the plot in Fig. 2, and the others to follow, could only correspond at best to a narrow bandpass filtered version of an actual measurement, such as an EEG, since these examples would correspond to very small components of very much larger ensembles, and the complexities of other contributions, not to mention noise, etc., have not been taken into account. Our model's expected output, symmetric about the time axis and severely band limited, cannot be expected to reproduce the fine details of the electrochemistry involved in an actual action potential, such as the boundary details of polarization and depolarization, since these details have deliberately been hidden so that the bigger picture may emerge. Even so, there is a slight change in the graph at the points at which this happens in real neurons. Namely at the moments when the potential is initiated and when it declines again to the threshold value (of zero, in this scenario): there is a slight flattening of the plot at these points. Thus, the plot above may be compared to the response of a single neuron, which should resemble a series of a band limited versions of generic action potentials, as it does seem to crudely approximate.
However, we note the important fact that for an isolated b-neuron the only variable parameters involved above are the h and l coefficients, and while their ratio of their absolute values remains fixed, only the frequency of the resulting expected output value, not its (maximum) amplitude, may be changed. This is a well-known attribute of actual neurons.
In the applications to follow, we shall make a further simplification, in addition to the Zassenhaus approximation. Namely, we shall seek to mimic what are known as event related potentials, or ERPs. These are essentially the responses to outside stimuli, such as a sound or flash of light, or other sensory input. Since these are not necessarily of biological origin, we shall assume the H I are in fact skew symmetric so that the corresponding time evolution operator preserves probabilities. In which case, they must assume a form we shall write as where here and the norm being the Euclidean norm.
(Note that this assumption is equivalent to an assumption of a balance between inhibitory and excitatory influences upon networks which would tend to promote homeostasis. Thus, one might conjecture that functioning brains tend towards such an equilibrium regardless.) Example 2 It is interesting to see how the model treats perhaps the next simplest case, namely the case of the autapse: a neuron synaptically attached to itself as depicted in Fig. 3.
Consistent with our previous notations, we have and (5.31) We shall consider both cases of initialization. First we start the stimulation with n 1 OFF, and quickly find the system state after t ⩾ 0 to be: Consequently, the expected value for the autapse at time t is Now that we have a synapse, we must confront the inevitable Hebbian explosion. In order to quench the instability, we shall assume in the applications to follow that the synaptic weights, i.e., the J factors, are influenced by the external h factors, and that these factors, being external stimuli, may slowly decline in time.
Accordingly, we choose, rather arbitrarily though otherwise in keeping with our orthogonality and approximation assumptions, the following values for our parameters, leaving the units in abeyance: Here, we have chosen a ratio of J factor to h-factor which declines very slowly with time, while both h and J themselves also decline slowly in time. This seems realistic for the case of a stimulus and in the case of the synaptic weight J 11 it is the only kind of choice  the model leaves open to us by which to simulate synaptic scaling and thereby vitiate a Hebbian explosion. The ratio chosen is meant to simulate a possible, though simple, causal link between the two factors: we assume the substrate h-factor inhibition (via neurotransmission, for example) has a part to play in whatever synaptic scaling is involved. We note that the reliability of the Hamiltonian series expansion for longer times is enhanced by the diminution in time of the J-factor(s).
With parameters chosen as above, the plot is shown in Fig. 4. Figure 4 resembles rather closely the excitatory post synaptic currents that have been found in single autapse measurements, an example of which is shown in Fig. 5.
The top trace in Fig. 5 is the excitatory post synaptic current measured on a single autapse in vitro. The lower trace is a similar measurement taken from a "micronet" comprising two connected pyramidal cells (which are the epitomes of our b-neuron model).
The lower trace will be discussed in the next example.
For the case of the autapse being ON when the stimulation commences, we have and consequently the system state at t ⩾ 0 is Fig. 4 Expected value plot for the autapse, starting from OFF

3
The expected value is thus With the parameter values as before, this quantity very soon blows up to an immense value, presumably far above the capability of any actual biological membrane to contain so it must be construed as a catastrophic breakdown, leading perhaps to such phenomena as seizures. However, it is not mathematically singular and eventually the value reduces to zero, thanks to our imposed scaling effects. The outcome would have been even worse if these brakes had not been applied. (If J 11 is taken to be negative, the result is a rapid decline to the rest state.) It would appear from this and other analyses using the same model that excitatory autapses may be associated with conditions such as epilepsy, and there does seem to be some empirical evidence for this: cf. [12]. EEGs of the above type are also found under the rubric "(interictal) epileptiform." Although the role of autapses has been implicated in the prequels to seizures, their possible ubiquity and role, when scaled, in other, non-pathological modularity contexts has only recently been broached.
The choice of next example is not the most logically simple one to follow, but it is related to the autapse case so we shall do it here.
. Example 3 This is the simplest mutually connected pair, called a dyad or resonance pair, with, in our example, the usual stimulus applied to one of the pair, as in Fig. 6.
With H I as before we now have The dyad is found to occur frequently in the brains of many species and its symmetry seems to promote synchrony and stability [13]. Therefore, it will be of interest to see what happens to it upon stimulation in the cases when the stimulated neuron is both ON and OFF.
We start the stimulus first with n 1 OFF.
We note: and so on. Thus, upon neglecting powers of the J-factors higher than 2. Thus, to a presumably good approximation, the state of the system at time t ⩾ 0 , starting from OFF, is  A plot using the same parameter values is shown in Fig. 9. This corresponds to the case of the two connected pyramidal cells whose response is depicted in the lower trace of Fig. 5 for the autapse stimulation discussed above. In the work cited, the observed response was found to be similar to that of an autapse. Note that the plot above is similar to that in Fig. 4, though having a different start state.

Example 4
Our next example is the three neuron motif labeled M7 in the Sporns classification [15], with one of the neurons being stimulated as shown in Fig. 10.
As usual, we have and here Since the network is itself a (single) cycle, its time evolution operator expansion does not terminate, so we again invoke the approximation of ignoring powers of the J-factors higher than 2. This should extend further the range of reliability of the temporal approximation.
(5.67) Our approximations and small examples are probably not sufficient to draw any general conclusions about this vis-à-vis hippocampal theta waves (observed while an animal is awake) and the long irregular activity (LIA) waves (observed while the animal is asleep/ resting). But it may be suggestive of a hippocampal network structure at some level. Example 5 Our next example differs from the others in a couple of ways. Firstly, we will stimulate two b-neurons while they are in an even superposition of ON states, meaning that, in this state, each stimulated b-neuron is ON with the same probability. In addition, we will alter the inputs slightly to introduce an asymmetry in the l factors. This will have the effect of engaging co-acting pairs of b-neurons. The network is the simplest example of neural convergence, as depicted in Fig. 13.
Here we have with the usual sign differences on the l i and h i , so that and At t = 0 , we take the normalized system state to be Then, we have, for t ⩾ 0 . Putting all these together, we find the system state at time t ⩾ 0 to be

Fig. 13 A convergent b-network
Note that all possible pairs of co-acting b-neurons are engaged if l 1 ≠ l 2 . The expected value after t ⩾ 0 is In the plot for this example, we have maintained the entirely guessed at values used before but have now added a slightly higher guessed at value for l 2 = −h 2 than l 1 = −h 1 , namely With these values, the plot for the expected value is shown in Fig. 14.
This may be compared to an ERP average measured from the central part of human crania in an experiment on the recognition of brand names [16] shown in Fig. 15.
The peaks and valleys of the two plots in Figs. 14 and 15 follow a perhaps surprisingly similar course, considering the approximations and guesswork involved in producing the upper plot.
Let us now do this example starting the stimulation from the vacuum or OFF state. Here we find (5.85) The expected value is then given by With values as before but now taking l i < 0 (and therefore h i > 0 ) a plot is shown in Fig. 16. This may be compared to a typical late positive potential or LPP, associated with the perception of visual inputs either pleasant or unpleasant. The following example, Fig. 17, is of a typical such observation, from [17]: Here again, in Figs. 16 and 17, we find a similar configuration, with a similar slight dip at the beginning of the rising hump-like curve in both the computed plot and the experimental one.
Our standing hypothesis, supported by the discussion above and other evidence ( [1]), is that the states of b-networks we have dubbed firing patterns are the entities corresponding to the relevant states in real brains. It is their collective behavior that gives rise to what is observed in for instance EEG-like measurements as discussed in our examples above. These underlying states are, we contend, the coins of the brain-like computational realm, the physical waves detected in measurements being collective side-effects of these processes. Such computations are effected through the interaction and disposition of these (5.90) × − sin( t) (l 1 + l 2 + t(l 1 J 31 + l 2 J 32 )) . Figure 2 reproduced from [17], licensed under CC-BY 4.0 states which are generated by internal and external influences, and the structure of the networks involved. Given the validity of this hypothesis, the question arises as to why the patterns reproduced above, for tiny networks comprising only three or less b-neurons, should resemble those that have been measured from presumably large networks comprising many thousands or more neurons, as in the last three examples above. The answer may lie in the generality of the probabilities given by the so-called Born law. Probability is the ratio of the number of times something actually occurs to the number of times it can occur. The occurrence in this case includes the presence of that precise configuration of three neurons within a larger ensemble of possible configurations. That is to say, the probability as computed is also the ratio of the number of occurrences of those particular three neurons being not OFF while also being in that particular configuration among many other possible configurations involving other neurons, to the number of its possible occurrences, ON or OFF, among those other configurations. So the expected value is also taking account of the actual appearance of that particular configuration in a larger ensemble. Thus, in the case of Example 5, the topologically convergent configuration must appear frequently in the networks involved, if the calculated plot matches the observed one. This might be amenable to experiment.

Fig. 17
Moreover, again given the validity of our hypothesis, it might be possible to compute the expected value plots for many different network configurations and, by comparison with measured values, EEGs, etc., build up a dictionary to reverse engineer such measurements. Details of the underlying network structures and neurotransmitter environments may thereby be revealed, in a manner rather similar to the way X-ray crystallography reveals the underlying structure of a crystal.

Autonomous b-networks, working memory, and the Sternberg task
In this model, memoranda are identified with certain firing patterns of certain b-networks in which are implicit the synaptic context local to the b-neurons involved in the firing pattern: cf. [1], chapter 4. These are subject to change as time progresses. Given the Fermi-Dirac space, or exterior algebra E(ℌ A ) of firing patterns of such a b-network N b A , "retrieval" is implemented in the only way the model allows, namely via a map f t ∶ E(ℌ A ) → E(ℌ D ) at time t, where D labels another b-network, a "working memory" module say, which involves ultimately a projection operator upon a subspace of E(ℌ A ) itself [ibid.]. The activity of such a retrieved firing pattern in real brains presumably constitutes its experience.
As noted earlier, our interaction Hamiltonian for a given b-network may contain contributions from within the interstices of the network itself, namely from its internal electrochemical environment. In reality, it is likely to do so, since synaptic scaling by whatever means must be in action within stable networks, and some of it may emanate from internal sources. With our usual constraint upon the signs of the l and h factors, such Hamiltonians will induce neural wave phenomena which we shall dub autonomous. Such Hamiltonians should then be included in a specification of such autonomous b-networks. It is to be expected that such autonomous interactions would be very much weaker than external ones, leading to a low value for the parameter and therefore producing low-frequency waves, even when constant. (We should consider the single un-connected b-neuron of Example 1, with its specified interaction Hamiltonian-which in this case may be supposed to emanate from the neuron's own internal electrochemical environment-as our first and most basic example of an autonomous b-"network.") Of course, such waves are present in many actual brain networks and some of them seem to be endogenous, requiring no external inputs, though as a practical matter it might be difficult to untangle the autonomous activity from the other kind. In the case of such b-networks, the autonomous interaction Hamiltonian must be added to any other posited such interaction Hamiltonians.
Such waves of neural activity, from whatever source, in conjunction with concomitant synaptic scaling, would tend to promote and modulate Hebbian learning, either increasing synaptic strengths in the relevant regions of the network or weakening them, as the wave waxes or wanes there. Memoranda may then also be at various stages of being formed or deleted, and this will occur depending upon the complex dance of the coefficients in the interaction Hamiltonian. This wave action could promote neural potentiation by repeated nearly coincident excitation of members of local clumps of neurons, or reverse such potentiation via inhibition, or switch between these two actions. A low-frequency autonomous wave may be conveying the information that this particular network is ready to receive memoranda which are then adduced and processed by the action of adding new terms to, or augmenting extant terms of, the autonomous "carrier-like wave" interaction Hamiltonian. In this way, short-term memories may be established in a "working" memory module or network. (The engineering analogy is correct in principle, though carrier waves there are usually of high frequency.) Here we will consider one consequence of this idea, leaving a fuller account for a sequel. Let us assume then that we have an autonomous b-network, with autonomous Hamiltonian we shall assume for simplicity is in the skew-symmetric form promoting probability maintenance in time as in Eq. (5.31), namely for some subset S of b-neurons assumed to be receiving the autonomous influences. There is no loss of generality, and a gain in notational convenience, if we assume the set S indexes the entire ensemble of b-neurons, with the appropriate coefficients being zero. This will apply to all the subsequent corresponding indexing sets. The ultimate set of stimulated neurons will determine a subspace whose elements, the possible l-coefficients, generate a subspace we will dub the input space or space of inputs. (An argument similar to the one to follow applies also to the case of a general, not necessarily skew-symmetric form.) The corresponding factor in the associated time evolution operator, Eq. (3.33), is then We shall suppose, again for simplicity, that the network Hamiltonian may be ignored in the ensuing discussion: the general case will be considered elsewhere. Now, suppose a stimulus impinges on the network with interaction Hamiltonian While it lasts, the interaction Hamiltonian is H (0) I + H (1) I with coefficient vector (0) + (1) so that the factor is now We note that, as a result, the frequency of the resulting waveform(s) goes from ‖ (0) ‖ to ‖ (0) + (1) ‖ while the coefficients of the terms involving the sinusoidal factors are added with a new factor of ‖ (0) + (1) ‖ −1 . (The cosinusoidal terms are not encumbered with this factor.) So there is a change in both frequency and amplitude, though in both cases it is not a straightforward addition.
Let us suppose this external stimulus lasts long enough to establish a short-term memory and is then turned off, while another different stimulus is applied. Let us label this stimulus with a superscript, as before: and is the -factor for the time evolution operator generated by H (0) I + H (2) I . We continue in this way with further incoming stimuli, similarly indexing the associated time development operator as in where These operators are all orthogonal, that is to say, the transpose of T (j) I (t) is its inverse, and therefore it is inner product and norm preserving, since all our Hamiltonians have been chosen skew-symmetric.
The intuition here is the test of working memory known as the Sternberg task [4]. In this test, subjects are shown different items in the same category-colored squares at different positions on a grid, different characters from the same alphabet, etc.-and are asked to memorize them, being tested shortly thereafter on recalling them in order.
Returning to our scenario above, we note the following. The first step in memory consolidation seems to involve the Hebbian "learning" process of strengthening synaptic signals by the repeated nearly simultaneous firing of neighboring neurons (in the presence of interstitial substrate effects). This was confirmed in our model via a pair of "LTP" theorems [1], chapter 4.1. It is apparent from these theorems and their sequel ([1], section 4.1.3) that the speed at which this firing is repeated ramps up the synaptic strength in the context of excitation (and down in the context of inhibition). Thus, the presence of a neural wave of the above type would promote such Hebbian learning (and unlearning), and this process would be enhanced according to the rise in frequency of the wave. So high-frequency inputs would be favored for short-term memory formation (and loss). Such high frequencies may be assumed for the incoming stimuli H (j) I . Since it is the frequencies of the incoming stimuli that are the relevant factors in memory formation, what strategy would best ensure that each item is stored in such a way that it will be distinguishable from the others? That is to say, what restrictions should be imposed upon the memory forming stimuli to promote reliable distinctions upon the working memory traces they induce? To answer this question, we shall derive such firing  2. Under the above circumstances, a small value for ‖ (0) ‖ , entailing a low frequency for the autonomous carrier wave, will likewise decrease the magnitude of the fraction and therefore also the probability of confusion. 3. Again under the circumstances of mutual orthogonality of the input vectors, since they are orthogonal, they are linearly independent in the subspace of input vectors. Therefore, they are limited in number by the dimension of this subspace, which might be small, since it depends on the number of stimulated b-neurons. That is to say, the number of non-confusable memoranda in working memory might be small.
The upshot is that as the frequencies of the exogenous neural waves increase, and the frequency of the endogenous carrier wave decreases, the probability of confusion among the exogenous signals decreases. Consequently, the probability of confusion among the short-term memories induced by the exogenous waves must also decrease. The price inevitably paid for holding such mutually distinguishable memories simultaneously is their possible paucity.
All of these conditions are seen in laboratory settings, and the last observation may explain the reported results in tests of the Sternberg task: they also seem to justify the hypothesis of phase-coding, at least in the context of memory formation [4].
Our argument was made in the absence of a network or synaptic structure other than the effects we have labeled autonomous. This more general case will be considered elsewhere.
Evolution has presumably selected for neural circuits that implement these strategies, i.e., low-frequency carrier endogenous waves and sensitivity to high-frequency inputs. It is clearly a selective advantage to an organism to be able to make reliable distinctions between sensory inputs and their memory traces, for instance of signs and portents of predators in the neighborhood, or where food stores have been cached. These strategies taken together must then inevitably run into the issue of the possible paucity of such reliably distinguishable memoranda, since they may impose this restriction by mathematical necessity, from the third remark above. This may only be a problem for humans, since the need to make distinctions is likely to be of greater selective urgency for other animals.

Discussion
We have sought to validate the neural net model introduced in earlier work by showing that it can generate a panoply of neural waveforms that closely resemble observed neural output. There thus emerges a general method of reproducing in closed mathematical form a good approximation to the output from a network of this form, given the input, either endogenous, exogenous, or both, which may be of independent interest.
We then sought to explain the apparently anomalous finding, from trials of the Sternberg task, that humans have a rather limited amount of short-term memory storage for distinct items or events. Our finding here is consistent with the phase-coding hypothesis, which has been posited as an explanation of this finding.