Objective discovery of dominant dynamical processes with machine learning

veriﬁcation criterion. This framework is repeated over a user-speciﬁed range of algorithm parameters to ﬁnd the optimal regimes as deﬁned by the highest veriﬁcation criterion. We show that our framework yields results consistent with domain knowledge and previous stud-

formally define dynamical regime identification as an optimization problem by 23 using a verification criterion, and we show that an unsupervised learning frame-24 work can automatically and credibly identify regimes. This eliminates reliance 25 upon conventional analyses, with vast potential to accelerate discovery. 2 Our 26 verification criterion also enables unbiased comparison of regimes identified by 27 different methods. In addition to diagnostic applications, the verification crite-28 rion and learning framework are immediately useful for data-driven dynamical 29 process modeling, 3, 4, 5, 6, 7 and are relevant to researchers interested in the de-30 velopment of inherently interpretable methods 8 for scientific machine learning. 31 Automation of this kind of approximate mechanistic analysis is necessary for 32 scientists to gain new dynamical insights from increasingly large data streams. 33 Observations of dynamical systems often exhibit patterns of spatial and/or temporal spar- terms in e n that are neglected. We choose a verfication criterion V(E, H), such that the 77 optimal fit hypotheses, H opt , can be obtained by varying the hypotheses H to find where 1 is an array of ones indicating all equation terms are retained for the entire data array. 79 We use the notation conventions of Bishop, 21 where scalars are italicized, lower case bold 80 represents one dimensional arrays, and upper case bold represents two or higher dimensional 81 arrays. 82 We where Γ n is the normalized difference of the log of the smallest magnitude equation term of 102 the set of terms that are considered dominant and the largest magnitude equation term of 103 the set of terms that are considered negligible. We thus refer to Γ n as the gap in magnitude 104 and M n as the local magnitude score. Ω n is a penalty imposed by the difference between 105 maximum and minimum magnitudes within the set of terms that are considered dominant.

106
Definitions of Γ n and Ω n are provided in Methods. The score measures the consistence of 107 local truncations of the equation with the observed magnitudes of equation terms. 108 We propose the weighted average of the score M n (e n , h n ), when averaged over N samples, 109 as the verification criterion, We propose an unsupervised machine learning framework 17, 18 that automatically discovers  The first task shown in Figure 1, row 1, is to partition E into different regimes. For average equation terms and then selecting balance associated with the highest score.

148
The final task shown in Figure 1 is to measure the fit of hypotheses H to the data E.

149
This task was conventionally performed indirectly through post hoc validation of models con- which describes the balance of processes that control the rate of solid body rotation of a column of seawater (see Methods for more details).
where the velocity and pressure fields (u, v, p) have been decomposed into mean and fluctu- for observation n such that 1 ≤ n ≤ N .
where h n is an indicator function 41 that consists entirely of ones and zeros, which represent 313 selected dominant terms and negligible terms, respectively.
and, therefore, the remainder index set and selected index set are non-overlapping, 320 R n ∩ S n = ∅.
Thus the cardinality, or size, of the selected index set and remainder index set are 2 ≤ card(S n ) ≤ D and 0 ≤ card(R n ) ≤ D − 2, respectively. The lower bound of two selected terms is not necessary nor required; we impose it because a dominant balance of just one term is conceptually ambiguous. Let the arrays of selected and remainder equation terms from e n be s n and r n , respectively. s n and r n are normalized by the smallest element of e n and defined as respectively. If min( i∈F |e ni |) = 0, then the minimum non-zero absolute valued element 321 of e n replaces the denominators in Equations 13 and 14. Let the relative magnitude gap 322 between the normalized subsets, Γ, be defined as a scalar for each n th observation: The magnitude gap Γ is normalized such that Γ ∈  Figure 3b) corresponds to equivalently optimal results. Figure 3c) shows the dominant balances of the optimal regimes and Figure 3d) shows the spatial distri-369 bution of the optimal regimes. Identical optimal regimes were identified by using K−means