Temporal super-resolution using smart sensors for turbulent separated flows

Particle image velocimetry (PIV) data of high Reynolds number unsteady turbulent flows are often undersampled in time; this leads to aliasing of important spectral content. The present work proposes a novel data-driven estimation technique that uses oversampled sparsely placed surface-mounted pressure sensors and long short-term memory neural networks to resolve the aliased transient velocity dynamics from undersampled PIV data. The method leverages the time-resolved pressure dynamics to estimate the temporal evolution of a proper orthogonal decomposition-based low-dimensional subspace of the velocity field. The proposed approach is demonstrated on a PIV dataset of a high Reynolds number turbulent separated flow over a Gaussian speed-bump benchmark geometry (ReH=2.26×105\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text{Re}_{H}=2.26\times 10^{5}$$\end{document}, where H is the Bump height). The 15 Hz PIV data is super-resolved to 2 kHz, and spectral analysis of the flowfields is conducted to educe the originally aliased unsteady dynamics of the turbulent separation bubble. The estimator is shown to accurately reconstruct the Reynolds shear stress from unseen sensor data, demonstrating its generalizability to resolve the coherent motions. The estimated velocity spectra show distributions consistent with those of other separated flows.


Introduction
Particle image velocimetry (PIV) has been extensively used to extract velocity flowfields relevant to aeronautical applications. In such flows, integral and turbulent scales of motion typically differ by orders of magnitude, which necessitates a trade-off between the acquisition window and temporal resolution. The PIV imaging/sampling rate is often too low to resolve the time scales that correspond to the energy-producing flow structures. While these undersampled datasets of the flowfields can still be used to provide a statistical description of the turbulence, the frequencies pertaining to unsteady mechanisms that govern the distribution of energy within the flow are aliased. Of particular interest is the unsteadiness in turbulent separated flows, which has important engineering implications such as fluctuating body forces that can lead to structural fatigue, decreased diffuser performance and increased component drag. Therefore, developing techniques to resolve these aliased frequencies is crucial to understand, predict and control these flows (Brunton and Noack 2015).
In response to NASA's computational fluid dynamics (CFD) Vision 2030 for accelerated advancement in the computational aerosciences (Slotnick et al. 2014), a threedimensional Gaussian speed-bump geometry has been developed as part of an experimental/CFD campaign to improve Reynolds-averaged Navier-Stokes (RANS) modeling of turbulent separated flows (Slotnick 2019). Planar PIV experiments conducted in the 3 � × 3 � low-speed wind tunnel at the University of Washington have enabled the characterization of the mean field topology of the turbulent separation bubble (TSB) for a range of freestream speeds from 20 − 60 ms −1 corresponding to Re H = 1.12 × 10 5 − 2.90 × 10 5 based on the Bump height, H (Williams et al. 2021). Surface pressure measurements indicate Reynolds number independence in the mean field for Re H ≥ 2.26 × 10 5 (Williams et al. 2020). Comparison with RANS simulations (Williams et al. 2020), and exploration of additional experiments (Gray et al. 2022), direct numerical simulations (Balin and Jansen 2021; Shur et al. 2021) and wall-modelled large eddy simulations (Iyer and Malik 2021) have shown that the Bump presents a significant challenge for predicting and modelling flow separation due to the presence of alternating pressure gradients and surface curvature effects.
The unsteadiness of large-scale structures in TSBs has been studied: for geometry-induced TSBs such as flows around sharp corners (Cherry et al. 1984;Hudy et al. 2003;Fang and Tachie 2019) as well as pressure-induced separated flows (Mohammed-Taifour and Weiss 2016; Wu et al. 2020). Two dominating broadband spectral features often characterize the TSB dynamics: (i) a low "breathing" or "flapping" frequency referring to the contraction/expansion of the bubble that also results in the "flapping" of the separated shear layer, and (ii) a medium "shedding" frequency that refers to the shedding of spanwise-oriented vortex rollers from the TSB. The PIV sampling rate for the Bump dataset (15 Hz) is too low to resolve these unsteady TSB dynamics for Re H ≥ 2.26 × 10 5 . The large scale-separation in the highly turbulent Bump flow therefore further challenges techniques to educe the coherent motions at the aliased frequencies in turbulent separated flow PIV datasets.
Temporal super-resolution techniques have been proposed for fluid flows using several methods. Schneiders et al. (2014) leveraged high-resolution spatial information to increase the temporal resolution of incompressible 3D flow datasets using the vortex-in-cell method. The more common class of techniques involves the use of local "sensors" within the flowfield to reconstruct the spatiotemporal global flowfield evolution. For instance, Discetti et al. (2018) increased the temporal resolution of a PIV dataset of a turbulent channel flow using hot-wire anemometers and extended proper orthogonal decomposition (EPOD) modes. More recently, machine learning methods using neural networks have shown promising improvements. Using a single cobra probe (measuring velocity) placed in the unsteady turbulent wake behind a circular cylinder, Jin et al. (2020) used recurrent neural networks with gated recurrent units to super-resolve the temporal coefficients of the first three POD modes from 5 Hz to 10 Hz in order to resolve the vortex shedding frequency at 5.4 Hz. Fukami et al. (2021) successfully applied a convolutional neural network to increase both spatial and temporal resolution of a turbulent channel flow using skip-connections. For very high Reynolds number flows, the prediction of large-scale dynamics from a few sensors is challenged by the high-dimensional nature imposed by the large scale-separation. Most super-resolution techniques have not been tested on turbulent flows.
The above methods utilize sensors that directly measure a sparse representation of the global flowfield of interest (such as the use of high-repetition rate in-flow velocity probes to extend the frequency resolution of the global velocity field), which aids the estimation problem. However, these sensing techniques are intrusive and may not be practical for most engineering applications. The use of wall-mounted pressure sensors to estimate flowfields is a non-intrusive alternative that can be implemented in an economically feasible framework for flow monitoring, prediction and control. Unlike velocity, pressure is a non-local quantity (explained through the Poisson equation). Hence, the wall-pressure field encodes the inter-scale interaction between far and near-field structures. For turbulent boundary layer (TBL) flows, Hutchins and Marusic (2007) postulated the possible influence of large scales in the logarithmic region on the near-wall smallscale fluctuations through superposition and amplitude modulation, which was later confirmed by Mathis et al. (2009) and Marusic et al. (2010). The effect was also observed in a zero-pressure-gradient TBL subjected to various pressure gradients (Harun et al. 2013), as well as in the wall-pressure signal (with a time lag between the large-and small-scale pressure fluctuations) by Tsuji et al. (2016). Therefore, in addition to the practical benefits, the use of wall-pressure sensors may be highly appropriate to predict the large-scale dynamics throughout a turbulent flow.
The importance of time history in the sensor signals for flow estimation has been noted (Durgesh and Naughton 2010;Hosseini et al. 2015;Jin et al. 2020;Manohar et al. 2022). For instance, Hosseini et al. (2015) determined an optimal time delay between the wall-pressure signal and velocity flowfield data to estimate the vortex shedding behind a pyramid obstacle from PIV measurements. More recently, long short-term memory (LSTM) neural networks (Hochreiter and Schmidhuber 1997) have been successfully applied to sensor-based flow estimation (Manohar et al. 2022) due to the ability to make future state predictions conditioned on the time history. However, the use of LSTMs for temporal super-resolution of high-dimensional turbulent flows using few wall-pressure sensors remains unexplored.
As demonstrated in earlier studies (Bonnet et al. 1994;Tinney et al. 2008;Tu et al. 2013;Hosseini et al. 2015;Zhang et al. 2020;Ozawa et al. 2022;Manohar et al. 2022), using a low-dimensional proper orthogonal decomposition (POD) subspace can improve the observed correlation between pressure-sensor and velocity data. Moreover, the spatial POD modes are agnostic to the PIV sampling rate, making them appealing to establish a low-dimensional spatial basis from undersampled velocity flowfields. For instance, Zhang et al. (2020) utilized spectral POD (SPOD) and the spectral linear stochastic estimation technique proposed by Tinney et al. (2006) to estimate optimal transfer functions for a (linear) multiple-input/multiple-output (MIMO) model between time-resolved unsteady surfacepressure measurements and the velocity field of a Mach 0.6 cavity flow. Their model, which used only two sensors and was trained on 15 Hz non-time-resolved PIV data, was able to accurately reconstruct the first four Rossiter frequencies (up to 2 kHz) of the first SPOD mode. The high-dimensional/multi-scale nature of the Bump flow introduces new challenges to the estimation problem as several modes are required to reconstruct the dynamically relevant flow features. In particular, while the Rossiter modes are spectrally separated, the energy contained in the Bump modes has strong contributions in similar regions of the spectrum in each of the modes-resulting in modes characterized by broadband frequency-centered phenomena, which challenge traditional estimation architectures. Manohar et al. (2022) recently conducted a comparative performance analysis between several pressure sensor-based flow estimators for a complex two-cylinder laminar wake flow: linear/quadratic stochastic estimators (LSE/QSE), a shallow feedforward neural network, and a single-cell LSTM network. They showed that the LSTM-based estimator is able to significantly outperform the LSE/QSE and feedforward network approaches, achieving remarkable reconstruction performance in both the time and frequency domains, even for noisy sensor signals. For this reason, LSTM networks are chosen for the current study.
The objective of this work is to estimate high-frequency dynamics from undersampled PIV data. A novel pressure sensor-based estimation technique using LSTM neural networks and a sparse array of surface-pressure sensors is introduced. To the authors' knowledge, this work presents (i) the first application of a data-driven temporal super-resolution technique using pressure sensors on a PIV dataset of a high-Reynolds number turbulent separated flow with energy distributed over a very large range of wavelengths/frequencies, and (ii) introduces a strategy for leveraging the TR pressure sensor history to resolve otherwise aliased spectral contributions to the velocity field using an LSTM estimator. The effectiveness of this technique is demonstrated on a PIV dataset of a high Reynolds number turbulent separated flow over the speed-Bump. The 15 Hz PIV flowfields are super-resolved to 2 kHz from unseen sensor inputs to educe unsteady dynamics at originally aliased frequencies. The estimated time-resolved (TR) dynamics are compared with those of other separated flows.

Methodology
The experimental setup and data acquisition methods are detailed in Sect. 2.1. In Sect. 2.2, the flow estimation and reconstruction techniques are outlined.

Experimental setup and data acquisition
Experiments are conducted in the 3' × 3' low-speed wind tunnel at the University of Washington. The test section width is L = 0.914 m [see Fig. 1a]. PIV is used to acquire two-component velocity data in a streamwise-vertical plane, denoted Ω , that captures a 22.25 × 18.77 cm 2 field-of-view, which is chosen to always encompass the motion of the separation point (Annamalai 2022) [see Fig. 1b]. The lightsource is a Quantel Evergreen dual-pulsed 200 mJ pulse −1 532 nm Nd: YAG laser with a maximum repetition rate of f PIV = 15 Hz. The light sheet thickness is Δy ≈ 1 mm. A dataset of N = 23 920 snapshots is acquired, which consists of seven independent runs. The freestream speed is maintained at U ∞ = 45.0 m s −1 with a standard deviation of 0.3 m s −1 across the seven runs. Based on U ∞ , the nominal Reynolds number is Re H = 2.26 × 10 5 where H = 0.0766 m is the Bump height.
A de-ionized water and glycol mixture of smoke particles with nominal particle size of 1 m is used to seed the flow. A 5.5 megapixel Imager sCMOS camera ( 2560 × 2160 pixels) is used to capture the frames via a Nikon Micro 60 mm focal length lens with aperture set to an f-stop of f/8. This yields a pixel resolution of 11.14 pixel mm −1 . Image acquisition, calibration and PIV postprocessing are controlled through the LaVision DaVis plane Ω , with mean streamwise velocity field U(x) ∕ U ∞ and superimposed streamlines. Nine pressure taps along Bump surface, marked as black dots, are labeled Taps 6-14 10 software. The time interval between successive image pairs is set to 25.4 s to limit the maximum particle displacement to 12 pixels.
Images are post-processed using a multi-grid, multipass, cross-correlation method. The final pass uses a 32 × 32 pixel interrogation window with Gaussian windowing and 75% overlap. The spatial resolution is thus estimated to be 26.5 pixels per interrogation window or 2.38 mm ( = 0.031 H ) per window (see Chapter 3.1.3, Williams (2014)). The vectors are validated using a correlation level filter and three passes of a normalized median filter using a 5 × 5 vector window (Westerweel and Scarano 2005). The average percentage of missing vectors per snapshot is 0.26%.
The static wall pressure at nine locations is measured using an Advantech PCIE-1805 data acquisition card via multiplexing and AllSensors 5" H 2 O D1-P4V-miniature amplified differential pressure transducers. Pressure measurements are sampled at f sens = 31.25 kHz and are synchronized to the snapshots to within f −1 sens . All pressure measurements are low-pass filtered using a minimum order digital filter with a cut-off frequency at 1 kHz.

Flow estimation
The snapshot POD is summarized in Sect. 2.2.1. An overview of the estimation procedure is provided in Sect. 2.2.2, and the POD mode selection criterion is detailed in Sect. 2.2.3. For the following sections, the Reynolds decomposition of the velocity field is given by u(x, t) = U(x) + u � (x, t) where U(x) and u � (x, t) are the mean and fluctuating fields with components U(x), W(x) and u � (x, t), w � (x, t) , respectively.

Snapshot proper orthogonal decomposition
The snapshot POD is used to obtain a low-order reconstruction (LOR) of u(x, t) up to the leading r terms/modes: The energy captured by the first r modes The time-varying coefficients a (i) (t) are obtained by orthogonal projection and satisfy orthogonality following Nobach et al. (2007): with ⟨ ⋅ , ⋅ ⟩ the inner product in the time domain, T is the acquisition window, i are the eigenvalues of the two-point correlation function, and ik is the Kronecker delta.

Overview of estimation procedure
An overview of the estimation training procedure is provided in Fig. 2. First, the snapshot POD is computed on the under-sampled 15 Hz dataset of N train training snapshots of where T train is the interval spanning the training data and k ∈ [1, N train ] . The criterion to select the number of modes r to represent the integral flow scales is described in Sect. 2.2.3. This yields a set of spatial bases (ii) The time-varying POD coefficients are used with the synchronized (red dots connected by red lines to assist visibility) and TR wall pressure sensor data to train an estimator; (iii) The map F is approximated, which takes as input a time-delayed sequence of pressure data and estimates the POD coefficients at t k Second, an estimator is trained to predict the state of the velocity POD coefficients using the available TR pressure sensor history. The instantaneous fluctuating pressure at the nine sensors and r POD coefficients at t k are arranged into column vectors p � k ∈ ℝ 9 and a k ∈ ℝ r , respectively. The time- where n is the number of time steps at f sens that is empirically chosen to minimize the estimation error, is the number of time steps at f sens defining the time resolution/interval within the time-series, and S is the resultant sequence length. An estimator is trained to approximate the map F for all k using the training data: i.e., F uses a historical time-series sequence of fluctuating static wall-pressure at times { t k − n ∶ ⋅ f −1 sens ∶ t k } as inputs to predict the velocity POD coefficients at t k . The estimator is trained to minimize the mean-square-error between the predicted output state, ã k , and the true state, a k . A long short-term memory (LSTM) neural network is used to approximate the map F ; see Hochreiter and Schmidhuber (1997) for a detailed description on the architecture. The network is trained using the stochastic gradient descent algorithm with early stopping (Goodfellow et al. 2016).
For super-resolution, the TR pressure data is used with the trained estimator to predict the TR evolution of POD coefficients at any desired sampling rate that is less than f sens = 31.25 kHz. The estimated TR coefficients {ã (1) (t k ), … ,ã (r) (t k ) } are then projected onto the spatial modes { (1) (x), … , (r) (x) } obtained in the initial decomposition step using the training data to yield the TR velocity time-series through (1).

Proper orthogonal decomposition mode selection
A LOR in (1) is typically obtained using an energy-based criterion, where the truncated series of r modes retains a certain amount (usually > 99% ) of the total turbulent kinetic energy (TKE). However, the required number of modes to retain > 99% of the TKE for turbulent flows with large scaleseparation can be prohibitively large for estimation problems. Discetti et al. (2018) proposed a truncation criterion that removes the uncorrelated portion of the sensor signals from the flowfield reconstruction, effectively filtering out the content that does not contribute to estimator performance. Although this method has been shown to improve the reconstruction quality of specific flow cases where the scale-separation is not large, the effectiveness of this criterion may be greatly reduced when applied to high Reynolds number flows where the TKE is distributed over a very large range of scales/frequencies, such as that of the Bump. In particular, the criterion proposed by Discetti et al. (2018) could retain additional low-energy modes that not only increase the computational cost, but also do not significantly improve reconstruction performance. The present work aims to resolve the scales that dominate the turbulence production, P . Thus, the number of POD modes, r, used to construct the LOR is chosen such that the Reynolds stresses −u �2 , −w �2 and −u � w � are resolved. If the Reynolds stress tensor for the original high-dimensional field is defined as R , then R (k) is the corresponding tensor of the LOR obtained using the first k POD modes. The individual relative contribution of mode k to the Reynolds stress reconstruction is calculated using where | ⋅ | denotes the absolute value, M is the total number of points in Ω , and the argument (p, q) identifies the element (using row-column indices) in the matrix representation of R ij . Plotting Δ ij against the number of modes defining the LOR admits the qualitative inspection of the rate at which Δ ij saturates, or the rate at which individual modal contributions to Reynolds stress reconstructions can be deemed negligible-thus defining the cut-off k = r . The estimator is then trained to predict the temporal coefficients of the r modes.

Results and discussion
The flowfield is characterized in Sect. 3.1. The hyperparameter study and flow reconstruction performance are presented in Sect. 3.2, followed by a brief discussion of the unsteady TSB dynamics based on the TR estimation in Sect. 3.3.

Mean field topology and mode selection
The time-averaged streamwise field shows a portion of a large separated zone [see (4) The mean separation bubble length, L sep = 0.226 m, is computed as the Euclidean distance between separation and reattachment.
The snapshot POD is computed on N train = 11 962 snapshots (revisited in Sect. 3.2.1). Based on Sect. 2.2.3, the first r = 80 modes (83.5% of planar TKE) are sufficient to resolve the Reynolds stresses [see Fig. 3a]. Sample 80-mode reconstructions of instantaneous snapshots show the rollup of vortices at x∕L ≈ 0.12 and reveal large-scale structures along the shear layer for x∕L > 0.14 [see Fig. 3b-c]. It is emphasized that although the first three POD modes can retain nearly half the total planar TKE (46.5%), a LOR derived from these modes cannot accurately reconstruct the Reynolds stress distributions [see Fig. 3a].

Unsteady wall-pressure dynamics at separation
Power spectral density (PSD) functions of the pressure fluctuations (denoted by Ψ p ) of the TR wall-pressure sensor measurements are used to identify energetic spectral regions expected to represent signatures of the large-scale dynamics. Taps 7-12 likely contain the spectral signatures of convective and global instabilities emanating from within the flow domain Ω and potentially from upstream disturbances (Annamalai 2022). Thus, the magnitude distribution of the pre-multiplied pressure PSDs pertaining to Taps 7-12 is plotted as a function of the Strouhal number Fig. 4. The spectra are each averaged using a Hamming window size of 2 15 samples with 50% overlap for 4.28 × 10 6 samples acquired at f sens = 31.25 kHz.
The spectra reveal two broadband frequencies. A low-frequency spanning St ∈ [0.02, 0.12] centered at St br = 0.07 (13.5 Hz) likely corresponds to the "breathing" motion of the TSB or the "flapping" of the separated shear layer [see Mohammed-Taifour and Weiss (2016) and Wu et al. (2020)]. The medium-frequency spanning St ∈ [0.42, 0.92] centered at St sh = 0.68 (135 Hz) likely corresponds to the shedding of spanwise-oriented vortex rollers along the separated shear layer. The shedding frequency is approximately 10 times that of the breathing frequency, and the fluctuations are about 2.5 times more energetic. In particular, (i) the region x∕L ∈ [0, 0.12] that includes the range of separation points is dominated by the breathing motion at St br , (ii) a redistribution of energy occurs from St br to higher frequencies around 2 St sh for x∕L ∈ [0.12, 0.13] followed by (iii) a shift in frequency from 2 St sh to St sh roughly until x∕L = 0.22 (downstream of Ω ), and (iv) the shedding at St sh remains the dominant phenomenon at least until x∕L = 0.31 . Similar dynamics and shift in frequency have previously been reported for other separating-reattaching flows such as the ones studied by Cherry et al. (1984) and Hudy et al. (2003). The observed dynamics are above the Shannon-Nyquist limit of 7.5 Hz for the 15 Hz PIV sampling rate ( St > 0.038 ). Thus, a reconstruction resolving the aliased

Training and reconstruction performance
Section 3.2.1 provides a brief overview of the hyperparameter study conducted to train the estimator. In Sect. 3.2.2, the estimation performance is assessed in detail and validated against the ground-truth 15 Hz data.

Hyperparameter study
The following hyperparameters are varied to study their effect on estimation performance: training dataset size ( N train ), sequence duration (n), and input pressure cut-off frequency ( f cut-off ). The dataset of N = 23 920 undersampled snapshots (consisting of seven independent batch-runs) is split into several datasets: five training, one validation, and one testing. For each training set, the validation and testing datasets (each disjoint from the training sets and from each other) consisting of N val = 3 985 and N test = 3 985 snapshots are used, respectively. The remaining batches are used to construct five training datasets: N train ∈ {1 995, 5 981, 7 975, 11 961, 15 948} , where each dataset is a subset of a larger one. The snapshot POD is computed over both the training and validation sets to obtain the dataset of POD coefficients.
Since all pressure measurements are low-pass filtered with a cut-off frequency at 1 kHz, the input time series are resampled at 2 kHz; this gives = 15 . For N train = 5 981 , the sequence length is varied over the following n ∈ {100, 200 ∶ 200 ∶ 1600} . The remaining datasets are trained over the subset n ∈ {400, 1000, 1400} . Both the input and output states are z-score normalized using the respective standard deviations.
A single-cell LSTM network with 400 hidden units is used for training. Deeper networks with additional layers do not improve performance. The adaptive moment estimation (Adam) optimizer is used with an initial learning rate of 10 −3 that is dropped by 1% every 10 epochs. A minibatch size of 64 samples is used, and the network is trained for 5 000 epochs. The loss function is the half mean-square-error with L 2 regularization set to 10 −3 .
The dominant contribution to P is from R 12 = −u � w � . The major error metric considered in this study is: is the LSTM-predicted value and || ⋅ || 2 denotes the L 2 norm computed over the entire flow domain Ω . (Note that the spatial arguments for u � w � (LOR) and u � w � (LSTM) are dropped in (5) for a lighter notation.) The estimation error at each instant in time over the test dataset is also recorded. The metric proposed in Tu et al. (2013) is used: where the denominator is constant and is the sum of the first r = 80 POD eigenvalues, ∑ r k = 1 k . In particular, e(t i ) "quantifies the fraction of mean kinetic energy contained in the estimation error" (Tu et al. 2013) at a given t i .
Finally, the estimator with the ability to best reconstruct the important dynamics/statistics (i.e., lowest (Ω) u � w � value) is chosen to study the effect of pressure cut-off frequencies on estimation performance. The hyperparameter search space as well as the corresponding values for (Ω) u � w � (%) and e(t) are tabulated in Table 1 of Appendix A. Some general observations are summarized below: • Instantaneous errors, e(t) , do not scale similarly with the errors in u ′ w ′ , (Ω) u � w � . A detailed discussion of the relevance of the instantaneous errors to estimation problems in the context of high-dimensional/multi-scale turbulent flows is provided in Sect. 3.2.2. • None of the networks performs well for an input pressure sequence duration of n < 200 (or t U ∞ ∕ L sep < 1.27 convective time units). • Generally, it is found that (Ω) u � w � ⪅ 10% for n ≥ 400 ( t U ∞ ∕ L sep ≥ 2.55 ) and N train ≥ 5 981 . This may suggest that either (i) the LSTM network is unable to extract additional information from sequences longer than ≈ 2.55 convective time units, or (ii) sequences longer than ≈ 2.55 time units do not contain additional information that improves LSTM performance. Note that a single shedding cycle corresponds to 1 ∕ St sh ≈ 1.47 convective time units. Therefore, this result suggests that the minimum for the sequence duration, n, is approximately bounded by 200 ≤ n ≤ 400 or 1.27 ≤ t U ∞ ∕ L sep ≤ 2.55 time units, and thus comparable to one shedding cycle. • No significant improvement in performance is observed for datasets containing N train > 7 975 samples. One possible reason for this result might be because the five training datasets are acquired in separate batches, each with slightly different freestream conditions (recall that the standard deviation of U ∞ is 0.3 m s −1 ). The marginal differences between the batches could likely introduce noise in the combined training datasets, further challenging the learning algorithm. A similar observation was made by Ozawa et al. (2022), who studied the effect of training dataset size on their super-resolution algorithm by (i) constructing subsets from a single large dataset of 15 000 non-TR-PIV snapshots (acquired at 4 kHz) of a Mach 1.35 supersonic jet, and (ii) applying their algorithm on each of these training (sub)sets. Unlike the present study, however, they observed that their reconstruction error starts to monotonically increase linearly for N train ≥ 7 000 (while the error remained approximately constant for 1 000 ≤ N train ≤ 7 000 ). The authors attributed the cause of this increasing error to slight changes in jet conditions over the course of longer periods of data acquisition. • Architectures with longer sequence lengths (S) increase the computing time to estimate each state since the number of model parameters becomes prohibitively large. This would, for instance, render these architectures inefficient in applications demanding low-latency system responses, such as real-time estimation in active flow control of high-speed flows. • No significant effect on performance is observed when the pressure cut-off frequency is varied from 1 kHz to 4 kHz. Setting the cut-off frequency to sub-kHz values would affect reconstruction/super-resolution performance in the inertial subrange, and hence these values are not considered.
The LSTM estimator resulting in the lowest (Ω) u � w � value (3.73% in this case) is used for the analyses in Sects. 3.2.2 and 3.3: N train = 5 981 , pressure cut-off frequency set at 1 kHz (with = 15 ), n = 400 (or t U ∞ ∕L sep = 2.55 time units), and thus an overall input sequence length of S = 27.

Reconstruction performance
The LSTM-predicted −u � w � profiles from unseen pressure inputs spanning the test interval T test ⊄ T train are shown in Fig. 5a. The estimator is able to accurately predict −u � w � compared to the reference LOR throughout Ω with a relative leastsquare error of (Ω) u � w � = 3.73% . The local error for u ′ w ′ as a function of vertical distance from the wall is shown for several x/L in Fig. 5b. Dropping the spatial arguments (x, z) for a lighter notation, this error is defined as: at a given x. The maximum error for the Reynolds shear stress is around 8%. The highest errors are Fig. 5 au ′ w ′ profiles for the original field, LOR and LSTM predictions (from unseen pressure sensor data) at multiple streamwise locations. Profiles are each offset by 1.5 × 10 −5 , and every other point is plotted for clarity. The ordinate is offset by the wall location, z wall . (b) Error plots of -u ′ w ′ shown as u � w � (%) as functions of vertical distance from wall concentrated near the wall around the separation point x∕L ≈ 0.10 and in the shear layer further downstream.
The time-averaged instantaneous error computed using (6), e(t) , is 1.11, which is surprisingly large. The following analysis aims to clarify the interpretation of the instantaneous errors, and to provide insight into the challenges associated with sensor-based estimation when applied to highly turbulent flows. In particular, the use of instantaneous error metrics alone to judge estimation performance may be misleading for flows for which the fluctuation energy is distributed over a large number of modes.
In Tu et al. (2013), the estimation error, e = e (1∶ 7) , is in the range e ∈ [0.1, 0.2] [see Tu et al. (2013),[ Fig. 11]]. The flow case under investigation in Tu et al. (2013) exhibits low-rank single-frequency vortex shedding with nearly pure-tone spectral signatures. The first two vortex shedding modes of their wake flow each contain 40% of the total TKE, and the first seven modes contain 85%. For a better comparison of the mode errors and reconstructions with earlier studies (Tu et al. 2013;Ozawa et al. 2022), (6) is modified to compute the single-mode error: The error for mode 1 is considered first, as it has similar TKE contributions (33.5% of planar TKE) to that found in the first mode of Tu et al. (2013), see Fig. 6. The LSTM is found to accurately predict the general trend of the breathing mode dynamics in a (1) with an average error of e (1) = 0.16 , which is similar to that reported by Tu et al. (2013).
The distribution of the single-mode errors, e (1) , e (10) , and e (80) , is shown for all t ∈ T test through their probability density functions (PDFs) in Fig. 7a. Note that the results shown in Fig. 7a are representative of the individual mode errors across the first 80 modes. The PDFs exhibit the shape of Gamma ( Γ ) functions, which is verified by (i) fitting a Γ function to the original error PDFs of e (i) to yield fit parameters ( i , i ) and, (ii) plotting the PDF of a set of 100 000 random samples generated from the Γ-fitted distribution.
There is a significantly greater spread in the right-tail of the PDF for the low-order mode errors (see e (1) vs. e (80) ). Overall, the results suggest that individual modes are predicted well at any given time instant and across most of the test dataset. The error summed over a monotonic set of modes p ∶ q is defined as the cumulative-mode error: Fig. 6 a Sample time-series of ground truth a (1) sampled at 15 Hz and LSTM prediction, ã (1) ; dots connected by lines to assist visibility only. b Single-mode instantaneous error in a (1) , e (1) , with its time-averaged error, e (1) , shown as horizontal dashed line and in text Fig. 7 Original PDFs and PDFs calculated from 100,000 random samples generated from a fitted Γ function with parameters ( i , i ) for a single-mode errors for modes 1, 10 and 80, and ( 1∶ i , 1∶ i ) for b cumulative-mode errors of over the first 2, 10, and all 80 modes which is used to assess the effect of adding higher-order modes on the overall instantaneous errors. The PDFs of the cumulative-mode errors from (9) are plotted for modes 1 : 2, 1 : 10, and all modes 1 : 80 in Fig. 7b. The addition of higher-order modes generates a family of Γ functions (with fit parameters 1∶ i , 1∶ i ). With the inclusion of additional modes, the cumulative-mode error distributions shift toward larger (cumulative) errors and retain fewer perfectly predicted (high-dimensional) states. For instance, in contrast to a two-mode reconstruction, there are no states with zero error for the full-rank 80-mode reconstruction-compare e (1∶ 2) with e (1∶ 80) .
The Γ-fitted PDFs corresponding to both the individual and cumulative-mode errors provide insight into the underlying nature of the error. In particular, from probability theory, the sum of independent Γ-distributed random variables results in another Γ-distributed random variable (Moschopoulos 1985). This is precisely what is observed in the behavior of the mode error quantities shown in Fig. 7. The significance of this result is that the single-mode errors can be treated as independent random variables, highlighting the unbiased manner in which the LSTM predicts modal dynamics. Overall, the LSTM captures the underlying coherent motions embedded within seemingly highly varied linear combinations of the first 80 most energetic modes. This results in excellent reconstruction of the important statistics/dynamics (demonstrated through Figs. 5 and 6). However, the cumulative-mode error PDFs in Fig. 7b indicate that the instantaneous fields produced through the cumulative sum of these modes must contain significant uncorrelated/random fluctuations; these are not preserved in the LSTM estimate.
Supporting the results shown in Figs. 5, 6 and 7, several quadrant plots of low-order reconstructions at select locations across the shear layer (along x∕L = 0.17 ) are provided in Fig. 8. These plots are an additional method to qualitatively verify the ability of the estimator to reconstruct the important fluctuation dynamics. In particular, the collapse of the quadrant plots pertaining to the reconstructed dynamics onto the ground-truth observations shows that the structure of the main turbulence-producing mechanisms around the shear layer region is preserved.
To highlight the trade-off relationship between the instantaneous errors and Reynolds stresses upon adding higherorder modes to the LOR, the cumulative global error in u ′ w ′ is defined: is the LSTM-predicted Reynolds shear stress field computed using the first i modes and u � w � (LOR) using the first 80 modes. The mean cumulative-mode errors e (1∶ i) (t) are plotted alongside the cumulative global L 2 errors (%) for u ′ w ′ , (Ω) u � w � (i) , for the first 80 modes in Fig. 9. Although the addition of higher-order modes to the LOR increases the cumulative-mode errors defined in (9), these errors are randomly distributed, while the prediction performance of individual modes remains relatively unaffectedsee Fig. 7. The improvement in Reynolds shear stress prediction with the addition of modes further supports the finding that cumulative-mode errors are uncorrelated to the coherent motions in the flow.

Estimated time-resolved dynamics
The energetically dominant large-scale dynamics identified in the pressure spectra in Sect. 3.1.2 can be mostly described by the first three modes (46.5% planar TKE). The spatial modes are presented with the pre-multiplied PSD of the corresponding coefficients at 15 Hz and 2 kHz super-resolved estimates obtained from unseen pressure sensor inputs spanning T test in Fig. 10. Inclusion of higher-order modes refines the discussion presented in this section, but these modes are qualitatively similar and are thus omitted for brevity. The spectra of the super-resolved coefficients indicate that the breathing and shedding frequencies are resolved. A supplementary video of the super-resolved 2 kHz velocity flowfield estimates is provided in "Movie 2.mp4." The shape of Φ (1) u is that of the shear layer, and the super-resolved spectra of mode 1, ã (1) , shows a high-magnitude hump centered at St br and a smaller hump centered at St sh . This suggests that the breathing/flapping phenomenon is the most energetically dominant process at separation. The breathing frequency is dominated by a streamwise-oriented elongated structure in Φ (1) u along the shear layer, which has also been observed by Mohammed-Taifour and Weiss (2016). The smaller hump at the shedding frequency in ã (1) appears to be associated with a streamwise-oriented structure in Φ (1) w at x∕L ≈ 0.15. For ã (2) , a broadband hump centered at St ≈ 1.5 St sh is observed, and a relatively narrow-banded hump at St ≈ 2 St sh is observed for ã (3) . The broadband hump in the mode-2 spectrum includes the mode-3 peak at St ≈ 2 St sh , which suggests mixing between the two modes. In particular, (i) a streamwise-elongated lobe in Φ (3) u extends from the wall very close to the mean separation point, (ii) a similar lobe is observed in Φ (2) u but is shifted downstream with part of the lobe cleaving from this structure. This appears to represent a mode pair in u, describing vortex roll-up and initiation of shedding. The structures downstream of this lobe in Φ (2∶ 3) u , along with the structures in Φ (2∶ 3) w that appear to be phase-shifted, may describe the transient organization of shear layer structures through convection of vortices along the shear layer. Similar dynamics have been reported in the analysis of a TSB by Wu et al. (2020) using dynamic mode decomposition.
The appearance of w-modulations at x∕L ≈ 0.14 in Φ (1∶ 3) w coincides with the onset of the shift in frequency observed in the wall-pressure PSD for x∕L ∈ [0.14, 0.22] (see Fig. 4).  Fig. 9 Mean cumulative-mode errors, e (1∶ i) (t) (%) , defined through (9) and global L 2 errors of u ′ w ′ reconstructions, (Ω) u � w � (i) (%) , defined through (10) for the first 80 modes Fig. 10 Columns 1-3 (from left to right) correspond to the first three spatial POD modes with u-(top) and w-components (bottom) normalized by U ∞ . Column 4 shows pre-multiplied PSDs of corresponding original (15 Hz) and super-resolved (2 kHz) coefficients estimated from unseen pressure inputs spanning T test . The PSDs of each mode are offset by 10 −1 for clarity. Modes 1-3 represent 33.8%, 7.5% and 5.2% of the planar turbulent kinetic energy, respectively decreasing frequency of wall-pressure PSD from 2 St sh to St sh for x∕L ∈ [0.14, 0.22] is interpreted to indicate the formation of large-scale vortices, potentially through vortex merging; this has also been observed in a TSB by Wu et al. (2020).
The use of a reduced set of POD modes limits the effective spatial resolution of the resultant LOR, which affects the maximum resolvable target frequency for the super-resolution technique. The smallest wavelengths of the most energetic modes shown in Φ (1∶ 3) w are roughly 0.02 L = 0.24 H , which is 7.7 times the PIV spatial resolution ( 0.031 H ). The temporal variation of these spatial structures is found to span a wide range of time scales: from approximately 15 Hz to over 600 Hz (see the coefficient spectra in Fig. 10)-consistent with the observations in Sect. 3.1.2. On the other hand, the smallest wavelengths in mode 80 (not shown here) are approximately 0.005 L = 0.061 H , or two times the PIV spatial resolution-defining the resolvable spatial limit of the LOR and hence the maximum target frequency for superresolution. Therefore, it is not a coincidence that the errors in the Reynolds stress reconstructions of the LOR saturate after the addition of 80 modes [see Fig. 3a].
Spectral analysis can also be conducted directly using the super-resolved high-dimensional flowfields. The variation of pre-multiplied PSDs of the streamwise u-velocity component, St ⋅ Ψ u , is analyzed along the shear layer core at z∕L = 0.07 , and is plotted for various streamwise locations from the mean separation location to the downstream edge of Ω -see Fig. 11. The spectra illustrate the spatial dependence of the breathing and shedding phenomena. The breathing frequency peak at St br persists along the entire shear layer core for x∕L ∈ [0.11, 0.19] . This suggests that the breathing/flapping motion of the TSB dominates the region along the entire shear layer core. In contrast, the u-PSD amplitude at the shedding frequency, St sh , starts at roughly one-quarter of the amplitude at St br near the mean separation point at x∕L = 0.11 , and progressively increases with downstream distance along the core until the amplitude is matched with that corresponding to St br at x∕L = 0.19 . This increase in vortex shedding amplitude may correspond to the dynamics of spatially evolving structures from separation to the far-field region downstream along the shear layer core-consistent with the observations made earlier on the structures in (2∶ 3) (see Fig. 10). Moreover, this latter observation corroborates the shift in frequency from 2 St sh toward St sh found in the wall-pressure PSD for x∕L ∈ [0.11, 0.19] in Fig. 4. Larger fields-of-view encompassing the entire TSB could help decompose the breathing and shedding motions into separate POD modes, enabling analysis of the coupled breathing/shedding dynamics and inter-modal energy exchange.

Conclusions
The present work proposes a novel data-driven estimation technique that uses oversampled surface-mounted pressure sensors and long short-term memory (LSTM) neural networks to predict transient dynamics that are inherently aliased from undersampled PIV data. In contrast to existing techniques, the method leverages the time-resolved pressure history between the pressure sensor state and a loworder representation of the velocity field state represented by proper orthogonal decomposition (POD) coefficients to resolve otherwise aliased frequencies.
The proposed approach is demonstrated on an undersampled PIV dataset (acquired at 15 Hz) of high Reynolds number turbulent separated flow over a Gaussian speed-bump benchmark geometry with freestream speed U ∞ = 45 m s −1 or Re H = 2.26 × 10 5 , where H is the Bump height. Nine wall-pressure sensors are used to estimate the first 80 POD modes (83.5% of planar turbulent kinetic energy). Spectral analysis of the flowfields after super-resolving the 15 Hz PIV data to 2 kHz reveals (i) a low-frequency breathing mode at 13.5 Hz or St br = f L sep U −1 ∞ = 0.07 that is associated with the contraction and expansion of the turbulent separation bubble (TSB) where L sep is the mean separation bubble length, and (ii) a medium-frequency mode at 135