A novel principle to localize scattered-wave sensitivity using wave 2 interference and its adjoint 3

11 When using scattered waves for high-resolution imaging of a medium, the sensitivity of these 12 waves to the spatiotemporal distribution of heterogeneities is undoubtedly a key factor. The 13 traditional principle behind using scattered waves to detect small changes suffers from an 14 inherent limitation when other structures, not of interest, are present along the wave 15 propagation path. We propose a novel principle that leads to enhanced localization of wave 16 sensitivity, without having to know the intermediate structures. This new principle emerges 17 from a boundary integral representation which utilizes wave interferences observed at 18 multiple points. When tested on geophysical acoustic wave data, this new principle leads to 19 much better sensitivity localization and detection of small changes in seismic velocities, 20 which were otherwise impossible. Overcoming the insensitivity to a target area, it offers new 21 possibilities for imaging and monitoring small changes in properties, which is critical in a 22 wide range of disciplines and scales.


Introduction
1 Scattered waves and their sensitivity to heterogeneity are fundamentally important to 2 study any kind of material structure. The sensitivity represents how much changes in wave 3 scattering is associated with changes in the heterogeneity. A knowledge of the sensitivity and 4 its distribution enables one to address why, how and where such changes occur 1-4 . 5 When the heterogeneities are known, then one can model the scattered wave data. The 6 sensitivity (Jacobian) derived from the modelled data can be used to predict how those 7 heterogeneities in natural or engineered composite materials interact when subjected to 8 external stimuli 1-3 . When the heterogeneities are not known, the sensitivity is derived from 9 the observed data. In this case, the sensitivity is defined as a change (gradient) in the 10 difference between the observed waveform and the waveform estimated based on a physical 11 principle, as a result of changes in the assumed model 4- 6 . In this article, we address this latter 12 sensitivity issue. This is important in many real-world problems that involve resolving and 13 monitoring unknown heterogeneities. For example, the sensitivity derived from acoustic, 14 electromagnetic, or seismic scattered waves is key to in-vivo medical imaging to identify 15 tumors 7,8 , in-situ seismological monitoring of geological materials (e.g., rock, soil, ice, 16 fluid) 6,9,10 , and health monitoring of civil engineering structures (e.g., metal and concrete) 11 . 17 The conventional principle behind calculating wave sensitivity involves one source 18 point, one observation point, and the Huygens' principle 4-6 : an incident wave at a source 19 point causes a disturbance in the material, a secondary wavefield is generated at a 20 heterogeneity, and the total wavefield (incident and scattered waves) is measured at an 21 observation point. The sensitivity is then calculated using two Helmholtz equations: one in 22 which the incident wave propagates forward in time from the source point, and one in which 23 scattered waves propagate backward in time from the observation point (adjoint). These two 24 wavefields have identical arrival times at the location where scattered waves are generated 25 (i.e., at the location of the heterogeneity). The sensitivity of the scattered waves is calculated 1 by analyzing the amount of correlation between the two wavefields 4-6 . 2 In various disciplines, owing to the reduced footprint of sensors and the increased 3 computational resources, measurements and data processing based on many spatial 4 observation points have become common. For example, scattered waves are measured with 5 spatially dense sampling, e.g., on the surface of the Earth 12,13 or along boreholes 14,15 , at the 6 surface of or inside the human body 16,17 , and in civil engineering structures 16,18 . These 7 developments have led to spatially and temporally high-resolution imaging of materials 8 across scales 12,15,19-21 . 9 Using the conventional principle, the sensitivity from multiple sets of source-10 observation points is obtained by simply summing up the sensitivity from every single set 4-6 , 11 due to the linearity of the problem. Although this approach improves the sensitivity 12 estimation, it does not fully exploit the interrelation among the scattered waves. As the 13 conventional physical principle addresses material heterogeneity present along the wave 14 propagation paths starting from a source and ending at an observation point, small material 15 perturbation between the observation points does not matter, unless the heterogeneities 16 around the source points and those present along the source-observation paths are sufficiently 17 known. This limits resolving the temporal changes in many applications. For example, in 18 geophysical monitoring, wave sources at the Earth's surface and buried receivers are often 19 deployed to monitor stress and/or fluid in the underground [22][23][24] . In addition to time-lapse 20 changes in the target area, such monitoring data, however, also contain the effect of time-21 lapse changes around a source point, e.g., due to environmental effects (like rainfall), which 22 can jeopardize entire time-lapse monitoring efforts 23 . 23 In this article, we present a novel physical principle for calculating and localizing the 24 sensitivity without having to know the heterogeneities around the source points and those 25 contributions from the reference receiver array and adding them result in the scattered waves 1 (red arrows in Fig. 2b) which propagate backward from the hypothetical scatterer to the 2 source point, leaving only the forward propagating wave travelling from Q to P (green arrow 3 in Fig. 2b). In this way, by analyzing the amount of correlation between the forward and the 4 backward propagating waves, a large sensitivity can be achieved at the true scatterer Q in a 5 data-driven manner, without having to know the heterogeneity around the source point. 6 This novel principle calculates the sensitivity which is highly localized at the location 7 of the scatterer Q, without any knowledge of the complete heterogeneity distribution (Fig.  8 2c). In contrast, the sensitivity obtained from the conventional principle, using the same 9 source-observation points, does not allow detecting the local scatterer Q (Fig. 2d). The same 10 conclusion can be drawn even when we use all available data in the calculation (see 11 Supplementary Fig. 1). When we assume that all heterogeneities but Q are perfectly known, 12 the conventional sensitivity approaches the localized sensitivity that we estimate using the 13 new principle (Fig. 2e) We have tested this new sensitivity localization principle on field experimental data. 21 The measurement geometry is same as in Fig. 1a, but here we have multiple observation 22 points P located along another vertical line (Fig. 3a): in total we observe wavefield using two 23 vertical arrays (left array, LA, and right array, RA). In the field test, this corresponds to 24 measurements in two vertical boreholes. 25 For the sake of clarity, here we need to distinguish between extrapolated and 1 calculated waveforms. Extrapolated waveforms are obtained using the boundary integral 2 representation that we have presented above and assuming homogeneity (constant acoustic 3 velocity). The calculated waveform, on the other hand, are the ones obtained through 4 numerical (finite-difference, FD) computation also assuming homogeneity, where the source 5 wavelet is estimated by deconvolution of the recorded waveform 31 with a waveform that is 6 calculated assuming the same homogeneity and an impulsive source at S. Therefore, the 7 difference in waveforms between the observation (black lines in Fig. 3b) and the calculation 8 or extrapolation (green and red lines in Fig. 3b) indicates the deviation from the 9 homogeneity. The waveforms calculated using the conventional approach by FD method 10 (green lines in Fig. 3b) assumes a globally homogeneous material. On the other hand, those 11 using the boundary integral representation (red lines in Fig. 3b) assumes local homogeneity 12 between LA and RA, and heterogeneities around the source point (gray-shaded area in Fig.  13 3a) are accounted for in a data-driven manner. As a result, their waveforms vary over the 14 receiver location (red lines in Fig. 3b), which implies an improved sensitivity to the local 15 heterogeneity. 16 The sensitivity localization is evident in Fig. 4a. The conventional sensitivity shows 17 large values around the source point S. Further, the conventional sensitivity varies smoothly 18 in the subsurface, which implies small correlation between the incident and the scattered 19 waves (averaging out of the contribution of different local scatterers). This is due to the fact 20 that in the conventional approach, the difference waveforms have a complex nature as they 21 include more scatterers (see black and green lines Fig. 3b). In contrast, the localized 22 sensitivity, derived from the new principle found in this research, reveals a very detailed 23 structure between LA and RA (Fig. 4a). The sensitivity indicates the amount of velocity 24 perturbation/changes with respect to homogeneity, assuming Born scattering 32 . A comparison 25 with the heterogeneity directly observed at LA (Fig. 4b) shows that the localized sensitivity 1 detects a much finer variation in heterogeneity than the conventional sensitivity at depths 2 greater than 80 m. The novel principle exploits information in the observed data in a 3 completely different manner than the conventional principle. As a result, the conventional 4 sensitivity cannot achieve comparably good results even using all available data, including 5 data from the reference receiver array ( Supplementary Fig. 2). 6 7 Localized sensitivity: quantitative estimation of heterogeneity 8 The localized sensitivity can be exploited to resolve quantitatively the material 9 heterogeneity. An inversion scheme can be formulated to estimate the acoustic velocity 10 distribution by minimizing the difference between the observed and the calculated waveforms 11 at the observation point. The localized sensitivity can navigate iteratively toward a best-fit 12 model using nonlinear inversion (see "Methods") without a knowledge of the heterogeneity 13 around the source points. 14 Using the same geometry as in Fig. 3a, multiple sources were used sequentially in the 15 field to generate pressure waves at right to RA and left to LA (Fig. 5a) in order to illuminate 16 the medium from various directions. The reference receiver array, the observation point (P), 17 and the zone which does not contribute to calculating the localized sensitivity are 18 appropriately defined depending on the source location (Fig. 5a). In order to verify a resolved 19 heterogeneity, we additionally perform independent waveform measurements (ground-20 truthing) using downhole sources (Fig. 5b) and apply the conventional waveform inversion 21 (Supplementary Note 2). 22 Waveform inversion estimates a velocity model starting from an initial guess 4-6 . We 23 perform standard traveltime tomography to obtain the starting model (Fig. 6a). Waveforms 24 around the first-arriving events and a frequency component similar to that in the independent 25 measurements using downhole sources are analyzed (Supplementary Notes 3 and 4). Figure  1 6b shows the estimated velocity structure using the localized sensitivity derived from the 2 novel principle involving boundary integral representation. Figure 6c shows the result of 3 waveform inversion where additional downhole sources have been placed in RA (ground-4 truthing). Figure 6d shows in details a comparison between the different velocity models. The 5 estimated velocity using the localized sensitivity is strikingly close to the one obtained from 6 independent waveform inversion using downhole sources and also to acoustic well log data at 7 RA, especially at depths greater than 80 m where the raypath coverage is good (Fig. 6d). We delve further into this concept through performing realistic synthetic monitoring 20 tests. Although this novel principle can be useful in high-resolution monitoring in a wide 21 variety of fields e.g., medical sciences, non-destructive material testing, civil engineering, in 22 our synthetic test we consider geoscientific applications where monitoring is necessary while 23 injecting fluid into the subsurface using boreholes. For example, recycled water is injected 24 and stored in the aquifer for water resource management 34,35 , or treated water is injected in 25 order to produce energy in geothermal fields or to store carbon dioxide in the subsurface 36 . In 1 all these applications, detecting subsurface changes due to the replacement of fluid and 2 changes in the pore pressure is crucial 34,37,38 . Monitoring using sensors located in the 3 boreholes is generally performed for this purpose due to the sensitivity of downhole sensors 4 to changes at the target depths [22][23][24] . Source points are located at the surface, as boreholes are 5 generally inaccessible during the operation 35 . Therefore, we also consider the observation 6 points located in boreholes and the source points at the surface (Fig. 7a), similar to field 7 experiments discussed earlier in this article. 8 To generate realistic synthetic data, we assume a random velocity distribution with a 9 mean velocity of 2.0 km/s (Fig. 7a). The data contain source location errors and errors due to 10 temporal changes occurring outside the target area (the dashed rectangle in Fig. 7a). The 11 target area is located at 100 m depth where the velocity decreases by 5 % with respect to the 12 baseline measurement due to an increase in the pore pressure 39 . The topmost 6 meter is 13 modeled as a vadose zone having a random velocity distribution with a mean value of 1.0 14 km/s ( Supplementary Fig. 7). Additionally, the structure of the vadose zone is completely 15 different between the baseline and the monitor surveys ( Supplementary Fig. 7), representing 16 a possible drastic change in seismic velocity in this zone due to seasonal change in water 17 saturation 23,40 . Source-receiver geometry and frequency components are quite similar to those 18 in the field experiments discussed earlier (Supplementary Note 6), except that the source 19 location in the monitor survey contains random error up to 4 m ( Supplementary Fig. 7). 20 We first look at the result of imaging the inter-borehole velocity heterogeneities in the 21 baseline measurement. Here we consider two different scenarios for the prior information of 22 the non-target zone (outside the two boreholes) to build an initial velocity model for 23 waveform inversion. Assuming the same initial velocity model for depths greater than 16 m 24 ( Fig. 7b), we consider a situation where the correct average velocity and thickness of the 25 vadose zone are known (Fig. 7c), and in case we have a poor knowledge of them (Fig. 7d). 1 These two different initial velocity models are used to estimate the heterogeneities using the 2 conventional sensitivity (Figs. 7e,f) and the localized sensitivity (Figs. 7g,h), respectively. As 3 the recorded waveforms contain information of the structure present along the wave 4 propagation path connecting the surface source and the downhole receiver, the waveform 5 inversion using the conventional sensitivity estimates the heterogeneities not only in between 6 the boreholes but also outside, i.e., those structures to the left of LA and to the right of RA 7 (Figs. 7e,f) -which are not of interest. More critically, the estimation of the velocity structure 8 in between the borehole has been influenced by the accuracy in prior information of 9 structures outside the two boreholes, contributing to large uncertainties in the estimated inter-10 borehole heterogeneities (Figs. 7i,j,k). On the contrary, the new principle presented in this 11 research addresses the localized sensitivity and, therefore, provides directly the inter-borehole 12 structure (Figs. 7g,h) which is minimally influenced by the accuracy in the prior information 13 of the non-target zone (Fig. 7k). 14 In order to achieve accurate results using the conventional sensitivity, it is crucial to 15 account for the propagation effects outside the target zone (gray-shaded area in Figs. 7g,h) by 16 obtaining independently good prior information. Alternatively, one can design carefully a 17 multi-scale inversion scheme utilizing sequentially data from lower to higher frequencies in 18 order to avoid gaps in the wavenumber information 6 . However, this is not a trivial task due to 19 the difficulty in acquiring low-frequency data using controlled sources 6 and because each 20 frequency component has generally a different signal-to-noise ratio. The localized sensitivity 21 is free from uncertainties associated with these fundamental limitations, because the 22 propagation effects outside the target zone are accounted for in a data-driven manner. 23 Next we concentrate on the monitoring of time-lapse changes in the target zone which 24 is located around 100 m depth (dashed rectangle in Fig. 7a). The results are shown in Fig. 8.  25 14 The new principle estimates the temporal changes at the target depth much better than the 1 conventional approach (Figs. 8b,c). The conventional approach is sensitive to the source 2 location errors in case an accurate prior information of the vadose zone is available 3 ( Supplementary Fig. 8). Generally, the conventional approach requires an accurate prior 4 knowledge of the vadose zone (Fig. 7). Any inaccuracy in this prior knowledge results in a 5 significant loss of accuracy in the estimated time-lapse changes at the target zone when using 6 the conventional approach (Figs. 8c, 8e). On the contrary, the extremely high sensitivity of 7 the new approach to the inter-well structures allows high-resolution estimation of the velocity 8 changes, which is nearly independent of the presence of any source location error and/or 9 inaccuracy in the prior information of the vadose zone (Figs. 8b, 8d, Supplementary Fig. 8). 10 11

12
In this article, we present a novel principle to localize the sensitivity of scattered 13 waves to medium heterogeneities, which otherwise remain hidden in case of using the 14 conventional principle for sensitivity estimation. Earlier studies on Green's function 15 retrieval 12,41-47 , which is found useful in a variety of disciplines, e.g., medical diagnostics 44 , 16 seismology 12 , exploration geophysics 42 , and material testing 43 , have tackled a similar problem 17 from a different point of view. In those studies, the inter-receiver Green's function (impulse 18 response) is estimated by effectively removing wave propagation paths from a source point 19 using correlation or convolution 45 . Although there is a good similarity between Green's 20 function retrieval and the concept presented here, there are also notable differences. First, in 21 contrast with Green's function retrieval using crosscorrelation 45 , our boundary integral 22 representation can be applied to lossy media. This is a major difference. Furthermore, our 23 primary purpose is to directly obtain the sensitivity using the adjoint of the boundary integral 24 representation and, therefore, the associated Green's functions are by-products. This enables 25 us to tackle the problem from a completely different point of view. We have used the 26 Dirichlet boundary condition in the Green's functions in a rather unconventional manner (see 1 "Methods"). This makes it possible to relax the critical assumption of one-way wavefield 2 propagation in the Green's function retrieval using convolution 46,47 and that of multi-3 component measurements or single-component measurements with approximations 45,48 . The 4 assumption of one-way wavefield and/or the approximations due to the conventional 5 boundary condition are otherwise necessary when the primary purpose is to retrieve Green's 6 functions, which will require further processing. 7 We have formulated the new principle as a 2D problem. We have shown in this article 8 that the assumption of 2D wave propagation is effective for field data. Also, the geometry of 9 the reference receiver array and the observation point can be arbitrary. In this regard, a 10 similar concept, but using a conventional integral representation for Green's function 11 retrieval, was proposed earlier for reflected waves where the reference receiver array and the 12 observation points are co-located in a horizontal borehole 49 . Also, this newly found principle 13 can be applied to seismological monitoring using surface-waves and 2D seismometer arrays 14 because single-mode surface waves in 3D elastic media can be represented by 2D wave 15 propagation at the surface 50 . The independence of the estimated localized sensitivity from 16 source locations and heterogeneities around the source points is attractive for ambient noise 17 tomography, where the limitation due to uneven distribution of noise sources and due to 18 heterogeneities outside the target area is especially detrimental to imaging and monitoring 51 . 19 The novel principle can also be extended to 3D wave propagation. In that case, one needs to 20 measure waves at a reference receiver array located over a 2D surface. 21 The novel principle provides a unique opportunity in case the wave source does not 22 illuminate a medium from a location which is close to the target area, but multiple 23 observation points are used to enhance the localization of the sensitivity without the need to 24 know precisely the structures outside the target area. We have illustrated that this principle is 25 especially useful in monitoring, where the subsurface is illuminated by distant sources and 1 the response is observed by embedded sensors. In other disciplines, this may necessitate a 2 new data-acquisition design. In this regard, the development of fiber optic sensing has lately 3 demonstrated that existing telecommunication networks can turn into spatially dense, 4 subsurface acoustic sensors without a need of additional sensor installation 13,52 . The novel 5 principle can, therefore, be powerful in future seismic monitoring in areas with difficult 6 access, e.g., in urban or underwater environments. We anticipate that the novel principle will 7 open up possibilities for new experiments and measurement techniques where accurate and 8 efficient monitoring is of high importance but the conventional approaches using scattered 9 waves are hindered by the insensitivity to the target area due to limitations in data-acquisition 10 geometry or a poor knowledge about changes occurring outside the target zone. 11 12

Boundary integral representation 14
The following boundary integral representation is used to calculate the waveform at 15 the observation point at P due to the source point at S using interferences of the observed 16 waveforms at the reference receiver array: 17

,
(1) 18 where all properties are in the space-frequency domain,  is the angular frequency, j the 19 imaginary unit,  the density, ds the line element, and ni the outward pointing normal vector 20 on the arbitrary integral path D, which is the location of a reference receiver array. Equation 21 (1) indicates that the multiplication of the observed waveform, p(x, S) where x  D, at the 22 receiver in the reference array and the spatial derivative of the Green's function,iG(x, P), in 23 the i direction due to a point source at P, and collecting its contributions from all receivers 24 calculate the observed waveform at P. We derive equation (1) from the general wavefield 1 representation 53 by defining the Green's function such that the velocity structure is same as 2 that of the observed data, but with the Dirichlet (sound-soft) boundary condition at D. This 3 additional boundary condition correctly handles outward propagating waves at the reference 4 receiver array by canceling non-physical wave arrivals while evaluating the integral. 5 Furthermore, it enables us to require only single component wavefield to measure (e.g., 6 pressure field instead of pressure and particle velocity fields), or an approximation due to the 7 single component measurements is not necessary. This contrasts from other similar 8 techniques of wavefield retrieval 46,47 . Furthermore, the boundary condition enables us to use 9 model information only inside the reference receiver array because waves in impulse 10 responses (Green's function) do not radiate outward from the boundary. 11 The array shape in equation (1) is arbitrary. We consider a special case of a vertical 12 line (Fig. 1a). Suppose that a source S is located to the right of the reference receiver array, 13 and the observation point P is located to the left of the reference receiver array. In this 14 configuration, equation (1) can be written as 15 where pobs is the observed waveform at the reference receiver array, G is the Green's function 17 with the Dirichlet boundary condition at the horizontal location of the reference receiver 18 array (Fig. 1a), and we used the relation (n1, n2) = (1, 0). 19 20 The localized sensitivity using the adjoint of the boundary integral representation 21 A sensitivity of the scattered wave is defined as the change of a selected feature due 22 to model perturbation. In this study, we consider the difference between calculated and 23 observed waveform as, 24 where variables with subscript S indicate that they depend on the source location, those with 11 subscript P indicate that they depend on the receiver location or the observation point P, and 12 the frequency dependence of all variables is omitted for brevity. In equation (4), a column 13 vector gP is a solution to the wave equation 31 : (7) 9 The real part is taken in the right-hand side term of above equation because E(m) is real 30 . In where * denotes the complex conjugation. These equations provide an algorithm to calculate 15 the localized sensitivity. In order to interpret physically the adjoint-state equations, we 16 rearrange them such that the sensitivity is a crosscorrelation of two wavefields b and f, where the term ¶A ¶m compensates for the scattering radiation pattern due to different 1 parameterization, and b is a row vector representing the backward propagating wavefield 2 defined as, 3 The backward propagating wavefield b is a solution to the conjugate (time-reversed) wave 5 equation where the source term represents the scattered waves (i.e., difference between 6 calculated and observed waveforms) crosscorrelated with the observed waves at the reference 7 receiver array. Note that the modeling operator A in equation (12) is the same as in equation 8 (5) where the Dirichlet boundary condition at D is considered. 9 10

Field experiment 11
The test site is made of sedimentary layers. Two instrumented vertical boreholes with 12 50 m horizontal separation are available. Hydrophone strings, installed in 28 m -170 m 13 depth range with 2 m separation between two adjacent hydrophones, are used to measure the 14 pressure wavefield due to a surface source, simultaneously in the two boreholes. We use a 15 small amount (6 g) of explosives for surface sources. The measurement-depth interval is split 16 into four sections. The receiver arrays (hydrophone strings) are installed simultaneously at 17 one of the sections in each borehole; they measure the pressure wavefield due to the surface 18 source. In order to cover the measurement-depth interval, we repeat this procedure four times 19 at the fixed source location changing the depth of the receiver arrays. The total recording 20 length is 0.4 s with a sampling interval of 0.25 ms. 21

Waveform inversion 23
We use the quasi-Newton l-BFGS method 54,55 in estimating the velocity model by 24 waveform inversion. The model parameter is iteratively updated using the following formula: 25 where Qk is the approximate Hessian inverse computed using previous values of the gradient, 2 and k is the step length in the line-search in the descent direction. The datasets generated during and/or analysed during the current study are available from the 7 corresponding author on reasonable request.   representation. Scattered waves are observed at a reference receiver array (pressure sensors). 4 These waves are used in the representation to calculate the response at the observation point