Figure 1a illustrates the schematic concept of the dual-microcomb based coherently parallel optic-acoustic mapper. The probe soliton comb co-drives multiple FOMs via wavelength division multiplexing. Every FOM, an optical Fabry-Perot cavity, is sensitive to sound pressure. Acoustic modulation can shift the resonant wavelength, leading to alterations in the backwardly reflected intensity of the static frequency comb line inserted into it. Detailed acoustic sensing mechanism of the FOMs is shown in the Supplementary Note S1. The system then captures each probe comb line’s backward reflection, which contains the acoustic information. The reflected comb lines beat with the reference comb lines one by one at a photodetector, creating a radio-frequency comb through heterodyne. Finally, the responses of each FOM are collectively gathered in the time domain and independently analyzed in the frequency domain following a fast Fourier transform. When the FOMs are stereoscopically positioned, the system can achieve in-situ acoustic orientation and localization.
Figure 1b demonstrates the hybrid integration's structural design of the entire system. Initially, two miniature distributed feedback (DFB) lasers generate Comb #1 (probe) and Comb #2 (reference) in two silicon nitride rings, whose Q factor is on 4×106 level. The comb lines of Comb #1 are demultiplexed after on-chip filtering and transferred into fibers. These demultiplexed frequencies drive multiple F-P cavity-based FOMs sequentially. Then the reflected comb lines from the FOMs, containing acoustic information, return via fiber circulators and are again collected by the chip. After multiplexing them, we couple them with the reference comb in a photodetector. Finally, the data is processed in a field programmable gate array (FPGA) core, which is also the controller for stabilizing Comb #1 and Comb #2. Essentially, the system integrates an on-chip optoelectronic hybrid signal generation/processing center with networked peripheral sensors, mimicking a “brain-neural” structure. Figure 1c showcases images of our system and its internal devices. In Fig. 1c(i), we show that the optic-acoustic mapper can be packaged into a portable and reliable 30×20×15 cm3 plug-and-play device. Fig. 1c(ii-v) provide more detailed pictures of the crucial devices: the microcomb generator (ii), on-chip wavelength multiplexer, demultiplexer, and photodetector (iii), chip-fiber coupler (iv), and the FOM. Device fabrication and characterization are described in Extended Data Fig. 1 and Supplementary Note S2.
Fig. 1d illustrates how the FOMs’ resonances correspond to comb teeth. Here the grey curves represent the resonances of 12 FOMs, while the blue curves denote the spectra of 12 comb lines. Since the frequencies of all the comb lines are fixed, we need to precisely design the FOM microcavities with diverse resonances to match them. We accomplish this by meticulously adjusting the cavity length of each FOM. Typically, the resonant frequency of each FOM is approximately 1.457 THz. Each comb tooth is situated at the center of the linear interval of a resonant cavity, ensuring a quasi-linear response across a broad dynamic range. Thus, in acoustic detection, each comb line independently drives a single FOM. When the acoustic pressure alters the resonances, we can measure the changes in the reflected comb intensities.
Prior to acoustic detection, we delve into the operation of the dual microcombs and their associated parameter settings. Fig. 2a illustrates the optical spectra of Comb #1 and Comb #2. In the spectrum, both soliton combs display a sech2-shaped envelope, corroborating their soliton state. With the aid of precise temperature control, the repetition frequency of Comb #1 is set at 106.055 GHz, while that of Comb #2 is at 106.066 GHz. The difference in repetition frequencies (Δfrep) is 11.5 MHz, approximately three orders higher than the acoustic frequencies in the kilohertz band. To accommodate the needs of multiple sensor probes, we select 12 comb lines from each comb using the on-chip Demultiplexer (DMUX), and they are adjusted to render a near-equal power level. This figure also presents an enlarged view of the spectra. The spectral difference between the first probe comb line (from Comb #1) and the first reference comb line (from Comb #2) is 98.5 MHz.
We depict the heterodyne beat note of the two combs (12 lines) in Fig. 2b. The combined beating of the dual combs forms an electrical comb with a frequency interval of 11.5 MHz, while the first beat note occurs at 98.5 MHz. All of the 12 beating lines reside within a 0 ~ 250 MHz band, which is easily detectable by using an on-chip slow photodetector. Then, to achieve high sensing accuracy, we synchronize both Comb#1 and Comb#2 via feedback stabilization loops. Here we also demonstrate a detailed comparison of the first beating line in radio-frequency spectrum, before and after the stabilization process. Commonly, inherent pump frequency drifts and the noise superposition triggered by the optical frequency division effect contribute to every comb line's uncertainty in a free-running soliton [37]. These occur from the instability of the carrier envelope offset frequency and repetition frequency. Consequently, the linewidth of the 1st line-to-line beating signal (free-running) extends to 560 Hz. After full stabilization, the linewidth of the 1st dual-comb beating line is minimized to 18 Hz, indicating an enhancement greater than 30 folds. Moreover, following stabilization, the SNR of the beat note exceeds 60 dB. This significant performance improvement aids in boosting the spectral resolution and optimizing the detection limit in sensing applications afterwards.
In the top panel of Fig. 2c, the measurement of the single-sideband phase noise (SSB-PN) of the dual comb beat note at 98.5 MHz is depicted. The blue curve represents the SSB-PN of the first beating line in its free-running state, while the red curve indicates the SSB-PN of the stabilized beating signal (via optical feedback loops). Post-stabilization, the first beating line’s SSB-PN reaches -72.3 dBc/Hz at 10 Hz offset, -93.5 dBc/Hz at 10 kHz offset, and -108.6 dBc/Hz at 1 MHz. For acoustic detection and analysis, it is crucial to suppress noise within the 10 ~ 100 kHz band, therefore such stabilization is essential. Here, we also show that the feedback locking operation results in an SNR increment greater than 130 dB at a 10 Hz offset. After locking, we estimate the total timing jitter of each comb is smaller than 5×10-16. In the bottom panel of Fig. 2c, the measured relative intensity noises (RINs) of the dual-comb beatings are displayed. Prior to locking, the RIN of the dual-comb beating signal fluctuates from -80 dBc/Hz at a 10 Hz offset to -119 dBc/Hz at a 1 MHz offset, with a total RIN of -54.7 dB. Once locked, its RIN is reduced to –113.8 dBc/Hz at a 10 Hz offset and -153.8 dBc/Hz at a 1 MHz offset. Due to the stabilization, the total RIN is lowered to -86.7 dB. In our experiment, both dual-comb generation and stabilization are automatically regulated by our integrated FPGA module, demonstrating plug-and-play functionality. In Fig. 2d, we show the Allan deviation of the first dual-comb beating line (carrier 98.5 MHz). In free-running operation, the dual-comb beating shows an uncertainty of about 30 kHz during 0.1 s averaging. After locking, at an averaging time 0.001 s, 0.01 s and 0.1 s, frequency uncertainty of this beating signal approaches 35.4 μHz, 3.56 μHz, and 0.352 μHz. Accordingly, for a typical sampling ratio of 10 Hz, we estimate that the optical linewidth of our probe comb (the first line) is at 396 kHz level and 3.2 Hz level before and after feedback locking, respectively. We measure the frequency noises of comb lines in Ext. Data. Fig. 2. More technical details are shown in Supplementary Note S2.
In Fig. 3, we portray that the stabilized soliton microcomb can elevate the detection limit of each FOM. We first clarify the underlying mechanism. As the schematic in Fig. 3a demonstrates, the SNR of a FOM depends on both the frequency and intensity noise of its light source. Acoustic modulation results in a reduction of the FOM’s reflection, implying that the change in reflected light power corresponds to the spectral integral of the marked red region. This process involves various noises, such as the power & frequency fluctuation (ΔP, Δf) of the laser line, and the uncertainty of the FOM response, which collectively shape the detected SNR. Detailed theoretical analysis is shown in Supplementary Note S1. Fig. 3b maps the calculated parametric space of the SNR. Specifically, by reducing the RIN and the frequency fluctuation of the light source, one can enhance the SNR. For direct comparison, three distinctive light sources are employed to drive the same FOM: i) applying a single frequency laser source (NKT-E15); ii) utilizing a free-running comb line; iii) operating with a stabilized comb line. Parameters for these three light sources are indicated with grey, blue, and red dots respectively.
In Fig. 3c, we demonstrate the measured acoustic response spectra of the above methods. During the experiment, a sinusoidal acoustic signal with an acoustic pressure PA = 37 mPa, and frequency fA = 5 kHz was employed. The measured SNR of methods i, ii, and iii in air yielded results of 79.2 dB, 64.1 dB and 90.3 dB respectively, with a resolution bandwidth (BW) equaling 2 Hz. Subsequently, as represented in Fig. 3d, MDP for methods i, ii, and iii are computed as 2.86 μPa/Hz1/2, 16.33 μPa/Hz1/2 and 0.79 μPa/Hz1/2 respectively, based on the transformation equation MDP = [PA2/(BW*SNR)]1/2 [16,38]. We find that the MDP for our comb driven FOM scheme reaches sub-μPa/Hz1/2, indicating a sensitivity 200 times greater than the scheme employing an incoherent light source. Additionally, within this figure, we also compare the performance to a commercial electrical microphone (APT-15A), used as a benchmark, which has an MDP exceeding 62.1 μPa/Hz1/2.
When utilizing locked comb lines to activate multiple FOMs, we can also confirm that this method consistently enhances sensitivity for every FOM. Fig. 3e displays the scenario that 12 FOMs are individually driven by 12 comb lines, which are filtered out from the probe soliton comb (Fig. 2). As previously noted, due to the dual comb heterodyne, their responses can be simultaneously received and demodulated in parallel. Fig. 3f presents the detected trace of the dual comb beating with a 5 kHz acoustic modulation, while Fig. 3g reveals the acoustically modulated spectra of the FOM #1 (driven by the 1st probe comb line) and the FOM #12 (driven by the 12th probe comb line). They both show a high SNR. Finally, we present the sensitivities of the 12 FOMs in Fig. 3h. Experimental results indicate that each FOM when driven by different soliton comb lines, exhibits similar MDPs around 0.8 μPa/Hz1/2. More characterizations of FOMs are shown in Ext. Data. Fig. 3 and Supplementary Note S2.
We now demonstrate that our on-chip dual-comb based FOM array is able to achieve high-precision passive acoustic target localization. The concept is illustrated in the top panel of Fig. 4a. First, each FOM has an independent coordinate Mi (xi, yi, zi), here i = 1, 2, 3 or 4. When an object situated at the coordinate (x, y, z) emits sound waves, the unique distances (e.g. D1, D2, D3 and D4) between every FOM and the target will result in variable pairwise arrival time delays for each FOM pair: vA×tM2,M1 = D2 – D1; vA×tM3,M1 = D3 – D1; vA×tM4,M1 = D4 – D1; vA×tM3,M2 = D3 – D2; vA×tM4,M2 = D4 – D2; vA×tM4,M3 = D4 – D3. Here vA = 340 m/s signifies the acoustic velocity, tMi,Mj signifies arrival time difference between Mi and Mj (i ≠ j). Meanwhile, D1, D2, D3 and D4 can be represented in the coordinate system, such as D1 = |(x1-x)2 + (y1-y)2 +(z1-z)2|. The spatial position of the target can thus be determined. Typically, a minimum of four acoustic detectors are required to locate a target in three dimensions. Employing additional acoustic detectors at different locations will add redundancy but enhance the accuracy of the positioning. In this measurement, we use 8 FOMs (M1 ~ M8), enabled by dual comb technology, to localize an indoor sound source. Table 1 displays the locations of the FOMs (M1 ~ M8) and the target (T). The bottom panel of Fig. 4a demonstrates this design.
Table 1. Locations of the FOMs and the acoustic target.
Location
|
x (m)
|
y (m)
|
z (m)
|
M1
|
0
|
0
|
0
|
M2
|
3
|
0
|
0
|
M3
|
3
|
3
|
0
|
M4
|
0
|
3
|
0
|
M5
|
0
|
0
|
3
|
M6
|
3
|
0
|
3
|
M7
|
3
|
3
|
3
|
M8
|
0
|
3
|
3
|
T
|
2.2
|
1.4
|
0.8
|
As a result, the distances between the target and detector are as follows: D1 = 2.728 m, D2 = 1.8 m, D3 = 1.96 m, D4 = 2.835 m, D5 = 3.412 m, D6 = 2.728 m, D7 = 2.835 m, D8 = 3.499 m. While playing classical piano music from the sound source, Fig. 4b presents the acoustic traces detected in different FOMs. There are noticeable temporal misalignments in waveform. Subsequently, we determine the delay difference of these detected acoustic traces across all pairwise FOMs by employing the equation R(τ) = ∫L Mi(t)Mj(t+τ)dt [39]. Here L signifies temporal length of the sampled trace and i ≠ j. The maximum value of R(τ) identifies the delay difference denoted by τ. Fig. 4c illustrates the cross-correlations between Mi and Mj (i ≠ j), which are calculated based on the measured outcomes from our FOMs. The delay differences for all pairs of Mi & Mj are provided in Supplementary Note S2.
As demonstrated in Fig. 4d, we can retrieve the distances from D1 to D8. The specific measurements obtained are as follows: D1 = 2.723 m, D2 = 1.8 m, D3 = 1.96 m, D4 = 2.831 m, D5 = 3.41 m, D6 = 2.731 m, D7 = 2.829 m, D8 = 3.493 m. By solving the matrix equations Di = |(xi-x)2 + (yi-y)2 +(zi-z)2|, we can determine the spatial location of the target: T (2.19 m, 1.42 m, 0.8 m). This result aligns well with the actual number. Fig. 4e compares the measured coordinates and the actual coordinates, showing that the average error in the x, y, or z direction is less than 1.5%. Fig. 4f shows the outcomes of 100 repeated measurements, where the Root Mean Square (RMS) errors for x, y and z reach 0.62 cm, 0.56 cm and 0.6 cm respectively. All the calculations can be performed using an integrated FPGA, see Ext. Data. Fig. 4.
In addition to indoor acoustic localization, the on-chip dual comb based parallel optic-acoustic mapper also shows the capability to detect and localize outdoor acoustic targets such as drones and gun shooting, which are important targets in both military and civilian applications, but usually have strong visual concealment, and often are hard to measure with traditional active techniques such as radar or optical ranging techniques. For example, Fig. 5a demonstrates the experimental scenario in that we use this tool to localize a small unmanned aerial vehicle (UAV, DJI Air 3), its typical sound intensity during flight is 8 mPa. We characterize the acoustic features of the UAV in Ext. Data. Fig. 5. In the experiment, we employ 4 FOMs on a tetrahedral framework, whose side length is 40 cm. This parameter enables a maximum framing rate of 850 Hz. Referring to the environmental noise is ≈320 μPa, the maximum detecting radius for this drone can reach 200 m in principle, as sub-kHz-level acoustic attenuation in the open air is < 0.15 dB/m. Fig. 5b shows the detailed FOM distribution on the tetrahedral framework, their 3D coordinates are shown in Table 2.
Table 2. Coordinates of each FOM on a regular tetrahedron.
FOM
location
|
x (m)
|
y (m)
|
z (m)
|
M1
|
-0.2
|
-0.1155
|
0
|
M2
|
0.2
|
-0.1155
|
0
|
M3
|
0
|
0.2301
|
0
|
M4
|
0
|
0
|
0.3266
|
Fig. 5c plots the acoustic wave and spectrum of the UAV, measured by our FOMs. Typically, the UAV shows multiple characteristic frequencies. In practice, for further promoting the SNR in detection, one can conveniently use an electrical filter or a computer recognition algorithm during the signal processing. Fig. 5d shows the case that we localize the static hovering UAV, whose spatial coordinate is (x, y, z) = (22.33 m, 14.71 m, 9.64 m), therefore the sensor-to-target distance d = (x2+y2+z2)1/2 = 28.424 m. By repeatedly measuring its location 100 times, we record the measured points in the top-view and side-view maps. In statistics, standard deviation of the positioning is σx = 7.9 cm, σy = 7.7 cm, and σz = 7.2 cm, respectively. When the UAV hovers stably, such localizing errors can be reduced via continuous measurement. Fig. 5e shows the Allan deviation of d, here we set a framing rate of 500 Hz. For single shot measurement (2 ms), σd is 4.89 cm; and when the averaging time is 1 s, σd reaches 0.31 cm. This number is already much smaller than the size of the target. In Fig. 5f, we show the case that our on-chip dual comb based parallel optic-acoustic mapper can trace the dynamic movement of the UAV. When the UAV linearly flies from point M to point N, we can localize the UAV in real time. Originally, spatial coordinate of the UAV is M (-8.7 m, 2.1 m, 13.4 m), while the terminal is N (12.5 m, 10.1 m, 2.1 m). By using our acoustic localizer, we record the 3-dimensional coordinate changes, as the blue, red, and yellow dots show. During this flight, we verify that the sensor-to-target distance d decreases from 16.1 m to 9.8 m, and then increases back to 16.2 m. Finally, we verify the maximum measurement range for this UAV. When increasing d from 1.22 m to 265 m, we find that the SNR in our FOMs approaches 0 dB when d > 194 m. This number also well meets the theoretical limit. The on-line out-field test is shown in Supplementary Movie.