Data collection
Patients' BSs were collected using a self-developed wearable bowel sound device. The device uses a Knowles' SiSonic MEMS microphone (SPU1410LR5H-QB), which has an ultra-wide band (UWB) flat frequency response (±2dB, 10~10000 Hz) and a tightly matched sensitivity of ±3dB. Since the frequency of BSs is mainly distributed within the 100-1000 Hz band, this microphone is practical for the pick-up of BSs. The bowel sound and the ambient noise acquired by the microphones are filtered and amplified through the second-order active low-pass filter firstly. The cut-off frequency of the low-pass filter is 2000Hz, and the magnification is 2 times. After the filter, a stage of amplification was carried out, and the amplification factor was 30. The amplified analog signal enters the analog-digital converter (12bit) of STM32L151 to realize analog-to-digital conversion. The sample rate of BSs was 8 kHz and the converted data is stored in the Micro SD card.
Signal processing
In the process of BSs acquisition, the ambient noise is easily introduced, which directly affects the quality of the BS signal. Therefore, it is necessary to remove the ambient noise to better analyze and identify the BSs. We used the noise acquisition channel of the recorder to collect the ambient noise, and the adaptive noise cancellation was used to remove the noise. Specifically, the least mean square (LMS) [15] algorithm was adopted because the LMS algorithm is more robust than the recursive least squares (RLS) algorithm[16]. The order of the filter was determined to be 32, and the step size factor was set as 0.000001 to achieve a good adaptive cancellation.
Adaptive filtering can eliminate the environmental noise, but the high-frequency noise in the signal still affects the identification and analysis of effective bowel sounds (EBSs). As an effective and practical method, wavelet denoising has achieved good results in signal and image denoising, and has been widely used in engineering applications including the enhancement of bowel sounds [17]. Donoho [18, 19] proposed a wavelet threshold denoising method. The wavelet coefficient of signal contains important information after wavelet transformation using the Mallat algorithm. The wavelet coefficient of the noise is less than the wavelet coefficient of the signal. By selecting a suitable threshold, the wavelet coefficients greater than the threshold are considered to be generated by BS signals and should be retained, while those less than the threshold are considered to be generated by external noise and set to zero to achieve the purpose of denoising. In the process of wavelet decomposition, the wavelet basis, the number of decomposition layers and the threshold should be determined. For the selection of wavelet basis, we chose sym8 wavelet basis which is from the two common wavelet bases of db wavelet system and sym wavelet system. For determining the number of decomposition layers, too large or too small will both affect the final de-noising effect. In this paper, the number of decomposition layers was determined to be 5 after comparing the denoising effects of different decomposition layers. For the determination of threshold value, the Birge-Massart [20] algorithm was used to obtain the threshold value of each layer of one-dimensional wavelet transform, and soft threshold function was used for denoising. For bowel sound signals, there is no standard signal to refer to, so the wavelet denoising was used combining with adaptive filtering, to kept the frequency response range below 1KHz [21] which is the main frequency range of bowel sounds.
Fig. 1 The bowel sounds signal after adaptive filtering and wavelet denoising. Notes. (a) the original signal is abtained from the microphone after amplification without any cancellation, and (b) the ambient noise is collected form the other microphone. (c) is the signal after LMS adaptive filter. (d) is the signal after wavelet denoising.
After the adaptive filtering and wavelet denoising, the waveform (Fig. 1) can be used to identify EBSs. The fractal dimension (FD) can quantitatively describe the complexity of the signal. The FD of EBSs is different from that of background sounds [22]. To calculate the FD of a time series, we can either reconstruct the phase space first and then calculate the correlation dimension of the time series [23-25] or directly calculate the FD in the time domain. The time series in this paper was the audio signal with a high sampling rate and large data volume, so the FD was calculated directly in the time domain. The Katz method [22, 26, 27] used in FD calculation can effectively judge the randomness of waveforms. When calculating the FD of the BS signal, we employed a sliding window to realize the short-time processing of audio signals. The length of the sliding window was set to int (0.006*fs), where int indicates the integer part of the argument, and fs is the sampling frequency of the BS signal. The constant 0.006 is empirically set and justified by the efficient performance of the algorithm [22]. The FD of the data in the sliding window is calculated. In order to ensure that the length of the data before and after calculating the FD is equal, the first and the last FD are used to make up the data at both ends. After the FD sequence is calculated, the peak value is extracted to ensure the effective recognition of the BSs. The peak extraction method adopts the FD-peak peeling algorithm (FD-PPA) [22].
FD-PPA makes the EBSs more obvious in the waveform, but the voice endpoint detection (VAD) technology is needed to extract the EBSs. The purpose of VAD technology is to identify the starting point and ending point of EBSs accurately from a segment of the signal containing EBSs to distinguish the EBS and the non-BS signal. It is an important aspect of speech processing technology. As for the BS signal, we identified the EBSs which satisfied certain conditions, while the others are considered as non-BS signals. In this paper, the time series after FD-PPA were used as the input sequence to judge the starting and ending points of EBSs. The threshold for entering the BS segment, the length threshold of identified noise, and the maximum allowed mute length in the BS segment are set. Based on the above three thresholds, the endpoint of EBSs was determined. As a rule of thumb, the first is the threshold for entering BS segments which was set to 1.01. When the input value is greater than 1.01, it is considered to be the starting point of EBSs. The second parameter is the minimum duration threshold of the EBS signal, and the BS segment less than this threshold is considered as noise. And this threshold is set to 50 milliseconds. The maximum mute length allowed in the BS segment is the third threshold which was set to 250ms. If the mute length in the BS segment is less than this value, the BS is considered unfinished, otherwise, the BS segment is considered finished.
After the VAD, there are also many kinds of vocal signals mixed in, such as heart sounds, breath sounds and background noises similar to BSs. Limited to the problem of environmental noise collection and filter residue, we set three thresholds to remove three kinds of residual noise based on experience. Specifically, the envelope of each EBS was obtained by complex analytic wavelet transformation [28]. Then, we excluded the sound segment whose envelope maximum value was less than 0.037V, which meant that a sound segment with a too small amplitude is considered as noise. In the measured data, the confounding heart sounds is obvious. We extracted the envelope of sound segment and calculated the peak number. And based on the experience in judging heart sounds we ruled out the sound segment whose peak value was less than 3. We also found that for BS segments with a very small signal-to-noise ratio, there was residual noise and it was identified as a gut sound, which also needed to be removed. As for this speech segment with residual noise, we filtered out the envelope peak number which was more than 3 in the length of 1000 sampling points based on experience.
Table 5 The linear CVs of EBSs
Linear CVs
|
Calculation methods
|
Physiological significance
|
Num_bs
|
The number of identified effective bowel sounds during the measurement
|
Frequency of bowel sounds in the five minutes
|
Sum_bs
|
The sum of the absolute values of the identified effective bowel sounds
|
Reflecting the total energy of the bowel sounds
|
Sum_Duration_bs
|
The sum of the duration of the identified effective bowel sounds
|
Reflecting the total duration of bowel sounds
|
Mean_Duration
|
The mean of the duration of effective bowel sounds
|
Mean duration of effective bowel sounds
|
Std_Duration
|
The standard deviation of the duration of effective bowel sounds
|
Standard deviation of duration of effective bowel sounds
|
Mean_Mag_bs
|
The mean of the mean absolute value of effective bowel sounds
|
The average energy of effective bowel sounds
|
Std_Mag_bs
|
The standard deviation of the mean absolute value of effective bowel sounds
|
The standard deviation of the energy of the effective bowel sound
|
Notes. CVs, characteristic values. EBSs, effective bowel sounds.
Chatactersitic values extraction
The characteristic values (CVs) can quantitatively reflect the characteristics of BSs, so we extracted linear and nonlinear CVs for quantitative evaluation and statistical analysis. The linear CVs are mainly time-domain parameters, as shown in Table 5.
Physiological signals have been shown to be chaotic [29]. As the basic physiological signal, gut sound also has nonlinear dynamic characteristics. Therefore, nonlinear CVs were calculated in this paper. Recurrence quantification analysis (RQA) [30] can measure the complexity of a short and non-stationary characteristic signal with noise [31]. It has been broadly applied in the analysis of physiological data [32-34]. In this paper, phase space reconstruction was carried out for each EBS signal. Based on the recursive graph, recursive quantitative analysis was carried out and quantitative parameters were extracted [35], as shown in Table 6. There are multiple EBSs in each period, so in order to realize the subsequent statistical analysis, the mean value (Mean_) and standard deviation (Std_) of each CV in each period were calculated.
Table 6 The nonlinear CVs of EBSs
Nonlinear CVs
|
Calculation methods
|
Physiological significance
|
RR
|
The percentage of recurrent points falling within the specified radius
|
Reflect the similarity of signal fluctuation
|
Lmean
|
The mean of the diagonal lengths in recurrence plot
|
related to the separation velocity of adjacent trajectories
|
ENTR
|
The Shannon information entropy of all diagonal line lengths
|
A measure of signal complexity
|
TT
|
the average length of vertical line structures
|
Degree of system stability
|
Notes. CVs, characteristic values. EBSs, effective bowel sounds.
Fig. 2 shows an overview of BSs data acquisition, processing, and analysis, and is calculated as Eq. (1). (see Supplementary Files)
Fig. 2 Overview of BSs data acquisition, processing, and analysis. BS is short for the bowel sound. Pre-op is short for before operation. Pro-op is short for after operation. 3h-Pro-op is short for three hours after operation. FD-PPA is short for FD-peak peeling algorithm. EBS is short for the effective bowel sound.
Statistical analysis
We attempted to analyze the differences in BSs at Pre-op, Pro-op and 3h-Pro-op. The CVs can quantitatively represent the signals, so we conducted statistical analysis on the CVs of the 26 patients between Pre-op and Pro-op, the 18 patients between Pro-op and 3h-Pro-op, and the 18 patients between Pre-op and 3h-Pro-op. And then determined whether there were statistical differences. The CVs for statistical analysis included linear CVs and the means and standard deviations of nonlinear CVs.
Statistical analyses were performed using IBM SPSS Statistics 25. Each set of statistical analyses was taken from the same patients at two moments. And the normal distribution test was performed before statistical analysis. Normality test is performed for each set of data using Shapiro-wilk normal test. If the data is normally distributed, the significance level (i.e., p value) should be greater than 0.05; otherwise the p value should be less than 0.05.So for data satisfying normal distribution, the parametric statistical method of paired t-test was used. Otherwise, the rank-sum test was used. A value of p < 0.05 was considered to indicate statistical significance.