Signal processing
In the process of BSs acquisition, environmental noise is easy to be introduced, which directly affects the quality of the BS signal. Therefore, it is necessary to remove environmental noise to better analyze and identify the BSs. We use the noise acquisition channel of the recorder to collect the environmental noise, and the adaptive noise cancellation is used to remove the noise. Specifically, the least mean square (LMS) [19] algorithm is adopted, the order of the filter is determined to be 32, and the step size factor is set as 0.000001 to achieve a good adaptive cancellation.
Adaptive filtering can eliminate the environmental noise, but the high-frequency noise in the signal still affects the identification and analysis of effective bowel sounds (EBSs). As an effective and practical method, wavelet denoising has achieved good results in signal and image denoising, and has been widely used in engineering applications. Donoh [20, 21] proposed a wavelet threshold denoising method. The wavelet coefficient of signal contains important information after wavelet transform with Mallat algorithm. The wavelet coefficient of the noise is less than the wavelet coefficient of the signal. By selecting a suitable threshold, the wavelet coefficients greater than the threshold are considered to be generated by signals and should be retained, while those less than the threshold are considered to be generated by noise and set to zero to achieve the purpose of denoising. In the process of wavelet decomposition, the wavelet basis, the number of decomposition layers and the threshold should be determined. For the selection of wavelet basis, we chose sym8 wavelet basis which is from the two common wavelet bases of db wavelet system and sym wavelet system. For determining the number of decomposition layers, too large or too small will both affect the final de-noising effect. In this paper, the number of decomposition layers is determined to be 5 after comparing the denoising effects of different decomposition layers. For the determination of threshold value, Birge-Massart [22] algorithm is used to obtain the threshold value of each layer of one-dimensional wavelet transform, and soft threshold function is used for denoising.
After the adaptive filtering and wavelet denoising, the waveform (Fig. 1) can be used to identify EBSs. The fractal dimension (FD) can quantitatively describe the complexity of the signal. The FD of EBSs is different from that of background sounds [23]. To calculate the FD of time series, we can either reconstruct the phase space first and then calculate the correlation dimension of time series [24–26] or directly calculate the FD in the time domain. The time series in this paper is the audio signal with a high sampling rate and large data volume, so the FD is calculated directly in the time domain. The Katz method [23, 27, 28] used in FD calculation which can effectively judge the randomness of waveform. When calculating the FD of the BS signal, we employed a sliding window to realize the short-time processing of audio signals. The length of the sliding window is set to int (0.006*fs), where int indicates the integer part of the argument, and fs is the sampling frequency of the BS signal. The constant 0.006 is empirically set and justified by the efficient performance of the algorithm [23]. The FD of the data in the sliding window is calculated respectively. In order to ensure that the length of the data before and after calculating the FD is equal, the first and the last FD are used to make up the data at both ends. After the FD sequence calculated, the peak value is extracted to ensure the effective recognition of the BSs. The peak extraction method adopts FD-peak peeling algorithm (FD-PPA) [29].
FD-PPA makes the EBSs more obvious in the waveform, but the endpoint detection is needed to extract the EBSs. The purpose of voice endpoint detection (VAD) technology is to identify the starting point and ending point of speech accurately from a segment of the signal containing speech and distinguish speech and non-speech signals. It is an important aspect of speech processing technology. As for the BS signal, we identify the EBSs which satisfying certain conditions, while the others are considered as non-bowel sound signals. In this paper, the time series after FD-PPA are used as the input sequence to judge the starting and ending points of EBSs. The threshold for entering the BS segment, the length threshold of identified noise, and the maximum allowed mute length in the BS segment are set. Based on the above three thresholds, the endpoint of EBSs is determined. As a rule of thumb, the first is the threshold for entering BS segments which is set to 1.01. When the input value is greater than 1.01, it is considered to be the starting point of EBSs. The second parameter is the minimum duration threshold of the EBS signal, and the BS segment less than this threshold is considered as noise. And this threshold is set to 50 milliseconds [30]. The maximum mute length allowed in the BS segment is the third threshold which is set to 250 ms. If the mute length in the BS segment is less than this value, the BS is considered unfinished, otherwise, the BS segment is considered finished.
After the VAD, there are also many kinds of vocal signals mixed in, such as heart sounds, breath sounds and background noises similar to BSs. Limited to the problem of environmental noise collection and filter residue, we set three thresholds to remove three kinds of residual noise based on experience. Specifically, the envelope of each EBS is obtained by complex analytic wavelet transform [31]. Then, we exclude the sound segment whose envelope maximum value is less than 50, which means that a sound segment with a too small amplitude is considered as noise. In the measured data, the confounding of heart sounds is obvious. We extracted the envelope of sound segment and calculated the peak number. And based on the experience in judging heart sounds we rule out the sound segment whose peak value is less than 3. We also found that for BS segments with a very small signal-to-noise ratio, there was residual noise and it was identified as a gut sound, which also needed to be removed. As for this speech segment with residual noise, we filter out the envelope peak number which is more than 3 in the length of 1000 sampling points based on experience.
Table 5
Linear CVs | Calculation methods | Physiological significance |
Num_bs | The number of identified effective bowel sounds during the measurement | Frequency of bowel sounds in the five minutes |
Sum_bs | The sum of the absolute values of the identified effective bowel sounds | Reflecting the total energy of the bowel sounds |
Sum_Duration_bs | The sum of the duration of the identified effective bowel sounds | Reflecting the total duration of bowel sounds |
Mean_Duration | The mean of the duration of effective bowel sounds | Mean duration of effective bowel sounds |
Std_Duration | The standard deviation of the duration of effective bowel sounds | Standard deviation of duration of effective bowel sounds |
Mean_Mean_bs | The mean of the mean absolute value of effective bowel sounds | The average energy of effective bowel sounds |
Std_Mean_bs | The standard deviation of the mean absolute value of effective bowel sounds | The standard deviation of the energy of the effective bowel sound |
Notes. CVs, characteristic values. EBSs, effective bowel sounds. |
Chatactersitic values extraction
The characteristic values (CVs) can quantitatively reflect the characteristics of BSs, so we extracted linear and nonlinear CVs for quantitative evaluation and statistical analysis. The linear CVs are mainly time-domain parameters, as shown in Table 5.
Physiological signals have been shown to be chaotic [32]. As the basic physiological signal, gut sound also has nonlinear dynamic characteristics. Therefore, nonlinear CVs are calculated in this paper. Recurrence quantification analysis (RQA) [33] can measure the complexity of a short and non-stationary characteristic signal with noise [34]. It has been broadly applied in the analysis of physiological data [35–37]. In this paper, phase space reconstruction is carried out for each EBS signal. Based on the recursive graph, recursive quantitative analysis is carried out and quantitative parameters are extracted [38], as shown in Table 6. There are multiple EBSs in each period, so in order to realize the subsequent statistical analysis, the mean value(-mean) and standard deviation(-std) of each CV in each period are calculated respectively.
Table 6
The nonlinear CVs of EBSs
Nonlinear CVs | Calculation methods | Physiological significance |
RR | The percentage of recurrent points falling within the specified radius | Reflect the similarity of signal fluctuation |
Lmean | The mean of the diagonal lengths in recurrence plot | related to the separation velocity of adjacent trajectories |
ENTR | The Shannon information entropy of all diagonal line lengths | A measure of signal complexity |
TT | the average length of vertical line structures | Degree of system stability |
Notes. CVs, characteristic values. EBSs, effective bowel sounds. |
Fig. 2 shows an overview of BSs data acquisition, processing, and analysis, and ‖f(t)‖ is calculated as Eq. (1).