A Novel Subband Fractional Delay Algorithm Based on the Filterbank of Cochlear Implant

Background: In recent years, microphone array method is gradually applied to speech enhancement of cochlear implant, and the delay parameter is the main parameter of microphone array beamforming technology. Due to the size limitation of cochlear implant, the microphone spacing is very small. In algorithm implementation, delay parameter usually corresponds to fractional sampling point. It is necessary to use fractional delay filter to realize the interpolation of integer sampling points. The traditional fractional delay method is to interpolate sampling points in the whole frequency band of speech. However, the speech frequency band itself is very wide, so the error of the present fractional delay method in cochlear is still large. Methods: We propose a fractional delay algorithm based on the filter bank of cochlear implant. The algorithm deduces and calculates the mathematical expression of the fractional delay filter of each subband, and forms a full band fractional delay filter algorithm to minimize the delay error of the whole band. Realization and results: Through the analysis of the system response curve and the calculation of the delay error, it can be seen that the system response corresponding to the fractional delay of each subband in the cochlear filter bank has only a small deviation from the ideal fractional delay filter. Therefore, the error of the fractional delay filter designed in this paper is very small, which can meet the requirements of cochlear implant using microphone array technology for the precision of delay parameters. Discussions: In this paper, the implementation algorithm of subband fractional delay filter is applied to signal acquisition of cochlear implant. Considering the space condition and delay parameters in the actual application scenario, the value of fractional delay can be any continuous real number between 0 and 3, and the error situation of the algorithm can be calculated and analyzed in this range. If the algorithm is extended to other applications, the numerical range of fractional delay can be extended. From the statistics of the average error, it can be seen that the average error of the proposed algorithm in the whole frequency band is extremely small, which can meet the needs of the accuracy of the delay parameters in the application of cochlear implant. Conclusions: The proposed fractional delay filter based on the minimum subband error of cochlear implant can not only realize the local fractional delay minimization, but also the error minimization of the whole frequency band. The coefficients of the unit impulse response of the designed filter can be obtained by using Cramer’s rule [33-35] for the linear equations in equations (10). Corresponding coefficients of i , 0  , i , 1  , …, i are shown in equation (11).

guiding, virtual electrode, microphone array speech enhancement and so on. Among them, the microphone array speech enhancement method is based on the situation that the target speech signal and interference noise have different directions in the use scene [6][7][8]. In the practical application of cochlear implant, users want to enhance the forward signal in face-to-face communication, but they don't pay much attention to the signal from other directions. Research shows that for 50% sentence recognition rate, the signal to noise ratio (SNR) required by normal people is about −10 dB, while the SNR required by cochlear implant users is between 5 and 15 dB; however, the SNR of daily living environment is usually 5 to 10 dB [9][10]. Therefore, under the noise of normal living environment, the speech recognition rate of cochlear implant users is difficult to reach more than 50%. Because the speech recognition rate of the cochlear implant in the quiet environment has been relatively high, improving the SNR of its front-end signal acquisition is equivalent to making the cochlear implant work in the "quiet" environment, which is conducive to improving the performance of the cochlear implant. In recent years, microphone array technology has been widely used in video conference, car handsfree telephone, hearing aid and so on. Also, it has appeared in the research of front-end acquisition for cochlear implant, using microphone array speech enhancement technology to enhance the speech signal in the front-end device of cochlear implant. The microphone array method uses multiple sound sensors to collect sound signals, which helps to separate the target signal from the noise signal. This method can improve the SNR of the cochlear implant, and then improve the speech recognition rate [11][12][13].
The core of microphone array technology is delay and sum beamforming (DSB). In this method, multiple microphone sensors are placed in space to collect multichannel sound signals, and the corresponding channel signals are given certain delay and gain parameters. Finally, the output signals with directional characteristics are obtained to form different responses to signals in different directions. In essence, beamforming realizes spatial filtering, that is, the system response realizes the directional adjustment of signal amplitude, the enhancement of target orientation signal and the suppression of ambient signal. Because the recognition rate of cochlear implant in quiet environment is high, but it is low in noise environment, if the cochlear implant can be restored to work in "quiet" environment, it is expected to improve its performance. In the noisy environment, the signals collected by the front-end of the cochlear implant contain a lot of noise.
If the SNR of signal acquisition can be improved, the interference noise will be weakened when it is transmitted to the speech processor of the cochlear implant, which is helpful to improve the speech recognition rate of the wearer. In addition, the common situation in the daily use of cochlear implant is the separation of target voice and interference noise in the spatial direction.
What users want to solve most is that when they are in face-to-face communication, the voice or environmental noise and reverberation of the nearby speaker can be effectively removed.
Therefore, the cochlear implant microphone array technology is helpful to improve speech recognition rate.
In microphone array beamforming, the delay value is the main parameter to determine the beam shape. By changing the delay and other parameters, different directional beam modes can be designed [14][15][16]. For example, in reference [14], multiple beam patterns can be designed by using different coefficients in the mode of four acoustic tubes, and in reference [15], different beam patterns can be obtained by using two microphone modes with different delay and gain coefficients. In particular, different delay values have a great impact on the beam pattern of the dual microphone system [17][18]. Due to the size limitation of cochlear implant, the distance between microphones is very small. For a specific sampling rate, the delay value corresponds to the fractional sampling point [19][20]. Since the signals collected at the front-end of the cochlear implant have become discrete signals, it is necessary to realize the delay signal by interpolation on the basis of the original discrete signals [21]. The least mean square fractional delay filter [22][23][24][25][26][27] and the maximal flat fractional delay filter [23][24][25][26][27][28][29][30][31][32] are the most widely used fractional delay interpolation methods. Among them, the advantage of the least mean square fractional delay filter is to reduce the total mean square error of the whole frequency band, the disadvantage is that the local error is not minimum, and the global error is still large. The maximal flat fractional delay filter can minimize the error at low frequency or local position, but it has larger error at other frequencies and larger global error. Therefore, the previous methods do not solve the problem of global error minimization, and the error of global fractional delay in all frequency bands of the filterbank is still large. Because the filterbank of cochlear implant contains the process of frequency band division, the frequency range of each subband is relatively narrow. A fractional delay algorithm based on subband error minimization is proposed. The algorithm reduces the error of the center frequency of each subband to the minimum, and the error of each subband itself is very small, so that the fractional delay error of the whole speech band is minimized.

Methods
At present, the mainstream speech processing strategies for cochlear implant are based on filterbank, such as CIS, ACE and SPEAK. CIS strategy only transfers the parameters of one channel to the electrode at a time, and stimulates in order, and forms alternating electrical stimulation through continuous interleaved sampling. The stimulation rate range of CIS strategy is 740-2400 pulses per second. The SPEAK strategy and ACE strategy extract several frequency bands with largest energy information for electrode stimulation, and the extracted frequency bands are not equal interval or continuous interleaved stimulation. Because the energy of speech signal is mainly concentrated in the low-frequency band, the SPEAK strategy and ACE strategy are mainly to extract and transmit channel information in the low-frequency band. SPEAK strategy and ACE strategy are similar in algorithm architecture, and the specific parameters used are different. The stimulation rates of the two speech processing strategies are different, and ACE strategy has a higher stimulation rate. On the other hand, the number of channels of the two speech processing strategies is different. The SPEAK strategy sets 20 channels and selects the corresponding maximal 5-10 frequency bands for electrode stimulation, while the ACE strategy sets 22 channels and selects the channels within 20 for information transmission. Although the three mainstream strategies have different algorithm structures, they are all based on the filterbank mode. Therefore, the fractional delay filter algorithm studied in this paper is also based on the filterbank architecture.
Assuming that the signals collected from the front-end of the cochlear implant are expressed as x(n). For a M-channel cochlear implant, subband signals xi(n) are separated from each channel, where i ranges from 0 to M. The original signal x(n) consists of subband signals, as shown in In microphone array beamforming algorithm,  is used to represent the value of delay parameter. For a sampling rate of fs, the number of delay sampling points is s f  . The delayed signal is shown in equation (2).
For practical parameters, the delay sampling point s f  is not always an integer. It can be divided into an integer delay Di and a decimal delay Dd, as shown in equation (3).
In equation (3), when Dd is 0, delay s f  is simplified to integer point delay. Its implementation is simple, only the shift of digital signal. When Dd is not zero, delay s f  is essentially interpolated between sampling points of known signal x(n). For an ideal fractional delay system, the Ztransform is shown in equation (4).
Equation (4) shows that the ideal fractional delay system is an all-pass system with linear phase, which is a non-causal system and cannot be implemented in real time, and cannot be directly used in real-time signal processing in cochlear implant. Considering the characteristics of signal processing and the stability of the algorithm in cochlear implant, the FIR fractional delay filter is designed to approximate the ideal filter and minimize the delay error. Because the speech processing strategy of the cochlear implant has been divided into frequency bands, each subband contains a corresponding narrow-band signal. The coefficients of the corresponding fractional delay filter are different because of the different frequency bands. For the subband of channel i, the unit impulse response hi(n) of the designed K-order fractional delay filter can be expressed as equation (5).
 are the coefficients of K-order FIR fractional delay filters designed for the i subband of the cochlear implant. The length of the filter is K+1, and the corresponding system frequency response function in Z-transform is of the designed K-order FIR fractional delay filter and ideal fractional delay filter in Z domain is shown in equation (6).
By substituting formula (4) into formula (6), the expression of error function in the i subband is shown in formula (7).
It can be seen that the error function   The following equations can be obtained by substituting equation (7) with equation (9), and change to the expression of corresponding angular frequency i cen, 0  e  e  e  e   ,  ,   , The coefficients of the unit impulse response of the designed filter can be obtained by using Cramer's rule [33][34][35] for the linear equations in equations (10). Corresponding coefficients of  are shown in equation (11).

Realization and results
Due to the size limitation of cochlear implant, the distance between microphones is very small, generally between 1 cm and 2 cm. For the common delay parameters of a d/c (d is the intermicrophone distance, c is the sound speed), the delay parameters ranges from 29.41 μs to 58.82 μs.
And the number of commonly used cochlear implant channels is 16-24. There are many modes of frequency band division of filterbank in cochlear implant, such as FFT filterbank and Gammatone filterbank [36][37]. The algorithm proposed in this paper can be applied to different frequency band division modes, only through the cut-off frequency of each subband to calculate. Through this algorithm, the coefficients of fractional delay filters of different subbands can be calculated and embedded in the conventional continuous interleaved sampling strategy or the strategy of selecting the maximum amplitude (ACE, SPEAK, etc.), and the implementation flow is shown in figure 1 (take the system of two microphones as an example).  Fig. 1 Processing schemes of the subband fractional delay coefficients in the speech processing strategy of cochlear implant.
In figure 1, two microphones collect and form signals of two channels. The two signals are processed according to the conventional cochlear speech processing strategy, such as pre emphasis and frequency band division (FFT filterbank mode or Gammatone filterbank mode). Then one channel of signals is given the delay parameters of beamforming through the proposed subband fractional delay filter, and then the combination of gain parameters can be used for speech enhancement of microphone array algorithm. After embedding into the cochlear implant, the following signal processing processes are continued to be implemented, such as envelope extraction, channel selection, dynamic range mapping. Finally, the stimulus sequence was formed.
Taking      Figure 3 shows the system response curve of 24-channel full-band fractional delay filter for cochlear implant. For an ideal fractional delay filter, the amplitude frequency response of the system is constant to 1 (0 dB). The closer the amplitude frequency response of the designed fractional delay filter is to 1, the higher the accuracy is. As can be seen from figure 3, the maximum deviation of the system response value of the designed fractional delay filter in the whole frequency band of the 24-channel filterbank is 3×10 -5 dB. As 3×10 -5 dB corresponds to 1.000007, that is to say, the maximum deviation is 0.0007%. Therefore, the error of the fractional delay filter designed in this paper is extremely small, which can meet the requirements of cochlear implant using microphone array technology for the accuracy of delay parameters.

Discussions
In the application of front-end microphone array for cochlear implant, the range of delay   Figure 4 shows the average error of fractional delay filters with different delay and channel quantity. It can be seen from the figure that there are two cases of average error: positive error and negative error. In different cases, the average error is very small, not more than 0.0004%. When the number of channels is large, the average error is smaller, and when the number of channels is small, the average error tends to increase. In figure 4, the delay values range from 0 to 3, where 0, 1, 2, and 3 are sampling points for delay integers and others are sampling points for delay decimals.
It can be seen that the error of delaying integer sampling points is 0, so the average error is gradually increasing and then decreasing. Therefore, the maximum error occurs between two integer sampling points. The calculation shows that the position of delay parameter for the maximum error in the range of 0~1 is 0.385, and the 16~24 channel filterbank all have the same results. Similarly, the position of delay parameter for the maximum error in the range of 1~2 is 1.5, and the position of delay parameter for the maximum error in the range of 2~3 is 2.615. The filterbank with different channel numbers have similar characteristics, and the average error is very small, which is helpful to realize the precise delay value in the algorithm of front-end microphone array beamforming.

Conclusion
In microphone array technology, delay value is an important parameter of beam design, whose accuracy determines the stability of beam pattern. Due to the limitation of cochlear implant size, the distance between microphones is very small, and the delay value is also very small, which leads to the fractional delay problem in the application field of cochlear implant. In this paper, a subband fractional delay method is proposed based on the frequency band division characteristics of filterbank in cochlear implant. The fractional delay filter designed by this method can minimize the local and global fractional delay errors, and solve the problem that the whole frequency band fractional delay error is too large in the previous technology. The algorithm can be embedded into the speech processing structure of cochlear implant, which has important theoretical and engineering value.