Fault detection and diagnosis method of diesel engine by combining rule-based algorithm and Bayesian/neural networks

The stable operation of diesel engine is critical to the normal production of the industry, and the prevention, monitoring and identification of faults are of great significance. At present, the fault research on diesel engine still has some defects, such as only few types of faults diagnosis are identified, the accuracy of fault diagnosis is still low, and fault identification is located at a constant speed. Therefore, a rule-based algorithm for fault diagnosis is proposed. Bayesian networks (BNs) and BP neural networks are used to identify seven faults at different speeds. Changchai EV80 diesel engine is taken as an example, and the feature values are extracted from the vibration signals measured from the cylinder head. The signals are processed by wavelet threshold de-noising and Ensemble Empirical Mode Decomposition (EEMD). The signal-sensitive feature values extracted from the decomposed Intrinsic Mode Function are used to distinguish different faults. After obtaining the feature values, a rule-based algorithm using IF... THEN's logic statement is proposed. BNs and BP neural networks established by parameter learning method are used for fault identification. Furthermore, this paper considers the uncertain factors and the interference of the external environment. Gaussian white noise is added to the raw signal and external excitation interference is applied to the diesel engine when it is running under normal operation condition. The results show that the proposed fault diagnostic method can accurately identify the faults.


Introduction
Fault diagnostic accuracy of diesel engine is still low because of large number of components, rough work and large vibration, making it difficult to identify faults. In general, the fault diagnosis method of diesel engine can be classified into three types, including model-based method, signal-based method and knowledge-based method [1].
The earliest development and the most mature research is the model-based fault diagnosis method among them.
The method has a profound theoretical foundation and can penetrate the essence of the dynamic system to achieve the identification of faults. For example, Salehi et al. [2] established a mathematical model of the engine exhaust pipe and identified accurately the faults of leakage and blockage. Nahim et al. [3] proposed a corresponding model for the defects of the diesel engine and pointed out the impact of the fault on the overall system. However, this method is only useful for the system with known mathematical models. It is difficult to obtain precise mathematical models of the system in actual situations. The uncertainty and time-variation of the system structure and parameters are difficult to predict, so the method has great limitations.
Signal-based method is also widely used for the fault diagnosis of diesel engine. The real-time operation of the diesel engine can be obtained through the sensors installed on the diesel engine and the online real-time monitoring system of the diesel engine without destroying the diesel engine is established. Jing et al. [4] identified the faults of different valve conditions by using the vibration acceleration signal accurately. Dykas et al. [5] measured the acoustic emission signal of the single-cylinder diesel engine and accurately identified the injector fault. However, sometimes the signal patterns extracted for the same type of fault each time are different, which interferes the identification of the fault.
The knowledge-based method is also called data-driven method. The method does not require prior mathematical model of the structure or component, and is ideal for the fault diagnosis of complex systems. Haghani et al. [6] used data-driven methods to detect experiment deviations, optimize engine parameters and improve operation performance of diesel engine. Li et al. [7] monitored the pressure signals of diesel engines using the data-driven method to identify faults of valve trains and fuel systems. Data-driven method requires enough data to train the diagnostic model. In the current work, the faults are set artificially and enough data are obtained. This method requires a large amount of historical data. In order to obtain enough historical data, the experiment of fault diagnosis is repeated for many times.
The vibration sensors are used to collect the fault signals in this paper. The movement of the diesel engine belongs to the reciprocating mechanical motion which has a strong periodicity. One motion cycle consists of four strokes and the running curves of each stroke are completely different. Therefore, when the internal components of the diesel engine fail, the fault is reflected in the corresponding stroke process and the running curve also changes, so it is very convenient for the fault diagnosis of the diesel engine. Wang et al. [8] successfully identified the faults of the fuel failure and the fuel supply advance angle increase by decomposing the vibration signal. Wang et al. [9] calculated the cone-shaped nuclear distribution of the vibration signal under different faults of the diesel engine and identified the fault of the diesel engine valve mechanism. Yuan et al. [10] effectively monitored and evaluated the health and reliability of the engine based on the vibration signal method. Janssens et al. [11] proposed a multi-sensors system for fault diagnosis of rotating machinery by combining vibration sensor with other sensors.
It turns out that multiple faults can be diagnosed effectively using multi-sensors.
Fault identification is a very important step in the fault diagnosis process of data-driven method. The faults cannot be accurately identified without the proper fault identification method when the fault signals are collected and decomposed. Proper identification methods can reduce the interference during signal acquisition and the errors during signal decomposition. The current popular fault identification methods include mixed causality and fuzzy rule based methods, blind source separation, support vector machines, BNs and neural networks. BNs have strong uncertainty problem processing ability because they can learn and reason under incomplete and uncertain information conditions. BP neural networks have the arbitrarily complex pattern classification ability and excellent multidimensional function mapping ability. For example, Cai et al. [12] proposed an object-oriented BN modeling method for fault diagnosis of complex systems. Subrahmanya et al. [13] proposed a new algorithm of selecting sensors based on the Bayesian formula, and applied it to the fault diagnosis of diesel engines. Morgan et al. [14] used the feature values obtained from oil analysis to establish a BN to identify diesel engine faults. Zhao et al. [15] defined fault diagnosis disciplines using the BNs to identify the faults of air handling devices. Lu et al. [16] input the feature values extracted from the vibration signal into the BP neural network for identifying the turbine fault.
Ahmed et al. [17] trained the network with BP neural networks and stacked self-encoder to achieve high-precision diagnosis of faults in rolling bearings. Wang et al. [18] trained the fault data of the electronic equipment system with BP neural networks to establish a fault diagnostic system for complex electronic equipment.
So far, many researchers have done a lot of work on the fault diagnosis of diesel engines. However, there are still many problems, for instance, only few types of faults are identified, the accuracy of fault diagnosis is still low, and fault identification is located at a constant speed. This paper proposes a diesel engine fault diagnostic method which combines the rule-based algorithm and BNs or BP neural networks. The method can accurately identify diesel engine faults under different speeds. The impact of white noise and external excitation on the diagnostic accuracy of diesel engines has also been verified. The structure of this paper is as follows: Section 2 describes the construction of the test bench and the simulation of the fault; Section 3 proposes the overall process of fault diagnosis, including signal acquisition, signal preprocessing, feature value extraction and fault identification; Section 4 gives the fault diagnostic results; Section 5 summarizes the paper.

System description and fault analysis
The fault diagnosis system of diesel engine is shown in Figure 1. The vibration sensors, photoelectric switch, photoelectric encoder are installed on the diesel engine and the measured signals are delivered to the PC through the corresponding data acquisition card. Both of the vibration sensors are placed on the same cylinder head, and they are located above the intake and exhaust valves of the cylinder head, respectively. The photoelectric encoder is placed at the output shaft of the diesel engine to monitor the real-time speed. A reflective sticker is attached to The engine used in the experiment is EV80 diesel engine produced by Changchai Company, and it is a twocylinder, in-line V-type diesel engine with a speed range from 1800 rpm to 3600 rpm and a power range from 10.8 kw to 14 kw. The basic parameters are shown in After a long period of operation, the diesel engine has various failure problems because of the hard working environment and large number of components. In the past literatures, the faults are mainly confined to the faults of valve clearance, such as the increasing or reducing of the valve clearance. In actual environment, the failure problems of the diesel engine contain a wide variety of faults. Therefore, the research is strengthened on the fault type of diesel engine. The faults are mainly divided into the following types, as shown in Table 2. The clearance between the valve and the transmission is adjusted to 0.1mm Exhaust valve clearance with 0.8mm (EL) The clearance between the valve and the transmission is adjusted to 0.8mm The bending of push rod at exhaust valve (EC) The putter is manually bent to a certain curvature The leakage of high pressure oil pipe (OL) The bolt is loosen at the high pressure inlet pipe to achieve slow oil leakage The wear of intake valve base (ISE) The side of the intake valve base is scratched so that the seal is not tight The wear of cylinder pad (WE) The black pad for the seal is scratched The spring wear at the intake valve (ISP) The spring is cut for a length by wire cutting machine In addition, when the normal operating condition (NOR) is counted, there are totally 8 types and the valve clearance in the normal operating station is 0.2 mm.

The proposed fault diagnosis methodology
The flow chart of the intelligent fault diagnosis method is shown in Figure 2.

Signal noise reduction and feature value extraction
Decomposing various frequency components of the signal into mutual non-overlapping frequency bands provides an effective way for signal filtering, signal-to-noise separation and feature extraction. It has a unique advantage in signal de-noising. The principle of wavelet threshold de-noising can be expressed as follows: where s(n) is useful signal, u(n) is sequence signal of noise. Assume that u(n) is zero-mean, and obeys Gaussian distribution of random sequences and the distribution of N:(0, σu 2 ), the above Eq. (1) can be carried out by the wavelet transform, defined as: Since u(n) is a zero mean and stationary random signal with independent distribution. Note that u=(u(0) u(1)…u(N-1)) T , then the formula can be defined as follows: Here, E{•} represents the mean operation, Q is the covariance matrix of u, I is a unit matrix.
Assume that W is the wavelet transform matrix, x and s are the corresponding vectors of x(n) and s(n) . The vectors X, S, and U is the wavelet transform of x(n), s(n), u(n), respectively. P is the covariance matrix of U, then the formula can be defined as follows: It can be seen from Eq. (4) that after experiencing orthogonal wavelet transform, the energy of the signal is concentrated on a few sparse, large amplitude wavelet coefficients, and the noise is distributed on all time axes with various time scales. The amplitude is small and the noise interference is suppressed.
The raw signal not only contains the fault information but also contains the signal interference generated by the interaction of many other components and the external environment. Therefore, the raw signal needs to be de-noised by the wavelet threshold method to decrease high frequency interference. The normal operating condition is taken as an example to compare the different signals before and after signal de-noising. The Figure 3 shows the comparison results. yi(t) can be obtained by adding normal distributed white noise ni(t) with equal length to the raw signal y(t) for multiple times.
where yi(t) is the signal that is added to white noise for the ith time.
yi(t) is decomposed by the EEMD method to get component cij(t) and residual ri(t). The principle that the statistical mean of the uncorrelated random sequence is zero is utilized, then the components of cij(t) are averaged to compensate the effect of white noise on the real IMF for multiple times. The final result of EEMD decomposition is shown as follows: (6) where N refers to the number of added white noise sequences, cj(t) is the averaged IMF component.
The normal operating condition is decomposed by EEMD method. The result is presented in Figure 4. Figure 4 shows the first three layers of signals after signal decomposition. It can be seen that the Intrinsic Mode Function

Rule-based algorithm for model division
The intelligent fault diagnostic system is able to identify all kinds faults at different speeds. The measured speed range varies from 1800 rpm to 2600 rpm. In the current work, it is unrealistic that only a model of BNs achieves the goal of seven faults identification within such a large speed span. Therefore, the model is established at 1900 rpm, 2200 rpm and 2500 rpm, respectively. The maximum speed span is 100 rpm in each model at central speed.
Therefore, all faults at all speeds can be diagnosed and identified.
In each model, seven faults are divided into three sub-models. The principle of dividing seven faults into three sub-models is to utilize the rule-based algorithm. Some feature values can clearly distinguish one or two types of All fault signals at the central speed of 2500 rpm are taken as an example. The rule-based algorithm is used to divide the faults into three sub-models. The meaning of feature value node is shown in Table 3. The feature values that can obviously distinguish the fault are sorted in ascending order. It can be seen from  Table 4.

Fault identification using BP neural networks
Neural networks have the capabilities of self-learning, nonlinear mapping, approximation for arbitrary functions and parallel computing. The capabilities provide a powerful guarantee for constructing new fault diagnostic model.
The learning process of BP neural networks is divided into two stages. The first stage is to input the known learning samples and calculate the output of each neuron through the network structure, weights and thresholds. The second stage is to modify the weight and threshold to minimize the total error E.
The weight and the threshold between the input layer and the hidden layer is wij and θj, the weight and the threshold value between the hidden layer and the output layer is wjk and θk, then the output of each layer neuron is as follows: where xj ' is the output of hidden layer neuron, yk is the output of the output layer neuron and xi is the input of hidden layer neuron.
The sample p1 is input into the network, then the output yk p1 (k=0,1,…m-1) is obtained. The error is the sum of the output units errors. tk p is the expected output for the corresponding sample. The total error for p learning samples can be defined as: Assume that wsq is the connection weight between any two neurons in the network, and the correction value using the gradient method to correct each wsq element is defined as: ∆ = − ∑ 1=1 , η is the step size, for The total error changes with the direction of decreasing through the gradient method, which is stopped until

ΔEsum=0.
The basic structure of BP neural networks is shown in Figure 7. xn is the input layer of the network, hn is the hidden layer of the network, and on is the output layer of the network.    Figure 9.
It can be seen that the diagnostic accuracy using BP neural networks is similar to that using BNs in each model established at each central speed. The accuracies of fault diagnosis using different identification method are high in the middle and low on both sides. Furthermore, the diagnostic accuracy with higher speed than the central one is completely higher than that with lower speed than the central one. When the speed exceeds the central speed and the speed difference reaches 100 rpm, the accuracy of fault diagnosis is basically higher than 80%, while the speed is lower than the central speed and the speed difference reaches 60 rpm, the accuracy of fault diagnosis is higher than 80%, and the diagnostic accuracy can reach more than 60% when the speed difference reaches 100 rpm. Only when the central speed is 2500 rpm, the diagnostic accuracy using BP neural network is slightly better than that using BNs. When the speed is 2440 rpm, the accuracy of diagnosis using BP neural networks is 8.9% higher than that using BNs. This result shows that BP neural networks have good compatibility for high speed and high intensity vibration. The BNs interval division is artificially defined, and the goal of perfect interval division cannot be achieved with limited training data. When the diesel engine is running at high speed, once the engine has abnormal fluctuation, the extracted feature values will exceed the upper limit of the interval threshold, which results in the decrease of diagnostic accuracy.

Figure 9
Diagnostic results using BP neural networks and BNs

Impact of different layers of signals on fault diagnosis
In the above mentioned fault diagnostic process of diesel engine, the processed signal is the first layer signal decomposed by EEMD method. The frequency range of each fault is different. For example, the vibration response caused by the pressure of combustion gas is a low-frequency response, and it is mainly concentrated at 1 kHz.
While the valve seat impact is a high-frequency response, and it is mainly concentrated at 2-4 kHz. It is uncertain to determine which layer or layers signals are the best for fault diagnosis, since all first few layers of the to identify the faults. The subsequent methods of signal decomposition and feature value extraction are the same as that adopted by the first layer decomposed signal. Taking the fault signals with the central speed at 2200 rpm as an example, the result of fault diagnosis is shown in Figure 10. It can be inferred from Figure 10 that when the speed is lower than the central speed, the diagnostic accuracy using BP neural network is slightly better than that using BNs, and the maximum difference of accuracy reaches 7.1% at 2100 rpm speed. However, when the speed exceeds the central speed, the BNs are obviously superior to the BP neural networks. Especially the higher the speed is, the more obvious the difference of accuracy is. The maximum difference can reach 12.5% at the 2300 rpm speed. On the whole, when the synthesized signal consisting of the second, third and fourth layers of decomposed signals is used to identify the faults, the method using BNs is superior to the method using BP neural networks. The synthesized signal is composed of low frequency signals, which naturally contain a little high frequency information. When the measured speed is lower than the central speed, the vibration intensity weakens and the extracted feature values are lower than the low limit of the interval threshold, which can cause a lower accuracy of fault diagnosis. On the contrary, when the measured speed exceeds the central speed, the vibration intensity increases and the high-frequency information of the measured signals increases, which can compensate the shortcomings of little high-frequency information in the synthesized signal.
So the interval of the feature values is stable and the fault diagnostic accuracy using BNs is high. The accuracy of fault diagnosis with the first layer signal and synthesized signal at different speeds using BP neural networks and BNs can be obtained from Figure 9 and Figure 10. It can be observed that the trend lines of accuracy using different fault diagnosis methods are similar. The maximum difference of accuracy using BP neural networks at the both ends of the trend line is 12.5% and 21.4%, respectively. When the BNs are used to identify faults, the maximum difference of accuracy is 10.7% and 9%, respectively. The accuracy using the first layer signal is relatively high at high speeds, and the accuracy using the synthesized signal is high at low speeds. Since the EEMD algorithm decomposes the original signal according to the magnitude of the time feature scale, the first layer decomposed signal contains more high frequency information and the synthesized signal contains more low frequency information. When the speed of diesel engine increases, the vibration frequency increases and the highfrequency signal contains more fault information, so the diagnostic accuracy using the first layer signal is high.
When the speed of diesel engine reduces, the vibration frequency slows down and the low-frequency signal contains more fault information, the effect of fault diagnosis using the synthesized signal is better.

Influence of Gaussian white noise
In this section, the effect of Gaussian white noise on the accuracy of fault diagnosis is studied. Gaussian white noise is used to simulate the unknown real noise. In the real environment, the noise is often not caused by a single source, and is a noise compound of many different sources. The noise source in most electronic systems such as radar and communication systems is thermal noise, and thermal noise is typical Gaussian white noise. Figure 11 shows the accuracy of fault diagnosis when Gaussian white noise is added to the signal before signal denoising. The Gaussian white noise from 0 dB to 10 dB is added to the measured signal. The diagnostic accuracy of signal with white noise is basically the same as that without white noise in Figure 11(a). The fault diagnostic accuracy of the signal with signal-to-noise ratio (SNR) 0 dB is only reduced by 3.6% at 2230 rpm speed. It can been seen from the Figure 11(b) that the fault diagnostic accuracy of the signal with SNR 0 dB is reduced by 5.3% at 2230 rpm speed. It can be concluded from Figure 11(a) and Figure 11(b) that the method using BNs is superior to the method using BP neural networks. The signal de-noising method can eliminate the added Gaussian white noise, the noise interference of the environment and the precision interference of the sensor can be eliminated.

Figure 11
Fault diagnostic results when white noise is added before signal de-noising (a) BNs, (b) BP neural networks Figure 12 shows the fault diagnostic accuracy when Gaussian white noise is added to the signal after signal denoising. Similarly, the Gaussian white noise from 0dB to 10dB is added to the measured signal. It can be seen from the Figure 12(a) that the fault diagnostic accuracy of the signal with SNR 0 dB is reduced by 5.3% at 2230 rpm speed, and is reduced by 7.2% at 2200 rpm speed. As shown in Figure 12(b), the fault diagnostic accuracy of the signal with SNR 0 dB is reduced by 7.2% at 2230 rpm speed. It can be concluded from Figure 12(a) and Figure   12(b) that the method using BP neural networks is superior to the method using BNs. Combining Figure 11 and

Influence of external excitation interference
In this section, the external excitation is applied to the diesel engine, including continuous excitation (the gear driven by a motor wears the lower end of the cylinder head) and intermittent excitation (the screw nut drew by a motor strikes the diesel engine intake pipe). The excitation is not directly applied to the cylinder head, and is transferred to the cylinder head through some certain conduction. strong toleration than BNs for external interference. The maximum difference of diagnostic accuracy can reach 17.8% when continuous excitation is applied to diesel engine. When intermittent excitation is applied to diesel engine, the maximum difference of diagnostic accuracy can reach 16%. The reason is that when BNs are used to identify faults, the interval division of feature value is fixed and cannot be changed arbitrarily. Some differences between different interval values are small. Therefore, when the diesel engine is disturbed by external excitation, the magnitude of extracted feature value changes, which causes the change of original interval position. Then the posterior probability of the fault accordingly changes.
The same feature value extracted at different speeds or even the same speed has a relativity large variation range when the same excitation is applied to the diesel engine. The reason is as follows: when the continuous excitation is applied to diesel engine, the contact tightness between the gear and the diesel engine changes with the vibration intensity of the diesel engine. The produced frictional excitation also changes accordingly. The generated excitation signal is unstable, and the variation range of feature value varies greatly. Due to the discontinuity of the intermittent excitation, when the vibration signal is measured within one operation cycle, the intermittent excitation may not be collected. Therefore, the variation range of the feature value will increase, and a situation that a certain feature value suddenly increases at some speed will show up, which reduces the accuracy of the fault diagnosis.

Conclusion
An intelligent method of fault diagnosis for diesel engine by combining rule-based algorithm and Bayesian/neural networks is proposed. The innovation of the method is that using a small amount of data trained at a speed realizes the fault diagnosis with a wide speed range. The fault diagnosis application on Changchai EV80 two-cylinder diesel engine is researched and some important conclusions are summarized as follows.
(1) The diagnostic accuracy with higher speed than the central one is completely higher than that with lower speed than the central one. When the speed exceeds the central speed and the speed difference reaches 100 rpm, the accuracy of fault diagnosis is basically higher than 80%, while the speed is lower than the central speed and the speed difference reaches 60 rpm, the accuracy of fault diagnosis is higher than 80%, and the diagnostic accuracy can reach more than 60% when the speed difference reaches 100 rpm. When synthesized signal is used to identify faults, the method using BNs is superior to the method using BP neural networks and the maximum difference of accuracy can reach 12.5% at 2300 rpm speed.
(2) When white noise is added to the signal before signal de noising, the method using BNs is better than the method using BP neural networks, and the diagnostic accuracy is increased by 1.7% at 2230rpm speed. When white noise is added to the signal after signal de-noising, the method using BP neural networks is better than the method using BNs, and the diagnostic accuracy is increased by 5.3% at 2300 rpm speed.
(3) The accuracy of fault diagnosis is greatly affected by external excitation interference, and the accuracy using BP neural networks is higher than the accuracy using BNs. When BP neural networks are used to identify the faults, the accuracy of fault diagnosis for continuous excitation can reach more than 60%, and the highest accuracy reaches more than 70%. While the intermittent excitation is applied to the diesel engine, the fluctuation of the accuracy is relatively large. The highest accuracy can reach 76.8% and the lowest accuracy can reach 58.9%.