Rolling Bearing Fault Diagnosis Algorithm Based on Overlapping Group Sparse Model-Deep Complex Convolutional Neural Network

As the key component of a mechanical system, rolling bearings will cause paralysis of the entire mechanical system once they fail. In recent years, considering the high generalization ability and nonlinear modeling ability of deep learning, a rolling bearing fault diagnosis method based on deep learning has been formed, and good results have been achieved. However, because this kind of method is still in the initial development stage, its main problems are as follows. First, it is difficult to extract the composite fault signal feature of rolling bearing. Second, the existing deep learning rolling bearing fault diagnosis methods cannot well consider the problem of multi-scale information of rolling bearing signals. Therefore, this paper first proposes the overlapping group sparse model. It constructs weight coefficients by analyzing the salient features of the signal. It uses convex optimization techniques to solve the sparse optimization model, and applies the method to the feature extraction of rolling bearing composite faults. For the problem of multi-scale feature information extraction of rolling bearing composite fault signals, this paper proposes a new deep complex convolutional neural network model. This model fully considers the multi-scale information of rolling bearing signals. The complex information in this model not only contains rich representation ability, but also can extract more scale information. Finally, the classifier of this model is used to identify rolling bearing faults. Based on this, this paper proposes a new rolling bearing fault diagnosis algorithm based on overlapping group sparse model-deep complex convolutional neural network. The experimental results show that the method proposed in this paper can not only effectively identify rolling bearing faults under constant operating conditions, but also accurately identify rolling bearing fault signals under changing operating conditions. Additionally, the classification accuracy of the method proposed in this paper is greatly improved compared with traditional machine learning methods. It also has certain advantages over other deep learning methods.


I. INTRODUCTION
dentifying the faults in the equipment operation process can in time is of great significance to the safe operation of mechanical systems, which reduces or avoids major economic losses and catastrophic accidents [1,2]. Rolling bearings are a key transmission component in the entire mechanical system structure; if they fail, it will cause an increase in the overall failure rate of the system, cause significant economic losses, and even cause serious safety accidents. Therefore, research on rolling bearing fault diagnosis methods has always been a key issue in the field of fault diagnosis. Rolling bearing fault diagnosis methods are generally divided into rolling bearing fault diagnosis methods based on feature analysis and rolling bearing fault diagnosis methods based on artificial intelligence [3][4][5].
The rolling bearing fault diagnosis methods based on feature analysis mainly include: Yang et al. [6], who used a short-time Fourier transform to process the original vibration signal of variable-speed rotating machinery and used it for feature extraction. Bafroui et al. [7] used discrete wavelet transform to study rotor fault feature extraction. Although it has good reconstruction characteristics and other advantages, it has problems such as the loss of high-frequency part of the fault characteristic information. Georgoulas et al. [8] used the empirical mode decomposition (EMD) method to obtain the original characteristics of the vibration signals of the normal bearing and the faulty bearing, thereby realizing the abnormal recognition of the rolling bearing. Yu et al. [9] used ensemble empirical mode decomposition (EEMD) and singular value decomposition to obtain useful fault features of rolling bearings, and used the fault features for identification. Liu et al. [10] used wavelet transform and singular value decomposition to preprocess the signal, and then used LLE to reduce the dimensionality of the feature space. It performs fault diagnosis through the obtained low-dimensional feature set. Zhang et al. [11] used NPE to achieve dimensionality reduction processing on high-dimensional original feature sets and achieved better bearing fault diagnosis results. Although the above methods have been widely used and promoted in the field of rolling bearing fault diagnosis, these methods have problems such as the artificial setting of feature information, weak adaptive ability, and poor robustness. Therefore, better rolling bearing fault diagnosis methods have been studied in the industry.
Fault diagnosis methods for rolling bearings based on artificial intelligence are divided into fault diagnosis methods for rolling bearings based on machine learning and deep learning. The fault diagnosis methods based on machine learning mainly include: Muruganatham et al. [12] used singular value decomposition (SVD) and feedforward back propagation neural networks (Back propagation neural networks, BPNN) to diagnose different faults of rolling bearings. Ali et al. [13], who used BPNN as a classifier to diagnose the running state of the rolling elements and inner and outer rings of the bearing. Zhang et al. [14] used the intercluster distance (ICA) in the feature space to optimize the support vector machine (SVM) to achieve bearing fault detection and classification. Li et al. [15] proposed the fault diagnosis of rolling bearings based on the binary tree SVM model. Uddin et al. [16] proposed an enhanced KNN classification algorithm using enhanced KNN to realize bearing fault diagnosis. The fault diagnosis methods based on machine learning have an improved diagnosis effect and adaptive ability compared with the fault diagnosis method based on feature analysis. However, this type of method has problems such as weak self-learning ability and weak robustness in the modeling process. In this context, deep learning [17,18] has a high degree of generalization ability and strong feature extraction ability. Deep learning has been widely used in machine vision, image classification and natural language processing. Therefore, researchers introduced deep learning to the field of fault diagnosis, which formed a fault diagnosis method based on deep learning. Shao et al. [19] used a deep belief network (DBN) for intelligent state monitoring of induction motors and used a DBN to automatically extract relevant features of vibration signals for state recognition. Shao et al. [20] used the DBN optimized by particle swarm optimization for rolling bearing fault diagnosis and introduced stochastic gradient descent to fine-tune the weights of the restricted Boltzmann machines (RBM) training process. Finally, the optimized DBN was used for fault diagnosis. Jiang et al. [21] proposed a multilayer deep learning convolutional neural network (CNN) for fault diagnosis of rolling bearings. Wang et al. [22] proposed an adaptive convolutional neural network method and applied it to the fault diagnosis of rolling bearings. Islam et al. [23] used a deep learning model to monitor rolling bearing faults, used wavelet analysis to extract signal characteristics, and then used a deep learning model to classify faults. Zhou et al. [24] used a deep learning model to directly process vibration signals, and combined with a regional adaptive method to diagnose faulty bearings, it can improve the model's diagnostic effect. Cabrera et al. [25] combined a deep convolutional neural network with a long and short-term memory network (LSTM) model, and used the combined model to estimate the bearing state. It has achieved a better fault diagnosis effect. In summary, although the deep learning model has been applied to the field of fault diagnosis of rolling bearing equipment. However, the adaptive feature extraction and fault diagnosis based on deep learning have the following problems: First, it is difficult to extract the composite fault signal features of rolling bearings. Second, the existing methods cannot well consider the problem of multi-scale information of rolling bearing signals. In view of the difficulty in extracting the features of composite fault signals of rolling bearings, the composite fault signals not only have sparseness, but also there are interrelated structures between them. Therefore, this paper proposes the sparse characteristics of the structural group based on the overlapping group sparse model to represent the signal, and then constructs the weight coefficients by analyzing the salient features of the signal. The existing convex optimization technology is used to solve the sparse optimization model. This method is applied to the feature extraction of weak composite faults of rolling bearings, and it can solve the problem of difficult feature extraction of composite fault signals of rolling bearings. In addition, in view of the difficulty in extracting multi-scale information of rolling bearing signals with deep learning methods, this paper proposes a deep complex convolutional neural network model. It fully considers the multi-scale information characteristics of rolling bearing signals. The difference in scale characteristic information is used to distinguish fault category information. Plural information not only contains rich representation ability, but also has the ability to promote memory and retrieval of fault information. Based on the above description, this paper proposes a new rolling bearing fault diagnosis algorithm based on overlapping group sparse model-deep complex convolutional neural network. Finally, experiments verify that the model has a good ability to identify rolling bearing faults and robustness. Section 2 describes the overlapping group sparse model. Section 3 introduces the deep complex convolutional neural network model. Section 4 establishes a rolling bearing fault diagnosis algorithm based on overlapping group sparse model-deep complex convolutional neural network. Section 5 conducts an experimental analysis on the method proposed in this paper and compares it with other mainstream methods. Finally, summarize and analyze the content of the paper.

A. Group of Sparse Models
To solve the inverse regularization problem, x is recovered from the rolling bearing signal y(y=x+w). Assuming that x is nonsparse, a certain sparseness will appear in the transform x    (1) In the formula, θ is the sparse representation coefficient. The process of solving θ is a sparse approximation process. The sparse model can be expressed as: In the formula, I(θ) is the regularization penalty function that induces the sparse solution θ. The choice of I(θ) depends on the knowledge of the sparse structure of the solution θ. If θ is sparse, then the regularization function I(θ) can choose the l1 norm. It can be expressed as: In the formula, λ>0 is the penalty parameter. It is a parameter to adjust the degree of compression. When λ is greater, the degree of compression is greater. It will make more coefficients approach zero. Conversely, the smaller the λ, the smaller the degree of compression. It causes more coefficients to be retained. The above defines the l1 norm form as the Lasso model on the basis of minimizing the residual sum of squares. Because it can obtain sparse solutions of high-dimensional data, the Lasso model is widely used for feature selection of high-dimensional data. It adds the same penalty function to each variable. In other words, it compresses the coefficients of each variable to the same degree.
In some cases, there is a relationship between the coefficients. These related coefficients can be regarded as a whole. At this time, the Lasso model is not suitable for handling this relationship. It can be replaced by the group Lasso. Group Lasso is a further expansion on the basis of Lasso, which adds constraints to a set of coefficient vectors. It implements coefficient compression from the perspective of the group. If θ is sparse, then I(θ) is the group Lasso penalty function. It can be expressed as: In the formula, i=1,2, …, I, all coefficients are divided into I group. With the change of λ, the sub-model vectors are either all 0 or not 0. Group Lasso regards each group of coefficients as a "single" vector for selection under the condition that the coefficient vectors are divided into groups. If the coefficients in this group are not zero, then all coefficients in this group are selected. Conversely, if the set of coefficients are all zero, then the set of coefficients are all discarded. In the above formula, λ>0. It is used to control the amount of contraction. The larger the λ, the more severe the compression. Its corresponding θi is closer to zero. It removes the complete group that has not contributed from the model. Conversely, if θi i≠0, then all the coefficients in i are not zero. All the coefficients of this group are selected into the model, so as to realize the selection from the perspective of the variable group. The selection effect of group Lasso as shown in Fig. 1. In Figure 1, u1, u2,…, u12 respectively represent a component in the sparse group.

B. Overlapping Group Sparse model
The group sparse model does not consider that different sub-model vectors may contain the same certain variables. The combination of these variables is excluded by the group sparse model, which has limitations in practical applications. That is, in reality, there is overlap between groups and variables contained between groups. The aforementioned non-overlapping group sparse model is no longer applicable to this situation. Therefore, this paper proposes an overlapping group sparse model, which allows variables between different groups to overlap, and then introduces the prior information of the "overlapping" structure into the model.
If θ is a sparse group, the group overlaps with the group. Its regularization function I(θ) can choose the overlapping group Lasso penalty function. It can be expressed as the following formula [26]: In the formula, which optimizes the experimental results by adjusting the values of parameters K and λ. Reference [27] noted that when groups have overlapping structures, the group lasso without overlapping structure in formula (5) is used to select variables within the group. Because group {u1,u2,u3,u4,u5} is not selected, variables u4 and u5 are discarded when group {u1,u2,u3,u4,u5} is discarded. The group {u4,u5,u6, u7,u8,u9} is selected, but the variables u4 and u5 are discarded. Therefore, its final selection effect will not contain variables u4 and u5. The specific results are shown in Fig. 2. The reason for this situation is that the overlapping variables u4 and u5 between group {u1,u2,u3,u4,u5} and group {u4,u5,u6, u7,u8,u9} are not considered. The variable selection effect of the overlapping group lasso is shown in Fig.  3. The group {u4,u5,u6, u7,u8,u9} is completely selected. Table 1 shows the structural sparsity characteristics and algorithm complexity of each structural sparsity model. N is the number of samples. P is the sample dimension. max{d1,…,d|G|} is the maximum dimension of the model vector in the group. G is the potential of the group assembly.

A. Basic Principles of Deep Complex Convolutional Neural Network Model
At present, the operation and characterization of common deep neural networks are all based on the real number domain, but complex number signals appear more and more frequently in practical applications. Theoretical analysis shows that complex numbers not only have richer representation ability, but also help to retrieve the signal feature memory. However, there are relatively few studies on building deep convolutional neural network modules based on complex numbers. Deep complex convolutional neural network [28] proposed complex batch normalization, complex weight initialization strategy and end-to-end training scheme. It has been applied to music transcription tasks. Therefore, this section proposes a deep complex convolutional neural network for rolling bearing signals, which can fully consider the multi-scale information of rolling bearing signals. Different scale information can distinguish the fault category. Deep complex convolutional neural network includes complex convolution operation, complex pooling operation, complex activation function and complex classifier optimization. Different scale information can distinguish the fault category. Its specific content is as follows: (1) Reconvolution operation In the complex convolution layer, the data of the input complex signal is v, the complex weight is w, and the offset information is c. Then the input convolution operation of the nth channel can be expressed by the following formula: In the formula, 1 i  is the imaginary unit. w n =v+iu represents the complex connection weight of the convolutional neural network. It is the convolution kernel of the deep complex convolution network. c n =+i represents the offset of all channels, u and v are real number matrices, and x and y are real number vectors.  and  are real numbers. Re(Γ) and Im(Γ) represent the real and imaginary parts of the complex number Γ, respectively. It can be seen in Fig. 4 that a complex convolution with a convolution kernel of v+iu is equivalent to a real network with two convolution kernels [v, u] and [−u, v]. (2) Repool operation By performing convolution operation on a neighborhood in the image, it will get the neighborhood features of the image. After the characteristics of different locations are summarized, it is called pooling. Its main purpose is to reduce the amount of calculation by reducing the dimensionality of the input data. It has translation invariance. In the pooling layer, the input is broken down into blocks, and each block is replaced with a value. The complex pooling process with a core size of 2×2 and a step size of 2 is shown in Fig. 5 This paper is to fully preserve the integrity of the input data. The complex number pooling proposed in this paper is a one-to-one correspondence between the real part and the imaginary part of the complex number. A complex number contains real and imaginary parts, and it cannot be compared. The complex random pooling method has the same rules as the random pooling method of the real number domain. It all calculates the probability of feature points in the neighborhood. The greater the probability, the greater the probability of the feature point being selected. The maximum pooling of complex numbers can be achieved by calculating the modulus of the complex numbers. It outputs the complex number corresponding to the largest modulus. It can reduce the mean shift caused by the convolutional layer parameter error. It will retain more texture information. The formula for calculating the maximum pooling of the complex number z=x+yi is as follows: Complex average pooling averages the real and imaginary parts of feature points in the neighborhood. It can retain more  u(x,y)+iv(x,y), the transfer characteristic of the complex number still satisfies the characteristics of boundedness and differentiability. Its complex activation function can be expressed as argmax, which is That is, In the formula, c and r are positive real numbers. This nonlinear function is suitable for training a feedforward neural network and can be used for classification problems. The function is bounded, and the function is also bounded in the derivation. Its corresponding partial derivative is: (4) Complex batch normalization As the number of layers of the deep neural network increases, the output value of the model in the last few layers is closer to 0. However, the output value is a multiplication factor of the gradient, so the gradient of backpropagation is very small. In this case, the parameters are difficult to update. In order to avoid the network getting into trouble, this paper proposes a plural form of batch normalization strategy, which can reduce the influence of initialization parameters on model training [29]. The formula of plural batch normalization is as follows: 11 12 21 22 cov(Re( ), Re( )) cov(Re( ), Im( )) cov(Im( ), Re( )) cov(Im( ), Im( )) In the formula, h represents the activation vector. h is the normalization of the mean value uB= 0, the covariance K=1, and the pseudocovariance matrix C=0. V is the covariance matrix.  and γ represent the displacement parameter and scaling parameter to be learned, respectively. γ is a positive semidefinite matrix, which can be expressed as: 11 12 (5) Fully connected layer In the entire deep complex convolutional network, the function of the complex fully connected layer is to map the distributed feature representation learned by the complex convolution layer, the complex maximum pooling layer and the complex activation function layer to the sample label space. Its purpose is to reduce the original rolling bearing signal input, and then use all the characteristic information to the maximum extent through the fully connected layer. The fully connected layer can be regarded as a convolutional layer with a 1×1 convolution kernel, that is, it transforms the input data into a one-dimensional vector. It then points to the vector, the specific formula is as follows: In the formula, U={u1,...,um} is the input. O={o1,...,om} is the output.
(6) Multiple output layer This paper applies the superposition state in quantum mechanics to the deep complex convolutional neural network Rolling Bearing Fault Diagnosis Algorithm Based on Overlapping Group Sparse Model-Deep Complex Convolutional Neural Network 6 model. The rolling bearing signal is transmitted to the output layer after several convolution and pooling layers. Initialize the signal amplitude of each rolling bearing, their corresponding probabilities are equal and the cumulative sum is 1. Every time a quantum observation is performed, each signal amplitude is mapped to a position that satisfies the probability distribution. Finally, it is classified by probability, and the function expressing the quantum state of the particle must satisfy the normalization condition. The distribution probability of particles is equal to 1. The output value of the complex node is normalized by the Softmax nonlinear function to obtain the probability corresponding to a fault category. It satisfies the constraint condition of the probability distribution, namely The l2 norm of the complex number domain input z=x+yi (x,y ∈ R) can be defined as A. For Softmax regression, the probability of each class of the K class classifier corresponding to its input feature z is P(y=k|z,). That is In the formula, θ1, … ,θk ∈ Q n are the parameters of the reconvolutional neural network model and In the formula, when Y(g) is h, the index function J{Yg=h}=1, otherwise it is 0.
The θ of I(θ) can be updated by the gradient descent method. The gradient expression formula is as follows: In the formula, t=1 means calculating the real part. t=2 means that the imaginary part is calculated. The θ vector update is defined as: From the above analysis, it can be seen that the deep complex convolutional neural network model extends the real number domain model to the complex number domain model. Therefore, the overall framework of the model consists of several repeated reconvolution layers, complex pooling layers and complex activation functions.

B. Process Description of Deep Complex Convolutional Neural Network Model
Weight optimization is a key part of the deep complex convolutional neural network model. In the weight update process, the gradient descent algorithm based on deep complex convolutional neural network is introduced in this section. The training process is shown in Algorithm 1 of Table 2.
(1) Batch normalization to initialize complex weights This paper uses the random initialization method to initialize the weights of the deep complex convolutional neural network model. Its expression is as follows: In the formula, |W| and θ represent the size and scale characteristics of the weight W, respectively.
The complex batch normalization operation decorrelates the real and imaginary parts of a unit. It can reduce the risk of overfitting. The specific normalization process is shown in Algorithm 2 of Table 3.   loss function L(W) to get the minimum value. Since the magnitude of the gradient value is very small, it is a gradient accumulation effect. After multiple accumulations, the gradient noise will be consumed by the gradient. For large data sets, the cost of evaluating the gradient of the entire data set is very expensive. The general method of its processing is the stochastic gradient descent method [30]. To deal with this problem, for deep complex convolutional neural networks, the real part and imaginary part of the weights are updated according to formula (7) and formula (8) respectively.
In order to improve the convergence speed of stochastic gradient descent, this paper adopts the batch norm conversion method. The average value of the weight parameters is calculated as shown in the following formula: Then calculate the loss function based on the average weight t W . At the same time, standardized methods are used to standardize rolling bearing data. Its purpose is to reduce the difference in the order of magnitude of the weight parameters.

IV. ROLLING BEARING FAULT DIAGNOSIS ALGORITHM BASED ON OVERLAPPING GROUP SPARSE MODEL-DEEP COMPLEX CONVOLUTIONAL NEURAL NETWORK
In this section, on the basis of Section II and Section III, this paper proposes a rolling bearing fault diagnosis algorithm based on overlapping group sparse model-deep complex convolutional neural network. The basic steps of the proposed fault diagnosis algorithm are as follows: (1) Data collection. Obtain relevant rolling bearing data under different health conditions through public rolling bearing data sets.
(3) Data feature extraction. Using the model proposed in Part 2 of this article to extract features from rolling bearing data, it can get richer and more complete feature information. It can better solve the problem of difficulty in extracting features of composite fault signals of rolling bearings.
(4) Multi-scale information extraction and fault classification. The model proposed in Section III of this paper is used to extract multi-scale information from rolling bearing data. It can obtain different scale information of rolling bearing fault signals. Finally, the classifier of the model is used to classify and identify rolling bearing faults.
(5) Actual test. The test samples are input into the rolling bearing fault diagnosis model trained in steps (1) to (4), and the model can give the rolling bearing fault category or result.
The basic framework diagram of the rolling bearing fault diagnosis algorithm proposed in this paper is shown in Fig. 6.  Table 4. The rolling elements and outer ring collect vibration signals of four failure levels. The outer ring collects vibration signals of three failure levels. All vibration signals are collected under 0, 1, 2 and 3 hp motor loads. The sampling frequency is 12 kHz. Norm means no fault. G1, G2, G3, and G4 indicate rolling element failures of 0.007, 0.014, 0.021, and 0.028 inches, respectively. IR1, IR2, IR3, and IR4 indicate 0.007, 0.014, 0.021, and 0.028 inch inner ring failure, respectively. OR1, OR2, and OR3 indicate 0.007, 0.014, and 0.021 inch outer ring failure, respectively. The experiment separately studies the recognition performance of the method proposed in this paper for fault data under constant working conditions and changing working conditions.
For constant conditions, this experiment divides the data into two groups to verify the method proposed in this article. It contains normal bearing, inner ring failure data, outer ring failure data, and rolling element failure data. The specific grouping information is shown in Table 5 The time-domain waveform of a certain data sample in the above four working states is shown in Fig. 7. Fig. 7 shows that the time-domain waveform of the bearing vibration signal has nonlinear and nonstationary characteristics. For this signal, this article first uses the model proposed in the second part to characterize the fault. Then, the model proposed in the second part of the deep model is used to extract the dimension information of rolling bearings. Finally, the classifier of the deep complex convolutional neural network model is used to identify the type and degree of bearing fault.    verify the effectiveness and advantages of the method proposed in this paper, the traditional SVM method [31], the CNN method [32] and the deep CNN method [33] are used to perform fault analysis on the group 1 data. To avoid the contingency of the fault diagnosis results, the above four methods are all run 10 times. The specific results are shown in Table 7 and Fig. 8. Similarly, using the method in this paper, the traditional SVM method [31], the CNN method [32] and the deep CNN method [33], experiments are performed on the group 2 data. The experimental results are shown in Table 8 and Fig. 9-Fig. 10. From Table 7 to Table 8 and Fig. 8 to Fig. 9, we can see that for group 1 and group 2, the classification accuracy of the SVM method is approximately 93%. However, the CNN method achieves 98% classification accuracy. Furthermore, the classification accuracy of deep CNN is 1% higher than that of the CNN method because this method is an optimized deep learning model. This shows that the optimized deep learning method has a certain effect on improving the accuracy of rolling bearing fault diagnosis. The method in this paper has the highest classification accuracy among all methods, which is 100%. It shows that the method proposed in this paper not only greatly improves the classification accuracy of the traditional SVM method. It also has a certain degree of improvement in the classification accuracy of deep learning methods such as CNN and deep CNN. This shows that the method proposed in this paper is highly adaptable to rolling bearing signals. It can better extract the fault feature information and multiscale information of rolling bearing signals. This verifies that the rolling bearing fault diagnosis algorithm proposed in this paper has good stability and robustness. This is mainly because the method proposed in this paper is based on the characteristics of rolling bearing signals. It not only solves the problem of difficult extraction of rolling bearing signal features but also better extracts the multiscale information of rolling bearing signals.
(3) Diagnosis results and comparative analysis under changing working conditions To further verify the effectiveness of the fault diagnosis method proposed in this paper, the experimental process is similar to (2) experiment. This part uses the method of this paper, the traditional SVM method [31], the CNN method [32] and the deep CNN method [33] to conduct experiments on the data samples of group 1 and group 2 under changing conditions. To avoid the contingency of the fault diagnosis results, the above four methods are all run 10 times. The specific results are shown in Table 9 and Fig. 9-Fig. 10.  It can be seen in Table 8 and Figs. 10-11 that for group 1 and group 2 data, the classification accuracy of the SVM method is approximately 6% lower than the classification accuracy under constant working conditions, and only 87%. This shows that the machine learning method similar to SVM has a poor fault classification effect under changing working conditions. The classification accuracy of the CNN method is only reduced by approximately 0.9% compared with the classification accuracy under constant working conditions. The classification accuracy of the deep CNN method is only reduced by approximately 0.8% compared with the classification accuracy under constant working conditions. It is lower than the CNN method. Whether it is CNN or deep CNN, the accuracy of fault classification under changing conditions is still very high. They can reach more than 97%. This shows that deep learning methods can adapt to data changes very well. It is inseparable from the high generalization ability of deep learning. The method in this paper still has the highest classification accuracy of all methods, both at 99.9%. It shows that the classification accuracy of the method proposed in this paper is lower than the classification accuracy of deep learning methods such as CNN and deep CNN under changing conditions. It objectively verifies that the method proposed in this paper is more adaptable to the signal characteristics of rolling bearings than other deep learning methods. This proves that the proposed method not only has good classification accuracy but also has better adaptability and robustness. The main reasons why the method in this paper has such a good classification effect are as follows. First, the method in this paper is proposed for the signal characteristics of rolling bearings. Second, this paper solves the problem of single fault signal feature extraction and the problem of multiscale information extraction of compound fault signals.

VI. CONCLUSION
Aiming at the difficulty of extracting features and scale information of composite fault signals of existing rolling bearings, this paper proposes a new overlapping group sparse model, which can effectively extract the single feature extraction problem of the composite fault signal of rolling bearings. Additionally, to better extract the scale information of the composite fault signals of rolling bearings, this paper proposes a new deep complex convolutional neural model. The plural form in the model not only has richer characterization ability but also helps to extract different time scale information of rolling bearing composite fault signals. It also helps to remember the fault information. On this basis, this paper proposes a method for composite fault diagnosis of rolling bearings, that is, a fault diagnosis algorithm for rolling bearings based on an overlapping group sparse model-deep complex convolutional neural network. Related bearing data experiments show that the method proposed in this paper can accurately identify all faults under constant working conditions. It not only greatly improves the recognition accuracy compared with the SVM method but also has a certain degree of improvement compared with other deep learning methods. This directly verifies the effectiveness of the fault diagnosis method proposed in this article. Under changing conditions, the fault classification accuracy obtained by the method proposed in this paper is the highest. The classification accuracy of the method proposed in this paper, SVM, and other deep learning methods under changing working conditions is lower than the classification accuracy under constant working conditions, which are 0.8%, 6% and 2%, respectively. It can be seen that the method proposed in this article decreases the least. This further verifies that the method proposed in this paper can adapt to the diagnosis and identification of rolling bearing fault signals under different working conditions.