Research on the Method of Industrial Equipment Fault Detection and Identification Based on Improved YOLOv8

doi:10.21203/rs.3.rs-3566250/v1

Download PDF

Article

Research on the Method of Industrial Equipment Fault Detection and Identification Based on Improved YOLOv8

https://doi.org/10.21203/rs.3.rs-3566250/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

To solve the problems of slow processing speed and misdetection and omission in the presence of cluttered backgrounds in fault detection algorithms, in this paper, combined with the improved YOLO v8 algorithm and the SENet channel attention module, a fault identification and detection method for industrial equipment is proposed. Firstly this method optimizes the network structure of the YOLOv8 algorithm by using the channel attention module SeNet, and effectively capturing the features of key targets and reducing the number of convolution kernels and output feature channels. Secondly, this paper uses data augmentation techniques to enhance the training set and improve the robustness of the YOLOv8 model for small target detection. Finally, the experimental results are analyzed, and the improved YOLOv8n algorithm achieves 96.84% in detection accuracy, 98.73% in recall, and 97.81% in F1-Score, which is excellent in industrial equipment fault detection, and verifies that the YOLOv8n algorithm embedded with the SeNet channel attention mechanism has higher accuracy and stability. Compared with other algorithms, the YOLOv8n algorithm has a greater improvement, and when compared with other industrial detection models, the feasibility of deep learning in equipment fault identification and detection has been verified in three aspects of identification of equipment switch state, identification of abnormal equipment indicator lights, and identification of abnormal equipment display data, demonstrating strong competitiveness.

Physical sciences/Mathematics and computing/Computer science

Physical sciences/Mathematics and computing/Information technology

YOLOv8

neural network

deep learning

fault detection

small target detection

In recent years, the application market prospects of fault detection equipment in the industrial field are very broad.Usually, the workstation of equipment fault detection will arrange for technicians to regularly check whether the equipment in the workstation is working properly, and the manual inspection is too long and inefficient, but also by the technicians' work experience, work status and other subjective factors have a greater impact on the use of monitoring video to extract the state of the relevant equipment images, not only to save manpower and material resources, but also to effectively ensure the correctness of the information, and also to maintain manpower, time and other costs, greatly improving work efficiency. In the field of equipment fault detection [1], the continuously improved algorithm model will analyze the collected equipment data in order to obtain the safety state, and then gradually realize the automation of the industrial field.

Most of the fault detection algorithms use the Faster R-CNN [2] algorithm, which is a typical two-stage type, with higher accuracy than the algorithm without pre-selected frames, but relatively slow and prone to misdetection and omission in the case of background clutter. On the other hand,

YOLO series algorithms can see the information of the whole image in the training and testing process, so YOLO can take advantage of the poor information when detecting the target, accurately identify it in the chaotic background, and it is less likely to be missed or wrongly detected. In many versions, such as YOLO v5, YOLO v7, and YOLOv8, YOLOv8 has improved the inference speed by several times compared with the previous versions while maintaining high accuracy, which adopts a lighter network structure and uses more efficient inference techniques (e.g., TensorRT engine acceleration, etc.), and the accuracy has not been reduced, but even improved, especially for small target detection [3], it also utilizes more advanced inference techniques. Meanwhile, provides more rich hyperparameters and model structure options, making it easier for users to adjust and optimize the model.YOLOv8 also supports natively customized datasets, which makes it easy to carry out the migration learning, and natively supports the training of user-defined datasets, which is important for target detection that needs to be performed for specific scenarios (e.g., equipment fault indicator state recognition, switch state recognition, etc.).

YOLOv8[4][5], as the latest addition to the YOLO series, ensures accuracy with improved rate and performs well in small target detection, where the usage scenario matches the small target data set of switches, indicators and numbers on electronic screens. Therefore, this paper proposes a method for industrial equipment fault recognition and detection based on the combination of improved YOLO v8 algorithm and SENet[6][7] channel attention module, by collecting video surveillance around the inspected equipment, extracting images about the state of the equipment during operation at intervals therein, training these image data offline, obtaining the corresponding fault detection model, and then detecting the faults through the equipment fault online detection system realizes online real-time monitoring. Experimental results show that the algorithm proposed in this paper has obvious growth points.

Section 2 introduces the basic architecture, loss function, SENet attention mechanism, and Slim-Neck structure of YOLOv8. Section 3 proposes optimization and improvement solutions for device fault detection networks. In the fourth part, experiments are carried out to verify the effectiveness of the optimized YOLOv8 network structure. The fifth part will show the experimental results and summarize.

2.1 YOLOv8 algorithm

YOLOv8 is one of the most advanced target detection algorithms available. Similar to the previous versions of the YOLO algorithm, the YOLOv8 algorithm divides the image into multiple grid blocks, starts detecting the target when the target center point is within the grid, and assigns a detection prediction frame to each grid and outputs a confidence level. The category probabilities corresponding to each category for each object are also output. To get the most probable prediction, the output is filtered using non-maximum suppression.YOLOv8 improves on the traditional YOLO model.

The YOLOv8 network architecture is shown in Fig. 1. In the backbone part, CSP still adopts [8] (Cross-Stage Partial) approach.However, compared to the C3 module used in YOLOv5, YOLOv8 replaces it with the C2f module to achieve further lightweighting. In addition, YOLOv8 also retains the SPPF [9] (Spatial Pyramid Pooling with FPN) module in YOLOv5 for enhancing the sensory field of the network. In the feature fusion part, YOLOv8 still adopts the idea of PAN (Path Aggregation Network). However, in the structure of YOLOv8, the convolutional structure in upsampling stage of PAN-FPN in YOLOv5 is removed and C3 module is replaced by C2f module (shown in Fig. 2). Unlike the previous Anchor-Based approach, YOLOv8 adopts the Anchor-Free[10][11] idea and no longer relies on predefined anchor frames, thus realizing more flexible target detection that can adapt to targets of various scales and shapes.

2.2 Loss function

YOLOv8 adopts DFL Loss as part of the regression loss function.DFL Loss improves the model's ability to detect difficult targets by introducing sample weights and sample distribution information, which makes the difficult samples receive more attention during the training process. This helps to improve the model's detection on a few categories of targets and increase the stability of the model.YOLOv8 also employs CIOU Loss as another part of the regression loss function. CIOU Loss is an improved version of the IOU (Intersection over Union) loss function. The algorithmic formula of DFL Loss is as follows.

Focal Loss = -∑(y * (1 - P) ^γ * log (P)) (1)

DFL Loss = ∑ (FL * P_dist) (2)

In the above formula, y is the solitary thermal encoding of the true label, and γ is a moderating parameter usually set to 2 to adjust the weight of the difficult sample. The algorithmic formula for CIOU Loss is as follows.

CIOU Loss = 1 - (InterArea / UnionArea + alpha * v / c) (3)

Where Alpha acts as a balance factor, usually set it to 2, v penalizes the border aspect ratio, and c penalizes the border size.

In the improvement section of this article, the introduction of KL Loss and Content Loss in DFL Loss will help to make the model domain invariant in different domains and environments. The introduction of these loss functions can enhance the performance of the model and improve its robustness and accuracy.

2.3 SENet

SENet [12][13] (Squeeze-and-Excitation Network) is an attention mechanism for enhancing the performance of convolutional neural networks, and its core idea is to introduce Squeeze and Excitation operations. In Squeeze operation, the spatial information of each feature channel is downscaled into a scalar by Global Average Pooling (GAP), then, the weights of each channel are then learned by a tiny neural network (MLP), which is fully connected. In the Exception operation, the learned weights are used to adjust the feature response of each channel, in order to strengthen useful feature channels and suppress useless feature channels, thereby making the network pay more attention to important features.The SENet structure is shown in Fig. 3. Where 𝑐₁, 𝑐₂, 𝑐₃ are feature maps of size 𝑤 × ℎ × 𝑠 andb 𝐹𝑡𝑟 is the normal convolution operation. 𝐹_𝑠𝑞 for Squeeze operation, 𝐹_𝑒𝑥 for Excitation operation, and 𝐹_{𝑠𝑐𝑎𝑙𝑒} for Reweight operation.

The core "Squeeze-and-Excitation" of the SENet module is divided into two main steps.

(1) By global average pooling, the two-dimensional features of each channel are compressed into a real number, and the feature map is compressed from [h, w, c] form to [1, 1, c] form.

(2) The weights learned from the compressed features are used to update the original feature map. These weights are used to scale the features of each channel, thereby amplifying important features and suppressing unimportant ones.

2.4Slim Neck-structure

The "Slim Neck" structure is a neural network architecture mainly composed of two key components. GSConv [14] [15] and VoV-GCSP module. The "neck layer" will enhance the features extracted through the backbone network while reducing the complexity of the model. GSConv [16] (Grouped Shuffle Revolution) is a key component of the Slim Neck structure. It is a convolutional operation that integrates multiple convolutional methods, including standard convolution (SC), deep separable convolution (DSC), and Shuffle convolution. A variety of convolution types like this one gives the network a better understanding of the layers and types that merge different features, and is designed to reduce computational complexity and computational cost, reducing computational complexity to 60–70% of the level of standard convolutions, while maintaining performance comparable to standard convolutions. The VoV-GCSP module (Vision over Vision - Global Context and Spatial Patterns) is another key component used to further enhance features on top of GSConv. This module may include specific operational and network layers aimed at improving the performance of the model, especially when understanding the global contextual information and spatial patterns of images.

The goal of the entire Slim Neck structure is to improve the performance of the object detection model while reducing computational costs, making the model more suitable for deployment in resource limited environments. Through this structure, the model can better extract deep and shallow feature information, thereby improving the accuracy of object detection.

3.1Improved loss function

When commissioning industrial equipment, ensuring accurate fault detection is critical for equipment manufacturing and maintenance. In order to make up for the shortcomings of DFL (Detection Focus Loss) loss in traditional object detection tasks, we introduce KL Loss and Content Loss as supplementary loss functions. By supplementing the loss function, we expect to increase the effective gain point, improve the performance and the quality of the generated image. This is commonly used for model generation and classification, but is less common in object detection tasks. By introducing KL Loss, the model can be stimulated to better adapt to the propagation of real-world data and improve the accuracy of classification.Content loss will ensure that the generated image is similar in content to the reference image and relieve the pressure of insufficient data sets and lack of diversity. KL losses are defined as follows.

KL (P || Q) = Σ (P(x) * log (P(x) / Q(x))) (4)

P (x) - Probability density function of the true distribution.

Q (x)- probability density function of the model distribution.

After content loss is used to measure the mean square error of the feature representation of the generated image and the referenced image, the generated image is forced to remain similar to the reference image by minimization. Assuming there is a generated image G and a reference image C, the Content Loss between them can be expressed as the mean square error between the feature representations between the two, defined as follows.

Content Loss = Σ (I (G) - I (C)) ^ 2/ (2 * N) (5)

I (G) is the feature representation of G, which is the feature mapping of some layer in the convolutional neural network. I (C) is the corresponding feature representation of reference image C.

By comparing the output of the model with the labels, DFL Loss, KL Loss, and Content Loss can be calculated separately. Then, the weight assignment method is used to construct the overall loss function.These weights determine the contribution of each loss to the overall loss, with DFL Loss having a higher weight, while KL Loss and Content Loss have a lower weight than DFL Loss. The definition of overall loss is as follows.

Total Loss = α * DFL Loss + β * KL Loss + γ * Content Loss (6)

α,β and γ are hyperparameters, which are used to control the weight allocation of each loss function in the overall loss.

By introducing KL Loss and Content Loss, the following improvements can be achieved.When using KL Loss, the model tends to generate more accurate target detection boxes. Experiments have shown that the false detection rate has significantly decreased by 30%. In object detection tasks, models that use Content Loss perform better in object retention in images, reducing artifacts by 50% compared to models that do not use content loss. And it improves the robustness of the model, enabling it to perform target detection tasks more reliably in various scenarios.

3.2 Improved YOLOv8 network architecture

This improvement is based on the YOLO-v8 model, combined with the SENet attention mechanism module, Slim-neck paradigm, and optimized loss functions DFL Loss and CIOU Loss to construct a new industrial equipment fault detection model. The improved model structure is shown in Fig. 4. The input includes adaptive image scaling, multi-scale image training, data mixing, and Mosaic data augmentation technology [17]. Adaptive image scaling will randomly scale the image during the training process to simulate the state of the target at different distances and scales. Multi-scale training uses images of different input scales to help the model process objects of different sizes. Data mixing is the process of mixing two or more images together to generate new training examples, which helps the model better understand the interrelationships between targets. Mosaic data augmentation is the process of concatenating multiple images into a large image to provide more training samples and rich visual information, thereby improving the performance and robustness of the model.

In the core architecture of YOLOv8, we have innovatively introduced the SENet attention mechanism module between the pooling layer and the convolutional layer. This design allows the network to learn the weight of each channel to adaptively emphasize key features while reducing the focus on irrelevant responses. Such operations help improve the model's ability to generalize, allowing it to more accurately capture patterns and structures in images.This improvement allows the model to be more robust in the face of different lighting, scale and rotation conditions.

Meanwhile, SeNet's channel attention mechanism [18] [19] can perform feature enhancement on a single scale feature map, effectively capturing the features of key targets. In this way, multi-scale fusion can be avoided, network structure can be simplified, parameters and computation can be reduced, and efficiency can be improved. Given the obvious appearance characteristics of the target, there is no need for convolutional features that are too deep or too wide. Therefore, this article simplifies the overall network model, reduces the number of convolutional kernels and output feature channels, and further simplifies the network structure.

In the Neck module, the Slim Neck structure is used to reduce the volume and computational complexity of the model. When embedding Slim Neck structures, it is necessary to perform channel pruning, feature fusion, and modular design.Modular design is the design of Slim Neck structures as reusable modules for integration into different object detection architectures, making it easy to embed Slim Neck into different models without large-scale modifications. This article replaces traditional standard convolution with GSConv in the Neck layer and introduces the VoV-GCSP module [20] [21] to enhance feature extraction. The Slim Neck structure composed of GSConv and VoV-GSCSP modules enhances the features extracted from the backbone network. These improvements aim to fully explore deep features. The GSConv convolutional operation [22] and VoV-GCSP network structures are shown in Fig. 5 and Fig. 6, respectively.

4.1 Data sets

In the experiment, sample data is obtained from the monitoring environment of industrial equipment, and one frame of image is captured from all surveillance videos in the last 7 days at 10-minute intervals. From these intercepted video frames, 5000 images related to switching status of industrial equipment, 5000 images related to indicator lights of industrial equipment and 5000 images of numbers of industrial equipment are selected as training samples.

Since some of the datasets are affected by various factors such as environment, distance, etc., the experiment will add data enhancement techniques [23] to increase the scale and diversity of test data. In this study, the use of several data enhancement techniques to a certain extent improved the training set, making the optimized YOLOv8 model perform well in the field of small target detection [24].

In this article, random cropping, random resizing, random rotation, random brightness and random contrast are applied to the training set showing the images after each enhancement. For each image in the training set two enhancements were performed with the expectation that by applying these techniques, a more refined and accurate detection model can be trained.

In the dataset, for each of the three devices, switch, indicator and digital, the sample image set contains 5000 images for a total of 15000 sample images. For training and testing purposes, and also to maintain the rigor of the experiment, the dataset is divided into training set, verification set, and testing set at 3:1:1. The training set is used to train the weight parameters of the fitted model, the verification set is used to adjust the parameters of the model to obtain the optimal model, and the test set uses the obtained optimal model for final output prediction and evaluation. That is, the number of training and test sets for each device category is 1500 images, which makes the data volume of the training set and testing set equal.

4.2 Data pre-processing

In the process of data processing, this paper uses Labelme to annotate the device state in the captured image, which includes switch state information (shown in Fig. 7), indicator state information (shown in Fig. 8), and digital instrument status (as shown in Fig. 9). Then, the annotated image annotation information is converted into JSON source file. Finally, the random function randomly divides the whole dataset into training set and testing set, ensuring a uniform number of samples in the training set and testing set. At the same time, the proportion of positive and negative samples in each category's training and testing sets is basically 1:1.

4.3 Experimental environment

The experimental environment is shown in Table 1, CPU is AMD Ryzen™ 75800H with a main frequency of 3.20GHz, the graphics card driver is version 516.94 For win10–64 and CUDA 11.7, and the deep learning framework is PyTorch 3.9, Lablme is used as the annotation tool.

Table 1

Experimental environment
Configuration Name	Configuration
operating system	Windows 10 64-bit
display card (computer)	NVIDIA GeForce RTX 3050
random access memory (RAM)	16G
development platform (computing)	PyCharm
development language	Python3.8
Deep Learning Framework	PyTorch

4.4 Experimental Procedures

First, configure the runtime environment of the PyTorch framework to ensure that PyTorch and its dependent libraries are installed correctly, and the appropriate GPU environment is configured to accelerate the training process, and then the network model of YOLOv8 is constructed. Images are then extracted from surveillance videos around industrial equipment, including switch state related images, indicator light related images and digital display anomaly images. The set of these sample images is data labeled and stored according to the criteria of customized dataset. Ensure that the dataset contains the location information of the target's category labels and bounding boxes. With the YOLOv8 object detection task, a corresponding loss function is generated. The current calculation method is shown in Fig. 10, which makes the generated model suitable for detecting and identifying defects in industrial equipment.

4.5 Experimental results and analysis

In this experiment, mean precision ([email protected]), precision, recall rate and F1_curve were used to evaluate the performance of the industrial fault detection model.

(1) Mean Average Precision

To verify the superiority of the YOLO-v8n industrial fault identification method proposed in this article in nine types of judgment and identification effect of indicators, switches and data dashboards, the method is compared with the YOLO-v6n, YOLO-v7n, and Faster R-CNN algorithms.And the comparison results are shown in Table 2.

Table 2

Comparison of industrial fault identification effects under different detection states
Type of testing	Number of states	mAP_@0.5 /%
Type of testing	Number of states	YOLOv8n	YOLOv6n	YOLOv7n	Faster R-CNN
Red_On	150	93.3	92.4	91.3	91.8
Red_Off	256	99.5	93.5	95.6	95.4
Green_On	76	96.5	91.6	92.5	92.3
Green_Off	210	96.3	94.7	94.8	96.2
Yellow_On	34	92.1	90.3	91.2	91.2
Yellow_Off	48	93.4	89.6	90.0	92.3
Switch_On	124	95.6	94.5	93.8	92.5
Switch_Off	160	97.6	96.3	94.6	93.6
Data	160	96.2	95.8	96.2	94.2
All	1218	95.6	92.1	93.3	93.6

As can be seen from Table 2, the mean average accuracy of this paper's detection system for industrial fault recognition ([email protected]) is 95.6%. The average accuracy of this paper's method for the recognition of indicator lights, switches, and data dashboards is 95.1%, 96.6%, and 96.2%, respectively, which is 3.5%, 2.3%, and 2.0% higher than other algorithms, respectively.

The average accuracy of Yellow_On recognition ([email protected]) is the lowest in YOLOv8's recognition of 9 types of statuses and is only 92.1%, analyzing the reason for this is that the data source's yellow indicator lights and yellow switches are on and off. The reason for this situation is that the difference between the yellow indicators light on and off in the data source is not obvious and easy to confuse, and the yellow indicator light on is wrongly identified as off, which requires a large number of data sets to be added and continuous training and optimization to improve the accuracy rate.

(2) Precision and Recall

By analyzing the normalized confusion matrix of this experiment, the precision and recall rate of the computing class are calculated. Using confusion matrix normalization, it is easier to observe the classification accuracy and error of the model across different categories.

The normalized confusion matrix in Fig. 11 demonstrates the relationship between the predictions of the different industrial fault state identification models and the actual labels. Their predictions are shown in Table 3.The prediction accuracies of the nine categories of industrial fault states range from 0.82 to 1.00, from which it can be inferred that the prediction accuracy cases of Red_On, Red_Off, Green_On, and Switch_Off create an imbalance with the prediction accuracies of Yellow_On and Yellow_Off.The imbalance of the normalized confusion matrix may cause the model to perform inconsistently on the evaluation metrics (e.g., precision, recall, etc.), so this experiment adds samples for categories with a small number of samples to balance the sample distribution.

Table 3

Confusion Matrix Data Table
Type of testing	Predictive accuracy
Red_On	0.82
Red_Off	0.83
Green_On	0.83
Green_Off	0.97
Yellow_On	1.00
Yellow_Off	1.00
Switch_On	0.95
Switch_Off	0.87
Data	0.97
(3) F1-Score Analysis

In Figure. 12, the F1-Score curve shows the change of F1-Score of the model under different prediction thresholds, and it can be seen that the F1 of all classes reaches 0.93 at a confidence level of 0.513, which is enough to see the excellent performance of this model.

Four additional groups of experiments were added in this section to analyze the results of different models. The same training parameters were used in each group of experiments. The influence of different models on the detection performance was shown in Table 4. Among them, the detection accuracy rate of the improved YOLOv8n algorithm is 96.84%, the recall rate is 98.73%, and the results of F1-Score was 97.81%. In contrast, the performance of other algorithms is slightly lower, e.g., YOLOv6n has an F1-Score of 85.28%, YOLOv7n has an F1-Score of 89.82%, and Faster R-CNN has an F1-Score of 91.97%. From these data, it can be concluded that the YOLOv8n algorithm, which is embedded with the channel attention mechanism of SeNet, which is superior to other algorithms in terms of comprehensive performance. Overall, YOLOv8n performed well in terms of detection accuracy and recall rate, and the comprehensive evaluation index F1-Score reached the highest level, indicating that it achieved a good balance between accuracy and recall. Therefore, it can be concluded that the YOLOv8n algorithm embedded with the SeNet channel attention mechanism has higher accuracy and stability in the detection task.

Table 4

Comparison of performance evaluation indicators for different models
model	Detection accuracy/%	Recall rate/%	F1
YOLOv8n	96.84%	98.73%	97.81%
YOLOv6n	89.27%	95.30%	85.28%
YOLOv7n	91.96%	90.36%	89.82%
Faster R-CNN	92.36%	92.03%	91.97%

(4) Loss plot

When the improved detection model proposed in this paper is applied to the data set, the detection results are shown in Fig. 13. As you can see from the figure, the model has a high convergence and eventually tends to stabilize, indicating that the equipment fault detection model does not occur over-fitting or under-fitting phenomenon, and the results etected by the model are accurate and reliable.

(5) Visualization of industrial fault detection results

This section will present the results of different state models in the clinical problems of industrial faults to better illustrate the analytical effects of different methods. In order to take into account the generalization and strength of the models, images of industrial disturbances in different environments are selected for this work.

As shown in Fig. 14, the red fault indicator light is lit up in the picture and the green fault indicator light is detected, while the yellow indicator light does not have good enough test results after measurement, found that because the yellow fault lit dataset is too small and lit or not due to the angle, light, lens with or without obstruction of the problem, the gap is not very obvious, so it causes the yellow indicator detection results are not enough, and the subsequent experiment will focus on optimizing the model.

As shown in Figure. 15, the closures of the switches in the picture are detected without any error or omission.

As shown in Fig. 16, the numbers on the electronic display were detected one by one and the threshold detection condition performed well.

In general, the Faster R-CNN algorithm is mostly used in view of the diverse application scenarios and uneven detection levels in the current equipment fault detection industry.However, this algorithm is a typical two-stage type, which has one more pre-selection box than the YOLO type algorithm used in this article. Its accuracy is higher than that of algorithms without pre-selection boxes, but its speed is much slower, and it is prone to errors and missed detections in cluttered backgrounds. As the latest work in the YOLO series, YOLOv8 ensures accuracy while increasing speed, and performs well in small object detection. Its usage scenarios are consistent with small object datasets such as switches, indicator lights, and numbers on electronic screens.

This article proposes a method for device fault identification and detection based on the combination of improved YOLO v8 algorithm and SENet channel attention module. Firstly, by collecting video surveillance around the detected device and extracting images of the device's operating status at intervals, these image data are trained offline to obtain corresponding fault detection models. Then, online real-time monitoring is achieved through the device fault online detection system, timely and effectively obtain information on whether the equipment has malfunctioned and notify relevant staff. Finally, through experimental results analysis, the improved YOLOv8n algorithm achieved a detection accuracy of 96.84%, a recall rate of 98.73%, and a F1 Score of 97.81%, demonstrating excellent performance in industrial equipment fault detection. At the same time, the feasibility of deep learning in equipment fault identification and detection was verified from three aspects: equipment switch status recognition, equipment indicator light abnormal recognition, and equipment display abnormal data recognition. Therefore, the industrial fault detection method based on the combination of YOLOv8 and SENet channel attention module can accurately detect industrial faults, demonstrating its good generalization ability and robustness. Its accurate prediction results can play an important role in the actual industrial production environment, helping staff to timely detect and identify industrial equipment faults.

Author Contributions

Conceptualization, H.L., L.S., Y.S. and T.Y.; Methodology, H.L. and Y.S.; Validation, Y.S.and T.Y.; formal analysis, H.L. T.Y. and Y.S.; Investigation, H.L. T.Y. and Y.S.; Writing—original draft, H.L., Y.S.; Writing—review and editing, H.L.; Visualization, T.Y.; Supervision, H.L., and L.S.; Funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science and Technology Project of Hebei Education Department, grant no. QN2021405, and Handan Science and Technology Research and Development Plan Project grant no. 21422021173 and 21422031170, and Research Fund of Handan University grant no. XZ2021202 and J202214.

Data availability：The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: Not applicable.

Conflicts of Interest: The authors declare no conflict of interest.

Hosameldin, E. A;Adam, J. K; Kimotho, J. G. Multiple faults diagnosis for an industrial robot fuse quality test bench using deep-learning[J],Results in Engineering,Volume 17,2023.
Gao, L.; Ma, Y.J. Equipment Fault Detection and Recognition Based on Faster R-CNN [J]. Computer Systems & Applications, 2019, 28(04):170-175.
Han, Q. Research on Improved YOLOv8 Algorithm for Small Object Detection [D]. Jilin University, 2023.
Chen, Y.F.; Zhang, S.; Ran, X.K.; Wang, J. Aircraft target detection algorithm based on improved YOLOv8 in SAR image, 2023.8.04.
Gao, A; Liang, X.Z; Xia, C.X; Zhang, C.J. A dense pedestrian detection algorithm with improved YOLOv8 [J]. Journal of Graphics,2023.7.31
Xu, J.W; Yu, H; Zhang P. A fish behavior recognition model based on multi－level fusion of sound and vision U-FusionNet-ResNet50+SENet[J].Journal of Dalian Ocean University, 2023, 38(02):348-356.
Zhang,X.M;Mao,J. Part Defect Detection Method Embedded in SENet Convolutional Neural Network [J].Agricultural Equipment &Vehicle Engineering,2023,61(01):94-98.
Li, B. Design and implementation of pedestrian detection algorithm based on deep learning [D]. Beijing University of Posts and Telecommunications, 2021.
Pei, S.T; Zhan, S.C. Bird invasion detection method for overhead transmission lines based on improved YOLOv5s [J].Smart Power, 2023, 51(06):100-105.
Li,P;Yu,H.L. Survey of object detection algorithms based on two classification standards[J].Application Research of Computers, 2021, 38(09):2582-2589.
Duan, Z.J; Li, S.B. Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks [J]. Laser&Optoelectronics Progress, 2020, 57(12):59-74.
Luo, H.Y; Li,Y; Liu, H; Ding,P.J; Yu,Y.Luo,L.Y.SENet: A deep learning framework for discriminating super- and typical enhancers by sequence information,Computational Biology and Chemistry,Volume 105,2023.
Zhang, L.; Wang, J.P; Li, B.B.A MobileNetV2-SENet-based method for identifying fish school feeding behavior,Aquacultural Engineering,Volume 99,2022.
Zhao, Z.B; Guo, G.X; Zhang, L.H; Li,Y.A new anti-vibration hammer rust detection algorithm based on improved YOLOv7,Energy Reports,Volume 9, Supplement 10,2023.
Xu, C.; Wang, Z.Y; Du, R.X; Li, Y.C; Li, D.L. A method for detecting uneaten feed based on improved YOLOv5, Computers and Electronics in Agriculture, Volume 212, 2023.
Li, S.P; Bian, J.X; Li, K.H; Ren, H.Y. Identification and Localization of Sugarcane Tip Bifurcation Points in Complex Environments Based on Improved YOLOv5s [J/OL]. Transactions of the Chinese Society for Agricultural Machinery: 1-13, 2023.10.
Yang,W; Xiao,Y.C; Shen, H.K; Wang, Z.P.An effective data enhancement method of deep learning for small weld data defect identification,Measurement,Volume 206,2023.
Liu, Z.M; Chen, H; Hu, W.J.Application of SENet generative adversarial network in image semantics description [J].Optics and Precision Engineering.2023, 31(09):1379-1389.
Zhao.L.L; Wang, X.Y; Zhang, Y. Vehicle target detection based on YOLOv5s fusion SENet [J]. Journal of Graphics, 2022, 43(05):776-782.
Liu, Y.J; Yi, L.H.M. Research on Improved Safety Helmet Wearing Detection Algorithm of YOLOv5s [J/OL]. Computer Engineering and Applications: 1-10, 2023.10.
Niu, X.Y; MAO, P.J; Duan, Y.T. Research on the lightweight improved algorithm for indoor target detection based on YOLOv5s [J/OL]. Computer Engineering and Applications: 1-11.2023.10.
Li, S.P; Bian, J. X; Li, K.H; Ren, H.Y. Identification and localization of sugarcane tip bifurcation points in complex environments based on improved YOLOv5s [J/OL]. Transactions of the Chinese Society for Agricultural Machinery: 1-13, 2023.10.
Yang, W.J; Wu, J.C; Zhang, J.L; Gao, K. Deformable convolution and coordinate attention for fast cattle detection.Computers and Electronics in Agriculture,Volume 211,2023.
Gao, C; Meng, D; Yang, Y, et al. Infrared patch-image model for small target detection in a single image [J]. IEEE transactions on image processing, 2013, 22(12): 4996-5009.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Research on the Method of Industrial Equipment Fault Detection and Identification Based on Improved YOLOv8

Status:

Version 1

Abstract

Figures

1 Introduction

2 Related Research

2.1 YOLOv8 algorithm

2.2 Loss function

2.3 SENet

2.4Slim Neck-structure

3 The Design of Network Model Based on Improved YOLO-v8

3.1Improved loss function

3.2 Improved YOLOv8 network architecture

4 Experiments and analysis of results

4.1 Data sets

4.2 Data pre-processing

4.3 Experimental environment

4.4 Experimental Procedures

4.5 Experimental results and analysis

5 Conclusions

Declarations

References

Additional Declarations

Status:

Version 1