Comparative Analysis of Deep Learning Based Steel Surface Defect Detection under Different Lighting Conditions

doi:10.21203/rs.3.rs-3231635/v1

The automatic inspection and detection of defects on flat steel surfaces is important for product manufacturers as well as for users of these products. Since industrial products are used in various fields such as transportation, energy production, food production, the inspection of these products is a difficult problem today. Traditional methods such as image processing or machine learning used to provide inspections give successful results in detecting sufficiently illuminated, strong contrast or obvious defects. In this study, flat steel surface defects, which is an industrial product, are discussed. The aim of the study is to test the robustness of deep learning methods under different illumination conditions and to determine their response. For this purpose, four popular YOLO object detection methods are used. Because of the different illuminations applied on the data set, the changes in the defect detection algorithms have been observed and the results are shared. Experimental results clearly demonstrate the effect of lighting on model success. In addition, the proposed approach presents an improvement in terms of both detection rate and frame rate compared to studies in the literature.

Deep Learning

Defect detection

Object detection

Steel surface

YOLO

With the development of the industry and the transition to mass production, defects may occur on metallic surfaces due to factors such as the properties of the material, its durability, and processing methods. These defects affect the quality of production. These defects, which negatively affect the quality, also affect the applications and projects where industrial materials will be used. In addition, industrial products are high-cost products. If defects are detected and repaired in a timely manner, they may cause loss of life. Therefore, detecting the defects that may occur at an early stage by constantly inspecting industrial products ensures the emergence of quality products. Methods such as eddy current [1], magnetic flux [2] and acoustic emission [3–5] can be used to detect defects on metal surfaces. Image processing techniques used as a general solution for detecting defects consist of classification, recognition and defect localization. These methods are examined in two categories as deep learning and traditional image processing. Traditional image processing methods are known as defect detection methods by manual feature extraction [6]. The same success may not be achieved when using traditional image processing techniques developed for a dataset on a different dataset of the same type, because these methods are affected by environmental factors too much. Threshold adjustments need to be made by reviewing many factors such as the contrast, lighting, noise and background of the image. Therefore, conventional models cannot achieve sufficient success and are not suitable for real-time detections. Contrary to traditional methods, with the development of deep learning, a new era has been entered in defect detection. Convolutional neural network (CNN) models have been applied and successful to find defect detection on metal surfaces [7]. When the defects occurring on metal surfaces are handled, they may occur as the detection of single or combined defects. In the single defect condition, only one type of defect occurred on a part of the metal surfaces. However, combined defects are the occurrence of more than one defect on the same region. The detection of a single defective region is an easier process. However, it is difficult to detect regions created by more than one defect because of their complexity. For this reason, in order to make more robust inspections that are more robust on metal surfaces, it is necessary to successfully detect combined defects as well as single defects. Deep learning is faster and more cost-effective. Since the results of deep learning are quite successful, it is among the most popular methods used today.

Deep learning, which has many models that support examining the defects of metal surfaces, provides a great advantage in this inspection and evaluation. It has also been proven that deep learning models tend to perform better if their parameters are well- adjusted [8].

Today, machine learning is used in many fields such as psychology, neuroscience, engineering or computer science, and it prevents most problems that may arise from human error. By using it in audits, it can reduce the risks that may arise from human errors, perform repetitive tasks quickly, and can easily cope with complex tasks if a good model is designed [10]. Metal surfaces have been the subject of evaluation from many different perspectives using machine learning. Huang et al. [11] conducted a study using Random Forest and Support Vector Machines (SVM) to analyze the effects of the speed factor on the railway dataset they used as metal surface. However, the convolutional neural network has become one of the most used models for image classification and detection due to its high accuracy compared to other machine learning algorithms[12]. Aydin et al. [13], who dealt with the examination of metal surfaces in a study using deep learning, combined the two models and achieved good results under low contrast in their study. They used the MobileNetV2 model and the SqueezeNet model, which did not give good results when used alone but had fast and uncomplicated structures when used together. Feng et al. [14] proposed a new model for surface defect detection based on the YOLO model. Lv et al. [15] used the existing NEU-DET [9] dataset by producing a new dataset in their study. They compared the GC-10DET dataset and the NEU-DET dataset with models such as SSD [16], Faster-RCNN [17], YOLO-V2 [18] and YOLO-V3 [19].

Lv et al. [20] obtained a mAP (mean Average Precision) rate of 64.5% in their model for the detection of metal surface defects in four groups. Masci et al. [21] applied maximum pooling to the CNN model in order to reduce the training time and create a real-time detection model with their developed model, thereby shortening the processing time and suggesting a defect detection model for steel products. Li et al. [22] proposed a variant of the YOLO network for detecting surface defects of flat steels. Yuan et al. [23] made a comparison using the dataset they discussed on the railways. They extracted features with MobileNetV2, which they used as the backbone network in their model, and named their newly created model MOLO. The experimental results showed that the MOLO model gave better results than the YOLO-V3 model.

In this study, the proposed method was applied to the NEU [9] and Steel pipe defect datasets and the performances of the were compared. Various experiments have been made on the dataset by applying many different models and a certain amount of lighting and dimming. It is aimed to compare the success effect of the models against different brightnesses and the success of the models relative to each other. The contributions of the study can be listed as follows:

An image enhancement method is presented to improve the performance of deep learning-based defect detection models.
Comparison of the latest developed models was done in defect detection.
The ideal model that can be used for defect detection of metal surfaces has been proposed.
In order to avoid human errors in defect detection, it is recommended to provide deep learning model training.
By decreasing or increasing the brightness on metal surfaces, the changes in the success rate of the models were compared.
YOLOX and YOLOR models are included in the examination of metal surfaces for the first time.

In this section, the basic working principle of the methods used for deep learning-based object detection methods will be explained. For this purpose, the working principles are detailed by giving the background of the different versions of YOLO.

2.1. YOLO-V4

YOLO-v4, a CNN-based object detector, was first proposed in 2020 by Bochkovskiy et al. [24]. The main development purpose of the YOLOv4 model is to further increase the success of a model applicable to real life. The backbones of object detectors vary in GPU and CPU platforms. Model such as VGG [25], DenseNet [26] or ResNet [27] can be used for detectors on the GPU platform. The MobileNet [28], SqueezeNet [29] or ShuffleNet [30] can be used for detectors running on the CPU platform. YOLO-v4 is a single-stage object detector and its working strategy is similar to SSD [31] and RetinaNet. The backbone of the YOLO-v4 architecture consists of CSPDarknet53, SPP as an additional module, PaNet as a path collection neck, and YOLOv3 anchor-based head. Figure 1 represents the architecture of YOLOv4.

The comparison of YOLOv4 architecture with other models on the COCO dataset is shown in Fig. 2.

In Fig. 2, the comparisons of six different models are presented. Adaptive Training Sample Selection (ATSS) is a method of automatically selecting positive and negative samples based on the statistical features of the object. Although ATSS is as successful as YOLOv4, it is almost twice as expensive as YOLOv4 in terms of operation cost. Although YOLOv3 has the same values as YOLOv4 in terms of operation cost, it is seen that its success in detecting objects is less than YOLOv4. In comparisons made for EfficentDet, it can be said that the increased object detection rate also increases the operation cost. It is observed that the version of EfficentDet, which is the closest to the YOLOv4 model in terms of success, is quite costly compared to the YOLOv4 model in terms of time cost. Adaptive Spatial Feature Fusion (ASFF) and Center mask models, on the other hand, have high processing costs compared to YOLOv4, so it is suggested that the ideal model to be used in Fig. 2 is YOLOv4.

2.2. YOLOv5

The YOLOv5 model was published by Glenn Jocher, the founder of Utralytics 32. One of the most important features of the model is that for the first time a YOLO model can run on Pytorch. The YOLOv5 architecture consists of three parts, as in YOLOv4. In the backbone, the input images are focused at 640x640x3 resolution, the resolution becomes 320x320x12 with slicing, and then a 320x320x32 feature map is created after 32 convolutions. The CBL module represents convolution, normalization and LeakyRLU. BottleneckCSP module, on the other hand, extracts information from images by performing feature extraction over feature maps. The SPP module mainly increases the receiving area of the network and acquires features at different scales. The YOLOv5 architecture is shown in Fig. 3.

YOLOv5 architecture has 4 different head structures. These structures represent the ratio of the object to be detected to the total image. If the object to be detected in the images covers a very small area compared to the overall image, the appropriate head structure for this is A Micro-Scale. However, if the object to be detected covers a large area, a large head structure is used. Micro-Scale head structure was used because the objects to be detected in the NEU dataset cover very little compared to the image.

2.3. YOLOX

The Exceeding YOLO (YOLOX) proposed by Ge et al [33] is the latest version of the YOLO models. Unlike other YOLO models, the YOLOX model does not need an anchor while working. One of the biggest differences between YOLOX from other detection models is that it uses advanced detection techniques. Figure 4 shows the improved split head structure for the YOLOX model.

2.4. YOLOR

The You Only Learn One Representation (YOLOR) architecture proposed by Wang et al. [34] consists of a unified network structure to perform various tasks. It requires an additional cost of approximately 1/10000 of the training cost to effectively improve the performance of the model. According to the results obtained by the researchers from their studies on the YOLOR model, the YOLOR model is 88 percent faster than YOLOv4 and more successful than YOLOv5 on the COCO dataset. Comparisons of the model on the COCO dataset is given Fig. 5.

In the tests conducted on the MSCOCO dataset, the authors in [35] argued that the YOLOR model is advantageous as it produces faster and more successful results than the other models. Figure 5 describes success as being 88% faster than the closest YOLOv4 model, which is almost equivalent. It has been argued that the YOLOR model is more performant in terms of both success rate and speed compared to other models compared.

In this section, the equations for the analysis done to measure the performance and efficiency of the models will be given. Brief information about the explanation of performance criteria and their importance will also be given. FLOPS is a performance metric used to describe the performance of a computer’s floating point unit. For example, 1 GFLOPS represents one billion FLOPS per second, or floating point operations. FLOPS is calculated as shown in Eq. 1.

$$\text{F}\text{L}\text{O}\text{P}\text{S}=(2\text{X}{C}_{İ}\text{X}{K}^{2}$$

1

C shown in the formula shows the input channels, K convolution kernel, H and W output feature map sizes, and C output channel [36]. Mean Average Precision (mAP) is a metric system designed to evaluate values such as Precision, sensitivity, F1-score from a single point. The formula for the mAP value is given in Eq. 2.

$$mAP=\frac{1}{N}\sum _{n=1}^{N}{P}_{n}{R}_{n}100$$

2

In 2, n represents the number of classes, P represents the sensitivity, and R represents the recall values [37]. Eq. 3 was used to calculate the frames per second (FPS) of the models. The FPS rate is obtained by dividing 1000 ms to the latency.

$$FPS=\frac{1000}{latency}$$

3

In this study, time-related defects on flat metal surfaces are determined. The NEU and steel pipe defect dataset were used for defect detection. Different lighting data of the visuals in two datasets were created and the responses of the models to the newly created different lighting datasets were measured. framework of the proposed method is given in Fig. 6.

As shown in Fig. 6, 4 different illuminations were used on the datasets, excluding the original data. According to these lighting options, each dataset was tested on each defined model (YOLOX, YOLOR, YOLOv4, YOLOv5). Each dataset was run for a total of 4 models, and 20 pieces of training were performed for a total of five datasets. The performances of the models were compared on these datasets.

5.1. NEU Dataset

The dataset prepared by Northeastern University on flat metal surface defects is used in the study. The dataset using the name of the university is also known as the Northeastern University Dataset (NEU Dataset) [9]. There are six different classes in this dataset. These classes are inclusion, patches, cracking, rolled in scale, scratches, and pitted surface. There are 300 images in each class in the dataset and there are 1800 images in total. Each image is 200x200 in size and is approximately 40KB. Among these images, there are also images with different types of defects on a single image. This is a factor that makes fault detection difficult. There are 4186 defect labels in 1800 images. The distributions of these labels are shown in Fig. 7.

As can be seen in Fig. 7, labeled defect numbers that do not show a balanced distribution in the data set are a very challenging factor for training. In Fig. 8, it can be seen an example of a labeled dataset for each class.

5.2. Steel Pipe Defects Dataset

The steel pipe dataset [38], published in 2021, contains 3408 images in total. The images categorized in eight different classes were obtained as X-RAY images. If 3408 image data are to be defined, 1339 of them are Blow-hole, 35 are Undercut, 531 are Broken-arc, 119 are Crack, 219 are Overlap, 136 are Slag-inclusion, 416 are Lack-of-fusion, 613 are Hollow-labeled as the bead. These tagged images have been exported to a level that can be used in PASCAL VOC2007 or YOLO format. In the Fig. 9, some examples of the dataset are shown.

5.3. Generating New Datasets with Lighting and Darkening

Since the aim of this study is to measure the effect of lighting and darkening on training success, the same trainings were repeated on the above-mentioned models by changing only the brightness settings without making any changes in the training, test and validation data in the original dataset, and comparisons were made.

New datasets were created by providing 10% and 20% lighting and darkening on the dataset. As a result of these processes, a total of five datasets were obtained, including the illuminated data, the darkened data, and the original data. An example representation of the generated datasets is shown below in Fig. 10 for NEU dataset.

In the study, four different models were used to understand the effect of light on the data set. These models will be discussed in turn and it will be analyzed in which dataset each of them gives the best results. In addition, the best success rate of each model will be presented in the form of a table at the end of the section. The parameters such as iteration number, learning rate, batch size, and subdivision values of were selected as 12000, 0.001, 64, 24 during the training of each model, respectively. The performance of each model is given in Fig. 11.

As shown in Fig. 11, the brightness ratios in the datasets have an effect on the success rates of the models. In the experimental study on which brightness improves the model positively, it has been observed that all models give better results in 10% illuminated datasets. The YOLOv4 model takes 416x416 images, but the dataset consists of 200x200 images. After the input images are converted to 416x416 size, it goes to the training phase Among five different datasets with changed light values, the best success rate of the dataset with 20% darkening is 67%, and the best success rate of the dataset with a blackening rate of 10% is 68%. It has been observed that the darkening to be made in the dataset for YOLOv4 almost do not affect the success of the model. The success of the original dataset is 67%. It is observed that the success rate changes positively in the illuminated datasets. The 10% illuminated dataset gives the best results with 71% mAP. Although the dataset with 20% illumination gave better results than the original dataset with dimming, it gave worse results than the dataset with 10% illumination. The most successful dataset is the 10% applied dataset. The YOLOv5 model, which came out after the YOLOv4 algorithm is a more advanced model. Using the same parameters, five different trainings were applied to the datasets. In these training, the best success rate of the dataset with 20% darkening is 75.9%, and the best success rate of the dataset with a darkening rate of 10% is 77.1%. It has been observed that darkening in the dataset for YOLOv5 has a negative effect on the success of the model, as in YOLOv4. The success of the original dataset is 76.4%. It is observed that the success rate in illuminated datasets increases positively as in YOLOv4.

The 10% illuminated dataset gives the best results with 79.7% mAP. Although the dataset with 20% illumination gives better results than the original dataset with darkening, it gives worse results than the dataset with 10% illumination. The most successful dataset is the 10% applied dataset. The YOLOX model used in the study was run as the default parameters specified in [33]. Although it achieved higher success rates in a shorter time compared to other models, it did not achieve as much success as the YOLOv5. The last model YOLOR in the study was run in 500 iterations. Since there was no improvement in performance after 500 iterations, all algorithms were run 500 iterations. As in the other models, the parameters are set to be equivalent, but it did not produce as successful results as the YOLOv5. When evaluated with the lighting element, the dataset with 20% darkening achieved 72.7% success and gave the most unsuccessful training result as in the other models. Figure 12 shows the detection results of the best performing YoloV5. As shown in Fig. 12, it can be seen that different defects were successfully detected on the same image. The original dataset and the success rate after 10% illumination are given in Table 1. In Table 1, information about the performance ratios of the models is also given.

Table 1

Detection of best result of the dataset.
Model Name	Size	Parameters (M)	Layers	GFLOPs	Latency (ms)	FPS	Best Dataset mAP (%)	Improvement Rate (%)
YOLOv4	608x608	27.6	137	52	44	22.7	71.00	4.0
YOLOX	640x640	8.94	286	26.65	33	30.3	73.20	2.7
YOLOR	416x416	36.8	665	80.44	22	46	75.90	0.6
YOLOv5	416x416	7.3	232	16.8	17	58.8	79.70	3.3

As shown in Table 1, there are differences in performance and success rate between models. This study was carried out on a server with a Tesla P100 graphics card and 16 GB of RAM. The results show that the YOLOv5 architecture performs better and is more successful than the latest YOLOR and YOLOX architectures, which have not been used in the field of metal surfaces before. When YOLOX and YOLOR architectures give better results than YOLOv4, it is seen as an output of experimental results. While the YOLOR model showed the least improvement rate on the illuminated dataset, the YOLOv4 model showed the highest difference, however, the YOLOv5 model is the most suitable model for detecting defects on steel surfaces among the YOLO models. When the recent defect detection studies on the NEU dataset are examined, no previous studies have used YOLOX, YOLOR and YOLOv5 models, and no studies have been found to examine the effect of lighting on success. In addition, the study has the uniqueness of the first metal surface defect detection study using YOLOX and YOLOR models. In 2022, Tian et al. [39] proposed the DCC-CenterNet model to get fast and accurate results on metal surfaces. ResNet50 was preferred as the backbone structure of the model and DFEM (Dilated feature enhancement model) head structure was used. In 2021, Kou et al. [40] developed a YOLOv3-based model for damage detection on the NEU dataset. Studies that applied feature selection to improve the model achieved 72.2% mAP. In 2021, Cheng and Yu [41] used the DEA_RetinaNet model on metal surfaces. This model, which basically consists of five parts, consists of feature extraction network, DE-block, FPN, adaptive spatial feature coupling (ASFF) module and prediction network modules. The authors obtained 78.25% mAP as a result of the study. In 2021, Song et al. [42] used the multiscale adversarial and weighted gradient-domain adaptive network (MWDAN) model to detect the defects of metal surfaces. This model is obtained by applying HRNet [43] as a backbone to the Faster R-CNN model. Researchers achieved a success rate of 76.2% in their study. The comparison results are given in Table 2.

Table 2

Comparison results on NEU dataset of different methods.
Reference	Method	FPS	Parameters (M)	mAP(%)
[41]	DCC-CenterNet	71.3	32.8	79.4
[42]	YOLO-V3	64.5	60.7	72.2
[43]	DEA_RetinaNet + ASFF MWDAN	12.0 -	28.5 -	78.2 76.4
This study	YOLOv4 Darknet + (10% Illimunated) YOLOR + (10% Illimunated) YOLOX + (10% Illimunated) YOLOv5 + (10% Illimunated)	22.7 46.0 30.3 58.8	27.6 36.8 8.9 7.3	71.0 75.9 73.2 79.7

In Table 2 it is observed that the object detectors based on the mentioned YOLO model have less success on the NEU dataset than the YOLOv5 model and have more parameters. In addition, the effects of lighting and darkening were not examined in these studies. The study clearly shows that the image enhancement techniques on the models make an absolute contribution to the success of the model. In studies with another dataset, the steel dataset, the effect of brightness was measured as in the NEU dataset. According to the experimental results, it is fixed according to the results of the two datasets that the images taken with the camera live better performance against the lighting. The proposed approach has been applied on the steel pipe defects dataset. In the experiments on this dataset, 10% lighting gave the best result. In the Table 3, performance, success rate, improvement rate (%), and model details are given for the steel pipe defects dataset under 10% lighting conditions.

The FLOPs mentioned in the Table 3 are the operations of a computer’s floating point unit in one second. One GFLOPs is one billion FLOPS per second, or floating point operations. In the datasets obtained with lighting and darkening, the dataset with 10% lighting achieved the best results. The improvement rate best dataset represents the additional success rate over the original dataset. The performance comparison of the illuminated dataset is shown in Table 4.

Table 4

Performance comparison of steel pipe detection algorithms Object
Object Detection Model	Accuracy, precision, [email protected]%
GAN + CFM [44]	85.90 accuracy
OSTU + MSVM-rbf [45]	95.23 accuracy
Faster R-CNN + ResNet50 [46]	78.10 [email protected]
YOLOv5x (over 10x image augmentation)[47]	97.8 pre
Yolov4 (only 10% Illimunated)	84.30% [email protected]
YOLOX (only 10% Illimunated)	86.50% [email protected]
YOLOR (only 10% Illimunated)	89.80% [email protected]
YOLOv5 (only 10% Illimunated)	97.60% [email protected]

When the results of Table 4 analyzed, it is seen that the YOLOv5 model has been developed to 97.60% success with only 10% illumination. Yan et al. [38] achieved 97.8% success by performing data augmentation on the same data set. In the data augmentatrion process, the data size has been increased 10 times. As the model weights increase because of augmentation of the data set, the training time also increases approximately ten times. In our paper, almost the same success rate was achieved by making 10% illumination with a simple pre-processing without increasing the data, and efficiency was achieved in terms of performance. The following figure shows the YOLOv5 results of the steel dataset.

As shown in Fig. 13, YOLOv5 steel pipe defect can also produce very effective results on dataset. As a result of the analyzes made, it has been observed that the YOLOv5 model can work in the real-time system and achieves the highest accuracy compared to other models. In addition, it has been observed in both datasets that the accuracy can be highly increased with only a simple pre-process lighting, in terms of performance and cost, without the need for data augmentation. Despite the different datasets and different numbers of classes in our study, the YOLOv5 architecture has been identified as the architecture that gives the most effective results in terms of performance, success and cost.

Table 5

Comparation of Illuminated and Darkened Processing on Steel Pipe Dataset Model Name
Models	Illuminated Rate		Original	Darkened Rate
Models	10%	20%	0%	10%	20%
Yolov4	82,00%	81,80%	80,50%	80,20%	79,70%
YOLOX	83,80%	83,80%	83,40%	82,60%	81,80%
YOLOR	88,60%	86,40%	88,10%	86,50%	85,60%
YOLOv5	94,90%	93,20%	93,40%	91,30%	90,20%

Table 6

Comparation of Illuminated and Darkened Processing on NEU Dataset
Models	Illuminated Rate		Original	Darkened Rate
Models	10%	20%	0%	10%	20%
Yolov4	71,00%	69,00%	67,00%	68,00%	67,00%
YOLOX	73,20%	73,20%	70,50%	73,20%	72,10%
YOLOR	75,90%	75,20%	75,30%	74,50%	72,70%
YOLOv5	79,70%	76,60%	76,40%	77,10%	75,90%

In addition, to better highlight and analyze the performances of the YOLO models, the results of the datasets with different illuminated and darkened processes are given in Tables 5 and 6, respectively. As can be seen from the obtained results, the most satisfactory results are obtained with 10% illumination processing.

In this study, the effect of light on deep learning object detection models was examined and compared with different models. Experimental results were obtained about the success of the newly developed models, which were included in the scope of the study and the latest YOLO models. The study clearly shows that there are effects on the success of the model, even by changing the lighting setting, which is produced in the same way. Within the scope of the study, the NEU dataset with Demiols was used and the performance was obtained in four different YOLO models (YOLOv4, YOLOv5, YOLOX, YOLOR). Then, these data were developed by applying 20% dimming, 10% dimming, 10% lighting and 20% lighting, respectively. As a result of training these new datasets in the same models again, it has been determined that the dataset developed with 10% illumination gives healthier results. It is also another conclusion that the model with the best results among the Models is YOLOv5.

As in the NEU dataset, it has been observed that the lighting operation to be carried out as an improvement in the study with the steel dataset directly adds success to the failure detection success rate. According to the results, the YOLOv5 architecture, which gave the best results, was achieved by applying 10% illumination on the steel dataset, increasing the accuracy by approximately 4.2% compared to the original dataset.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

ACKNOWLEDGMENT

This work was supported by the TUBITAK (The Scientific and Technological Research Council of Turkey) under Grant No: 5210082.

Authors contribution statement

A.A. has done the concept modeling, implementation, helped in concept building and proof-reading of the entire paper.and manuscript writing. I.A. supervised the entire work and helped A.A. in formulating the concepts and also thoroughly checked the manuscript. E.A. has done Visualization, Project administration, Writing – review & editing.

Ethical and informed consent for data used

This paper does not contain any study with human participants or animals performed by any author.

Consent for Publication

Author agreed with the content and that all gave explicit consent to submit.

Competing Interests

Author certifies that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Data availability and access

Data will be made available on reasonable request.

Song Z, Yamada T, Shitara H, & Takemura Y. Detection of damage and crack in railhead by using eddy current testing. Journal of Electromagnetic Analysis and Applications 2011; 3(12): 546-550. doi: 10.4236/jemaa.2011.312082
Antipov A G, & Markov A A. Detectability of rail defects by magnetic flux leakage method. Russian journal of nondestructive testing 2019; 55(4), 277-285. doi: 10.1134/S1061830919040028
Zhang J, Ma H, Yan W, Li Z. Defect detection and location in switch rails by acoustic emission and Lamb wave analysis. A feasibility study. Appl. Acoust. 2016; 105, 67–74. doi: 10.1016/j.apacoust.2015.11.018
Zhang X, Feng N, Wang Y, Shen Y. Acoustic emission detection of rail defect based on wavelet transform and Shannon entropy. J. Sound Vib. 2015; 339, 419–432. doi: 10.1016/j.jsv.2014.11.021
Bruzelius K, Mba D, An initial investigation on the potential applicability of Acoustic Emission to rail track fault detection. NDT E Int. 2004; 37, 507–516. doi: https://doi.org/10.1016/j.ndteint.2004.02.001
Hegde R B, Prasad K, Hebbar H, & Singh B M K. Feature extraction using traditional image processing and convolutional neural network methods to classify white blood cells: a study. Australasian Physical & Engineering Sciences in Medicine 2019; 42(2), 627–638. doi:10.1007/s13246-019-00742-9
Chhikara P, Gupta P, Singh P, & Bhatia T. A deep transfer learning based model for automatic detection of COVID-19 from chest X-rays. Turkish Journal of Electrical Engineering & Computer Sciences 2021; 29(SI-1), 2663-2679. doi:10.3906/elk- 2104-184
Caner S, Erdoğmuş N. & Erten Y M. Performance analysis and feature selection for network-based intrusion detection with deep learning. Turkish Journal of Electrical Engineering & Computer Sciences 2021; doi:10.3906/elk-2104-50.
Song K, & Yan Y. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Applied Surface Science 2013; 285, 858-864. doi: 10.1016/j.apsusc.2013.09.002
Khanzode K C A, & Sarode R D. Advantages and Disadvantages of Artificial Intelligence and Machine Learning: A Literature Review. International Journal of Library & Information Science (IJLIS) 2020; 9(1), 3.
Huang K, Wu J, Yang X, Gao Z, Liu F, Zhu Y. Discrete train speed profile optimization for urban rail transit: A data- driven model and integrated algorithms based on machine learning. Journal of Advanced Transportation 2019; 1-17. doi:10.1155/2019/7258986
Véstias M P. A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing. Algorithms 2019; 12(8), 154. doi:10.3390/a12080154
Aydin I, Akin E & Karakose M. Defect classification based on deep features for railway tracks in sustainable transportation. Applied Soft Computing 2021; 111, 107706. doi:10.1016/j.asoc.2021.107706.
Feng C, Liu M Y, Kao C C, & Lee T Y. Deep active learning for civil infrastructure defect detection and classification. In Computing in civil engineering 2017; 298-306. doi: 10.1061/9780784480823.036
Lv X, Duan F, Jiang J, Fu X, & Gan L. Deep Metallic Surface Defect Detection: The New Benchmark and Detection Network. Sensors 2020; 20(6), 1562. doi:10.3390/s20061562
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C. Ssd: Single Shot Multibox Detector. In Proceedings of the European conference on computer vision, Amsterdam, The Netherlands,2016; pp. 21–37.
Ren S, He K, Girshick R, Sun J. F. R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS), Montreal, QC, Canada; 2015. pp. 1-9
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA; 2017; pp. 7263–7271.Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.
Lv X, Duan F, Jiang J J, Fu X, & Gan L. Deep Active Learning for Surface Defect Detection. Sensors 2020; 20(6), 1650. doi:10.3390/s20061650
Tastimur C., Karakose M , Akin E, & Aydin I. Rail defect detection with real time image processing technique. 2016 IEEE 14th International Conference on Industrial Informatics (INDIN), 2016; pp. 411-415, doi: 10.1109/INDIN.2016.7819194.
Li J, Su Z, Geng J, & Yin Y. Real-time Detection of Steel Strip Surface Defects Based on Improved YOLO Detection Network. IFAC-PapersOnLine, 2018; 51(21), 76–81. doi:10.1016/j.ifacol.2018.09.412
H Yuan, H Chen, S Liu, J Lin & X Luo, A Deep Convolutional Neural Network for Detection of Rail Surface Defect, 2019 IEEE Vehicle Power and Propulsion Conference (VPPC), 2019; pp. 1-4, doi: 10.1109/VPPC46532.2019.8952236.
Bochkovskiy A, Wang C Y, & Liao H Y M. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934;2020.
Simonyan K, & Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556;2014.
He K, Zhang X, Ren S & Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition; 2016; pp. 770-778.
Huang G, Liu Z, Van Der Maaten L & Weinberger K Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition; 2017 ;pp. 4700-4708.
Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, ... & Adam H.. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861;2017.
Iandola F N, Han S, Moskewicz M W, Ashraf K, Dally W J & Keutzer K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360;2016.
Ma N, Zhang X, Zheng H T, & Sun J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV),2018; pp. 116-131.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, & Berg A C. Ssd: Single shot multibox detector. In European conference on computer vision,2016; pp. 21-37.
Fang Y, Guo X, Chen K, Zhou Z, & Ye Q. Accurate and Automated Detection of Surface Knots on Sawn Timbers Using YOLO-V5 Model. BioResources, 2021; 16(3), pp. 5390-5406.
Ge Z, Liu S, Wang F, Li Z & Sun J. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430;2021.
34.Wang C Y, Yeh I H, & Liao H Y M. You Only Learn One Representation: Unified Network for Multiple Tasks. arXiv preprint arXiv:2105.04206;2021.
Github: https://github.com/WongKinYiu/yolor
Wang B, & Huang F. A lightweight deep network for defect detection of insert molding based on X-ray imaging. Sensors, 2021; 21(16), 5612. https://doi.org/10.3390/s21165612.
R S Rampriya, R Suganya, Sabari Nathan & P. Shunmuga Perumal. A Comparative Assessment of Deep Neural Network Models for Detecting Obstacles in the Real Time Aerial Railway Track Images. Applied Artificial Intelligence,2022; DOI: 10.1080/08839514.2021.2018184
Yang, D., Cui, Y., Yu, Z., & Yuan, H. (2021). Deep Learning Based Steel Pipe Weld Defect Detection. Applied Artificial Intelligence, 1-13.
Rushuai T & Jia M. DCC-CenterNet: A rapid detection method for steel surface defects. Measurement, 2021; 187. 110211.10.1016/j.measurement.2021.110211.
Kou X, Liu S, Cheng K & Qian Y. Development of a YOLO-V3-based model for detecting defects on steel strip surface. Measurement, 2021, 109454. doi: 10.1016/j.measurement.2021.109454
Xun C & Jianbo Y. RetinaNet With Difference Channel Attention and Adaptively Spatial Feature Fusion for Steel Surface Defect Detection, IEEE Transactıons On Instrumentatıon And Measurement,2021; Vol. 70 pp. 1-11. doi: 10.1109/TIM.2020.3040485.
Song Y, Liu Z, Wang J, Tang R, Duan G, & Tan J. Multiscale Adversarial and Weighted Gradient Domain Adaptive Network for Data Scarcity Surface Defect Detection. IEEE Transactions on Instrumentation and Measurement,2021; 70, pp. 1-10. doi: 10.1109/TIM.2021.3096284.
Sun K, Xiao B, Liu D & Wang J. Deep high-resolution representation learning for human pose estimation, in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 5693–5703.
Wu, X., L. Qiu, X. Gu, and Z. Long. 2021. Deep learning-based generic automatic surface defect inspection (ASDI) with pix- elwise segmentation. IEEE Transactions on Instrumentation and Measurement 70:1–10. doi:10.1109/TIM.2020.3026801
Malarvel, M., and H. Singh. 2021. An autonomous technique for weld defects detection and classification using multi-class support vector machine in X-radiography image. Optik 231:166342. doi:10.1016/j.ijleo.2021.166342.
Ren, S., K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6):1137–49. doi:10.1109/TPAMI.2016.2577031.

No competing interests reported.

Comparative Analysis of Deep Learning Based Steel Surface Defect Detection under Different Lighting Conditions

Status:

Version 1

Abstract

Figures

1. INTRODUCTION

2. YOLO BASED DEEP LEARNING MODELS FOR OBJECT DETECTION

2.1. YOLO-V4

2.2. YOLOv5

2.3. YOLOX

2.4. YOLOR

3. METRICS

4. THE PROPOSED METHOD

5. DATASET

5.1. NEU Dataset

5.2. Steel Pipe Defects Dataset

5.3. Generating New Datasets with Lighting and Darkening

6. RESULTS AND ANALYSIS

7. CONCLUSIONS

Declarations

References

Additional Declarations

Status:

Version 1