The solution of defect detection system is illustrated in Fig. 1 to recognize surface defects. Our system began with obtaining the depth image by the structured light system; and as a result, the 3D point cloud model is obtained by the depth image (Fig. 1a), followed by the calculation of the model that filter the point cloud data (Fig. 1b), and then segment the model by European clustering algorithm (Fig. 1c), that fit each region using the least squares to estimate and quantify the defect information (Fig. 1d). By the quantify defect information of the coordinate center position, defect area, defect height and shape size, each point on the 3D surface is classified into one of four types of defects-bubble, fold, warping and pit. In this work, it is considered that with this set of primitives it is possible to describe any typology of the defect. Finally, these regions are extracted 2D features to recognize the defect in a classification stage.

The following diagram is the flowchart of proposed approach for improved KNN filtering algorithm and improved Euclidean clustering algorithm in defect identification.

## 3.1 Image Acquisition

The depth image is obtained from the 3D structured light system. We use a (CMOS)camera (Teledyne DalsaG3-GM12-M2590, On-Semi Python 5000 p1) to capture the laser fringes moving on the mobile platform, which are emitted by a line laser of our choice, and the height information is stored in the sheet of light model of the computer terminal, then the depth image of the target image is obtained. This depth image acquisition and the 3D reconstruction system is shown in Fig. 3. The specific parameters of the hardware are shown in Table 1.

Table 1

Main equipment specifications.

Device | Specifications |

Camera | Teledyne DalsaG3-GM12-M2590 |

Sensor size | 4.8µm×4.8µm |

Image resolution | 2592×2048 |

Lenses | *f =* 25mm |

Laser | 150mw |

Software | C#, halcon19 and VS2019 |

## 3.2 Improved k-neighbors algorithm

There are two problems in the process of three-dimensional data model acquisition. One of the problems is that there is a lot of point cloud data in the model, which affects the speed of data processing, and another problem exists with some inevitably noisy data, e.g., outliers, which need to be removed to ensure the integrity of the capture data model. Therefore, if we use voxels as computational units (Fig. 4a) to handle voxel grids, the processing speed of k-neighborhood and the accuracy of outlier processing will be greatly improved. Outlier as a form of noise can easily be an inlier of structures and generate a very large number of structures.

Given a point cloud of the sample 3D surface called S set, S set are represented by a voxel grid cube with side length *l**0* that can greatly reduce the efficiency of point cloud data processing. Pi represents the grid coordinates which is the center of gravity of point cloud data in a unit; let Pi ∈ S be point on the surface; The K nearest neighbors include neighbors Pij in the sphere with radius *d**0* which are represented by set Qk. The problem of choosing *d**0* affects the size of filtering area and the number of K-adjacent points. The problem of choosing the number *k* and distance *d**0* is called the correct scaling factor, which affects the estimation of outliers. The correct outlier estimation can effectively remove the model noise. The selection of parameter *k* determines whether the point is an outlier in Qk domain directly. These outliers are far away from the model and can be effectively filtered by setting *d**0* and *k* parameters (Fig. 4b).

However, there are some sparse noises generated by reflection in data acquisition, which are close to the main body of the model and cannot be removed effectively. In order to correctly judge whether the set of point clouds represented by the voxel grid is a sparse point group, we introduce the density parameter *ρ* (Fig. 4c), which represents the voxel density of the voxel grid in the region Qk. This value reflects the number of point clouds contained in the voxel and indirectly identifying and filtering noise. In the picture below, shows how to estimate Pi with *ρ*.

As shown above, the structure restricted to the *l**0* parameter is defined as a set of voxels, where point Pi is a collection of solid points *p**ij*, and voxel grids represent surface variations in a simple manner. Region Qk = {*p**i*} (i = 1,2, .... *k*) is represented by a *k* structure, where the parameters *p**ij* and *n* are used to represent this Pi structure. Therefore, we propose a method to evaluate the voxel structure Pi in the Qk region, where the concept of voxel density is added to the measure of Pi, and then their structure density *ρ* is calculated. This is shown in Eq. (1), where *n**i* represents the number of point clouds in the structure Pi found on the surface. Figure 4 shows the density estimation method for the voxel point Pi. By setting a density threshold, we estimate the density of the noise near the real model. The density value *ρ* directly reflects the proportion of the voxel grid in the set Qk, this indirectly illustrates the sparsity of the points within the voxel grid, which should be filtered out. In Fig. 4c, the density of Pi will be defined.

$$\rho =\frac{{n}_{i}}{{\left({n}_{ij}\right)}_{max}}$$

1

In Fig. 4c, if the number of points in the Pi Cloud *n**i* is 2(blue grid) and the maximum (*n**ij*)max in the Qk neighborhood is 6(purple grid), then the density *ρ**i* of Pi points in the Qk set is 1/3. If the threshold *ρ**0* we set is greater than *ρ**i*, then the Pi point will be considered sparse and eliminated.

## 3.3 Clustering segmentation

As can be seen in Fig. 5, the defect segmentation is divided into three parts: data acquisition, data processing and results display. Considering that the defects exist separately, we need to use Euclidean clustering algorithm to segment the model, but for the defects near the location cannot be distinguished because of sub-segmentation. For this situation, we study the existence of plane region in under-segmentation. Segmentation with undistinguishable under-segmentation region, as the most important processing link, need to be further improved. Compared the feature of defects with flat area and defect area, the degree of point-z separation *S**z* would be introduced to further segment the defects.

Firstly, the seed point *p**i* of the spatial points in region R is selected. In this paper, we take any point *p**i* in the under-partitioned region and obtain all points *p**i*∈ S, S = { *p**i* ∈1,2, ..., n } by spatial index within a certain distance. We get the z value for each *p**i* point, which is its height value. Besides, in the process of parameter evaluation, we introduce the degree of point-z separation *S**z* as a feature to evaluate the seed point *p**i*, where the formula of *S**z* is shown in Eq. (2). We set *S**0*, and if the value of *S**z* is less than *S**0* is considered to be a plane set, and so on. All point clusters could be divided into a plane set (Q) and a defect set (A, B, ...) with the feature of height difference, so that the segmentation is finished. At last, different colors are used to represent defects segmentation results.

$${s}_{z}=\sqrt{\frac{\sum _{1}^{n}{\left({z}_{i}-{z}_{ave}\right)}^{2}}{n-1}}$$

2

where \({z}_{ave}\) denotes average height, \({z}_{ave}=\frac{\sum _{1}^{n}{z}_{i}}{n}\). The standard deviation Sz is introduced to describe the discreteness of point *p**i* in region R. When the query point *p**i* is located on a flat plane, the height difference in region R is small, and the smoothness *S**z* is small too. That means region R is the smoother, and the standard deviation *S**z* is smaller. Therefore, by judging the value of *S**z*, we separate the point cloud set with the plane features and detect the defects with complex height variations detect the defects with complex height variations from the improved Euclidean segmentation to solve the problem of under-segmentation.

The graph above shows the clustering relationship between defect sets and plane sets. There may be plane set point cloud data in defect A set, which need to be further divided into plane set Q by using the threshold *S**z*. Set Q as a separate category include smoothed points would not be considered as a defect set A. And eventually filtered set Q out while retaining the remaining clustering results A and B set.

## 3.4 Quantification and classification

The final task is to fit the point cloud data and quantify the defect information, after Segmentation completed. Four kinds of defects with height feature would be classified by the necessary information of 2D features such as geometric shape defect size defect and height. As a solution, we choose “plane” primitives to fit the defect. We assess and categorize defects with high-profile features, focusing on bubbles, folds, warps, and pits. In space coordinates, the expression for a plane can be written as:

$$z={a}_{0}x+{a}_{1}y+{a}_{2}$$

3

Least squares are used to fit a plane with discrete points in defect set, that is, to find a plane \(z={a}_{0}x+{a}_{1}y+{a}_{2}\) that is closest to each point. According to the least squares, the deviation \(Q=\sum {\left({a}_{0}x+{a}_{1}y+{a}_{2}\right)}^{2}\) is the smallest. That is to say we require a set of *a**0*, *a**1*, *a**2* to fit plane so that *Q* is the smallest value for the given discrete points. Setting the first derivative of *S* to zero, and we could get:

$$\left[\begin{array}{ccc}\sum {x}_{i}^{2}& \sum {y}_{i}{x}_{i}& \sum {x}_{i}\\ \sum {x}_{i}{y}_{i}& \sum {x}_{i}^{2}& \sum {y}_{i}\\ \sum {x}_{i}& \sum {y}_{i}& n\end{array}\right]\left[\begin{array}{c}{a}_{0}\\ {a}_{1}\\ {a}_{2}\end{array}\right]=\left[\begin{array}{c}\sum {z}_{i}{x}_{i}\\ \sum {z}_{i}{y}_{i}\\ \sum {z}_{i}\end{array}\right]$$

4

Then we project the point cloud into the fitting plane, and use the edge projection technique to get the plane projection of the defect fitting, so as to calculate the defect area, the maximum diameter of the defect, the center of gravity of the defect, and so on. It should be noted that the fitted defect plane cannot accurately represent the actual shape and height of the defect, but it can also be used to obtain other useful information about the defect. The useful characteristics such as area and maximum diameter of the defect, and the center of gravity coordinates will vary depending on what the fitting plane looks like. For the height information, the maximum height difference between the z-axis coordinates of the point cloud is determined to indicate the height of the defect.

In the process of defect classification, four types of defects with the coordinate information are divided into external and internal defects. Along the edge of an object's surface position is used as the boundary between external and internal part. The external part has fold and warping defect that could be distinguished by the difference of their own height, the lower for the fold, the reverse is warping; The internal part can be divided into fold, bubble and pit, in which we can judge whether it is a pit according to the height value of positive and negative, and judge whether it is a bubble according to the size characteristics of the defects. After analysis of lithium-ion batteries, the Table 2 classification criteria will be set. According to the coordinates of the center point of the defect, the defect boundary value is 10mm which is used to judge whether it is on the edge of the tested object. In the former case, if the defect is outside, the defect with height *h* more than 2 mm is a warping, and the other situation is a fold. In the latter case, if the defect is inside, the defect with height *h* less than 0 is a pit; Otherwise, if the defect with height *h* more than 0, the defect with diameter *d* less than 12 mm is a bubble and its diameter *d* more than 12 is a fold.

Table 2

Types | Internal defects | The external defects | Height | Diameter |

fold | √ | - | h > 0 | d > 12 |

bubble | √ | - | h > 0 | 0 < d < 12 |

pit | √ | - | h < 0 | - |

fold | - | √ | 0 < h < 2 | - |

warping | - | √ | h > 2 | - |