The study of machine learning for wire rupture prediction in WEDM

During wire electrical discharge machining (WEDM), wire rupture may deteriorate workpieces’ machined surfaces and increase the processing time. However, only a few referenced papers focused on wire rupture during past decades because of its complexity. In this research, machine learning (ML) technique was applied to analyze the relationship between manufacturing parameters and the chance of wire rupture. Three parameters, including gap voltage (GV), feed rate (FR), and water resistance (WR), were considered as training features, and a total of 298 sets were used to train an artificial neural network (ANN). The results show that the prediction accuracy of wire rupture for 10 s in advance is above 85%. This research developed a new method to apply the real-time predict wire rupture and is faster, more accurate than prior research. Besides, this method is extendable for future measured data when the usable sensor data are increasing.


Introduction
Nowadays, WEDM is one of the widely used non-conventional machining processes. It may be applied to cut electrically conductive materials and complicated machining shapes that traditional machining processes cannot produce [1][2][3]. It also has the potential to cut the polycrystalline silicon in recent years [4,5]. There are some optimization researches for better manufacturing performance is published due to its importance [6][7][8]. WEDM uses a continuously traveling wire electrode made of copper, brass, or tungsten. The wire is kept in tension using a tensioning device, reducing the inaccurate production. During the process, the material is eroded by heat ahead of the wire. There is no direct contact between the workpiece and the wire, eliminating the mechanical stresses during machining [9]. Therefore, the wire electrode is the key component of the machining process [10]. The risk of wire breakage has undermined the full potential of the WEDM, drastically reducing the efficiency and accuracy of the process. Different factors lead to wire breakages, like high wire tension, electrical discharge impact, thermal load, and high-temperature, influencing the wire strength and, consequently, wire rupture [11]. Moreover, these phenomena come from the machining parameters, so the relationship between machining parameters and the possibility of wire rupture can be considered a complicated nonlinear problem.
ML, a computer algorithm to extract useful knowledge from experience, has been shown to be an effective method to improve machining manufacturing [12]. A mathematical model builds in the ML algorithm based on sample data, known as "training data," to make predictions or decisions. And then, "testing data" is used to prove that the accuracy and robustness of the model are suitable for the application. ML is frequently used to solve problems that only can be solved by a lot of experiences or data in recent years. It has been applied to solve many complicated nonlinear problems on machining processes and obtained an outstanding performance [13]. Goswami and Samuel [8] predicted the dimension features based on the input and condition parameters by SVM. Sharma et al. [14] formed the ANN model of surface roughness in terms of speed, feed, depth of cut, and approaching angle. Karayel [15] derived surface roughness by ANN, close to actual values.
One of the critical steps in machine learning is data processing. The quality and correctness of the data set affected the training result obviously and unignored. Therefore, the collected data should be preprocessed cautiously by domain knowledge before input into the neural network. In this research, the machining status can be considered two statuses. First is the normal machining process, which means the wire does not break. The second is wire rupture, which means the machining process stop by wire rupture. Because the label of wire state does not exist in the original data so needs to do data processing. The method of processing of data set will be introduced in the next chapter carefully. This situation is one kind of classification problem by these two kinds of wire states. Luckily, AI can solve the classification problem in many ways. Therefore, the ANN was applied to analyze the relationship between manufacturing parameters and the chance of wire rupture for WEDM.

Data processing
Data processing is a critical component for applying ML to wire rupture prediction. During the WEDM process, CHMER machines will record almost 200 parameters data every second, including the error code of the controller, the state of machining processing, the G-Code of the processing path, and the parameters of the operation. However, many parameters do not change during the process or do not affect the manufacturing, which hence are neglected for training.
Filtering for raw data and making labels are the essential steps in the data processing. Table 1 contains a part of processed data and introduces the definition of each parameter. The wire state is defined by the machining state of raw sensor data. For example, the rupture point is 23290 in data stamp, and the label of wire state is 2. On the other side, the data sets of data stamp from 23,280 to 23,289 before the rupture point are "rupture data," and the label of wire state is 0. Finally, label 1 in the wire state means normal working data. Besides, the author, based on Fig. 1, defined rapture level. The rapture level is used to consider the relationship between the change of manufacturing features and wire rupture and is prepared for future work, not this research. By observing Table 1, there is no apparent difference for machining parameters between rupture data (wire state = 0) and working data (wire state = 1). Therefore, ML be used to train and find the trend of the parameters in this research.
This study aims to build a model to predict the wire state and prevent wire rupture. Therefore, the 10-s data were used as one data set to predict the wire state in the next second. Furthermore, choosing which manufacturing parameters recorded the machining process as the neural network input is a serious step. The machine tool used in this research already has a monitor to record the manufacturing parameters. Among all recorded data, gap voltage (GV), feed rate (FR), and water resistance (WR) were chosen as input data according to the experiment results in previous research documents [7]. A set containing 30 data of 10 s of machining process was considered an input data set for ML. Totally, 298 data sets were used for training, validation, and testing. The prediction of wire rupture can be considered a classification problem containing two states, i.e., "wire rupture" and "normal working" classes. Therefore, the data set for these two converse states were collected to be the same amount so that the dataunbalanced situation in classification may be avoided. Three groups of data sets, i.e., "training," "validation," and "testing," were formed by random selection from 298 sets. Among 298 data sets, 214 sets (72%) were used for training the network, 24 sets (8%) were used for validation, and 60 sets (20%) were used for testing. The percentages of these groups are shown in Fig. 2.

Architecture of network
In this research, ANN was used to analyze the relationship between the sensors' data and the state of wire rupture. Generally, an ANN is made up of some neurons connected via the link as shown in Fig. 3. The feed-forward neural network based on back-propagation is the best general-purpose model among various neural network models. In [16], it is mentioned that ANN is the most commonly used network to solve classification problems.
An ANN is built by one input layer, one or several hidden layers, and one output layer. The input layer is a 1D array with specific lengths. Base on the flexibility of ANN, various combinations of the different layer numbers and neurons number were tested in this study to find out the better architecture. First, one hidden layer with different neurons was tested. Some specific numbers of neurons that have highgrade performance are expected to choose for the next experiment. For the same total neuron number, the hidden layers Fig. 1 The curve in Part B shows the relationship between the data stamp and the customized value, rapture level. For example, the data between 23,268 and 23,279 data stamp make the level 1 of 23,280 data stamp. The data between 23,269 and 23,280 data stamp make the level 2 of 23,281 data stamp. Finally, the data stamp 23,280-23,289 caused the max level 10 and the wire rupture in data stamp 23,290. Besides, Part A is the whole manufacturing process, and Part B is the part of data stamp from 23,270 to 23,294 Fig. 2 The whole data set is separated into the training (72%), validation (8%), and test data (20%) will vary from 1 to 4 to obtain the best network system. For example, if there are N total neurons are used in the network, each layer will be N/L n , where L n represents the layer number.
The ANN has two output values and modifies them into 0 to 1 by the SoftMax function in this research. The classification layer uses the value passed by the SoftMax layer to classify into two classes, which includes "wire rupture" and "wire no rupture." The loss function is the cross-entropy loss function, like Eq. 1, for multi-class classification problems with mutually exclusive classes. The complete model built by MATLAB is shown in Fig. 4. Equation 1. The cross-entropy loss function.

Finding the important parameter
During the training process, it is found that the result of the trained network may not perform better if too many input parameters are included in a data set. The significant parameters should be chosen as the input features to simplify the model on the real-time application.
Three parameters (GV, FR, WR) were considered in the first experiment in Sect. 2.2. This experiment designed different combinations of these three parameters, shown in Table 2, which were trained network and evaluated the effect between input parameters and prediction accuracy. By observing the accuracy, valuable parameters can be identified as the input features.
N is the number of classes, y i is the real value, y i is the predicted value y i andŷ i are between 0 to 1.

Comparison of ANN architecture in different neuron numbers
In this research, different neuron numbers were used in the hidden layer to predict the state of wire rupture. Because the network produces the weight randomly in each training time, every network is different even they have the same data source and architecture. Therefore, each architecture was trained five times to check the model's robustness and average performance. Figure 5 shows the training result (blue line) and testing result (red line) of all architecture. Each architecture result includes the highest, the lowest, and the median accuracy in five experiments. The accuracy becomes stable and similar when the neuron number is varying from 100 to 300. That means it does not need more than 300 neurons to simulate this nonlinear model. However, the accuracy decreases by increasing neurons when neurons number over 300. The reason is that when low-dimensional data is extended to too many neurons, some useless values and noise will be generated. Therefore, using too many neurons in the network is unsuitable in this research. Table 3 lists some results for the 0.001 learning rate. Based on Table 3, this research suggested the network of a single hidden layer with 130 neurons is suitable for the prediction of wire rupture. Besides, Fig. 7 shows the confusion boxes of training and testing results of a single hidden layer network with 130 neurons.
Moreover, the 0.001 learning rate applied to prior training. The other result of the 0.0001 learning rate is shown  in Fig. 6. Even the accuracy is not high as the previous training (learning rate = 0.001), but the divergence of the results of the same model is smaller than the result of Fig. 5. This result proved that the lower learning rate is suitable for higher neuron numbers. However, the lower neurons number has enough accuracy for this research. Besides, this research tried more learning rates to observe the efficiency of the model training. Still, it did not get the productive result to present, as shown in Table 4. In the experiment, three structures which are 80 neurons, 130 neurons, and 180 neurons were chosen to train by a series of learning rates from 0.002 to 0.0001. The standard deviation of each structure is so slight that the effect of the change of learning rates is not apparent, no matter in training or testing results.

Comparison of ANN architecture in different hidden layer numbers
In order to find a better ANN architecture, one to four hidden layers were used for comparison. The total amount of neurons was kept the same for all structures. For example, if 300 neurons are used in the structure, each layer contains 150 neurons for 2-hidden-layer structure, 100 neurons for 3-hidden-layers, etc. During the experiment, the total number of neurons varied from 100 to 300, and the learning rate is 0.001. As shown in Fig. 8, the result shows that the accuracy becomes better for more neuron numbers and more hidden layers used in the structure. However, when the neuron number increases to 400, there is no apparent improvement trend in Fig. 9. After applying the basic artificial neural network to predict the wire state, the application of the ML tool on the wire rupture problem of WEDM is proved. The development of machine learning is quick and efficient that many powerful networks are developed already. Maybe the wire rupture problem can be solved by other functional networks.

Comparison of different input parameters
After designing ANN architecture, an applicable ANN network, which is one hidden layer with 100 to 300 neuron numbers, was defined to predict the state of the wire. In this experiment, different combinations of three parameters were trained and try to find the critical parameter. For example, case 2 uses the gap voltage and the feed rate as the input parameters, and the input size of the network is 20. In addition, case 1 was the control group with one hidden layer and a 0.001 learning rate. All combination results show the part of steady accuracy in which neuron numbers from 100 to 300. Each architecture in the case in Table 3 also was trained five times, as shown in Figs. 10, 11, 12, 13, 14, 15, and 16. The medium testing accuracy is shown in Table 5. Table 5 shows that the accuracy closes 70% by using feed rate (FR) and over 70% by using gap voltage (GV) as input. However, the water resist (WR) shows low relevance with wire rupture. Compared to the result of case 1 and case 2, the participation of water resists even lower than the prediction accuracy.
Based on this result, the author supposes a control strategy. If the machining parameters make the wire rupture in our model, adjust the manufacturing parameter to affect the gap voltage or feed rate. Therefore, a new controller can be designed into the original controller in the WEDM machining tool to increase efficiency. When the controller gets the 10-s data from the sensors, it can predict the wire's state in the next second and modify the machining parameters by the predicted result. Table 5 shows that the accuracy gets over 80% by using feed rate (FR) and over 70% by using gap voltage (GV) as input. However, the water resist (WR) shows low relevance with wire rupture. Compared to the result of case 1, the Fig. 8 The combination with 1-4 hidden layers and 100-300 neuron numbers Fig. 9 The combination with 1 to 4 hidden layers and 100 to 400 neuron number  participation of water resists even lower than the prediction accuracy.
Based on this result, the author supposes a control strategy. If the machining parameters make the wire rupture in our model, adjust the manufacturing parameter to affect the gap voltage or feed rate. Therefore, a new controller can be designed into the original controller in the WEDM machining tool to increase efficiency. When the controller gets the 10-s data from the sensors, it can predict the wire's state in the next second and modify the machining parameters by the predicted result.

Conclusion
This paper shows that the ML algorithm can be used to predict wire rupture for WEDM. Although the basic artificial neural network, which may be is not the best one, was used in this research, it still has a good performance to point out the potential of the application on the ML. This research uses 130 neuron numbers in the single hidden layer to get 85% accuracy. The author expects that applying other functional networks to solve the wire rupture more efficiently in the future.  Moreover, the ML learning tool and other powerful references study help the researcher determine the critical machining parameter for wire rupture. Finding the critical parameter makes the prediction network more efficient and correct one the actual control application. In the future, the author supposes combining with other algorithms that can find out the suitable machining parameters combination to avoid wire breakage. It is an automatic strategy that does not need experience from the operator. When the predicted model meets the wire rupture result, other algorithms automatically adjust the machining factors to avoid the situation.