2.1 Data Processing
Data processing is a critical component for applying ML to wire rupture prediction. During the WEDM process, CHMER machines will record almost 200 parameters data every second, including the error code of the controller, the state of machining processing, the G-Code of the processing path, the parameters of the operation... etc. However, many parameters do not change during the process or do not affect the manufacturing; hence are neglected for training. Filtering for raw data and making labels are the essential step in the data processing. In Fig. 1, the rupture point is 23290, and the code of "Wirestate" is 2. The 10 seconds (23280–23289) data before the rupture point are "rupture data", and the label of "Wirestate" is 0. The data has 1 in the "Wirestate" column means working data. There is no apparent difference for machining parameters between rupture data and working data in Fig. 1. Therefore, ML be used to train and find the trend of the parameters. There has another label, Rupture Chance, which is 200 on rupture point. The probability of rupture is 100% before rupture point 1 second and decreases 10% per second in order.
This study aims to build a model to predict the wire state and prevent wire rupture. Therefore, the 10 seconds data were used as one data set to predict the wire state in the next second. Furthermore, choosing which manufacturing parameters recorded the machining process as the neural network input is a serious step. The machine tool used in this research already has a monitor to record the manufacturing parameters. Among all recorded data, Gap voltage (GT), Feed rate (FR), and water resistance (WR) were chosen as input data according to the experiment results in previous research documents [6]. A set containing 30 data of 10 seconds of machining process was considered an input data set for ML. Totally, 298 data sets were used for training, validation, and testing. The prediction of wire rupture can be considered a classification problem containing two states, i.e., ‘wire rupture’ and ‘normal working’ classes. Therefore, the data set for these two converse states were collected to be the same amount so that the data-unbalanced situation in classification may be avoided.
Three groups of data sets, i.e., ‘Training’, ‘Validation’ and ‘Testing’, were formed by random selection from 298 sets. Among 298 data sets, 214 sets (72%) were used for training the network, 24 sets (8%) were used for validation, and 60 sets (20%) were used for testing. The percentages of these groups are shown in Fig. 2.
2.2 Architecture of Network
In this research, ANN was used to analyze the relationship between the sensors' data and the state of wire rupture. Generally, an ANN is made up of some neurons connected via the link likes Fig. 3. The feed-forward neural network based on back-propagation is the best general-purpose model among various neural network models. In [14], it is mentioned that ANN is the most commonly used network to solve classification problems.
An ANN is built by one input layer, one or several hidden layers, and one output layer. The input layer is a 1D array with specific lengths. Base on the flexibility of ANN, various combinations of the different layer numbers and neurons number were tested in this study to find out the better architecture. First, one hidden layer with different neurons was tested. Some specific numbers of neurons that have high-grade performance are expected to choose for the next experiment. For the same total neuron number, the hidden layers will vary from 1 to 4 to obtain the best network system. For example, if there are N total neurons are used in the network, each layer will be N/Ln, where Ln represents the layer number.
The ANN has two output values and modifies them into 0 to 1 by the softmax function in this research. The classification layer uses the value passed by the softmax layer to classify into two classes, includes 'Wire rupture' and 'Wire no rupture'. The loss function is the cross-entropy loss function, likes Eq. 1, for multi-class classification problems with mutually exclusive classes. The complete model built by MATLAB is shown in Fig. 4.
N is the number of classes, yi is the real value, is the predicted value yi and are between 0 to 1.
Eq 1. The cross-entropy loss function
2.3 Finding the important parameter
During the training process, it is found that the result of trained network may not perform better if too many input parameters included in a data set. By observing the accuracy, valuable parameters may be identified.
Three parameters (GT, FR, WR) were considered in the first experiment. In this part, the different combinations of parameters, shown in Table 1, were trained network and the effect between parameter and prediction accuracy were evaluated.