Surface roughness prediction and optimization in the REMF process using an integrated DBN-GA approach

Surface roughness is a crucial factor affecting the surface quality of workpieces in manufacturing industries. Thus, it is important to provide an accurate performance of surface roughness prediction and optimal parameters to reduce the burden of time and costs during the process. In this study, two predict models, namely multiple linear regression and deep belief network (DBN) models, were performed to accurately predict change in surface roughness in the rotational electromagnetic finishing (REMF). Compared to the statistical-based model, the data-driven model based on the DBN architecture was a significantly considerable effect on surface roughness prediction in the REMF process. Among the considered DBN models, DBN5 architecture as [7, 14, 14, 1] showed effective features of the nonlinear relationship between process parameters and response with the highest determination coefficient (R2) of 0.9340 and the lowest mean squared error (MSE) of 1.3037 ×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} 10−3 in the testing datasets. In addition, a genetic algorithm (GA) as a heuristic optimization technique was adopted to optimize the input parameters of the best derived DBN model. It showed that the maximum change in surface roughness was 0.530 at particle length of 3 mm, particle diameter of 0.7 mm, particle weight of 1.3 kg, liquid water quantity of 1.0 l, a rotational speed of 1323 rpm, working time of 35 min, and initial surface roughness of 2.5478 mμ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upmu$$\end{document}.


Introduction
Recently, with the fast-growing requirements to improve surface integrity and functional performance in the ultraprecision engineering fields, the superior surface quality of manufactured components is in great demand. Surface roughness is one of the aspects of surface quality assessment that significantly affects the reliability, durability, wettability, reflection, and friction of products. Hence, a number of researchers are making considerable efforts to develop surface finishing technologies to improve surface roughness [1][2][3]. Traditional finishing processes are applied in a wide range of industries in practice. However, it still faces inherent limits in terms of surface defects and uneven surface finish, especially in complicated geometries. In the current manufacturing field, the latest research shows a trend away from traditional finishing methods towards advanced capabilities to achieve high surface finishing performance for micro-sized components having a complex shape. Magnetic energy-assisted finishing processes have proven effective for ultra-smooth surface among the advanced finishing technologies in the last few decades, where multiple cutting edges of tools with controllable magnetic force are able to obtain a high degree of micro/nanosurface finishing performance [4,5].
Numerous studies of magnetic surface finish have been extensively conducted from diverse points of view to develop process efficiency and establish a predictive model representing a relationship between process parameters and surface roughness improvement. Singh and Singh simulated magnetic flux density in magnetorheological finishing (MRF) operation to determine a theoretically optimized experimental setup that enabled magnetic intensity on the external surface of conical-shaped material to be uniform. A theoretical surface roughness model and an optimal condition for the fine surface finish were established based on the simulated approach. In addition, actual experiments at the optimal condition were carried out to the validated predictive model. It concluded that the suggested MRF method was adequate for improving micrometric surface characteristics on the conducted materials [6]. Sirwal et al. developed magnetic-assisted tools that can perform reciprocating and rotational motion to achieve high surface quality with tight tolerance for the cylindrical blind surface in the MRF process. Based on the response surface methodology (RSM), researchers made an effort to analyze the influences of process parameters on better surface characteristics and to derive a mathematical model of finished surface variation. In experimental and statistical analysis, finishing efficiency was dominantly determined by rotational speed, and surface roughness of the conducted areas improved by 85% at the optimal condition. Furthermore, the error rate of the predictive model was less than 10% compared to the experimental work. Thus, the derived model for surface roughness improvement was quite reliable [7]. Nagdeve et al. obtained the precise surface of an intricate implant by adopting rotational magnetorheological abrasive flow finishing (MAFF) with the help of a special fixture that maintained constant abrasive velocity and magnetic field intensity in the finishing zone. To provide a predictive model of surface improvement and to optimize input variables, RSM as a statistical analysis was employed. The results showed that surface quality improved by 73% with less time compared to the previous MAFF process [8]. Misra et al. established theoretical models by applying a genetic algorithm to fulfill multi-objective optimization in terms of maximum surface roughness improvement and minimum material loss in the ultrasonic-assisted magnetic abrasive finishing (MAF) process. In order to validate optimized models, linear regression models were given from the experimental approach based on a Taguchi orthogonal array method. Consequently, the findings indicated that the results obtained from confirmatory experiments were in good agreement with the theoretical model [9]. Li et al. presented deformation characteristics and plastic damage mechanism of single crystal components by using ultra-precision abrasive machining process combined with SEM and TEM. In addition, a theoretical model for predicting surface morphologies and roughness was developed as an effective and economic analysis. As a result, the theoretical results and experimental observations were found to be in good agreement [10,11]. Ahmad et al. applied the MAF process with various diametric sintering magnetic abrasives to improve the capability for micrometric surface finish. The statistical models for the ratio of change in surface roughness were developed based on experimental observations. It was evident that both results had a quantitatively similar [12]. Although the mathematical models for surface roughness prediction by regression analysis are widely available, it is impractical to accurately describe the complex nonlinearity of the processes due to uncontrollable factors that influence the relative motion between the workpiece surface and flexible tools [13][14][15].
Artificial neural networks (ANNs), unlike the statistical model, have become more popular as a robust predictive strategy to accurately establish the non-linear relationship between input variables and response with the development of computer technologies. A deep neural network (DNN), which is one of the supervised learning models in the ANNs, is commonly used and has demonstrated outstanding predictive ability. Thus, in the literature, a number of researchers attempt to build predictive models of surface roughness to capture complex nonlinear relationships. Peng et al. presented a cutting force model based on both linear regression and DNN methodologies. Consequently, the DNN-based model yielded exceptional performance rather than the existing model [16]. Ahmad et al. developed a DNN-based model for tri-objective models in terms of change in surface roughness, microhardness, and modulus of elastic indentation for Ti-6Al-4 V material in the MAF process. Moreover, a genetic algorithm (GA) was employed for optimizing the system. As a result, the combined DNN-GA-based model was recommended to obtain suitable output in comparison with the experimental trials [17]. Singh et al. introduced an ANN-MFO learning algorithm in the MAF process to improve final surface integrity and optimize process parameters. The effectiveness of this hybrid methodology was successfully verified with a minimum error rate [18]. Kooialippor et al. made a predictive model for the performance assessment of the penetration rate of a tunnel boring machine. In order to evaluate prediction performance between ANN and deep belief network (DBN) models, root mean square (RMSE) and coefficient of determination (R 2 ) were compared. Based on the results, this study was proved that DNN was a promising tool for prediction with a large amount of data [19]. Stojanović et al. used the DNN for predicting friction coefficient and wear rate on hybrid aluminum matrix composites. From the observation of this study, the accuracy performance of the predictive model driven by the DNN was similar to the experimental verification with 99% [20]. Despite success of the effective prediction, DNN still experiences difficulties in terms of local minima, gradient vanishing, and slow convergence rate. In order to address these limitations, the DBN with hierarchical structures has attracted attention as a promising tool in current studies and practice.
This study aimed to establish an accurate prediction model for surface roughness of stainless steel (SS) 316 in rotational electromagnetic finishing (REMF) process to improve surface quality. In order to develop the best predictive model well-explained nonlinearity of the process, datadriven model based on a hierarchical DBN architecture was adopted. The DBN model was compared to statistical model obtained from multiple linear regression with three statistical indicator, namely R 2 , mean squared error (MSE) as the cost function, and F-test to determine that the data-driven model suggested in this study was accurate and reliability in the REMF process. In addition, a genetic algorithm was employed to define the optimal input parameters of the best predictive model.

Theoretical background of deep belief network
The DBN is a hierarchical neural network with stacked numerous restricted Boltzmann machines (RBMs). Therefore, in this chapter, the theoretical background of the RBMs as a basis of DBN is introduced in detail.

Restricted Boltzmann machine
RBM is an energy-based stochastic graphical model consisting of visible layer v at the bottom with m neurons and hidden layers h at the top with n neurons, as shown in Fig. 1. In the RBM structure, neurons in the adjacent layers are fully connected by symmetric weight w without intralayer connection. Due to the independence of neurons, this data-driven method enables simplifying the training process and improving training efficiency. There are two types of RBM, called Bernoulli-Bernoulli RBM and Gaussian-Bernoulli RBM dependent upon data distribution of the visible neurons.

Bernoulli-Bernoulli RBM
Bernoulli-Bernoulli RBM consists of binary states v ∈ {0, 1} m in the visible layer and stochastic binary features h ∈ {0, 1} n in the hidden layer extracted from the visible units. The energy function for a joint configuration (v, h) in the Bernoulli-Bernoulli RBM could be expressed as follows: where , h j is the jth hidden variable, w denotes weight matrix which connects visible and hidden variables, w ij is the weight connection between v i and h j , a , and b are biases of visible and hidden nodes, respectively. is (a, b, w) as model parameters. Given the energy function, the joint probability distribution over all neurons in the each layer is defined as follows: where Z is partition function or normalization constant, in which the variables of In the training procedure of RBM, the model parameter is optimized by stochastic gradient descent to maximize the log-likelihood of training data (input variables), which is corresponding to minimizing the energy function of the system. Log-likelihood function with a set of training data In order to update main parameters, the derivative of the log-likelihood with respect to weight, visible bias, and hidden bias should be calculated. Each derivative and updated parameters are expressed as following Eq. (4), Eq. (5), and Eq. (6): where ⟨•⟩ data is expectation computed over the given training data, ⟨•⟩ model is expectation over the distribution obtained from the resulting model, and is a learning rate. However, the latter term is computationally intractable. Thus, a contrast divergence (CD) algorithm as an approximation method with t iterations of Gibbs sampling is used, where ⟨•⟩ model is substituted for ⟨•⟩ t to calculate model distribution easily. Figure 2 shows a stochastic procedure of t steps of Gibbs sampling. In the general RBM, one step of Gibbs sampling is suitable to acquire adequate values. Each neuron is independent, so the stochastic binary features of the hidden neurons h (0) are determined by given visible variables v (0) . A relevant conditional probability term is given in Eq. (7): where a(•) is a sigmoid activation function.
In the next step, visible units v (1) are reconstructed based on computed h (0) as follows:

Gaussian-Bernoulli RBM
In numerous practical applications, input variables as the visible neurons are real values rather than binary. Thus, Gaussian-Bernoulli RBM is adopted, which consists of normalized data obtained from observed variables with Gaussian distribution and binary variables in the hidden layer instead of Bernoulli-Bernoulli RBM. The energy function for a joint configuration (v, h) in the Gaussian-Bernoulli RBM could be expressed by using following Eq. (9): where i denotes standard deviation related to visible neurons v i .

Deep belief network
The basic architecture of DBN consists of the multiple stacked RBM as shown in Fig. 3. A DBN model is trained in two phases: pre-training and fine-tuning algorithms.
In the pre-training step, a set of stacked RBMs is independently trained from the bottom to the top to initialize weights and biases layer by layer. Each trained hidden layer in the lower RBM is served as the visible layer in the next upper RBM. This unsupervised learning process is repeated until training for the last hidden layer is finished. After the upward training, the fine-tuning phase in which the model parameters are adjusted from top to bottom by means of supervised learning with a back-propagation algorithm is performed to minimize errors between output and the actual measurement.
The DBN model through these processes is determined by following joint probability distribution: where l is the number of hidden layer in the DBN architecture.
Conditional probability distribution for P h i |h i+1 can be obtained as Eq. (11): 3 Experimental setup and DBN architecture

Principle of REMF operation
The REMF process is able to finish any complex geometries up to micro/nano-level surface finish. Figure 4 shows the schematic drawing of the REMF process. It was separated into three parts dependent upon operating steps; a finishing area, a driving disc, and a controller. In the container as the finishing area, there were workpieces and a number of cylindrical-shaped abrasive particles with diluted water. On the application of the magnetic field induced by the driving disc embedded with permanent magnets, the abrasive particles not only aligned in the direction of the magnetic field but also started to experience an attractive force. Permanent magnets on the driving disc were alternatively arranged. When the driving disk was rotated by an AC motor as a part of the controller, the alternating magnetic field was induced. As a result, abrasives shown in Fig. 5 exhibited dynamic behavior that included radial and rotating motions laid along the direction of the magnetic field. The magnetic force and torque acting on the abrasives were represented as following Eq. (12) and Eq. (13), respectively: where F p and T p were magnetic force and torque acting on the abrasive particle, m was magnetic pole, and V denoted susceptibility and volume of the abrasive particle, respectively, and H was the magnetic field. The kinetic energy created by the dynamic motion was induced by magnetic energy, which improved surface roughness by colliding with the workpiece. The total kinetic energy acting on the abrasive particle was given as follows: where M and v defined mass and velocity of abrasive particle, respectively, I was the moment of inertia, and w was angular velocity.

Experimental setup and measurement
The surface finishing performance in the REMF process was affected by magnetic and kinetic energy. As mentioned in Sect. 3.1, the energy intensity was determined by the physical properties of the abrasive particles and rotating speed. Thus, six parameters as controllable factors, which included particle length, particle diameter, particle weight, diluted water quantity, rotational speed, and finishing time, are selected as listed in Table 1.
Other fixed conditions were as follows. Figure 6 shows external and internal views of an experimental device (SS370, Bhl) of the REMF process. The diameter of the finishing region was 370 mm. In this study, the abrasive particles were cylindrical-shaped SS304, and the workpiece was SS316 plate in the dimensions of 50 × 35 × 5 mm. To achieve a high quality of SS316 surface, the workpiece was placed at a radial distance of 65 mm from the center, which measured a maximum magnetic flux density of 110 mT. The diluted liquid was a mixture of compound and water at the volume ratio of 1:100 to help that abrasive particles were dispersed uniformly over the finishing region.   On the basis of the determined parameters with corresponding levels, a total of 72 experiments were carried out by the mixed-level orthogonal array L 18 (2 1 × 3 7 ) with 4 iterations. In order to investigate the effect of the REMF process on the surface finish of the SS316 workpiece, a quantitative measure of surface roughness R a was evaluated by a stylus profilometer (SJ-301, Mitutoyo). A diamond stylus tip diameter of 2 mm, the cut-off value of 0.8 mm, and the total measuring length of 4 mm were designated for measurement. Figure 7 illustrates measurement points selected in this study. Surface roughness at each point was measured ten times, and then three values that were close to the mean value were chosen to improve the accuracy of the measurement. Before the REMF process, the SS316 workpiece were mechanically polished and average initial surface roughness was about 1.35 mμ in R a . Since the initial surface condition was not equal over all the workpieces, the ratio of change in surface roughness as a dimensionless coefficient was adopted for the assessment as given in Eq. (15): where R a,initial and R a,final were initial and final surface roughness, respectively.

DBN architecture
The predictive performance of the data-driven model was dependent upon network architecture. In this study, the architecture of DBN consisted of two sets of RBMs, where one visible layer and two hidden layers, with an additional regression layer as output with ReLU activation function. There were seven neurons corresponding to six process parameters designated in Sect. 3.2 and R a,initial in the visible layer. Output layer for regression had one neuron represented as predictive ΔSR E,DBN . The number of the first visible neurons and the output neuron was set to be the same while the number of neurons in the hidden layers varies from {7, 14, 21}. Total datasets for developing the predictive model were 72 obtained from experimentation. In order to improve the accuracy and reliability of the model, this study randomly divided 72 of the whole data in the ratio of 5:1. This meant that 83% of the total, being 60 train datasets, was considered for network training and 17% of the total, being 12 test datasets, was used for model evaluation. In the pre-training stage of ΔSR E,DBN prediction, the input variables were preprocessed. Since the input variables were continuous data laid in different ranges and units, it needed to be normalized from 0 to 1. These numerical values were used as the visible units in the first step of RBM, so Gaussian-Bernoulli RBM was adopted as the first RBM structure to convert real values into binary stochastic variables for the next RBM training. A remaining structure of the DBN was Bernoulli-Bernoulli RBM. The initial weights at the beginning of the RBM were sampled from a Gaussian distribution with mean of 0 and standard deviation of 0.01. The biases were initialized to 0. The training ran for 1000 epochs with a learning rate of 0.001. On the basis of the above information, the greedy layer-wise unsupervised training was carried out to obtain the network parameters in terms of weights and biases in each RBM structure. When the pre-training was completely finished, the initial parameters of the output layer on the top of the architecture, and the model parameters derived from the unsupervised training were adjusted in a supervised manner with the back-propagation algorithm to minimize prediction error. The activation functions of two sets of hidden layers and output layer were logistic sigmoid and ReLU, respectively. In order to accelerate convergence speed, Adam optimizer was introduced with a learning rate of 0.001. The epochs were set to 30,000 and an early stopping algorithm with 100 patiences was used to prevent underfitting and overfitting.
The performance of the predictive surface roughness model was evaluated with three statistical indicators, namely R 2 , MSE as the cost function, and F-test. Table 2 shows the calculated ΔSR Measure of all experimental combinations in this study. Based on the experimental results, this section proposed predictive models with the help of statistical and deep learning approaches.

Multiple linear regression model
A multiregression model as the statistical analysis was an effective way of determining relationships between independent and dependent variables. In order to predict ΔSR Pred,regression in this study, second-order polynomial regression was employed. Based on the experimental The most dominant factor by comparing absolute values of coefficients was particle diameter, followed by rotational speed, working time, particle weight, particle length, and diluted liquid quantity. In order to evaluate the performance of the developed regression model, three statistical indicators, namely R 2 , MSE, and F-test, were applied the same as evaluation criteria for the DBN model. The R 2 had a high explanatory value of 0.9199 close to 1, which meant that there was a consistent agreement between predicted results and experimental data. MSE defined the difference between predicted and observed values. This model had a small value of 0.4432 × 10 −3 , which was close to the best fit of 0. Figure 8 shows a residual plot with a histogram for the multiple linear regression-based model. As can be seen, the residual had approximately constant variance and followed the normal distribution. Table 3 lists the results of the analysis of variance (ANOVA). From the result of F-ratio, the regression model had reasonable accuracy with a confidence level of 95%. Therefore, it was said that it provided reliable surface roughness prediction in the REMF process. Table 4 shows the average R 2 and MSE of the considered 9th DBN architecture in this study after 3 iterations of the    Fig. 9. From the  Fig. 9 Correlation between predicted and observed data for testing datasets in DBN5  The number of data  Table 5, the DBN5 models for training and testing datasets were statistically significant at 99% and 95% of confidence levels, respectively. Therefore, the derived model by mean of the DBN was sufficiently reliable for surface roughness prediction in the REMF process.

Comparison
In this study, two different types of prediction approaches, which were the traditional statistical model and the deep neural network model, were suggested to estimate the ratio change in surface roughness on SS316 in the REMF process. Table 6 presents calculated R 2 and MSE for each model. The values obtained from the training and testing phases of the DBN5 model yielded 7% and 1.5% higher prediction performance, respectively, compared to the multiple linear regression model. Figure 10 shows the predictive values of both models against actual experimental results. As can be seen, the predictive value of DBN5 had less deviation compared with the multiple linear regression model. It was proved that the well-trained DBN model had a good agreement between observed and predicted data. In terms of MSE, all the values of both approaches, close to 0, were satisfactory for predicting surface roughness with high accuracy and reliability rate.
Although the multiple linear regression model provided the low MSE of 0.443 × 10 −3 , the predictive model was derived by mean value, which was the insufficient tool to predict non-linear characteristics.

Genetic algorithm optimization
Based on the training DBN model, it was found that the DBN5 architecture consisted of 7 neurons of input variables, 14 neurons of each hidden layer, and 1 neuron of output showed the excellent prediction performance. In order to improve the practical applicability of the REMF process, it was important to optimize the process parameters for the best ΔSR performance of the DBN5 model. In this study, GA as a heuristic optimization technique was applied to obtain the optimal solution. GA involved five main stages, which included population initialization, selection, crossover, mutation, and evaluation. In the first stage, population randomly created 10 chromosomes having the binary state of 27 genes that contain characteristics of input parameters. Afterward, the best two chromosomes were selected by comparing ΔSR as a fitness function with tournament selection. From the superior chromosomes, offspring as a new solution was produced by exchanging a group of genes between the selected parameters in the crossover process. After the crossover operator, some of the genes in the offspring were randomly modified with a mutation rate of 0.01 to avoid local optima. The algorithm iterated 50 generations until finding out the best ΔSR. Figure 11 illustrates the convergence curve towards maximizing ΔSR in the REMF process as the objective function during 50 generations. As can be seen, a trend of ΔSR increased sharply before 7 generations. After that, the convergence curve was stable and reached the maximum ΔSR of 0.530 at 25 generations. Table 7 lists the best combination of process parameters at the optimal condition. In comparison with the results of a Taguchi method, the convergence ΔSR of GA was close to that of the Taguchi strategy of 0.515. Thus, it proved that the DBN model-integrated GA was able to be adopted for the accurate prediction of ΔSR and process optimization.

Conclusions
This study aimed to provide an accurate prediction model for surface roughness of the SS316 material in the REMF process to improve surface quality. In order to establish the best predictive model, the multiple linear regression model and DBN architecture model were compared with three criteria which were R 2 , MSE, and F-test. Based on the best model resulting from comparison, the GA algorithm was adopted to optimize the process parameters. The main observations of this study were summarized as follows: • R 2 and MSE of the mathematical regression model were 0.9199 and 0.4432 × 10 −3 , respectively. From the ANOVA, this model had the confidence level of 95%. • From the developed 9th DBN architecture, the range of R 2 from the training datasets was from 0.9787 to 0.9910, which yielded about 7% higher prediction performance than the multiple linear regression model. The average MSE from the training datasets was 0.3126 × 10 −3 , which was less than 30% compared to the statistical model. Therefore, it proved that the DBN model was practical to accurately predict the complex nonlinear relationship between input and output variables in the process. • Among the considered DBN structures, DBN5 achieved excellent prediction performance with R 2 of 0.9900 and MSE 0.1518 × 10 −3 in the training datasets, and R 2 of 0.9340 and MSE 1.3037 × 10 −3 in the testing datasets. In addition, training and testing datasets were statistically significant at 99% and 95% of confidence levels, respectively. • From the comparison between the DBN5 and multiple linear regression model, DBN5 was close to actual experimental results. It meant that the well-trained model suggested in this study improved prediction accuracy and reliability for surface roughness in the REMF process. • At the optimal input parameters of the DBN5 obtained from the GA algorithm, which were 3 mm of particle length of 3 mm, particle diameter of 0.7 mm, particle weight of 1.3 kg, liquid water quantity of 1.0 l, rotational speed of 1323 rpm, working time of 35 min, and initial surface roughness of 2.478 mμ , maximum ΔSR was about 0.530.
Author contribution LJH and SYS performed experiments and data analysis. In addition, they contributed to writing the paper. KJS supervised the project and reviewed the paper.
Availability of data and material All the data supporting the results of this study were available within the article.

Declarations
Ethics approval Not applicable.