Optimization of lapping process parameters of CP-Ti based on PSO with mutation and BPNN

This work aims to improve the surface quality of commercially pure titanium (CP-Ti) with free alumina lapping fluid and establish the relationship between the main process parameters of lapping and roughness. On this basis, the optimal process parameters were searched by performing particle swarm optimization with mutation. First, free alumina lapping fluid was used to perform an L9(33) orthogonal experiment on CP-Ti to acquire data samples to train the neural network. At the same time, a BP neural network was created to fit the nonlinear functional relation among the lapping pressure P, spindle speed n, slurry flow Q and roughness Ra. Then, the range of the neuron numbers in the hidden layer of the neural network was determined by empirical formulas and the Kolmogorov theorem. On this basis, particle swarm optimization with mutation was used to search for the optimal process parameter configurations for lapping CP-Ti. The optimal process parameter configurations were used in the neural network to calculate the prediction value. Finally, the accuracy of the prediction was verified experimentally. The optimum process parameter configurations found by particle swarm optimization were as follows: the lapping pressure was 5 kPa, spindle speed was 60 r·min−1 and slurry flow was 50 ml·min−1. Then, the configurations were applied to a neural network to simulate prediction: the roughness was 0.1127 μm. The roughness obtained by experiments was 0.1134 μm. The error was 0.62%, which indicates that the well-trained neural network can achieve a good prediction when experimental data are missing. Applying the particle swarm optimization (PSO) algorithm with mutation to a neural network will obtain the optimal process parameter configurations, which can effectively improve the surface quality of CP-Ti lapped with free abrasive.


Introduction
Due to its excellent mechanical properties, such as high wear resistance, high specific strength [1], high fracture toughness [2] and corrosion resistance [3], CP-Ti is widely used in chemical [4], aerospace, national defence and marine fields [5]. However, as a typical difficult cutting material, industrially pure titanium cannot be machined well using traditional milling and turning techniques. Therefore, coupled mechanical-chemical lapping has become an important means to achieve the high-efficiency and high-precision machining of CP-Ti and has been deeply studied in recent years. However, how to obtain better surface quality has become a research hot-spot in recent years. Therefore, how to optimize lapping parameters to improve the surface quality of CP-Ti is an important problem.
At present, the methods used to optimize lapping parameters mainly include the quantum-based optimization method [6]/empirical model [7]/optimization algorithm combined with a neural network. The quantum-based optimization method uses the wheel speed, work-piece speed, depth of dressing and lead of dressing as control parameters. Minimal production costs, maximal production rates and minimal surface roughness are the optimization criteria being pursued [6]. However, the surface roughness is influenced by many factors in the lapping process; therefore, the quantum-based optimization method is not an effective solution to the problem; as such, it is difficult to quickly and accurately forecast the roughness using the quantum-based optimization method. The empirical model uses the wheel speed, abrasive concentration, current and pulse on time as control parameters. However, this method simultaneously analyses only two influencing factors for the surface roughness, which means that this method can only analyse the influence of each pair of parameters on the surface roughness. Hence, we need a method that can simultaneously consider all the parameters.
Furthermore, the neural network method could predict the surface roughness based on the limited values measured in the lapping experiment. For this reason, the neural network method has integrated advantages over the quantum-based optimization method/empirical model in terms of the experimental quantities and result reliability [8]. The highly parallel structure of the neural network enables it to complete the parallel implementation process, so it has better fault tolerance and faster overall processing speed than the quantum-based optimization method/empirical model. Neural networks have the theoretical ability of approximating to approximate arbitrary nonlinear maps. It applies a new method to the nonlinear control problem. A well-trained neural network has the ability to generalize all data. Therefore, neural networks can solve the control process problem, which is difficult to address using mathematical models or descriptive rules [9]. Therefore, the neural network method has been broadly utilized to predict the pulling force [10], surface roughness [11] and lapping capability of power plant ball mills [12].
As a very simple neural network, BP neural networks are widely used in many fields. Huan et al. [13] proposed a GA-BP model to predict the lapping forces produced during the creep-feed deep lapping of titanium matrix composites. Comparative results show that the GA-BP model has good prediction accuracy. Ma et al. [14] used the BP neural network to predict the effect of changing certain parameters, such as heat flux, mass flux, pipe diameter and pressure, on the heat transfer coefficient of supercritical water. The results show that the trained BP neural network prediction model can be applied to achieve a better prediction and understanding of the heat transfer coefficient of supercritical water. Liu et al. [15] proposed the continuous and dynamic prediction of the time series of NPP operating parameters. Validation results indicated that the proposed model could be used to achieve a stable prediction effect with high prediction accuracy for the prediction of fluctuating data.
However, the BP neural network can only solve prediction and classification problems and cannot handle parameter optimization problems. Therefore, it needs to be combined with the optimization algorithm. Wang et al. [16] used the improved particle swarm optimization (PSO) algorithm to optimize the previous model. The simulation results of a practical example show that the proposed wind power range prediction model can effectively forecast the output power interval and provide power grid dispatchers with decisions. Qi et al. [17] optimized the influence of normal milk processing and mixing performance by response surface analysis and the BP-GA neural network algorithm. The results revealed that the BP-GA neural network algorithm has better fitting performance than response surface analysis and confirmed the optimal working parameter combination that could provide a reference to improve double blade normal milk processing and mixing device design and milk processing quality. Wang et al. [18] developed a method to predict the thermal performance of PTC systems based on a GA-BP neural network model. The results revealed that the GA-BP neural network model can be successfully used to predict the complex nonlinear relationship between the input variables and thermal performance of PTC systems. Zhang et al. [19] designed the PSO-GA-BP model to assess personal credit risk, which makes up for the shortcomings of traditional BP network parameter training and improves the efficiency and accuracy of prediction. Gao et al. [20] proposed a novel hybrid algorithm that employs BPNN and PSO algorithms for the kinematic parameter identification of industrial robots with an enhanced convergence response. The results show that the proposed parameter identification method based on the BPNN and PSO has fewer iterations and faster convergence speed than the standard PSO algorithm.
But, no work has been reported yet, in author's knowledge, investigating optimization of the lapping process parameters of CP-Ti using BP neural network and PSO. Thus, in the present work, a BP neural network combined with the PSO algorithm is proposed to optimize the lapping process parameters of CP-Ti. Furthermore, the prediction results are verified by experiments. The optimum lapping parameters are accordingly given.

Test details
In this experiment, square commercially pure titanium samples with side lengths of 25 mm were selected as the processed parts. To ensure that the other parameters were the same before lapping, all samples were pretreated, i.e. #240 sandpaper was used to grind the samples for 3 min. After pretreatment, the average surface roughness (Ra) of the samples was 0.644 μm.
To reduce the number of tests and the cost as much as possible, the L 9 (3 3 ) orthogonal test was used to explore the influence of the main lapping process parameters on the surface roughness of the lapping samples. The process parameters used in the test include lapping pressure, spindle speed and lapping fluid flow.
The mass of each weight is constant, and the weights are available in one and half sizes. Each weight can provide a lapping pressure of 1.6 KPa. Three lapping pressures correspond to 1.5 weights, 2.5 weights and 3.5 weights. For the spindle speed, adopting a small value will reduce the lapping efficiency, and adopting a great value will cause the large centrifugal force which will push slurry outside of a spindle. For slurry flow, adopting a small value will reduce the lapping efficiency, and adopting a great value will lead to waste of slurry. Hence, the corresponding L 9 (3 3 ) orthogonal test was developed, and the factor level is shown in Table 1 based on the work of others [21,22].
As shown in Fig. 1, the lapping pad is fixed on the lapping plate and rotates together with the lapping plate. The CP-Ti sample is sealed on the pressing plate with wax and fixed on the carrier together with the pressing plate. In the lapping process, the lapping liquid is added at a certain flow rate and reacts with the sample surface. Then, the reaction layer on the work-piece surface is removed by mechanical action so that the work-piece surface material can be quickly removed. Under the joint action and promotion of mechanical lapping and chemical lapping, the surface of the work-piece is machined flat.

Data preprocessing
After the samples were ground, they were cleaned by an ultrasonic cleaning machine for 15 min and blown dried with an electric hair dryer. Then, a 3D surface topography analyser (PS50) was used to randomly pick five points on the sample surface for testing, and the average value of the detected surface roughness was taken as the measured surface roughness. The test results are shown in Table 2.
To remove the effects of dimensional differences and avoid situations where small data are overwhelmed by big data, the input and output data need to be normalized. Formula (1) is used to normalize input data: where x nor ∈[−1,1] is the normalized input data; x is the input data to be normalized; x min is the minimum value in this type of data; and x max is the maximum value in this type of data.
To extend the coverage of the output data, formula (1) is not used to directly normalize the output data. Instead, a broader value range is defined to normalize the output data, as shown in formula (2): where y nor ∈ [−1,1] is the normalized output data; y is the normalized output data; y 0 min is 0.08; and y 0 max is 0.6.

Neural network architecture
Because the data obtained from orthogonal experiments are limited, it is difficult to train a complex network model. Therefore, the most compact three-layer neural network is selected. Since the dimensions of the input and output data are 3 and 1, respectively, the neuron neurons of the input and output layers are set to 3 and 1, respectively. The number of neuron neurons in the hidden layer can be derived from empirical formula (3) and Kolmogorov theorem [23].
where m 1 is the range of the number of neurons in the hidden layer derived from empirical formula (3). Decimals are rounded up. a usually takes an integer from 1 to 10, so the range of m 1 values is 3 to 12. According to Kolmogorov theorem, any n-element continuous function can be represented by the sum of a cluster of continuous functions, and the number of functions is not more than 2n+1. Therefore, the value range of the number of neuron neurons in the hidden layer is also computed by empirical formula (4): where m 2 is the empirical value of the number of neurons in the hidden layer calculated by Kolmogorov theorem, and its value ranges from 1 to 7. Combined with the value range of m 1 and m 2 , it can be concluded that the number of neuron neurons in the hidden layer ranges from 3 to 7. The optimal number of neuron neurons in the hidden layer will be obtained by comparing the prediction accuracy before and after training.
In addition, the final network architecture was completed after the activation function, loss function and optimization algorithm were selected. On the one hand, the hidden layer activation function is an S-type tangent function tansig, which can process the data nonlinearly and enable the model to fit the nonlinear function, and the expression is: On the other hand, the hidden layer activation function is a pure linear function and expressed as: The loss function is the MSE function and is expressed as: where loss is the loss value calculated by MSE, y j pred is the predicted output given after the jth sample y j real is substituted into the neural network, y j real is the true value of the jth sample, and n is the total number of samples. The larger the loss is, the more the prediction result of the model deviates from the actual situation. The smaller the loss is, the closer the prediction result of the model is to the actual situation.
The Levenberg-Marquardt algorithm is selected as the optimization algorithm [24] and is suitable for small-and medium-sized networks with fast convergence speed.
In addition, we introduced k-fold cross validation to the training process to fully exploit each set of data in the absence of data samples. K-fold cross validation randomly divides the data into k groups and then takes each subset as a validation set and the remaining k-1 groups as a training set, which will train the neural network in turn to ensure that each subset participates in training the neural network.
In summary, the neural network architecture with dimensions of 3×5×1 is shown in Figure 2; the hidden layer activation function is the tansig function, and the output layer uses the pure linear function.

Variant particle swarm optimization algorithm
PSO was first proposed by Kennedy and Eberbart by observing the predation behaviour of birds. The easiest and most efficient way for birds to find food is to search an area near which food has been found. The optimization algorithm aims to minimize the surface roughness of the work-piece. The lapping pressure, spindle speed and lapping fluid flow were taken as the design variables. Then, the predicted values were calculated through the BPNN. Finally, the PSO algorithm with mutation was used to search for the optimal process parameter combination. The final model is as follows: the design variable X= [p, n, Q]; the target value Y=Ra min ; and the constraints are 2.4≤p≤5.6, 40≤n≤80 and 40≤Q≤80, where p, n and Q represent the lapping pressure, spindle speed and lapping fluid flow, respectively.
The initial population setting will increase the computational burden and take extra time for optimization if the number of individuals is set to a higher value, and it may  not be able to jump out of the local optimal value if the number of individuals is set to a low value. The value range of the lapping process parameters is shown in Table 3. The model will generate 20 individuals and assign values randomly within the constraints as the initial state of the population. The flow chart of extremum optimization based on the PSO algorithm is shown in Figure 3. At the start of the optimization process, all the particle position information is normalized and substituted into the neural network, which would conclude the fitness value of each particle. Then, the model will save the particle position and velocity information of the highest fitness value of the individual and the group (the lowest surface roughness), which would be used to update the particle position and speed information. By this method, the model continuously searches for new individual extrema and population extrema and updates the position and velocity information of the remaining particles until the set number of iterations is reached. The position information of a small number of particles is randomized within a given value range to prevent the model from falling into a local optimum after the particle position information is updated.

Training and testing the neural network
To more clearly compare the effects of each neural network, the indirect neural network prediction accuracy parameter is adopted and expressed as follows: where loss is the loss value calculated by MSE, loss∈[0,+∞) and accuracy∈(0,1). The larger the accuracy is, the closer the predicted result is to the true value.
After the number of neurons in the hidden layer is set from 3 to 7, the nine sets of data obtained through orthogonal experiments are preprocessed and substituted into the network models with different hidden layer neurons. The training set accuracy after training is shown in Table 4. It can be concluded from Table 4 that the actual prediction accuracy of the model is the best when the number of neural network neurons is 5. On the one hand, a large number of neurons would lead to overfitting of the model. On the other hand, a small number of neurons would lead to insufficient generalization ability of the model. Therefore, the neural network with five neurons is finally selected as the prediction model, and the prediction result is compared with the real value, as shown in Figure 4.

Optimization process
The optimization process of the optimization algorithm is shown in Fig. 4. According to the previous research work [25], the PSO with mutation converges faster than the general PSO. In this study, the algorithm converged after only 18   Fig. 3 Flow chart of the optimization of the particle swarm optimization algorithm with mutation iterations which could prove the conclusion. After 50 iterations (Fig. 5), the model obtains the optimal lapping process parameters. The minimum surface roughness value is Y=0.1127; the corresponding optimal lapping process parameter combination is X= [5.1333, 59.3234, 48.7533]. After rounding, the optimal process parameters were a lapping pressure of 4.8 kPa, spindle speed of 60 r·min −1 , and lapping fluid flow rate of 50 ml·min −1 . There are several reasons for the results.
With the increase of lapping pressure, the cutting depth of abrasive particles in the slurry increased, and the number of effective abrasive particles actually participating in lapping increased. The increase of lapping pressure increases the friction force between the work-piece and the lapping pad which improved mechanical removal efficiency. So, lapping pressure adopted a bigger value.
The increase of spindle speed accelerated the flow of slurry on the lapping pad and the mechanical removal efficiency, and improved the transmission efficiency of slurry. The chips were also transported faster, avoiding the damage caused by the chips to the work-piece surface and improving the surface quality. But the increase of the spindle speed also caused temperature changes, which affected the surface quality. Hence, spindle speed adopted the intermediate values.
Part of the abrasive particles may agglomerate at a high flow rate of slurry, causing more scratches on the surface and higher surface roughness. Thus, flow rate of slurry adopted smaller value.

Verification
When all else is equal, the best lapping process parameters after rounding were chosen to grind CP-Ti samples. After lapping, the samples were measured by a 3D surface topography analyser (PS50). Five points were randomly selected on the sample surface for testing, and the average value was taken as the measured surface roughness, as shown in Table 5.
The average value of surface roughness Ra is as follows: where Ra i is the ith measurement value. And the verification result is compared with the testing value, as shown in Table 6.
The surface roughness error, the difference between the theoretical value and the measured value, is as follows: where Ra m is the measured surface roughness value and Ra t is the theoretical surface roughness value. The verification result is close to the prediction value, which indicates that PSO with mutation and BP neural network has good prediction accuracy and could get the optimal processing parameters.

Conclusion
(1) A BPNN model was constructed by combining the process parameters for the free abrasive lapping of CP-Ti and the surface roughness of the sample after lapping, and the BPNN model was trained through the data obtained from the orthogonal experiment. Then, the predicted results of the model were evaluated by evaluation index accuracy. By comparison, when the number of hidden layer neurons is equal to 5, the prediction accuracy of the model is the highest at 83.72%.
(2) After the optimization of the PSO algorithm with mutation, the rounded process parameters were determined. When the lapping pressure was 5 kPa, the spindle speed was 60 r·min −1 and the lapping fluid flow was 50 ml·min −1 , the   (3) The surface roughness measured after lapping was 0.1134 μm, and the surface roughness error predicted by the neural network was within 0.62%, which effectively reduced the surface roughness of CP-Ti and improved the surface quality of CP-Ti.
(4) The work can provide theoretical guidance for the formulation and optimization of lapping process parameters and reduce the processing cost. 1. That the work described has not been published before (except in the form of an abstract or as part of a published lecture, review or thesis) 2. That it is not under consideration for publication elsewhere 3. That its publication has been approved by all co-authors, if any 4. That its publication has been approved (tacitly or explicitly) by the responsible authorities at the institution where the work is carried out The author agrees to publication in the Journal indicated below and also to publication of the article in English by Springer in Springer's corresponding English-language journal.
The copyright to the English-language article is transferred to Springer effective if and when the article is accepted for publication. The author warrants that his/her contribution is original and that he/she has full power to make this grant.
The author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors.
The copyright transfer covers the exclusive right to reproduce and distribute the article, including reprints, translations, photographic reproductions, microform, electronic form (offline, online) or any other reproductions of similar nature.
Competing interests The authors declare no competing interests.