Prediction for surface roughness of the large-pitch internal thread based on homologous isomerism data

It is difficult to measure the surface roughness of large-pitch internal threads; predictions are used instead of measurements, whereas the common predictions are not highly accurate and narrow in the scope of use. The homologous isomerism data of vibration signal were utilized to establish a predictive model, which predicted the surface roughness of large-pitch internal threads. The corresponding homologous isomerism data were acquired by turning the large-pitch internal thread, and the data were processed using the Relief-F algorithm to obtain the weights, which are the effects of different features on surface roughness. Additionally, influenced by the structural characteristics of the workpiece with a large pitch and a small number of teeth, support vector machine (SVM) and radial basis function neural networks (RBF-NNs) were used to establish the predictive model with the homologous isomerism data of vibration signal as the input parameters. Eventually, the SVM model with higher accuracy of prediction and better ability of generalization was more appropriate for the research of this paper through comparison and analysis. It was verified that the absolute error of the SVM model was less than 0.05 μm, and the relative error was less than 4% for turning both left and right threaded surfaces, demonstrating that the predictive model could take the place of measuring the surface roughness in the mass production of large-pitch internal threads. The method proposed in this paper can also be extended to other parts for the prediction of surface roughness, especially those whose surface roughness is difficult to measure due to structural characteristics.


Introduction
Large-pitch threaded parts (pitch > 4 mm) as transmissions for large vertical CNC gantry lathes and large multi-station presses and high quality of the threaded surface are critical for the screwing of internal and external threads. Nevertheless, surface roughness is one of the most important criteria for testing the surface quality of threaded parts, it is extremely significant to measure the surface roughness of threads. It is laborious to measure and consumes a long time due to the special structure of the internal thread, which affects the overall production and processing efficiency. With the development of information technology, machine learning is widely used in predictive modeling [1]. Predictive modeling of surface roughness with the help of machine learning technology can ensure consistent machining quality.
In the field of predicting surface roughness, domestic and overseas scholars have done a great deal of relevant research. Chan et al. [2] planned the machining experiments; the measurement records were used as training data for the AIM polynomial neural network to build a surface roughness prediction model. The prediction model could input processing parameters to achieve the surface roughness prediction. Yang et al. [3] developed a predictive model using response surface methodology with cutting parameters as input parameters and surface roughness as output parameters. The contour line of the surface roughness under a different combination of parameters is obtained and is used for the optimum surface roughness prediction. Kong et al. [4] analyzed four types of Bayesian linear regression models and used machine learning algorithms to achieve the prediction of surface roughness and verified the superiority of the regression models. Vahabli and Rahmati [5] applied the RBF-NN optimized using the imperialist competitive algorithm for parts fabricated using fused deposition modeling to predict surface roughness and compared it with the traditional RBF neural network to get better predictive performance. Majumder et al. [6] investigated the effect of various parameters on surface roughness during the Inconel 800 EDM process and established the associated equations between surface roughness and cutting time using a hybrid PCA gray analysis. Scholars have researched predictions, which required a large amount of data, of surface roughness mainly through empirical formulas, regression analysis, neural networks, and other methods, using machining parameters as imports to predict surface roughness. Inversely, the structure of the large-pitch internal thread workpiece is characterized by a large pitch and a small number of teeth. As a result, the research in this paper was the case of fewer training samples.
Scholars have made a large number of achievements in the field of predicting surface roughness. Frigieri et al. [7] established an associated model between the acoustic signal during turning and the surface roughness of the workpiece. The accuracy of the correlation was experimentally demonstrated, and the monitoring of the surface quality during machining was achieved, replacing the previous method of measuring at the end of the machining, with real-time, low-cost, and highprecision characteristics. Plaza and López [8] used the wavelet packet method to process the cutting force signal for surface roughness monitoring and established the correlation model between cutting force and surface roughness to complete the online monitoring of workpiece surface roughness during machining, which reduced the scrap rate and improved productivity. Arulraj et al. [9] developed an ANN model to fuse cutting temperature along with cutting parameters to predict surface roughness during turning. Models that include cutting temperature outperformed models that do not include cutting temperature. Salgado et al. [10] developed a highly robust and accurate online surface roughness prediction system for turning processes based on a novel signal processing technique known as singular spectrum analysis. In turning largepitch thread, the surface roughness has strong time-varying characteristics. Even if the cutting process parameters are not changed, the surface roughness will change with the change of real-time tool wear status and real-time stability of the process system, and these changes can be reflected in the characteristic parameters of the vibration signal in real time. Therefore, we choose the characteristic parameters of the vibration signal as the input to predict the thread surface roughness. It also improves the accuracy of prediction, decreases the time consumed by measurement, and enhances productivity.
The prediction problems during the high-feed turning process were focused on, carrying out a comprehensive study on prediction for surface roughness of the large-pitch internal thread. First of all, homologous isomerism data of vibration signal were acquired by turning the large-pitch internal thread. Meanwhile, the weights of homologous isomerism data were solved, based on the Relief-F algorithm. Furthermore, the prediction model for the surface roughness under the influence of vibration was established, based on two types of machine learning algorithms. Eventually, the two types of models were compared and the more suitable model was selected for validation. These predictive models were established to provide a theoretical basis and technical assistance for predicting the surface roughness of large-pitch thread parts during high-feed turning, replacing the laborious measurement and enhancing productivity.

Processing of homologous isomerism data
Homologous isomerism is a biological term. Multiple mRNAs are produced from mRNA precursors from one gene due to selective splicing, and multiple polypeptide chains are translated, which are known as homologous isomerism [11]. In this paper, homologous isomerism data means that signals from the same vibration sensors are extracted as characteristic parameters of different structures, such as root mean square and kurtosis. The homologous isomerism data were used as input parameters for the predictive modeling of surface roughness subsequently.

Experiment of turning large-pitch internal thread
As shown in Fig. 1, turning large-pitch internal thread have a high axial feed, large pitch, and large cutting depth, resulting in severe vibration. To maintain the stability of the entire turning, the axial layered turning was used. The technology of multiple layered feeds along the axial direction makes the turning stable and reduces the impact on the surface quality of the workpiece. The hardness of the internal thread should be lower than that of the external thread when it is matched with the external thread, and combined with the turning performance of the material, H59 copper was finally selected as the workpiece material for the internal threads. The parameters of the large-pitch internal thread workpiece are shown in Table 1.
According to the experimental requirements, a turning tool with changeable heads for turning trapezoidal internal thread was designed and manufactured. The tool is made of high-speed steel and has two symmetrical edges on the left and right sides. The parameters of the tool used during the experiment are shown in Table 2.
As shown in Fig. 2, the machine used for turning the large-pitch internal thread is a CAX6140 lathe with spindle speed n = 20 r/min. The vibration signals were collected through the Donghua signal acquisition system, which was accompanied by a PCB three-way acceleration sensor and a charge amplifier 5070A. The signals of tool and workpiece vibration were measured with axial removal z = 0.05 mm and feed rate f = 16 mm/r.

Measurement of surface roughness
As shown in Fig. 3, the machined internal threaded workpiece was removed axially from the cut-in point using the WEDM for subsequent measurement of surface roughness.
As shown in Figs. 4 and 5, the roughness measurement areas were the left and right sides of the threaded blocks. Roughness values were measured using a Taylor Hobson CCI white light interferometer. As shown in Fig. 6, multiple groups were sampled for the workpiece, starting with cut-in point 1 and taking the next group of samples at 90° intervals in the circumferential direction. We sampled 24 threaded blocks. Each threaded block had two threaded surfaces on the left and right, and 24 threaded blocks had 48 threaded surfaces. Each thread surface had 5 measuring points. The maximum and minimum points in the 5 measuring points were deleted. The roughness of the threaded surface was the average roughness of the remaining three measuring points.
The roughness values of the threaded surfaces were measured using a Taylor Hobson CCI white light interferometer. As    Fig. 7, the measured roughness values of the left and right threaded surfaces, which were contained in each group, were recorded. There were some differences in threaded surfaces between the same sample group and different sample groups, whereas the overall values were relatively close. In the same block group, no. 1 is the cut-in section, no. 6 is the cut-out section, and nos. 2-5 are the samples in the middle cutting section.
The roughnesses of the threaded surfaces corresponding to the cut-in and cut-out sections were slightly worse than those of the middle smooth turning stage due to the influence of vibration during the turning [12]. For the different nos. 1-6 in same groups, the surface roughness of the outer threaded blocks is higher than that of the inner. The workpiece of large-pitch internal thread can be regarded as a cantilever beam with one end fixed and one end free. No. 1 at the free end is the most easily disturbed by cutting vibration, so that its surface roughness is the largest. Eccentricity of machine tool spindle causes centrifugal force. Although no. 6 is located at the fixed end of the cantilever beam, it is close to the machine tool spindle and is most affected by centrifugal force. The surface roughness of no. 6 is also large due to the multiple influences of centrifugal force and cutting force. For the same nos. 1-6 in same groups, the roughness of the left threaded surfaces is higher than the roughness of right threaded surfaces. The reason is that the cutting force on the surface of the left and right threads is inconsistent due to the influence of the helix angle. For the same nos. 1-6 in different groups, we did not find that values of surface roughness differ sharply, and all values changed within a reasonable range. The existence of hard points in the workpiece material increases the cutting force and vibration, resulting in a large surface roughness at that point.

Processing of homologous isomerism data
The vibration signals were obtained, and the eigenvalues contained in the vibration signals were extracted, i.e., the process of obtaining homologous isomerism data. The Relief-F algorithm was utilized to select the most influential data on surface roughness from homologous isomerism data to lay the foundation for the predictive model. The x-direction corresponds to the axial direction during turning, the y-direction corresponds to the depth-of-cut direction, and the z-direction corresponds to the radial direction during turning.   Diagram of threaded surface roughness obtained after observation of machined surface is shown in Fig. 8; there were obvious changes of ripple on the surface in the z-direction, and the roughness was more obvious with peaks and troughs in the z-direction. Simultaneously, the tool itself had excellent rigidity in the axial and tangential directions. Influenced by the large overhang of the tool in the radial direction, resulting in its poor rigidity, the tool generated significant vibration when excited, and eventually caused a significant impact on the surface roughness of the workpiece. In summary, the influence of z-direction, i.e., radial vibration, on the surface roughness occupied the dominance among the vibration signals in different directions, and this direction was chosen for the subsequent research.
We sampled 24 threaded blocks. Each threaded block has two threaded surfaces on the left and right, and 24 threaded blocks have 48 threaded surfaces. Forty-eight roughness values corresponded to 48 groups of vibration signals. For each group of vibration signals, we have extracted 12 characteristic parameters. In the case of such a large amount of data, we could not subjectively select which characteristic parameters were used as the input of the prediction model. We need an objective and scientific method to assist in the selection. Therefore, we chose the Relief-F algorithm to analyze the weight of characteristic parameters through sample learning and screened out the characteristic parameters that had the greatest impact on surface roughness.
We chose the Relief-F algorithm to analyze the weight of characteristic parameters through sample learning and screened out the characteristic parameters that had the greatest impact on surface roughness [13]. The roughness values were shown in Fig. 7. Algorithm was programmed based on the Python platform. The detailed principle of the algorithm was described as follows.
There is a multi-class data set D = {(x 1 ,y 1 ), (x 2 ,y 2 ),…,(x n ,y n )}, Input: Training sample D = x 1 , y 1 , x 2 , y 2 , L, x n , y n ,x i ∈ R I ,y i ∈ 1, 2, ⋯ , ∁ , Sampling times m, number of adjacent samples k Output: feature weight w (1) Initialize feature weights w j = 0 , j ∈ [1, I] (2) Cycle m times: A. Random selection of samples B. According to, k similar sample sets and K different sample sets are found C. Calculate the weight of each feature: In the Relief-F algorithm, the value of the nearest neighbor sample k is user-defined, and the different values of this value will directly affect the different weights of the final eigenvalues. To ensure the accuracy of the selection of eigenvalues, the weight of the eigenvalues is calculated by setting different values of k many times, and then, the weight is averaged to obtain the weight data in Table 3. As shown in Table 3, the weights corresponding to the homologous isomerism data of vibration signals were obtained.
As a result, root mean square, variance, and crest factor had the most significant effect on surface roughness, where root mean square can represent the energy of vibration signal, variance is the dynamic component of vibration signal, and crest factor is used to detect the vibration signal with or without shock [14]. The root mean square, crest factor, and variance of vibration signals were finally selected as input parameters for predictive modeling of surface roughness as shown in Table 4.

Establishment of the predictive model
The machine-learning algorithm has a strong self-learning capability [15]. The process of turning large-pitch internal threads was complex and had certain fluctuations; the machine learning Influenced by the structural characteristics of workpiece with large pitch and small number of teeth, two algorithms of SVM and RBF neural networks in machine learning were chosen to be more suitable for the case of fewer training samples. The replacement of workpieces in the same experiment will change the original conditions of the process system. Therefore, the data after the replacement of workpieces cannot be used to train the prediction model together with the previous data. However, after the training of the prediction model is completed, it is no longer affected by the replacement of workpieces. The prediction model can be used for any workpiece.

Establishment of the predictive model based on SVM
SVM algorithm has advantages in the case of fewer training samples, nonlinear problems [16]. Using the homologous isomerism data as the input parameters x(i)∈ R n and the surface roughness as the output parameters y(i) = R a ∈ R n constituted the training sample D = {(x i ,y i )},i = 1,2,…,n. y i ∈ R could be estimated using the regression function y(x), and the expression of the regression function is shown in Eq. 1.
where φ(x) is the vector in the higher dimensional space obtained by mapping the vector x, y is a hyperplane, and b is the deviation of the actual value from the regressive value, which can be obtained as shown in Eq. 2.
where ‖w‖ 2 is confidence intervals, L(y i , y R a ) is the loss function, and μ is the penalty factor; the smaller the value, the smaller the error of the model obtained after the training. To ensure that the predictive model had high accuracy, a smaller number was introduced.
Since errors could not be avoided in regression analysis, the introduction of the slack variable i , * i allowed the formula for calculating the deviation to be rewritten as shown in Eq. 4.
According to the Karush-Kuhn-Tucker conditions, the equation could be transformed into a duality optimization, and the new equation was obtained as shown in Eq. 5.
where i , * i represents the Lagrange multiplier. According to Mercer's law, the introduction of kernel functions could solve the problem of vector dimensional differences in the regression, and the Gaussian kernel function in the nonlinear kernel function has better predictive accuracy. The expression is shown in Eq. 6.
where represents the width of the kernel function.
In summary, the predictive model for surface roughness could be derived as shown in Eq. 7.
The 48 sets of data were from the roughness values of 48 thread surfaces and the homologous isomerism data of vibration signal corresponding to 48 thread surfaces. The roughness values of 48 thread surfaces were shown in Fig. 7, and the homologous isomerism data of vibration signal corresponding to 48 thread surfaces were shown in Table 4. The 48 homologous isomerism data of vibration signal were used as the input parameters, and the 48 surface roughness values were used as the output parameters. The Gaussian kernel function whose scale was 2.4 was selected, and the predictive modeling of surface roughness  was completed after using the ten-fold cross-validation. The R 2 value of the obtained predictive model was 0.90 with high reliability. As shown in Fig. 9, prediction and measurement are plotted.

Establishment of the predictive model based on RBF neural network
RBF neural network, which can be used not only for function fitting but also for prediction, is a forward network with a single hidden layer [17]. Theoretically, it can fit any nonlinear function. It consists of an input layer, a hidden radial basis function layer and an output linear layer, the radial basis function is a radially symmetrical Gaussian function as shown in Eq. 8.
where c i is the i-th center of the basis function, i is the variance of the i-th basis function, and p is the number of perceptual units. The input layer of the RBF neural network implements a nonlinear mapping from x to R i (x), and the output layer implements a linear mapping from R i (x) to y R .
where q is the number of output nodes, w ki is the modulation weight between the k-th output layer, and the i-th hidden layer.
Due to the different value ranges and scales of each parameter, the influence of each parameter on surface roughness cannot be determined under the same standard. To make the convergence faster and the generalization better during training neural network, the sample data should be normalized and the normalization adopted in this paper is the maximum-minimum; the function is shown in Eq. 10.
The 48 sets of data were from the roughness values of 48 thread surfaces and the homologous isomerism data of vibration signal corresponding to 48 thread surfaces. The roughness values of 48 thread surfaces were shown in Fig. 7, and the homologous isomerism data of vibration signal corresponding to 48 thread surfaces were shown in Table 4. The first 40 sets of data were used as training samples, and the next 8 sets of data were used as test data. In the RBF neural network, the number of nodes in the input layer was 3, the number of nodes in the output layer was 1, and the number of nodes in the hidden layer was automatically set using the newrbe function. The spread is the spreading coefficient of the radial basis function; too large or too small is not conducive to achieve the prediction of the neural network. If it is too small, the input vector may not cover the region and more neurons will be needed; if it is too large, it will lead to overlapping regions between neurons, which will cause overfitting and increase the difficulty of numerical computation [18]. Predictions were taken for 15, 19, 20, 21, and 25, respectively. The value of spread was selected as 20 after comparative analysis. As shown in Fig. 10, prediction and measurement are plotted. of large-pitch internal threads are in good agreement with the measured values. To demonstrate the accuracy and validity of two models, comparative research of two models will be conducted as shown in Table 5.
The root mean square error between the predicted and measured values of the SVM model was 0.042 μm, with a maximum error of 0.124 μm and a relatively average error variation. The root mean square error between the predicted and measured values of the RBF neural network model was 0.063 μm, with a maximum error of 0.184 μm and a large error variation. The error was caused by the underfitting and overfitting of individual data in the training process of the prediction model. It was unavoidable, but the error of the prediction model was small, within a reasonable range.
The results testified that the SVM model has higher accuracy of prediction and ability of generalization than RBF neural network in predicting the surface roughness of largepitch internal threads and was more capable of predicting the surface roughness from homologous isomerism data. The reason was that the magnitude and fluctuation of errors were closely related to the number of sample sets, the learning and training process of the RBF neural network prediction model was not sufficient due to the limited number of samples, and the SVM model was more advantageous in solving the problem with the case of fewer training samples.

Validation of the predictive model
It was concluded that the SVM model, which was more suitable for the research of this paper, had higher accuracy of prediction and better ability of generalization after the comparative analysis. To further verify the performance of the SVM model, the verification experiments were conducted for left edge and right edge turning respectively. The experimental parameters are shown in Table 6.
The tool, which was utilized according to the corresponding spindle speed and axial removal, was recreated in accordance with the clearance angle and edge radius, using the instrumentation shown in Fig. 2. The results of the verification experiments are shown in Table 7.
As shown in Table 7, it was verified that the absolute error of the SVM model was less than 0.05 μm, and the relative error was less than 4% for turning both left and right threaded surface, which demonstrated that the model could be utilized for the mass production of large-pitch internal threads instead of measurement.
The method proposed in this paper for predicting surface roughness from homologous isomerism data of vibration signals can also be extended to other parts for the prediction of surface roughness, especially those whose surface roughness are difficult to measure due to structural characteristics.

Conclusions
1. The Relief-F algorithm acquires weights for each feature in homologous isomerism data. Ultimately, the three features of root mean square, variance, and crest factor in the homologous isomerism data of vibration signals were selected as input parameters to establish the predictive model which predicted the surface roughness of large-pitch internal threads. 2. The magnitude and fluctuation of errors are closely related to the number of sample sets. The root mean square error between the predicted and measured values of the SVM model was 0.042 μm, with a maximum error of 0.124 μm and a relatively average error variation. The root mean square error between the predicted and measured values of the RBF neural network model was 0.063 μm, with a maximum error of 0.184 μm and a large error variation. Consequently, we chose to utilize the SVM algorithm to establish the predictive model for surface roughness of large-pitch internal threads. 3. As a result, it was verified that the absolute error of SVM model was less than 0.05 μm, and the relative error was  It illustrated that this model could be utilized for the mass production of large-pitch internal threads, replacing the laborious measurement, obtaining excellent predictive performance, reducing the time consumed by measurement, and enhancing productivity.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.