Contour error modeling and compensation of CNC machining based on deep learning and reinforcement learning

Contour error compensation of the computer numerical control (CNC) machine tool is a vital technology that can improve machining accuracy and quality. To achieve this goal, the tracking error of a feeding axis, which is a dominant issue incurring the contour error, should be firstly modeled and then a proper compensation strategy should be determined. However, building the precise tracking error prediction model is a challenging task because of the nonlinear issues like backlash and friction involved in the feeding axis; besides, the optimal compensation parameter is also difficult to determine because it is sensitive to the machining tool path. In this paper, a set of novel approaches for contour error prediction and compensation is presented based on the technologies of deep learning and reinforcement learning. By utilizing the internal data of the CNC system, the tracking error of the feeding axis is modeled as a Nonlinear Auto-Regressive Long-Short-Term Memory (NAR-LSTM) network, considering all the nonlinear issues of the feeding axis. Given the contour error as calculated based on the predicted tracking error of each feeding axis, a compensation strategy is presented with its parameters identified efficiently by a Time-Series Deep Q-Network (TS-DQN) as designed in our work. To validate the feasibility and advantage of the proposed approaches, extensive experiments are conducted, testifying that our approaches can predict the tracking error and contour error with very good precision (better than about 99% and 90% respectively), and the contour error compensated based on the predicted results and our compensation strategy is significantly reduced (about 60~85% reduction) with the machining quality improved drastically (machining error reduced about 50%).


Introduction
In computer numerical control (CNC) machining, the contour error of the feeding system is one of the most fatal factors affecting the machining precision, and the lower the contour error is, the better the machining precision could be. The contour error results from both factors of the tracking error of the machine's feeding axes and the mismatch of multi-axis dynamic characteristics [1], especially at the location of the sharp corner [2][3][4] of the tool path. To improve the machining accuracy, the contour error should be effectively predicted and compensated [1,5].
There have been extensive works on contour error prediction and compensation. In general, they are classified into two types: online and offline approaches [6,7].
As an online method for the contour error estimation and compensation, the concept of cross-coupling control (CCC) was introduced by Koren in [8], which directly reduces the contour error of the entire feeding system of machine tool simultaneously, rather than modeling and treating the tracking error of each feeding axis individually. Based on the work of [8], an improved concept of CCC is proposed, for which a more effective control law is utilized to build the model of contour error [9]. In the following work of [8], Koren and Lo [10] proposed a variable gain CCC method to estimate the contour error more effectively. For the contour of arbitrary shape, Yeh and Hsu [11] proposed a contouring error vector estimation-based approach to efficiently determine the variable gains for the CCC. Later, the approaches based on the CCC were developed further into the two-layered cross-coupling control [12], cross-coupled fuzzy logic sliding mode control [13], cross-coupled dynamic friction control [14], and the neural-network CCC [15].
Different from CCC algorithms, Huo et al. [16] proposed a generalized Taylor series expansion error compensation method, and this method was proved to enable the elimination of the axial following error via simulation, thereby achieving good contour following results. Zhu et al. [17] presented a real-time contour error estimation method based on the second-order Taylor approximation of the point-to-curve distance function. Erkorkmaz and Altintas [18] presented a contour error estimation method for arbitrary free-form contouring curves by approximating the original continuous trajectory with some discretized positions. By projecting the actual position onto the osculating plane at the nearest reference point, Li et al. [19] proposed a method in which the axial component of contour errors are estimated and compensated into each control loop of the servo motor simultaneously. Hu et al. [20] proposed a numerical computation-based accurate contour error estimation method and the corresponding contour error compensation strategy.
In addition to the online contour error estimation and compensation method, some other works have been done on the offline method [21]. Huo and Poo [22] proposed a nonlinear auto-regressive (NAR) neural network modeling method to build the model of the feeding system by taking the reference position as the input, and the actual position at the next instant can be predicted from the proposed model. Similarly, by designing a NAR neural network with exogenous inputs, Erwinski et al. [23] presented a contour error prediction model, which can directly predict the contour error by using a reference position on the tool path. Zhang et al. [24] proposed an iterative compensation method for contour error of CNC machining, in which the tracking error model is utilized for calculating the optimal compensation value.
For the real-time compensation methods based on the contour error modeling, the accuracy of compensation is not high enough because, on the one hand, the prediction models have limited complexity and thus fair prediction precision; and on the other hand, due to the essential inertia of the electrical and mechanical system, the feeding system of CNC machine tool has the lag characteristic, i.e., the compensated result cannot work immediately, which could undermine the effect of contour error compensation. For the offline model-based iterative compensation methods, the computational efficiency for determining the compensation value is low because they normally require some iterative processes to calculate the compensation value of each control cycle [24]. Also, the predicted value of the model cannot truly reflect the actual contour error, especially when the error generated by the mechanical transmission elements [25,26] is difficult to be considered and covered by the prediction model.
To improve the contouring accuracy of the CNC machine tool, this paper presents a set of contour error compensation methods by firstly modeling the contour error based on a Nonlinear Auto-Regressive Long-Short Term Memory (NAR-LSTM) network and then determining the optimal compensation parameters via another machine learning method called Time-Series Deep Q-Network (TS-DQN). The proposed approach is mainly designed for finishing cutting of CNC machining, for which it has small material removal volume and some physical issues such as the cutting force and tool deflection could not be considered. There are two major contributions in this paper: (1) Based on the internal data of the machine tool, a deep learning network called NAR-LSTM is designed that can precisely predict the tracking error, based on which the contour error can be calculated. For each feeding axis, the nonlinear factor of the tracking error is estimated from the NAR-LSTM network; combined with the linear part as calculated from the steady state of the feeding axis, the entire tracking error of the feeding axis can be precisely predicted. After that, the contour error is calculated based on the predicted tracking error of two feeding axes.
(2) To address the lag characteristic as well as find the best compensation strategy, a reinforcement learning method called TS-DQN is proposed that determines the optimal parameters of the compensation strategy, from which the contour error can be largely reduced and the surface machining quality can be improved significantly.

Feedback
Rotary encoder + Servo Mechanical transmission elements Motor CNC u w w Fig. 1 The general structure of the semi-closed loop control of a feeding axis Fig. 2 The position control loop of the feeding system Given the proposed approaches, the contour error for machining a part, regardless if it is a new part or an already machined one, can be precisely predicted and compensated without needing any time-consuming iterative calculation for the compensation value and parameters. This paper is organized as follows. Section 2 introduces a NAR-LSTM network-based contour error prediction method. Based on the predicted result, Section 3 presents a TS-DQNbased approach to determine the optimal compensation parameters for a given compensation strategy. In Section 4, some simulation and physical cutting experiments are conducted to verify the performance of the proposed contour error prediction and compensation methods. This paper concludes in Section 5.

The NAR-LSTM network-based contour error calculation
In this section, a NAR-LSTM network is firstly designed to model the nonlinear part of the tracking error of a machine's feeding axis, based on which the tracking error (considering both the linear and nonlinear parts) can be calculated; and then a contour error (due to the tracking error of multiple axes) calculation approach is presented.

Tracking error of a feeding axis
For a given machining path, a set of positions will be generated for each feeding axis after it is processed in the CNC system, which is called reference position and denoted by the symbol u. The reference position is then sent to the servo system to drive the feeding axis of the machine tool. As a core component of a machine tool, a feeding axis is composed of mechanical transmission elements, a servo motor, and its controller. For a semi-closed loop control a feeding axis as illustrated in Fig. 1, the position feedback w ′ comes from the rotary encoder of the motor. Compared with w ′ , the actual position w as measured by using the linear encoder installed on the worktable can reflect the tracking error more precisely because the errors caused by the mechanical transmission elements [1] of the feeding axis are included. In our work, instead of using w ′ as the position feedback, the actual position w is utilized in our following tracking error modeling approach.
For a feeding axis of the machine tool, a typical way of the position control loop is achieved via the classical PID (proportional-integral-derivative) control algorithm [27]; i.e., by modulating the proportional, integral, and derivative parameters, the PID controller minimizes the tracking error between the desired position u (i.e., the reference position) and the feedback position w (i.e., the actual position), and where E is defined as the tracking error of a feeding axis as shown in Fig. 2.
To avoid the complexity of determining all the three PID parameters, the P-control mode which only adjusts the proportional parameter is normally utilized in industrial practices, for which the following relationship holds [28], where v is the velocity of the velocity loop, and K p is the position proportional gain and it is a parameter set in the CNC system. Based on Eq. (2), the tracking error E can be calculated as For E, as calculated from Eq. (1) to Eq. (3), it is the tracking error of the steady state, which is defined as the linear tracking error (LTE) because it is a linear function of v.
In addition to the LTE, there are a lot of nonlinear factors in the control and transmission process from which some additional tracking error can be produced, especially for the mechanical transmission elements which internally suffer from the nonlinear issues like the backlash and friction force of feeding axis. Tracking error generated due to these nonlinear issues is defined as the nonlinear tracking error (NLTE). The NLTE can significantly affect the contour error of the machine tool's feeding system and surface machining quality, and it should be precisely estimated before the contour error can be compensated.  For the X-axis of the machine tool (Fig. 11), an example of the LTE and NLTE is shown in Fig. 3. Considering both the LTE and the NLTE, the overall tracking error (OTE) of a feeding system is the combination of them, where E a is the OTE and E nl is the NLTE. Based on Eq. (4), building the prediction model for the OTE resorts to the modeling of the LTE and NLTE, as shown in Fig. 4. Since the LTE can be calculated from Eq. (3), the major challenge for modeling the OTE lies in the accurate modeling of the NLTE. In this paper, a big data-based model H(U) is utilized to predict the NLTE, in which the input U is the historical data of the CNC machining process. Details regarding the U and H(U) will be explained in Section 2.2.
In the following part of this paper, to distinguish from the real value retrieved from the physical experiment, we put a hat on the symbol of the estimated data; e.g., E is the real LTE while b E with a hat is the estimated LTE.

NAR-LSTM network-based modeling of the NLTE
In this section, the model for predicting the NLTE of the feeding axis is constructed based on the NAR-LSTM network. Due to the essential nature of time series for the control signal of the feeding system, the LSTM has direct advantages for modeling of time-series response of the feeding axis. In our early work [29], a network called the NAR-LSTM network is constructed based on the traditional LSTM and the idea of the feedback control of the feeding system, for which the actual position of the feeding axis for the current time instant is predicted by taking the previous instant as the direct feedback. In this work, the idea of NAR-LSTM is utilized for constructing the model for predicting NLTE of the feeding axis.
There   . In this work, the mathematical expression of these two architectures is given by Eq. (5) and Eq. (6) where in Eq. (5) and Eq. (6), H is the nonlinear function for modeling the NLTE; is the input vector of the tth instant, and v (t) , a (t) , and J (t) are respectively the reference velocity, acceleration, and jerk as retrieved from the CNC system; g is as a hyperparameter of the network, meaning that b E nl t ð Þ is calculated based on the g-period of the historical data, and g is set by trial and error method according to experience.
For the actual NLTE E t ð Þ nl of Eq. (5), it can be calculated from: where the actual tracking error E a (t) can be obtained from the CNC system, and E (t) can be calculated from Eq. (3).
To predict the nonlinear tracking error b E t ð Þ nl of the tth instant, the data U (t) from the (t-g)th instant to the current instant is utilized in the NAR-LSTM structure of Fig. 5. Besides, the historical NLTE data from (t-g-1)th instant to (t-1)th instant is also utilized for predicting the current b E t ð Þ nl , but the data source is different for the training stage and predicting stage per Eq. (5) and Eq. (6): during the training phase, the open-loop architecture is used because on the availability of the actual NLTE data E nl nl , there is no existing NLTE data, so the previous predicted NTLE from the (t-g-1)th instant to the   Fig. 11 The BM8-H three-axis CNC milling center Based on the proposed NAR-LSTM model, the NLTE of the feeding axis can be predicted from Eq. (6) and the OTE can be calculated from Eq. (4). Details regarding the prediction results are shown in Section 4.2. In the next subsection, a contour error prediction model will be presented based on the calculated OTE of each feeding axis.

Contour error calculation
Given the OTE as predicted from the approach presented in Section 2.2, the predicted position of the feeding axis is a is the predicted OTE at the tth instant; u (t) is the reference position of feeding axis; b w t ð Þ is the corresponding predicted actual position of the u (t) .
With the actual position of the two axes (i.e., the X-and Yaxes in the 2D contouring machining), the actual path of the CNC machining can be defined, and the contour error, which is the deviation between the reference path and the actual path, can be calculated.
The general idea of calculating the contour error is shown in Fig. 6. Before the procedure can be explained, the following symbols defined in Fig. 6 are explained at the tth instant: y -the predicted position of the feeding axis b ε c -the predicted contour errors of the feeding system with considering both X-and Y-axes With the above definitions, the proposed procedure for building the prediction model of contour error (PMCE) is explained as follows.
Step-1: Estimate the position of the X-and Y-axes at the tth instant. Taking the X-axis as the example: make t=g (g is the specific period of the historical data for modeling the NAR-LSTM); given the data of the a;x and the predicted position b w t ð Þ x per Eq. (4) and Eq. (8), respectively. Likewise, b w t ð Þ y of the Y-axis can be predicted.
Step-2: Get the predicted point P t Step-3: Iterate Step-1 and Step-2 by increasing the index of t, until it reaches the last point. The predicted points will form a curve and assume it is {…, P (t) , P (t + 1) , …}.
Step-4: Calculate the contour error for each predicted point.
Taking the point P (t) shown in Fig. 7 as an example, calculate the distances from the P (t) to the all the line segments of is the point on the reference path corresponding to the ith instant; the contour error is then selected as the minimal value among these distances, e.g., the distance between p t ð Þ * and P (t) as shown in Fig. 7. The contour error of each predicted point is calculated which altogether forms the predicted contour error b ε c of the reference path, as shown in Fig. 6. The detailed process for calculating the contour error could refer to the work [16].
Given the PMCE as presented in the section, a reinforcement learning-based approach is presented to determine the optimal compensation strategy of contour error, as to be presented in the next section.

Reinforcement learning-based contour error compensation
In this section, a contour error compensation strategy is first presented to address the lag characteristic of the feeding system and ensure good compensation performance, and then a reinforcement learning-based approach is designed to determine the essential parameters of the proposed strategy.

Contour error compensation strategy
The general idea of the proposed contour error compensation method is shown in Fig. 8. Before the approach is explained, the following definitions are given at the tth sampling instant: x and c t ð Þ y for the X-and Y-axes, respectively. After that, the compensated position will be sent to the CNC controller to reduce the contour error and improve the machining accuracy.
To achieve good compensation performance, we designed two parameters in the proposed compensation model, i.e., the forward compensation cycle T and the compensation rate k, which are respectively defined to avoid the lag characteristic of the control system and mechanical system of the feeding axis and ensure good compensation performance. Given a reference position u (t) at the tth instance, the corresponding compensated position u ′(t) as calculated from the compensation policy can be regarded as a function of the parameters T and k, and In Eq. (9), u ′(t) at the tth instance is compensated based on the future contour error ε tþT ð Þ c at the (t+T)th constant; in this way, the lag characteristic is counterbalanced by the parameter T at the tth instance, as shown in Fig. 9.
In the proposed compensation strategy, the compensation values for the X-and Y-axes are denoted as c t ð Þ x and c t ð Þ y , respectively, and where α is the angle between the normal direction at u (t) and the Y-axis. Given the compensation values for the X-and Y-axes per Eq. (10), the compensated positions u 0 t ð Þ x and u 0 t ð Þ y of the corresponding reference path are The compensated position as calculated from Eq. (11) is then sent to the CNC system to generate the modified contour, which is formed by the compensated positions and close to the original reference contour.
For the proposed compensation strategy, one issue remains how the two compensation parameters (i.e., the T and k of Eq. (10)) can be determined so that a good compensation result can be achieved. In Section 3.2, a reinforcement learningbased approach is applied to determine the two parameters.

TS-DQN-based identification of compensation parameters
As a popular and dominant reinforcement learning method, the Deep Q-network (DQN) learns by interacting with the environment to obtain the optimal policy under a set reward function, and it has been widely used in the domain of optimal control [30,31]. Considering the time-series feature of the feeding system, a modified DQN, named Time-Series Deep Q-network (TS-DQN), is presented for identifying the compensation parameters. The general framework of the TS-DQN is shown in Fig. 10, in which a time-series replay memory is used to replace the traditional replay memory of the classical DQN [30]. For the replay memory in the proposed TS-DQN, the rewards in historical steps can be re-updated according to the time-series state of the feeding system, so as to achieve higher parameter identification efficiency.
The TS-DQN-based compensation parameter identification scheme is shown in Fig. 10. The PMCE of Section 2.3 is utilized as the environment for the proposed TS-DQN. Two Q-networks with the same structure but different parameters are used; and between them, one is the main Q-network for calculating the Q-value of the agent's action from Eq. (12) and selecting the action under the maximum Q-value, and the other is the target Q-network for assisting training the network by calculating the target Q-value.
The three core components of the TS-DQN algorithm are the prediction performance states t (superscript "t" represents the data at the tth instance), decision action of compensation parameter (T t , k t ), and the reward for corresponding decision action r t in a specific state. In addition, the proposed TS-DQN involves an agent, a set of states S, and a set of decision actions for (T, k). In this paper, there are 8 potential decision actions, as listed in Table 1.
By selecting and executing a decision action according to the ε-greedy policy, the agent changes from one state to another. In the meantime, the agent will get the reward r when executing a decision action in a specific state.
where α is the learning rate and γ is the discount factor. The data set (s t , T t , k t , r t , s t + 1 ) as obtained through the interaction of the agent with the environment at t-instant will be stored in the time-series replay memory. According to the time-series feature of the feeding system of the CNC machining, if the same decision action is selected in adjacent steps and a better compensation performance is obtained, the reward r t − 1 in the last step will be re-updated in the timeseries replay memory by Eq. (13), Then, the reward in m previous steps can be updated in the time-series replay memory by r t−n ¼ r t−n þ p⋅ r t−nþ1 þ; …; þr t−1 þ r t À Á ; n   The Q-network of TS-DQN is trained by using a batch of the data from the time-series replay memory, while the parameters θ of the main Q-network are updated by minimizing the loss function of Eq. (15) via gradient descent algorithm.
where θ and θ − are the parameters of the main Q-network and the target Q-network, respectively. The reward value function corresponding to the specific state and action is given in Eq. (16). By estimating the compensation performance through the prediction model of contour error, the reward r t at the tth instant can be given as In Eq. (16), E t MSE and abs(E t ) max are the mean square error (MSE) and maximal contour error for the tth instant, respectively; ε t E is the weighted sum of E t MSE and abs(E t ) max , where the E t MSE accounts for a small weigh of 0.3 and the abs(E t ) max has a dominant weight of 0.7 because the abs(E t ) max normally has more effect on the final machining result; τ E =2 μm is the criteria of termination of the current iteration because the bestpredicted value is 2 μm for our contour error prediction model toward our machine tool; c is a constant of 1000.
For the proposed TS-DQN, the learning rate α, the discount factor γof Eq. (12), the time-series correlation factor p of Eq. (13), and the constant c of Eq. (16) are all hyperparameters set by trial and error method according to experience. For the machine learning-based method, the manual setting of hyperparameters is essentially a drawback, and the criterion for selecting these parameters could refer to [30,32]. Regarding the compensation set (T, k), they are the parameters of the proposed TS-DQN, which can be obtained through the training of the TS-DQN.
For the proposed TS-DQN as shown in Fig. 10, an algorithm called TS-DQN with Experience Replay is given below for a clear explanation. As already alluded, a time-series replay memory of the proposed TS-DQN is used to replace the traditional replay memory of DQN, and the re-updated method of rewards is added in the time-series replay memory by setting r t − 1 = r t − 1 + p ⋅ r t when a t − 1 = a t and s t − 1 > s t '. Fig. 14 Five test contours. a Circular contour; b heart contour; c goggles contour; d rhombus contour; e star contour TS-DQN with Experience Replay, the two compensation parameters (i.e., the T and k of Eq. (10)) at each instance of the machining process are identified which minimizing the contour error. Results on the identification of the two parameters and the follow-up compensation will be presented the Section 4.

Experiments
In this section, some experiments are conducted to verify the proposed NAR-LSTM network-based contour error modeling approach and the TS-DQN-based contour error compensation strategy.

Experimental setup
The experimental verifications of the proposed contour error prediction and compensation approaches were conducted toward the BM8-H three-axis CNC milling center equipped with a Huazhong HNC-818B controller, as shown in Fig.  11. In addition, the structure of the three-axis CNC milling center is given in Fig. 12. For the machine tool, the 2D contour error resulting from tracking error (both linear and nonlinear) Fig. 15 Actual (red) and predicted (blue) tracking error, prediction error (green) of the X-axis Fig. 16 Actual (red) and predicted (blue) tracking error, prediction error (green) of the Y-axis of X-and Y-axes is calculated and compensated, where the reference positions u x and u y are generated and retrieved from the CNC system; and the actual positions of the feeding axis w x and w y are detected by the linear encoder mounting on the working table (shown in Fig. 13); both data are collected via the data acquisition function of the CNC system with the sampling frequency of 1 kHz. Furthermore, the HSV-180UD-075 servo controller is utilized and set as the position control mode. To show the ability of our model for predicting the results of the unmatched axial, the proportional gains for the two axes were set to be different with K px = 160 Hz and K py =120 Hz.
To testify the proposed approaches in terms of contour error prediction and compensation, two elementary contours (i.e., circle shown in Fig. 14a and rhombus shown in Fig. 14d) and three more freeform contours (i.e., heart contour, goggles contour, and star contour shown in Fig. 14b, c,    For the proposed NAR-LSTM network in Section 2, its structure is presented in Section 2.2. In each time-series block of the NAR-LSTM network, two basic LSTM cells and one fully connected layer were used. The number of nodes in each cell and the parameter g are hyperparameters that were found by trial and error approach [35,36], and they are 60 and 14 in this paper, respectively. The dimensions of the input and output are 4 and 1 respectively. Moreover, as a typical recurrent neural network, the structure and parameters of the NAR-LSTM are reused in each control cycle. For the proposed TS-DQN in Section 3, its Q-network is a three-layer fully connected neural network with 10 nodes in the hidden layer, Fig. 19 Contour error compensation for heart contour. a Reference, uncompensated, and compensated heart contour; b contour error before and after compensation

Experimental results
In this section, the experimental results on the NAR-LSTM network-based contour error prediction are first reported, and then the feasibility and advantage of the proposed TS-DQNbased contour error compensation approach are testified by simulation and physical cutting experiments.

Contour error prediction
The detailed results on the contour error prediction are presented for machining goggles contour with the feedrate of 2400 mm/min. Results on the other contours and the other feedrates are similar, with their details presented in Section 4.2.2.
The prediction results for the tracking error of the X-and Yaxes from the approach of Section 2.1 and Section 2.2 are shown in Fig. 15 and Fig. 16, respectively. Results show that the predicted results match with the actual tracking error, which is retrieved from the controller of the CNC machine tools, very well. For the X-axis with the maximum actual tracking error of 0.5659 mm (as shown in Fig. 15), the prediction error, which is the deviation between the actual position and the predicted position, has the maximum value of 0.0019 mm, which is only 0.34% of the corresponding actual tracking error. For the Y-axis which has the largest actual tracking error of 0.5661 mm, the maximum prediction error is 0.0021 mm, which is only 0.37% of the actual tracking error at this location. Results shown in Fig. 15 and Fig. 16 validated that the proposed NAR-LSTM network-based tracking error modeling approach has very good predicted accuracy.
Based on the tracking error as estimated from the NAR-LSTM network, the contour error can be calculated based on the approach as presented in Section 2.3. Figure 17 shows predicted contour error, actual contour error, and the deviation (prediction error) between them when machining the goggles contour. Results show that, for the location with the maximal actual contour error of 0.0219 mm, the predicted result is 0.0020 mm, with the prediction error being 9.1% of the actual contour error. Results in Fig. 17 validated that, based on the tracking error as modeled via the NAR-LSTM network, the

Contour error compensation
Based on the approach as presented in Section 3, the experiments for contour error compensation are firstly conducted for circular contour, heart contour, and goggles contour, with the results shown in Fig. 18, Fig. 19, and Fig. 20, respectively. The tracking error for machining the contours of Fig. 18, Fig. 19, and Fig. 20 is first predicted with the proposed NAR-LSTM of Section 2.2, and the contour error is then calculated with the model as presented in Section 2.3. After that, the compensation parameters (T, k) are identified by the proposed TS-DQN. Given the predicted contour error and the optimal compensation parameters, the compensated positions of the Xaxis and Y-axis are calculated from Eq. (11) and sent to the CNC system to execute the compensation. In this way, the three contours can be machined with the compensated results.
For the circular contour, the contour error has some significant reduction after compensation, with the maximum contour error reduced from 0.0204 to 0.0039 mm, an 80.88% reduction; the compensated contour is very close to the original reference contour, as shown in Fig. 18. For the heart contour, the contour error is also largely reduced, with the maximum contour error reduced from 0.0183 to 0.0054 mm, as shown in Fig. 19, which has a 70.49% reduction. For the goggles contour shown in Fig. 20, the maximum contour error is reduced from 0.0262 to 0.0048 mm, an 81.68% improvement.
Note that, to make the results obvious, the contour error shown in Fig. 18 a, Fig. 19 a, and Fig. 20 a is magnified 200 times. Figures 18-20 are experimental results for a feedrate of 2400 mm/min; to comprehensively testify the feasibility and advantage of the proposed method, the contour error compensation is also conducted on other three feedrates, i.e., 1200 mm/min, 3600 mm/min, and 4800 mm/min. To effectively evaluate the compensation results, the Maximal Absolute Error (MAE) of the contour error, which is the critical condition affecting the machining quality, is listed in Table 2 for the four feedrates, and where ε(k) is the contour error at the kth instant, and N is the total number of instants. The experimental results in Table 2 show that for various contours machined different feedrates, our approach can significantly reduce the contour errors of all cases, with the MAE of the contour error reduced for 70.49~84.53% after the compensation.
To further testify the effectiveness of the proposed contour error compensation approach on improving the machining quality, physical cutting experiments are conducted on the milling center BM8-H for machining the circular contour of a radius of 35 mm and feedrate of 3000 mm/min. The machined parts before and after compensation are shown in Fig.   (a) (b) Fig. 22 a Inspection scenario of the machined parts; b machining error and roundness before and after compensation  Figure 21 shows that, after contour error compensation, the surface roughness can be improved drastically, especially for the machining region where the Xor Y-axes of the machine tool reverses the moving direction and the surface defects such as bump or groove could easily occur. In addition to the surface roughness, the machining precision of the two circles is inspected by the WENZEL LH108 coordinate measuring machine (Fig. 22a). Inspection results shown in Fig. 22 b convince that, based on the proposed compensation technology, the machining error can be reduced significantly with the maximum machining error reduced from 15.1 to 7.6 μm, and the roundness of the circular part is improved from 12.7 to 5.3 μm. In our work, the compensation parameters of Section 3.1 are identified with the proposed TS-DQN. Compared with the traditional DQN [30], the proposed TS-DQN has time-series replay memory, which enables the rewards in historical steps reupdated based on the time-series state of the feeding system; the TS-DQN designed in this way can improve the parameter identification efficiency. The computational time for recognizing the parameters based on the proposed TS-DQN and traditional DQN is listed in Table 3. Both algorithms are run on the same laptop with Intel Core i7-2.60 GHz CPU and 16GB RAM. The DQN takes 25.51s when identifying the compensation parameters while the proposed TS-DQN only takes 14.83s; i.e., the proposed TS-DQN has a 41.81% reduction of computational time.
To further verify the contour error compensation performance at the location of high curvatures, more experiments are conducted toward some complex trajectory, i.e., the rhombus and star curves as respectively shown in Fig. 14 d and e. For the rhombus contour, as shown in Fig. 25 a, the contour error has some significant reduction after compensation, with the maximum contour error reduced from 0.0138 to 0.0054 mm, a 60.87% reduction; the compensated contour is very close to the original reference contour, as shown in Fig. 23. For the star contour, the contour error is also largely reduced, with the maximum contour error reduced from 0.0247 to 0.0061 mm, as shown in Fig. 25 b, which has a 75.30% reduction; the compensated contour is also close to the original reference contour, as shown in Fig. 24. Experimental results toward the rhombus and star contours validate the robustness of the proposed contour error compensation method: even when the contours have some drastic changes of curvature (i.e., the shape corner), the proposed approach can still achieve good compensation results (Fig. 25).  In addition to our contour error compensation approach as proposed in Section 3, two more strategies (benchmarks) are implemented to verify the advantage of the proposed approaches in terms of reducing the contour error. The first strategy is called the tracking error compensation method which directly compensates the tracking error of the feeding axis according to the prediction results of the NAR-LSTM network. The other strategy is called the direct contour error compensation approach that compensates the contour error without considering the two parameters of the forward compensation cycle and the compensation rate (i.e., by setting T=0 and k=1 in the Eq. (10)). Experimental results of our approach and the two benchmarks on the goggles contour are presented in Fig. 26. Results show that our approach based on the optimal selection of the compensation parameters has much better compensation results compared to the two benchmarks and the original uncompensated strategy: the maximal contour error of our approach is 4.82 μm, which is significantly smaller than the two benchmarks (i.e., 13.77 μm and 17.85 μm) and the uncompensated strategy (i.e., 26.24 μm).
Theoretically, the two benchmarks are supposed to have good performance for improving the contouring accuracy. However, in the real cutting practice, the strategies of direct tracking error compensation and direct contour error compensation are difficult to work because the actual responses lag behind the reference inputs due to the lag characteristic of the control system and mechanical system of the feeding system. Therefore, the compensation values cannot work immediately through direct compensation. To achieve good compensation performance, we design two compensation parameters (i.e., the forward compensation cycle T and the compensation rate k) and utilize the deep learning approach to determine their optimal values. In this way, the lag-characteristic feeding system can be counterbalanced and good compensation performance can be achieved. Therefore, the proposed method can achieve better performance on the contour error compensation as compared to the two benchmarks.
In this section, some comprehensive experiments for testifying the proposed NAR-LSTM network-based contour error prediction and the TS-DQN-based contour error compensation approaches are designed and conducted. Experimental results show that the proposed approach can have good prediction accuracy on the contour error (prediction error less than 9.1%); and for the further contour error compensation strategy, as presented in this paper, it can significantly counterbalance the contour error of the machining process (the contour error is reduced for 60.87~84.53%) with the machining quality improved significantly after the compensation (maximal machining error reduced 49.67% for machining circular contour). Besides, the proposed TS-DQN-based contour error compensation approach has much  better performance in terms of both the computational efficiency compared to the traditional approach (a 41.81% reduction of computational time) and the compensated results as compared to the benchmarks.

Conclusion
In this paper, a set of approaches is presented for the contour error compensation of the CNC machining. To achieve this goal, a deep learning network NAR-LSTM is designed that can precisely predict the tracking error of the machine's feed axis from the internal data of the CNC system, based on which the contour error can be calculated. Then, a contour error compensation strategy is proposed for which the compensation parameters are identified based on the TS-DQN-based reinforcement learning approach. The validities of the contour error prediction and compensation methods were confirmed by extensive experiments on a three-axis CNC machine tool. Experimental results show that the proposed method can effectively improve the accuracy of the contour error prediction, both for the elementary shape like a circle and the freeform curves like the heart contour and goggles contour. Also, experimental results validate the effectiveness and the advantage of the proposed contour error compensation approach in terms of improving the machining quality and enhance the computational performance.
Regarding the method proposed in this paper, it has two major limitations. One is that our current method can only be used in a thermal steady state of the machine tool because the effective training samples can only acquire in that state. The other one is that there are many hyperparameters (e.g., learning rate α, discount factor γ, time-series correlation factor p) in the proposed networks that should be determined in a trial and error way, which is complicated and time-consuming. Regarding the future work of this paper, in addition to the 2D contour, the proposed deep learning-based tracking error modeling and reinforcement learning-based contour error compensation approaches could be extended for improving the machining quality of the 3D surface machining. Besides, we will focus on how to reuse the generated compensation values for different feedrate via the technology of transfer learning. At last, based on the control error prediction result, some feedforward control methods will be developed to further increase the contouring accuracy of CNC machining.