Application of support vector machine for the prediction of strip crown in hot strip rolling

: In order to enhance the prediction accuracy of the strip crown and improve the quality of final product in the hot strip rolling, an optimized model based upon support vector machine (SVM) is proposed firstly. Meanwhile, for purposes of enriching data information and ensuring data quality, the actual data from a hot-rolled plant are collected to establish prediction model, as well as the prediction performance of models was evaluated by using multiple indicators. Besides, the traditional SVM model and the combined prediction models with the particle swarm optimization (PSO) and the cuckoo search (CS) optimization algorithm are also proposed. Furthermore, the prediction performance comparisons of the three different methods are discussed and validated. The results show that the CS-SVM has the highest prediction accuracy compared to the other two methods, and the root mean squared error (RMSE) of the proposed CS-SVM is 2.05µm, and 98.11% of prediction data have an absolute error below 4.5 μ m. In addition, the results also demonstrated that the CS-SVM not only with faster convergence speed and higher prediction accuracy but can be well applied to the actual hot strip rolling production.


Introduction
The production of hot-rolled strip occupies an important position in the industrial system, and due to its excellent performance, the hot-rolled strip has been widely using in various aspects of the industry, agriculture, defense and daily life. However, the quality problems of the strip crown may adversely affect the users and the quality of the final product maybe leads to some serious consequences. For example, the deviations of strip crown not only may cause insignificant process disruptions and many issues, but can lead to the shape defects and the product failed, as well as ensuring product quality and improving shape accuracy is always a challenging task [1]. Therefore, hot strip rolling process is of great significance in the fields of manufacturing and processing, moreover, the quality of hot strip crown is particularly critical.
In addition, the production process of hot strip rolling has a large system, complicated mechanism, and many system parameters, and there is a strong coupling between each parameter. Consequently, it is unusually complex to set up an accurate mathematical model. The current strip crown control methods usually adopt a simplified mathematical model, and the production data is used to modify the model in long-term production. Give some examples, *Correspondence: mlf_zgtyust@163.com 1 School of Mechanical Engineering, Taiyuan University of Science and Technology, Taiyuan 030024, Shanxi, P.R. China Full list of author information is available at the end of the article Sun et al. [2] presented a profile distribution model based upon dynamic programming, in which the major parameters are computed by means of the finite element method and through a regression strategy to keep the bending force and shift position within an appropriate scope. He et al. [3] put forward a method of skew rolling, meanwhile, the finite element method was used to simulate the deformation of strips. This method is effective in production, but each process control parameter is not optimal obviously, which effects the further improvement of the strip crown forecast and control accuracy.
The past several decades have witnessed an expeditious development of big data and intelligent control techniques, increasingly researches have begun to use artificial intelligence methods in the study of strip crown control algorithms and a significant amount of researches were conducted to forecast. For example, Ding et al. [4] proposed a load distribution optimization based on max-min ant colony algorithm during the rolling process. Nandan et al. [5] presented a way of evolutionary multi-objective method based upon GA for the strip control. Besides, surface defect classification in large-scale strip steel image collection via hybrid chromosome genetic algorithm was proposed by Hu et al. [6]. Wang et al. [7] utilized the hybrid AMPSO-SVR-Based approach to forecast the strip crown during the process of hot-rolled. Furthermore, Abedinia et al. [8] utilized a combinatorial neural network method to forecast the future price. An empirical mode decomposition based ensemble deep learning for load demand time series forecasting was proposed by Qiu et al. [9]. Qureshi et al. [10] proposed a prediction system of wind power using a DNN-based ensemble strategy. Alaei et al. [11] presented online prediction of work roll thermal expansion based on neural network in the hot rolling process.
As known to all, the strip crown of hot rolling is a non-linear problems and the mechanism is extraordinarily complicated for the changes of rolling force, roll speed and other major parameters and variables during the rolling procedures. Artificial neural network, as a kind of typical machine learning methods based on the principle of empirical risk minimization, due to its strong non-linearity and capacity of adaptive information processing, is one of the most fashionable machine learning algorithms currently. However, the performance of neural networks is highly dependent on data samples, and its approximation and generalization capabilities of network models are closely related to the typicality of learning samples.
The SVM, as another prevalent machine learning algorithm which based upon statistical learning theory and structural risk minimization, able to achieve a stronger generalization capability with fewer support vectors [12]. Therefore, in this paper, a strip crown prediction model based upon SVM was established, furthermore, the parameters and variables of prediction model was optimized by CS algorithm to improve its prediction accuracy.
The remainder of this paper is arranged as follows: In Section 2, the definition and main influencing factors of strip crown and interrelate hot-rolled strip is descripted. Section 3, shows the SVM-based methods and experimental analysis of the SVM, PSO-SVM, and CS-SVM prediction model, followed by the paper's conclusions in Section 4.

Hot rolled strip
So far, many studies have been made to further comprehend the strip crown control capacity of hot rolling mills in the rolled procedure [13]. Zamanian et al. [14] designed a type of new mill for the product of a strip with zero strip crown. Zhao et al. [15] proposed a high precision shape model for hot strip rolling. A feedback control of the contour shape in hotrolled was shown by Schausberger et al. [16]. These days, the product of strips with uniform thickness has great of significance in rolling industries, but it is daunting to accomplish of multifarious existing technical matters such as pair cross mill and high crown control mill, etc.
Nowadays, with the rapid development of computer technology, automation control is increasingly applied to the hot-rolled process. Therefore, a great quantity of rolling data during the hot-rolled process can be easily collected. Furthermore, the data-driven method based on big data can analyze the data and establish a non-linear model of the crown and based upon the data to achieve accurate prediction and control of the crown. It has a broad application prospect and improvement space in the crown control. The hot rolling production line is illustrated in Fig.1, which created the experimental dataset consist of seven 4-high stand finishing mills. The parameters configuration of the CVC mill are listed in the Table 1 and its structure configuration is shown in the Fig.2 (a ＆ b), in which the finishing mill consists of a couple of work rolls and a pair of backup rolls and the roll gap is adjusted through the upper backup roll, which driven by two hydraulic cylinders.

The definition of strip crown and main influencing factors
Strip shape includes crown and flatness, and the profile is described as crown and flatness is depicted differential elongation [17]. It is generally considered that the thickness changes between the left and right marking points of the hot rolled strip is parabolic, so the profile is used to represent shape characteristics of the cross-section. In addition, as shown in Fig.3, the absolute strip crown is defined as the difference between the thickness at the midpoint of the strip in the width direction and the average of the thickness of the marked points at 40mm from the edge of the band on both sides.
where CR is the strip crown, , , ds os c h h h are the drive side, operator side and center thickness of hot rolled strip, respectively.
As the two main indicators of shape control, the relationship between strip crown and straightness is not relatively independent, but is mutually constrained, and is inseparable from each other. According to the flatness condition of the strip, the premise of ensuring the good flatness of the strip is to first ensure that the strip has qualified crown accuracy, and the key to the crown control is the roll gap shape control. Therefore, the first task to solve the shape control problem is to ensure the hit of the strip crown. In engineering, different specifications and grades correspond to different models, and it is complex to build a reasonable strip crown model that suitable for various circumstances and situations. So, it is general practice to decompose non-linear problems into linear issues to further simplify the process of building models.
There are many factors affecting the strip crown during the rolling process, which are mainly refers to two aspects, the rolled product and the rolling conditions. However, the current precision requirements of strip crown control can no longer meet, relying on traditional analytical methods and control theories, which are mainly manifested in three aspects: first of all, the simplified conditions in the modelling process have greatly restricted the further improvement of the accuracy of the strip crown model. Next, the complexity of the site conditions and the difficulty of measuring some special variables directly and in real-time also increase the trouble for further raising the precision of the strip crown model. Third, the difficulty of feedback control is also a key factor hindering the improvement of strip crown accuracy.
Therefore, it has become increasingly necessary to explore a new control method to improve the strip crown precision of hot-rolled strip. In this trend, artificial intelligence control has emerged as an advanced stage in the development of control theory. It is mainly used to dispose of complex system problems that are highly difficult to solve by traditional control strategies. Strip crown control in the rolling process is a sophisticated dynamic process, which includes lots of difficulties such as multivariable, non-linear and strong coupling. This is exactly the problem that artificial intelligence methods are good at solving. Therefore, research on industrial big data and artificial intelligence-based the high-precision strip crown prediction model undoubtedly has a broad theoretical significance and practical application value.

The SVM-based methods
Most of the practical issues about classification or regression, in fact, are always non-linear and complex, so that the ideal classification surface also should be non-linear. During classification, the SVM attempts to search for a hyperplane between two different types of data in the feature space [18]. As for as the regression problems, as shown in Fig.4, the method for SVM to deal with non-linear problems is to mapping the training set from original pattern to a high-dimensional feature space through a non-linear transformation of a specific function. Generally, given a set of non-linear samples data [ , ] , where x is input value, y is corresponding objective value. The SVM regression function is defined as following formula: The objective function can be gained using the following formula: where ,d  are the regression factors, C is the regularization parameter,  ( ) L is the loss function.
Assume that the non-linear mapping is: , at this time, the objective function becomes as follows: where , i j   are the Lagrange multipliers. Additionally, the radial basis function is expressed as follows: Finally, the decision function of SVM regression model will be calculated by following formula: The PSO is one of the representative swarm intelligence optimization algorithms to deal with multi-objective optimization problems (MOPs) and an evolutionary computation technique. It originated from the research of bird predation behavior and proposed by Kennedy, et al. in the 20th century. The fundamental principle of PSO is through cooperation and information sharing among individuals in the group to search for an optimal solution. Due to its simple, easy to operate and there are no adjustments for lots of parameters, the PSO algorithm has been widely applied to many of practical optimization issues.
The CS optimization algorithm is an emerging heuristic algorithm firstly put forward by Professor Xin-she Yang and S. Deb in 2009. CS, as a more effective algorithm than other swarm optimization algorithms, has been developed by simulating the breed behavior of cuckoos. It is a population based search procedure which is used as an optimization method and tool, in solving complicated and non-linear optimization issues. Simultaneously, the CS algorithm also adopts the relevant Levy flight search mechanism, which can effectively solve the optimization problem. Comparing with other heuristic searches such as genetic algorithm (GA) [19], PSO [20], and Artificial bee colony (ABC) [21], the main advantages of CS algorithm are fewer parameters, simple operation, easy implementation, excellent random search paths, strong generation ability, and able to converge to the global optimal. The main idea of CS is based on two strategies: the nest parasitism of cuckoos and Levi's flight mechanism. Through the random walk way to search for an optimal host nest to hatch their eggs, this way can achieve an efficient optimization model, which is expressed as: where t is generation number of current,  is generated from -1 to 1, ( ) L  is the Levy random search path, and L u t     .

Data pre-processing
A great quantity of data samples were collected from a hot-rolled manufactory to build SVM-based prediction model of strip crown and the framework of strip crown prediction as shown in Fig. 5.

Fig.5. Framework of strip crown prediction
To utmost recover the facticity of the information of samples data and accomplish a reliable experimentation result, samples data handling is fundamental instead of directly using. The description of dataset as shown in Table 2, in which 40 major variables are synthetically picked out to prove the capability of these set up predictive model and obtain a final model with great completeness and strong robustness. Besides, all the experimental data comes from a hot strip rolling plant and part of the data are listed in Table 3. In truth, the original data always comprise lots of noise and outliers, which may bring about misleading predictions. In addition, the predicted data based on the prediction model have strong correspondence to the measured data [22]. In the Fig. 6 (a and b), a three-dimensional visual map of the data shows distinct distributions of those vital variables and parameters which mostly impact the strip crown among the process of hot strip rolling. The values of strip crown increases from 10 to 70µm, and with the change of the horizontal and vertical coordinates, it presents a messy distribution. That is to say, not only this generous samples point, but also its tremendous disorder, can partly demonstrate that the strip crown prediction models established using this dataset will have a strong robustness in practical hot strip rolling production. a. Bending force and rolling force b. Roll shifting and finishing temperature Fig.6. 3D visualization diagrams of data Of course, there is no doubt that the large amount of raw data collected from a hot-rolled plant certainly contains a lot of noise and abnormal points. Therefore, on the basis of Pauta criterion, to eliminate the outliers and noise during the samples data and recovering the original information in the samples data is a key step to ensure the reliability of the experimental results. The formula of criterion is defined as follows: where x is the average of i x and x S is the standard deviation of the samples.
Moreover, in order to remove the noise and abnormal points and improve the quality of experimental data, the data also need to smooth by using a smoothing algorithm, in which the coefficients obtained by the principle of least squares. The formula of smoothing algorithm is expressed as follows: where n y is the smoothed value of n y . As a result, a total of 106 samples point have been detected and eliminated, and 2701 samples data are selected for the further experimentation program. Fig.7. Five-sport triple smoothing In Fig.7 is part of smoothed samples data. Besides, to avoid the influence caused by different dimensions, these samples point have to be normalized through a min-max scaling and the specific method is as follows: 12) where y , x , min x , and max x are the normalized data, original data, minimal and maximal data, respectively.

Establishment of predictive models (1) The SVM model
During the establishment of the SVM model, there are two crucial and sensitive parameters (the penalty factor c and the kernel function variable g) supposed to be searched and adjusted to achieve an ideal accuracy. It is a tedious and complicated process to accurately determine the sensitive parameter, the parameter are too large or too small will effects the final predictive ability of the established SVM model. However, there is no universally accepted uniform method for the selection of the parameter optimization of SVM. Cross Validation (CV), is a kind of statistical analysis technique, used to verify the capability of classification and regression problems.
Additionally, for comprehensively and synthetically appraise the capability of the prediction model, there are two crucial criteria, the MSE and R 2 are used as a criterion as well, which can be obtained by the following formulas: where i x , ti x , i x are measured values, predicted values, average values respectively.
a. contour map b. 3D views Fig.8. Selection result of parameters (c ＆ g) In Fig.8 ( a and b), the contour lines and 3D views indicate that the accuracy and mean squared error of the CV algorithm.

(2) The PSO-SVM model
Compared with the CV method, the PSO doesn't need to select the range of optimization parameter multiple times, as well as has faster convergence speed and the better optimization capacity. In most cases, the PSO can converge to an optimal solution faster than CV method. The step of using PSO algorithm for optimizing SVM parameters is depicted as follows: Step 1 Load the whole processed experimental data and divide it into two parts: one is the training set (70%) and another is the testing set (30%). All of the training samples (1801) are used as training the model of SVM and the remaining data (900) are used as testing the predictive capability of established model.
Step 2 Search for the best (c ＆ g) combination by the PSO algorithm and repeated the process until it approaches the condition of stopping.
Step 3 Output the best combination searched by the PSO algorithm.
Step 4 Establish the SVM regression model using the optimal (c ＆ g) combination parameters.
Step 5 Calculate the criteria of prediction model and evaluate its performance.

(3) The CS-SVM model
PSO algorithm is based upon the population instead of the gradient information, so, it has the ability to search the optimal solution in the global solution space, as well as the PSO algorithm also has disadvantages with premature convergence, low search precision and efficiency during the later iterations [23]. The CS, as a more effective algorithm than other swarm optimization algorithms, is considered to an optimization technique and tool, in handling complicated, non-linear and many of optimization issues. At the same time, the CS also adopts to the relevant Levy flight search mechanism, which can deal with the optimization problems effectively. The step of the algorithm process for optimizing SVM parameters using CS is same as PSO-SVM model. The workflow of optimizing SVM parameters by using CS algorithm is illustrated in Fig.9.

Experimental results and discussions
To pursue a perfect experimental results as much as possible, mean absolute error (MAE), mean absolute percentage error (MAPE) and RMSE are employed to measure the performance of established strip crown prediction models. In addition, the mean square error (MSE) and square correlation coefficient (R 2 ) as criteria are adopted to comprehensively evaluate the prediction accuracy as well.
The mathematical formulas are listed in the following formula:  (17) where i x is the measured values, ti x is the predicted values.   Fig.11 shows the error histogram for prediction of proposed model comparison for the SVM, PSO-SVM, and CS-SVM. Apparently, 98.11% of the CS-SVM model predicted data has an absolute error below 4.5µm. The evaluate criteria of prediction model of SVM, PSO-SVM, and CS-SVM shown in Fig. 10, the strip crown predictive capacity of CS-SVM is preferably higher. Moreover, in Fig. 10, from SVM to CS-SVM, the values of MAE, MAPE, and RMSE decline distinctly, and it has the best performance.
Besides, the reliability of the input data that may significantly but in an unknown or unrecorded manner rely on the experimental conditions or even human factors, are critical for the facticity of the predicted results. Therefore, for fully and comprehensively validate the effectiveness and predictive performance of this model, we specifically keep the remaining parameters unchanging and changing one of the parameters that add a step signal for prediction in a manufactory, and verify the prediction results by comparison with actual strip crown values. The test results of CS-SVM prediction model is shown in Table.4.
In Table.4, only one of the parameters and variables changed as the testing set for prediction and compared with the actual data from a hot rolled plant. The prediction accuracy