Hybrid gray wolf optimization method in support vector regression framework for highly precise prediction of remaining useful life of lithium-ion batteries

The prediction of remaining useful life (RUL) of lithium-ion batteries takes a critical effect in the battery management system, and precise prediction of RUL guarantees the secure and reliable functioning of batteries. For the difficult problem of selecting the parameter kernel of the training data set of the RUL prediction model constructed based on the support vector regression model, an intelligent gray wolf optimization algorithm is introduced for optimization, and owing to the premature stagnation and multiple susceptibility to local optimum problems of the gray wolf algorithm, a differential evolution strategy is introduced to propose a hybrid gray wolf optimization algorithm based on differential evolution to enhance the original gray wolf optimization. The variance and choice operators of differential evolution are designed to sustaining the diversity of stocks, and then their crossover operations and selection operators are made to carry out global search to enhance the prediction of the model and realize exact forecast of the remaining lifetime. Experiments on the NASA lithium-ion battery dataset demonstrate the effectiveness of the proposed RUL prediction method. Experimental results demonstrate that the maximum average absolute value error of the prediction of the fusion algorithm on the battery dataset is limited to within 1%, which reflects the high accuracy prediction capability and strong robustness.


Introduction
As the growth of new energy vehicles, large-scale energy storage, special robots and aerospace equipment continues, the requirement for battery management systems with high capacity, robust range, long cycle life and strong robustness is raising [1][2][3]. Lithium-ion batteries with high energy density, low self-discharge rate and prolonged cycle life in energy storage process is the most extensive application of energy storage [4,5]. Yet, along with the frequency of charging and discharging, the battery will degrade or even fail through different mechanisms [6,7]. If effective measures are not taken in time before they reach the failure threshold, it may lead to degradation of device performance or even catastrophic events [8][9][10]. Therefore, researching battery degradation and developing reliable degradation models to accurately predict its remaining useful life (RUL) are extremely significant.
In the domain of tracking lithium-ion battery capacity degradation, that is, battery RUL prediction, domestic and foreign experts and scholars have made great contributions [11,12]. At present, there are three main approaches: model-based approach, data-driven approach and fusion-based approach [13,14]. The model-based ways establish a model to simulate battery aging through mathematical methods, and the model established by the approach is only applicable to a specific type and working condition of batteries, which is narrow in application [15,16]. In contrast, the data-driven approach disregards the internal dynamic chemical reactions and failure mechanisms of the battery, and reveals the connection among battery degradation data and capacity degradation through statistics and analysis [17][18][19]. These approaches mostly use machine learning algorithms, combining specific methods such as artificial neural networks (ANN), support vector machines (SVM), relevance vector machines (RVM), Gaussian process regression (GPR), Bayesian models and self-encoders to achieve RUL prediction [20][21][22][23]. Long et al. [24] established the nonlinear relationship between RUL and model parameters based on the joint prediction model using ANN to obtain the battery capacity degradation incline to achieve the RUL prediction. Park [25] investigated the long-term dependence of the capacity degradation process of lithium-ion batteries utilizing long short-term memory (LSTM) networks, modeled with experimental data of multiple lithium-ion batteries at two diverse temperatures, and the method is capable of predicting the RUL of lithium-ion batteries independently of offline training data. To boost the precision of RUL forecasts, Lai et al. [26] investigated the relationship is studied among the charging curve and the remaining effective capacity of the battery, and proposes a rapid ranking and reorganization approach at the module stage based on machine learning algorithms. Fusion approach, which combines several distinct methods for hybrid prediction, drawing on the strengths of various methods to realize superior prediction results [27][28][29][30]. Typically includes a blend of empirical modeling and data-driven approaches [31], as well as a weighted fusion combining various data-driven approaches [32,33]. The literature [34] presents a data-driven least-squares-based support vector machine algorithm combined with a particle filtering (PF) algorithm. The least squares support vector machine way is utilized to conduct training on the historical measurement data, after which the measurements are forecasted and the PF algorithm is fused for RUL forecasting. He et al. [35] constructed an adaptive hybrid model combining empirical model and LSTM neural network model to combine historical capacity, online measurement data, latest offline state and model parameters to characterize the battery capacity degradation trend more effectively and achieve RUL short-term online prediction accuracy by rolling prediction. Li [36] employed an adapted bird swarm algorithm to optimize least squares and to enhance RUL simulation precision. Nevertheless, the problem of how to choose the optimal parameters based on the support vector regression (SVR) method and guarantee the high accuracy of RUL prediction remains to be solved [37][38][39]. Currently, fusionbased methods for RUL prediction of lithium-ion batteries are one of the major research hotspots in the field [40,41], and the use of fusion approach can fully utilize the advantages of various algorithms to obtain superior predictive performance [42,43].
Lithium-ion battery RUL prediction involves analyzing the available data to estimate the current state of the battery and predict the degradation trend of the battery performance. Define the discrepancy of the number of cycles to the starting point of prediction when the capacity of the lithium-ion battery is diminished to the failure threshold as the battery RUL. In the study, a capacity-based RUL prediction way for lithium-ion batteries utilizes the SVR method to develop a capacity degradation model for lithium-ion batteries, and an intelligent gray wolf optimization (GWO) algorithm with better global search capability and computational robustness is introduced for SVR method to select optimal parameters and ensure high accuracy of RUL prediction. To boost the search performance of the wolf pack algorithm and solve the problems of slow convergence, low accuracy of finding the best and vulnerability to local extremes, a hybrid gray wolf optimization (HGWO)-SVR model parameter prediction method is constructed by adding the differential evolution strategy to the algorithm in the paper. The applicable capability of the suggested model is verified by utilizing the battery dataset from NASA Research Center and the validity of the prediction model is proven by comparing the results of the historical data-based prediction method.

Support vector regression model
The support vector regression (SVR) model is a supervised machine learning tool for solving function regression problems, which is often used to solve small-sample, nonlinear problems and is extensively applied in machine learning. In the domain of RUL prediction, the SVR model maps the characteristic data of battery life to the high-dimensional space by nonlinear transformation, and performs linear regression fitting in the high-dimensional space to output battery capacity for characterization. The pattern of SVR model is displayed in Fig. 1.
In Fig. 1, SVR generates a "spacing band" around the linear function, where all samples that fall into the band are ignored, whereas the loss function is included only for values outside the band. After that, the model is optimized by minimizing the width of the spacing band and the overall loss.
Assume n training samples are given , x indicates the input feature vector, x i ∈ R n , y indicates the corresponding output vector,y i ∈ R . The nonlinear mapping is defined as Eq. (1).
in which x, w, b are the inputs, weights and intercepts, separately, φ(x) indicates the φnonlinear mapping function. In accordance with the principle of structural risk minimization, the above equation takes the form of Eq. (2).
where C denotes the penalty coefficient, which indicates the degree of penalty for errors exceeding ε. l ε denotes the insensitive loss function of ε, which serves to keep the SVR , the following equation is available.
Introducing the kernel function using the pairwise principle and the Lagrangian function, Eq. (4) is transformed into the expression.
where * i and i are Lagrangian multipliers. From Mercer's principle, the nonlinear mapping SVR formulation is obtained as: is the kernel function. In the training process of the model, the kernel function is a key factor in determining the performance of the support vector machine. The radial basis kernel function selected in the study is the most widely used kernel function with high recognition rate and superior performance, and the performance without degradation in the case of reduced training set. The formulation of the radial basis kernel function is given below.
where g is a kernel parameter.

Gray wolf optimization algorithm
To address the problem of parameter optimization for the above SVR model, the Gray Wolf Optimizer (GWO) algorithm is chosen. The algorithm is an efficient search method derived from the prey hunting behavior of gray wolves. And which is simple in principle, with adaptive convergence factor and information feedback mechanism, capable of taking into account both local optimization and global search, and has favorable performance in terms of solution accuracy and convergence rate. And the GWO algorithm is able to adapt to different types of optimization problems and its search strategy and collaborative behavior enable it to perform effective search in complex optimization spaces. In addition, the algorithm has strong robustness, and the search policy can overcome the local optimal solution and search for the global optimal solution. Since each individual of the GWO algorithm is independent of each other, the algorithm can be more easily parallelized to accelerate the optimization process utilizing multi-core or distributed computing resources. In the SVR model, the GWO algorithm can more effectively optimize its parameter settings, feature selection, model accuracy and other aspects to improve the performance and efficiency of the SVR model, thus raising the modeling accuracy and prediction precision. Firstly, the social hierarchy model of gray wolves is built and the adaptation degree of each individual in the population is counted, as depicted in Fig. 2.
The three gray wolves with the best adaptation in the pack are labeled as α, β, δ and the remaining gray wolves are labeled as ω. That means the social ranking in the gray wolf population is α, β, δ and ω in descending order. The optimization process of GWO is chief guided by the preferably three solutions in each generation of the population, namely α,β,δ is used to guide the process. In the GWO algorithm, α, β, and δ perform the hunting behavior, and ω follows the previous 3 for prey tracking and siege, and finally completes the predation task. The detailed steps of the algorithm are as below.
The first step is to surround the prey. Wolves utilizes the following position update equation to achieve the encirclement of the prey during the hunting process. The second step is the hunting behavior. Once the gray wolf judges the location of the prey, the alpha wolf α will lead β and δ to launch the hunting behavior. Among the wolves, α, β and δ are the nearest to the prey, and the locations of these three wolves make use of them to judge the direction of the prey, which is mathematically described as below.
where X α , X β , X δ represent the positions of α wolf (current optimal solution), β wolf (second optimal solution) and δ wolf (third optimal solution), D α , D β , D δ represent the distances between the remaining individuals ω and α, β δ wolves of the wolf population, respectively, X P (t + 1) means the position of the gray wolf after the update. The gray wolf packs keep updating their positions through the iterative process and gradually approach the prey until the end of the algorithm iteration and complete the predation on the prey.

Hybrid gray wolf optimization algorithm
Differential evolution strategy is a species-based evolutionary algorithm. Improving GWO algorithm by using differential idea enables to avoid it from entering into partial extremes and broaden the species search range. The differential strategy primarily consists of variation operation, crossover operation and selection operation, and the original GWO is improved by using differential evolution to obtain the HGWO algorithm.
To prevent the phenomenon that the variability of the species decreases when the population iterates to a particular region, the crossover and selection operations of the differential evolution method are adopted to preserve the diversity of the species. Then, as the original population of GWO algorithm, the objective function values of individuals are counted, and the optimal three individuals X α , X β and X δ are picked. The locations of other gray wolf units are renewed correspondingly, followed by the crossover and selection operations of differential evolutionary to renew the locations of gray wolf units, and iterative updates are made until the optimal objective function value is chosen from them.
To select the ideal vector variation factor is the basis to ensure the evolution of the GWO algorithm to search in the direction of the optimal solution. For the purpose, excellent gray wolves with high competitiveness served as the male offspring of the evolved population, and the differential vectors of β and δ wolves are determined through experimental tests. The variation factor of the GWO algorithm based on differential evolution is constructed by superimposing the dynamic scaling factor with α wolves, and the vector function expression of the variation factor is given below.
where Z is the dynamically varying scaling factor, which takes the value of Eq. (18).
The f min and f max in Eq. (18) refer to the minimum and maximum values of the scaling factor, and t max represents the maximum times of iterations. The dynamically varying scaling factor Z addresses the shortcoming that the fundamental GWO method easily enters into local extremes in the preliminary stage of search. By enlarging the scaling factor to dynamically adjust the differential vector, the global discovery ability of GWO algorithm in the presearch stage is greatly enhanced. By decreasing the scaling  factor, the GWO approach is able to enhance the local exploration ability in the later stage of search and enhance its search accuracy.
The individuals to be mutated in the wolf population are intersected with the vector factors undergoing the mutation operation to produce intermediate units, and the crossover operation is presented in Eq. (19).
where S represents the crossover probability constant, the random dimension variable rand(i) is applied to ensure that at least one dimension of the wolf individual vector comes from the variation vector. The intermediate individuals U t+1 i generated by the mutation and crossover operations compete with X t i to select the well-adapted individuals as the next generation.
where f denotes the constructed adaptiveness function, and the distance between individuals in the wolf pack is shorter than f c , which takes the value of Eq. (21).
In which, k refers to the quantity of objective functions, f, f i, min refer to the maximum and minimum values of the i-th objective function, and the one with superior adaptation in X t i and is picked as the t + 1th generation of individuals.
Such hybrid approach is able to upgrade the global search capability and simultaneously avoid the pitfalls of premature stagnation and trapping in partial optimality.

HGWO-SVR joint algorithm
When employing the radial basis kernel function-based SVR algorithm for prediction, firstly, the values of penalty parameter C and radial basis kernel function parameter g should be defined, and the advantages and disadvantages of these two parameters directly affect the prediction accuracy. In the article, a joint approach based on HGWO-SVR algorithm is implemented. The fundamental idea is to apply the hybrid GWO algorithm to solve the joint parameter search problem in the SVR model and to optimize the parameters of SVR to upgrade the prediction of the model. The process flow diagram of the proposed method is depicted in Fig. 3. As in Fig. 3, the input data set is firstly determined, classified into training set and test set, and normalized, then the method parameters are set: size of species, largest number of iterations, crossover probability, search range, and scaling factor range. Initialize the SVR parameters and set the range of penalty parameters and kernel function parameters. Then initialize the population, including parent population, variant population and offspring population, count the fitness of each wolf, and classify the wolves according to the fitness value. The positions of the best adapted gray wolves α, β and δ are retained and the residual gray wolf positions are updated conforming to Eqs. (10) to (16). Individuals from the paternal generation of the species are chosen for differential evolution operation, and the best gray wolf individuals are picked into the succeeding generation population through variation, crossover and selection operations, and the HGWO algorithm is applied to iteratively renew the position of the gray wolf population until the maximum number of iterations is attained, and the location of the gray wolf with the best adaptation is exported. The optimal parameters are employed to establish the model, and SVR forecast is performed on the test set, and the results are output after inverse normalization.

Battery data set
The lithium-ion battery dataset which is utilized in the experiment is derived from the prediction data repository of NASA Ames Research Center. The database contains many different kinds of lithium-ion battery datasets, and the battery datasets B0005 (B5), B0006 (B6), B0007 (B7), and B0018 (B18) are selected for the experiments in the paper. The detailed test conditions of the dataset are   Table 1. The battery capacity degradation curves of the dataset are displayed in Fig. 4.

Evaluation indicators
Mean Absolute Error (MAE) is a common regression model evaluation metric used to assess the difference between the predicted and true values. It is calculated by taking the absolute value of the difference between the predicted value and the true value of each sample, and then finding the average of all sample differences. The smaller the value of MAE, the better the predictive ability of the model. Root Mean Square Error (RMSE) is a metric that measures the magnitude of the error between the predicted value and the true value. The smaller the RMSE, the smaller the error between the predicted value and the true value, and the better the fit of the model. Absolute error (AE) is the absolute value of the difference between the predicted value and the true value. In statistics and machine learning, AE is frequently applied to measure the accuracy of a model. The smaller the absolute error, the better the predictive ability of the model.
In the experiment, the MAE, RMSE and AE are chosen as the criteria for judging the quantitative outcomes. MAE and RMSE are applied to the performance assessment of the estimation method, and AE is employed to assess the performance of the battery RUL prediction. The expressions are as follows. in which n denotes the amount of charging and discharging cycles, y i means the actual capacity, ŷ i refers to the predicted capacity, R means the practical RUL, and PR is the predicted RUL.

Performance analysis of algorithm
The changing of capacity directly indicates the degradation of the battery during charge/discharge cycles. Hence, capacity is a straightforward health factor to assess the degradation of battery feature for the prediction of RUL of lithium-ion batteries. To validate the performance of the suggested method in the prediction of RUL of lithium-ion batteries, the former 86 samples of B5/B6/B7 batteries are chosen as the training set and the remaining as the test sample. Owing to the small amount of total data samples for the B18 battery, the former 67 samples are chosen as the training set and the remaining as the test set. The prediction performance of HGWO-SVR is compared with the two algorithms as shown in Fig. 5. Figure 5 shows the prediction results of B5, B6, B7, and B18 using the three methods with 50% of the total cycle count data as training samples, respectively. The maximum errors between the actual and predicted capacities predicted by the joint HGWO-SVR algorithm in Fig. 5b, d, f, and h are 1.20%, 0.82%, 1.27%, and 0.75%, respectively, which are lower than the prediction errors of the remaining two algorithms errors. Table 2 (24) AE = |PR − R| summarizes the detailed evaluation metrics of the prediction result parameters.
As can be seen from Table 2, the actual RUL of B5 battery is 24 cycles, and the prediction result errors of SVR and GWO-SVR are 13 and 5 cycles, respectively, while the HGWO-SVR algorithm is 2 cycles, which shortens the prediction result by 11 cycles and 3 cycles compared with the former two. And the MAE of the joint HGWO-SVR algorithm is 0.35%, which is 83.4% and 45.3% lower compared with the MAE of SVR and GWO-SVR methods.
The actual RUL of B6 battery is 13 cycles, and the HGWO-SVR algorithm predicts the same result cycle as the actual cycle with AE of 0. Moreover, the RMSE of HGWO-SVR algorithm is 0.26%, which is 87.2% and 86.9% lower than the RMSE of SVR and GWO-SVR.
The actual RUL of B7 battery is 60 cycles, and the prediction result errors of SVR and GWO-SVR are 11 and 8 cycles, respectively, while the HGWO-SVR algorithm is 1 cycle, which shortens the prediction result by 10 and 7 cycles compared with the former two. And the MAE of the joint HGWO-SVR algorithm is 0.32%, which is 80% and 44.8% lower compared with the MAE of SVR and GWO-SVR.
The actual RUL of B18 battery is 15 cycles, and the prediction result errors of SVR and GWO-SVR are 11 and 6 cycles, respectively. Compared with the former two, the prediction errors of HGWO-SVR are reduced by 9 and 4 cycles. And the MAE of the joint HGWO-SVR algorithm is 0.33%, which is reduced by 62.1% and 17.5% compared with the MAE of SVR and GWO-SVR. As shown above, the prediction performance of the joint HGWO-SVR algorithm is better than the other two algorithms on all four battery datasets, which effectively improves the estimation accuracy and provides favorable stability and robustness.
To further validate the performance of the proposed method, Fig. 6 shows the prediction results obtained from different starting points. The prediction starting points are set to 40%, 50%, 60% and 70% of the total number of cycles, respectively. Correspondingly, the training set is resized to 40%, 50%, 60% and 70% of the total number of cycles, respectively.
There is no significant difference in the prediction results of the proposed method under different prediction starting points, and the detailed prediction result indicators are shown in Table 3.
The "-" in the table indicates the case where the life of the battery is unable to be predicted. According to the results in Table 3, the prediction starting point has a low impact on the prediction accuracy of the proposed method. Among them, the prediction error is the largest at 70% of the total prediction period in B5, B6, B18 battery prediction with RMSE of 0.81%, 0.76% and 0.54%, respectively, and the prediction error is the largest at 60% of the total prediction period in B7 with RMSE of 0.67%, but none of them exceeds 1%. Therefore, it can be concluded that the established model by combining hybrid gray wolf optimization and SVR is stable and has high prediction accuracy.

Conclusion
In this study, a novel SVR model framework for lithiumion RUL prediction based on hybrid gray wolf optimization is implemented, and a capacity degradation model for lithium-ion batteries is built based on SVR. For the problem of difficult selection of SVR parameters, the GWO algorithm with strong global search capability is employed to search and optimize the parameters of SVR. Applying the idea of differential evolution to the gray wolf optimization algorithm to form a hybrid optimization algorithm enables GWO to avoid falling into extremes and expand its population search range. Experimental results show that the HGWO-SVR method improves the accuracy of remaining life prediction of lithium-ion batteries. The study proves the effectiveness of the method by initially selecting 50% of the total cycles of the battery dataset to input into this joint model and comparing it with the SVR and GWO-SVR algorithms, and the proposed method has a minimum RMSE of 0.0026, which is better than the remaining two algorithms. For different batteries and prediction starting points, the RMSE of the proposed model is stabilized within 1%. The results show that the HGWO-SVR method is able to effectively improve the estimation accuracy and stability and achieve precise and rapid prediction of RUL. The method fills the research gap of combining intelligent algorithms and data-driven models to achieve high accuracy prediction of RUL and provides an option for high quality RUL prediction applications.