Prediction of Software Defect Using Ensemble Learning Based Improved Sparrow Search Algorithm To Optimize Extreme Learning Machine

For the traditional ensemble learning algorithm of software defect prediction, the base predictor exists the problem that too many parameters are difficult to optimize, resulting in the optimized performance of the model unable to be obtained. An ensemble learning algorithm for software defect prediction that is proposed by using the improved sparrow search algorithm to optimize the extreme learning machine, which divided into three parts. Firstly, the improved sparrow search algorithm (ISSA) is proposed to improve the optimization ability and convergence speed, and the performance of the improved sparrow search algorithm is tested by using eight benchmark test functions. Secondly, ISSA is used to optimize extreme learning machine (ISSA - ELM) to improve the prediction ability. Finally, the optimized ensemble learning algorithm (ISSA - ELM - Bagging) is presented in the Bagging algorithm which improve the prediction performance of ELM in software defect datasets. Experiments are carried out in six groups of software defect datasets. The experimental results show that ISSA - ELM - Bagging ensemble learning algorithm is significantly better than the other four comparison algorithms under the six evaluation indexes of Precision, Recall, F - measure, MCC, Accuracy and G - mean, which has better stability and generalization ability.


Introduction
There are various application softwares in our life, which provide great convenience for people ' s life. T he quality of software has become the focus of atte ntion. Most researchers believe that software defects are the main problem affecting software quality an d existing risk of leaking user information, resulting in serious information security problem [1,2] . Therefo re, it is issue of this paper is to propose an efficie nt software defect prediction algorithm to improve s oftware quality and reduce software development an d maintenance costs.
Software defect prediction is an important meth od to measure software quality. Economic losses an d information leakage risks will fall with accurately predicting software defect module. If we predict th e potential defects of software in time, the work eff iciency of testers can be improved [3] . The existing s oftware defect prediction can be divided into two c ategories. One is to predict whether the module in the software has defects, and the other is to predict the trend and distribution of defects in the softwar e development process [4] . Efficient software defect p rediction algorithm can not only guarantee the quali ty of software, but also reduce the risk of informati on leakage [6,7] . Therefore, the key issue of this pape r is to propose an efficient software defect predictio n algorithm to improve software quality and reduce costs of software development and maintenance.
Aiming at the problems of difficult parameter s election, low prediction accuracy and poor stability of base predictor in existing software defect predicti on ensemble learning algorithms, the paper proposes a software defect prediction algorithm based on en semble learning algorithm to optimize extreme learn ing machine. The main contributions of this paper i nclude the following aspects: (1) An improved sparrow search algorithm is proposed by using the reverse learning of pinhole imaging and the foraging strategy of flip bucket to optimize the sparrow search algorithm.
(2) The improved sparrow search algorithm im proves the prediction accuracy and stability of extre me learning machine by optimizing the parameters of extreme learning machine.
(3) In the Bagging ensemble learning algorithm, an optimized ensemble learning algorithm (ISSA-E LM-Bagging) is proposed to improve the generalizat ion ability of extreme learning machine on software defect prediction datasets. The rest of this paper is organized as follows. Section 2 introduces the related work of software defect prediction. Section 3 gives optimize the extreme learning machine and ensemble learning algorithm to improve sparrow search algorithm, and proposes an ensemble learning algorithm of ISSA-ELM-Bagging. Section 4 introduces the experimental setup in detail and analyzes the experimental results. Finally, conclusions and further work are given in Section 5.

Related work
With the expansion of software scale and increasing complexity, the quality of software has become the focus of attention. As an important way to guarantee software quality, software defect prediction have been apply many method of data mining and machine learning [8] . including, for example, fuzzy logic [9] , artificial neural network [10] , semi-supervised learning [4] , multiple kernel learning [11] , extreme learning machine [12] , naive bayes [13] , support vector machine [14] , and so on. In software defect prediction, ensemble learning algorithms, such as Bagging [15] and Boosting [16] , have been widely used to improve the prediction performance of software defect prediction algorithm. Reference [17] improved the performance of software defect prediction algorithm used coding-based ensemble learning. Reference [18] verified that Bagging algorithm has better overall performance in ensemble learning.
With the increase of software scale and test co st, researchers have proposed many software defect prediction algorithms. At present, as a machine lear ning model, extreme learning machine (ELM) has a n efficient prediction ability, but the performance of ELM is easily affected by input weights and hidde n layer neuron thresholds. To solve this problem, re searchers have used swarm intelligence optimization algorithm to optimize ELM parameters. Liu et al. [1 9] used cuckoo search algorithm to optimize extreme learning machine parameters (ICS-ELM) to improv e the accuracy of junction temperature prediction of insulated-gate bipolar transistor (IGBT). Ding et al. [20] used PSO algorithm to optimize extreme learnin g machine and improve the accuracy of clinical can cer diagnosis results. Li et al. [21] used the whale op timization algorithm to optimize the extreme learnin g machine, which has higher accuracy and generaliz ation ability. In recent years, ensemble learning has been widely used in the field of software defect p rediction. Ensemble learning can improve the genera lization ability of the algorithm by combining multi ple weak learners to build strong learners. Related e nsemble learning algorithms have also been propose d in the fields of industrial chloride prediction [22] , h igh-dimensional data classification [23] , and disease pr ediction [24] . Reference [22] proposed a new ensembl e learning model by combining artificial neural net work and stepwise clustering analysis, but did not o ptimize the model parameters. Reference [23] prove d that the multi-view ensemble learning performance (MEL) was superior to the classical ensemble met hod. When MEL used the particle swarm algorithm for multi-objective optimization, it could quickly se arch the optimal solution, but the performance of th e particle swarm algorithm was not obvious. Refere nce [24] used four prediction algorithms to predict gas emissions and established an ensemble deep lea rning model for new type of exhaust gas emissions. The problem of parameter optimization in the base prediction period is not well solved by the above ensemble learning methods, which leads to obvious performance fluctuation and poor stability of the mo del. We improve the extreme learning machine (EL M) optimized by sparrow search algorithm (ISSA) a s the base predictor of ensemble learning algorithm, and build an ensemble learning algorithm with bett er prediction accuracy and generalization ability. Th e experimental results show that the optimized base predictor has better performance than other predicti on algorithms on the software defect prediction data sets, and its prediction performance is more stable and generalization ability is stronger.

Sparrow Search Algorithm
Sparrow search algorithm is a swarm intelligence optimization algorithm proposed by XUE et al. [25] according to a series of behaviors of sparrow groups in the process of foraging. Compared with the traditional swarm intelligence algorithms such as particle swarm optimization (PSO) [26] , grey wolf optimizer algorithm (GWO) [27] and whale optimization algorithm (WOA) [28] , sparrow search algorithm has stronger optimization ability and faster convergence speed. Sparrow search algorithm can quickly converge near the optimal value, and has good global optimization ability and stability. Sparrow search algorithm consists of three parts: the producers are responsible for foraging and providing direction for the group, the scroungers follow the producers and the reporters who alert predators. The producers are the best positioned part in the sparrow population and closest to the food; scroungers will update their positions according to the producers to improve the probability of obtaining food; when reporters detect predator ' s attacks, they alert the sparrow community, in which the sparrows approach each other to reduce the probability of being preyed and act against predators [29] .

The mathematical model of producers
The location of the producers is updated as follows [29] : Where , t ij X represents the position of the ith sp arrow in j-dimensional at the t-iteration, max iter repr esents the maximum number of iterations, and  i s a random number in the range of (0,1). 2 R is a warning value, and if it is reached, the sparrow po pulation is already in danger and needs to take acti on ( ). Q is a random number that obeys t he standard normal distribution. L represents a mat rix of 1 d and each element in the matrix is 1.

The mathematical model of the scroungers
The location of the scroungers is updated as follow s [29] : Where 1 t p X  is the location of the best producer s at the

The mathematical model of reporters
The location of the reporters is updated as follows [29] : Where best X is currently the global optimal loc ation.  is a random number that obeys the standa rd normal distribution. K is a random number fro m -1 to 1 that represents the direction of the sparr ow's movement and parameters of the step length c ontrol. i f is the fitness value of the current individ ual sparrow. g f is the current global optimal fitness value, and w f is the current worst global fitness v alue.  is an Infinitesimal constant, avoiding zeros in denominator.

Pore imaging reverse learning
Aiming at the problem that the sparrow search algorithm is easy to fall into local extremum in the later iteration, Tian et al. [30] applied reverse learning to the sparrow search algorithm, effectively expanding the search range of the algorithm, but the candidate solution obtained by traditional reverse learning is far from the current optimal solution. Aiming at this problem, this paper adopts the inverse learning of pinhole imaging. This method combines the principle of pinhole imaging with the inverse learning, so that the current solution can obtain the candidate solution through the principle of pinhole imaging, which effectively improves the quality of the candidate solution, further avoids the algorithm falling into local optimal solution, and improves the generalization ability of the algorithm. The principle of pinhole imaging reverse learning is shown as follows: x axis is b X ， , and the candidate solution correspo nding to the global optimal position of b X is b X ， . According to the principle of pinhole imaging [30] : , substituting the formula (4): When 1 n  , we can obtain: When 1 n  , formula (7) is the traditional reverse learning. At this time, the distance between the small hole screen and the receiving screen can be adjusted to obtain a better candidate solution, which effectively improves the quality of the candidate solution and greatly reduces the situation of falling into local optimal solution in the late iteration. A more dynamic candidate solution can be obtained reverse learning of pinhole imaging, which enhances the ability of sparrow search algorithm to jump out of local optimal solution and can find the optimal solution more efficiently.

Fighting foraging strategy
The scroungers find food by following the producers and update their location through the producers, so this also makes the search of the scroungers have certain blindness. This paper introduces the flip foraging strategy and applies it to the location update of the joiner. This strategy can make the joiner update the location more effectively.
where S represents the flip factor that determi nes the position of the sparrow after the tumbling f oraging strategy. According to the reference [31], Introducing the flip bucket foraging strategy in location update of the scroungers, which effectively improves the search ability of the scroungers in th e high-dimensional space. The flip bucket foraging strategy is mainly carried out around the optimal so lution of each iteration, which also increases the se arch efficiency of the scroungers in the space. In th e early stage of the iteration, the distance between each sparrow is relatively large, and the sparrow ca n increase the search range of the sparrow through the flip bucket foraging strategy. In the late stage o f the iteration, the distance between each solution a nd the optimal solution is small. When the flip buc ket foraging strategy is carried out in a relatively s mall range, it can not only quickly jump out of the local optimal solution, but also reduce the correspo nding time cost.

ISSA algorithm design
The initial number of sparrows in the traditional sparrow search algorithm is small. After a certain number of iterations, the sparrows in the group will gradually approach a certain optimal solution. If the previous generation of sparrow location is not ideal, then continuing to update the individual location of the sparrow will affect the final result, making the algorithm easy to fall into local optimal solution. In order to solve these problems, this paper introduces the inverse learning of pinhole imaging. After each iteration, the candidate solution of the inverse learning of pinhole imaging is obtained according to the producers. Then the fitness value of the producers is compared with the fitness value of the candidate solution. The sparrow with better fitness value is taken as the producers of the current iteration number, which effectively improves the local search ability, the efficiency of the algorithm and the diversity of the population. In the traditional sparrow search algorithm, the adder updates the location according to the producers, which has certain blindness. In the early iteration, the search range of the adder is not wide enough, and it is easy to lose high quality solutions. Therefore, the flip bucket foraging strategy is introduced into the location update process of the scroungers. The flip bucket foraging strategy is a location change around the optimal solution obtained by the current iteration number. In the early iteration, the sparrow is relatively far from the optimal solution, which can increase the search range of the scroungers and obtain more high-quality solutions. In the late iteration, the sparrow individuals are close to the optimal solution, and the probability of replacing the current optimal solution by the candidate solution obtained by the foraging strategy of the flip bucket will gradually increase. Therefore, the strategy can effectively improve the ability of the algorithm to jump out of the local optimal solution. ISSA algorithm can effectively reduce the situation that the sparrow search algorithm falls into the local optimal solution, and find the optimal sparrow location and the optimal fitness value more quickly and efficiently. Finally, the optimal parameters of the extreme learning machine are obtained by the optimal sparrow location.

ISSA Optimized Extreme Learning Machine
For dataset 1 1 , the extreme learning machine model with n hidden layer nodes excitation function sigmoid can be expressed as [32] :  is the output weight of ith hidden layer node and output neuron, i w is the input weight of input neuron and ith hidden layer node, i b is the offset of ith hidden layer node, is called hidden layer output matrix. The output weight can be obtained by least square solution of linear formula (8) [33] : least square solution [34] : In this paper, ISSA is used to optimize ELM parameters, and the maximum absolute error of the sum of sample expected output and actual output in ELM is selected as the adaptation function of ISSA algorithm. The specific steps of ISSA algorithm to select the optimal input weight and hidden layer neuron threshold are as follows: step1: Initialize sparrow population and related parameters; step2: Calculate the fitness of each sparrow, sel ect the current optimal location and the optimal fitn ess value corresponding to the optimal location, sele ct the current worst location and the worst location corresponding to the worst fitness value; step3: Select the better fitness value of the spa rrow population as the producers, according to the f ormula (1) update the location of the producers; step4: According to formula (6), the candidate solution of the reverse learning of the producers' pi nhole imaging is obtained; step5: The fitness values of the producers and the candidate solution are compared, and the better fitness value is selected as the scroungers; step6: In addition to the producers sparrow as scroungers, according to the formula (2) update the location of the scroungers; step7: The candidate solution is obtained by th e foraging strategy of the adder according to formul a (7); step8: Compare the fitness value of the candida te solution and the scroungers ' , select the better fitn ess value as the scroungers; step 9: Randomly select sparrows as reporters among scroungers and producers, and update the re porters location according to formula (3); step10: judging whether the algorithm runs to t he maximum number of iterations, if it reaches the end of the cycle, if it does not reach step3; step11: Output global optimal location and opti mal fitness value, get the optimal input weights and hidden layer neuron threshold. The ISSA-ELM flo wchart is as follows:

Ensemble learning algorithm Bagging
According to reference [35], the superiority of Bagg ing algorithm in the field of software prediction is verified. This paper adopts the Bagging ensemble le arning algorithm. The Bagging algorithm uses the p rediction algorithm with low accuracy as the base p redictor, and then performs parallel processing on K base predictors. Finally, the voting method is used to obtain the final prediction results [36] . In this pap er, the model first uses ISSA to optimize the param eters of ELM as the base predictor of homogeneous integration, and then uses Bagging ensemble learni ng algorithm to improve the prediction performance of ISSA-ELM base predictor. The Bagging ensemb le learning algorithm is shown in the figure:

Experimental results and analysis
The experiments are implemented with MATLAB 2014b running on a PC with Intel (R) Core (TM) i7-8565U 1.80 GHz CPU, 16 GB RAM.

Experimental introduction
This paper compares sparrow search algorithm (SS A), whale algorithm (WOA), grey wolf algorithm (GWO), particle swarm optimization (PSO) with im proved sparrow search algorithm (ISSA) for compari son. In order to ensure the fairness of the experime nt, the population size of all algorithms are set to 30, and the maximum number of iterations are set t o 500 [37] . The experimental results retain four decim als, and other parameters are shown in table 1 [37] . For the software defect datasets, SSA algorithm and ISSA algorithm select 30 sparrows, and the maximum number of iterations is 100 [38] . The number of neurons in the hidden layer is 100, and the activation function is sigmoid [33] . The number of base predictor is the same as the datasets dimension. In this paper, 25 datasets from six groups of NASA, SOFTLAB, AEEEM, AUDI, MORPH and ReLink were selected for simulation experiments. Of these, 80 percent are training sets and 20 percent are test sets. Due to the high data dimension of the software defect datasets, PCA is used to reduce the dimension. The number of principal components after dimensionality reduction is shown in Table 2. The basic information of the dataset is shown in Table 2.

ISSA algorithm performance tests
The experiment selects eight different benchmark functions to compare the five optimization algorithms. The benchmark test function information is shown in Table 3. In order to highlight the performance and stability of the algorithm in this paper, the five optimization algorithms are run independently for 30 times on eight benchmark test functions. The optimal values for each function is shown in bold. The specific results are shown in Table 4.
[-50,50] 0   It can be seen from Table 4 that the ISSA algorithm is significantly better than the other four comparison algorithms in eight test functions. For F1 and F2, ISSA algorithm and SSA algorithm can find the theoretical optimal value, but the average value of ISSA algorithm is smaller, which shows that the optimization effect of ISSA algorithm is superior. For F3, F4, F8, the optimal value of ISSA algorithm is higher than the other four algorithms by several orders of magnitude or even a dozen orders of magnitude, and the average and standard deviation are less than the other four algorithms, indicating that the stability of ISSA algorithm is better. For F5, the optimization results of the five algorithms have little difference, but the ISSA algorithm precedes the other four algorithms and the average and standard deviation are the smallest. For F6, the average value of ISSA algorithm is the smallest, indicating that the convergence rate of ISSA algorithm is faster. However, in this function, the standard deviation of GWO algorithm is the smallest, indicating that the stability of GWO algorithm is better than that of ISSA algorithm in function F6. For F7, the optimal results of ISSA algorithm and SSA algorithm are the same, and the average value and standard deviation are the same, indicating that the two algorithms have the same effect on function F7. In the seven groups of test functions except F6, the average and standard deviation of ISSA algorithm are the minimum values, indicating that ISSA algorithm has better stability and stronger robustness, and has stronger global optimization ability than the other four algorithms.
As shown in Fig. 6, the convergence curves of eight groups of test functions are plotted. The abscissa and ordinate represent the number of iterations and the fitness function value respectively. It can be seen from Fig. 6 that among the convergence curves of eight groups of test functions, ISSA algorithm is at the bottom, indicating that ISSA algorithm has faster convergence speed and higher solution accuracy than the other four algorithms. In the convergence curve of F7, the ISSA algorithm and SSA algorithm almost coincide after 50 iterations, indicating that the convergence accuracy of the two algorithms is the same for the test function, but the ISSA algorithm converges faster in the early iteration. Integrating the convergence curves of eight test functions, the convergence curves of ISSA algorithm finally tend to be stable, indicating that the algorithm has the fastest convergence speed and the highest convergence accuracy.
In order to more comprehensively reflect the p erformance of ISSA algorithm, this paper, Wilcoxon rank sum test is used to verify whether ISSA algo rithm has significant difference with other algorithm s at P = 5 percent level. As shown in Table 5, P < 5 percent indicates that the difference between th e two algorithms is obvious, P > 5 percent shows t hat the two algorithms have little difference, that is, the algorithm performance is close. N/A indicates t hat the performance of the two algorithms is close.
It can be seen from Table 5 that the P values of test functions F3, F4, F5, F6 and F8 are all less th an 0.05, indicating that the optimization ability of I SSA algorithm in these five groups of test functions is significantly different from that of other algorith ms. For the test functions F1, F2 and F7, the opti mization effect of ISSA algorithm is not significantl y different from that of the other four algorithms, a nd the optimization effect is similar.

Software defect prediction contrast experiment
In order to verify the superiority of ISSA-ELM-Bag ging algorithm, four prediction algorithms, ELM, SS A-ELM, ISSA-ELM and SSA-ELM-Bagging, are us ed as comparison algorithms. In order to ensure the accuracy of the experiment, the ISSA-ELM-Baggin g algorithm uses the average of the six indicators o f precision (P), recall (R), F-measure, MCC, Accura cy and G-mean as the evaluation index to verify th e prediction performance of the algorithm. The high er the six evaluation indexes are, the better the soft ware defect prediction effect is. The optimal results of each evaluation index have been bold. The eval uation index calculation formula is as follows [39] : TP is the number of samples correctly classified by minority samples; FP is the number of correctly classified majority class samples; FN is the number of misclassified minority samples; TN is the number of samples misclassified by most class samples [40] .

TPR=TP/(TP+FN), TNR=TN/(FP+TN)
, TPR is the proportion of predicted correct samples in the actual minority samples; TNR is the proportion of predicted correct samples in most class samples to actual majority class samples [41] . Table 6 Fig. 13. Standard deviation of G-mean for each algorithms By analyzing Table 6 to Table 11, ISSA-ELM-Bagging algorithm is compared with ELM, SSA-EL M, ISSA-ELM and SSA-ELM-Bagging for twenty-fi ve datasets. Among the six evaluation indexes of Pr ecision, Recall, F-measure, MCC, Accuracy and Gmean, ISSA-ELM-Bagging algorithm can get the op timal value for most datasets. The mean value of I SSA-ELM-Bagging algorithm is the maximum amon g the six evaluation indexes, which fully shows the efficiency of ISSA-ELM-Bagging algorithm. It can be concluded from Table 6 that for PDE and poi-1.5 datasets, the precision value of SSA-ELM-Baggi ng prediction algorithm is the best. About arc datas et, the precision value of ISSA-ELM prediction alg orithm is the highest. In these three datasets, the pr ecision value of ISSA-ELM-Bagging algorithm is th e suboptimal value. In Table 7, the Recall values o f ISSA-ELM-Bagging algorithm and SSA-ELM-Bagg ing algorithm for MC2, PC2 and ar4 datasets reach the optimal value. For Ant-1.3 and camel-1.0 datas ets, the three prediction algorithms of ISSA, SSA-E LM-Bagging and ISSA-ELM-Bagging can reach the optimal value. About PC4, ar1, ar6 and velocity-1. 4, the data volume is generally small and the numb er of features is few. The experimental results show that the Recall value of SSA-ELM-Bagging predict ion algorithm is better, and ISSA-ELM-Bagging alg orithm is suboptimal. The Recall value of ISSA-EL M is superior to the other four algorithms on PC1 dataset, but the difference between ISSA-ELM-Baggi ng algorithm and the optimal value is only 0.03. F or Table 8, in the dataset ant-1.3, ProjectA, ar6, IS SA-ELM-Bagging algorithm and SSA-ELM-Bagging prediction algorithm, ISSA-ELM prediction algorith m, ISSA-ELM and SSA-ELM-Bagging prediction al gorithm respectively reach the optimal F-measure va lue. In the rest of datasets, the F-measure values of ISSA-ELM-Bagging algorithm are the best. For the datasets with low defect rate and large amount of data, the prediction effect of ISSA-ELM-Bagging al gorithm is good. It can be seen from Table 9 that t he ISSA-ELM-Bagging algorithm and SSA-ELM-Ba gging prediction algorithm are superior to the other three comparative prediction algorithms on ant-1.3. ISSA-ELM-Bagging algorithm has high prediction efficiency in EQ, JDT, ML and PDE datasets. Table  10 shows that SSA-ELM-Bagging prediction algorit hm has the highest accuracy value in the ar4, ar6, poi-1.5 three datasets, ISSA-ELM-Bagging algorithm is a suboptimal value. For the dataset poi-1.5, ISS A-ELM-Bagging algorithm may not reflect the effec t of low defect rate datasets because of the high de fect rate of data. It can be seen from Table 11 that ISSA-ELM-Bagging algorithm and SSA-ELM-Baggi ng prediction algorithm both reach the optimal G-m ean value in the dataset ant-1.3. The prediction effe ct of SSA-ELM-Bagging algorithm is better than th at of the other four prediction algorithms for ar1, a r4 and ar6 datasets. ISSA-ELM-Bagging algorithm i s a suboptimal prediction algorithm on ar1 and ar4 datasets. For ar6 dataset, the difference between the proposed algorithm and the optimal G-mean value is 0.1095. For this dataset, the number of data feat ures and the number of samples are relatively few, resulting in the poor effect of ISSA-ELM-Bagging a lgorithm.
The ISSA-ELM-Bagging algorithm can achieve the optimal evaluation index value in most datasets. By analyzing Table 6 to Table 11, it can be seen that the ISSA-ELM-Bagging algorithm can achieve good prediction results for high data dimension, larg e data volume and low data defect rate, which sho ws that the ISSA-ELM-Bagging algorithm has highe r prediction accuracy and stronger generalization abi lity.
Standard deviation is an important indicator to measure the effect and stability of the algorithm. In the experimental process, the lower the standard de viation is, the better the stability of the algorithm i s, and vice versa. It can be seen from Fig. 8 that t he standard deviation of ISSA-ELM-Bagging algorit hm is not the optimal value. Since the accuracy of ELM prediction algorithm for 25 datasets is genera lly low, the final standard deviation is lower than t hat of ISSA-ELM-Bagging algorithm. It can be see n from Fig. 9 to Fig. 13 that the standard deviation of ISSA-ELM-Bagging algorithm is the minimum compared with the standard deviation of the other f our prediction algorithms on five evaluation indexes.
Especially under the four evaluation indexes of Fmeasure, MCC, Accuracy and G-mean, the standard deviation of ISSA-ELM-Bagging algorithm is signif icantly better than that of the other four prediction algorithms. It shows that ISSA-ELM-Bagging algorit hm has better overall stability on the software defec t dataset than the other four prediction algorithms.
The traditional ELM is a single hidden layer f eedforward neural network with poor stability and r andom selection of parameters, resulting in low pre diction accuracy of the model. The traditional sparr ow search algorithm has excellent optimization abilit y, which can obtain the optimal parameters of ELM and improve the stability, but it is easy to fall int o the local optimal solution in the late iteration. Th e improved sparrow search algorithm has faster con vergence rate and the ability to jump out of the loc al optimal solution by using the reverse learning of pinhole imaging and the foraging strategy of flip b ucket. Using improved sparrow search algorithm to optimize ELM as the base predictor of ensemble le arning algorithm can further improve the prediction accuracy and generalization ability of ELM.

Conclusions
In this paper, ISSA-ELM-Bagging algorithm is prop osed to improve the stability of ELM in datasets w ith low defect rate and high data dimension. The o ptimization ability of sparrow search algorithm is i mproved by using small hole imaging reverse learni ng and flip bucket foraging strategy. Then, the ISS A algorithm is used to optimize the random selectio n of ELM parameters, and the optimal parameters o f ELM are obtained to ensure the prediction accura cy of ISSA-ELM. Finally, ISSA-ELM is used as th e base predictor of Bagging ensemble learning algor ithm to improve the stability of ISSA-ELM. The ex perimental results show that the optimization perfor mance of ISSA algorithm is significantly better than other prediction algorithms. For 25 software defect datasets, the six evaluation indexes of ISSA-ELM-Bagging ensemble prediction algorithm are significa ntly superior to other prediction algorithms. In addit ion, the weighted integration strategy can be explore d to further improve the performance and stability of the model.

Declaration of Competing Interest
The authors de clare that they have no known competing financial interests or personal relationships that could have ap peared to influence the work reported in this paper.