Improved Particle Swarm Optimization on Based Quantum Behaved Framework for Big Data Optimization

In recent times, big data has become an essential concern with the rapid increase of digitalization. The problems that find solutions to the problems of finding and evaluating the features of big data are called optimization problems. In this paper, data sets containing EEG signals have been studied. The goal is to detect actual EEG signals while eliminating additional brain activity patterns in the collected data, resulting in more accurate interpretation. In the study, to handle big data optimization (BigOpt) difficulties, a novel swarm intelligence-based technique is developed. A Developed PSO-Q was proposed by updating the random walking phase of the Particle Swarm Optimization on the combined quantum behaved method (PSO-Q) for BigOpt problems. PSO-Q's local search capability has been improved. The success of PSO-Q and IPSO-Q has been thoroughly tested in various cycles (maximum iterations) (300, 400, 500, and 1000) and population sizes (10, 25, and 50) on six data sets. The outcomes of the PSO-Q and IPSO-Q were statistically evaluated with the Wilcoxon Signed-Rank Test. PSO-Q and IPSO-Q have been compared with newly developed swarm-based algorithms (BA, Jaya, AOA, etc.) in the literature in recent years. The success of IPSO-Q has been shown by evaluating the results obtained. The results showed that IPSO-Q can be used as an alternative algorithm in BigOpt problems.


Introduction
The process of making anything better is known as optimization. In other terms, optimization is the act of modifying the inputs or attributes of a device, mathematical process, or experiment in order to obtain the smallest or greatest output or outcome [1]. Optimization algorithms are classified into two types: deterministic and stochastic techniques. Although deterministic approaches offer a greater rate of convergence, they are susceptible to local optimum solutions since they employ gradient information from an optimization problem.

E. Baş
The best-known form of stochastic algorithms is evolutionary algorithms. In general, evolutionary algorithms are created by imitating various natural processes such as physical, biological, or natural occurrences. Many evolutionary algorithms have been proposed in the literature such as Artificial Bee Colony algorithm (ABC) [2], Firefly Algorithm (FA) [3], Ant Colony Optimization (ACO) [4], Particle Swarm Algorithm (PSO) [5], etc. Particle Swarm Algorithm (PSO) has also been proposed as an evolutionary algorithm in the literature. PSO is a heuristic algorithm created by imitating the swarm behavior of creatures such as fish and birds that live in flocks in nature. Eberhart and Kennedy improved PSO in 1995 [5]. PSO is inspired by the coexistence behavior of birds and fish swarm. PSO is easy to implement and the number of parameters is small. Various PSO applications have been proposed in the literature and have been applied to various problems [6][7][8]. The Particle Swarm Optimization combined with quantum behaved method (PSO-Q) is a population-based algorithm and was first presented by Sun et al. [9]. Quantum behavior particle swarm optimization algorithm brings quantum computing to PSO. Thus, the disadvantages are overcome while retaining the advantages of the particle swarm algorithm. PSO is based on the quantum state vector's representation. It employs quantum spin gates to execute the update process and applies the qubit's probabilistic amplitude representation to particle coding, allowing a particle to represent an overlapping of several states. To avoid early convergence, the gate is used. Early convergence is an issue that many heuristic algorithms have to deal with. By combining classical PSO with quantum computing, the early convergence disadvantage of PSO is avoided. Many studies have been done on it in the literature. In 2005, Moore and Venayagamoorthy presented a quantum particle swarm technique for combinational logic [10]. In 2006, Mikki and Kishk introduced an electromagnetism-based quantum mechanical particle swarm method and improved the electromagnetic elements [11]. In 2014, Yumin and Li suggested a particle swarm optimization approach based on an artificial fish swarm that was paired with a quantum-behaved algorithm [12]. The proposed algorithm uses swarm and tracking behavior to avoid pulling the individual to the local endpoint. Santos Coelho et al. presented a more efficient PSO-Q that incorporates chaotic sequences (CQPSO) [13]. Combined with quantum-behaved particle swarm optimization, Liu and Zhang presented dynamic clustering [14]. For system identification, Li and Li presented a quantum particle swarm evolutionary method [15].
Stochastic algorithms from the discipline of evolutionary algorithms are the best optimization approach for large data optimization (Big Opt ) challenges. They can investigate big and extensive search regions more effectively and efficiently since they work on a population of possible solutions. The term "big data" refers to the vast amount of data that has been stored as a result of today's digitalization [16]. Information flow has started to grow enormously in the last decade due to the ease of access and sharing of information. This growth has allowed data to be produced in many areas (medicine, social media, etc.) [17,18]. Many areas are becoming connected with the information stream or sharing data on social media. It is not easy to process this increasing data and make it meaningful. This big data offers new opportunities and broadens the business horizons [19]. Storing or processing this data with classical databases and data processing methods has created many challenging problems. They cannot support the capacity of big data. Big data has four characteristics, according to the National Institute of Standards and Technology (NIST) (volume, variety, velocity, variability) [16]. With scalable architectures capable of processing and analyzing enormous volumes of data, Big Data has introduced many issues. Scalable platforms capable of processing and analyzing massive amounts of data are now available, Big Data has introduced many issues. Big Data is ubiquitous in the industry. There are real-world problem applications of big data such as cybersecurity, logistics, healthcare. In this study, the process of cleaning electroencephalography records is discussed. The brain's electrical activities are captured via EEG signals. The core of the problem is to identify true EEG signals, thus providing more reliable interpretation accuracy. Another issue is the processing time, which is a concern because this cleanup necessitates real-time performance. In the Big Data Optimization 2015 Competition, it was modeled as a mathematical Big Data optimization problem (Big Opt ). With the number of variables evaluated and the number of channels employed in the capture procedure, the dimensionality of the Big Opt issue can reach multiples of the length of the EEG signals [17,18].
In this study, the Big opt problem is solved by using the PSO-Q algorithm. To improve the results, the Developed Particle Swarm Optimization mixed with quantum behaved approach (IPSO-Q) is developed in this work for Big Signal Optimization (Big Opt ). The random walking phase of PSO-Q has been updated with an equation that calculates "mort" and the proposed new algorithm is named IPSO-Q. Thus, the escape of the local optimum traps of the PSO-Q algorithm is ensured. To test framework performance, PSO-Q and IPSO-Q are applied to optimize six data sets using Big Opt problem formulations. A detailed parameter analysis is also presented in this paper. For the Big opt problem, the results obtained in three different population sizes and four different maximum iteration sizes were compared. Thus, the effect of the parameters on the results is shown.

The Main Contribution, Motivation, and Organization
Big data is a term used to express growing data over time. In this study, heuristic algorithms provide a solution to the problem defined as the big data optimization problem. The following are the study's contributions: • This study used big data that has recently evolved and is difficult to process. The flow of information has started to grow enormously due to the ease of access and sharing of information. It is not easy to process this increasing data and make it meaningful. • In this study, the problem modeled as a mathematical Big Opt problem was solved in the 2015 Big Data Optimization Competition. The dimensionality of the Big Opt can be very large as the number of variables handled can reach multiples of the number of channels used in the capture process and the length of the EEG signals. Therefore, the problem is considered as a large-scale optimization problem. Big Data optimization problem includes large dimensions due to the structure of datasets. • In this paper, the Big Opt problem was first solved with PSO-Q and IPSO-Q. Developed PSO-Q (IPSO-Q) has been proposed by updating the random walk phase of PSO-Q. Thus, the local search capability of PSO-Q has been improved. • IPSO-Q and PSO-Q were run with various population sizes (10, 25, and 50) and maximum iteration numbers (cycles) (300, 400, 500, and 1000) to test their success as sensitively as possible. • IPSO-Q has been compared with methods based on various swarm intelligence developed recently in the literature (Jaya algorithm [20], The Arithmetic Optimization Algorithm (AOA) [21], Bat Algorithm (BA) [22], etc.). The success of IPSO-Q has been tested statistically.
The following is how the paper is organized: Related works for Big opt are explained in Sect. 2, the original version of PSO-Q is explained in Sect. 3, Developed PSO-Q (IPSO-Q) is detailed in Sect. 4. The Big Opt problem is detailed in Sect. 5. IPSO-Q and PSO-Q are tested on six benchmark data sets for various population sizes and maximum iterations in Sect. 6 and PSO-Q and IPSO-Q results are compared with different methods in the literature and the results are discussed.

Related Works for Big opt Problem
Many studies in the literature solve the Big opt problem. Aslan investigated the basic ABC method and its well-known versions using the electroencephalographic signal decompositionbased optimization challenges proposed at the 2015 Congress on Big Data Competition [23]. Based on big data optimization, Aslan and Karaboga suggested a genetic Artificial Bee Colony technique for signal reconstruction [24]. The ABC algorithm was updated to account for the specific aspects of big data optimization issues, and a new variation is known as genetic big data ABC, or gdatABC, was created. A variety of experimental experiments were carried out to investigate the solution capabilities of the gdatABC method. Elaziz et al. presented a hybrid salp swarm method with differential evolution for multiobjective big data optimization [25]. The differential evolution algorithm's job is to improve the salp swarm algorithm's feature exploitation capacity since the differential evolution algorithm's operators are employed as local search operators. In general, the proposed technique is divided into three steps. In the first stage, the population is established, and the archive is launched. The second stage updates the solutions using the hybrid salp swarm method and the differential evolution algorithm, and the third stage detects nondominated solutions and updates the archive. A set of tests were carried out to evaluate the performance of the suggested technique. A collection of single-objective and multiobjective optimization tasks from the 2015 Big Data competition were tested [25]. Wang et al. present a hybrid multi-objective FA (HMOFA) for big data optimization [26]. In comparison to the previously introduced DE-based large data optimization strategies, the changes to the FA were kept to a minimum. They integrated the DE algorithm's typical crossover method into the FA procedure and adaptively adjusted the parameters. Experimenting with several cases of the discussed large data optimization issue shows that the suggested FA-based method yields good results. For Big Data optimization issues, Yi et al. suggested an enhanced NSGA-III method with an adaptive mutation operator [27]. They also developed an adaptive mutation operator to improve the conventional NSGA-III's performance and evaluated it with three distinct crossover strategies, including simulated binary crossover, uniform crossover, and single-point crossover. Experiment findings reveal that adaptive NSGA-III based on uniform crossover gives superior outcomes to other NSGA-III versions [27]. Zhang et al. suggested a multi-objective memetic approach for large optimization problems based on decomposition [28,29]. Zhang et al. investigated several methods for merging big data process methods with network optimization in order to improve user quality of experience [28,29]. They presented a Big Data-Driven (BDD) network optimization system. The suggested architecture was designed to work with forthcoming 5G networks and Big Data technologies. Sabar et al. developed a heterogeneous method based on a cooperative co-evolution (CC) approach with several types of MAs. They broke a large-scale problem into manageable subproblems using the proposed CC technique. It was demonstrated that their suggested heterogeneous approach can self-adaptively solve Big Data 2015 challenges [30]. Elsayed and Sarker proposed a differential evolution framework for the 2015 big data optimization competition problems and the exploitation capability of the proposed method increased [31]. Before tackling EEG signal-based optimization issue instances, Cao et al. presented the Phase-Based Optimization (PBO) method, which examines transitions between the solid, liquid, and gas phases of matter [32]. Zhang et al. introduced a multi-agent genetic algorithm for big optimization problems [33]. Loukdache et al. concentrated on the Clonal Selection method and assessed its performance on several cases of the aforementioned huge data optimization challenge [34]. Meselhi et al. attempted to eliminate artifacts from EEG signals by parallelizing the DE algorithm-based memetic approach for graphics processing units [35]. Abdi and Feizi-Derakhshi suggested a hybrid multi-objective evolutionary method for big data optimization issues based on the Search Manager framework [36]. The Search Manager (SM), a recently published framework for hybridizing metaheuristics to improve optimization algorithm performance, is expanded for multi-objective problems (MOSM) to handle the EEG signal analysis problem, which belongs to the huge data optimization difficulties class. The experimental findings show that the proposed MOSM designs are effective in these types of issues. Peng et al. introduced DE-DDI, a new DE framework for large data that selects the base and difference vectors for the mutation operator based on information collected directly from the population. The data is obtained from a distributed topology that is used to define each individual's neighborhood. The algorithm's performance is evaluated using CEC2005 benchmark functions. The results demonstrate its efficacy in dealing with these issues [37]. In their Fireworks Algorithm-based Framework, El Majdouli et al. developed a single-objective Fireworks Algorithm (SOFWA) with a modified search mechanism for consistent adaption on Big Data challenges (FAF) [17,18]. One of the most significant changes made by Majdouli et al. in their study was the beginning of potential solutions or fireworks. Rather than randomizing the parameters of the feasible solutions within the provided lower and higher boundaries, they used the source EEG signal directly.

The Particle Swarm Optimization Combined with Quantum Behaved Method (PSO-Q)
In the 1980s, Benioff and Feynman first proposed the concept of quantum computing to the literature. Quantum computing uses the superposition, entanglement, and coherence characteristics of quantum states. It is thought that quantum computing can produce effective solutions for the solutions of NP-hard problems in classical calculations. Since then, quantum computing has started to attract the attention of many researchers in the scientific world. Especially after Grover's random database search algorithm and Shor's quantum prime factorization algorithms were introduced to the literature, quantum computing gained popularity [38,39]. QPSO is a combination of quantum computing and PSO. QPSO uses the quantum state vector. Many state overlaps are represented for a particle, and particles are encoded using quantum spin gates for the update operation [40]. A quantum space particle is described using the wave function. The wave function is shown in Eq. 1.
where | | 2 is the square of the module of the wave function and Q is the probability density function. Equation 2 shows the normalization condition.
Assume that the quantum space contains a population of n particles (D dimension). X i = (x i1 , x i2 , x i3 ,…, x iD ) is location of the ith particle. P i = (p i1 , p i2 , p i3 ,…, p iD ) is the particle through the history of the best position. P g = (p g1 , p g2 , p g3 ,…, p gD ) is all the particles of the best historical location.
Equations 3-5 provide the future locations of the particles in the random walk phase in quantum space.
where u is the random number range of [41] and β is the contraction expansion factor. It is the only parameter of the PSO-Q. β is shown by Eq. 6. ϕ is the random number in [41]. Sun et al. proposed mbest to avoid early convergence [9]. mbest is shown by Eq. 7.
where Max iter is the maximum number. w 1 and w 2 are 0.5 and 1.0, respectively.
where p i is the best position of ith particles and n is the number of particles (population size). "mbest" indicates the mean best position of n the particles. After the introduction of mbest, the random walk phase of PSO-Q has been updated again. Equations 8 and 9 are shown in PSO-Q's new random walk phase. Figure 1 depicts the PSO-Q flowchart, whereas Fig. 2 depicts the PSO-Q work processes.

Developed Particle Swarm Optimization Combined with Quantum Behaved Method (IPSO-Q)
PSO-Q is developed in this section. The early convergence problem is a big problem for many heuristic algorithms. While heuristic methods search for optimal points in the search space, they perform both a local and global search. If a heuristic algorithm fails to achieve an adequate balance between local and global search, either early convergence occurs or the optimal point is strayed too far. This situation is also encountered when one of the two searches is insufficient. Therefore, a good balance between exploitation and discovery must be achieved in a successful heuristic algorithm. Also, both search capabilities must be sufficiently successful. The premature convergence problem of classical PSO is reduced by adding a quantum computation to the PSO. This is accomplished by expressing each particle as a quantum system in a quantum state formulated with a wave function. When the structure of the PSO-Q algorithm was examined, mbest points were used to prevent early convergence in the random walking phase in PSO-Q. "mbest" indicates the average best position of n the particles. The objective points of the best and worst particles are also Start Set the parameters A population was created.
The population's objective function was calculated.

The optimal best updated
The global best updated The population updated Step 1. The set algorithm parameters (n=population capacity, D=dimension, Max_iter= the maximum iteration, p i = history of particles, and p g = ideal point in global history) Step 2. Calculate and assess the objective points of the individuals in the population.
Step 3. Compare the population's optimal objective levels across history. Replace the previous objective point with the new objective point if the new objective point is superior; else, do nothing.
Step 4. Update the global objective points, which are the best objective points across the whole population.
Step 5. Using the PSO-Q algorithm, update all of the particles in space to their new locations (Equation 9).
Step 6. If PSO-Q achieves the maximum number of iterations, the algorithm returns the best result; otherwise, return to Step 2. We realized that if the particles with the best and worst objective points represent the best or worst particles for the system, both points play an active role in the exploitation and exploration capabilities of the PSO-Q. Approaching the best point increases the local search capability and causes early convergence while approaching the worst point causes reasons that are far from the optimum result. Therefore, if the objective points of these two particles are not included in the mbest calculation, the success of PSO-Q will increase. IPSO-Q, which was developed by updating the random walking phase in PSO-Q, has been proposed in this work. Instead of "mbest" used in PSO-Q, the "mort" equation has been added in IPSO-Q. Equation 10 shows "mort" in IPSO-Q. Although the average position points of the whole population are calculated in the mbest, in the mort, the best and worst particles of the population are not taken into account. Thus, the new algorithm avoids pulling the individuals to the local endpoints (best and worst points).
Although the technique has been simplified, the new "mort" process introduced to the random walk stage has enhanced the program's performance, as seen in the experimental section.
where p i is the best location of ith population individual and n is the number of population individuals (population size). p best is the best point of population individuals and p wor st is the worst point of population individuals. "mort" indicates the mean best position of n-2 the population individuals (excluding p best and p wor st points). Figure 1 shows the IPSO-Q flowchart, which is identical to that of the original PSO-Q method, and the working stages are identical to those of the original PSO-Q method in Fig. 2. The main change is that Eq. 10 is updated instead of Eq. 7, which is used to update individual locations in PSO-Q.

The Big Data (Big Opt ) Problem
When the amount, speed, diversity, and correctness of big data are regulated, it may be distinguished from standard datasets or databases. In the 2015 CEC Big Data Competition, Abbass et al. proposed a novel big data and related optimization challenge based on these characteristics [41][42][43]. When evaluating the inter-dependent time series numbers, the data gathered by EEG measurements are divided into six groups: D4, D12, D19, D4N, D12N, and D19N. Each optimization problem is based on some inter-dependent time series forming a dataset [31]. The length of each time series is decided to be 256 [41][42][43]. Inter-dependent time series for data sets D4, D12, and D19 are 4, 12, and 19, respectively. Each time series in this dataset has 1024, 3072, and 4864 variables, respectively.
Let's define two Y and T matrices. Let their dimension be N × M. N is the number of inter-dependent time series and M is the length of each time series. B is an N × N linear transformation matrix [31]. The task is to divide the T matrix into two matrices (T1 and T2). They have the same dimensionality as T. Equations 11-13 show these matrices.
However, there is no simple way to easily obtain the optimal T1 and T2 matrices. As a result, some statistical computations should be used to guide the split of the T matrix. C is the Pearson correlation coefficient. C between T1 and Y is calculated by Eq. 14.
where cov(Y , B × T 1 ) is the covariance matrix and σ (Y ) and σ (B × T 1 ) are the variances. In addition, the distance between T and T1 should be as minimum as possible (T1 should be as similar to T as possible). Two different objective functions can be created by taking into account these targets mentioned in both C and T1. Equation 15 shows the first objective function and Eq. 16 shows the second objective function.
For a Big Opt problem, a single objective function can be defined as f 1 + f 2 . Equation 17 shows a single objective function.

Experimental Analysis
The capabilities of PSO-Q and IPSO-Q methods for big data optimization were solved in this study. The PSO-Q and IPSO-Q were run in three different populations (10, 25, and 50) at four different maximum iterations (Cycles) (300, 400, 500, and 1000). The search space is set in [− 8, 8]. Finally, Each application was run 30 times. The best, the worst, the average of the objective worths (mean), and standard deviation (SD) were determined. Experiments were carried out on a 1.19 GHz processor and 12 GB memory. The parameters setup are indicated in Table 2. Table 3 indicated the points of the baseline algorithm (NSGA-II) [44]. These results are the optimum points found for the big data optimization problem first.

The Comparisons of PSO-Q and IPSO-Q for 300, 400, 500, and 1000 Cycles
The comparisons of PSO-Q and IPSO-Q were made in three different criteria (best, mean, and SD) for 300, 400, 500, and 1000 cycles according to the results of Tables 4, 5, 6 and 7. Similar results of PSO-Q and IPSO-Q are also marked in bold in both results. Moreover, the baseline algorithm results are added to the comparison tables. According to the results, IPSO-Q performed much better than PSO-Q according to mean and SD criteria. The comparisons of PSO-Q and IPSO-Q are shown in Table 4 for 300 cycles. According to the results of the best, the IPSO-Q demonstrates good achievement at 66.67% of big data sets for 4 out of 6 data sets, 66.67% of benchmark data sets for 4 out of 6 data sets, and 83.33% of benchmark data sets for 5 out of 6 data sets for population size = 10, 25, and 50, respectively. According to the results of the mean, the IPSO-Q algorithm demonstrates super achievement at 100% of big data sets for 6 out of 6 data sets for population size = 10, 25, and 50. According to the results of the SD, the IPSO-Q demonstrates good achievement at 100% of big data sets for 6 out of 6 data sets, 100% of big data sets for 6 out of 6 data sets, and 83.33% of big data sets for 5 out of 6 data sets for population size = 10, 25, and 50, respectively.
The comparisons of PSO-Q and IPSO-Q are demonstrated in Table 5 for 400 cycles. According to the results of the best, the IPSO-Q demonstrates good achievement at 100% of big data sets for 6 out of 6 data sets, 100% of big data sets for 6 out of 6 data sets, and 83.33% of big data sets for 5 out of 6 data sets for population size = 10, 25, and 50, respectively. According to the results of the mean, the IPSO-Q demonstrates super achievement at 100% of big data sets for 6 out of 6 data sets for population size = 10, 25, and 50. According to the results of the SD, the IPSO-Q demonstrates good achievement at 83.33% of big data sets for  Table 6 for 500 cycles. According to the results of the best, the IPSO-Q demonstrates good achievement at 66.67% of big data sets for 4 out of 6 data sets, 83.33% of big data sets for 5out of 6 data sets, and 33.33% of big data sets for 2 out of 6 data sets for population size = 10, 25, and 50, respectively. According to the results of the mean, the IPSO-Q demonstrates super achievement at 100% of big data sets for 6 out of 6 data sets for population size = 10, 25, and 50. According to the results of the SD, the IPSO-Q algorithm demonstrates good performance at 83.33% of big data sets for 5 out of 6 data sets, 66.67% of big data sets for 4 out of 6 data sets, and 83.33% of big data sets for 5 out of 6 data sets for population size = 10, 25, and 50, respectively. The comparisons of PSO-Q and IPSO-Q are demonstrated in Table 7 for 1000 cycles. According to the results of the best, the IPSO-Q demonstrates achievement at 83.33% of big data sets for 5 out of 6 data sets, 66.67% of big data sets for 4 out of 6 data sets, and 50% of big data sets for 3 out of 6 data sets for population size = 10, 25, and 50, respectively. According to the results of the mean, the IPSO-Q demonstrates super achievement at 83.33% of benchmark data sets for 5 out of 6 data sets, 100% of big data sets for 6 out of 6 data sets, and 100% of big data sets for 6 out of 6 data sets for population size = 10, 25, and 50, respectively. According to the results of the SD, the IPSO-Q demonstrates good achievement at 83.33% of big data sets for 5 out of 6 data sets, 100% of big data sets for 6 out of 6 data sets, and 100% of big data sets for 6 out of 6 data sets for population size = 10, 25, and 50, respectively.

The Wilcoxon Signed-Rank Test on the Results of PSO-Q and IPSO-Q
The Wilcoxon signed-rank findings for the PSO-Q and IPSO-Q algorithms for 300, 400, 500, and 1000 cycles are indicated in Tables 8 and 9. Tables 8 and 9 indicated the results of a signedrank test with a 0.05 p-point. If the p-point is below 0.05, the sign is shown as positive. If the p-point is above 0.05, it is shown as negative. The results revealed a considerable difference between PSO-Q and IPSO-Q, with the sign being positive in the majority of situations.

Table 8
The statistical test's findings on the PSO-Q and IPSO-Q methods of findings for 300 and 400 cycles (s

Table 9
The statistical test's findings on the PSO-Q and IPSO-Q methods of findings for 500 and 1000 cycles (s

Comparisons of PSO-Q and IPSO-Q with Other Algorithms
Case 1: In this subsection, a comparison has been made with some swarm-based algorithms recently developed in the literature to analyze the success of IPSO-Q. These algorithms are as follows Jaya algorithm [20], The Arithmetic Optimization Algorithm (AOA) [21], and Bat Algorithm (BA) [22]. The original codes for Jaya, AOA, and BA were extracted from the Matlab library (https://www.mathworks.com). Population size is set equal to 25 and the maximum iteration to 300, 400, 500, and 1000. All methods were run 30 times. Table 10 shows the parameter settings for the comparison algorithms. Tables 11, 12, 13 and 14 provide the best, worst, mean point, and standard deviation (SD) of the objective function points of the comparison methods. The best points are marked. A comparison was not made according to their worst points.
According to the results of the best, mean, and SD, IPSO-Q performed better than Jaya, AOA, BA, and PSO-Q algorithms. According to the results of the mean, the IPSO-Q demonstrates high achievement at 100% of big data sets for 6 out of 6 data sets for 300, 400, 500, and 1000 cycles. According to the results of the SD, the IPSO-Q demonstrates good achievement at 100% of big data sets for 300 and 1000 cycles and 66.67% of big data sets for 400 and 500 cycles. According to the results of the best, the IPSO-Q demonstrates good achievement at 66.67% of big data sets, 100% of big data sets, 66.67% of big data sets, 83.33% of big data sets, and 66.67% of big data sets for 300, 400, 500, and 1000 cycles, respectively.
The Wilcoxon signed-rank test was used for Jaya, AOA, BA, and IPSO-Q results. The outcomes are indicated in Tables 15, 16, and 17. From the outcomes given in Tables 15, 16 and 17, it is believed that the IPSO-Q results are sufficient to produce statistical differences in favor of the IPSO-Q method over the Jaya, AOA, and BA methods. Figures 3, 4 and 5 show the outcomes of IPSO-Q, PSO-Q, Jaya, AOA, and BA methods in population size = 25 for various cycles (300, 500, and 1000). According to the convergence graphs, the IPSO-Q, PSO-Q, and AOA algorithms were substantially faster to converge than the BA and Jaya methods.
Case 2 The results given in Table 18 can be checked to analyze whether the success of the IPSO-Q is adequate compared to other heuristic methods. The outcomes obtained from the IPSO-Q and PSO-Q algorithms were compared with literature algorithms such as COABC [23], ACDE [30,44], SaNSDE-CC [30,46], JADE [30,47], SADE [30,48], NSGA-II [30,43] and MOFA [26] were used in the solution of six instances (D4, D4N, D12, D12N, D19,  and D19N). Population size is set equal to 100 and the maximum iteration to 1000 [23,30,44,46]. The best, standard deviation (SD), and mean point of the fitness function points of the comparison methods are indicated in Table 18. The best mean points are signed.
The results of IPSO-Q are much better than SaNSDE-CC, JADE, SADE, PSO-Q, and NSGA-II for the six big data instances. IPSO-Q has better results than MOFA for D19 and D19N. When Table 18 is examined, the IPSO-Q provides competitive solutions. However, IPSO-Q could not produce better outcomes than COABC and ACDE algorithms.
The average outcomes of the fitness function found by COABC, ACDE, SaNSDE-CC, JADE, SADE, MOFA, PSO-Q, and IPSO-Q were obtained for six big data sets. The outcomes were subjected to a statistical test result. Table 19 indicates the statistical test outcomes. When the results in Table 18 are examined, IPSO-Q's objective function results are better

Case 3
The results obtained from the IPSO-Q and PSO-Q algorithms were compared with literature algorithms such as IADEF, ADEF, IADEF-NLS, ADEF-NLS, DECC-DG, SHADE, DE 1 , DE 2 , DE 3 , and DE 4 [31]. The best and mean points of the objective function points of the comparison methods are indicated in Tables 19 and 20. The best mean points are signed in bold. Population size is set equal to 25 for D = {1024, 3072} and 50 for D = {4864} and the maximum iteration to 300 for D = 1024, 400 for D = 3072, and 500 for D = 4864 [31]. The best mean points are signed. According to Table 20, the most successful algorithm was IADEF. IPSO-Q failed to pass IADEF. IADEF achieved superior success in 6 of 6 big data problem examples. According to Table 21, the most successful algorithms were IPSO-Q and DE4. In both algorithms, they achieved superior success in 3 of 6 big data problem examples.

Comparisons of PSO-Q and IPSO-Q on CEC-C06 2019 Benchmark Tests Functions
For the extra evaluation of the PSO-Q and IPSO-Q algorithms, 10 modern CEC benchmark test functions consisting of common benchmark functions were selected. The reason for choosing these test functions is that they have been developed in the literature in recent years. These test functions were developed by Price et al. for a single objective optimization problem [49]. The test functions are known as the "100-Number Challenge", which is intended to be used in the annual optimization competition [50]. The definitions of the CEC-C06 2019 benchmarks test functions show in Table 22. All test functions are minimization functions and are scalable. The dimensions of the first three CEC functions are 9, 16, and 18, respectively.    The other CEC functions have a dimension of 10. While functions CEC04 to CEC10 are shifted and rotated, the first three CEC functions are not. PSO-Q and IPSO-Q algorithms have been compared with heuristic algorithms whose performance has been proven in the literature in recent years and whose success has been shown in the CEC-C06 2019 benchmark functions. These selected algorithms are Fitness Dependent Optimizer (FDO) [50], Dragonfly Algorithm (DA) [51], Whale Optimization Algorithm (WOA) [52], and Salp Swarm Algorithm (SSA) [53]. Algorithm parameter settings default parameters set in settings. Default parameter settings are presented in their original papers. For each algorithm, the population size was set to 30 and the maximum number of iterations to 500. Comparison results are shown in Table  23. Table 24 shows Wilcoxon test results between IPSO-Q and other heuristic comparison algorithms. According to the results, FDO showed outstanding success in CEC-C06 2019. After FDO, the most successful algorithm was IPSO-Q. IPSO-Q showed superior success in 2 of 10 CEC functions. According to Table 24, there is a significant difference between the outcomes of IPSO-Q and other comparison methods. Fig. 4 Convergence graphs of the big data instances for the IPSO-Q, PSO-Q, Jaya, AOA, and BA algorithms for cycle = 500

An Example of a Hadoop Application
Hadoop is an open-source framework that enables the parallel processing of large data files using the Mapreduce method. Provides infrastructure support for distributed programs running on clusters of computers. Hadoop enables data to be moved to the relevant node without the need for user interaction and automatically performs recovery operations in case of a possible hardware problem. In the MapReduce technique used, the written program is divided into small pieces that can run on a node in the computer cluster. It also provides the user with a distributed file system (HDFS) that enables high bandwidth for the Hadoop computer cluster [54,55].  In this study, a sample application was carried out in the Hadoop environment. In this study, the U.S. Earthquake data published by the Geological Survey (USGS) (Dataset-7day Earthquake Report of USGS) were analyzed [56]. Earthquake data is available in the form of CSV (comma separated points) files published at periodic intervals. Analysis of earthquake data responds to where earthquakes occur based on the number of earthquakes on a given date. Figure 6 shows a schematic representation of processing data with Hadoop and trend analysis techniques. Figures 7 and 8 show execution time according to the number of nodes and no of days report in earthquake analysis. The preceding experiment indicates that the Hadoop cluster is scalable to serve growing datasets and that as the number of nodes increases, job execution time decreases.

Conclusion
Big data refers to a broad variety of data. There are many problematic applications of big data in the real world, such as cybersecurity, logistics, healthcare. Inferring meaningful features from this wide range of data is an important problem. In this paper, large datasets containing brain signals are selected. The fundamental issue is identifying actual EEG brain signals while minimizing extraneous brain activity signals in the collected data, thus providing more reliable interpretation accuracy. Optimization techniques offer solutions in the processing of big data. In this study, a new method based on swarm intelligence is improved for the big data optimization (Big Opt ) problem. The Particle Swarm Optimization based on Quantum behaved (PSO-Q) is a population-based algorithm and it brings quantum computing to Particle Swarm Algorithm (PSO). Thus, the disadvantages are overcome while retaining the advantages of the Particle Swarm Algorithm. In this study, local search capability was developed by updating the random walking phase of PSO-Q and Developed Particle Swarm Optimization based on