A neighborhood comprehensive learning particle swarm optimization for the vehicle routing problem with time windows

Vehicle routing problem with time windows (VRPTW), which is a typical NP-hard combinatorial optimization problem, plays an important role in modern logistics and transportation systems. Although the particle swarm optimization (PSO) algorithm exhibits very promising performance on continuous problems, how to adapt PSO to eﬃciently deal with VRPTW is still challenging work. In this paper, we propose a neighborhood comprehensive learning particle swarm optimization (N-CLPSO) to solve VRPTW. To improve the exploitation capability of N-CLPSO, we introduce a new remove-reinsert neighborhood search mechanism. We calculate an information matrix ( IM ) recording the probability of adjacency between two clients based on the information about the clients themselves and the local information about the elite individuals to guide the removal operation of the neighborhood search. At the same time, we combine the cost matrix ( CM ) that records the cost of customer removal with IM to create a guided reinsertion operator based on local information to guide the routing. Moreover, to enhance the exploration of N-CLPSO, a semi-random disturbance strategy is proposed. To prevent degradation of the population, the longest common sequences of elites are saved when performing the disturbance. To illustrate the eﬀectiveness of N-CLPSO, this paper conducts extensive experiments


Introduction
The vehicle routing problem (VRP), which is a typical combinatorial optimization problem, was first proposed by Dantzig et al. in 1954 when they studied the optimal path of gasoline transportation trucks (Dantzig and Ramser, 1959).Since there is a high correlation between VRP and the reality of logistics, the VRP attracts much attention of researches in the logistics field.During the last few years, various variants of VRP, such as the Capacitated VRP (CVRP) (Rabbouch et al, 2020) and the Heterogeneous Fleet VRP (HFVRP) (Lai et al, 2016), have derived from different real-world problems.One of the most significant variations is the VRP with time windows (VRPTW) (Yu et al, 2011a), in which users' access time and vehicles' capacity limits are considered in VRP aiming to make the problem more realistic.
Initially, some deterministic algorithms are adopted to solve VRPTW (Baldacci et al, 2012;Kallehauge, 2008), but due to VRPTW's NP-hardness, solving large-scale VRPTW with the deterministic algorithms is exceedingly time-consuming.Thus, in recent years, various heuristic and meta-heuristic algorithms for solving VRPTW problems are of great interest to researchers.For example, the Simulated Annealing (SA) (Zhong and Pan, 2007), Taboo Search (TS) (Ho and Haugland, 2004), Ant Colony Optimization (ACO) (Cruz-Reyes et al, 2014), Particle Swarm Optimization (PSO) (Gong et al, 2012), and Genetic Algorithm (GA) Nalepa and Blocho (2015) have been proved to be effective in solving VRPTW problems.However, as the complexity and scale of VRPTW increase, the convergence speed and optimal results of the algorithms are still unsatisfactory.Hence, how to speed up the convergence and improve the solutions' accuracy needs to be further studied.
PSO algorithm (Poli et al, 2007), as an excellent heuristic proposed by Kennedy and Eberhart in 1995, was initially widely used for continuous function optimization and achieved many promising achievements in both theoretical studies (Wei et al, 2020;Xia et al, 2020b,a,c) and engineering applications (Veeramachaneni et al, 2005;Lin et al, 2009;Kulkarni and Venayagamoorthy, 2010;Kanakasabapathy and Swarup, 2010;Kulkarni and Venayagamoorthy, 2011).In recent years, some researchers have tried to improve the basic PSO algorithm to adapt it to combinatorial optimization problems, including VRPTW.However, PSO also has some shortcomings, including (1) premature convergence for complicated multimodal problems, and (2) low precision of final solutions.The main reasons for the former shortcoming of the PSO algorithm is that the population diversity may disappears rapidly during optimization process.Hence, many improvements intending to keep the population diversity have been proposed by researchers (Ajibade et al, 2022;Cheng et al, 2011).To overcome the latter drawback of the PSO algorithm, various excellent and efficient local search strategies have been applied to some PSO variants (Chourasia et al, 2019;Liu et al, 2020).Although these strategies significantly enhance the comprehensive performance of PSO on continues problems, they cannot be directly applied in combinatorial optimization problems, such as VRPTW.Thus, how to improve these strategies in PSO and adapt them to VRPTW still needs to be further studied.
Based on the analysis above mentioned, this paper proposes a neighborhood comprehensive learning PSO (N-CLPSO) to solve VRPTW.In N-CLPSO, update operators of the velocity and position applied in traditional PSO are improved to satisfy characteristics of VRPTW.Based on the update operators, we design a novel neighborhood search operator to enhance the local search capability of N-CLPSO.Moreover, intending to maintain population diversity, a diversity retention strategy based on elite fragments is introduced in N-CLPSO.Properties of introduced strategies in N-CLPSO is analyzed by systematic experiments, while the overall performance of N-CLPSO is testified by comparison results between it and other state-of-the-art metaheuristics on a large number of benchmark VRPTW instances.
The main contributions of this study are summarized below.
(1) PSO has been widely used in the field of VRPTW, but its lack of local search capability leads to the solution quality being often unsatisfactory.To remedy this deficiency, we propose a novel reinsert operator and two remove operators with the help of the remove-reinsert idea of Large Neighborhood Search (Hong, 2012).We combine the novel remove-reinsert-based neighborhood search with CLPSO and propose N-CLPSO.
(2) In previous reinsert operators, most of them only consider the incremental repair cost after node insertion, which tends to ignore other useful information.Therefore, we propose an efficient repair operator, in which not only considers the local information of the location between clients, time windows and elite segments, but also evaluates the incremental cost caused by the insertion of client points into the current location.
(3) To overcome the premature convergence, this study proposes a diversity retention strategy based on semi-random disturbance of elite fragments.In order to ensure the diversity of particles, a certain scale of reorder operations on the original routes should be performed.However, a large-scale random reorder operations may lead to particle degradation.Therefore, we design a new reorder strategy, in which the longest common sequences of elite particles are saved.Based on the strategy, the promising gene blocks, i.e., the longest common sequences, of the elites, can be saved when executing the reorder operator.
A neighborhood comprehensive learning particle swarm optimization ... The rest of this paper is organized as follows.Section 2 provides a short introduction to the PSO algorithm and the VRPTW model, and Section 3 reviews the current state of VRPTW research.N-CLPSO and the main strategies are detailed in Section 4. Next, the experimental results and analysis are reported in Section 5. Finally, a summary and future work of this study are presented in Section 6.

PSO
In the standard PSO, states of each particle i in the t th generation can be described by two vectors, i.e., a position vector X t i = x t i,1 , x t i,2 , . . ., x t i,D and a velocity vector , where D denotes a dimension of the problem to be optimized.X t i is considered as a candidate solution, and V t i is regarded as the search direction and step size of particle i in the t th generation.In the process of population search, each particle adjusts its flight path by its own historical best position and the population's historical best position GB t i = gb t i,1 , gb t i,2 , . . ., gb t i,D .The specific update rules of V t i and X t i are defined as Eq.( 1) and Eq.( 2), respectively.
where w denotes an inertia weight, which is used to control the influence of the current speed on the latest speed; c 1 and c 2 are two constants that determine the learning weight on P B t i and GB t i respectively; r 1 and r 2 are two random numbers uniformly distributed in the interval [0, 1].
To improve the global search capability of the basic PSO, Liang et al (2006) proposed a CLPSO in which a new speed update rule described as Eq.( 3) is used to prevent premature convergence.
where f (i) represents a particle index that guides particle i to fly in the j th dimension, and the particle f (i) can be any particle including particle i itself.
To select f (i) in each dimension, CLPSO will generate a random number r. Subsequently, r is compared with P c i defined as Eq.( 4), which is the learning probability of the control particle to learn from itself or others.
where N is the population size.If r is greater than P c i , the particle i learns toward its own personal history.On the contrary, the particle i selects the particles f (i) as its learning exemplar by binary tournaments selection.Using this learning strategy, particles can learn not only from themselves, but also from the optimal features of other particles.These features allow the particles to have more learnable samples and a larger potential flight space.Thus, CLPSO can utilize the helpful information in the population more efficiently, and then can generate higher quality solutions.Experimental results in (Chen et al, 2010) manifest that CLPSO has good performance for discrete optimization.Hence, we will use CLPSO as the basic framework to solve VRPTW problems.

Mathematical definition of VRPTW
VRPTW is to find the lowest cost route to serve consumers in a given geographic area with the same size fleet within a certain time window.The total demand for service provided by each vehicle must not exceed the total capacity of the vehicle, and each customer is served by a vehicle only once during a defined time window.A vehicle must wait until the start of the time window if it approaches a customer before the start of the customer's time window.Similar to this, a customer cannot be served if a vehicle arrives at their location after the end of their time window.
VRPTW is a problem in which a fleet of K vehicles serve M customers.Each vehicle has a constant capacity Q.The depot v 0 is the start and end point of each route.The vertex v i is defined as a customer, i ∈ {1, 2, . . ., M }.The customer point v i is located at (x i , y i ), its the demand for goods is q i and the delivery time window [b i , l i ], where b i and l i refer to the earliest and latest time when the customer starts the service, respectively.If a vehicle arrives at the customer v i earlier than b i , it must wait until the start of the time window to serve the customer, on the other hand, if the vehicle does not arrive before l i , it cannot serve the customer v i .The service time of each customer is s i .The depot is located at (x 0 , y 0 ) with demand q 0 = 0 and the time window [0, l 0 ≥ max (e i )].For simplicity, the time cost that a vehicle traveling from customer i to customer is represented by the Euclidean distance between nodes (d i,j = d j,i ), where i = j, i, j ∈ {1, 2, . . ., M }.
VRPTW has two objectives defined as Eq.( 5) and Eq.( 6), respectively.The primary goal is to minimize the number of vehicles (N V ) and the secondary goal is to minimize the total distance (T D) with the same number of routes.VRPTW can be mathematically formulated as follows.Define variable: 1, if vehicle k treavels directly from i to j 0, otherwise 1, if customer i is served by vehicle k 0, otherwise A neighborhood comprehensive learning particle swarm optimization ...
The goal of the VRPTW is to minimize K k=1 e i ≤ t j ≤ l j ∀j = 1, . . .M (13) Constraints Eq.( 7)-Eq.(9) mean that each customer will be served by a vehicle and that each customer can be served by only one vehicle.Constraint Eq.( 10) means that each vehicle cannot carry more than capacity Q. Constraint Eq.( 11) means that all routes start from the depot.Eqs.( 12)-( 14) define the time window constraint, where t i is the vehicle arrival time at node i; w i is the vehicle waiting time at the customer location to start the service time; S i is the service time; and d i,j is the time cost between nodes i and j.
3 Literature review VRP and its variants have been extensively studied based on different intelligence algorithms in the past decades.For instance, ACO inspired by the foraging behavior of real ants is a popular probabilistic algorithm to solve VRPTW.Considering the customer service time, Wang et al (2019) designed a multi-ant system with local search, which combines the Multi Ant System algorithm and four local search operators to improve the solution quality.Gupta and Saini (2017) proposed an improved ACO to solve the VRPTW problem, which uses new pheromones to reset and update the function to enhance the global search capability of the algorithm, and improve the optimization path by the 2-opt method.Zhang et al (2019) designed a solution strategy based on ACO and three variational operators to solve a multi-objective VRP problem with flexible time windows.
GA is a heuristic algorithm that simulates genes, chromosomes and the genetic evolution of organisms.Due to its strong search performance and good extensibility, GA has been widely used in VRPTW problems.Gocken et al (2017) proposed a hybrid version of the GA that performs a scan calculation for the initial population, allowing the algorithm to start searching directly for high-quality solutions to produce more has been solutions.Moon et al (2012) proposed a model of the VRPTW with overtime and outsourcing vehicles (VRPTWOV) by extending the conventional VRPTW model, which allows drivers to appear to work overtime and use outsourcing vehicles.Thus, an integer programming model, a GA and a hybrid algorithm based on simulated annealing are proposed to solve the VRPTWOV problem.Zhang et al (2020) used VRPTW as a research object and proposed a hybrid multi-objective evolutionary algorithm with fast sampling strategy-based global search and route sequence difference-based local search (HMOEA-GL).
PSO, as a typical swarm intelligence optimization algorithm, imitates the foraging behaviour of a flock of birds, and it was mainly applied to continuous problems at the beginning of its proposal.In recent years, some scholars have started to try to improve the standard PSO to make it applicable to the optimization of combinatorial problems, such as VRPTW.Marinakis et al (2019) proposed a Multi-Adaptive PSO (MAPSO), which improves the traditional PSO in three aspects: initialization, position update and hyperparameters.Moreover, two memories i.e., global best memory (GBM) and personal best memory (PBM), are proposed in MAPSO, to update the particles position information.Zhang et al (2018) proposed an evolutionary scattering search PSO (ESS-PSO) to solve VRPTW, which introduces a GA and a new "route +/-" evolutionary operator.The algorithm redefines the velocity and position update rules based on the concept of "destroy and rebuild".
Moreover, some studies on VRPTW have indicated that an efficient local search strategy has become a useful method to help individuals to jump out of local optima.Therefore, designing efficient local search strategies becomes one of research hot spots on VRPTW, and attract more attentions of researchers in recent years.For instance, Liu and Jiang (2019) designed an efficient algorithm based on the combination of large-neighborhood search and GA.The algorithm uses a constrained relaxation scheme to extend the search space by neighborhood search of existing infeasible solutions.It initiates a GA to explore the undiscovered space when the search falls into a local optimum.Yu et al (2011b) introduced a neighborhood search operator on top of the ACO and borrow the population diversity protected by forbidden search and explore new solutions.
Since VRPTW is an NP-hard problem with high computational complexity and many conditional constraints, this study proposes N-CLPSO to solve VRPTW.In this section, Section 4.1 presents the framework of N-CLPSO, and Section 4.2 describes the details of it.Section 4.3 introduces the vehicle insertion strategy.Section 4.4 shows the guided reinsertion operator based on local information, and Section 4.5 details the remove-reinsert-based neighborhood search strategy.Diversity retention strategies based on elite fragments are described in Section 4.6.

Framework of N-CLPSO
N-CLPSO is a combination of CLPSO and a new proposed local search strategy.Although, CLPSO has a good global search capability when optimizing continues problems, the encoding and related operators of it need to be redefined according to the distinct characteristics of VRPTW.
We use the vector X t i to denote the position of particle i in the t th generation.
i , which indicates that the left and right neighbors of node d in particle i in the t th generation are k and l, respectively.To ensure that each position is a valid solution, the position x t i,d has three constraints: (1) d ∈ (0, 1, 2, . . ., n), n represents the city number; (2) k, l ∈ {0, 1, . . ., d − 1, d + 1, . . ., n} and (3) k = l.Constraint (1) ensures that each element is a valid city, constraint (2) makes each city will not be adjacent to itself, and constraint (3) enables that each city's neighbouring points are not duplicated.Taking a VRPTW problem with four cities as an example, a route sequence 0 → 2 → 4 → 3 → 1 → 0, can be represented by 5 arcs: ( 0 1,2 , 1 3,0 , 2 0,4 , 3 4,1 , 4 2,3 ).There exists a set of probability sets , where A d denotes the set of all possible adjacent arcs to node d and p(u, v) ∈ [0, 1] is the probability corresponding to each arc < u, v >.
The framework of N-CLPSO is demonstrated by Fig. 1.N-CLPSO is the framework of CLPSO with the addition of a vehicle insertion strategy (see Section 4.4), a diversity retention strategy (see Section 4.6), and a neighborhood search strategy (see Section 4.5).After the N-CLPSO velocity and position update (see Section 4.2), we use a vehicle insertion strategy to minimize the number of vehicles in the feasible solution.Then, we insert two start conditions after the steps of vehicle insertion strategy and individual optimal solution P best update to determine whether the diversity and neighborhood search strategies are started or not, respectively.Finally, after updating the global optimal solution Gbest, N-CLPSO continues with the next iteration until the stopping condition is reached.
It has been pointed out in Section 2 that VRPTW has two objectives.In this study, we combine a new decision method (Gong et al, 2012) to deal with the number of vehicles N V and the total distance T D of VRPTW.To better present the priority of the N V objective over the T D objective, we normalize T D in the range of [0,1] weighted with N V .The objective function of N-CLPSO is defined as Eq.( 15), where N V (X t i ) and T D (X t i ) denote the number of vehicles and the total distance corresponding to particle X i , respectively.fitness

Basic operators in N-CLPSO
In N-CLPSO, inspired by the literature (Gong et al, 2012),we define new operators to update the position and velocity of each particle on the set and A neighborhood comprehensive learning particle swarm optimization ...
probability.The proposed 4 operators are defined as Eqs.( 17)-( 20), respectively.c * v t i,d and v t i,d + v t j,d denotes the probability p(u, v) variation of the arc.x t i,d − x t j,d means the set solving difference set and c * (x t i,d − x t j,d ) stands for converting a crisp set into a set with probability: Taking the velocity update Eq.( 3) as an example, we assume that .12, < 1, 4 > /0.6, < 4, 1 > /0.24} can be obtained.
In order to speed up the convergence of the algorithm, this study modifies the traditional CLPSO with certain improvements, shown as Eq.( 21) and Eq.( 22).
where N is the size of population, sc i is the ranking of particle i in terms of its fitness value in the population, and n is the number of selected sample particles.
Unlike the traditional CLPSO, in which a particle selects its learning exemplar based on its index, a particle in N-CLPSO adjusts it learning rate and the number of learning samples based on the its fitness values.Concretely, the higher the particle fitness is, the smaller its selection probability P c and the number of learning examples n will be, thus ensuring the fitness of its particles.On the contrary, if the particles fitness value is lower, both P c and n will become larger to make the particles have a greater probability to learn from other particles.In other words, it will have a greater probability to learn from other outstanding particles, which is beneficial for speeding up the convergence.
Based on the velocity generated by the above equation, the position of the particle can be obtained by three set: S V = {m |< k, m >∈ V i , and < k, m > satisfies Ω}, S x = {m |< k, m >∈ X t i , and < k, m > satisfies Ω}, S a = {m |< k, m >∈ A, and < k, m > satisfies Ω}.It can be seen that the arcs < k, m > come from the sets V i , X t i and A respectively, while satisfying the constraints.First, we set a random number r ∈ [0,1] in order to ensure that arcs with larger probability are more likely to be selected, and only arcs with probability greater than r in V t i will be added to V i .Second, the set X t i and A in the set of all possible arcs in the search space since the position of the previous generation of particles and the set of all possible arcs in the search space, respectively.As in Algorithm 1, assume that the new location X t+1 i is set to the empty set, and according to the constraint, each vehicle departs from the depot and iteratively selects the next customer node adjacent to the current customer.Suppose k is the customer being served by the vehicle, and the next customer, m to be visited by the vehicle.If there is an available node in S V , we select m from S V .Otherwise, we select m from S x .When there is no available node in both S V and S x , we select m from S a .After m is selected, arc < k, m > is added to X t+1 i and the search continues with m as the current client until all client visits are complete.When there are no available nodes in all S v , S x , and S a , it indicates that the constraints of VRPTW cannot be satisfied, so we need to create a new path.Specifically, a depot node needs to be inserted after the current customer point k, and the next customer point m to be served is reselected using the depot as the starting point, thus ensuring the feasibility of X t+1 i .Finally, the updated X t+1 i goes to replace the current position X t+1 i .In addition, we also use the heuristic selection method NNH (Gong et al, 2012) to speed up the convergence of the algorithm.

Vehicle insertion strategy
It is well known that the number of vehicles is one of crucial factors determining a VRP's difficulty.In reality, each additional vehicle is likely to increase the cost significantly.Therefore, in this paper, a simple insertion scheme is applied to reduce the number of vehicles after each particle completes its position update.The vehicle insertion strategy is shown in Algorithm 2. Take Fig. 2 as an example, where 1 is a depot and 2-7 are 6 customers.The route r = {1, 2, 3, 1, 5, 6, 1, 4, 7, 1} means that there are 3 subpaths, i.e., V c(1) = {1, 2, 3, 1}, V c(2) = {1, 5, 6, 1}, and V c(3) = {1, 4, 7, 1}.First, we remove subpath V c(1) from route r to obtain an intermediate route r * .After that, the nodes {2, 3} in V c(1) need to be inserted into the r * through a guided insertion strategy (see Section 4.4).When all nodes are inserted successfully, as shown in Fig. 2(a), a new route r consisting 2 new sub-paths V c(1) and V c(2) can be obtained, and continue to remove the path V c(1).If any node is not inserted in the r * as shown in Fig. 2(b), the route keeps the sub-path in the Algorithm 1 Pseudocode for the position update process in N-CLPSO Input:  end if 14: end while 15: N r = r;

Guided reinsertion operator based on local information
The VRPTW problem itself has a very complex time-space distribution, where the customer points are not only distributed in different locations in space (i.e., spatial distribution characteristics), but also have their distinct time windows (i.e., temporal distribution characteristics).If spatial locations of two customs are close, but the time windows of them are very different, directly connecting the two customs in a route may result in a longer waiting time for a vehicle, which makes the quality of the solution degrade.On the contrary, if the time windows of two customers are close, but the distances of them are far, infeasible solutions may be generated if the two customers are severed by a same vehicle.Therefore, it is necessary to consider both time and space factors when solving VRPTW.
In this section, a guided reinsertion operator based on local information is proposed.To create a local information matrix, the space-time distribution characteristics between customer points, elite segment information, and insertion cost are considered.Then, we go through the local information matrix to guide a customers to reinsert into a path.In this study, the local information matrix is mainly divided into two modules: one is the customer information matrix (IM ), and the other is the route cost matrix (CM ) that rises after customer insertion.Details of the two modules are introduced as follows.

Information matrix
Inspired by the study in (Jiang et al, 2022), the local information among customers can be utilized to quickly calculate the probability of adjacency between A neighborhood comprehensive learning particle swarm optimization ... two customers, and then to obtain a higher quality solution set after the crossover operation.Therefore, the IM proposed in this work is constructed based on the time-space distribution characteristics of nodes and information on elite segments of particles in N-CLPSO.Concretely, IM t i,j , defined as Eq.( 23), denotes the probability that customers i and j can be served by the same car consecutively.
From Eq.( 23) we can see that IM t i,j contains two components.The first component is DST max −DST i,j , where DST i,j , detailed as Eq. ( 24), represents the distance in time-space between customer i and customer j.DT i,j and DIS i,j denote the time and space distances between customer points i and j, respectively.It is obvious that the latter can be expressed in terms of the Euclidean distance between two points.The size of the time window tends to be more likely to reflect the probability that two different customer points can be served by the same vehicle.Generally, the probability of two points being served by the same vehicle will decrease as an interval between the two time windows is very small.For instance, in the condition, the vehicle is likely to exceed the latest time window constraint for the latter node while finish the serve of the former node.To avoid the problem, intuitively, we want the vehicle to have plenty of time to serve the latter customer.Therefore, we utilize the idea in (Qi et al, 2012) to select the next serve customer.Concretely, we measure the time distance based on the amount of time saved when the vehicle arrives before the end of the time window.Suppose there exists customer i and customer j, time windows of them are [a, b] and [c, d], respectively, and k 1 and k 2 denote the cost coefficients of the remaining service time and waiting time of the vehicle, respectively.Then, the vehicle-saving time can be found according to the time t ′ of the vehicle arriving from i to j, as in Eq.( 25).
When the vehicle arriving early at customer j, the vehicle needs to wait until the customer starts service time c.Therefore, the time saved S i,j is equal to the length of the customer's time window minus the time the vehicle is waiting.If the vehicle arrives within the time window at point j, the saving time S i,j is equal to the end time window j subtracting the arrival time t ′ .If the vehicle arrival time exceeds the end time d of the customer, the customer cannot be served.We can see that it is easier for customer i to go to customer j if S i,j is larger.To be consistent with the spatial distance, we define the time distance between two customers denoted as Eq.( 26).
The second component in the right side of Eq.( 23) is CT , which represents the local information (see Eq.( 27)) of the elite individuals.
where ct denotes the total number of times that customer i and customer j are adjacent to each other from the beginning to the current generation; ct max and ct min denote the maximum and minimum values of the number of times that all customers are adjacent to each other, respectively.
In evolutionary algorithms, individuals with better fitness values are more likely to produce promising offspring.Since these individuals usually have better fitness values, it seems that they may be closer to the optimal solution and their common fragment is more likely to be part of the optimal solution.Thus, if two adjacent customers appear frequently in elite individuals, the probability of them being served by the same vehicle in the optimal solution will also be high.
During the search process, the excellent genes of elite individuals will gradually spread to the whole population.If the diffusion rate of the favorable genes is fast, it is beneficial to improve the convergence speed of the algorithm, but it is may cause the population fall into local optimum easily.Conversely, if the diffusion rate is slow, it is beneficial to maintain the population diversity, but it will reduce the convergence speed of the algorithm.Therefore, a linearly decreasing coefficient a t , defined as Eq.( 28), is introduced in this study to adjust the diffusion rate of good.
where t is the current number of generations and t max is the maximum number of generations.
It can be seen from Eq.( 28) that at the early stage of the optimization process, the value of a t is small.In this case, the diffusion rate of excellent genes of elite individuals is slow, which is conducive to preserving population diversity and enhancing the global search ability of the population.On the contrary, a larger a t at the later optimization process can bring a higher diffusion rate of excellent genes, which is beneficial for the speed convergence.
As mentioned above, a larger IM t i,j indicates that the probability of customer i and customer j being served by the same car in the t th generation is larger.It is worth noting that the size of IM is determined by the number of customers.Concretely, when the number of customers is N , the size of IM is A neighborhood comprehensive learning particle swarm optimization ... N * N .Thus, to reduce the computational cost, we only update IM when the global optimal solution is updated.

Cost matrix
To speed up the convergence, we also propose a CM -assisted information matrix for bootstrap repair.CM finds the best insertion position in a path with the help of greedy ideas, i.e., the insertion brings the least increase in total cost afterwards.As shown in Fig. 3, there exists a point i to be inserted into a path r with length L + 1.In the insertion process, we not only need to determine whether node i satisfies a constraint after insertion, but also need to calculate the corresponding incremental cost.Note that, if the constraint being violated after inserting node i into a point in the path, the corresponding cost increment is ∞.It can be seen from the value of CM in Fig. 3 that node i satisfies the condition of insertion positions L − 1 and L. Therefore, the cost increment after insertion is saved to CM .From the above definitions of IM and CM , it is clear that a larger IM t i,j denotes a greater correlation between node i and node j, while a smaller CM l means a smaller incremental cost of routing when the node is inserted into location l.Therefore, both IM and CM are considered when inserting node i into a certain path.We first perform ascending and descending operations on IM and CM , respectively.Then, the insertion position of node i is determined based on the contents of the two matrices.Specifically, we base the selection on the sum of the ranking values of the positions to be inserted in the two matrices.For example, if the position to be inserted is ranked 3rd in IM and 8th in CM , the priority value of the inserted position is equal to 11.Finally, according to the priority value of each inserted position, the inserted position with the lowest priority value is selected to insert node i.It is worth noting that if node i is in the current path and there is no location where it can be inserted, then the node will start a new route from the warehouse.See Algorithm 3 for a guided reinsertion operator based on local information.

Algorithm 3 Guided reinsertion operator based on local information
Input: the current route r, the node set ins to be inserted, the information matrix IM , and the distance matrix Dis.Output: N r 1: N r = r; 2: for i = 1 to ins do // ins denotes the number of customers to be inserted 3: l ← Calculate the path r length; 4: stay ← Record the location of the path r repository; CM ← Sort the cost matrix of node P d in descending order;

25:
IX ← Find the number of the lowest position in the sum of IM and CM rankings; end while 33: end for

Removal-reinsert-based neighborhood search
When using PSO to solve VRPTW problems, an efficient neighborhood search operator plays a positive and crucial role in helping particles to jump out of A neighborhood comprehensive learning particle swarm optimization ... local optima, improving the solution accuracy, and accelerating the convergence speed (Chih, 2018).Furthermore, the study in (Shi et al, 2018) verify that the neighborhood region of elite particles is more likely to contain highquality solutions or even global solutions.Therefore, to enhance the local search efficiency of N-CLPSO, we perform a neighborhood search operation for elite individuals, rather than all individuals, in the population.Concretely, the neighborhood search operation only exert on the individual optimal solution P best.
Subject to the ideas of LNS (Hong, 2012) removal-reinsert, this section introduces removal-reinsert-based neighborhood search method.First, we randomly choose one of the removal methods to remove D customers from the complete route and insert them into the set N t of customers to be inserted.Then, we reinsert the route by guided reinsertion operator based on local information(see Section 4.4), i.e., the customer nodes in N t are reinserted into the route, forming a route that traverses all nodes to generate a new solution.Finally, we compare the routes before and after the update, and then keep the better individuals for the next iteration.
We determine the number of customers D to be removed from the original route according to Eq. ( 29).
where N is the number of customers and I is the number of generations in the population for which the optimal solution has not been updated.It can be seen that when the optimal solution has more generations-stagnancy, there are more customers should be removed.Thus, the search range of the neighborhood can be increased.It is worth noting that the number of removed customers D set in this paper must not exceed N/10 because a too large D not only increases the computational effort, but also causes the inability to find the best insertion position for nodes.Two removal strategies proposed in this study are described below.
(1) In Section 4.4.1, we define IM to measure the probability of being served by the same vehicle successively among customers.With the help of IM , we can design an information matrix-based removal strategy, as in Algorithm 4. First, we randomly select a customer point i and insert it into the customer set N t.After that, we randomly pick a point i ′ in N t and select j points to join N t based on IM , where node j denotes the node that is least likely to be adjacent to node i ′ , i.e.IM t i ′ ,j = min(IM t i ′ ).Since we are borrowing the IM for evaluation, the time complexity of the operator is O(D).
(2) We propose a removal operator on removal cost based on the customer's removal cost, and its pseudo-code is shown in Algorithm 5.The algorithm performs a removal operation on a customer by calculating the difference in cost incurred by removing each customer point from the original path, i.e., the removal cost.We select a customer i to insert into N t based on the removal cost corresponding to each customer in a roulette wheel method.The time complexity of our removal of customer D is O(D).
Algorithm 4 Removal strategy based on IM Input: //Information Matrix and node to be deleted, respectively IM ; D; Output: Delete node set N t 1: N t = φ; 2: for x = 1 to length of D do 3: Randomly select a customer i from N t; 4: Find the node j that is least likely to be adjacent to customer i ′ by IM ; Randomly select the customer i in ∆c where N t does not appear by roulette;

Diversity retention strategy based on elite fragments
In order to maintain the population diversity and improve the quality of the solution, a diversity retention strategy based on elite fragments is designed in this study.
In PSO, the diversity of particles disappears as the number of iterations rises, which directly leads to the premature convergence of the algorithm.In order to prevent the particles from converging prematurely, we need to perturb the particles.However, a large range of random perturb can easily make the particles degenerate and degrade the algorithm performance thought it is beneficial for improving population diversity.Generally, in evolutionary algorithms, elite individuals generally contain better genetic fragments.So when perturbing an individual, if we can also retain some gene fragments shared by other elite individual, we can increase diversity while retaining superior genetic information.Therefore, inspired by the longest common subsequence (LCS) (Xia et al, 2022), we propose a diversity retention strategy based on elite fragments.
When using PSO to optimize a VRPTW problem, each particle can often be represented by a complete route.Therefore, we base on the elite fragment between the particle and the elite particle, other nodes will be inserted into the elite fragment one by one through the guided reinsertion strategy to form a new route, and finally, we reserve the better individual.It can be seen that the retention of elite fragments avoids excessive degradation of particles to some extent, and the strategy perturbs the current particles while ensuring the performance of the algorithm.Obviously, when two particles are more similar, their extracted fragments are longer and the set of nodes to be inserted is smaller.On the contrary, if two particles are more different, their common sequence is shorter and the number of nodes to be removed will be more, which is more conducive to increasing population diversity as well as helping the algorithm to jump out of the local optimum.Fig. 4 depicts process of the proposed diversity retention strategy.For example, the sample particle r1 is {1, 2, 5, 6, 1, 3, 4, 7, 8, 1}, and the current particle is r2 is {1, 3, 6, 1, 2, 4, 7, 1, 5, 8, 1}.Then, we can obtain that the LCS of r1 and r2 is {1, 6, 1, 4, 7, 8, 1}.Based on the LCS, the set of nodes to be inserted is {2, 3, 5}.After that, we insert the customer nodes in the node-set into the LCS to get the new r3.Nots that the finding process of the LCS depends on the length of the r1 and r2, thus the time complexity of it is O(n * m).

Experimental evaluation
In this section, Section 5.1 describes the general setup of experiments while Section 5.2 presents an introduction to the dataset used in the experiments.In Section 5.3, a sensitivity analysis of each component in N-CLPSO is presented.Section 5.4 gives the experimental results and corresponding discussions.

Setup
In this study, extensive experiments are conducted to investigate the performance of N-CLPSO.In the experiments, the inertia weight value w in Eq.( 3) is initialized to 0.9 and decreases linearly from 0.9 to 0.4 during the optimization process, and the acceleration coefficient c is set to 2.0.In this paper, the refresh gap "rg" of CLPSO is set according to (Liang et al, 2006).The learning probability P c and the sample selection method are used in Eq.( 21) and Eq.( 22).The parameters of Eq.( 24)-Eq.(25) are set to a = 0.5, k 1 = 1 and k 2 = 2, respectively.The population size is set to N = 20.Neighborhood search and diversity preservation policies are activated at 10 and 100 generations of P best and Gbest stagnation updates, respectively.The optimization process will be stopped when Gbest has been stagnation for consecutive 10,000 generations.Each trial is performed independently 5 times.

Datasets
N-CLPSO is tested on a classical benchmark of 56 VRPTW problem instances proposed by Solomon (1987) since the benchmark can reflect various real life scheduling problems.According to properties, the instances can be classified into three categories: clients distributed in a clustered manner (Class C), clients distributed in a random manner (Class R), and test sets with a mixture of clustering and randomness (Class RC).On this basis, the test set can be further classified into to two categories, i.e., problems with smaller vehicle capacities and more compact time windows (C1, R1 and RC1), and problems with larger vehicle capacities and longer dispatch cycles (C2, R2 and RC2).

Sensitivity analysis of components in N-CLPSO
This section aims to clarify the impact of the components proposed in N-CLPSO.We attempt to verify the effectiveness of the strategy by adding components one by one to the original CLPSO.The strategys proposed in A neighborhood comprehensive learning particle swarm optimization ... this paper focuses on speeding up the convergence and maintaining population diversity.
In N-CLPSO, we use the new speed selection strategy and vehicle insertion strategy to speed up the convergence of N-CLPSO.To this end, 3 algorithms are adopted as competitors to N-CLPSO, i.e., "CLPSO" which does not contain the new proposed strategies, "add-V" which denotes that only the new speed selection strategy is involved in CLPSO, and "add-C" which means that only the vehicle insertion strategy is added in CLPSO.
Comparison results shown in Fig. 5 demonstrate that "N-CLPSO" shows faster convergence in these experiments and higher accuracy in solving at later stages.The convergence speed and solution accuracy of "add-C" are slightly lower than those of "N-CLPSO", but higher than those of "add-V" and "CLPSO".Although "add-V" displays similar performance as "CLPSO", in terms of solution accuracy, it yields significantly higher convergence speed than "CLPSO".Therefore, we can see that the new speed selection strategy and the vehicle insertion strategy play positive performance on improving the convergence speed.
In addition, to investigate the advantages of the neighborhood search and diversity retention strategies applied in N-CLPSO, 3 variants of N-CLPSO are selected as peer algorithms.Specifically, "noLV" and "noZ" denote two algorithms in which the neighborhood search strategy and the diversity retention strategy are removed from N-CLPSO, respectively, while "noLV+noZ" means that both the two new proposed strategies are removed from N-CLPSO.The comparison results in most of the datasets, in terms of the percentage error between the best stroke length and the best-known stroke length for each algorithm, are shown in Fig. 6.It can be observed that both the addition of the neighborhood search module and the diversity retention strategy optimized the optimal solution to be closer to the global optimal solution, and it is clear that the optimal results were obtained by adding both the neighborhood search and diversity retention strategies to N-CLPSO.
In N-CLPSO, we used a combination of the information matrix IM and the cost matrix CM to guide the neighborhood reinsertion strategy.To verify the advantages of the combination matrix used in N-CLPSO, we implemented three variants of N-CLPSO, i.e., N-CLPSO with IM only, N-CLPSO with CM only, and N-CLPSO with IM and CB.Experimental results demonstrated in Fig. 7 verify that using CB is better than using only IM in R101, R201, RC101 and RC201 problems, while using IM is better than using only CB in the set clustering dataset C104 and C204, it is clear that using both CM and IM has the best performance.

Comparison with other algorithms
To verify the comprehensive performance of N-CLPSO, we compare it with several state-of-the-art methods for solving VRPTW.Table 1 summarizes the basic information of the selected peer algorithms.We demonstrate the effectiveness of N-CLPSO by comparing it with the best known results and the In Table 2, we also give a comparison of the proposed algorithm with the best-known results, where "NV" and "TD" denote the best-known results for the best number of vehicles and the corresponding minimum distance, respectively."BNV" denotes the best number of vehicles, while "BTD" denotes the minimum distance corresponding to the best number of vehicles."MNV" and "MTD" denote the average number of vehicles and the average distance obtained by the proposed algorithm respectively."Deviation" = (BTD-TD)/BTD denoting the percentage deviation of the algorithm from the bestknown result is used to measure the solutions quality of the algorithm."Std" is the standard deviation of multiple solutions achieved by an algorithm, which is used to measure the stability of the algorithm.It can be seen from Table 2 that the N-CLPSO algorithm obtains a new more optimal solution on R101 and reaches 23 optimal solutions on other tested problems.The average deviations of R1, C1 and RC1 are 0.45%, 0.27% and 0.43%, respectively, which are less than 0.5%.Meanwhile the deviations of R2 and RC2 are 2.51% and 1.94%, respectively, which shows that N-CLPSO has favorable performance in the "1" class problems.There is no standard deviation for class C2, and only C103 and C104 have 1.08 and 4.82 for class C1. the average standard deviations of R1, R2, RC1, RC2 and the total data set are 8. 85, 12.19, 6.88 and 12.75, respectively, and most of the standard deviations are below 10.The results show that N-CLPSO is stable and has good robustness.Furthermore, it can be seen that N-CLPSO yields more favorable performance in the "1" class problems who have narrow time windows, while there is a small decrease in the performance of N-CLPSO in the "2" class problems who have longer time windows.In Table 3, we have selected two recently published state-of-art algorithms as competitors for N-CLPS.Note that, the data for each algorithm prioritizes the minimum vehicle, followed by consideration of the shortest path length.The results show that N-CLPSO obtains the best results on 47 out of the 56 data sets, while BSO and MOLNS yield the best results on 21 and 17 test problems, respectively.In all three algorithms, N-CLPSO is able to find the minimum number of vehicles and has the shortest path length in most of the data sets.It can be seen that N-CLPSO has the strongest synthesis capability in the prioritization of vehicles problem.In Table 4, the average best results of the given algorithms are given for each subclass (C1, C2, R1, R2, RC1 and RC2).The results are given in the form of NV TD, where NV and TD are the averages of the best minimum number of vehicles (NV) and the best minimum total travel distance (TD) found in each subclass using the corresponding method, respectively.For instance, the data "12.25 1218.28" in the second row of the table indicates that the average number of vehicles and the average distance cost obtained by LNS on independently runs are 12.25 and 1218.28,respectively.The best results for each category are highlighted in bold.
It can be seen from Table 4 that for the C2 class problems, N-CLPSO obtains the same results as the most well-known ones and obtains smaller travel distances at the expense of certain vehicles in the R1, RC1, and RC2 problems.However, in R2 and C1 class problems, the performance of N-CLPSO is slightly worse than the CPSO and MOLNS.Similar to the previous comparison, it proved that although the algorithm is better at solving the category "1" problems, it also performs very well in the category "2" problems.

Conclusions and future research
In this paper, we propose a PSO-based algorithm, names as N-CLPSO, to solve VRPTW.In the study, two objectives of VRPTW are considered.The primary target is the number of vehicles, while the secondary target is the total distance travelled all the vehicles.
In N-CLPSO, we use CLPSO as the main framework and redefine the speed and position update operators of the CLPSO intending to make it more suitable for VRPTW.Moreover, we propose an elite fragment diversity retention strategy and removal-reinsert-based neighborhood search strategy aiming to address the premature convergence and speed up the convergence of PSO.Meanwhile, we apply a novel sample selection strategy and vehicle insertion strategy to improve the convergence speed of N-CLPSO.The experimental results show that the new proposed strategies enable N-CLPSO to outperform other selected state-of-the-art algorithms for VRPTW at present.Moreover, properties and characteristics of the new introduced strategies are also testified by extensive experiments.
Note that, although N-CLPSO yields very promising performance on different VRPTW problems, we regard that there are still some works need to be further investigated.For example, as well as information of customer and elite segment, other potential useful information about VRPTW can be extracted and utilized to guide the evolution of the population, such as information on customer clusters.Moreover, the location update of particles in N-CLPSO is determined by the three new introduced sets.Thus, how to extract more helpful information from a VRPTW and design more efficient evolution operators for PSO is our next work.Moreover, we also hope to apply the algorithm to other combinatorial optimization problems, especially various variants of the VRP.

Compliance with ethical standards
c(i); //Add the node with vehicle i to the set to be inserted 5: r * = r − V c(i); //Remove the vehicle i from the path r 6:Insert ins into r * using Algorithm 3;//See section 4

Fig. 3
Fig.3CM building process Insert P d and repository at the end of N r; Insert P d at the location of IX;

Fig. 4
Fig. 4 Illustrative Example of Diversity Strategy

Fig. 5
Fig. 5 Contribution of speed selection strategy and vehicle insertion strategy

Fig. 6
Fig. 6 Contribution of Neighborhood Search and Diversity retention Strategies

Table 1
Information about the comparison algorithm

Table 2 :
Comparison with the best-known results.

Table 2 :
(Continued.)Comparison with the best-known results.

Table 3 :
Comparison with recently published representative algorithms A neighborhood comprehensive learning particle swarm optimization ...

Table 4
Comparison of average levels.