## 3.2 Optimization objectives

Within a certain parameter range, there are theoretically an infinite number of well trajectory that can be designed, and how to select this optimal trajectory becomes the key to the design of the well trajectory. wellbore energy is used as an evaluation indicator to show how complex a well. The calculation by wellbore curvature and wellbore deflection, using mathematical reasoning methods rather than geometric reasoning, better quantifies the complexity of the wellbore and is a metric to describe the complexity of the wellbore trajectory. In nonlinear curve modeling, the thin elastic line that bends the least when passing through a given set of points is called the minimum wellbore energy curve. It is considered a good criterion because it can simply generate a smooth curve to describe the minimum wellbore energy of the trajectory[9, 21]. A lower value of the wellbore energy indicates a smoother well track. It is defined as the arc length integral of the square of the curvature and the sum of the squares of the deflections, with the following mathematical expression:

$${E_W}=\mathop \smallint \nolimits_{0}^{L} k{\left( x \right)^2}+\tau {\left( x \right)^2}$$

14

where \(k\left( x \right)\) is the wellbore curvature, which is calculated in the same way as the wellbore curvature, and \(\tau \left( x \right)\) is the wellbore deflection rate, which is calculated by the following formula \(\tau =\frac{{{K_\alpha }{{\mathop K\limits^{ \bullet } }_\varphi } - {K_\varphi }{{\mathop K\limits^{ \bullet } }_\alpha }}}{{{K^2}}}{\text{sin}}\alpha +{K_\varphi }\left( {1+\frac{{K_{\alpha }^{2}}}{{{K^2}}}} \right){\text{cos}}\alpha\) (15)

where \({K_\alpha }\) is the rate of change of well slope, \({K_\varphi }\) is the rate of change of azimuth, \({\mathop K\limits^{ \bullet } _\varphi }\) is the first-order derivative of the rate of change of azimuth, \({\mathop K\limits^{ \bullet } _\alpha }\) is the first-order derivative of the rate of change of well slope, and is the wellbore curvature. By calculating the above equation, the wellbore deflection rate can be found, and the wellbore energy can be found by substituting the wellbore curvature and deflection rate into Eq. (15).

The above equation \({E_W}\) as an evaluation of the entire track evaluation index, comprehensive consideration of the wellbore energy of the entire wellbore, further normalized can be transformed into the incremental wellbore energy on a certain well section, the formula is as follows

$${E_{{{\left( {abs} \right)}_n}}}=\frac{{\mathop \sum \nolimits_{{i=1}}^{n} ~\left( {K_{i}^{2}+\tau _{i}^{2}} \right)\Delta {L_i}}}{{{L_n}+\Delta {L_n}}}$$

16

Since the optimal wellbore energy and track length are often mutually exclusive, a smaller value on one side will inevitably lead to a larger value on the other side, the wellbore energy is further normalized to the incremental wellbore energy on each 100m well section to facilitate comparison, which is given by the following equation:

$${E_{{{\left( {abs} \right)}_n}}}=\frac{{\mathop \sum \nolimits_{{i=1}}^{n} ~\left( {K_{i}^{2}+\tau _{i}^{2}} \right)\Delta {L_i}}}{{({L_n}+\Delta {L_n})\Delta {L_n}}} \times 100$$

17

The equation provided enables the determination of the wellbore energy increase per hundred meters for every control unit. This allows for easy comparison of the wellbore energy of different wellbore to be drilled.

The second optimization objective is defined as the track length. In the actual drilling process, the objective function for trajectory optimization is usually the total length of the trajectory in order to reduce drilling costs. In the three-dimensional five-segment trajectory, the length of the trajectory is the length of three hold angle segments plus the length of two circular segments. The length of the hold angle segment is easy to solve, while the length of the circular segment needs to be solved by the minimum curvature method, and finally the five segments of the trajectory are summed to the incremental length of the track, expressed by the formula:

$$L={L_1}+{S_1}+{L_h}+{S_2}+{L_2}$$

18

## 3.4 Optimization of the objective function

Minimized trajectory length helps to reduce the loss of resources and improve the economic efficiency of drilling. Minimizing the wellbore energy can obtain a smoother trajectory, which is conducive to ensuring safe downhole construction. Therefore, multi-objective optimization with both as the optimization objective can obtain shorter trajectory length and smaller wellbore energy.

The multi-objective intelligent optimization design objective function of the well path can be expressed as

$$\hbox{min} Z={(L({L_1},{S_1},{L_h},{S_2},{L_2}),E({L_n},k,\tau ))^T}$$

19

Where: Z is the minimum value of the multi-objective function, function L is the track length of the wellbore, and function E is the increment of wellbore energy.

To effectively confine the activity of the constraint function within a specified range, we introduce the concept of dynamic penalty function. The dynamic penalty function is an improved penalty function method used to solve constraint conditions in optimization problems. Unlike static penalty functions, dynamic penalty function methods dynamically adjust the penalty coefficient of the penalty function to better handle constraint conditions and more effectively solve optimization problems. In the dynamic penalty function method, the penalty coefficient is no longer a fixed constant but is dynamically adjusted as the optimization process progresses. Generally, the penalty coefficient starts small, and as violations of the constraint condition are detected during iterations, the penalty coefficient gradually increases to increase the punishment for violations and more strongly urge the optimization algorithm to move toward satisfying the constraint condition. A common implementation of the dynamic penalty function method is gradually increasing penalty coefficient, starting with a small penalty coefficient and gradually increasing it until the constraint condition is satisfied. This method can better balance the trade-off between the objective function and the constraint condition, help avoid getting trapped in local optimal solutions, and usually has better convergence performance when solving complex optimization problems.

In the previous context, the constraint conditions were defined as engineering condition constraints and non-negative constraints. In practical operations, it is defined as the constraint of the build-up rate of the deflecting tool and the constraint of the length of the intermediate stable section. When the build-up rate is limited between \({k_l}\) and \({k_u}\), and the length of the intermediate slope stabilizing section is limited between \({l_l}\) and \({l_u}\). The constraint function \(g\left( x \right)\) is defined as \({k_l}\)<=\(g\left( x \right)\)<=\({k_u}\), and The constraint function \(h\left( x \right)\) is defined as \({l_l}\)<=\(h\left( x \right)\)<=\({l_u}\). When defining the dynamic penalty function, we must first decompose two inequality constraints into two unilateral inequality constraints, namely:

$$g\left( x \right) - {k_l} \geqslant 0$$

20

$${k_u} - g\left( x \right) \geqslant 0$$

21

$$h\left( x \right) - {l_l} \geqslant 0$$

22

$${l_u} - h\left( x \right) \geqslant 0$$

23

Then, relaxation variables s1 and s2 are introduced, and p1 and p2 respectively correspond to the above four equations, so that

$$g\left( x \right) - {k_l}+s1 \geqslant 0$$

24

$${k_u} - g\left( x \right)+s2 \geqslant 0$$

25

$$h\left( x \right) - {l_l}+p1 \geqslant 0$$

26

$${l_l} - h\left( x \right)+p2 \geqslant 0$$

27

The constructed dynamic penalty function is defined as:

$$\begin{gathered} F\left( {x,s1,s2,p1,p2} \right)=f+{\sigma _1}*\hbox{max} \left( {0,g\left( x \right) - {k_l}+s1} \right)+{\sigma _2}*\hbox{max} (0,{k_u} - g\left( x \right)+s2) \hfill \\ +{\sigma _3}*\hbox{max} \left( {0,h\left( x \right) - {l_l}+p1} \right)+{\sigma _3}*\hbox{max} \left( {0,{l_u} - h\left( x \right)+p1} \right) \hfill \\ \end{gathered}$$

28

In this case, \(f\left( x \right)\) represents the objective function, \(g\left( x \right)\) represents the original constraint function, and \({\sigma _1}\),\({\sigma _2}\),\({\sigma _3}\) and \({\sigma _4}\) are dynamically adjusted penalty coefficients. The penalty function optimization problem is solved iteratively. To start, smaller penalty coefficients (e.g., \({\sigma _1}\),\({\sigma _2}\),\({\sigma _3}\) and \({\sigma _4}\)) are chosen. In each iteration, the satisfaction of the constraint conditions \({k_l}\)<=\(f\left( x \right)\)<=\({k_u}\) and \({l_l}\)<=\(g\left( x \right)\)<=\({l_u}\) is checked. If the constraints are satisfied, the iteration stops and the final solution is returned. If the constraints are not satisfied, the penalty coefficients \({\sigma _1}\),\({\sigma _2}\),\({\sigma _3}\) and \({\sigma _4}\) are increased, and the penalty function optimization problem is solved again. This process is repeated until an optimal solution that satisfies the constraint conditions is found.

The penalty coefficient in the penalty function is selected to increase exponentially during the iteration. In the dynamic penalty function method, the penalty coefficient usually gradually increases with the iteration process to approach the constraint conditions. The growth formula of each iteration is as follows:

$$\sigma _{1}^{k}={\sigma _1} * {\left( {1+r} \right)^{\left( {k - 1} \right)}}$$

29

$$\sigma _{2}^{k}={\sigma _2} * {\left( {1+r} \right)^{\left( {k - 1} \right)}}$$

30

$$\sigma _{3}^{k}={\sigma _3} * {\left( {1+r} \right)^{\left( {k - 1} \right)}}$$

31

$$\sigma _{4}^{k}={\sigma _4} * {\left( {1+r} \right)^{\left( {k - 1} \right)}}$$

32

Where, \(\sigma _{1}^{k}\),\(\sigma _{2}^{k}\),\(\sigma _{3}^{k}\),\(\sigma _{4}^{k}\) represent the iterative value of the penalty coefficient. \({\sigma _1}\),\({\sigma _2}\),\({\sigma _3}\) and \({\sigma _4}\) represent the initial penalty coefficient. r is the growth factor, usually greater than 1, which controls the growth rate. k is the current iteration number.

## 3.5 Optimization Process

Particle Swarm Optimization (PSO), a global search algorithm proposed by Eberhart and Kennedy in 1995, is a stochastic search algorithm that simulates the biological activities of nature as well as group intelligence. In addition to considering the group activity of simulated organisms, it incorporates individual cognition and social influence, and is a group intelligence algorithm[22].

The multi-objective particle swarm algorithm (MOPSO) differs from the single-objective particle swarm algorithm in the evaluation of the historical optimum and the global optimum[23, 24]. By judging the pareto optimal domination case, the non-dominated solution, i.e., there does not exist a point such that all aspects of this particle are superior to the solution of the particle in the pareto optimal dominated solution set, stores this solution set in a set, divides the particle information in this set by the grid method, and calculates the density of the particles in the grid to select the one with the smallest density as the optimal solution.

The well track design parameters multi-objective optimization algorithm, its specific steps are improved with reference to the MOPSO, its specific steps are as follows:

First, initialize each parameter and key information such as the number of particles in the population and the number of iterations. The initialization parameters should be distributed within the parameter constraints, and the key parameters of the well track to be drilled are calculated by substituting the design parameters into the track calculation equation. The key parameters are brought into the optimization objective function to calculate the fitness value of each particle, which can be an array of fitness values for the multi-objective function. Initialize the dominance of particles in the population and filter out the non-dominated particles, i.e. non-dominated individuals, into the non-dominated solution set Archive. The implication of a non-dominated particle is that there is no particle in the population that is better than the case of this particle in all dimensions of the condition. The grid is then initialized, as shown in Fig. 3, and the initialized boundaries of the grid are generally inflated by 1.2 times the maximum and minimum values of each dimension. Subsequently, the grid is divided according to the pre-designed number of grids, the position information of each particle in the grid in the non-dominated solution set is found, and the density information of the particles is calculated. Then generate the next generation of population through elite strategy.

Let the current evolutionary generation be t. The subsequent iterations are completed when t is less than the total number of evolutionary iterations.

Subsequent evolution produces the next generation population, and let the currently evolving particle j, complete 1) to 3) when j is smaller than the population size.

1) Calculate the density information of the particles in Archive set, using the number of particles contained in each region as the density information of the particles. The more the number of particles contained in the grid where the particle is located, the larger the density value, and vice versa.

2)The selection of the current local optimum as the global optimum is based on the density information of the particles in the Archive set, and when there are multiple particles with the same density information, one particle is selected as the global optimum by means of roulette.

3)The position and velocity of the particles in the population are updated, based on elite strategy and NSGA algorithm, select particles with high pareto level as the initial particles for the next generation, and the particles in the population search for the optimal solution under the guidance of gBest and pBest. After the update is completed, we judge whether the boundary is exceeded and update the corresponding value of the exceeded boundary. After the update is completed, we bring the parameters back into the well track calculation function to derive the track design parameters, and then substitute back into the objective function to calculate the fitness value of the objective function.

4) The newly computed non-dominated liberation into the Archive set, when the number of particles in the Archive set exceeds the specified size, the excess individuals are deleted to maintain a stable Archive set size, and the deletion principle is randomly deleted according to the roulette strategy.

Finally, the particle information in the Archive set is output, and the Archive set is the optimal set of solutions for the trajectory design parameters of the well track to be drilled.

The updating of the velocity and position of the MOPSO algorithm is consistent with the PSO in that the following equation is applied for updating:

$$v_{i}^{d}=\omega \times v_{i}^{d}+{c_1} \times rand_{1}^{d} \times \left( {pBest_{1}^{d} - x_{i}^{d}} \right)+{c_2} \times rand_{2}^{d} \times \left( {gBes{t^d} - x_{i}^{d}} \right)$$

33

,

$$x_{i}^{d}=x_{i}^{d}+v_{i}^{d}$$

34

.

Where

The inertia weight, \(\omega\), is a non-negative number that regulates the search range of the solution space, generally initialized to 0.9 and decreasing to 0.4 with iterations. generally speaking, the larger the value the stronger the global search capability, and the smaller the local search capability, so it is generally iterated in a dynamically decreasing form.

\({c_1}\) and \({c_2}\) are the acceleration constants that regulate the maximum step of learning.

\(rand_{1}^{d}\) and \(rand_{2}^{d}\) are two random functions that take values in the range [0, 1] to increase the randomness of the search.

The iteration rule is shown in Fig. 4, where the blue dots indicate the feasible solution of the function in the current region, the brown dots indicate the individual optimal solution, and the red dots indicate the global optimal solution, and each particle moves toward the global optimum by iterative updates of velocity and position.

As shown in Fig. 5, in a region, the update direction of the particle is updated forward according to the inertia vector of the particle in each region and the respective local optimal direction of the sum vector, the velocity of the next iteration of the particle relies on the above Eq. (29) to follow up, and after the velocity update is completed, the current position and the value after the velocity update are brought into Eq. (30), and the next position of the particle is calculated, and the next generation is generated by iterative updates in turn, and the velocity and position are judged to be over the boundary, and the value beyond is treated as equal to the boundary to complete an iterative process.

Some of the steps of the MOPSO are shown in Table 4 below:

Table 4

partial pseudo-code for MOPSO.

Input: Population size Max iteration Learning factor External archive size N Optimization parameter |

Out put: Pareto front |

Begin |

**Step1**: Initialize population Algorithm parameters initialization Initialize the external archive Initialize dominant relationship |

**Step2**: Calculate the fitness value of the objective function Calculation of particle density information non-dominated sorting Update individual and global optima Update particle position, velocity, external archive Cross boundary judgment |

**Step3**: Does it meet the convergence condition? Yes to **step 4**, no to **step 2** |

**Step4**: Output Pareto Frontier |

End |

After obtaining the Pareto front, the multi-objective optimization problem has been practically solved. However, further analysis is needed to determine which value in the Pareto front should be selected as the final design track. The usual practice is to take the best point in each dimension and a utopian solution. However, there are still more than 2 solutions, which is not conducive to the computer to judge which solution to obtain. Here, a TOPSIS algorithm is used. The full name of TOPSIS method is Technique for Order Preference by Similarity to Ideal Solution. TOPSIS is a commonly used comprehensive evaluation method, which can make full use of the information of original data, and the results can accurately reflect the gap between evaluation schemes. Its basic idea can be measured by the following formula:

$$\frac{{f(x) - Worst}}{{Optimal - Worst}}$$

35

In the TOPSIS method, \(Worst\) represents the worst solution and \(Optimal\) represents the ideal optimal solution. It can be observed that if a solution reaches the ideal optimal solution, its expression value is 1; if a solution reaches the worst solution, its expression value is 0. Therefore, we can use this expression to measure the comprehensive distance between a certain solution in the system and the ideal optimal and worst solutions. Then, we can select the maximum value as the optimal ideal solution.

The TOPSIS algorithm mainly provides a way to select from the Pareto front and will not be elaborated further here.

The flow of the multi-objective well trajectory design parameter optimization method is shown in Fig. 6. The deviation of the actual drilling trajectory from the design trajectory is monitored by computer, and when the two do not deviate, it is judged whether there is an update of the geological target, and when the actual drilling track deviates from the design trajectory or there is an update of the geological target, it enters into the multi-objective optimization process. The multi-objective optimization procedure involves determining the relevant coordinates of the target point and information, followed by initializing the iteration parameters, initializing the particle population, and initializing the constraint boundaries. The parameters of the initialized particle swarm are the design parameters of the well track design method, and then the initialized particles are brought into the well track design method, and the parameters required for the track evaluation are obtained and brought into the track evaluation method (i.e., the objective function) to obtain the function fitness value of the particle swarm for this iteration. The non-dominated solution is obtained by Non-dominated Sorting Genetic Algorithms and stored in an external data file. The global optimum and the individual optimum are subsequently updated, the position and orientation of each particle is updated according to the formula, and the external data file is maintained. Compare the updated particle positions and velocities with the positions and velocities constrained by the constraints, and if they are exceeded, the positions and velocities are qualified according to the corresponding rules. If the iteration is not completed, the new fitness value is solved according to the newly generated particle population, and so on and so forth. If the iteration is completed, the Pareto optimal solution set is output, and the design trajectory data is updated, and subsequent drilling continues according to the new design trajectory data. Finally, revert to the computerized trajectory monitoring.