Activated Gas Tungsten Arc Welding Process Optimization Using Artificial Neural Network and Heuristic Algorithms


 Apart from different merits of using conventional gas tungsten arc welding (C-GTAW) process, shallow penetration has been considered as the most important drawback of the process. Recently, in order to cope with the low penetration, using a paste like coating of activating flux during welding process known as activated GTAW (A-GTAW) has been proposed. In this paper, effect of A-GTAW process input parameters (welding speed (S), welding current (C)) and percentage of activating fluxes (TiO2 and SiO2) combination (F)) on the most important quality characteristics (weld bead width (WBW), depth of penetration (DOP), and consequently aspect ratio (ASR)) for AISI316L parts have been considered. The data needed for the modeling and optimization objectives, box-behnken design (BBD) of experiments, back propagation neural network (BPNN), simulated annealing (SA), and particle swarm optimization (PSO) algorithms have been employed. Moreover, PSO algorithm has been used to determine the proper ANN architecture (hidden layers number and their corresponding neurons/nodes) and optimize the proper ANN model to obtain the desired aspect ratio, maximum depth of penetration, and minimum weld bead width. Next, SA algorithm has been used to avoid getting trapped in local minima. Finally, confirmation experimental tests have been carried out to evaluate the performance of the proposed method. Due to the obtained results, the suggested method for modeling and optimization of A-GTAW process is quite efficient (with less than 4% error).


Introduction
High weld quality and surface finish (spatter free) acquired using gas tungsten arc welding (GTAW) process are the major factors that considered for fabricating stainless steel, aluminum, titanium and magnesium alloys. Apart from different merits introduced for GTAW process, shallow penetration has been considered as a drawback especially for joining thick parts [1−3]. To cope with the mentioned drawback of poor (low) penetration, different procedures have been introduced among which using a paste like coating of different activating fluxes on the weld surface, known as activated GTAW (A-GTAW) process is the most important ones [4,5]. In this process a layer of activating flux/fluxes (including oxide, fluoride, and chloride) on the specimens' surface has been coated prior the process begins. As the welding process begins, the coated layer of flux is then melted and vaporized. Based on two phenomena named reversal of Marangoni convection and arc constriction, welding penetration depth increased and consequently welding width bead decreased [5,6].
Based on the literature survey, apart from using A-GTAW process for fabricating different materials namely: titanium, aluminum, manganese, and stainless steel (including austenite and austenite duplex) alloys this process could be effectively employed in welding of dissimilar metals due to stabilizing of DOP based on reversal of Marangoni convection [7][8][9][10][11]. In C-GTAW process, when the thickness of weldments exceeds 3 mm, a gap between the welding parts is considered filling which required using filler metal. Whereas, workpiece of around 8 mm could be joined in a single welding pass without even edge preparation and using filler metals via A-GTAW process [6]. There are different enough studies in which different aspects of A-GTAW process has been taken into account.
Process of C-GTAW has been optimized using response surface methodology to attain the largest welding penetration by Pamnani et al [7]. Kumar et al [12], investigated the performance of A-GTAW and C-GTAW processes. Full DOP has been only achieved in A-GTAW process.
Therefore, A-GTAW process could improve the performance of C-GTAW process by increasing DOP and decreasing WBW simultaneously. Elimination of edge preparation before welding process (for specimens with more than 3mm thickness) and reduction of required passes for welding process using activating fluxes has been reported by Venkatesan et al [13]. Distortion reduction and mechanical properties improvement have been reported by Chern et al [14], as the key advantages of A-GTAW process. There are different fluxes among which chloride, oxide, and fluoride based ones in dissimilar joining (low alloy and stainless steel) have been employed by Tathgir et al [15]. Based on the research results, the largest depth of penetration has been achieved using oxide based fluxes. Furthermore, other fluxes had trivial and negligible effect on DOP.
Based on the A-GTAW process literature survey, there are different studies in which A-GTAW have been studied. In these papers the lack of process modeling and optimization senses. Modeling and optimization of A-GTAW process output characteristics (especially WBW, DOP, and ASR) has been considered in no published studies simultaneously using BBD-based design of experiments approach, ANN-based modeling method, and heuristic algorithms-based optimization (SA and PSO algorithms) technique. As different activating fluxes have different effects on the weld bead geometry, mechanical and metallurgical properties of weldments, therefore, in this study effect of combination of the two most crucial activating fluxes (TiO2 and SiO2) has been considered as the process input variable (apart from welding speed and current) and optimized in such a way that DOP increases, WBW decreases and proper value for ASR achieved simultaneously. Based on the preliminary experimental tests carried out using DOE (screening) approach and literature survey studied, as mentioned process inputs variables (current (I) and speed of welding (S), and percentage of activating fluxes combination (F)) have been taken into account and their corresponding intervals and levels determined. According to the process parameters number and their predetermined levels, the most appropriate design matrix (BBD) has been considered as the way of carrying out experiments and gathering data required for modeling and optimization purposes. Next, to found the relations among process input and output parameters (I, S and F, and DOP, WBW and ASR), neural network with a back propagation algorithm (BPNN) has been used. Next, the best BPNN architecture has been determined using PSO algorithm.
Finally, multi-response optimization has been carried out using PSO algorithm. The SA algorithm has also been used to check the adequacy of the PSO algorithm and avoiding getting trapped in local minima. AISI316L stainless steel parts have been considered as specimens on which proposed approach has been carried out. Based on the achieved results, an optimized formula for activating fluxes has been proposed in such a way that a desired ASR with minimum WBW and maximum DOP achieved.

Determination of process input variables and their intervals
Different variables affecting the A-GTAW process has been introduced among which, speed (S) and current (I) of welding are the most influential ones based on the screening method used and literature review [1][2][3]. Furthermore, percentage of activating fluxes combination and (F) has been considered as a process input variable to achieve the merits of both in this regard. Similarly, process quality characteristics including DOP, WBW, and ASR are the most important responses of A-GTAW process have been considered to optimize simultaneously. Welding references studied and some preliminary tests have been conducted to determine the proper working intervals of each process parameters [8][9][10][11][12][13][14][15]. Process parameters and their corresponding levels based on the initial test findings lists in Table 1. Other input variables with trivial effects have been considered at an optimum fixed level. Furthermore, in this study, Argon (with 99.7% purity) has been used as the shielding inert gas.
AISI316L steel plates (100 mm×50 mm×5 mm) have been considered as the specimens on which the experimental tests has been conducted. In this study a combination of Nano oxide fluxes (TiO2, SiO2) (+99%, 20-30 nm, amorphous) have been used. To assure the powder particle size, FESEM test has been employed ( Fig. 1). Results of FESEM test have been illustrated in Fig. 1. In order to prepare a paste-like activating flux coating, prior to welding process begins, 20 grams of flux with 20 ml of methanol (as carrier solvent) has been mixed using mechanical and magnetic mixers approximately 20 minutes for each [1,2]. Then, a layer of paste like flux on the specimen surface has been applied. Afterward the methanol evaporated and the flux coated layer remained attached on the surface then the welding process could be started.

Box-Behnken Design (BBD)
The response surface methodology (RSM) has been proposed to act as a powerful tool for determining the effects of process variables and their corresponding interactions. There are different RSM designs, including central composite design (CCD) and its variations (spherical CCD, rotatable CCD, small composite design, etc.), box-behnken design (BBD) and also hybrid designs [16]. In this study, a BBD's L17 experimental design matrix has been opted ( Table 2).

Results of experiments
In this study, a random order in conducting experiments have been considered in order to increase the data accuracy. Three types of process output characteristics (WBW, DOP, and ASR) from each welding sample have been measured (Table 2). MIP (microstructural image processing) software has been used for each specimens' DOP and WBW values processing and determination. Results of the measuring process has been illustrated in Fig. 3.
ASR =e -6.60 × F -0.2853 × S 2.005 × C -0.392 The effect of two main process variables (welding speed and activating flux combination) on WBW and DOP has been studied using 3D response surfaces. Furthermore, the rest process variable has been considered at a constant level. The predicted output performance measures depending on the welding speed and activating flux combination has been shown in Fig. 4. The interaction effect of activating flux combination and welding speed demonstrate on the measured characteristics. There are different structures proposed for ANN, among which multi-layer perceptron (MLP) has been extensively used due to its merits including the capability to solve non-linear separable/continuous problems. MLP topology comprises an input layer (input variables), hidden layer/s (one or more), and an output layer (output characteristics) (Fig 5 (b)). In order to adjust the weights and biases in the training stage a supervised way is used, given that a set of input and output data pairs, which allows the MLP to learn the relationships among input-output parameters. In back propagation neural network (BPNN), an algorithm (back propagation) in which error of each MLP's input-output pair is calculated and then propagated from the output layer (the last one) to the input layer (the first one), modifying the biases and weights of the MLP to the error devoted by its neuron proportionally, is used [15]. The details in this regard are well documented in Refs [16,19].
Commonly, the architecture of ANN models is determined using trial and error procedure. Whereas, in this study PSO algorithm has been used to determine the proper BPNN's architecture. The hidden layers number diverse from 1 to 3; therefore a 3-n1-n2-n3-3 structure was constructed; where n1, n2, and n3 are the number of nodes/neurons for the 1st -3rd hidden layers respectively. The training of a NN denotes definition of weights of net and architecture which leads to minimum values of desired and predicted outputs error. The performance of the proposed model has been illustrated in Fig. 6.
Obtaining the best set of A-GTAW process variables is the key purpose of this study to simultaneously maximize DOP, minimize WBW and attain desired ASR. Consequently, process output measures could be considered together to build a multiple process response in the optimization procedure. Thus, the optimal design could be defines as a problem of multi-response optimization illustrated as Equation (4).

Minimum WBW= − WBW (F, I, S)
Desired ASR= [1-1.4] In this study achieving high DOP, low WBW, and desired ASR simultaneously required for multi-criteria optimization. Therefore, process multi-responses are changed into a single measure using Equation (5)

Simulated annealing algorithm
SA is one of the novel algorithms, methodology of which resembles the cooling of molten metal through annealing process [21][22][23]. At high temperatures in annealing process, the molten metal atoms can move freely with respect to each other, but as the temperature is reduced, the atoms movement gets restricted. The molten metal atoms start to get arranged and finally based on the cooling rate form a crystal structure having the minimum possible energy. If the temperature reduction occurs at a very fast rate, the crystalline state may not be achieved at all and system ends up with a polycrystalline state with higher energy state than the crystalline state. Consequently, minimum energy state achieved at a slow temperature reduction rate known as annealing process.
In annealing process the temperature is reduced slowly in order to achieve the lowest energy state.
At the same token, in A-GTAW process an objective function is considered as energy function to be minimized. The optimized values of A-GTAW process parameters have been considered as the lowest value for energy function achieved. The mechanism of SA algorithm is defined as follows [23]: Defining an acceptable answer space and generating an initial random solution in this space. Next, the new solution's objective function (C1) is computed and compared with the current ones (C0).
Next, a probability function (Equation (6)) is defined. Where, parameter Tk, acts the same role as the temperature in the physical annealing process.
= exp (− ∆ ) (6) Equation (7), is used as a temperature reduction rate to cool down the pre-determined temperature at each iteration.
Either the new solution is improving or the value of Equation (6) is higher than a number which generated between 0 and 1 randomly, a movement to the new solution is made: Based on the mechanism of the SA algorithm, due to higher temperatures at the initial iterations, most of the worsening movements may be accepted in order to avoid getting trapped in local minima. Nonetheless, at the following iterations only improving ones are likely to be accepted.
The algorithm is terminated after a pre-determined run time, a specific number of iterations or after a number of iterations in which no development is detected.

Particle swarm optimization algorithm
Particle swarm optimization (PSO) is a random-generated and population-based evolutionary heuristic algorithm proposed by Kennedy and Eberhart [24]. First, a population of random solutions initialized and generations for optimum searching updated. Next, the current optimum solutions (called particles) followed by potential particles through the problem space. The best solution achieved and the corresponding location obtained named "pBest" and "gBest" respectively. The PSO algorithm procedure comprises changing the velocity of each particle toward its "pBest" and "gBest". Acceleration toward "pBest" and "gBest" is being done using a random term with separate random numbers for weighing velocity generated. For updating the particles, the following equations (8 and 9) are employed [25][26][27].
Where, for each potential solution/particle, the term Vi+1 is determined based on its previous velocity (Vi), global best location and best solution (gBest and pBest). The individual particle's position (Xi) in solution is being updated using Equation (9) [27]. The terms "r1" and "r2" are generated in the range of [0, 1] randomly. In order to pull each particle/solution towards global best location and best solution, acceleration constants ("cl" and"c2") are used.
The term "w" (inertia weight) plays an important role in the algorithm convergence behavior. In order to explore the design space globally, the large amount of inertia weights selected. While, small amount of inertia weights results in concentrating the velocity updates to nearby regions of the design space [28].
However, the architecture of BPNN is determined conventionally using trial and error, in this study the PSO algorithm has been employed to determine the number of hidden layers and nods in the hidden layers of BPNN architecture. Furthermore, the optimization of the proposed BPNN models have been carried out using PSO algorithm. Moreover, SA algorithm has been used to evaluate the performance of PSO algorithm (avoiding getting trapped in the local optima).
The performance of each evolutionary algorithm is affected by its own distinctive tuning variables.
The adjusting parameters used for controlling the SA and PSO algorithms are carried out as the following.

Results and discussion
Different weights (W1 and W2) may have been considered for A-GTAW welding process responses (depth of penetration and weld bead width) based on the importance considered. In this study the value of 0.5 has been considered for W1 and W2. Fig. 7, illustrates the cross section of weldment for the optimized conditions (using hybrid BPNN-PSO). Based on the nature of the PSO algorithm, its convergence is faster than SA algorithm. Furthermore, as the PSO drawback is falling into optimum traps, its performance could be better to be checked by another algorithm. In this paper, the performance of PSO algorithm has been checked by SA algorithm. Table 3, represents the results of PSO and SA optimization. Based on the results, PSO and SA could accurately optimize the process responses (with less than 3% error).

Conclusion
Modeling of A-GTAW process and consequently its optimization in welding of AISI316L stainless steel parts considering both the process variables and percentage of activating fluxes combination has been addressed throughout this study. First, to design the experimental tests matrix, BBD has been employed to collect the needed data for modeling and optimization objectives. Next, MIP software has been employed to measure depth of penetration and weld bead width values extracted from cross sections. Based on the results of WBW and DOP, ASR values have been computed. Then, BPNN has been employed to found the relationships among process parameters (welding speed, current and percentage of activating fluxes combination and DOP, WBW and ASR). The adequacy of the models has been checked using analysis of variance (Supplementary Tables 4 and 5). Furthermore, in order to determine the proper BPNN architecture (number of neurons/nodes and hidden layers) PSO algorithm has been used. Then, PSO algorithm has been used again to optimize the proposed proper BPNN model in such a way that DOP increased, WBW decreased, and desired ASR achieved simultaneously. Furthermore, SA algorithm has been employed to avoid getting trapped in local optimum traps (supplementary Fig.   8). The proposed integrated method (BPNN-PSO) is shown in supplementary Table 9. Using the proposed hybrid BPNN-PSO approach either process input variables have been optimized (133 mm/sec for welding speed and 100 Amp for welding current) and the optimum formula (74% SiO2 and 26% TiO2) for activating fluxes combination has been determined using hybrid BPNN-PSO approach to achieve the optimized/desired process output characteristics (minimum WBW, maximum DOP, and desired ASR). Apart from using the proposed method for modeling and optimization purposes (hybrid BPNN-PSO), the design expert software (version 11) provides an optimization technique using which ends in the same optimization results (supplementary Fig. 10).
The result of proposed optimization procedure showed that the proposed method can precisely simulate and optimize (with less than 4% error) the A-GTAW process.

Acknowledgments
We are thankful to the editor and reviewer of this journal for suggesting the valuable improvements and allowing appropriate time for updating the manuscript.

Conflicts of interest/Competing interests
The authors whose names are listed immediately below certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Availability of data and material
The data and material could be available if is required

Code availability
The Code could be available if is required

Ethics approval
The manuscript will not be submitted elsewhere until the editorial process is completed.

Consent to participate
Masoud Azadi Moghaddam analyzed the experimental data obtained. The study has been done under the supervision of F Kolahan. All authors read and approved the final manuscript.

Consent for publication
The Authors grant the Publisher the sole and exclusive license of the full copyright in the Contribution, which license the Publisher hereby accepts. Consequently, the Publisher shall have the exclusive right throughout the world to publish and sell the Contribution in all languages, in whole or in part, including, without limitation, any abridgement and substantial part thereof, in book form and in any other form including, without limitation, mechanical, digital, electronic and visual reproduction, electronic storage and retrieval systems, including internet and intranet delivery and all other forms of electronic publication now known or hereinafter invented. Figure 1 FESEM test equipment used and results of Nano activating ux (SiO2) scaling       Evaluation of DOP and WBW for the optimized conditions