Study of micro-porosity in electron beam butt welding

As complete elimination of porosity from the weld is very difficult, the next option available is to minimize this weld porosity, which is crucial for the safe performance of the welded components. However, this investigation through experiments alone is very tedious and time consuming. Additionally, very limited models are available in the literature for accurate prediction of different porosity attributes. The present study, thus, addressees both the experimental as well as modelling aspect on the study of micro-porosity during electron beam welding (EBW) of SS304 plates. Welding parameters are reported to have significant influence on the micro-porosity. Hence, the influences of these parameters on micro-porosity attributes, namely pores number, average diameter, and sphericity are extensively studied experimentally employing optical microscopy (OM), scanning electron microscopy (SEM), X-ray computed tomography (XCT), and Raman spectroscopy. This is followed by an elaborate modelling using seven popular and well-recognized machine learning algorithms (MLAs), namely multi-layer perceptron (MLP), support vector regression (SVR), M5P model trees regression, reduced error pruning tree (REPTree), random forest (RF), instance-based k-nearest neighbor algorithm (IBk), and locally weighted learning (LWL). These different techniques enhance the chance of obtaining the better predictions of the said micro-porosity attributes by overcoming the effect of data-dependence and other limitations of individual MLAs. The different model-predicted micro-porosity data are also validated through experimental data. Statistical tests and Monte-Carlo reliability analysis are additionally utilized to evaluate the performances of the employed algorithms. IBk and MLP are overall found to perform well.


Introduction
The welding process is widely reported to be a complex nonlinear system due to various thermo-mechanical-physicalmetallurgical phenomena, such as heat transfer, fluid motion, micro-segregation, porosity formation, stress generation, solidification cracking, etc., occurring simultaneously [1,2]. The welded joint, thus, often acts as a potential site of failure [2]. In particular, electron beam welding (EBW) process is often subjected to a higher cooling rate, and the strong influence of Marangoni, Lorentz, and buoyant forces [3][4][5]. Thus, under such complex circumstances, weld defect estimation becomes important for quality assurance. Out of the various methods available for the analysis of different welding defects, non-destructive testing (NDT) methods have gained wide popularity. Vilar et al. [6] stated that NDT methods have been accepted over time from a mere laboratory curiosity to unavoidable need of industrial quality product assurance. As a result, different NDT tests have been employed to ensure the quality, longevity, safety, and reliability of products, available in the market [7]. Porosity is one of the prominent welding defects, known to strongly influence the mechanical properties of the joint [8]. Its elimination is also reported to be very difficult [9]. It is seen that porosity promotes the crack initiation, which further reduces the weld fatigue strength [9]. Porosity formation is also reported to be influenced by many factors, namely keyhole dynamics [8,9], fluid flow, solidification rate [9,10] etc., but it is observed that weld porosity could be minimized by selecting an optimum set of input process parameters [8,9]. EBW is known to take place in vacuum environment and the gases responsible for porosity mostly come from the gases inherently dissolved in the base metal. Except the vacuum environment, all the welding processes including EBW undergo localized heating, melting, cooling and solidification, which significantly affect the porosity distribution in the weld. Specifically, the cooling rate, keyhole geometry profile, Dendritic Arm Spacing (DAS) etc. of EBW and Laser Beam Welding (LBW) are comparable, both being high energy welding techniques. Both EBW and pulsed-LBW (P-LBW) are reported to yield almost equal bead-geometries. However, the pulsed nature of laser beam along with the flowing shielding gas induced forced convection resulted into P-LBW exhibiting the higher cooling rate than that of EBW process [11]. Similar observation had been reported by Hou et al. [12], where the different weld attributes, namely porosity distribution, microstructure etc. were compared during P-LBW and continuous laser beam welding (C-LBW) of Al-25Si-4Cu-Mg alloy. They reported P-LBW to have the smaller heat input than that of C-LBW because of its intermittent energy distribution resulting into the relatively higher cooling rate. Overall, the power density of both EBW and LBW were reported by Short [13] to vary approximately in the range of 10 11 W∕m 2 − 10 12 W∕m 2 . The nature of weld solidification and associated changes in the microstructure during both EBW and LBW of austenitic stainless steels were also observed by Ragavendran and Vasudevan [14] to be comparable.
Additionally, literature available on the analysis of weld porosity exclusively on EBW is relatively less [15]. Hence, regardless of the welding techniques, there are some similarities or trends expected regarding the influence of welding parameters on the porosity distribution in the weld zone. As a result, many different welding techniques are discussed below to address the primary goal of the present study of correlating weld porosity as a function of welding parameters.
Various experimental methods and welding inspection techniques were used by the following researchers to study weld porosity. Sun et al. [9] investigated the effects of different shielding gases on weld porosity during LBW of thick AISI 304 Stainless Steel (SS) plates using a high-speed camera. They observed a reduction in the porosity size with an increase in welding speed. LBW had been carried out by Elmer et al. [16] on A36 steel, 21 6 9 SS, and pure nickel to evaluate the influences of argon (Ar) and nitrogen (N 2 ) on porosity formation. Both the number of pores and the total pore volumes had been observed to increase simultaneously with the increase in melt volume for AISI 304L SS material-Ar gas combination. Lisiecki [17] had investigated the influence of heat input per unit length (E) on weld-geometry and porosity during LBW of 5.0 mm thick S700MC steel plates using Disk laser, where weld porosity had been observed to increase with an increase in the heat input. Wang et al. [10] conducted LBW of 2 mm thick 5754 Al-alloy sheets to investigate the effects of welding process parameters on the joint quality. They had also carried out X-ray inspections to study the weld porosity and had found it to decrease with the reduction of heat input. Blecher et al. [18] studied root defects in fully penetrated laser and hybrid laser arc welding using optical microscopy (OM) and X-ray CT scan. They had observed root defect formation to strongly depend on a force balance between the surface tension and melt-pool weight. This force balance had been observed to depend strongly on heat input and could be minimized by preferring low heat input. EBW of mild steel and Fe-Al alloy had been carried out by Dinda et al. [15], where an increase in weld porosity had been observed with an increase in welding speed. Moreover, they had suggested the use of beam oscillation to reduce porosity formation. XCT and Raman spectroscopy had been performed by Bandi et al. [19] to investigate micro-porosity during EBW of Zircaloy-4 alloys. They had also reported a significant reduction in porosity by using an oscillating beam. X-ray Diffraction (XRD), SEM, XCT, and micro-hardness tester had been used by Dinda et al. [20] to study the distribution of porosity and intermetallic compounds (IMCs) in dissimilar EBW of DP600steel to 5754 Al-alloy. They had also observed a significant reduction in the porosity due to beam oscillation. They had further employed Raman spectroscopy for gas identification in the welds. The amplitude of oscillation was also observed to have a strong influence on the pore attributes through the XCT [21]. Alshaer et al. [22] had studied the weld defects and mechanical performance of LB-welded AC-170PX Alalloy using fillet and flange couch joints. They had reported an increase in porosity with an increase in welding speed and heat input. OM, SEM, etc. had been utilized by Zhan et al. [23] to study the influence of heat input on porosity during laser cladding of Invar alloy. They had reported heat input and cooling rate to have the significant influence on the porosity, where the porosity had been observed to increase as a function of heat input until a threshold value, after which it was also observed to follow the decreasing trend. Xie et al. [24] had carried out XCT and SEM to investigate the influence of heat input on porosity during the LBW of Molybdenum alloy and had observed an increase in porosity with the increase in heat input. They further had reported that porosity occurs at low heat input due to inability of the gases to escape the weld zone because of high cooling rate.
Apart from experimental investigation, the capability of different conventional and non-conventional modeling techniques had been widely explored by various researchers to predict the desired weld attributes [3]. Among many such available techniques, Statistical Regression Analysis (SRA) was reported extensively for modeling various EBW processes [25] using experimental data set. However, it is known that human error, machine error, process error etc. always contribute to some uncertainties in any experimental data set. These uncertainties are rightly expected to be present in the augmented data set, obtained through the regression equations. On the other hand, the use of machine learning algorithms (MLAs) generally requires a large data set for satisfactory performances and is usually observed to yield unsatisfactory results with the smaller data set [1]. At the same time, MLAs can handle the uncertainties and complexity of the problem in an efficient way, it is also reported to offer a noticeable improvement in the quality of prediction [1]. Thus, the outstanding stateof-the-art performances, accurate predictions and better generalization capability of the MLAs using an augmented data set offer a popular choice for the modeling of different processes [1]. The influence of input parameters, namely power, welding speed, stand-off distance and clamping pressure on the quality of LBW had been investigated by Petkovi [26]. The researcher had employed SVR, ANN etc., where the SVR had performed better than others. Liang et al. [27] had stated that SVR performs better than MLP for the same small size dataset during the prediction of weld attribute in gas tungsten arc welding (GTAW ) of AISI 304SS. Boersch et al. [28] had used M5P, RF, SVM and a few other MLAs with the help of Weka software to predict the weld diameter during resistance spot welding. The deviation in the model-predicted weld diameter was found to be within 10% with respect to the experimentally measured one for most of the datasets. Computational intelligence-based tools, such as MLP and radial basis function networks had been used by Ghanty et al. [29] to investigate the influences of different welding parameters on the melt-pool geometry during GTAW of AISI 316LN stainless steel material. These SRA and other MLAs-based modeling techniques had also been utilized to predict the influence of welding process parameters on porosity and other welding defects. A 3D transient model had been proposed by Lu et al. [8] to study the effects of process parameters on keyhole-induced porosity during the LBW of CCS-B steel, where the predictions closely matched the experimental results. They had observed pore number and its size to increase with an increase in laser power and a decrease in welding speed. On the other hand, Munir et al. [30] and Hou et al. [7] had employed various MLAs to study welding defects, namely cracks, porosity, slag inclusion etc. MLP and fuzzy k-nearest neighbor were employed by Wang and Liao [31] to identify different welding defects from the radiographic images. They had found MLP neural networks to perform better than fuzzy-k-nearest neighbor (k-NN). Liao et al. [32] had used MLP and case-based reasoning (CBR) to predict welding defects. Das et al. [33] summarized the merits, issues and guidelines for the selection of the most suited conventional and non-conventional techniques for efficient weld optimization.
As mentioned above, complete elimination of porosity from the weld is very difficult. Hence, the second-best option available is to minimize this weld porosity, which is crucial for the safe performance of the welded components. However, the experiment-based detection of porosity in weld, followed by its proper analysis is very time consuming and tedious. As a result, generation of a large data-set on weld porosity through experiments alone is infeasible within a given time. This also make modelling and optimization of weld porosity, slag inclusion, crack or any other welding defects extremely difficult [1,7]. These above stated issues have led to limited availability of literature on the experimental, modelling and combined analysis of weld porosity in EBW [15].
The novelty of the present study, thus, lies with the detailed experimental investigation of the effects of welding input process parameters on the above stated gas-induced micro-porosity attributes during the EBW process through OM, SEM, XCT, and Raman Spectroscopy. The study also evaluates the role of the melt-pool area on the gas escape. In addition, seven popular and well-recognized MLAs-based models from distinctly different backgrounds are chosen for the present study, that is, multi-layer perceptron (MLP) and support vector regression (SVR) works on the principle of artificial neural network (ANN). M5P model trees regression, reduced error pruning tree (REPTree) and random forest (RF) are tree-based regression tools. Instance-based k-nearest neighbor algorithm (IBk) and locally weighted learning (LWL) belong to lazy learning-based algorithms [34,35]. Hence, MLAs corresponding to different categories, namely neural networks, model regression trees and lazylearner are likely to overcome this data-dependence of the MLAs, compensate the limitations of individual algorithms, and thereby, significantly improve the chance of obtaining the better results. Moreover, a well-known K-fold cross validation (CV) technique is adopted in the present study using the employed MLAs to make the modelling rigorous, thorough and unbiased [34,[36][37][38]. Additionally, these models are also validated through experiments. Some well-known statistical tests along with Monte-Carlo reliability analysis techniques are also utilized to evaluate and thereby, compare the capability of the MLAs to predict micro-pore attributes in the weld.
Hence, the present study brings together an extended experimental investigation, a wide variety of popular MLAbased modelling techniques and their performance evaluation through statistical tests and Monte-Carlo reliability estimation to analyze micro-porosity formation as a function of the high energy-based EBW input parameters under one roof. Such a balanced and elaborate study on similar topic is very limited. Hence, the present study has added value to the research community and thereby, assist in the selection of suitable process parameters in order to reduce the weld porosity.
It is necessary to mention that other than welding process parameters' optimization, the use of beam oscillation has also been reported to effectively reduce weld porosity. However, it also consists of different parameters, namely beam oscillation frequency, beam oscillation radius and various modes of beam oscillation, such as circular, triangular, rectangular, vertical and horizontal height etc. [20,39,40]. As stated above, experimental investigation of the influence of these different beam oscillation parameters on weld porosity is time taking. Moreover, in order to develop input-output models, a large experimental data will be beneficial following full-factorial design, Taguchi or any other design of experiments (DOE). The authors also plan to combine the optimum beam oscillation parameters with the optimum welding input process parameters, in future, which is expected to further reduce weld porosity and thereby, further improve the joint quality.
The rest of the manuscript is structured as follows: Sect. 2 deals with a brief description of the employed materials and methods for the experiment. Section 3 discusses the different MLAs used in this study. Results are stated and discussed in Sect. 4. Some concluding remarks are made in Sect. 5.

Experimental study and data collection
The experimental setup and methods of data collection and data augmentation used in the present study are briefly discussed, in this section.

Experimentation
It is to be noted that the welding samples are cleaned with acetone to remove all the dirt particles prior to carrying out the welding. Experiments are conducted on an 80 kV-150 mA EBW set-up, developed by Bhabha Atomic Research Centre, Mumbai, India [38,41] at IIT Kharagpur, India (refer to Fig. 1a). The working ranges of input process parameters used in the present study are listed in Table 1. The welding parameters used in the present study are selected by considering the working ranges of the process parameters reported in literature [38,41] and through some prior trial and error welding runs within the working range of the available EBW setup. During welding, the focusing coil current is varied in between 9 mA− 15 mA for focusing the electron beam on the job surface. The electron beam diameter is kept approximately equal to 800 m throughout the experiments. All other parameters, such as the working distances, fixture heights, vacuum level etc. are kept fixed throughout the experiments [42]. It is important to note that 5 mm thick AISI 304 SS plates are butt welded, and its chemical compositions are found to be 71.37% Fe, 17.80% Cr and 8.09% Ni through X-ray fluorescence spectroscopy (XRF) [43]. It is to be noted that AISI 304 stainless steel has been extensively used in the chemical industries, nuclear reactor coolant piping, fabrication plants, food processing industries, dairy industries, cryogenic vessels, valves systems industries, and in numerous other sectors due to its outstanding mechanical properties at the elevated temperatures, strength, corrosion resistance etc. [44,45]. This makes the present investigation more crucial.
The weld samples corresponding to 3 3 = 27 different welding conditions (according to full-factorial DOE) are polished on a Supertech double disc polishing machine using Jawan brand silicon carbide waterproof paper with the grit sizes of 220, 400, 600, 800, 1000, 1200, followed by diamond polishing at 0.25 µm. Polishing is then followed by etching using a mixture of Ferric chloride, Hydrochloric acid, and Ethyl alcohol in the approximate ratio of 20%, 5%, and 75%, respectively. The obtained weld cross-section is then investigated through OM and SEM. The OM is carried out on Leica DMLM optical image analyzer at Optical Microscopy and Material Testing (OMMT) Laboratory, Central Research Facility (CRF), IIT Kharagpur, India. Similarly, SEM images are taken from JEOL JSM-IT300HR, Department of Metallurgy and Materials Engineering, IIT Kharagpur, India, for observing the better and magnified views of pores. The successful identification of the pores has led the authors to carry out extensive XCT analysis of all the 81 welded samples (corresponding to all 3 3 = 27 different welding conditions according to the full-factorial DOE through the Zeiss versa 520 system (refer to Fig. 1b). It is to be noted that the non-destructive XCT analysis is used for 3D visualization and quantification of micro-porosity in the weld. It is important to note that the samples are not etched before carrying out the XCT analysis, as it may enlarge the porosity. The sample is mounted on a table lying between the X-ray gun and the detector screen. The table is rotated 360 0 and several images are taken throughout the rotation. All these images are merged to develop the 3D volumetric image of the sample, also known as 3D pixel or simply voxels. The machine is further equipped with different optical systems to obtain additional magnification. The XCT data are analyzed through VG-STUDIO-MAX 2.2 software. The noise present in the 3D images, because of the unexpected tilt or shift of the sample during testing due to fixation issues, is addressed through the proper smoothening processes. ROI CT filtering and beam hardening correction (BHC) are performed to eliminate the ring effects and artifacts, respectively, by maintaining the histogram between the matrix and background. A thresholding value of 0.1 is maintained for all the analyses [15,20,46].  The XCT machine parameters, namely voltage, current, exposure time, filter (Cu-mm), number of projections, and voxel size are kept equal to 150 kV, 150 µA, 500 mS, 0.5 (Cu-mm), 1000, and 23.8 µm, respectively. The micro-porosity in the weld is measured in the present study following (ISO15901, part 3; 2006 (EN)) standards. During XCT analysis, it is ideally desirable to scan the entire weld sample to study the pore distribution through XCT. However, in practice, it may not always be feasible. That is, the X-rays need to penetrate the full-size sample completely in order to map the variation in density of the metal and gases. As the X-rays fail to penetrate the full-size sample even with the above stated maximum rated voltage and current settings with intermediate number of projections of the XCT machine, a reduction of the sample size becomes a compulsion. To maintain consistency, the size of all the porosity samples used for the said XCT analysis are kept same and these samples are cut from approximately the same location of the welded plate. That is, from the welded plate of size 110 mm × 80 mm × 5 mm, a small portion of the plate of size 15 mm × 15 mm × 5 mm is cut at the center of the plate, which is located at a distance of 40 mm from the starting point of welding start point and 40 mm from the end of the weld, and it is shown in Fig. 1c. As the sample size is reduced, the X-rays can now fully penetrate the sample, and the possibility of detecting smaller pores through XCT also increases. For the said sample size (15 mm × 15 mm × 5 mm), the voxel size is found to be equal to 23.8um. However, a minimum resolution of one micron is kept just to indicate that even smaller pores may be available, whose measurement is beyond the scope the present study. Thus, in order to detect even the smaller pores, the sample size needed to be reduced further. But, that again would limit the analysis of pore distribution in a very small weld zone. The authors carried out a few trial-and-error runs to decide the size of the samples. After completing the XCT analysis, the entrapped gasses in the EBW samples are then determined through Raman spectroscopy [15,[47][48][49] (refer to Fig. 1d). Figure 1 shows all the experimental setup used in the present study. It also shows the complete welded sample, the place from which samples for porosity analysis are taken, and the sample used for porosity analysis of the welding conditions. Raman spectroscopy, employed in the present study for gas detection uses Argon-Krypton mixed ion gas laser. MODEL 2018 RM (Make Spectra-Physics, USA) and MODEL T64000 (Make Jobin Yvon Horiba, France) are used as the excitation source and spectrometer, respectively. The thermo-electric cooled front-illuminated 1024 256 CCD of MODEL SynpseTM (Make Jobin Yvon Horiba, France) and Optical Microscope of MODEL BX41 (Make Olympus, Japan) are used as the detector and collection optics, respectively. The focal length, frequency, step size, and grating during the Raman spectroscopy are kept equal to 640 mm, 100 cm −1 , 0.00066 nm, and 1800grooves/mm, respectively [15,46,48,49]. Finally, the melt areas corresponding to the said welding conditions are obtained from the weld crosssection through microscopy study. All the experimental and modelling steps adopted in the present study are graphically presented through a flowchart (refer to Fig. 2). This representation not only provides a compact summary of the entire work but also guides the young researchers how to proceed stepwise. In this study, V, I, and U are considered as input parameters (refer to Table 1). The number of pores N p , average pore diameter D p ( m) and average pore sphericity S phry , are considered as the outputs. Here, full-factorial design with the said three input parameters having three levels each is developed. Thus, weld porosity for each of the It is necessary to mention that in the present study, the influences of accelerating voltage (V) , beam current (I) and welding speed (U) are analyzed on the different porosity attributes. All these input parameters are known to significantly influence the porosity distributions in the weld. However, while representing their influences through 2D input-output plots (refer to Fig. 5), the only feasible and relatively easier way to depict the influences of all the above said input parameters through a combined equivalent parameter is to use heat input per unit length or simply, heat input (refer to Eq. (1)) [50].
It is a well-established parameter to represent the combined influence of V, I and U , and has been utilized by numerous researchers to predict a wide range of different weld attributes, namely weld geometries, melt-pool area, grain size, dendritic arm spacing, porosity, hot cracking tendency, cooling rate, element segregation, phase formation, strain distribution, fusion zone hardness and other mechanical properties [12,14,51,52].
In fact, porosity distribution in the weld had been studied as a function of heat input by Hou et al. [12] during pulsed and continuous LBW of high-silicon aluminum alloy, where a proper adjustment of input parameters was observed to be significantly controlling the weld porosity. These studies have motivated the authors to investigate porosity distribution in the weld as a function of E in the present study. Here, heat input is varied from 0.175 kJ∕mm to 0.526 kJ∕mm.

Data augmentation
As stated above, a large data-set is usually required for satisfactory performance of the MLAs [1]. However, generating a large data set on weld porosity, slag inclusion, crack or any other welding defects are time consuming and difficult to (1) E = V × I U Fig. 2 Flowchart depicting the stepwise detection, modeling and analysis of micro-porosity in the weld obtain, thereby making the use of MLAs-based techniques for modelling and prediction very challenging. This issue with small data set is addressed by some researchers with reasonable accuracy by employing different data augmentation techniques to increase the size of the original data set. In fact, they have recommended this approach for the use of numerous MLA-based modelling and prediction of different metallurgical outputs with relatively small database [1,7]. Thus, the idea of data augmentation technique adopted in the current investigation is inspired from the published literature [1,7] and thus justifies the approach adopted in the present study. In addition, some uncertainties are always associated with the experimental data and these are likely to be also present in the augmented data developed through regression equations. Moreover, MLAs can inherently deal with these uncertainties and inconsistencies associated with the augmented data [1]. This is further accompanied by the use of well-known K-fold cross validation (CV) technique, known to make the evaluation of the performance of the MLAs unbiased and thorough [34,36,37]. Hence, the experimental data available in the present study for only 3 3 = 27 input-output combinations utilizing a full-factorial data set is artificially augmented to 1000 input-output data set by generating additional 973 input-output data using regression equations (refer to Eqs. (2)-(4)) developed through Minitab 16 software. The equations used for the data enhancement are given below.
The quality of the MLAs-based predictions are evaluated L, T ok ′ l and O ok ′ l denote the correlation coefficient, root mean square error, average absolute percent deviation, total number of welding conditions, target and predicted outputs of l th condition, respectively.

Tools and techniques for modeling through MLAs
WEKA (Waikato Environment for Knowledge Analysis) is a well-accepted open-source java-based machine-learning framework consisting of a diverse categories of MLAs for regression, classification and other data processing [38,[53][54][55]. The present study utilized Weka 3.8.0 to develop MLP, IBk, RF, REPTree, SVR, M5P and LWL. MLP is an artificial neural network (ANN) comprising of the input, hidden and output layers architecture [34]. It works based on the principle of back-propagation. It has also been employed by many other researchers [31]. IBk is an instance-based lazy learner regression tool, and works based on the k-nearest neighbor (k-NN) algorithm. The Euclidian distance of the instances are utilized to denote the nearest neighbors [34,56]. RF is an efficient tree-based MLA capable of carrying out both classification and regression [57]. RF has also been reported by various researchers for the prediction of different weld attributes [58,59], and found to be very accurate. Node statistics-based multiple decision tree, used for regression and classifications is called Reduced Error Pruning Tree (REPT) [34,60]. Vapnik [61] developed a statistical approach-based supervised MLA, known as Support vector machine (SVM). It has gained popularity and implementation in welding industry because of being less time-and cost-taking technique for modeling of data than that of ANN models [62]. It employs a kernel function to solve the problem [27]. Model Trees Regression (M5P) uses the M5 algorithm and works based on the principle of divideand-conquer policy. It is relatively smaller than other trees and also contains a fewer variables [34]. Zhan et al. [63] reported that M5P algorithm could create multiple linear regression models at its tree leaves. Alam et al. [57] stated that M5P regression tree is generated through piecewise function of many linear models. LWL is a non-parametric statistical tool, which depicts the non-linearity of a problem through piecewise linear simplification [34,64]. As mentioned above, the present study has utilized seven popular and well-recognized MLAs from distinctly different backgrounds, namely neural networks, model regression trees and lazy-learner following a well-known K-fold cross validation (CV) [36]. A limited literature is available on the applications of such extended modeling techniques to study different weld attributes in high energy welding processes, such as EBW, LBW, ion beam welding etc. Here, Weka 3.8.0 is used to develop MLP, IBk, RF, REPTree, SVR, M5P and LWL.

Results and discussion
Various observations, made by the authors are discussed below. It consists of the detection of the micro-pores in weld through SEM and XCT, followed by the study on change in micro-pore attributes with heat input and the identification of gases through Raman Spectroscopy. This is followed by data augmentation using regression equations and MLA-based modeling. Finally, the performances of these modeling techniques are evaluated through statistical tests and Monte Carlo Reliability Estimation.

Micro-pore detection through SEM and XCT
Micro-pores in the weld cross-section are detected through SEM study, as shown in Fig. 3a, b, respectively, for both the etched and un-etched samples. It is to be noted that the authors have used both the etched and unetched SEM images of the weld cross-section to identify and thereby, confirm the presence of pores in the weld samples. This is so, because the pores are relatively easier to identify in the etched samples. However, there is a possibility that the spot identified as a pore may have been formed due to pitting, which leads to an error in identification. This possibility of obtaining error is nullified by successful pore identification in the un-etched polished samples. However, the visibility or the ease with which it could be identified is relatively less. Thus, the images of both the etched and un-etched regions are provided in the present study to confirm the presence of distinctly clear micro-pores in the weld. The confirmed pore identification has led the authors to conduct an extensive investigation of the effects of welding process parameters on pore attributes through XCT analysis. It provides a detailed 3D distribution of micro-pores in the weld samples.
XCT study works based on the density difference occurred between the porosity and the matrix. The volume percentage of porosity is calculated with respect to the total scanned volume [14]. Figure 4 depicts XCT results of pores distribution in the weld plate, where the micro-pores of different sizes are observed to be distributed throughout the weld fusion zone. It is to be noted that the variation of porosity from the weld centerline to the further extent of the weld zone on both the sides is not considered. It is so, because the weld porosity is mainly formed due to the melting and solidification in the fusion zone, and consequently, porosity is observed within the fusion region only but not in the heat affected zone (HAZ) or unaffected base metals.

E vs. different micro-porosity features
The nature of variation of weld porosity with heat input is shown in Fig. 5. From Fig. 5, it can be observed that as E is increased from 0.175 kJ/mm to 0.239 kJ/mm, N p increases from 3 to 522, and D p varies from 32 to 39.08 µm. The authors further observed that the percentage of porosity has increased from 7.3 × 10 -6 % to 2.6 × 10 -4 %, whereas the average pore volume is increased from 1.08 × 10 -5 to 2.75 × 10 -5 mm 3 . This observation is in line with the literature [8-10, 16, 17]. This is so, because an increase in E leads to an increase in the volume of material melted. This also increases the turbulence and fluid flow, occurring in the melt-pool [17,65], resulting into an enhancement of porosity in the weld. It is to be noted that E of 0.239 kJ/mm corresponds to the maximum N P . However, as E is further increased from 0.240 to 0.452 kJ/mm, N P is observed to gradually decrease from 210 to 44, respectively, but D p is observed to increase from 39.45 µm to 95.42 µm. The percentage of porosity is also observed to increase from 4.05 × 10 -4 to 5.36 × 10 -3 %. The average pore volume is also seen to increase from 3.04 × 10 -5 to 9.65 × 10 -4 mm 3 . Hence, apart from N p , all other parameters have followed the expected trends, as reported in the literature [8-10, 16, 17]. N p is also observed to decrease, even though the pore volume is increased. This is so, because with the increase in E, liquid melt-pool gets more time and energy, which will help to create more porosity eventually, but due to the prolong  [23,24,66]. It is interesting to note that as E is increased further from 0.487 kJ/ mm to 0.526 kJ/mm, N p is decreased from 19 to 15. The porosity percentage is also decreased from 4.65 × 10 -3 to 2.38 × 10 -3 %. The average pore volume and D p are also found to decrease from 7.09 × 10 -4 to 2.75 × 10 -4 mm 3 and 72.09 to 64.69 µm, respectively. This might be due to a further increase in the solidification time, for which the bubbles could escape the melt-pool, which leads to a decrease in weld porosity. Similar observations have also been reported in the literature [10]. Moreover, an increase in E has increased in the overall melt area. The pores get a larger area to escape from the melt-pool. As a result, the decrease in the porosity may be attributed to the increase in solidification time and melt area, which allows the pores to escape from the melt zone. The change in the average melt area with E is shown in Fig. 6.
Furthermore, S phry is used to define the 3D-shape of the pores. It varies from 0 to 1, which corresponds to a completely non-spherical and perfectly spherical pore, respectively. It is calculated through Eq. (5), where the pore volume and pore surface area are denoted by V ol and A ar , respectively. Moreover, S phry is plotted concerning D p in Fig. 7. The obtained trend is in line with the literature [15].
In this study, the minimum value of S phry is found to be equal to 0.68, corresponding to an input parameter  Fig. 8. It depicts the change in pore shape with that in S phry value. It is to be noted that the smaller the pores, the higher the sphericity is. The presence of hydrogen and nitrogen gases is detected through Raman spectroscopies, which are formed because of the nucleation and growth during the solidification process. On the other hand, the bigger and irregular pores are originated due to shrinkage phenomena.

Gas detection through Raman spectroscopy
Porosity is mainly induced by the entrapment of different gasses. Gases may be entrapped during the manufacturing of base metal. A few air pockets may also be entrapped within the metal junction during the mechanical assembling, though EBW is carried out in a vacuum environment [20,40,67]. EBW is a very high speed and high cooling rate process, due to which, some gases may be entrapped during welding solidification. Moreover, some gases may also be inherently dissolved into the workpiece itself. Huang et al. [67] had observed a significant contribution of residual hydrogen gasses to the formation of gas-induced porosity in EBW. They had further suggested that the entrapped gas-induced porosity has an overall spherical morphology. Williams et al. [68] had observed a strong influence of input process parameters on the pore morphology during additive manufacturing through XCT analysis. Results of Raman spectroscopy, conducted in the present study, are provided in Fig. 9. It shows the variation of intensity with the Raman shift. The peaks are  Fig. 9 confirms the presence of no other gases, contributing to the weld porosity formation. This study is found to be in line with the observations reported in the literature [15].

MLAs-based modeling of micro-pore attributes
K-fold cross-validation (CV) [36] is a well-known method of determining the efficiency of the employed MLAs through random partitioning of the available data into Ksub groups with equal data distribution. The training of the MLAs is carried out utilizing data corresponding to (K-1) groups, following validation and performance evaluation using the left-out group. This MLA-based training and testing is conducted K times, that is 2, 5 or 10 times [38]. As a result, entire data set is utilized to evaluate the performance of the MLA-based modelling. This makes K-fold CV an unbiased, thorough and mostly accurate [34,36,37]. Results of K-fold CV, namely R 2 , RMSE and AAPD of the employed MLAs, are summarized in Figs. 10, 11 and 12. Results indicate the IBk and LWL to perform the best and worst, respectively. All the employed models have successfully predicted the micro-porosity attributes with reasonable accuracy. MLP is observed to perform the best during the prediction of the number of pores. However, IBk has performed slightly better in predicting the average diameter and sphericity. As a result, IBk is found to perform the overall best, while LWL has performed relatively worse. The difference in the performances of IBk and MLP is found to be almost negligible. The non-parametric nature of handling data set might have led IBk to perform better. Similar better performance of IBk is also observed in the literature [69,70]. Additionally, MLP also has performed well due to its robust neural network architecture. However, the poor performance of LWL may be attributed to its piece-wise problem simplification, which may have failed to tackle the non-linearity of the present problem [34,64]. Schaal et al. [64] had stated that LWL is

Performance evaluation of the MLAs through statistical analysis
The performances of the different employed MLAs are compared through Friedman, Aligned Friedman and Quade statistical tests [71,72], as provided in Table 2. Table 2 suggests that IBk and LWL algorithms have performed the best and worst, respectively. Moreover, the differences in performance of most of the employed MLAs through statistical analyses are observed to be negligible. This shows that the employed MLAs are very competitive to one another in terms of quality of prediction.

Performance evaluation of the MLAs through Monte-Carlo reliability analysis
Monte Carlo technique utilizes random numbers to determine reliability of different systems [73]. This is done by calculating the probability of success p occurrence for a pre-defined condition assumed to lie within the range of ±5% to ±30% in the present study. The reliability of the employed MLAs in predicting different micro-pore attributes are analyzed through a new data set of 10,000 input data, generated utilizing random number generator. Equations (2)-(4) are then used to obtain regression equation predicted micro-pore attributes. The different MLAs are also utilized to obtain micro-pore attributes corresponding to the said 10,000 input parameters. The different micro-pore attributes, predicted using the MLAs are grouped into success and failure conditions. That is, Fig. 10 Performance comparison of the employed models during the prediction of pore number N p using a R 2 , b RMSE , c AAPD if the predicted value exceeds the domain, then it is considered as a failure. Else, it is considered to be successful. By following this rule, p occurrence is determined using Eq. (6), where N success and N total denote the successful and complete scenarios, respectively. p occurrence is determined in the pre-defined range, as shown in Fig. 13. The performance of MLP, IBk and RF are observed to be very close. On the other hand, LWL has performed the worst for the present problem. This observation is in line with the previous observations. Thus, the above stated MLAs having distinctly different backgrounds effectively have modeled the microporosity attributes in EBW. MLP from ANN-background and IBk from lazy-learner background are observed to perform well. On the other hand, the performances of SVR and LWL from the same ANN and lazy-learner backgrounds, respectively, are seen to be relatively less impressive. Fig. 11 Performance comparison of the employed models during the prediction of average pore diameter D p using a R 2 , b RMSE , c AAPD Fig. 12 Performance comparison of the employed models during the prediction of average pore sphericity S phry using a R 2 , b RMSE , c AAPD

Conclusion
An elaborate experimental study of micro-porosity during the EBW of AISI 304 Stainless Steel plates followed by an extended machine learning-based modeling is reported in the present study. The performances of these MLAs are also checked through statistical tests and Monte-Carlo reliability analysis. From the above study, the following conclusions are drawn: 1. The number and size of pores in the weld zone are found to increase initially until their threshold values are reached and then, decrease with an increase in the heat input due to corresponding reduction in cooling rate, resulting into the merger of a large number of smaller pores to form the smaller number of bigger ones. 2. The average pore sphericity in the weld is experimentally observed to decrease from 0.803 to 0.680 with a proportional increase in the average pore diameter from 32 to 72 m , and thus, an inverse relationship is established between them, which is in accordance with the existing literature. 3. Nitrogen and Hydrogen gases are identified inside the pores through experimentally measured intensity peaks at Raman Shift values of 2332 cm −1 and 3020 cm −1 , respectively. The smaller and near spherical pores are formed through nucleation and growth during solidification, whereas the bigger and irregular pores are most likely to be formed due to shrinkage. This observation is in line with the literature. 4. An increase in the cross-sectional melt-pool area from 12.14 to 20.71mm 2 corresponding to an increase in heat input from 0.175 to 0.56 kJ∕mm has resulted into a corresponding decrease in cooling rate to such an extent that the gas bubbles have got sufficient time to leave the melt-pool, and thereby, decrease the number and size of the pores. This observation is also in line with the literature. 5. Machine learning algorithms-predicted weld-micro-porosity attributes have closely matched with the experimental ones, and could be applied in industrial applications. 6. Reliable and robust modeling of weld micro-porosity is possible due to the use of multiple techniques from the distinctly different backgrounds. 7. MLP and IBk are found to perform well, while the performances of SVR and LWL are observed to be inadequate. 8. Results obtained through statistical tests and Monte-Carlo reliability analysis endorse the superiority in performance of IBk and MLP over the rest.
An attempt will be made in future to investigate and optimize the influences of beam oscillation parameters using a rotatable central composite design of experiments on porosity distribution of the weld. The effects of some other parameters like vacuum level, working distance, fixture height, electron beam diameter on porosity distribution of the weld will also be studied in future. The principle of deep learning will be used in future for data modeling. Data availability The data and materials will be made available to others on request, after it is accepted for publications.

Declarations
Ethics approval As this experiment is not conducted on human-beings, animals, ethical approval is not required.

Consent to participate
All the authors express their consent to participate.

Consent for publication
All the authors express their consent to publish this work.