Prediction of Glass Forming Ability of Bulk Metallic Glasses-Machine-Learning

Bulk-Metallic-Glass has been a fascinating class of metallic systems with remarkable corrosion resistance, elastic modulus and wear resistance, while evaluating the glass forming ability has been a very interesting aspect for decades. Machine learning techniques viz., artificial neural networks and random-forest based models have been developed in this work to predict the glass forming ability, given the composition of the bulk metallic glassy alloy. A new criterion of classification of atoms present in a bulk metallic glassy alloy is proposed. Feature importance analysis confirmed that the accuracy of the prediction depends mainly on change in enthalpy of mixing and change in entropy of mixing. However, among the artificial neural network random forest models developed, the former showed a promising accuracy in prediction of the glass formation ability (critical thickness). It has been successfully demonstrated and validated with experimental critical thickness that the glass forming ability can be predicted using an artificial neural network given the elemental composition alone. A computational algorithm was also developed to classify the atoms as big/ small in a given alloy. The outcome of this algorithm was used by models developed by training with experimental data.


INTRODUCTION
Bulk-Metallic-Glasses (BMG) is one of the classes of advanced materials which are known for their superior corrosion resistance, fracture toughness and wear resistance [1][2][3]. The glass forming ability (GFA), is usually understood as the ease of formation of the vitrified structure. It is measured in terms of critical thickness, i.e., the largest section thickness of the alloy beyond which the alloy starts nucleation of the ordered structure. BMGs with high GFA reveal large critical thickness (or critical diameter, (Dmax)) [1][2][3][4][5][6][7][8][9]. Plenty of studies were focused on evaluating the criteria of the GFA. These criteria were mostly based on critical temperatures, thermodynamic parameters, topological parameters etc., [5][6][7][8][9]. While several research works continued to focus on the glass forming ability, few of them performed computational studies on predicting the GFA of the BMGs. Sun et. al. [10] developed a machine-learning model built using a support-vector-machines (SVM) classifier. This method predicts the GFA of BMG given its composition. The model was developed using thermodynamic and physical parameters. However, the predictions of the model [10] were confined to binary alloys alone, and whose output was a classification of binary alloys into 'good or bad glass-formers'. But, their machine-learning model was capable to predict neither the critical thickness nor the critical cooling rate of a BMG, quantitatively.
Ward et.al. [11] extensively studied the predictions of GFA using random forest approach of machine-learning, which was based on the elemental composition of BMGs as the input parameters. Three models were developed viz., glass forming ability model (which classifies good/ bad glass formers, qualitatively), critical casting thickness model (predicts Dmax quantitatively) and the third model was on the prediction of the supercooled liquid range.
Interestingly, configurational entropy was not considered in any of their models which turned out to be most important feature in this study. Another independent investigation by Majid et.al. [12] demonstrated few algorithms viz., support vector regression (SVR), artificial neural network (ANN), general regression neural network (GRNN) using the characteristic temperatures of the BMGs as the input parameters. However, the accuracies of the models developed weren't satisfying. Another limitation of this model was that, the temperatures used as input parameters can be measured only after synthesizing a BMG alloy experimentally. Hence the prediction model proposed by Majid et.al. [12] cannot be used for designing virtual BMG alloys.
Several research works [1][2][3][4][5][6][7][8][9][10] proved that the GFA depends on the thermodynamic and physical parameters such as enthalpy of mixing (ΔHmix), difference ratio in electronegativity (∆e), difference ratio in atomic size (∆d) and critical characteristic temperatures of the BMGs. Therefore, in the present work, two different prediction approaches have been developed, which take an alloy composition as input and give Dmax as output. According to the approach reported by Cai et al. [9] it is required to classify the constituent atoms as big/small to compute the ∆e and ∆d.
On the other hand, according to the existing classification made by Inoue et al. [2,13], the atoms (of elements, irrespective of their presence in the alloy) are grouped into small, medium and large. Such classification makes the situation ambiguous in some BMGs, as all the constituent atoms of the alloy fall into only one category (only big/ medium/ small class of atoms). Therefore, in the present work a new method named as 'radius range theory of classification' has been proposed and demonstrated in reference with the classification reported by Inoue et al. [2,13]. The proposed classification method also uses a computational approach to classify the atoms in to big/ small/ medium, irrespective of the number of elements. The result of this classification has been further used to compute the ∆e and ∆d values which were used in the prediction models (to predict Dmax) developed in this work.
Upon providing the alloy composition, an algorithm computes the values of ∆Hmix, ∆Smix, ∆d and ∆e for the alloy. The prediction models (Artificial Neural Network; Random forests) take these computed values as inputs, process them and give the values of the Dmax as output of the alloy. The prediction accuracy of both the methods has been critically validated with experimental values of Dmax, of the alloys taken from the literature.

Radius range theory of classification
It is based on the empirical criterion proposed by Inoue et al. [2,13] that, in a multicomponent system BMG, the atomic radii difference should be greater than 12%. The classification proposed in this work is explained with an arbitrary quaternary alloy containing atoms A, B, C and D with different radii as shown in Fig.1.

Algorithm of the radius range theory
Step-1: Identify the atom to be compared (eg. "A") Step-2: Calculate the radius range (0.88RB ~ RB; 0.88RC~RC and 0.88RD~RD) for elements other than "A".
Step-3: Compare the radius of A, i.e RA with the radius range of the respective element If RA lies within the radius range of an atom: assign "0" If RA is less than the radius range of an atom: assign "-1" If RA is greater than the radius range of an atom: assign "+1" Step-4: Sum up the values assigned in step-3 Step-5: Sum > 0, inferred as big atom. (i.e., atom "A" is bigger than all other atoms) Sum ≤ 0, inferred as small atom (i.e., atom "A" is smaller than all other atoms) In the above example: i. RA < radius range of atom "B", therefore assign "-1" ii. RA lies in the radius range of atom "C", therefore assign "0" iii. RA < radius range of atom "D", therefore assign "-1" Sum up the assigned values in (i), (ii) and (iii), i.e, = (-1) + 0 + (-1) = -2. Which is < 0, hence atom "A" is classified as a small atom.

Dataset
A dataset containing 400 BMGs was collected from the work of Long et al [1]. It contains

Feature Engineering
The characteristic temperatures of BMG such as Tx, Tg, Tl can't be used as input features as they can be known only after synthesizing the BMG. Therefore we need input features that can be calculated without creating the BMG. ∆Hmix (Change in enthalpy of mixing), ∆Smix(change in entropy of mixing), ∆d (difference in atomic size), ∆e are considered as input features of the ANN (deep-learning) and random forests (machine-learning) models.
Where, ΔHi is the enthalpy of the individual element and xi is the atomic percentage of i th element.
where xi is the atomic percent and di is the radius of the i th bigger element, whereas j is for the smaller elements.
The input features are normalized using Standard Scaler (sklearn) as having features on same scale helps the ANN to converge faster towards minima while the process of gradient descent.

Demonstration of the radius range theory of classification
Three BMG alloys, Cu57.5Zr37.5Ga5, Zr50Cu43Ag7 and Pd77Cu6Si17 were picked up randomly to demonstrate the proposed classification algorithm described above. The atomic radii of Cu, Zr, Ga, Ag, Pd, Si are 1.28 A°, 1.6 A°, 1.35 A°, 1.44 A°, 1.37 A° and 1.11 A° respectively [2]. If the atoms in Cu57.5Zr37.5Ga5 BMG alloy are classified using proposed classification, the Cu, Zr, and Ga can be classified as small, big and small atoms, respectively but with respect to Inoue's method they come under medium sized atoms. Similar computations were conducted for other two BMG alloys and the elements have been classified basing on the proposed classification and are compared with that reported by Takeuchi and Inoue [13].   [13], as the former is based on relative atomic sizes present in a BMG.
Further, a python language program was developed in this work, which is based on the proposed classification (code enclosed as a supplementary document). This program takes the elements and concentration as inputs and result in the classified elements in any multicomponent BMG alloy, as shown in Fig.2. The outputs i.e., big/ small atom classification obtained from this program can be used to calculate ∆e and ∆d, which further influence assessment of the glass forming ability of the alloy studied.

Multi-Collinearity Check
An important aspect which has to be considered during modelling is to check the multicollinearity between the input parameters chosen. A high multicollinearity would result in interdependency between the input parameters and this violates the assumption that all parameters are independent of each other. The collinearity was assessed by using the following equation, Eq. (5). Where 'r' represents the collinearity factor between the where parameters 'x' and 'y' can be any two of the input parameters ∆Hmix, ∆Smix, ∆e, ∆d.  Figure. 4 represents the collinearity matrix constructed between ∆Hmix, ∆Smix, ∆e, ∆d. It can be seen that the multicollinearity between any of the two parameters is not high, i.e. r <0.7.
This means there is no significant interdependency between any two parameters among ∆Hmix, ∆Smix, ∆e, ∆d; and therefore none of them was excluded in the prediction of Dmax.

Modelling
The next step is creating machine learning models to predict the Dmax of the BMGs. We The R-squared score is found to 0.868 over training set, 0.71 over testing set and MSE is found to be 5.12 over test set using optimized Random Forest Regressor model. Random Forest Regressor gave better results than classical machine learning models comparatively.

Artificial Neural Network method
An

RESULTS AND DISCUSSION
Independent influence of each of the parameters on the accuracy of the model was investigated. In this process, one of the parameters was excluded from the model at a time and the model was reconstructed with the remaining parameters. The drop in the performance of each such model was considered to calculate the feature influence on the model. Figure.6 represents the % importance of each of the parameters in decreasing order, obtained from such a method. In this approach, it has been observed that ∆Smix is more vital in determining the accuracy of the model, followed by ∆Hmix and others. This observation is well in agreement with the thermodynamics of BMGs reported in the literature.  Further, earlier studies neither analysed multicollinearity nor conducted feature importance, which are crucial to identify the parameters which are highly influential on the GFA or Dmax of a BMG. Present model was successfully tested on various BMG alloys which were not used for training. Present work has shown promising results irrespective of the alloy's chemistry, i.e, whether it is Pd-based, Fe-based, Mg-based, Cu-based or Zr-based.

CONCLUSIONS
The radius range classification demonstrated in this work based on the relative sizes of the constituent elements in a BMG alloy, successfully overcomes the ambiguity in the contemporary theories. The algorithm helps in the classification, which is independent of the alloy systems.
Multi-collinearity study revealed that there is no significant interdependency between any two parameters among ∆Hmix, ∆Smix, ∆e, ∆d; and therefore, none of them was excluded in the prediction of Dmax.
Feature importance analysis confirmed that the accuracy of the prediction models depends mainly on ∆Hmix and ∆Smix. However, among the ANN and random forest models developed, ANN, a deep-learning model shows a promising accuracy in prediction of the Dmax than the random-forest, a machine-learning model as evident from the validation with the experimental values.
It has been successfully demonstrated that the glass forming ability of a BMG can be predicted using machine-learning, given the elemental composition alone. Hence can be used for developing virtual BMGs.

DATA AVAILABILITY
The processed data required to reproduce these findings are available to download from [https://github.com/jaideep99/Atom-Classification].

FUNDING
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.