Neural Network Modeling of NiTiHf Shape Memory Alloy Transformation Temperatures

Data-driven techniques are used to predict the transformation temperatures (TTs) of NiTiHf shape memory alloy. A machine learning (ML) approach is used to overcome the high-dimensional dependency of NiTiHf TTs on numerous factors, as well as the lack of fully known governing physics. The elemental composition, thermal treatments, and post-processing steps that are commonly used to process NiTiHf and have an impact on the material phase transitions are used as input parameters of the neural network model (NN) to design the TTs. Such a feature selection led to the use of most of the accessible information in the literature on NiTiHf TTs, as all processing features required to be fed into the NN model. Considering most of the regular NiTiHf processing factors also enables the option of tuning additional characteristics of NiTiHf in addition to the TTs. The work is unique as all the four main TTs and their associated peak transformation temperatures are predicted to have complete control over the material phase change thresholds. Since 1995, extensive experimental research has been conducted to design NiTiHf TTs with a large temperature range of around 800 °C, paving the path for the current work’s ML algorithms to be fed. A thorough data collection is created using both unpublished data and available literature and then analyzed to select twenty input parameters to feed the NN model. To forecast the NiTiHf TTs, a total of 173 data points were gathered, verified, and selected. The model's overall determination factor (R2) was 0.96, suggesting the viability of the proposed NN model in demonstrating the link between material composition and processing factors, as well as identifying the TTs of NiTiHf alloy. The effort additionally validates the generated results against existing data in the literature. The validation confirms the significance of the proposed model.


Machine Learning Approaches for Alloy Design
There is a current challenge to develop an inclusive approach to model and formulate material design and behavior. ML approaches have the potential to significantly decrease the need for complex physical and mathematical considerations (Ref 1), as well as the number of physical experiments needed to optimize the process of materials design and discovery (Ref 2,3). In addition, the capability for efficient, accurate predictions, and performing complex pattern recognition and regression analyses, made ML approaches good candidates in the area of material design and discovery .
ML is currently implemented to design new single-phase materials in many areas like thermoelectric  processing are an essential part of material development to create a multiphase structure reaching desirable performances. This suggests an enormous open space for material design through ML approaches to search and build an alloy design. Data from the literature ( Ref 6,[16][17][18][19][20][21], theoretical models (Ref 22), computational approaches (Ref 23), and generated by physical experiments, or a combination of these sources, are used to feed the ML algorithms. Experimental data are the most reliable to feed ML algorithms as it has been tested and measured. The present work focuses on collecting experimental data from literature and physical experiments. Among ML algorithms, neural network (NN) is the most widely used model due to its ability to handle large data sets, as well as high computational capacity and sophisticated algorithm architecture (Ref 8). NN methods proved their validity in material design and optimization areas (Ref [24][25][26][27][28][29][30][31][32][33][34][35][36][37]; therefore, the present work focuses on applying the NN method, as well.

SMAs and TTs Design Parameters
Shape memory effect and superelasticity are the two principal features of SMAs. These effects result from a reversible microstructure transformation and rely on the phase transformation between austenite and martensite phases that occur through applying stress or changing the temperature (Ref [38][39][40]. These effects have attracted many industries to develop a wide range of applications from biomedical to aerospace applications (Ref [41][42][43][44][45]. The phase TTs consist of four critical temperatures: martensite start (Ms), martensite finish (Mf), austenite start (As), and austenite finish (Af) temperatures. Ms is the temperature at which the austenite to martensite transformation begins, whereas Mf denotes the temperature at which the complete martensitic transition occurs. The As and Af indicate the phase change from martensite to austenite as a result of heating (Ref 46). Martensitic peak (Mp) and austenitic peak (Ap) represent temperatures with maximum cooling or heating rate during the material thermal cycling, respectively. Transformation hysteresis (TH) is an important design factor that is defined as the difference between Af and Ms or Mp and Ap that represents the SMA material thermodynamic efficiency. TH represents the amount of energy that the material dissipates throughout each cycle.
NiTi, as the most used SMAs, exhibits shape memory effect and superelasticity that make it a suitable candidate for many applications. The binary NiTi alloys, on the other hand, are not appropriate for applications requiring high working temperatures due to the narrow range of transformation temperatures, which cannot surpass 120°C ( Ref 42,[47][48][49][50][51] The addition of Hf as a ternary element to NiTi alloy has proved to be promising for future SMAs, as well as being less problematic or expensive than other elements ( Ref 59,60). NiTiHf shows good mechanical properties at a relatively low cost and offers a very wide range of TTs and the possibility to achieve greater strengths for repeatable functional performances with only heat treatments and without the need for cold work. These alloys have been classified as high-temperature SMAs (HTSMAs) ( Ref 58). However, based on their elemental composition, they can offer low-transformationtemperature SMAs (LTSMAs), as well ( Ref 62).
The key factor impacting the TTs is the elemental composition of NiTiHf alloys ( Ref 63). It has been shown that an alloy with less than 10% Hf can have medium or very low TTs ranging from around À 140 to 100°C, depending on the NiTi content ( Ref 62). An increase in TTs of up to 644°C can be achieved by increasing the Hf level by more than 10% ( Ref 58,64). Alloys containing more than 50% Ni presented lower TTs, indicating that Ni content also has an impact on TTs ( Ref 58,65).
Apart from the chemical composition, heat treatments such as homogenizing, solutionizing, annealing, and aging, as well as mechanical post-processing such as hot rolling, cold rolling, and precipitation hardening, are the most common methods used to regulate the properties, including the TTs, of SMA alloys ( The key characteristics impacting NiTiHf TTs are chemical composition, thermal treatments, and metalworking processes, which are the focus of the present work.

Machine Learning to Design the NiTiHf TTs
The majority of the research employed experimental techniques to study the influence of different compositions, heat treatments, mechanical treatments, and other variables on the characteristics of NiTiHf TTs (see Appendix) or their thermomechanical behavior ( Ref 77,78). When it comes to alloy design, experimental approaches are expensive and timeconsuming, and they may not completely cover the search space.
Correlating NiTiHf TTs with various post-processing parameters and elemental composition stresses the need for comprehensive modeling techniques. Such techniques should address the dynamic nature and multilevel relations between the material inputs, fabrication methods, heat treatment procedures, and the final alloy material performance.
While ML approaches are adaptable for various materials categories, they represent an excellent tool for modeling active materials such as SMA. Due to the unique material reaction to heat or external stress, material modeling in active materials includes an extra fold of material behavior. Hence, the design space has an extra dimension compared to other material classes. There are previous studies that shed light on the use of various ML systems in designing SMA, as well as the resultant material and response characteristics ( Ref 63,[79][80][81][82][83]. A few works focused on using the ML approaches to predict the NiTiHf TTs ( Ref 63,80,84). A NN model is used in predicting the NiTi material properties such as TTs and strain recovery ratio with limited data points (Ref 84), and the work focused on relating the four fabrication process parameters, of NiTi material to the strain recovery ratio and TTs. Other MLdeveloped models (Ref 63) used a physic-informed feature selection approach with a special focus on the material composition characteristics. This work presents a different approach to previous ones as the data gathered is about an order of magnitude larger than ( Ref 80,84), and in contrast to (Ref 63), the approach of this work is to consider almost all the experimental steps that are commonly used in the processing or fabrication the NiTiHf and not necessarily downsizing the involving features even if they present limited effects on TTs. The work is aiming to develop a tool to predict the TTs based on how the NiTiHf is heat-treated and processed up to now. This also opens the space for future works that will look at additional functional parameters of the NiTiHf along the TTs. Additionally, a new way of NN feature preparation with respect to similar works (Ref 63,80,84) is presented in which 1 or 0 values are applied for certain heat treatments or post-processing that are performed or not performed, respectively.
In the present work, a NN ML model is developed to predict the TTs of the NiTiHf based on given chemical compositions, heat treatments, and mechanical treatments. As described above, the uniqueness of the presented approach lies in the type of data points, selected features, and employed ML. The data points investigated in this work contain a large amount of information to be related to the six TTs, namely Mf, Ms, As, Af, Mp, and Ap. The work targeted only experimental data as the most reliable type of data to feed ML algorithms. The data are gathered almost from all laboratories working on NiTiHf throughout the world (see Appendix). The findings and discussion sections of related studies are carefully evaluated to perform the feature selection and to obtain a total of 173 data points. Finally, the NN model was further validated by comparing the predicted values with the experimental works for several cases.

Methodology
A NN model is developed in the present study to predict NiTiHf TTs. The data set preparation, features selected, ML algorithms, and the error metrics are presented next.

Material Process Parameters Identification and Selection
The experimental data points used in this work are presented in Appendix. Table 1 represents the features, their related values, and descriptions. The knowledge gained from the acquired experimental data points is utilized to identify the factors that are often employed to process NiTiHf and influence the TTs variation. The preliminary data set contains more than 250 data points, which were reduced to 170 by lowering the number of features following the procedures indicated below. The amount of data originally acquired was not greatly reduced because of this feature reduction. To enable the NN model to represent the features, the data set is validated following a set of rules: • The mean value of the Ms and Mf or As and Af is considered in the data set if Mp or Ap are not reported. • If there are more than two missing values in a data collection, it is removed. • The effects of size, stress level, re-melting, oxygen level, and mechanical or thermal fatigue on the material are less reported in the experimental studies and are thus ignored. • This study focuses on conventional manufacturing meth-ods such as melting (Ref 85), induction melting (Ref 73), and plasma torch melting. Materials experiencing the same time-temperature profile result in similar TTs. This allowed for the elimination of these conventional approaches from the features. Suspended droplet alloying (SDA) as a novel manufacturing method has also been considered in Ref (86,87). This method showed to increase the TTs and was considered as a single input. Melt spinning is another manufacturing method that is assumed to be a combination of conventional melting and extrusion (Ref 88). • Homogenization is a material preparation process that is not considered a feature as it is an obvious part of the material preparation process. This process is implemented at a temperature and time of around 1050°C and 70 h, respectively, with a rapid cooling afterward. • The re-melting process is commonly used in the fabrication of NiTiHf material. To maintain good homogeneity, some studies repeated the re-melting process several times (Ref 89). Due to a lack of associated data in many of the investigated experimental studies, the model does not contain re-melting as a feature. • Aging is considered as any heat treatment with a temperature range of 300-900°C and a period of Input values of 1 or 0 are considered, if a certain heat treatment or metalworking process was performed or not performed during the processing of NiTiHf, respectively. Assigning the value of 0 to the input with no meaningful value is common in NN modeling. The network will learn from exposure to the data that the value 0 means missing data and will begin to ignore it, which corresponds to the concept of a non-performed process on the material (Ref 91). Since the input parameters have different natures and different value ranges, a normalization process was performed to map all the input values in the range between 0 and 1.

Neural Network Model Formation and Evaluation Metrics
In the present work, a multilayer perceptron (MLP) NN was developed using Levenberg-Marquardt optimization algorithm  (Ref 92). The training was performed by updating weights and biases in each iteration to minimize the loss function resulting from discrepancies between the expected and predicted outputs. The set and the sigmoid activation function were implemented for the NN model: Cross-validation ( Ref 93,94) and loss functions (Ref 93, 94) were employed as the main metrics for model accuracy and model error, respectively. The data points were separated randomly into three sets for cross-validation: 70% training, 15% cross-validation, and 15% testing. The early stopping approach (Ref 93,94) was considered to end the model training. After the error for the validation data set continued to increase, the stopping condition was set to update the weights and biases for 50 more iterations.
The training was performed based on the overall determination factor, as defined below: where i is the index of the data point, y i is the predicted values, y is the mean of predicted values andŷ i is the target values. The ideal model is chosen by randomly altering the data sets and finding a model with a high and steady R 2 . To avoid overfitting, the R 2 values for the training, validation, and training sets are set to have a maximum difference of 0.1. The error metric or loss function considered in this work is the root mean square error (RMSE), as calculated below: The total number of data points is defined by n. RMS determines how far a set of values deviates from the target value. To further analyze the model errors, the l mean error value is also presented.
To determine the NN model architecture, model training was carried out constantly for multiple designs, ranging from 1 neuron with one hidden layer to 40 neurons with two layers. The R 2 values for all six outputs are determined for each distinct model architecture. To measure the maximum, minimum, and average R 2 values, the MATLAB code loop runs the model 20 times for each distinct model design. Finally, the optimum architecture was chosen.

NN Model Results
The model architecture is finalized after several iterations to find the optimum number of neurons and the hidden layers. The result for the optimum architecture for the NN model is presented in Fig. 1, and the NN model schematic is presented in Fig. 2. As shown in Fig. 1, the neuron numbers in the first layer are 30, and the neuron numbers in the second layer are ranging from 23 to 27. The study discovered that 25 neurons are the optimum number for the NN's second hidden layer. In the case of 25 neurons, both the minimum and the average of R 2 suggest the highest values. The model was evaluated for various NN settings, nonetheless, all R 2 analyses yielded lower results. As a result, the best model structure is a two-layered setup with 30 and 25 neurons in the first and second layers, respectively. Figure 2 presents the optimum NN model configuration, which has two hidden layers with 30 and 25 neurons in each. To achieve the final solution training, the model updates 930 weights and 61 biases for each iteration. Figures 3 and 4 present the training results. The regression plot is presented in Fig. 3 for all the output data points as a united category and each output: As, Mf, Af, Ms, Ap, and Mp. The overall R 2 value is 0.96, as shown in Fig. 3(a), and every single output is also presenting similar R 2 values ranging from 0.95 to 0.96 factor (Fig. 3b, c, d, e, f, and g). The difference in R 2 values was less than 0.1 for the training, validation, and testing data sets to guarantee a minimum amount of overfitting. It should be mentioned that the R 2 values related to the training sets are higher than the overall one (R 2 = 0.96). The model shows good results based on the R 2 values and the data point distribution around the regression line describing the predicted temperature. To better investigate the model's accuracy, the study evaluated a histogram fit in Fig. 4 to assess the abundance of the predicted outputs with certain discrepancies from zero error. The error is defined as the difference between the predicted and target transformation temperatures. The occurrence frequency of the error shows a normal distribution meaning that most of the outputs are well predicted. The mean errors, l range between 0.3 and 2.1°C, and the RMSE ranges from 28.6 to 31.7°C. Considering that the transformation temperature ranges of NiTiHf are from À 140 to 644°C, the mean error and RMSE are relatively low values, and the model shows acceptable results. It is worth noting that TTs discrepancies for SMA material with similar composition and processing in different laboratories showed values higher than the errors of the presented model. Ni50.3Ti29.7Hf20, and the TTs showed up to 60°C discrepancies. The same group further aged the material at 500 or 600°C and the TTs showed up to 40°C discrepancies. The trained model predicts with maximum errors that are equivalent to or lower than those reported in other laboratories for similar input, indicating that it is an effective TTs prediction tool. In addition, the present work intends to keep the major heat treatments and post-processing which are performed commonly in previous experimental works to treat NiTiHf, and therefore, no further feature selection is performed to improve the model accuracy. Such a model is a crucial tool not only to predict the TTs but opens the possibility to modify the major processing parameters to tune the other material properties.
The results can be improved by adding more data points to enrich the model. This is possible by a continuous update of the data set by newly released data in the literature.

Model Validation
To verify the performances of the presented NN model, three cases are selected. The data points are selected in a way to have the outputs in at least two references: reference 1 (Ref 95) which was not used to feed the present model and reference 2 (Ref 74) which was involved in the model training. Table 2 presents the selected data points for the three cases, their features, and their corresponding TTs based on the references presented. The reason that the TTs for the same chosen features are different is due to the varying reported values in different studies.
The model ran 10 times for each case, and all 10 predicted outputs are reported along with the data from the references. The main reason for including all 10 prediction results is to highlight the solution area and confirm the robustness of the model against a random selection of the training, validation, and testing data points. Figure 5 presents the validation results. Considering case 1, the predictions for almost all the iterations have resulted in a value close to the references. The maximum discrepancy is less than 5%. In case 2, also, there is a good agreement between the model and the reference values. In case  2, there are three references considered. In Ref (74), both the aging treatments with 500 and 550°C are reported and compared with the model results which assumed the aging temperature to be 500°C for both cases. Considering case 3, the maximum discrepancies show a higher value with respect to the two previous cases, and the discrepancies are less than 15%. The main reason for more discrepancies in case 3, is that the number of data points with similar input variables in the data set is limited, resulting in a larger error in prediction after the model training. Considering the good agreement between the predicted and the target results, the trained model demonstrates to be an effective tool for the prediction of the NiTiHf TTs. The mean value of 10 predicted values can be considered a finalized model output as it presents the best match with the target values from various values. The model is designed to select the training, validation, and testing data points randomly, and this is the main reason that for each iteration, different predictions are given. The model has been optimized to arrange each training, validation, and test set to obtain the optimal results with the minimum discrepancies. This leads to an accurate prediction, which is very similar to the references. The design and arrangement of the data set for the training, validation, and testing data points are performed to check the possibility of accuracy enhancement. Such predetermined data set arrangements resulted in a possibility of higher accuracies for the present case scenarios. However, NN modeling is dynamic and requires a continuous data set enhancement as the new data points are generated, and the final goal is a robust model being independent of any data arrangement. Therefore, this work does not suggest any predetermined data set arrangement for accuracy enhancement. Alternatively, one of the most effective ways to improve the modelÕs performance is to update the data points in parallel with future experimental research data.
NiTiHf TTs can be designed using this model as a reference. Almost all experimental procedures in the processing and/or fabrication that have been previously undertaken on this material to modify all the four TTs or other functional parameters are considered. The model suggests a useful tool for checking multi-paths for certain TTs, which opens the possibility of fine-tuning the input parameters to acquire additional desirable material features. This research also includes the prediction of NiTiHf TH as a significant design factor. Furthermore, using 0 and 1 input values simplifies the modelÕs use and allows easy generation of different pathways to investigate associate TTs. The existing model serves as a starting point for future works that will include more material properties.

Conclusion
In this study, a NN model has been developed to predict the four main transition temperatures, as well as the two peak temperatures of NiTiHf alloys with varying chemical compositions and post-processing treatments. The purpose of the feature selection was to introduce the 0 and 1 for the input values to make the model easier to use, as well as to make the most of the available data from 1995. A data collection of 173 data points has been generated, with previous experimental work information supporting 19 inputs and 6 outputs parameters. Two hidden layers with 30 and 25 neurons each were shown to be the optimum NN model design. For each of the outputs, the developed model demonstrated overall determination factors of around 0.95. The mean errors and the RMSE for each of the outputs ranged from 0.3 to 2.1°C and 28.6 to 31.7°C, respectively. The model was further validated by comparing the predictions to the references used for training and validating the model both outside and inside the data set. The errors were equal to or fewer than the TT differences observed in several laboratories with similar chemical and processing parameters. As a result, the model is considered well-validated and reliable for estimating the TTs. One of the modelÕs contributions is presenting a tool that can offer various tracks to achieve specified TTs, allowing the input parameters to be modulated to obtain additional desirable material outputs. This work also enables the prediction of the thermal hysteresis as a significant design component for shape memory alloys. This all highlights the capacity to create key material informatics approaches that make the most of the existing experimental data, which is both expensive and time-consuming to generate.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Conflict of interest
The authors declare that there are no conflicts of interest that could have appeared to influence the work reported in this paper.

Appendix
NiTiHf data set: the description of the most important features, the number of data points, and the related references. In this study, the Ni50TiHf are studied with different amounts of Ti and Hf. The material is an electric arc melted and re-melted three times for homogenization, subsequent hot upsetting by 5-10% in a press, long-term annealing at 1073 K in argon, and quenching.

Ref 89
This study has a significant effect on the data sets as it presents the NiTiHf shape memory alloy with low transformation temperatures. The study discusses that if the Hf amount is in the range lower than 12%, the NiTiHf can show very low transformation temperature, and the related data are added to the article. In this study, only the martensitic finish transformation temperature is reported. The rest of the TTs are assumed based on the average difference of the TTs of the whole data set 15 Ref 62 This study reported the TTs amounts for Ni50TiHf, Hf amounts of 6, 8, and 10%. The materials are prepared with arc melting, followed by hot rolling, aging, or annealing, and the related TTs are plotted. The effect of the oxygen level is discussed for NiTiHf. The study implemented the melt spinning fabrication approach, followed by rapid quenching. Ni48.9TiHf with 5 different atomic percentages of Hf ranging from 8 to 20% are considered.

Ref 88
In this study, Ni50.3Ti29.7Hf20 was manufactured with induction melting followed by homogenization at 1050°C for 72 h and then extruded at 900°C. The material then aged at different temperatures ranging from 300 to 900°C, and the TTs were reported. Results confirmed a continuous TT decrease for aging temperature 400°C and a continuous TT increase for aging with 500°C with increasing the aging time.
8 Ref 74 In this article, three different NiTiHf atomic percentages are investigated for the TT curves. The effect of thermal cycling is also investigated. Material heat is treated up to 60 times at 800 or 900°C. The study concluded that the TTs would become stable after 20 thermal cycles 3 Ref 64 In this article, NiTi70Hf material with different atomic percentages of the Ni and Hf are fabricated by Arc Melting, and the related TTs are reported. The study is one of the rare ones which present Hf atomic percentages lower than 10% 3 Ref 97 The study measured the TTs for Ni38Ti50Hf12. This study considers the Ternary NiTiZr and the effect of thermal cycling on the TTs 1 Ref 98 In this study, the Ni50.3Ti34.7Hf15 is fabricated by induction melting, homogenized at 1050°C, hot extruded at 900°C e (7:1), heat-treated at 900°C for 1 h and then aged at various temperatures and times. This study discusses the effect of aging on the TTs 5 Ref 75 In this work, Ni50.3Ti34.7Hf15 is fabricated through arc melting, homogenization, solution treatment, hot rolling, and aging at 450 or 550°C at different timing from 0.5 to 72 h. The data added to the model are the one related to 3 h aging. It is observed that the aging timing of more than 10 did not result in any significant changes in TTs 3 Ref 85