Response Surface Mesh with the Outer Input Method


 Artificial intelligence in general and optimization tasks applied to the design of aerospace, space,and automotive structures, rely on response surfaces to forecast the output of functions, and are vital part of these methodologies. Yet they have important limitations, since greater precisions require greater data sets, thus, training or updating larger response surfaces become computationally expensive, sometimes unfeasible. This has been a bottle neck limitation to achieve more promising results, rendering many AI related task with a low efficiency.To solve this challenge, a new methodology created to segment response surfaces is hereby presented. Differently than other similar methodologies, the novel algorithm here presented named outer input method, has a very simple and robust operation. With only one operational parameter, maximum element size, it efficiently generates a near isopopulated mesh for any data set with any type of distribution, such as random, Cartesian, or clustered, for domains with any number of coordinates.Thus, it is possible to simplify the response surfaces by generating an ensemble of response surfaces, here denominated response surface mesh. This study demonstrates how a metamodel denominated Kriging, trained with a large data set, can be simplified with a response surface mesh, significantly reducing its often expensive computation costs> experiments here presented achieved an speed increase up to 180 times, while using a dual core parallel processingcomputer. This methodology can be applied to any metamodel, and metamodel elements can be easily parallelized and updated individually. Thus, its already faster training operation has its speed increased.


Introduction
Artificial intelligence (AI) plays an important role in society presently, due to its capacity to automate several complex or laborious tasks which before were only accomplished by humans. Present design practices of aeronautical, spatial and automotive structures also rely on metamodels used in AI to achieve approximations of precise and expensive models. Several metamodels are available in the public domain, yet the precision and the computation cost of these metamodels are often an object of research.
The outer input method (OIM) is a novel algorithm here described, which with a very simple methodology can create a mesh for a data set with any type of distribution or number of coordinates. It can be applied to any metamodel, thus expensive metamodels can be segmented and have their training time significantly reduced without compromising accuracy.

Related work
In the field of AI, Rumelhart et al [1]. described the backpropagation method, which is applied to train multi-level artificial neural networks (ANNs). These ANNs are a metamodel which consists of a mathematical model that mimics the flow of information observed in organic neuronal tissues of several living species, allowing many living organisms to have intelligence. This methodology has been very efficient in text, image, and speech recognition, among other applications. Among some of its generalized applications, ANNs can be applied to forecast the response of a mathematical function, called regression, and for category classification.
Deep learning [2] (DL), is a technique created by Lecun, Bengio, and Hinton, where many layers of ANNs are trained over data set to recognize several levels of patterns. It has unsupervised training, different than ANNS, and has many important applications for the automation of complex and challenging tasks, substituting human work in many fields of activity, with similar or greater levels of intelligence. DL and other areas of AI have as their basic building blocks, response surfaces such as ANNs, Kriging [3] (KR), support vector machine [4] (SVM), polynomial regression [5], radial basis functions [6] among others, and often require large parallel computing infrastructure and large data sets for their often computationally expensive training tasks.
In optimization, a metamodel, which is also called a response surface or surrogate model, can be applied to reduce computation costs by dismissing unpromising candidate solutions and suggesting promising alternatives. To evaluate the efficiency of a response surfaces, many benchmark functions were created [6].
Yet precise metamodels often require large data sets, and also are expensive in computation costs. Some studies aiming to reduce training times of KR have been published. Fuhg et al. in 2020 [12] published a review of the state-of-the-art of adaptative sampling methods for KR. Bouhlel et al. researched ways to reduce the computation cost of the KR model [13], by substituting the inversion of a matrix in the KR methodology with a kernel with few parameters defined by least squares and using adaptative sampling to not overload the KR model. In another study, van Stein et al. [14] described several methodologies to segment a KR model into clusters. Some of these techniques are based on slicing the domain in a Cartesian segmentation, yet those methodologies have limited application for data set with clustered distribution. Other methodologies, as described by Wang and Simpson [15] include a fuzzy analysis of the landscape of the function, to segment in locally optimal regions. Yet these proposed methodologies cannot be applied to any type of data set, and have several operational coefficients which must be adjusted by the user according to each function.
Wang et al. [16] in 2017 explored similar application of data sampling selection, including domain landscape analysis for KR clustering applied to optimization tasks. Liu et al. [17] described experiments to reduce computation time of the KR model for large data sets, by applying global and local sparse approximations of the overall inputs. Another alternatives presented include selecting subsets of the data set to reduce KR computation costs, and nesting KR, as described by Bachoc et al. [18] in 2021. A popular methodology also in the same filed of research, is the adaptative sampling [19], which Chellappa et al, in 2020., applied to the reduced basis method. This adaptative sampling consists of a surrogate error model generated to efficiently create sub samples of the parameter domain. There are many other publications available in the topic, aiming to reduce metamodel computation costs, which most often are based on a type of clustering or selective sampling the data set. Yet these methods often do not have a very simple implementation, requiring many operational parameters to be adjusted according to the data set, or cannot be applied to any type of data set distribution.

Methodology
The OIM generates a response surface mesh (RSM) for a normalized data set with a Cartesian, random, or clustered distribution, by separating the data set into elements within regions of similitude. This is achieved by consecutively dividing elements of data considered large, in two elements of data set, by its approximately middle section in the design space. For this the data set must be normalized, having each input between 0 and 1.
The advantage of this method is that it is very simple and robust, working equally with data sets with any number of coordinates or type of distribution while refining the mesh in regions of the domain with greater data density. By not overpopulating elements, it facilitates the generation of simplified response surface segments, and it allows to apply mesh refinement in regions of greater interest, simply by populating those regions. It also allows parallelization, and individual mesh elements to be updated separately.
The algorithm starts by assuming all the data set as the initial element. If the element size in terms of the number of inputs is greater than the maximum element size, it is sectioned. To section the element, initially is necessary to calculate the average position of the data set element, also called center position (CP). Secondly, the distance of each input with the CP is measured, and the point with greater distance with the CP is denominated outer input. It provides a rough estimation of the direction in which the variance of the position of the inputs is expected to be greater. The vector generated from CP to OI is denominated outer input vector (OIV), and is normal to a plane denominated approximated middle plane of the data set, which is used to divide the data set into two elements. The third step is to identify which points of the data set belong to each side of the plane, generating the element segmentation.
This process is repeated to each generated element until each element has a local population lower than the maximum allowed size. As an alternative that can be implemented, it is possible to define a minimum element size, to avoid error in the computation of each response surface element. Thus, if any element has a population lower than the minimum allowed size, additional points are added from the adjacent element.
Once all mesh elements are generated, each polarization vertex (PV) is defined by the average position of all inputs of a single element. Thus, each polarized mesh element (PME) is defined by the points of the data set which are closer to each PV, as in the Voronoi diagram, which is also called Dirichlet partition{Formatting Citation}. A given input forecast will be then calculated by the response surface element trained with the PME which has closer PV to the input. Another possible variation is to substitute the outer input vector by the eigenvector of the covariance matrix of the dataset.
The OIM is presented at Algorithm 1: The OIM also can be applied to generate a mesh for a finite element model or similar mathematical models, discretize the domain of a function and identify its most competitive outputs, among other applications which require domain partitioning. The OIM also can be applied to generate a mesh for a finite element model or similar mathematical models, discretize the domain of a function and identify its most competitive outputs, among other applications which require domain partitioning. Figure 1 displays the elements generated by the OIM segmentation, and Figure 2 displays the respective generated PMEs. Figure 3 displays the OIM segmentation of a data set with cluster distribution.

Numerical experiment
In order to compare the efficiency of the response surface segmented by the OIM and the surface non-segmented, three benchmark functions used for this purpose are selected [6]. The regression metamodels applied in this experiment are the SVM which has fast a response and reasonable accuracy, and the KR, also called Gauss Process, very popular for its great accuracy, yet very expensive for larger datasets. Each function has its 2D domain defined in a Cartesian grid of 71 x 71 points, and the performance parameters including the overall training time are presented in tables 1 to 3. The metrics to measure performance are the average error of the output, and the variance of the error. Further parameters are the R square, RMAE, and RAAE, which are described by Jin, Chen, and Simpson [21]. It is important to note that for the RSM, each PME of the metamodel was trained in parallel using a quad-core computer, and the overall training time includes the mesh generation. The OIM has the only operational parameter named maximum element size, which in this experiment is equal to 50 inputs. The metamodels selected are the SVM and KR, and their OIM combinations. The performance comparison between the OIM segmented and non-segmented models are presented in Tables 1 to 3:  Table 3 -Results for the Custom probability function From tables 1 to 3 is possible to see that the segmentation has increased the precision of the SVM, yet the mesh generation increased its evaluation time from less than one second to about 7 seconds. At Table 3, the SMV gave a large average error of 34%, yet its OIM combination gave an average error of 0.6 %, indicating the OIM can benefit SVM models. All other metrics also indicate the same, that the combination of the SVM with OIM benefits precision without significantly increasing computation time.
The KR model demonstrates in the experiment why is very popular among the metamodels, since it is very precise, like we see from the average error metric. Yet, is also very expensive in terms of computation time, ranging from 38 minutes up to 83 minutes for the 71 x 71 domain training. Its combination with the OIM significantly reduce training times to 13 seconds or less, a speed increment up to 567 times. Accuracy metrics show the KR combined with OIM has increased accuracy for the first two functions, and a slight reduction for the third. The large training speed gained, while maintaining very good accuracy, and having a very simple and robust implementation, demonstrates OIM is an efficient tool which can benefit several optimization and AI practices.

Conclusions
As demonstrated in the experiment, with the OIM, a SMV model can be segmented, increasing the accuracy for SMV, and reducing training time for KR. Maybe one of the great combinations of the OIM with metamodels, as demonstrated in this study, is to segment a KR model, and provide its great accuracy for larger datasets with much faster operation. With the OIM segmentation, KR training computation was reduced from about 80 minutes to 9 seconds, a speed increase up to 567 times without compromising accuracy. With the OIM, KR models for larger datasets also can be parallelized and be scaled up to as computer memory and parallel processing allows, without excessively increasing training costs. Also, it is important to note the OIM mesh generation can perform even faster if coded in a language faster than MATLAB, like JAVA or C++.
Adding to this that the OIM can be implemented on any dataset with any type of distribution, it is demonstrated it is an important alternative for optimization and AI studies, to significantly reduce training costs of metamodels, and lead AI tasks and optimization studies to greater levels of performance.