Human face aging based on active appearance model using proper feature set

In this paper, a simulation of the human face evolution from youth to old age is presented. In this work, the goal is to maintain a person's appearance and accurately estimate the face at the target age; in such a way that the simulated face should be realistic and have the appearance and texture characteristics of the target age. To achieve this goal, the number of feature points and their arrangement on the face are very important. First, we suggest proper feature points. Then, using the active appearance model (AAM) method and the presented model, templates are obtained as representative of different age groups. To maintain the unique geometric characteristics of the input face and apply these characteristics to the target age template, the proposed feature point pattern and the MLS method have been used. Then, using the active appearance model method and having the target age template contain the geometric characteristics of the input face, and the steps of changing the input face age to reach the target age are presented. Finally, the results of the survey, using two methods of age recognition and real image recognition (separately for man and woman images), show an average of 80.77% (male images) and 81.36% (female images) correct answers of the participants in this Poll.


Introduction
The path of aging from birth to old age is very complex. The pattern of aging is generally different from person to person, people of the same age also have special similarities in the face. Studying these patterns and changes can improve our understanding of the aging process. Individuals' common facial features help to classify them into distinct age groups and model the aging process on the face. Hence, a person's facial features are used to determine his or her age. For older facial images, the amount of change in facial texture is greater than the amount of change in shape and form. In the facial aging model, the geometric changes are small, and B Alireza ahmadyfard ahmadyfard@shahroodut.ac.ir Mahboubeh Khajavi m.khajavi@shahroodut.ac.ir 1 Electronics -Image Processing, Faculty of Electrical and Robotics Engineering, Shahrood University of Technology, Daneshgah Blvd, Shahrood, Iran 2 Electronic Engineering -Image Processing, Faculty of Electrical and Robotics Engineering, Shahrood University of Technology, Daneshgah Blvd, Shahrood, Iran the changes are mainly in the facial texture. The problem of changing faces from one age to the next target age is used in forensic medicine, criminal investigations, missing persons, modeling of suspects' faces, computer graphics, the film industry, cosmetic surgery, and the prevention of high-risk lifestyles. Due to the variety of potential applications of facial aging and the increasing variety of machine vision techniques, many methods have been developed in recent decades.
Shu et al. [1] suggested that the facial aging pattern has been presented in the form of a specific dictionary definition for each age group. To age the people's faces [2], texture changes and geometry changes are applied separately on the input face and then by combining them. In [3][4][5][6], GAN has been used for aging. The proposed GAN-based methods often have two problems, which include: first, abnormal changes in the face regardless of the input, and second, changes in areas that are ineffective in the aging process [2]. For example, carefully looking at the output results presented in [4], the lack of attention to the geometric dimension, especially in the sample of older results of children's faces, is visible. It should be noted that the use of the GAN method when the number of age groups is small with a long-time   [7]. With this method, the number of age groups has increased. However, poor performance has been reported. In [8], an encoder-decoder architecture has been proposed to simplify the aging process of the face instead of utilizing a complex GAN. Recurrent face aging (RFA) [9,10] is another approach based on a recursive neural network. Given a single face image at the input, these methods [9,10] result in a set of faces at different ages at the output. It is based on the active appearance model, in which a special channel is provided for the complete integration of wrinkles, in which the aging space considers the shape, appearance, and wrinkles. In [11], feature points are extracted from face images using the active appearance model (AAM). Then, for each age group, a template image is to represent the age group. The face aging process aims to achieve a most realistic result while keeping its unique features, not only adding wrinkles to the input face which is done in previous experiences. In this work, we suggested a set of feature points to represent face characteristics at the target age up to 97 years. In this article, first, we use the active appearance model (AAM) to extract the appropriate feature points from face images to better represent the face images. Then, using the active appearance model and the proposed layout of feature points, a suitable template for each of the suggested ten age groups was obtained, separately for men/women. Following, given a face image and the desired age, the template face group based on the desired age and the person's gender is selected. Then, we apply the unique geometric characteris- Fig. 2 Sample face shape of men in age group of 65-67 years old from references a [9,10] b [16,17] c [11] d [13,18] e 66 points f 92 points tics of the input face which represent the identity of the face (such as the contour model of the face, elongated, round, oval and the shape of the eyes, nose, etc.) on this template by using the moving least squares (MLS) technique. In the follow, AAM method combines the input face and the deformed template of target age containing the unique geometric characteristics of the input face.

Feature points' extraction and template of the face age groups
To better understand the human face, its salient features must be extracted. Most facial features are based on geometry, shape, and distribution, such as the eyes, nose, and mouth. Lanitis et al. [12] developed a statistical model of facial appearance based on the active appearance model as a basis for obtaining a parametric description of facial images. In general, AAM-based approaches can consider shape and texture rather than facial geometry. AAM (Active Appearance Model) [13] is a common face descriptor that uses the PCA technique for dimension reduction while preserving important elements, including the template and texture of the face image structure. This model can simultaneously extract changes in the shape and texture of the human face, changes that reflect the aging process in the face. The AAM method offers two models of gray surfaces and the overall shape of the face. The use of AAM for face description causes age characteristics to appear in the gray surface model and race and gender characteristics to appear in the shape model [14].
In [15], AAM is used to extract the desired features for teaching the proposed algorithm. For a face image presented by the AAM method, two series of extensions must be calculated: a shape model and a texture model. Making a shape where (x i , y i ) denotes the location of the ith reference point, where n is the number of reference points. This description does not provide any clear information about connectivity. By applying PCA, an active appearance model similar to the ASM can be provided [13]. Therefore, the shape of the face is modeled as follows [13]: which s and s(2n * 1) denote the shape instead of the face and the mean of the face shapes in the same age group, respectively. P s (2n*t), a matrix (whose columns are unit vectors along principle axes or basis vector) is a set of orthogonal modes of deformation and b s (t*1, vector: b 1 … b t ) is a set of parameters for the face shape model. Similarly, the face texture is represented using a vector of intensity values for the face pixels [13]: m is the number of pixels on the face image. To build a texture model, all training faces must be written in the mean shape frame to collect texture information from landmark points. The result is shape-free textures [13]; Here, g (m*1, vector) is the mean of the texture, P g (m*M, the M eigenvectors corresponding to the M largest eigenvalues) is the orthogonal variation mode derived from the training set, and b g (b g b 1 , b 2 , …, b m ) contains a combination of texture parameters in the texture subspace. Finally, a combination of shape and texture is provided.
w s is a diagonal matrix through which the appropriate weight between the pixel distance and the pixel intensity is obtained. After applying PCA we have [13]: b Qc (6) Q are the eigenvectors (shape and texture model), and c is a vector consisting of appearance parameters that control both the shape and texture of the model [13]. PCA can eliminate the relationship between shape and texture parameters in the model. Moreover, it provides a more compact model in which the shape and texture of the face are represented as a function of C: Which: Accordingly, a face template is obtained for each age group using the AAM method. Figure 1 shows the suggested landmarks pattern for the samples of men's and women's faces. For each age group, the face template is a good representation of the age group. The number of landmark points and the way that the points     Table 1, the number of features proposed in the literature and the number of feature points proposed here are presented. Each of these references has a different number and patterns in terms of the arrangement of feature points on the face (Fig. 2, right). Figure 2 (right) shows the number of feature points and the arrangement patterns of these points. In Fig. 2 (left), there are results of the implementation of layout patterns and the number of points provided in each of these references (Table 1), which are obtained from the number and the similar images of the face selected in a specific age group (65-67 years, male gender).
Finally, we suggested using 92 landmark points as the most appropriate way of arranging the points to represent the face. As shown in Fig. 2f, the suggested landmarks with 92 points take into account nasolabial folds, the marionette lines, and the tear trough, which appear in the face of adulthood. This is a more appropriate and comprehensive choice in comparison with the other landmarks in the literature. Figure 3a shows the obtained templates with the proposed layout pattern (92 feature points) for 10 age groups. Figure 3b shows the templates (79 feature points and different layouts) presented in [11]. As shown, especially in the age group over 45, the details of the elderly person's face, including tear trough, nasolabial folds, as well as marionette lines, are of lower quality than the suggested 92-point pattern. Examples of the suggested template, layout, and texture model are presented in Fig. 4.
For each age group, we suggested the proper locations for 92 landmark points to provide a suitable representation for the face template of the group.

Warping of images
Warping is an important step in many image processing applications. In general, digital image warping is a geometric conversion of a digital image. There is a wide range of warping algorithms. Critical aspects of image warping algorithms are speed, accuracy, and complexity. Choosing the right algorithm depends on the application. In [11], image deformation UTKFace~20,000 23,000 0-116 - Fig. 10 The result for the proposed method in comparison with Refs. [2,4,7] is performed using the moving least squares (MLS) proposed by Schaefer et al. [19]. Levin [20] has introduced a method for creating smooth and realistic image deformation based on linear MLS. Using MLS eliminates the need to triangulate the input image and create a smooth deformation [21]. The MLS deformation is used to find the best conversion function (f ) that maps a set of controlled handles (p) to the position of the controlled handles (q). Therefore, to generate a deformed image, the function f is applied to any point v in the deformed image.  [24] 2022 To 60 years old GAN [25] 2022 To 60 years old The CUstom Structure Preservation module (CUSP) [26] 2022 To 80 years old Landmark-Guided cGAN (LGcGAN) [27] 2022 To 80 years old Multimodal FA framework With the v point in the image, the following equation must be solved to provide the best conversion of l v (x) that is minimized [19,22].
p i and q i are control points known as row vectors and w i is the weight and α is set to adjust the effect of deformation [22]. Since in the minimum squares problem, w i depends on the evaluation point, this method can be called the MLS. Hence, for each v, a different conversion l v (x) is obtained. Now, the deformation function f is defined by satisfying the above three properties as f (v) l v (v) · l v (x) contains a matrix of linear conversion M and transfer T [19,22]:   (10) is quadratic in T . Therefore, we can solve for T directly from the obtained matrix relation M: Which p * and q * are the Centroid center weights: Accordingly, T can be rewritten in Eq. (12) and l v (x) can be rewritten in terms of the linear matrix M as follows: Therefore, the problem of least squares in Eq. (10) can be rewritten as follows [19,22]: This framework helps to explore different classes of M-conversion matrices. Specifically, here, an inflexible conversion for M is selected. To have realistic results, the deformation must be as rigid as possible; that is, the deformation space should not include uniform scaling [21]. Because of the nonlinear condition: M T M I , this is generally very difficult to approximate. However, closed-form solutions have been proposed from the closest iterative point to solve this problem [23]. Thus, the closed-form solution for inflexible conversion results in a deformation matrix [19]: Which μ r is the scaling factor: So, we have M T M I . This deformation is nonlinear, but can still be easily calculated. The vector f r (v) is a rotated and scalable version of the vector v − p * . To calculate f r (v), we normalize the vector f r (v), scale with length v − p * and transfer by q * .
Here, for this deformation, points are used as control bundles, and the rotating vector is [19]: The results of the warping applied to the face images to change the size of the face, and its various components are shown in Fig. 5.
As shown in Fig. 5, (1) is the input image. In (2), the desired points of location are defined. These 11 points are the primary input. In step (3), the alternate points are redefined (these are the new location of the initial points) and applied to the face. Finally, by applying the new locations of the points, the result face in step (4) is deformed from the input face in step (1).

Proposed method
The block diagram of the proposed method is shown in Fig. 6. In the proposed method, given the face of a person at age Y as the input image, it is resized to specific dimensions, and we set the image dimension to 200*200. Next, based on the suggested layout of the landmarks, 92 feature points are arranged on the face image. The face template (Fig. 3a) is than chosen based on the target age (age X) and the person's gender. Then, the landmark points on the template of target age are extracted. Then, the MLS technique is applied to map the feature points from the input face image to the corresponding points on the template at the target age group. This applies geometric changes to the input face to modify face elongation and the special position of the eyes, nose, and chin based on the target age template.
As shown in Fig. 7, to apply the unique geometric characteristics (for example, face elongation, eye position, cheek and chin position) of the person for whom the input face has been given, that represents the identity of that face, it is necessary to apply the specifications on the target age format before this aging process. Therefore, according to the steps presented in the algorithm, we have considered the image of a woman as sample input at the age of 25 and the target age templates at different ages of 35-97 years old. First, 92 landmark points have been extracted from the input image as well as the desired templates according to the pattern presented in the previous section. Then, in the next step, geometric characteristics are applied to the target age template using the MLS method. Thus, the MLS input is a 25-year-old female image with 92 landmark points and a template image at the desired age (for example, a 35-37 age template) with 92 landmark points. Now, the goal is to transfer the position of 92 landmark points from the face template of the target age to the corresponding points from the input image to maintain the unique character of the input face. As a result, we have a face image in the template of the target age with the unique characteristics of the input image. Therefore, the output of the proposed system is a face image with general specifications of the target age and specific characteristics of the input face image (e.g., the 25-year-old woman at the input). As shown in Fig. 7, the input face belongs to a woman with an elongated face and special eyes (almond), elongated cheeks, and nose (these unique features express the identity of the person), and after applying the MLS technique, these features are preserved in the template of different target ages.
Next, 92 landmark points are re-extracted from the obtained image according to the layout pattern (Fig. 6). Finally, having the input image feature template and the age group template (after applying the MLS technique), using the AAM method (active appearance model) and combining both template images age group (age X) and input face (age Y ), the image of the face is obtained at the target age (age X). In the last step of the block diagram, considering that skin color is also one of the unique characteristics of each person, the color of the input face is applied to the final aged image. So, the unique color layer of the input face, which is part of the identity of the individual, is added to the face through the histogram of both the input image and the resulting image, and the final image is aged while maintaining the identity characteristics of the input face. For a sample face image, Fig. 8 shows the step-by-step result for constructing the face image at a target age. Figure 9 shows the result of face estimation in different age groups for two men and three women. The persons for input images are at the age of 25, and the output images are in 7 age groups in the range of 35-97 years old.

Experiments
In this section, first, the hypotheses, preprocessing, review of existing databases, and introduction of the database used in this article are discussed. The reports of some experiments for validation of the proposed method are studied.

Suppositions
In this experiment, 10 age groups were considered in the range of 5-97 years old separately for men and women. The age groups are: 5-7, 15-17, 25-27, 35-37, 45-47, 55-57, 65-67, 75-77, 85-87, 95-97. So, we have a total of 10 age groups with two sexes where each group is represented by a face template. The preprocessing performed on the input images (images with dimensions of 200*200) and the steps of template extraction for each age group are shown in Fig. 6.
Numerous datasets for facial images are available, but only a handful of them are specifically designed to examine facial aging. Creating a new dataset in the field of facial aging is a difficult task. Since it requires a collection of face images for people of different ages and genders. In Table 2, we reported several datasets used for studying age estimation and the facial aging process.
There is no pre-defined protocol for such datasets, the age distribution is not the same for the subjects. However, these datasets have been used in several studies of facial aging modeling.
We used the UTKFace database [2,6,24] to evaluate the proposed method. UTKFace have a large number of face image where subjects are of a wide range of ages (from 0 to 116 years old). The database contains approximately 23,000 face images with captions of age, gender, and ethnicity. The face images have been taken from different views and under different imaging conditions such as brightness, obstruction, and sharpness. We compared the result of our work with the results in Refs. [2,4,7]. In this experiment, we selected some face images as the inputs from the addressed works and the result of the aging process using these methods and the proposed method are reported in Fig. 10 (Table 3).
As shown in Fig. 10, the result of those references [2,4,7] covers long periods. For instance, in reference [4] for targets aged 19 to + 60, only 3 final images were given. But in this method for the same range, 7 images with 10 years steps are given. Each proposed image in output has exclusive specs in the input image and general specs of the target age.

Evaluation using an opinion poll
Another experiment was conducted to evaluate the proposed method using an opinion poll. This experiment was conducted in two parts: recognizing the age and identifying the person from the simulated face image.

Recognizing the age of simulated face
In the first part, two women's images and two men's images of age (20 s) were given as input to the proposed algorithm. The images of these people at different ages (7 age groups) in the range of 35-97 years old were simulated. So, the number of images for this experiment is 28. Then, the participants in the opinion poll (110 men and women separately) were asked to guess the age of each face image in the output. Using this experiment, we aimed to find how well the simulated faces represent the target ages. Some of the men/women images are shown in Figs. 11a and 12a.
The results of the opinion pool for man/woman images are shown in Figs. 11b and 12b.As shown in Fig. 11b, the 35-37 aged group (series 2) as the number of participants (those 110 persons) has the most accurate recognition with 98.18% and the 45-47 aged group (series 2) has the least accurate recognition with 72.72%. Another result is in Fig. 12b, the 35-37 aged group (series 1) as the number of participants has most accurate recognition with 100%, and the 95-97 aged group (series 2) has least accurate recognition with 79.05%.

Identifying who is the simulated face
In the second part of the experiment, in addition to the image of a person at the age of entry, we have the real image of the person at the target age. We used the FGNet dataset, which contains images of people of different ages. First, we provide the image of a person at a younger age as the input to the proposed method. The face image is constructed at the target age. We have the real image of the person at the target age (ground truth). By comparing the constructed face image and the real face image at the target age similarity between the two images is assessed. In Fig. 13a, three images of a woman and three images of a man (inputs image and ground truth in target ages) have been shown from the FGNet database. In this figure, the outputs of the proposed method at the target ages have also been reported for comparison. Participants in the opinion poll were then asked to choose two similar images from the images presented to them (Fig. 13b) that could be for a person in their youth and adulthood. All participants in this stage of the opinion poll gave the correct answer (correct: 110/incorrect: 0) to this question, which shows the results obtained from the proposed method are acceptable as they are similar to the real face image at the target ages.

Conclusion
In the face aging process, geometric changes are small and mainly in the texture of the face. Most recent work has used statistical methods to provide a model using a set of facial images as educational data. Then, based on the obtained statistical models, the shape and texture changes are applied to the input image. Here, first, the feature points, the appropriate number of points, and the location of these points to have a suitable template for each age group are examined. In the following, by defining ten age groups, of course, separately for men and women, a total of 10 templates are presented using the active appearance model (AAM) method as a representative of the defined age groups. Then, using the MLS method and having the feature points of the template of the target age group as well as the feature points of the input image, geometric characteristics specific to the input face are applied to the template of the target age group. Finally, using the AAM method and inputting face feature points and target age template feature points and combining the two, the face image at the target age is presented. Here, we achieved to proper template for ages ranging from 5 to 97 with 10 years step by proposing a new arrangement pattern and some points for more detailed features. It became possible to have these many templates producing results in more target ages. In the following, the specific characteristics of individuals from the input faces are transformed to the template of target age to preserve individual's identity. By comparing the results obtained by the above method with the samples in other references, the accuracy and quantity of the results in different age groups at different ages have been shown.
Author contributions This work is the result of research that has been conducted by MK as her thesis for PhD degree. AA is her supervisor for this research. All parts of the paper have been prepared by both authors.

Availability of data and materials
The data that support the findings of this study are openly available [28].

Declarations
Authors' information Mahboubeh Khajavi received M.Sc. degree in Electrical Engineering from South of Tehran Branch and Ph.D. student in Shahrood University of Technology now. Her research interests include image processing. Dr. Alireza Ahmadyfard is associate professor in Electrical Engineering. He is an academic staff in the Shahrood University of Technology.