Human Face Aging Based on Active Appearance Model Using Proper Feature Set

doi:10.21203/rs.3.rs-1705580/v1

Download PDF

Research Article

Human Face Aging Based on Active Appearance Model Using Proper Feature Set

https://doi.org/10.21203/rs.3.rs-1705580/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

In this paper, a simulation of the human face evolution from youth to old age is presented. The appearance of facial changes due to aging depends on various factors such as genetics, ethnicity, and lifestyle. However, the human faces in an age group have similar characteristics. These characteristics can be used to estimate the image of a person's face in the coming years. In this work, the goal is to maintain a person's appearance and accurately estimate the face at the target age; in such a way that the simulated face should be realistic and have the appearance of the texture characteristics of the target age. To achieve this goal, the number of feature points and their arrangement on the face is very important. First, we suggest proper feature points. Then, using the Active Appearance Model (AAM) method and the presented model, templates are obtained as representative of different age groups. To maintain the unique geometric characteristics of the input face and apply these characteristics to the target age template, the proposed feature point pattern and the MLS method have been used. Then, using the active appearance model method and having the target age template containing the geometric characteristics of the input face, the steps of changing the input face age to reach the target age are presented. Finally, the results of the survey, using two methods of age recognition and real image recognition (separately for man/woman images), show an average of 80.77% (male images) and 81.36% (female images) correct answers of the participants in this Polls.

Face aging

aging

active appearance model (AAM)

face features extraction.

The human face is a mystery with an extraordinary combination of unity and similarity. It is a window into the inner human nature; A complex structure consists of soft tissue layers with complex but delicate components with the ability to represent a world of information about ethnicity, gender, age, and emotional states. The path of aging from birth to old age is very complex. Although aging is a definite process, it is neither uniform nor linear and its pattern is different in different people. On the other hand, even though the pattern of aging is generally different from person to person, people of the same age also have special similarities in the face. Studying these patterns and changes can improve understanding of the aging process. The common facial features of individuals in an age group help to classify individuals into distinct age groups and model the aging process on the face. Hence, a person's facial features are used to determine his or her age. The aging process is highly unpredictable, but there seems to be a series of changes that depend on a simple exponential pattern over time. Also, shifts in specific areas of the face will well describe the changes caused by aging, especially between the ages of two and eighteen. For older facial images, the amount of change in facial texture is greater than the amount of change in shape and form. In the facial aging model, the geometric changes are small and the changes are mainly in the facial texture. Most of these changes occur in the form of loss of musculoskeletal mass as a result of aging. The problem of changing faces from one age to the next target age is used in forensic medicine, criminal investigations, missing persons, modeling of suspects' faces, computer graphics, the film industry, cosmetic surgery, and the prevention of high-risk lifestyles. Due to the variety of potential applications of facial aging and the increasing variety of machine vision techniques, many methods have been developed in recent decades.

Shu et al. [1] suggested that the facial aging pattern has presented in the form of a specific dictionary definition for each age group. The learning process in both dictionaries is done jointly by considering additional personalized facial features, such as moles, which are ineffective in the aging process. However, the faces produced are still opaque and no wrinkles appear, even for those over 40 years old. To aging the people's faces [2], texture changes and geometry changes are applied separately on the input face and then by combining them. The result is the aged face at the target age. In [3] [4] [5] [24] GAN has been used for aging. The proposed GAN-based methods often have two problems, which include: first, abnormal changes in the face regardless of the input, and second, changes in areas that are ineffective in the aging process [2]. For example, carefully looking at the output results presented in [4], the lack of attention to the geometric dimension, especially in the sample of older results of children's faces, is visible. It should be noted that the use of the GAN method when the number of age groups is small with a long-time interval (for example in [4] 3 age groups: 19–35, 36–60, and + 60 and in [5] 4 age groups: 0–20, 20–40, 40–60 and + 60) are suitable. In [6] a modified version of GAN so-called A3GAN was proposed. In this method, the number of age groups has increased. However, poor performance has been reported. In [7], an encoder-decoder architecture has been proposed to simplify the aging process of the face instead of utilizing a complex GAN. Recurrent Face Aging (RFA) [8, 9] is another approach based on a recursive neural network. Given a single face image at the input, these methods) [8, 9] result in a set of faces at different ages at the output. It is based on the active appearance model, in which a special channel is provided for the complete integration of wrinkles, in which the aging space takes into account the shape, appearance, and wrinkles. In [17], feature points are extracted from face images using the active appearance model (AAM). Then for each age group, a template image is to represent the age group. The face aging process aims to achieve a most realistic result while keeping its unique features, not only adding wrinkles to the input face which is done in previous experiences. So, here presented a person’s face of the desired age up to 97, with its specific characteristic including eyes shape, nose state, … also, the general characteristic of target age such as changes in different parts of face details by using a new pattern of feature points.

In this article, first, we use the active appearance model (AAM) to extract the appropriate feature points from face images to better represent the face images. Then, using the active appearance model and the proposed layout of feature points, a suitable template for each of the suggested ten age groups was obtained, separately for men/women. Following, given a face image and the desired age, the template face group based on the desired age and the person’s gender is selected. Then, we apply the unique geometric characteristics of the input face which represent the identity of the face (such as the contour model of the face, elongated, round, oval and the shape of the eyes, nose, etc.) on this template by using the Moving Least Squares (MLS) technique. In the follow, AAM method combines the input face and the deformed template of target age containing the unique geometric characteristics of the input face. It uses 92 feature points of both images. Now the aging image of the target age face is present in the output. Finally, the results of the aging process on faces using the proposed method are presented. The provided result is compared with the results of similar works in the literature.

To better understand the human face, its salient features must be extracted. One of the most common ways to extract facial features is to find feature points on face images. Most facial features are based on geometry, shape, and distribution; such as the eyes, nose, and mouth. Lanitis et al. [11] developed a statistical model of facial appearance based on the active appearance model as a basis for obtaining a parametric description of facial images. In general, AAM-based approaches can consider shape and texture rather than facial geometry. AAM (Active Appearance Model) [12] is a common face descriptor that uses the PCA technique for dimensions reduction while preserving important elements including the template and texture of the face image structure. This model can simultaneously extract changes in the shape and texture of the human face, changes that reflect the aging process in the face. The AAM method offers two models of gray surfaces and the overall shape of the face. The use of AAM for face description causes age characteristics to appear in the gray surface model and race and gender characteristics to appear in the shape model [13]. In [14], AAM is used to extract the desired features for teaching the proposed algorithm. For a face image presented by the AAM method, two series of extensions must be calculated; a shape model and a texture model. Making a shape model similar to the ASM model is taken from a collection of face images. Let denote a set of landmark points by a 2n*1 vector as:

\(\text{s}={\left({\text{x}}_{1},\dots {,\text{x}}_{\text{i}},\dots ,{\text{x}}_{\text{n}},{\text{y}}_{1},..,{\text{y}}_{\text{i}},\dots ,{\text{y}}_{\text{n}}\right)}^{\text{T}}\)

(1)

Where \(\left({\text{x}}_{\text{i}},{\text{y}}_{\text{i}}\right)\) denotes the location of the i-th reference point where n is the number of reference points. This description does not provide any clear information about connectivity. By applying PCA, an active appearance model similar to the ASM can be provided [12]. Therefore, the shape of a face is modeled as follows [12]:

\(\text{s}=\overline{\text{s}}+{\text{P}}_{\text{s}}{\text{b}}_{\text{s}}\)

(2)

Which s and \(\overline{\text{s}}\) (2n*1) denote the shape of face and the mean of the face shapes in the same age group respectively. \({\text{P}}_{\text{s}}\) (2n*t, matrix whose columns are unit vectors along principle axes or basis vector) is a set of orthogonal modes of deformation and b_s (t*1, vector: b₁, …., b_t) is a set of parameters for the face shape model. Similarly, the face texture is represented using a vector of intensity values for the face pixels [12]:

\(\text{g}={\left({\text{g}}_{1},\dots ,{\text{g}}_{\text{m}}\right)}^{\text{T}}\)

(3)

m is the number of pixels on the face image. To build a texture model, all training faces must be written in the mean shape frame to collect texture information from landmark points. The result is shape-free textures [12];

\(\text{g}=\overline{\text{g}}+{\text{P}}_{\text{g}}{\text{b}}_{\text{g}}\)

(4)

Here \(\overline{\text{g}}\) (m*1, vector) is the mean of the texture, \({\text{P}}_{\text{g}}\) (m*M, the M eigenvectors corresponding to the M largest eigenvalues) is the orthogonal Variation mode derived from the training set, and \({\text{b}}_{\text{g}}\) (b_g= b₁, b₂, ..., b_m) contains a combination of texture parameters in the texture subspace. Finally, a combination of shape and texture is provided using PCA on the data as follows and created the appearance subspace [12].

\(\text{b}=\left(\begin{array}{c}{\text{w}}_{\text{s}}{\text{b}}_{\text{s}}\\ {\text{b}}_{\text{g}}\end{array}\right)=\left(\begin{array}{c}{\text{w}}_{\text{s}}{\text{P}}_{\text{s}}^{\text{T}}\left(\text{s}-\overline{\text{s}}\right)\\ {\text{P}}_{\text{g}}^{\text{T}}\left(\text{g}-\overline{\text{g}}\right)\end{array}\right)\)

(5)

\({\text{w}}_{\text{s}}\) is a diagonal matrix through which the appropriate weight between the pixel distance and the pixel intensity is obtained. After applying PCA we have [12]:

\(\text{b}=\text{Q}\text{c}\)

(6)

Q are the eigenvectors (shape and texture model) and c is a vector consists of appearance parameters that control both the shape and texture of the model [12]. PCA can eliminate the relationship between shape and texture parameters in the model. Moreover, it provides a more compact model in which the shape and texture of the face are represented as a function of\(\text{c}:\)

\(\text{s}=\overline{\text{s}}+{\text{P}}_{\text{s}}{\text{W}}_{\text{s}}{\text{Q}}_{\text{s}}\text{c}\)		(7)
\(\text{g}=\overline{\text{g}}+{\text{P}}_{\text{g}}{\text{Q}}_{\text{g}}\text{c}\)	(8)

which:

\(\text{Q}=\left(\begin{array}{c}{\text{Q}}_{\text{s}}\\ {\text{Q}}_{\text{g}}\end{array}\right)\)

(9)

Accordingly, a face template is obtained for each age group using the AAM method.

For each age group, the face template is a good representation of the age group. The number of landmark points and the way that the points are arranged is crucial. We tested a different number of landmarks and their position on the face suggested in the literature. Finally, we suggested using 66 and 92 landmark points with the most appropriate way of arranging the points to represent the face. Figure 1 shows the suggested landmarks for the samples of man and women's faces. In Table 1, the number of landmarks suggested in the literature is compared with the number of feature points. Each of these references has a different number and patterns in terms of the arrangement of feature points on the face.

Table 1

References and number of key points
References	number of key points	year
[8]	68	2016
[9]	68	2018
[15]	68	2019
[16]	68	2016
[17]	79	2018
[12&18]	122	2001&1998
This work	66	-
This work	92	-

Figure 2 shows the landmark points due to the number of feature points are known and also the arrangement pattern of these points (right), in a specific age group (65–67, man gender), by selecting the same and the similar number of face images, we implemented the arrangement patterns and the number of points presented in each of these references on these images(left).

As shown in Fig. 2(f), the suggested landmarks with 92 points, takes into account smile lines, chin lines (both sides from the corners of the lips to the chin), bags under the eyes (lines under the eyes), which appear in the face of adulthood. This is a more appropriate and comprehensive choice in comparison with the other landmarks in the literature. Figure 3 (a) shows the obtained templates with the proposed layout pattern (92 feature points) for 10 age groups.

Figure 3 (b) shows the templates (79 feature points and different layouts) presented in [17]. As shown, especially in the age group over 45, the details of the elderly person's face, including bags under the eyes, smile lines, as well as lines on both sides of the mouth to the chin, are of lower quality than the suggested 92-point pattern. Examples of the suggested template, layout, and texture model are presented in Fig. 4.

For each age group, we suggested the proper locations for 92 landmark points to provide a suitable representation for the face template of the group.

Warping is an important step in many image processing applications. This method has been used in recent decades in machine vision, computer graphics, and medical imaging. Almost every digital image processing application requires at least occasional changes in image position, scale, or direction. In general, digital image warping is a geometric conversion of a digital image. There is a wide range of warping algorithms. Critical aspects of image warping algorithms are speed, accuracy, and complexity. Choosing the right algorithm depends on the application. In [17], image deformation is performed using the Moving Least Squares (MLS) proposed by Scott Schaffer et al. [19]. [21] has introduced a method for creating smooth and realistic image deformation based on linear MLS. This includes three types of class changes for image warping: affine conversion, similarity conversion, and rigid conversion. It also allows the selection of deformation using a set of line points or segments to specify the warping. Using MLS eliminates the need to triangulate the input image and create a smooth deformation [22]. The MLS deformation is used to find the best conversion function (f) that maps a set of controlled handles (p) to the position of the controlled handles (q). Therefore, to generate a deformed image, the function f is applied to any point v in the deformed image. The function f must satisfy three conditions that are useful for deformation:

1. Interpolation: p handles must map to q in deformation \({\left(\text{f}\right(\text{p}}_{\text{i}})={\text{q}}_{\text{i}})\).

2. Smoothness: F Should Make A Smooth Deformation

3. Identity: if the deformation form of q is p, f must be the identity function [19].\(\left({\text{q}}_{\text{i}}={\text{p}}_{\text{i}}\Rightarrow \text{f}\left(\text{v}\right)=\text{v}\right)\)

With the v point in the image, the following equation must be solved to provide the best conversion of l_v(x) that is minimized [19] [20].

\(\sum _{\text{i}}{{\text{w}}_{\text{i}}\|{\text{l}}_{\text{v}}\left({\text{p}}_{\text{i}}\right)-{\text{q}}_{\text{i}}\|}^{2}\)	(10)
\({\text{w}}_{\text{i}}=\frac{1}{{\|{\text{p}}_{\text{i}}-\text{v}\|}^{2{\alpha }}}\)	(11)

\({\text{p}}_{\text{i}}\) and \({\text{q}}_{\text{i}}\) are control points known as row vectors and \({w}_{i}\) is the weight and \({\alpha }\) is set to adjust the effect of deformation [20]. Since in the minimum squares problem, \({\text{w}}_{\text{i}}\) depends on the evaluation point, this method can be called the MLS. Hence for each v, a different conversion \({\text{l}}_{\text{v}}\left(\text{x}\right)\) is obtained. Now, the deformation function f is defined by satisfying the above three properties as \(\text{f}\left(\text{v}\right)={\text{l}}_{\text{v}}\left(\text{v}\right).{\text{l}}_{\text{v}}\left(\text{x}\right)\)contains a matrix of linear conversion M and transfer T [19] [20]:

\({\text{l}}_{\text{v}}\left(\text{x}\right)=\text{x}\text{M}+\text{T}\)

(12)

Equation (10) is quadratic in T. Therefore, we can solve for T directly from the obtained matrix relation M:

\(\text{T}={\text{q}}_{\text{*}}-{\text{p}}_{\text{*}}\text{M}\)

(13)

Which \({\text{p}}_{*}\) and \({\text{q}}_{*}\)are the Centroid center weights:

\({\text{p}}_{\text{*}}=\frac{\sum _{\text{i}}{\text{w}}_{\text{i}}{\text{p}}_{\text{i}}}{\sum _{\text{i}}{\text{w}}_{\text{i}}} , {\text{q}}_{\text{*}}=\frac{\sum _{\text{i}}{\text{w}}_{\text{i}}{\text{q}}_{\text{i}}}{\sum _{\text{i}}{\text{w}}_{\text{i}}}\)

(14)

Accordingly, T can be rewritten in Eq. (12) and \({\text{l}}_{\text{v}}\left(\text{x}\right)\)can be rewritten in terms of the linear matrix M as follows:

\({\text{l}}_{\text{v}}\left(\text{x}\right)=\left(\text{x}-{\text{p}}_{\text{*}}\right)\text{M}+{\text{q}}_{\text{*}}\)

(15)

Therefore, the problem of least squares in Eq. (10) can be rewritten as follows [19] [20]:

\(\sum _{\text{i}}{\text{w}}_{\text{i}}{\left|{\widehat{\text{p}}}_{\text{i}}\right.\text{M}-{\widehat{\text{q}}}_{\text{i}}|}^{2}\) Which \({\widehat{\text{p}}}_{\text{i}}={\text{p}}_{\text{i}}-{\text{p}}_{\text{*}}\)& \({\widehat{\text{q}}}_{\text{i}}={\text{q}}_{\text{i}}-{\text{q}}_{\text{*}}\)

(16)

This framework helps to explore diffe6rent classes of M-conversion matrices. Specifically, here, an inflexible conversion for M is selected. To have realistic results, the deformation must be as rigid as possible; That is, the deformation space should not include uniform scaling [22]. Because of the nonlinear condition: \({\text{M}}^{\text{T}}\text{M}=\text{I}\), this is generally very difficult to approximate. However, closed-form solutions have been proposed from the closest iterative point to solve this problem [23]. Thus, the closed-form solution for inflexible conversion results in a deformation matrix [19]:

\(\text{M}=\frac{1}{{{\mu }}_{\text{r}}}\sum _{\text{i}}{\text{w}}_{\text{i}}\left(\genfrac{}{}{0pt}{}{{\widehat{\text{p}}}_{\text{i}}}{{-\widehat{\text{p}}}_{\text{i}}^{\perp }}\right)\left({\widehat{\text{q}}}_{\text{i}}^{\text{T}}{-\widehat{\text{q}}}_{\text{i}}^{\perp }\right) \&(\perp :{\left(\text{x},\text{y}\right)}^{\perp }=(-\text{y},\text{x}\left)\right)\)

(17)

Which \({\mu }_{r}\)is the scaling factor:

\({{\mu }}_{\text{r}}=\sqrt{{(\sum _{\text{i}}{\text{w}}_{\text{i}}{\widehat{\text{q}}}_{\text{i}}{\widehat{\text{p}}}_{\text{i}}^{\text{T}})}^{2}+{(\sum _{\text{i}}{\text{w}}_{\text{i}}{\widehat{\text{q}}}_{\text{i}}{\widehat{\text{p}}}_{\text{i}}^{\perp \text{T}})}^{2}}\)

(18)

So, we have \({\text{M}}^{\text{T}}\text{M}=\text{I}\). This deformation is nonlinear, but can still be easily calculated. The vector \({\overrightarrow{\text{f}}}_{\text{r}}\left(\text{v}\right)\) is a rotated and scalable version of the vector\(\text{v}-{\text{p}}_{\text{*}}\). To calculate \({\text{f}}_{\text{r}}\left(\text{v}\right)\), we normalize the vector \({\overrightarrow{\text{f}}}_{\text{r}}\left(\text{v}\right)\), scale with length \(\text{v}-{\text{p}}_{\text{*}, }\)and transfer by \({\text{q}}_{\text{*}}\).

\({\text{f}}_{\text{i}}\left(\text{v}\right)=\left|\text{v}-{\text{p}}_{\text{*}}\right|\frac{{\overrightarrow{\text{f}}}_{\text{r}}\left(\text{v}\right)}{\left|{\overrightarrow{\text{f}}}_{\text{r}}\left(\text{v}\right)\right|}+{\text{q}}_{\text{*}}\)

(19)

Here for this deformation, points are used as control bundles, and the rotating vector is [19]:

\({\overrightarrow{\text{f}}}_{\text{r}}\left(\text{v}\right)=\sum _{\text{i}}{\widehat{\text{q}}}_{\text{i}}{\text{A}}_{\text{i}}\)	(20)
\({\text{A}}_{\text{i}}={\text{w}}_{\text{i}}\left(\genfrac{}{}{0pt}{}{{\widehat{\text{p}}}_{\text{i}}}{{-\widehat{\text{p}}}_{\text{i}}^{\perp }}\right){\left(\genfrac{}{}{0pt}{}{\text{v}-{\text{p}}_{\text{}}}{-{(\text{v}-{\text{p}}_{\text{}})}^{\perp }}\right)}^{\text{T}}\)		(21)

The results of the warping applied to the face images to change the size of the face and its various components are shown in Fig. 5.

As shown in Fig. 5, (1) is the input image. In (2), the desired points of location are defined. These 11 points are primary input (the number and the locations of these points are changeable). In step (3) the alternate points are redefined (these are the new location of the initial points) and applied to the face. Finally, by applying the new location of the points, the result face in step (4) is deformed from the input face in (1).

The block diagram of the proposed method is shown in Fig. 6. In the proposed method, given the face of a person at age Y, as the input image first, it is resized to specific dimensions, and we set the image dimension to 200*200. Next, based on the suggested layout of the landmarks, 92 feature points are arranged on the face image. In the next step, according to the target age (age X) and the person’s gender the face template (Fig. 3(a)) is selected. Then the landmark points on the template of target age are extracted. Then, the MLS technique is applied to map the feature points from the input face image to the corresponding points on the template at the target age group. This applies geometric changes to the input face to modify face elongation and special position of the eyes, nose, and chin based on the target age template.

As shown in Fig. 7, to apply the unique geometric characteristics (for example, face elongation, eye position, cheek and chin position) of the person for whom the input face has been given, that represents the identity of that face, it is necessary to apply the specifications on the target age format before this aging process. Therefore, according to the steps presented in the algorithm, we have considered the image of a woman as sample input at the age of 25 and the target age templates at different ages of 35 to 97 years. First, 92 landmark points have been extracted from the input image as well as the desired templates according to the pattern presented in the previous section. Then, in the next step, geometric characteristics are applied to the target age template using the MLS method. Thus, the MLS input is a 25-year-old female image with 92 landmark points and a template image at the desired age (for example, 35–37 age template) with 92 landmark points. Now the goal is to transfer the position of 92 landmark points from the face template of the target age to the corresponding points from the input image. Using this step, the unique geometric characteristics of the input image are conveyed to the target age template. As the result, we have a face image in the template of the target age with the unique characteristics of the input image. Therefore, the output of the proposed system is a face image with general specifications of the target age and specific characteristics of the input face image (e.i. the 25-year-old woman at the input). As can be seen, this lady has an elongated face and a special state of eyes (almond), cheeks, and nose (these unique features express the identity of the person), after the applying the MLS technique, these features have appeared in the form of different target ages. Now, the goal is to transform the position of the 92 landmark points of the target age template into the landmark points from the input face image for maintaining the unique character of the input face. So, the image output has a general form of the target age and at the same time the unique specifications of the person in the input face. As shown in Fig. 7, the input face belongs to a woman with an elongated face and special eyes (almond), elongated cheeks, and nose (these unique features express the identity of the person) and after applying the MLS technique, these features are preserved in the template of different target ages.

Next, 92 landmark points are re-extracted from the obtained image according to the layout pattern (Fig. 6). Finally, having the input image feature template and the age group template (after applying the MLS technique), using the AAM method (active appearance model) and combining both template images age group (age X) and input face (age Y), the image of the face is obtained at the target age (age X). In the last step of block diagram, considering that skin color is also one of the unique characteristics of each person, the color of the input face is applied to the final aged image. So, the unique color layer of the input face, which is part of the identity of the individual, is added to the face through the histogram of both the input image and the resulting image, and the final image is aged while maintaining the identity characteristics of the input face. For a sample face image, Fig. 8 shows the step-by-step result for constructing the face image at a target age.

Figure 9, shows the result of face estimation in different age groups for two men and three women. The persons for input images are at the age of 25 and the output images are in 7 age groups in the range of 35 to 97 years.

In this section, first, the hypotheses, preprocessing, review of existing databases, and introduction of the database used in this article are discussed. the reports of some experiments for validation of the proposed method are studied.

5.1 Suppositions

In this experiment, 10 age groups were considered in the range of 5 to 97 years separately for men and women. The age groups are: 5–7, 15–17, 25–27, 35–37, 45–47, 55–57, 65–67, 75–77, 85–87, 95–97. So, we have a total of 20 age groups where each group is represented by a face template. The preprocessing performed on the input images (images with dimensions of 200 * 200) and the steps of template extraction for each age group are shown in Fig. 10.

The selection of the dataset plays a key role in evaluating the proposed algorithm. Numerous datasets for face images are available, but only a handful of them are specifically designed to examine facial aging. Creating a new dataset in the field of facial aging is a difficult task. Since it requires a collection of face images for persons of different ages and genders. In Table 2, we reported several datasets used for studying age estimation and the face aging process.

Table 2

Several databases in the field of face aging
datasets	Number of persons	Number of Images	Age range	Number of Images per a person
FG-NET	82	1002	0–69	Max 12
MORPH-1	632	1690	15–68	Average 4
MORPH-2	20569	78207	15–77	Average 4
PCSO_LS	18007	147784	18–83	Average 8
CARD	2000	163446	16–60	Average 81.7
WIT	110	1109	1–80	12 − 10
UTKFace	⁓20000	23000	0-116	-

There is no pre-defined protocol for such datasets, the age distribution is not the same for the subjects. However, these datasets have been used in several studies of facial aging modeling. Whereas there are limitations to the use of common datasets, which are not conducive to the development of a robust and reliable model for a variety of reasons; Therefore, in choosing a suitable dataset, the following points should be considered:

Dataset structure: some datasets, such as FG-NET, contain images of a subject at different ages. These images are not evenly distributed among different ages. In addition, in some cases, the dataset contains limited adult face images in some age groups. For example, in the Morph dataset, images are between 16 and 70 years old, so there is no face image for children.
Pose: to construct a valid model, a frontal image of subjects without occlusion is required. Many existing databases have this restriction.
Emotion: face images with expressions (disgust, anger, fear, sadness, joy, and surprise) are not suitable for age modeling.

We used the UTKFace database [24] [2] [25] to evaluate the proposed method. UTKFace have a large number of face image where subjects are of a wide range of ages (from 0 to 116 years). The database contains approximately 23,000 face images with captions of age, gender, and ethnicity. The face images have been taken from different views and under different imaging conditions such as brightness, obstruction, sharpness, and so on. This dataset can be used for different tasks, such as face recognition, age estimation, age/regression progression, and more. We compared the result of our work with the results in Ref [2], [4], and [6]. Figure 11 shows these results. In this experiment, we selected some face images as the inputs from the addressed works and the result of the aging process using these methods and the proposed method were reported in Fig. 11.

As shown to Fig. 11, the result of those references ([2], [4], and [6]) cover long periods. For instance, in the reference [4] for targets aged 19 to + 60, only 3 final images were given. But in this method for the same range, 7 images with 10 years steps are given. Each proposed image in output has exclusive specs of the input image and general specs of the target age.

5.2 Evaluation using an opinion poll

Another experiment was conducted to evaluate the proposed method using an opinion poll. This experiment was conducted in two parts: recognizing the age and identifying the person from the simulated face image.

5.2.1 Recognizing the age of simulated face

In the first part, two women’s images and two men’s images of age (20s) were given as the input to the proposed algorithm. The images of these people at different ages (7 age groups) in the range of 35 to 97 years were simulated. So, the number of images for this experiment is 28. Then, the participants in the opinion poll (110 men and women separately) were asked to guess the age of each face image in the output. Using this experiment, we aimed to find how well the simulated faces represent the target ages. Some of the men/women images have been shown in the Fig. 12(a) and 13(a).

The results of the opinion pool for man/woman images are shown in Figs. 12(b) and 13(b). As shown in Fig. 12 (b), the 35–37 aged group (series 2) as the number of participants (those 110 persons) has the most accurate recognition with 98.18% and the 45–47 aged group (series 2) has the least accurate recognition with 72.72%. Another result is in Fig. 13 (b), the 35–37 aged group (series 1) as the number of participants has most accurate recognition with 100% and the 95–97 aged group (series 2) has least accurate recognition with 79.05%.

5.2.2 Identifying who is the simulated face

In the second part of the experiment, in addition to the image of a person at the age of entry, we have the real image of the person at the target age. We used the FGNet dataset which contains images of a person of different ages. First, we provide the image of a person at a younger age as the input to the proposed method, the face image is constructed at the target age. We have the real image of the person at the target age (ground truth). By comparing the constructed face image and the real face image at the target age similarity between the two images is assessed. In Fig. 14(a), three images of a woman and three images of a man (inputs image and ground truth in target ages) have been shown from the FGNet database. In this figure, the outputs of the proposed method at the target ages have been also reported for comparison. Participants in the opinion poll were then asked to choose two similar images from the images presented to them(Fig. 14(b)) that could be for a person in their youth and adulthood. All participants in this stage of the opinion poll gave the correct answer(correct:110/incorrect:0) to this question, which shows the results obtained from the proposed method are acceptable as they are similar to the real face image at target ages.

In the face aging process, geometric changes are small and mainly in the texture of the face. The size and structure of the facial muscles change in the aging process. Many of these changes occur in the form of loss of musculoskeletal mass, which leads to aging. In most of the previous works, the goal is to preserve the appearance of the person and estimate the face image at the target age. The simulated face, on the other hand, must be realistic and have the appearance of the texture characteristics of the target age. Most recent work has used statistical methods to provide a model using a set of facial images as educational data. Then, based on the obtained statistical models, the shape and texture changes are applied to the input image. Here, first, the feature points, the appropriate number of points, and the location of these points to have a suitable template for each age group are examined. In the following, by defining ten age groups, of course, separately for women/men, a total of 20 templates are presented using the Active Appearance Model (AAM) method as a representative of the defined age groups. Then, using the MLS method and having the feature points of the template of the target age group as well as the feature points of the input image, geometric characteristics specific to the input face are applied to the template of the target age group. Finally, using the AAM method and input face feature points and target age template feature points and combining the two, the face image at the target age is presented. Here, we achieved to proper template for ages rang 5 to 97 with 10 years step by proposing a new arrangement pattern and some points for more detailed features. Having these number of templates giving results in more target ages, became possible. In the following, adding input image exclusive specs such as eye position, and nose, … to the target age template helped to keep the identity of the input image. By comparing the results obtained by the above method and the samples in other references, the accuracy and quantity of the results in different age groups at different ages have been shown.

Ethics approval and consent to participate

Not applicable to this research.

Consent for publication

We, the article authors, give our consent for the publication of identifiable details, which can include photograph(s) and/or images and/or case history and/or details within the text (“material”) to be published in this journal.

Availability of data and materials

The data that support the findings of this study are openly available[26].

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Funding

The authors did not receive support from any organization for the submitted work.

Authors' contributions

All authors contributed to the study conception and design and approved the final manuscript.

Acknowledgements

Not applicable

Authors' information

Mahboubeh Khajavi received M.Sc. degree in Electrical Engineering from South of Tehran Branch and PhD student in Shahrood University of Technology now. Her research interests include image processing.

Dr. Alireza Ahmadyfard is associate professor in Electrical Engineering . He is an academic staff in the Shahrood University of Technology.

X. Shu, J. Tang, H. Lai, L. Liu, and S. Yan,” Personalized age progression with aging dictionary”, In: Proceedings of the IEEE international conference on computer vision, pp 3970–3978, 2015.
Lu Liu, Haibo Yu, Shenghui Wang, Lili Wan, Shanshan Han,” Learning shape and texture progression for young child face aging”, Signal Processing: Image Communication 93-116127, 2021.
G. Antipov, M. Baccouche, JL. Dugelay,” Face aging with conditional generative adversarial networks”, arXiv: 1702.01983, bibtex: antipov face, 2017.
N. Sharma, R. Sharma and N. Jindal, “An Improved Technique for Face Age Progression and Enhanced Super-Resolution with Generative Adversarial Networks”, Springer Science Business Media, LLC, part of Springer Nature, vol.114, pp.2215–2233,2020. https://doi.org/10.1007/s11277-020-07473-1.
S. Palsson, E. Agustsson, R. Timofte and L.V. Gool, “Generative Adversarial Style Transfer Networks for Face Aging”, IEEE conference on computer vision and pattern recognition (CVPR),2018.
Yunfan Liu, Qi Li, Zhenan Sun, and Tieniu Tan,” A³GAN: An Attribute-Aware Attentive Generative Adversarial Network for Face Aging”, IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 16, 2021.
Xu Yao, Gilles Puy, Alasdair Newson, Yann Gousseau and Pierre Hellier,” High-Resolution Face Age Editing”, 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, Jan 10-15, 2021.
W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu and N. Sebe,” Recurrent face aging”: IEEE conference on computer vision and pattern recognition (CVPR), pp 2378–2386, 2016. https://doi.org/10.1109/CVPR. 2016.261.
W. Wang, Y. Yan, Z. Cui, J. Feng, S. Yan and N. Sebe,” Recurrent face aging with hierarchical autoregressive memory”. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 99, 2018. https://doi.org/10.1109/TPAMI.2018.2803166.
V. Martin, R. S´eguier, A. Porcheron and F. Morizot, “Face aging simulation with a new wrinkle oriented active appearance model”, springer, Multimed Tools Appl 78, pp. 6309–6327, 2019. https://doi.org/10.1007/s11042-018-6311-z.
A. Lanitis, C. Draganova, and C. Christodoulou, “Comparing different classifiers for automatic age estimation,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 621–628, Feb. 2004.
T. F. Cootes and C. J. Taylor, “Statistical Models of Appearance for Computer Vision,” University of Manchester, Manchester M13 9PT, U.K, Ch. 4&5, March 8, 2004. www.isbe.man.ac.uk
K. Ricanek, Y. Wang, C. Chen and S. J. Simmons, "Generalized Multi-Ethnic Face Age-Estimation," IEEE 3rd International Conference on Biometrics, Theory, Applications, and Systems, Washington, DC, pp. 1- 6, 2009.
Anjali A. Shejul, Kishor S. Kinage and B. Eswara Reddy,” CDBN: Crow Deep Belief Network Based on Scattering and AAM Features for Age Estimation”, Journal of Signal Processing Systems, 2 November 2020. https://doi.org/10.1007/s11265-020-01609-z
H. Wang, Y. Wang, W. Li and D. Huang, "Example-based facial aging simulation via facial detail transfer”, Journal of Ambient Intelligence and Humanized Computing, 2019. https://doi.org/10.1007/s12652-019-01243-z.
H. Yang, D. Huang, Y. Wang, H. Wang and Y. Tang, “Face aging effect simulation using hidden factor analysis joint sparse representation”. IEEE Transactions on Image Processing, vol.25, Issue. 6, June 2016.
E. Farazdaghi,” Facial aging and rejuvenation modeling including lifestyle behaviors, using biometrics-based approaches”, Signal and Image Processing. Université Paris-Est, 2018.
T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models”, in Computer Vision — ECCV’98, ser. Lecture Notes in Computer Science, H. Burkhardt and B. Neumann, Eds. pp. 484–498, Jun 1998.
S. Schaefer, T. McPhail, and J. Warren, “Image Deformation Using Moving Least Squares,” in ACM transactions on Graphics (TOG), ser. SIGGRAPH ’06, vol. 25. New York, NY, USA: ACM, 2006, pp. 533–540.
Ch. Yu, Xi. Chen, Qi. Xie and Gu. Li, L.Yin, H. Han,” Image deformation using modified Moving Least Squares with outlines,” 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 6-9 Aug. 2017.
D. Levin, “The Approximation Power of Moving Least-squares,” Mathematics of Computation of the American Mathematical Society, vol. 67, no. 224, pp. 1517–1531, Oct. 1998.
T. Igarashi, T. Moscovich, and J. F. Hughes, “As-rigid-as-possible Shape Manipulation,” in ACM transactions on Graphics (TOG), ser. SIGGRAPH ’05. New York, NY, USA: ACM, 2005, pp. 1134–1141.
B. K. P. Horn, “Closed-form solution of absolute orientation using unit quaternions,” JOSA A, vol. 4, no. 4, pp. 629–642, Apr. 1987.
Zhang, Zhifei, Song, Yang, and Qi, Hairong,” Age Progression/Regression by Conditional Adversarial Autoencoder”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. https://zzutk.github.io/Face-Aging-CAAE
Atkale, D.V., Pawar, M.M., Deshpande, S.C. et al. Multi-scale feature fusion model followed by the residual network for generation of face aging and de-aging. SIViP (2021). https://doi.org/10.1007/s11760-021-02015-z
https://susanqq.github.io/UTKFace/

No competing interests reported.

Download PDF

Editorial decision: Major revision
01 Aug, 2022
Reviews received at journal
28 Jul, 2022
Reviewers agreed at journal
15 Jul, 2022
Reviewers invited by journal
15 Jul, 2022
Editor assigned by journal
14 Jul, 2022
Submission checks completed at journal
08 Jun, 2022
First submitted to journal
29 May, 2022

You are reading this latest preprint version

Human Face Aging Based on Active Appearance Model Using Proper Feature Set

Status:

Version 1

Abstract

Figures

1 Introduction

2 Feature Points Extraction And Template Of The Face Age Groups

3 Warping Of Images

4 Proposed Method

5 Experiments

5.1 Suppositions

5.2 Evaluation using an opinion poll

5.2.1 Recognizing the age of simulated face

5.2.2 Identifying who is the simulated face

6 Conclusion

Declarations

References

Additional Declarations

Status:

Version 1