4.1 Experiment content
In this chapter, the operating system used is Windows 7 64-bit operating system, the CPU is dual-core Intel CPU I5, the memory size is 8GB, the GPU used is NVIDIA GTX 1050Ti, the deep learning framework uses TensorFlow, and the image processing toolkit uses Opencv and Pil. All the image data used are sourced from the network, and all the experimental images are cropped or stretched by the image toolkit to facilitate the experiments and the presentation of the results. the size of the experimental data input is not fixed, but the size of the same set of experimental data must be consistent, i.e., the size and color channels of the content image and the style image must be the same.
In this paper, the convolution layer of the pre trained vgg-19 network model is used as the abstract feature extractor. The number and relative position of these network layers determine the local scale of image style matching, which plays a decisive role in the final effect of the visual experience of the synthetic image. In the experiment, 'conv4_ 2 'as the content representation layer of the content image, the weight factor of the image content loss function α = 100.0, and conv1_ 1’, ’conv2_ 1, ’conv3_ 1, ‘conv4_ 1, ’conv5_ 1 'as the style representation layer of the style image, the weight factors of the image style loss function are β = 1000.0, the smoothing weight factor of the composite image x is γ = 0.001, and the color migration weight factor λL= 1.2, λA=λB = 1.3. In the training process, the Adam algorithm based on random gradient descent is used, and 1500 optimization iterations are carried out through the back propagation algorithm to minimize equation (6). The calculation time of a single GPU is about 150 seconds
4.2 Experimental results
Image texture synthesis is one of the important processes for synthesizing a new style of image, and its goal is to infer the process of synthesizing that image texture from an example image texture, which in turn can produce any number of new samples of that image texture. Image textures are pervasive image visual features that can be used to describe surface phenomena of things. The image texture structure reflects the spatial variation of the values of pixels in an image with a specific distribution pattern.
Compared with traditional image texture synthesis methods, the powerful parametric texture model of convolutional neural network has substantial and big improvement in image texture synthesis. The quality of image texture synthesis is usually evaluated by the line contour and color distribution of the synthesized texture, and the higher the similarity between the synthesized texture and the example texture is observed, the more natural the visual experience is, the more successful the image texture synthesis is. As shown in Figure 5, the method of [24] causes unnatural and scattered problems in the synthesized image, and certain image areas show severe color corruption. This is largely due to the image color migration in the RGB color space, where the color channels are strongly correlated, which can lead to color disorder when color migration is performed. The method in this paper can effectively solve these problems and make the overall color transition of the synthesized style image good and natural.
In terms of color control, the effect of image color migration largely affects the final effect of image style migration based on color retention. The color information of an image is an important part of its style direct perception, but the color distribution often appears uneven and mismatched in image color migration, and then effective color control is required to ensure the effect of the synthesized image. Therefore, in this paper, the color migration method of Reinhard et al. is improved by adding the weight coefficients of the relevant color channels to obtain better color effects through the parameter adjustment method. As shown in Fig. 6 and Fig. 7, compared with the method of Gatys et al, the color effect of this paper appears to be richer and more natural.
In addition, color and texture are two key elements of image style. In image style migration, color preservation is a typical use case with high requirements for color processing. Two methods of image style migration based on color preservation are proposed in [20]: one is linear color migration in RGB color space, which migrates the color of the content image into the style image, thus making it possible to maximize the color preservation of the content image during image style migration; the other is image style migration in the luminance space of the content image only, as a way to preserve the original content image color space invariance. [21] used a local linear model to enhance the coordination and correlation between the local and the overall, and realized the way that color migration can refer to multiple images, further enhancing the effect and flexibility of image color migration, are shown in Table 1.
Table 1
Environmental parameters of the experimental setup
Parameter
|
Content
|
Data corpus size
|
25000
|
Word dimension
|
512
|
Layers of neural network
|
2
|
Deep learning framework
|
TensorFlow
|
Hardware configuration
|
i7-9700K,128GB
|
With the development of deep learning-based image style migration, its commercial application value has received widespread attention, mainly in the following three aspects.
Image beautification is a popular application technique on social networks, such as advertising images, selfie photos, and so on. However, traditional image style migration methods appear to be simple and fixed in terms of digital image processing techniques, which are difficult to meet some more abstract needs. Deep learning can bring more room for innovation and imagination for image style design. Among them, the content-aware image style migration method is effective, which fully considers the two problems of "where to do image style migration" and "how to do image style migration", and it performs well in the field of image restoration showed excellent results in the related work are shown in Table 2.
Table 2
Test results of the 3 models
Model
|
BLEU value
|
RNN
|
15.88
|
LSTM
|
21.03
|
ATT+RNN
|
24.33
|
In addition, the image style migration method can also colorize comic sketches, and in the related work of [18], image style migration not only accomplished the task of colorizing the image brilliantly, but also the local features of the image worked very naturally. In terms of applications, Prisma, a mobile APP program, is one of the most popular free applications providing deep learning-based image style migration, which can convert user input images into high-quality art style paintings in just a few seconds. Subsequently, a number of mobile APPs or web-based systems for image style migration have emerged for a fee and have generated some commercial value. With the help of these applications, people can easily create their favorite art style works without the need for special expertise and without the need for a lot of time and expense.
Visual effects-related technologies are found everywhere in entertainment and film-related industries, such as film production, television production, animation production, etc. However, visual effects are very expensive to create. If artificial intelligence could be used to perform these tasks, it would greatly reduce the cost, and deep learning-based image style migration is one of the solutions to be considered. For example, [22] used optical flow techniques and a collection of deep convolutional neural networks to achieve artistic stylization for film production. The work of [16] fully considers the coherence problem between consecutive frames in video stylization by introducing a temporal consistency loss function to constrain the global variability of images between consecutive frames. [21] constructed a generative model with temporal correlation constraint, which not only can perform a variety of stylization computations, but also can perform real-time stylization for online videos. [23] delved into and analyzed image style migration in a more advanced abstraction of hyperparameter space in deep learning and found a set of effective parameter module components to perform impressionistic stylization of movie scenes. Deep learning-based image style migration in video processing still needs to be studied and analyzed more deeply, and from the current progress, its great potential commercial value will be further explored in the near future are shown in Table 3.
Table 3
Translation model performance comparison
Translation model
|
Dev
|
Test 1
|
Test 2
|
Test 3
|
Average
|
RNN
|
16.6
|
15.9
|
17.2
|
16.3
|
16.5
|
RNN+Attn
|
19.0
|
19.5
|
20.9
|
19.4
|
19.8
|
CNN+Attn
|
28.1
|
26.1
|
29.1
|
25.6
|
27.0
|
Transformer
|
31.8
|
30.8
|
33.6
|
29.3
|
31.2
|
Transformer Int Heads (8)
|
32.7
|
30.7
|
34.8
|
30.7
|
32.1
|
Transformer Int Heads (16)
|
33.0
|
30.9
|
35.2
|
31.1
|
32.4
|
Aids to style design. Image style migration can serve as an effective design aid technique, such as painting art creation, architectural style design, clothing fashion design, game special effects scene design, etc. Although there are not many references or more successful applications, deep learning-based image style migration is likely to become an important research hotspot in the near future, given the significant breakthroughs in various fields of deep learning in recent years.
In academia, in general, the two main categories of methods include image-based iterative and model-based iterative. Among them, depending on the image style acquisition method, the image iteration-based methods can be categorized as MMD (Maximum Mean Discrepancy), MRF (Markov Random Field), and DIA (Deep Image Analogy). The main approaches based on model iteration can be categorized as generative model-based and image-reconstruction decoder-based, depending on the model iteration method. These representative methods have excellent results, but there are still some problems that need to be studied in depth.
The balance between content, texture and color in image style migration determines the degree of viewability of the final generated image, and the current failure cases are often caused by the unreasonable adjustment of these three aspects. Therefore, an in-depth study of the balance between image content, texture and color, as well as systematic and repeated experiments on the adjustment of their related parameters and weights, is an important part of the work to further improve the quality of stylized images, as shown in Figure 8.