Double-image visually meaningful encryption algorithm based on compressed sensing and FRFT embedding

: The transmission of images via the Internet has grown exponentially in the past few decades. However, the Internet considered as an insecure method of information transmission may cause serious privacy issues. In order to overcome such potential security issues, a novel double-image visually meaningful encryption (DIVME) algorithm conjugating quantum cellular neural network (QCNN), compressed sensing (CS) and fractional Fourier transform (FRFT) is proposed in this paper. First, the wavelet coefficients of the two plain images are scrambled by the Fisher-Yates confusion algorithm, and then compressed by the key-controlled partial Hadamard matrix. The final meaningful cipher image is generated by embedding the encrypted images into a host image with the same resolution of the plain image via the FRFT-based embedding method. Besides, the eigenvalues of the plain images are utilized to generate the key stream to improve the ability of proposed DIVME algorithm to withstand the plaintext attacks. Afterwards, the plaintext eigenvalues are embedded into the alpha channel of the meaningful cipher image under control of the keys to reduce unnecessary storage space and transmission costs. Ultimately, the simulation results and security analyses indicate that the proposed DIVME algorithm is effective and can withstand multiple attacks.

For example, a parallel image compression-encryption algorithm is presented by Huang [17]. In his scheme, first, the plain image is divided into several sub-images and then linearly measured by 1D CS. Afterwards, a series of operations such as permutation, substitution, block-wise XOR are performed on the quantized measurement value matrixes to generate the final cipher image. However, the large-scale Gaussian random measurement matrix used as the key in Huang's scheme requires additional storage space and transmission cost. Subsequently, in order to overcome this issue, key-controlled partial Hadamard matrix [18][19], partial random block weighing matrix [20], structurally random matrix [21] and chaosbased measurement matrix [22][23][24] are introduced to compress the plain image. Besides, in the encryption phase, the counter mode [25][26], hash function [23,27] and plaintext eigenvalue [28] are applied to withstand the plaintext attacks, since different plain images correspond to different key streams. Nevertheless, the above-mentioned CS-based image encryption algorithm can prevent image data from leakage, but it cannot provide protection in appearance.
Therefore, Bao et al. [29] proposed a feasible framework for simultaneous encryption and steganography, that is encryption-embedding framework. In Bao's scheme, the plain image is first encrypted by an existing encryption algorithm to obtain a noise-like or texture-like cipher image. After that, the meaningless cipher image is decomposed and embedded into a host image by lifting wavelet transform. In the absence of compression stage, the resolution of the meaningful cipher image is four times that of the plain image, which increases the unnecessary cost of storage and transmission. Later, many improved visually meaningful image encryption algorithms [30][31][32][33][34] have been proposed one after another. Such as Ref. [19], where the plain image is first encrypted and compressed through the coefficient random scrambling strategy and compressed sensing with block-wise manner. Then the robust SVD embedding method is employed to embed the meaningless cipher image into the host image with the same resolution of the plain image. Besides, the counter mode is utilized to update the encryption keys to against the chosen-plaintext attacks.
In this paper, we put forward an efficient double-image visually meaningful encryption algorithm based on compressed sensing and FRFT embedding. It mainly consists of two stages: pre-encryption and embedding process. To prevent image information from leakage, in the first stage, the Fisher-Yates confusion and compressed sensing are utilized to encrypt and compress the two plain images to attain the meaningless cipher images. Afterwards, in the second stage, the meaningless cipher images are embedded into a host image via the FRFT embedding method, so that their appearance is protected.
Besides, the quantum cellular neural network and improved Henon map, whose initial values are generated from the plaintext eigenvalues, are applied to construct the key-controlled measurement matrix and key streams in encryption.
The innovation and contribution of this paper are summarized as follows.
(1) An efficient double-image visually meaningful encryption algorithm based on compressed sensing and FRFT embedding is designed to improve transmission efficiency.
(2) A key-controlled double-embedding method (FRFT embedding) is proposed to improve the security of embedding phase.
(3) A novel "One cipher image corresponds to one key" mechanism is proposed to withstand the plaintext attacks.
(4) Simulation analysis and comparison results indicate that the proposed encryption scheme has high efficiency and can withstand multiple attacks.
The rest of this paper is arranged as follows. The basic knowledge related to the proposed algorithm is described in the Section 2. The third section and fourth section respectively introduce the specific steps of proposed DIVME algorithm and corresponding decryption algorithm in detail. Moreover, the simulation results and performance analysis are given in the Section 5. After that, our encryption scheme is compared with the existing related algorithms, and the results are listed in the sixth section. Then, a brief summary and future work are shown in the final section.

Chaotic system
In this subsection, two chaotic systems are introduced which are hyperchaotic quantum cellular neural network and improved 2D Henon map. Among them, the quantum cellular neural network is used to construct the key streams with high unpredictability in encryption. Additionally, in order to save storage space and reduce transmission costs, a key-controlled measurement matrix is generated according to the improved Henon chaotic map.

Hyperchaotic quantum cellular neural network
Quantum cellular neural network (QCNN) is constructed by several quantum cellular automata (QCA) [35]. And it has complex dynamic characteristics due to quantum interaction between the quantum dots. For the two-cell QCNN, its state equation is defined as  Fig.1 The hyperchaotic trajectories of this quantum cellular neural network.

Improved 2D Henon map
Since the classical 2D Henon map has a small key space and its chaotic trajectories are simple, an improved Henon map (IHM) is proposed in Ref. [36]. Its system equation is as follows.
Where a and b are system control parameters. Furthermore, +1 and +1 are the generated pseudo-random numbers, belonging to [-1, 1]. Fig.2

Compressed sensing
Compressed sensing [3][4] refers to using a measurement matrix unrelated to the transformation basis to linearly project the sparse high-dimensional signals into a low-dimensional space, and then reconstructing the original signals with high probability from these few projections.
In Eq.(3), = [ 1 , 2 , 3 , ..., ] is the basis matrix. And the column vector sized of × 1 is the sparse representation coefficient of y in . Besides, if ‖ ‖ 0 = , y is said to be k-sparse on the orthonormal basis . Then the process of linearly measuring the signal y with sparsity through the measurement matrix ∈ ℝ × can be expressed as Where = { 1 , 2 , 3 , ..., } is the observed vector, and = is called the sensing matrix.
Since Eq.(4) is an underdetermined equation system, other regular constraints need to be added to restore the natural signal y. Related studies [37] indicate that if the signal y is sparse enough on the orthogonal basis , and when the matrices and are irrelevant, the sparse coefficient vector S can be recovered from the vector with high probability by solving the convex optimization problem, shown in Eq. (5). Finally, the inverse transform of sparse representation is performed on the vector S to restore the natural signal y.
In Eq.(5), ‖ ‖ 0 refers to the 0 -norm of vector , which is equal to the number of non-zero elements in the vector.

Fractional Fourier transform
The fractional Fourier transform (FRFT) is a generalized form of the Fourier transform [38][39]. It is obtained by rotating the natural signal counterclockwise at any angle on the time axis. Therefore, the FRFT of a signal contains both its timeand frequency-domain features. The p1, p2-order FRFT of a two-dimensional signal ( , ) is defined as follows.

Fisher-Yates confusion
Fisher-Yates algorithm, also known as Knuth random scrambling algorithm, is utilized to scramble the sparse coefficient matrix of plain image to reduce the strong correlation between adjacent sparse coefficients. Its scrambling process is illustrated in Fig.3. In operation, the chaotic sequence generated by the QCNN is used to replace each randomly generated number, effectively controlling the elements exchanged each time. Meanwhile, the initial value of the QCNN is calculated by extracting partial pixels of the plain image with the secret keys. Therefore, different plain images correspond to different scrambling sequences.
The detailed approach for generating the initial values of the QCNN is displayed as follows.
Step 1. First, the Logistic map is iterated (mn+T 0 ) times with the initial value 0 , and the first T 0 elements are discarded to obtain a chaotic sequence = { 1 , 2 , 3 , ..., }. Then the sequence is sorted in ascending order to generate a new sequence .

Obtaining the initial values for IHM
Storing the entire measurement matrix directly requires a lot of space, and sufficient bandwidth needs to be used to transmit it to the decoder. Thus, in this paper, the measurement matrix is determined by the chaotic sequence generated by the improved Henon map. At the same time, to improve the anti-attack ability of the algorithm, partial content of two plain images will be used to generate the initial values of the IHM.
The process of generating the initial values of the improved Henon map is as follows.
Step 1. First, the average value 2 is determined as follows.
Step 2. Then the parameter is obtained by performing Eq. (13) Step 3. Finally, the initial values of the IHM are computed according to the following equation. Where 3 and 4 are the external key parameters. Additionally, sign(·) means sign function.

The DIVME algorithm
The flow chart of the proposed DIVME algorithm is displayed in Fig. 4. As Fig.4 shows, it mainly consists of two stages.
In the first stage, the secret information carried by two plain images is encrypted and compressed by the Fisher-Yates confusion and the key-controlled partial Hadamard matrix, respectively. Then, in the second stage, on the one hand, the encrypted data is randomly embedded into the host image through the fractional Fourier transform embedding, and this process is controlled by the index sequence generated from the QCNN. In addition, some important parameters are also hidden in the alpha channel of the visually meaningful cipher image.

Pre-encryption process
Step 1. First, an orthogonal sparse representation matrix ∈ ℝ × is constructed using the Daubechies wavelet. Then, the sparse processing is performed on two plain images P1 and P2 through Eq.(15), where the symbol T represents the transpose matrix of .
Step 2. To further improve the sparsity of coefficient matrices P3 and P4, the elements whose absolute values are less than or equal to the threshold values Ts1 and Ts2 are forced to be set to zero. And the matrices after threshold processing are denoted as P5 and P6 respectively.
Step 3. After sparse processing, most of the energy of two plain images P1 and P2 is mainly concentrated in the upper left corner of the matrices P5 and P6, which is not conducive to parallel compression. Thus, the Fisher-Yates confusion is utilized to evenly distribute the energy to the entire matrix. The QCNN is iterated (mn+T 0 ) times with the initial value [̇1, ̇2, ̇3, ̇4] T . Then four chaotic sequences with size of 1 × are obtained by abandoning the former T 0 values, as shown in Eq. (16).
Step 4. Then, the random sequences X and Y are processed according to the following equation.
Step 5. As described in Section 2.4, the Fisher-Yates algorithm is used to scramble the matrices P5 and P6, which is controlled by the sequences Tx and Ty. After confusion, the resulting matrices are named P7 and P8 respectively.
Step 7. Construct a Hadamard matrix H sized of × . It is calculated by the Kronecker product of two low-order matrices, and its recursion equation is shown in Eq. (18). Additionally, considering the constraints of the Hadamard matrix, we assume that m can be divisible by four.
Step 8. Sort the sequence U in ascending order to generate the index sequence . Then, the key-controlled partial Hadamard matrix is obtained by the following equation.
Step 9. The matrices P7 and P8 are compressed in parallel by the key-controlled measurement matrix to generate the encrypted matrices P9 and P10. This process can be described by Eq. (20).

Embedding process
Step 1. Select a host image ∈ ℕ × and perform the 2D discrete cosine transform on it, which is formulated in Eq. (21).
Where the transform kernel function Step 2. The sub-matrix H2 is determined by Eq. (22). And then apply the 2D fractional Fourier transform on it with the rotation angle [ , ] to get the complex matrix = + . is the imaginary unit.
Step 3. By sorting the chaotic sequences Step 4. The matrices P11 and P12 are embedded into the real and imaginary parts of the complex matrix , respectively, after adjusting their amplitudes by the gain factor . These operations are formulated by Eq. (24).
Step 5. Then, perform the inverse 2D fractional Fourier transform on the complex matrix = + and replace the lower right corner of matrix H1 to obtain a new complex matrix H5.
Step 6. Next, the complex matrix = + is generated by applying the inverse 2D discrete cosine transform on the H5.
Step 7. To facilitate storage, the matrix IP3 is processed by the following equation. Then the Alp is used as the alpha channel together with the matrix RP3 to generate the final visually meaningful cipher image ∈ ℕ × .
= 247 + Step  Fig.4 The schematic of the proposed DIVME algorithm.

The image decryption algorithm
The inverse process of DIVME algorithm is the decryption algorithm, which also includes two processes, namely the extraction and decryption process. To successfully decrypt the plain images, some external secret keys need to be transmitted to the decoder through a private channel, including ( = 1, 2, 3, 4), , , and ℎ ( = 1, 2). Meanwhile, the internal secret keys fv and gv are extracted from the alpha channel of the visually meaningful cipher image. Besides, the host image is indispensable for extracting the encrypted data from the cipher image. Thus, it is recommended to randomly select the host image from a public database to avoid increasing extra transmission costs. Then, the detailed decryption process is shown below.

Extraction process
Step Step 2. By performing Eq.(11) and Eq. (14), the initial values of the QCNN and the IHM are obtained. And then iterate them to generate the key streams X, Y, Z, W and U.
Step 3. The modified matrix Alp is processed by the following equation to extract the IP3.
Step 4. The complex matrix H5 is determined by carrying out the 2D discrete cosine transform on the = + . Step Step 6. After inverse scrambling of the matrices P11 and P12 with the index sequences Tz and Tw respectively, the encrypted matrices P9 and P10 are generated.

Decryption process
Step 1. The key-controlled measurement matrix ∈ ℝ × is generated as described in the step 8 of section 3.2.1, And then utilized to recover the matrices P7 and P8 with the 0 algorithm. It can be denoted as Eq. (30).
Step 2. Inverse Fisher-Yates confusion (IFYC) is applied to the matrices P7 and P8 with the Tx and Ty generated by sorting the sequences X and Y, returning the matrices P5 and P6, respectively.
Step 3. By performing inverse sparse transform (see Eq.

Encryption and decryption results
To demonstrate the effectiveness and practicability of the proposed DIVME algorithm, the simulation experiments are conducted in this subsection, and the results are drawn in the Fig.5-Fig.7. As can be seen, the plain images are encrypted into the same resolution cipher images which are meaningful and visually similar to the corresponding host images, indicating that the proposed FRFT embedding method is effective. Actually, when the visually meaningful cipher image is transmitted or stored together with other natural images, it is less likely to be discovered and attacked by the attackers, making it more secure. In other respects, the decrypted images are visually identical to their respective plain images.
Next, to quantitatively analyze the imperceptibility of cipher images and the quality of decrypted images, the peak signalto-noise ratio (PSNR) [40] and mean structural similarity (MSSIM) [41] will be employed, which are defined in Eq. (32) and Eq. (33)  Moreover, the are basically greater than 30 dB. And as the resolution of the plain image increases, the quality of the decrypted image also increases. To a certain extent, the reconstruction quality is satisfactory. In short, the proposed encryption scheme can provide double protection of image data and appearance, and has great potential for application in other fields such as medicine and transportation.

Influence of gain factor on simulation results
Considering that the amplitude of the encrypted data is regulated by the gain factor in the embedding process.
Therefore, the influence of gain factor on the simulation results is given in this section, which is plotted in Fig.8. Among them, the red curves represent the PSNR between the meaningful cipher images and the host images. Moreover, the blue curves represent the PSNR between the plain images and the decrypted images. It can be seen from the figure that as the value of the gain factor increases, the red curve gradually rises. While for the blue curve, it rises in the first stage and then falls. Additionally, in the case of weighing the quality of the decrypted image and the visual security of the cipher image, the optimal gain factor is different for different plain images.

Influence of embedding position on decryption quality
In the proposed DIVME algorithm, we introduce a FRFT-based embedding method, which can embed two plain images into the real and imaginary parts of complex matrix, respectively. Thus, this subsection will evaluate the influence of embedding position on decryption. First, two identical plain images are subjected to the proposed algorithm under the condition that 1 = 2 = 35 and other encryption parameters are consistent with those described in Section 5.1. Besides, the image Baboon is selected as the host image. The experimental results are plotted in Fig.9. As observed from Fig.9, the embedding position basically has no effect on the quality of decrypted image. Therefore, the plain images can be encrypted and freely embedded into different locations.

Violent attack
The key space and key sensitivity together determine the ability of algorithm to resist violent attacks. In this paper, the secret keys mainly consist of the following three parts. i.e, (a) the ( 1 , 2 , 3 , 4 ) used for generating the initial values of two chaotic systems. (b) the (ℎ 1 , ℎ 2 ) used for calculating the location of internal parameters in the alpha channel and (c) the rotation angle ( , ). Suppose that the step length of (i = 1, 2, 3, 4) and ℎ (i = 1, 2) are 10 −14 and 10 −10 respectively, while the step of rotation angle is 10 −2 . Thus, the total key space can be calculated by Eq.(34).
Tab.2 Comparison of key space.

Algorithm Ours
Ref. [19] Ref. [31] Ref. [33] Ref. [ 43] Key space 4 2 × 10 80 2.56 × 10 59 10 56 10 56 10 75 To qualitatively analyze the key sensitivity, the images Lena and Woman are subjected to the proposed algorithm. Then the modified secret keys by adding a slight change to one of the correct keys are used to decryption. Moreover, the decrypted images are illustrated in Fig.10. It can be clearly seen that when one of the correct keys is slightly changed, the decrypted image does not visually reveal any useful information of plain image, indicating that the proposed encryption scheme is sensitive to the keys. To sum up, our scheme is sufficiently resistant to violent attacks. (e) ℎ 1 + 10 −10 (f) ℎ 1 + 10 −10 (g) + 10 −2 (h) + 10 −2 Fig.10 Decrypted image "Lena" using incorrect keys.

Statistical attack
The pixel distribution of an image can be reflected by the histogram. Since the strong correlation between adjacent pixels in the natural image, its histogram always presents an uneven shape. However, for an effective visual image encryption algorithm, the histogram of the visually secure cipher image should be as consistent as possible with that of the corresponding host image. Next, the distance of histogram intersection [43] is introduced to measure the slight differences between host images and cipher images, which can be calculated by Eq. (35).
Where (J, V) is a pair of histogram and L represents the bit depth of image. As shown in Eq. (34), when the histograms J and V are equal, the (J, V) reaches its maximum value that is one. In the experiment, the plain images with different resolutions are encrypted and embedded into different host images. Then the obtained results are listed in Tab.3. It is observed that the distance of histogram intersection between the host images and the cipher images is close to one, indicating that the proposed DIVME algorithm has good visual security. Similarly, it can also be seen that the host images have little impact on the value of (J, V).
Tab.3 The difference between the histograms of host images and cipher images.

Noise attack
Considering that the meaningful cipher image transmitted over the channel will be inevitably affected by various noises, resulting in loss of partial ciphertext data, which increases the difficulty of recovering the plain image. We will perform several experiments to evaluate the anti-noise performance of the proposed DIVME algorithm in this subsection under the condition that the parameters fv and gv hidden in the alpha channel are not destroyed. First, the plain images 'Lena' and 'Woman' with resolution of 512 × 512 are encrypted and embedded into the host image 'Lake' via FRFT embedding.
Then, multiple types of noise with normalized intensity of 0.0001%, 0.0005%, 0.001% and 0.005% are added to the meaningful cipher image respectively, including salt and pepper noise (SPN), speckled noise (SN) and Gaussian noise (GN). The resultant decrypted images are plotted in Fig.11-Fig.13. Besides, the values of PSNR between the decrypted image and the plain image under different noise attacks are listed in Tab.4.
As illustrated in Fig.11-Fig.13 and Tab.4, when the attack intensity varies from 0.0001% to 0.005%, the quality of the decrypted image drops significantly, and the maximum drop reach 4.787 dB, but we can still visually identify the secret information carried by the decrypted image. In another aspect, it can be seen from the experimental data that in the case of the same noise attack intensity, GN has the greatest impact on our proposed scheme, while SPN and SN have the least impact. In general, the proposed DIVME scheme has a good ability to resist noise interference.

Cropping attack
The ability of the proposed DIVME algorithm to resist cropping attacks is also evaluated in this subsection. Similarly, it is assumed that the parameters fv and gv in the meaningful cipher image are not corrupted. Then, different positions of the cipher images are cut with sizes of 128 × 128, 180 × 180 and 256 × 256, which are drawn in the first row of Fig.14.
And the corresponding decrypted images are depicted in the second and third row of Fig.14.
It can be seen from the figure that the position of the cropping blocks has basically no impact on the visual quality of the decrypted image in our scheme. Besides, as the size of the cropping block increases, the quality of the decrypted image decreases. However, when a quarter of the cipher image data is lost, we can still find the information carried by the plain image in the decrypted image visually, and the PSNR of the decrypted images are about 26.7 dB and 26.8 dB respectively, indicating that our DIVME scheme can withstand the cropping attacks to a certain extent, provided that the eigenvalues are not lost.

Differential attack
Differential attack is the most common attack method used by attackers. It refers to attempting to explore the relationship between the plain image and the cipher image by analyzing the cipher images produced by two plain images which differ by only one pixel, and then crack the secret information carried in other cipher images without using the key. The NPCR and UACI [44][45] are adopted to quantitatively measure the ability of our scheme against the differential attack in this section. And their calculation equations are displayed in Eq. (36) and Eq.(37).
In Eq.(36), the symbol Sign stands for the sign function. When 1 , is equal to 2 , , the value of the sign function is 0, otherwise it is ±1. In our scheme, the resulting cipher image is visually similar to the host image. Thus, the smaller the values of NPCR and UACI are, the more difficult it is to find the relationship between the cipher image and the plain image.
Tab.5 gives the NPCR and UACI values for plain images with different resolutions. As illustrated in Tab.5, the values of NPCR and UACI are very close to 0, indicating that the proposed DIVME algorithm has strong resistance to differential attacks. Up to now, many image encryption algorithms have been attacked by chosen-plaintext and known-plaintext, such as Ref. [46][47][48]. The main reason is that different plain images correspond to the same key stream in encryption. However, in this paper, the eigenvalues of the two plain images are utilized to control the generated key stream. First, the initial values of the QCNN are determined by the eigenvalue . Then, it is iterated and sorted to fabricate two sets of index sequences.
One set of sequences is applied to control the Fisher-Yates confusion performed on the image. And the remaining set of sequences is used to randomly embed the encrypted data into the host image. In other aspects, the eigenvalue is adopted to control the IHM to generate the key-controlled partial Hadamard matrix, which is used to compress two plain images in parallel. In short, since the proposed DIVME algorithm can realize "One plain image corresponds to one key", it can withstand known-plaintext and chosen-plaintext attacks.
Additionally, in order not to violate the basic principles of symmetric cryptosystems, we propose to embed the eigenvalues into the meaningful cipher image and transmit it to the decoders. Moreover, the scheme of hiding the eigenvalues has two characteristics. First, the location of the eigenvalues is controlled by the key. Secondly, the eigenvalues are embedded in the alpha channel of the visual cipher image to reduce the probability of being attacked.

Running efficiency analysis
Except for considering the security of algorithm, running efficiency is also an indispensable indicator to evaluate the performance of algorithm in practical applications. The execution times required for encryption and decryption of images with different resolutions are listed in Tab.6-Tab.7. In these two tables, 'Pretreatment' represents the phase that generates the initial values of the chaotic systems, and 'Reconstruction' stands for the process of recovering two natural images from the encrypted data through the 0 reconstruction algorithm and the inverse Fisher-Yates scrambling.
As can be seen, embedding and reconstructing the encrypted data account for a large proportion of the total encryption and decryption time. Such as for the plain image with size of 512 × 512, the FRFT embedding phase takes up about 89.465% of the total time in encryption. Moreover, the reconstruction phase occupies around 89.966% of the total decryption time.
In another aspect, when the resolution of the plain image changes from 256 × 256 to 512 × 512, the time spent in the encryption and decryption process increases about 10 times and 6 times, respectively. Thus, it is suggested to divide the large-scale plain image into several small images and then perform encryption and embedding operations in parallel to shorten the execution time.

Comparison with the existing work
In order to highlight the proposed DIVME algorithm, first, we summarize the characteristics of the existing visually meaningful image encryption algorithm, and compare them with ours. The comparison results are displayed in Tab.8.
Besides, part of them are contrasted with our proposed algorithm from the following three aspects: visual security, antinoise performance and running efficiency, as shown in Tab.9-Tab.11. Note that for the sake of fairness, the experimental data of Ref. [19,32,43] are obtained from their source articles or the related articles, and N/A means that the value is not provided. It can be seen from the comparison results that our DIVME algorithm has better anti-noise performance and higher running efficiency, compared with that in Ref. [32,43]. As for the visual security, the proposed scheme in this paper is comparable to the Ref. [43], but both are significantly better than that of Ref. [32].
Tab.8 Comparison of the characteristics for different algorithms.

No
Other algorithms Our 1 The plain image with resolution of × is pre-encrypted and embedded into a host image with resolution of 2 × 2 to obtain a meaningful cipher image, thus increasing the cost of storage and transmission, such as Ref. [29,33].
Before the embedding phase, the compressed sensing is adopted to compress the plain image to a quarter of its original resolution. Thus, the visual cipher image has the same resolution as the plain image in our scheme.

2
Only one plain image can be encrypted at a time, such as Ref. [49][50][51][52]. The encryption efficiency and transmission efficiency are limited.
Two plain images can be simultaneously encrypted and embedded into the host image with the same resolution as the plain image via the FRFT embedding method in this paper. The transmission efficiency has been improved, and there is no need to provide additional transmission costs.

3
There are both quantization error and truncation error in Ref. [50][51][52]. Because of the accumulation of errors, the visual quality of the decrypted image is reduced.
In the proposed DIVME algorithm, to reduce the influence of errors, the encrypted data is directly embedded into the host image under the control of gain factor, without quantization operation.
In the proposed FRFT-based embedding method, the rotation angles of the fractional Fourier transform can be used as the keys to improve the security of the hidden ciphertext. Tab.9 Comparisons of the PSNR and MSSIM between the cipher and host images in different encryption schemes.

Conclusions
This paper introduces an efficient DIVME algorithm combined with QCNN, CS and FRFT-based embedding method, which can simultaneously realize the image data security and appearance security. Besides, for withstanding the plaintext attacks, the eigenvalues of the plain images are adopted to generate the key streams. And then these eigenvalues are embedded into the alpha channel of the meaningful cipher image to reduce the probability of being destroyed. Finally, a series of security analysis indicate that the proposed DIVME algorithm not only has high visual security and decryption quality, but also can resist the diversified attacks, such as violent attack, noise attack, chosen-plaintext attack and so on. In the following work, we will devote ourselves to further improving the sparsity of plain images to attain the higher image compression rate.

Conflicts of Interest
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of this paper.