Novel medical image cryptogram technology based on segmentation and DNA encoding

This paper proposes a novel medical image cryptogram technology based on a fast and robust fuzzy C-means clustering image segmentation method and deoxyribonucleic acid encoding. In our method, first, the medical image is divided into background areas and regions of interest utilizing fuzzy C-means clustering image segmentation, which increases the encryption efficiency by about 60% when the background area is discarded. Second, some low-value pixels are also discarded in regions of interest to further reduce the encryption time. Third, a 4-dimensional hyperchaotic system has been improved. Furthermore, the hyperchaotic system and deoxyribonucleic acid encoding are utilized to encrypt the medical image. Finally, lossless encryption and fast encryption are done for different purposes. The experimental results demonstrate that the proposed algorithm has appealing encryption performance and the histogram and scatter graphs are governed by approximately uniform distribution. The NPCR and UACI of plaintext sensitivity and the key sensitivity are close to 99.6094% and 33.4635% respectively, which cause robustness against noise and clipping attacks.


Introduction
The remainder of the paper is as follows: Section 2 provides a detailed description of the proposed scheme's preliminaries. Section 3 introduces the encryption and decryption schemes. Section 4 presents the experimental results. Section 5 introduces the performance and security analysis while Section 6 provides challenges our method against current techniques. Finally, Section 7 concludes this work.
2 Hyperchaotic system and DNA encoding

Hyperchaotic systems model and performance analysis
This paper improves a 4-D hyperchaotic system [31]. The multiple-wing hyperchaotic system reflects the difference in system states. And the different states of the attractor can generate different keys. Thus, the hyperchaotic system has better security. The model of the system is shown in Eq. (1):ẋ where a, b, c, and d > 0, and in this work, we set their values to 5, 6.5, 7 and 4, respectively. When x 1 = 2, y 1 = 8, z 1 = 4.5, and w 1 = 6, the system has a typical four-winged hyperchaotic attractor. The system's attractor is illustrated in Fig. 1. The Lyapunov exponent is an evaluation indicator of whether a system is chaotic or not. When b = 6.5, c = 7, d = 4 and x 1 = 2, y 1 = 8, z 1 = 4.5, w 1 = 6, the Lyapunov exponent (LE) changes with parameter a, as shown in Fig. 2 (a).
According to Fig. 2 (a), when a ∈ (3, 13), the system is in a hyperchaotic state. Similarly, the change of LE with parameters b, c and d are illustrated in Fig. 2 (b), Fig. 2 (c) and Fig. 2 (d), revealing two positive Lyapunov exponents, and the proposed system is hyperchaotic.
The NIST SP800-22 test can detect the randomness of chaotic sequences with 15 test methods. Each test produces one or a set of P Values and if each P Value is greater than or equal to 0.01, the chaotic system has randomness. The system's test results are reported in Table 1, highlighting that all test data are greater than 0.01. Hence the chaotic sequence has good randomness, and it is suitable for encrypting images.  Table 2. This paper uses DNA encoding to encrypt the medical image. First, medical images are converted into 8-bit grayscale images. Each pixel may be depicted as a DNA sequence. A DNA sequence contains four nucleic acid bases.
For example, if the value of the first pixel of a grayscale image is 173, it is converted to a binary sequence (10101101). By using DNA coding rule 1, we can obtain the CCTG DNA sequence. Similarly, using DNA coding rule 1 to decode the same DNA sequence, we can achieve a 10,101,101 binary sequence. If we use the DNA coding rule 2 to decode the same DNA sequence, we will get the wrong binary sequence 01011110.   Table 3 The DNA + operation Table 3 shows the encoding rules of the DNA + operation. The base in the first row is added to the base in the first column, and the result is the intersection of their row and column. Table 4 shows the encoding rules of the DNA − operation. The base in the first row is subtracted from the base in the first column, and the result is the intersection of their row and column.
In this work, DNA + and DNA − operations are used to merge the key with the plain image. For example, taking two different DNA sequences CTAG and ACGT into consideration, the DNA + operation result is GCCG.

Initial key generation
The medical image cryptography system is based on fast and robust fuzzy C-means clustering (FRFCM) [14], DNA encoding and a 4-D hyperchaotic system. It consists of four stages: initial value generation, keystream generation, scrambling and diffusion. The initial value generation steps of a chaotic system are as follows: Step 1: The parameters of the chaotic system are used as a fixed password, and the plaintext information (PI) of the 64-bit sequence value is composed of the doctor-patient information and the device information in the medical image. Plain text information as the input value for the SHA256, is shown in Eqs. (2) and (3): Step 2: After converting K into a decimal sequence, add each of the adjacent bits from left to right. Lead to get a 32-bit decimal sequence KK (kk 1 , kk 2 , kk 3 ⋯kk 32 ) as shown in Eq. (4): Table 4 The DNA − operation Step 3: After kk 1 and kk 32 are discarded, the XOR and modulo operation is performed from left to right every six adjacent bits. And we get a five-digit decimal sequence KX(kx 1 , kx 2 , kx 3 , kx 4 , kx 5 ), as shown in Eq. (5):

Encryption and decryption schemes
After the initial values of chaotic sequences are generated, the steps of chaotic sequence generation, image segmentation, scrambling and diffusion are carried out. Figure 3 shows the medical image encryption process.
Step 1: The original medical image is segmented by the FRFCM algorithm [28], and different segmentation regions are extracted from the segmented image (model 1, model 2, …, model n). Moreover, segmentation regions are multiplied by the Fig. 3 Encryption flow chart original image to obtain regions of interest (ROI). ROI can be further divided into ROI 1, ROI 2, ..., ROI n. The n is odd. The ROI 1 with the smallest pixel value is discarded. The non-zero pixels in the remaining ROI were obtained respectively. And we obtain the pixel sequences (tq 1 , tq 2 , …, tq n − 1 ). tq 1 , tq 2 , …, tq n − 1 have a length of l 1 , l 2 , …, l n − 1 , respectively.
Step 2: Initial values (kx 1 , kx 2 , kx 3 , kx 4 ) of the chaotic system generate four chaotic sequences (s 1 , s 2 , s 3 , s 4 ) of length L ( L = l 1 + l 2 + …l n − 1 ). s 1 and s 2 are used to scramble tq a a n d t q b r e s p e c t i v e l y . s 3 a n d s 4 a r e u s e d d u r i n g diffusion.ða Step 3: we divide s 1 into α c , and s 2 into α d and we process them according to the following Equation where i = 1, 2, …, n-1. β i is used for confounding tq i . The confusion formulas are as swap(tq i (j), tq i (α i (j)), where j = 1, 2, …, l i . The sequence W is obtained by joining subsequences in turn.
Step 4: After the confusion operation, W is encoded by DNA as follows Equation: where kx 5 is the DNA encoding rules. C 1 is a DNA sequence containing the information about W. For chaotic sequences s 3 and s 4 , S 1 and S 2 are obtained according to the above encoding method. DNAencode is calculated in Table 2. Diffusion of C 1 is carried out according to the following Equation: Table 3. Diffusion of C 2 is carried out according to the following Equation: where i = 4 L-1, …, 2, 1. E 2 is the diffused sequence. The sequence is performed according to the following Equation: where E 3 is the final decoded sequence. DNAdecode is calculated in Table 2.
Step 5: After converting E 3 into a decimal sequence, E 3 are placed in the encrypted images. The remaining pixels are supplemented with a 0 value to get the encrypted image.
The decryption operation will be carried out in reverse, and the decrypted images will be obtained by putting the decrypted pixels back to the plain index position.

Simulation
All encryption and decryption experiments are performed in MATLAB R2018b on a Intel(R) Core (TM) i5-6500 CPU @ 3.20 GHz and 8GB RAM platform. We chose some medical images from the MedPix dataset to validate the proposed encryption method. The medical images are named as sample_1 (s1), sample_2 (s2), sample_3 (s3), sample_4 (s4) with Fig. 4 illustrating the sample images. Figure 5 shows the lossless encrypted images (ENI), and the decrypted images (DEI). The lossless encryption is a method of encrypting all pixels except zero value pixels. Thus, lossless encryption time is long. To overcome this problem, a fast encryption technology is proposed. Figure 6 shows the fast encrypted images (ENI) and the decrypted images (DEI). As shown in Fig. 6, fast encryption has fewer encrypted pixels than lossless encryption, so the black area occupies half the image. The encryption time is shorter and the decrypted image has no effect on the diagnostic.

Key space analysis
The size of the key space represents the algorithm's ability to resist brute force attacks. The greater the scope of the key space is, the stronger the security of the image encryption will be. In this paper, the secret keys include the generated hash values (K1) based on the doctor-patient information and SHA-256 hash function and the internal clustering key (K2) set by the user. In addition, the secret keys also include the initial conditions a, b, c, d, ×0, y0, z0 and w0 of the 4-D hyperchaotic system, as well as some auxiliary keys such as DNA coding rules (K3), and DNA manipulation (K4). The key space can reach 2 256 × 2 8 × 2 5 × 2 4 = 2 273 , which is much larger than the required 2 100 . The encryption system has sufficient key space. As a result, the proposed encryption algorithm is capable of withstanding brute force attacks.

Histogram analysis
The histogram represents the pixel value distribution, and the histogram variance quantitatively evaluates the image uniformity. Moreover, the smaller the variance of the encrypted image, the more uniform the distribution. Variance is measured as: where v i and v j are the pixel numbers, i is the gray value of the plain image and j is the gray value of the encrypted image. The variances of the plain and encrypted images are reported in Table 5, highlighting that the variance of the encrypted image is less than 293.24, indicating that the frequency distribution of the encrypted pixels is nearly uniform. The histogram of the normal (HPI) and encrypted (HENI) images is illustrated in Fig. 7. It indicates that the histogram of ordinary images has apparent peaks and valleys, which also reveals the image's information. The histogram of the encrypted image is flat, and therefore it can resist statistical analysis attacks.

Correlation analysis
The correlation of adjacent pixels is an important index to evaluate a cryptosystem's quality. The correlation coefficient (Corr) of two adjacent pixels is obtained from Eq. (12): where x 1 and x 2 represent the two neighboring pixel values, n is the number of pixels, and E(x 1 ) and D(x 1 ) represent the expectation and variance, respectively. The correlation coefficients of the plain and the encrypted images are depicted in Table 6. The latter figure shows that the plane image has correlation coefficients close to one in all directions, presenting a solid correlation. However, the correlation coefficient of the encrypted image is close to zero in all directions. Note that after the image is encrypted, the correlation is destroyed. The s3 correlation scatters diagram in the horizontal (H), vertical (V), positive diagonal (P), and negative diagonal (N) directions are illustrated in Fig. 8, highlighting that the pixels of the planar image are distributed near the diagonal. In contrast, the pixels of the encrypted image (EN) can be evenly distributed. Therefore, our method can resist statistical attacks.

Information entropy
Information entropy is an indicator evaluating the randomness and unpredictability of information. Information entropy is defined as: where p(x i ) is the appearance probability of the symbol x i , and N is the total number of x i . The entropy of the plain and encrypted images is presented in Table 7, showing that the plain images have entropies less than 7.6. However, the entropies of the encrypted images are close to the theoretical value of 8. Therefore, the encrypted image has high randomness, and it is challenging for the attacker to obtain valid information from the encrypted images.

Differential attack
The slight change of pixels and keys affects the encrypted image, and thus, the encrypted images may be hacked by a differential attack. The performance of the encryption scheme for resisting differential attacks can be evaluated by the number of pixels change rate (NPCR) and the unified average changing intensity (UACI) values. These metrics have the following mathematical expressions: where C 1 and C 2 are the first and the second encrypted images and w and h are the width and height of plain image C, respectively. Here, 2 8 represents the number of gray levels. The NPCR and UACI results are reported in Table 8. From Table 8, the NPCR and UACI results of encrypted images are close to the theoretical value of 99.6094% (NPCR) and 33.4635% (UACI), respectively. Hence, the suggested method has high plaintext sensitivity and can resist differential attacks.
Key sensitivity is as important as plaintext sensitivity. From Table 9, the NPCR and UACI results of encrypted images are close to the theoretical value of 99.6094% (NPCR) and  33.4635% (UACI), respectively. Hence, the suggested method has high key sensitivity and can resist differential attacks.

Noise attack and clipping attack
Encrypted images may be attacked during their transmission over public channels. Such attacks result in changes and loss of data, with commonly used attacks involving salt and pepper noise attack (SPNA), speckle noise attack (SNA) and clipping attack (CA). To evaluate the robustness of the encryption method, we added two different types of attacks to the encrypted (EN) image (ENI), and the corresponding decrypted image (DEI) is presented in Fig. 9. Under different attack types, the decrypted image is understandable, and the decrypted image does not affect the diagnosis. Thus, the developed method has strong robustness against SPNA and CA attacks. To evaluate the robustness of encryption and watermarking methods, we employ the peak signal-to-noise ratio (PSNR), mean square error (MSE) and structural similarity (SSIM) metrics, defined as follows:    where, P i, j is the plain image pixel or original watermark image pixel, C i, j is decrypted image pixel or the extracted watermark image pixel. μ P and μ C are the average pixel value of images P and C, respectively, σ P and σ C denote the variance of P and C, σ PC is the covariance between P and C, and l 1 and l 2 are constants. Information is attacked in transit. Figure 10 shows the PSNR and SSIM results of decrypted images under SNA(2 × 10 −6 ). In Fig. 10, faster encryption PSNR is greater than lossless encryption, and faster encryption SSIM is similar to lossless encryption SSIM. It demonstrates that fast encryption can not only withstand SNA, but also improve encryption efficiency. Table 10 shows the PSNR and SSIM results of decrypted images under SPNA.
From Table 10, it can be seen that the PSNR and SSIM decrease when the density of SPNA increases. After decryption, the quality of the image is reduced, but the content of the image can still be recognized. Moreover, the PSNR and SSIM of the lossless encryption are better than the fast encryption, because the fast encryption discards some low-value pixels. Furthermore, information is lost in transmission. CA is equal to information loss. Table 11 provides the PSNR and SSIM results for the decrypted images under the CA.
From Table 11, it is obvious that the PSNR and SSIM also decrease when the density of CA increases. Moreover, the PSNR and SSIM of the lossless encryption are also better than the fast encryption. SSIM of the lossless encryption is close to the theoretical value one. The lossless

Encrypted pixel ratio
The algorithm proposed in this paper discarded some pixels that do not affect the diagnosis, to improve the encryption efficiency. Figure 11 shows the encryption pixel ratio of the sample image in the general encryption method and the proposed encryption method.
In Fig.11, the ratio of the lossless encrypted pixels is less than 90%, and the encrypted pixel reduction ratio is small. The ratio of fast encrypted pixels is about 60%, and the encrypted pixel reduction ratio is relatively large. The reduction in the number of the encrypted pixels improves encryption efficiency without affecting the doctor's diagnosis.  Table 12 shows the performance comparison of the proposed encryption method with some of the existing encryption methods. It can be seen from Table 12 that the proposed encryption method is superior to current encryption, and the indexes of the proposed encryption methods are closer to the theoretical values. That's because some low-value pixels were discarded in the fast encryption and some unimportant information is lost. Moreover, the fast encryption is faster than the lossless encryption and some time and computing resources are saved. The corresponding encryption method can be chosen based on different requirements.

Conclusion
In this paper, the medical image encryption method based on hyperchaotic system and DNA encoding is proposed. It has several advantages as follows. First, ROI can be extracted and some important pixels can be encrypted. Thus, the encryption time is also reduced. Second, hyperchaotic system is used for medical image encryption. Third, compute resources can be saved because of the use of DNA encryption. The sequences of pixel values and chaotic sequences can be encoded and decoded at the same time. Finally, the results of the security analysis and experiments show that the proposed encryption method can withstand various attacks, such as noise attacks, clipping attacks and statistical analysis. Compared with the traditional encryption methods, the proposed medical image encryption method has achieved better results in all the tests.
Based on the above advantages, the method can be applied to a safe medical system. In the future work, we intend to implement this method of hardware to improve the execution efficiency of encryption algorithms. Although the proposed scheme focuses on medical image encryption, it is not limited to this field. Further future work may explore related applications in other areas of information security.