Preservation of Data Integrity in Public Cloud Using Enhanced Vigenere Cipher Based Obfuscation

Today’s internet world is moving to cloud computing to maintain their public data privately in a secure way. In cloud scenarios, many security principles are implemented to maintain the secure transmission of data over the internet. And still, the main concern is about maintaining the integrity of our own data in the public cloud. Mostly, research works concentrate on cryptographic techniques for secure sharing of data but there is no such mentioned works available for data integrity. In this paper, a data masking technique called obfuscation is implemented which is used to protect the data from unwanted modification by data breaching attacks. In this work, enhanced Vigenere encryption is used to perform obfuscation that maintains the privacy of the user’s data. Enhanced Vigenere encryption algorithm combined with intelligent rules to maintain the dissimilarity between the data masking to perform encryption with different sets of rules. This work mainly concentrates on data privacy with reduced time complexity for encryption and decryption.


Introduction
Cloud computing is not an emerging technology as in the previous decade where now it's a necessary technology that everyone knows. As per services offered by the cloud, IaaS, PaaS and SaaS are the well-known services, in which the additional service is SecaaS (Security as a Service). In all scenarios where we talk about cloud services that one word comes into the mind is security. As per the standard, cloud services are mostly offered by 1 3 third parties. When storing the information in third party storage or sharing the information through the third-party medium which may raise the question on secure sharing of information. For ensuring the secure sharing, cloud service providers maintain service level agreement to prove their integrity on present level of security. But again, most of CSP's need a third-party auditor to prove the integrity level of their organization's security.
Mostly, available security processes are maintaining the privacy of the data by ensuring the usage of advanced cryptographic techniques. But, what kind of privacy is provided for the data which is stored in the public cloud. Even though we can say the data is in the public cloud but it belongs to some private concerns. So, the question is data integrity in the public cloud for private data. Because, the data stored in the cloud in our specific storage space doesn't need any cryptographic algorithm as assumed. And even if the CSP provides the algorithms which create unwanted time consuming for encryption and decryption to access my own data. With that, another point of discussion is that the attacker can easily get the data from the cloud if they know the algorithm and key what the user contains.
By comprising all these points, the user needed cloud environment should be with the following characteristics. First, we need to store the data in a secure way that no one can access. Second, if any attacker can access the data, they should not get the actual data. Third, the security provided for the data should maintain the integrity which doesn't allow anyone to modify the content even in cipher text mode. Fourth, users don't want to spend more time on encryption and decryption for accessing their own data.
The solution for all these given points moves to the point of obfuscation. The main objective of obfuscation is to prevent the reverse engineering process from identifying the kind of cryptographic algorithms we used and protect the data by changing its form to different views.
The article is organized as follows. Section 2 describes the related works for obfuscation and data security. Section 3 provides the proposed Enhanced Vigenere Encryption Algorithm (EVEA) with intelligence for data privacy in the public cloud from the user side. Section 4 shows the results and performance analysis of the proposed system. Finally, Sect. 5 concludes the article with advantages with the proposed system and needed enhancement in the proposed work.

Related Works
Some research articles which describe the encryption algorithm and cloud security are discussed in this related work. Subhashini and Kavitha [2] provides a survey on security issues in cloud computing. It clearly shows the need of security in cloud computing mainly in data storage. Because security in the cloud is implicitly shows the privacy preservation of data in cloud storage. The cloud users accessing the public cloud storage where the service provider has to provide privacy for user's data.
Wilson and Gracia [12] proposed the modified Vigenere cipher algorithm to reduce the cracking of information encrypted by Vigenere cipher. By adding a few bits of random padding to each byte, one can diffuse the statistical retentiveness found within most messages. The exact quantity of pad will be determined by a one-way function in an effort to eliminate the distinguishably of the message bits from the padded random bits. This methodology moderately increases the size of the cipher text, but greatly increases the security of the cipher.
Gurpreet singh [1] proposed the modified Vigenere cipher algorithm for providing the data privacy for user's data by integrating Base 64 and AES. It considers the simple key and converts the alphabets into ASCII value and applies that in the Vigenere table with 92  keys including all special characters. Govinda and Sathiyamoorthy [3] describe agent-based security for cloud computing using the obfuscation technique. In this, various algorithms are proposed for different data types. And they achieved better results from a privacy perspective. Even though the privacy is achieved, the similar algorithm is not able to provide for all data types and it requires some additional computational complexity in the obfuscation.
Zhang et al. [4] provides a series of methodologies for privacy protection using different techniques in the cloud. In this series, one work describes noise generation in cloud computing for data protection based on association probability method. Another work by Zhang et al. [11] describes the trust-based noise injection strategy for privacy protection. This work calculates the trust value of the service provider and avoids the unwanted service injection into the user's window. This creates obfuscation by injecting noise service requests to confuse the immoral service requests.
Another recent work by Zhang et al. [10] proposed the time series pattern-based obfuscation with probability fluctuation occurrence in the service request. It presents the clusterbased technique for analysing the privacy risk and investigates the corresponding probability fluctuations. All the three works by Zhang et al. explains about the noise generation to confuse the immoral service request given by intruders. Forecasting techniques need the continuous manipulation of data in frequent intervals. It also needs third party authority support for noise generation.
Arockiam and Monigandan [5] give the system for maintaining confidentiality for data security. They give the solution for obfuscation which is similar to an encryption algorithm but obfuscation is only applied for numeric values in the data and remaining data is encrypted using the prescribed cryptographic algorithm. Yang et al. [7] proposed the method for obfuscation which encrypts the data with different keys for different users. So, by this each user can access only their allocated information and the other user information should not be viewed using the same key allotted for them. This again requires the additional computational and time complexity for generating different keys for different users. And also, if the attacker can understand the algorithm used for key generation, then all user's information is easily decrypted.
Tian et al. [9] provides the base view for personal cloud computing. This paper deals with maintaining the private place in the public cloud for individual users. This work intends to serve as a technical reference for the development of security requirements methodologies aiming to the personal cloud.
SaiRamesh et al. [13] gives the method for trusted data sharing between the users from the multiple owners in the same public cloud. This work provides the framework for the multiple data providers sharing their data for multiple users in the protected way. Another recent work by Selvakumar et al. [6] multi-authority access control mechanism for maintaining the privacy in the public cloud for user's data. It also maintains the privacy of data from unauthorized access in the public cloud environment.
Sharieff et al. [14] suggested the image inpainting technique to hiding the information inside the image. This technique is differ from general water marking which hides the data inside the image by inpainting process. Enireddy et al. [15] is another work discussed about data obfuscation for cloud security. They used MONcrypt technique to secure the data from unwanted access. This technique reduces the size of the plaintext which confuses the attacker from getting the actual data.
Jaithunbi et al. [16] applies the intelligent rules with the genetic algorithm to enhance the trust evaluation of the cloud service provider. This system evaluates the trust value of service provides and makes the data owner to get the trusted service provider for sharing their data. Yadav et al. [17] proposed the technique for obfuscated the source code from the hackers who try to create the mirrored view of the websites and other applications. This will helps to reduce the arise of fake websites and applications in the name of genuine websites. It also helps to protect the personal and sensitive information from the hackers.
Kalidoss et al. [18] applies the data anonymisation concept for the cloud data to secure it in vertically partitioned database. It helps the data owners to share only the needed information to the requested user without sharing the whole database or the content. Djeki et al. [19] provided the solution for checking the data integrity of their own before uploading it into the cloud storage. This system helps the data owners to maintain the integrity of their own data without the involvement of any other third-party members.
The above-mentioned survey through related articles gives the overview of the importance of data privacy in the public cloud. And also, it describes the need for techniques like obfuscation to avoid computational complexity by using simple encryption algorithms for data masking. And some of the limitations are overcome and some of them are still unsolved like a strong key for encryption with a less computational complexity model. This proposed system gives the solution by providing efficient data protection with less computational complexity.

Proposed System
This proposed work project on providing a security for user's information in public cloud with less computational cost. In previous work, Mowbray et al. [8] specified about the privacy manager for defining the policy specification. Here, the policy specification should be maintained by the clients itself for choosing their encryption algorithm and where to apply that algorithm and all.
The method obfuscation discussed here about encrypting the data from the user side and user has to protect their data from the service provider. For that, Extended Vigenere Cipher wass used in this work which provides efficient encryption mechanism with less computational time. Gurpreet singh [1] proposed the Modified Vigenere Encryption Algorithm (MVEA) which differs from standard Vigenere algorithm by including all characters in the keyboard of the computer system. But the current internet world using the mobile as their system and they want everything should be within the mobile environment. In this scenario, using all special characters as a keyword is again becoming the complex task. To avoid these kind of difficulties, we modified and extended the standard Vigenere Encryption algorithm with only alpha-numeric characters. And also, this system could achieve better avalanche effect than MVEA while using alpha numeric characters (totally 62) instead of 95 characters which is discussed in MVEA.
The proposed EVEA doesn't contain any special characters because we are not going to use this for any transfer of information. The main objective of this work is to enhance the privacy for client's data which is stored in public cloud storage. This information need not going to be shared by any other clients. If the need arise then the third party auditor introduce the key management policies with the use of any encryption algorithms.
In the proposed EVEA, CT i is the cipher text obtained from the given plain text PT i by using the key text KT i . Some article discussed about modified Vigenere Cipher by using the ASCII value of the given text. But here, the EVEA doesn't consider the ASCII value. It simply uses the same mathematical expression which is used for standard Vigenere Cipher but with 62 characters. If we go with the 26 characters, then 26 possible Caesar cipher are applied and cipher text is generated.
Key selection is the major task in all encryption algorithms. In Vigenere Cipher also the key length should be a maximum and if it's minimum then it should be repeatable until it equals the length of the given plain text. If we make the large text as key then it's should be shared with some other or to be saved in some place. Again, it should be the tedious task from the user side to maintain the key in the secure manner.
In this work, algorithm for key maintenance should also be reduced by using the plain text itself as a key. Instead of repeating the same key multiple times for equals the length of the plain text, we can use the same plaintext as key to encrypt the plain text as cipher text. If the plain text will be the key then we have to make a copy of the whole plain text to recall the key when decryption occurs.
In account of all above mentioned difficulties, the proposed work come with the solution to include language processing technique which used to chunk the whole text into characters. The whole plain text is chunked into separate words and ten words considered as a segment. In a segment, first five words will be encrypted using the next five words of the sentence. Most of the times, characters may not be equal if we go with the words length. In such scenario, the actual Vigenere Cipher technique of repeatable key is to be followed.
For example, the sentence "Cryptography is the important subject in computer science that everyone needs to study".
By applying the chunking, the five words total length is 33 characters and next five words is 29 characters length. In this case, again the key starts from the first character of the key word and it consumes the needed key length from the given words. Another one drawback arises that the next 33 characters in cipher text is same as plain text. To overcome this, simple keyword to be used to encrypt the plain text as cipher text as it is in Vigenere cipher. So, this proposed works follows obfuscation policy with simple encryption techniques without the involvement of cloud service provider and third-party auditor.
The Extended Vigenere Encryption algorithm (EVEA) character with its values is shown in Table 1.

Process Flow for Obfuscation Technique
The process flow diagram in Fig. 1 explains the flow of obfuscation technique. Here, the input data is given by the user and the data is stored in an array. Then, encryption and data masking are performed on the data. After that, the data is stored in public cloud storage. Whenever the user wants to access the data, the user retrieves the data from the cloud and decrypts it at the user's system.

Data Chunking
In this module, the plain text is given by the user. This plain text is taken as input for this module. Then, chunking process is applied on the plain text. The process results in chunked data. The chunked data is stored in a two dimensional array. During subsequent processes, this two dimensional array is used for encryption and decryption of the data.

Data Encryption
The data stored in two dimensional arrays is taken as input for this module. The encryption process comprises of two phases. During the first phase, the data in even numbered rows of the two dimensional array is added with the data in odd numbered rows of the two dimensional array. If the number of data in the odd rows is less than the number of data in even  rows, then the data in odd row is repeated up to the length of the data in even row. Or else, the data is taken as it is. After that, the data in the even numbered rows of the two dimensional array is encrypted. During the second phase, the data in the odd numbered rows of the two dimensional array is encrypted using a random key, which is given by the user. The key is repeated for the length of the data in the odd numbered rows of the two dimensional array and added to the data. After this, the data in the odd rows is also encrypted. Then, the encrypted text (i.e., cipher text) is stored in the public cloud.

Authentication
Whenever the user wants to retrieve the data from the cloud storage, the authentication process is performed. Authentication provides security for the data in the cloud storage. It protects data from unwanted use. It also provides integrity for the data stored in public cloud storage. A user can retrieve the data only after passing through this authentication process.

Decryption
The data which is retrieved from the cloud is used as an input in this module. This process is converse to that of the encryption process. This module also constitutes two phases. Firstly, the data in the odd rows of the array is decrypted using the random key, which has been used for encryption. After this process, the data in the odd rows will be decrypted.
Subsequently during the second phase, the data in the even rows are decrypted using the data in the odd rows of the array. If the cardinality of data in odd rows is less than the cardinality of the data in even rows of the two dimensional array, then data in the odd rows are repeated up to the length of the data in the even rows. Or else the data in the odd rows of the array is taken as it is. Then, the data in the even rows of the two dimensional array are decrypted using the odd row data. The decrypted data is the original data that the user wants.

Extended Vigenere Encryption Algorithm
The extended Vigenere encryption algorithm includes key generation, encryption and decryption with key verification. Intelligent rules are applied to extract the key from the given plain text.
The intelligent rules are written to implement the language processing technique in order to make decision regarding the character length with respect to word count. This procedure is explained in an algorithmic form as follows.  5b. Else make wc2 as length equals wc1 by repeat the character of wc2 from initial character and updated wc2 which chosen as key for wc1 Encryption : 6. Initialize cipherText to null 7. If S1(PT) is less than length of S2(PT) then choose subpart of wc2 which equals the length of wc1. Choose the segment length L 3. Once segment length is chosen, divided into two segments S1(PT) and S2(PT) 4. Count the words in both segments separately wc1 and wc2 5. Check wc1 = wc2, 5a. If yes choose wc2 as key for wc1 As mentioned in algorithm for key generation, the key is count as the words 6 to 10 of the plain text and total characters are counted for the five words. After, manipulate the key length as equal to the plain text to be encrypted apply the encryption algorithm steps. The key is applied to alternate five words and it also reduces the repeatability   in keywords in actual Vigenere Cipher algorithm. The word length may be vary based on the user's perspective. This makes the proposed algorithm to supports dynamic key generation and it will not be same as every time while the encryption carried out for plain text.
The encryption is carried out with the same process as followed in Vigenere Cipher. The mathematical expression for encryption process is given in Eq. (1) The decryption is carried out by the Eq. (2) By using this EVEA, the privacy over the information is preserved with less computation complexity. And also, storage space for key management is also neglected. Section 4 provides the experimental results and analysis the performance of EVEA by comparing with Standard Vigenere Cipher Algorithm (SVCA) and MVEA [1].

Results and Performance Analysis
The experiments are performed based on the time taken for encryption and decryption based on the size of the plain text. Table 2 and Fig. 2 shows the time analysis for encryption and decryption for SVCA, MVEA and EVEA by varying the size of the plaintext. In this analysis, EVEA requires less time than other two algorithms for larger plaint text. Table 3 shows the need of obfuscation method without intervention of service provider for time consumption and data integrity. It shows the time analysis for storing the data in cloud storage after encryption executed by Third party auditor and our proposed EVEA. The result shows that the proposed EVEA requires less time for storing the data in cloud storage after encryption. In decryption also it processed in less time than execution done in third party auditor. The character length is considered for analysing the performance based on time taken with respect to the key length. Figure 3 shows the pictorial representation of this analysis.

Avalanche effect
Avalanche Effect refers to a desirable property of cryptographic algorithms where, if an input is changed slightly (for example, flipping a single bit) the output changes significantly (e.g., more than half the output bits flip). This is a desired effect in encryption to ensure that a person cannot easily predict a message based on the changes in the hash value through a statistical analysis as shown in Eq. (3).
The input for the encryption is"THIS IS MY NEW TYPE HELLO WORLD PRO-GRAM IN THE". The key for encryption given is"HELLO". This results the output as"ALTD PW XM RPH ACAP OIWWC AZCZK ACCNVLX PR EVL". (1)

Number of flipped bits in ciphered text Number of bits in ciphered text
Then the input is changed as"THIS IS MY NEW TYPE HELLO SUPER PROGRAM IN THE". The key is same as the first encryption.
This results the output as"ALTD PW XM RPH ACAP OIWWC WFASY ACCNVLX PR EVL". The number of bits changed is 5. The total number of bits 37. The avalanche effect is, The input for the encryption is"THIS IS MY NEW TYPE HELLO WORLD PRO-GRAM IN THE". The key for encryption given is"HELLO". This results the output as"ALTD EG BP VRE MFTX OIWWC DSCWR WVZRFHQ PR ALP".
Then the input is changed as"THIS IS MY NEW TYPE HELLO SUPER PROGRAM IN THE". The key is same as the first encryption. This results the output as"ALTD AM BP VRE MFTX OIWWC ZYAPF WVZRFHQ PR ALP". The number of bits changed is 5. The total number of bits 37. The avalanche effect is, Based on the Avalanche effect, the performance of the proposed system is evaluated and compared with existing MVEA technique.

Conclusion
This work concentrated on data protection and privacy preservation of user's data in public cloud using obfuscation technique. For obfuscation, the Enhanced Vigenere Encryption algorithm is used with Intelligent Rules to generate varied key length. The key-keying technique also applied here by encrypting the key using different key by the user. The main objective is to encrypt the information without any third party intervention. This is achieved in this proposed by carried out cryptographic evaluation in the user side and it also reduces the time taken for encryption and decryption. And also, performance based on time and attacks carried out in the user side is lesser in the proposed system when compared with previous techniques using Avalanche effect. The system may be enhanced in future by using ASCII values for alphabets and numeric instead of using the serial number of the character. And, future work can also use fuzzy rules for choosing efficient key by varying the character length.
Funding There is no funding for this study or experiments.
Data Availability Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
Code Availability There is a custom code for this experiment and it is available from the corresponding author on reasonable request. Avalanche Effect = 5 37 × 100 = 13.5135% Avalanche Effect = 7 37 × 100 = 18.9189%