Empirical and statistical comparison of intermediate steps of AES-128 and RSA in terms of time consumption

Cryptographic algorithms are composed of many complex mathematical functions. When analyzing the complexity of these algorithms, one fixes priory the overall complexity of the algorithm as the complexity of the most dominating operations for a group of operations. Generally, it is the count of this operation which determines the complexity of the algorithm in case of compounding operations. We have instead used the weight factor to determine the complexity of an algorithm with many operating functions working simultaneously and have taken time of the operation as a measure of the weight factor. We statistically analyze the two most used operations in RSA, namely “power” and “mod,” through a method of revised difference to compare whether these are statistically similar or dissimilar. We have also calculated the empirical computational complexity of these two operations through the fundamental theorem of finite differences to verify whether these operations are statistically dissimilar and if so then which of the two is dominant. We have also analyzed empirically the complexity of each of the four sub-steps involved in the encryption and decryption of AES-128, to determine which operation dominates the most and consumes most of the time in an overall run time of AES-128.


Introduction
Algorithm analysis (Aho et al. 1975;Cormen et al. 2009;Greene and Knuth 1991;Knuth 1973) is an integral part of computer science wherein one tries to estimate the overall complexity in terms of both space and time before running the algorithm for any application. This helps to differentiate two algorithms with similar use when implemented in a particular application. Analysis of the running code by programmers has shown that often much of the time is spent in a small portion of the code and it is difficult to find the inefficient places in the code. The inefficient portion of the code is found by running the program for varying inputs and then replacing the inefficient portion with a better and more efficient code.
Cryptographic algorithms (Stallings 2009;Forouzan and Mukhopadhyay 2010;Diffie and Landau 2007;Stallings 2005) are composed of many complex mathematical functions which are generally compounded, a group of singular operations. Not much of research has been done on analyzing a cryptographic protocol from its time complexity point of view. For a better understanding of how closely cryptography and complexity are related, see Talbot and Welsh (2006) and Rothe (2005). Both the encryption and decryption of two wellknown symmetric key algorithms, namely RSA and ElGamal, are composed of the power operation and modular arithmetic operation working as a single unit. Generally, while analyzing these two cryptographic algorithms, one tends to predict the overall complexity of the two algorithms tending toward the more dominant of the two operations ''power'' and ''mod.'' This may give a biased theoretical result, and the algorithm may be unfit for many real time applications due to its excess execution time. We have evaluated the statistical similarity between the ''power'' and ''mod'' operations by analyzing them empirically. If the two operations are statistically similar, then none of them is dominant if analyzed in a group. But, if the two operations are statistically dissimilar, then one out of the two may be dominant operating and will account for most of the execution time of the overall compounding operation. To find which one of the operations is more dominant, we have used the fundamental theorem of finite differences. When analyzing a compound operation theoretically, it is the count factor which determines the complexity of the algorithm. We have instead used a weight factor, namely the time of the operation to measure the actual complexity. On the positive side, it allows mixing of operations which, in implementation, do work collectively and hence the approach is realistic. On the negative side, the time factor depends much on the computer clocks and the results can get affected if the clocks do not function properly.
To develop any working result, we first have to check the idiosyncrasies of computer clocks. Often, these clocks are not much accurate, and one cannot rely on the execution time achieved by running the program. The remedy is to increase the input until the total time required to process any operation is large enough to predict any result. It suffices to point out here that computer clocks (also called timers) have two components: the hardware part and the software part. All that the hardware part does is generate rapid pulses at equal intervals (called ''clock ticks''); the rest is achieved by programming. A detailed description of the working of the computer clocks can be seen in Tanenbaum (2001).
The theoretical analyses of RSA for encryption and decryption processes give different complexity measures. The complexity of encrypting with the compounding operation C ¼ M e mod n is Oðlog N ð Þ 2 Þ as it is done with a fixed or bounded number of modular multiplications. In cases, where e is of the same order as N, the complexity changes to Oðlog N ð Þ 3 Þ. For decrypting with the compound operation M ¼ C d mod n, the time complexity is Oðlog N ð Þ 3 Þ as the factor d has the same size in bits or digits proportional to that of N which remains the same irrespective of using the CRT technique. Using ''squaring and multiplying'' algorithms for encryption and decryption, the time complexity can be seen to be O N ð Þ 2 . Similarly, the block cipher algorithms such as AES-128 work normally on fixed block size. AES-128 works by encrypting a fixed block of length 128 bits and takes approximately the same time despite varying inputs. Thus, AES-128 can be termed to be O (1) for both encryption and decryption. If one wants to encrypt messages larger than 128 bits, then they need to be put into one of the modes of operations. In such a case, as we have O (n) blocks of data to encrypt/decrypt, this brings the order of complexity to O (n), where n is the size of the message. So, the time complexity of AES-128 for both encryption and decryption in any of the standard modes of operations like the CBC or CTR is a polynomial, or to be more precise, linear with respect to the size of the message. Time complexity of AES has been analyzed in Talbot and Welsh (2006) and Bogdanov et al. (2015).
Theoretical computational complexity is based on count of the underlying functions comprising the algorithm. This may lead to erroneous result as sometimes the apparently more dominant of the operations may not be that much dominant. We have analyzed the compounding operation power and mod of RSA for their statistical relationship. A power function is a function where y = x n (n is any real constant number). The mod function finds a remainder after a number (dividend) is divided by another number (divisor).
We have also analyzed these two operations empirically wherein as opposed to theoretical complexity measure based on counts, the weights are taken into account as a measure for deriving the complexity. Time of the operation, as mentioned earlier, has been used as a weight parameter for doing the empirical analysis. Four sub-steps of AES-128 are also analyzed empirically to find the most time-consuming of the four sub-steps.
The authors' published work in the related domain is given in Sect. 2. Section 3 of the paper covers similar research carried out in the field of statistical and empirical analysis and an introduction to the fundamental theorem of finite difference and RSA cryptographic algorithm. The revised difference method is introduced in Sect. 4, with a statistical comparison of power and mod operations used in encryption and decryption function of the RSA. Section 5 gives the empirical time complexity of the power and the mod operations. Introduction to AES-128 and its intermediate steps is discussed in Sect. 6, with an empirical analysis of all the intermediate steps to predict which of the sub-steps consumes the most time. Fundamental theorem of finite difference Step 1: Bubble sort algorithm was executed for varying input sizes and the execution time for each specific input was noted Step 2: The first and the second difference table of the execution time of bubble sort was calculated Step 3 Average-case analysis of the algorithm showed that for some specific distributions, the algorithm performed better than the most used Hoare's Quicksort algorithm C ? ? Programming language and SPSS statistical tool The complexity of the algorithm was estimated to be empirical O (n 2 ) in the worst case 9 Singh et al.
Regression analysis Experimental analysis is focused around three models, namely b 3 n 2 , which are referred to as n/linear, nlog 2 n and n 2 /quadratic model, respectively Minitab statistical software and SPSS statistical tool yavg(n) = O(n 2 ), at least for k B 25,000 in the worst case 10 Singh et al.

Regression analysis and t-test
The samples generated consist of all distinct elements (at least theoretically). The program runtime data obtained for sorted sequences are fitted for a quadratic model. The regression analysis result is given in Box 1. With a very significant tvalue (194.92) of quadratic term, the regression analysis statistic strongly supports a quadratic model. The quadratic model goodness is further tested through cubic fit for the same runtime dataset SPSS statistical tool Worst-case complexity model for quick sort was proposed as: Empirical and statistical comparison of intermediate steps... 13131

Related work
The statistical similarity among three well-known operations addition, multiplication and equality has previously been studied through the method of revised difference [see (Chakraborty 2007)]. The result generated using the method of hypothesis testing approach shows that addition, multiplication and equality are statistically dissimilar when working in a group. The bubble sort program execution time with different parameters has been analyzed statistically (see . A statistical analysis has been done to show the specific quadratic pattern of the execution time on the items sorted. The result states that the order of complexity of bubble sorting is 0 (x 2 ), where x is the number of items sorted. The statistical analysis shows the specific quadratic pattern of any dependent variable y on the number of items sorted x. The empirical execution time is predicted using the fundamental theorem of finite difference (see Comment). An empirical O is the empirical estimate of the non-trivial and conceptual weight-based statistical bound. A statistical bound, unlike a mathematical bound, weighs the computing operation instead of counting them, and it takes all the operations collectively and mixes them in assessing the bound [see (Chakraborty and Sourabh 2010)]. The empirical analysis is rather a practical approach for analyzing an algorithm's complexity. First running a program constituting the algorithm and then fitting a statistical model give a practical view of the algorithm's complexity in empirical form. Comment: Fundamental theorem of finite difference states that the nth difference of an nth degree polynomial is constant and the higher differences are zero. Mathematically, where D is the forward difference operator defined as The converse of the aforesaid theorem is also true. This means if the nth difference of a tabulated function is constant and higher differences are zero, the function is an nth degree polynomial. No function has this unique property except a polynomial.
When a statistician tells that the empirical O of the algorithm in question is O(g(n)), all the means is the following: ''The simplest model one can fit (to the algorithmic time complexity data) that prevents ill-conditioning by sacrificing a minimum amount of predictive power, if need be, must in my opinion have a leading functional term g(n) in the model.'' (Chakraborty and Sourabh 2010).
The authors in  presented a statistical view for investigating the average-case complexity behavior of computer algorithms. A statistical bound and its empirical estimate, the so-called empirical O, are used. The result depicted few complexity data, ensuring a clear linear pattern recommending an empirical linear model [Y avg = O emp (n)]. These resulted in conjectures in the realm of algorithmic analysis. Based on the statistical analysis, the robustness claim of average-case behavior of the quick sort algorithm has been rejected (Singh et al. 2013). From the empirical side of the approach used in the paper with an urge in computer experiments and applied statistics offers a helping hand by supporting its theoretical equivalents. In Schoor (1982), the authors propounded a study to demonstrate that an empirical O (n 2 ) complexity Involution function and fundamental theorem of finite difference A new technique of stream cipher system was proposed using an involution function and the complexity of the algorithm was estimated empirically using the fundamental theorem of finite difference MATLAB O (n 2 ) in the worst case was persuasively gettable with two dense matrices in nxn matrix multiplication. The result showed that an empirical O (n 2 ) complexity was undoubtedly gettable with both the matrices dense, i.e., the pre-factor matrix was roughly as dense as triangular and the post-factor matrix was fully dense. The algorithm used in the study was Amir Schoor's algorithm (see (Schoor 1982)) which is fast for both sparse matrices as well as for dense matrices as it is faster to work with rows than with both rows and columns. Further study can be undertaken to observe whether the same can be done if the density of the pre-factor matrix is increased keeping the post-matrix fully dense. For certain algorithms such as sorting, the parameters of the input distribution must also be taken into account, apart from the input size, for a more precise evaluation of computational and time complexity (average case only) of the algorithm in question, the socalled gold standard in the context of parameterized complexity [see ]. In Singh and Chakraborty (2011), the authors verified the complexity of a Smart Sort algorithm through computer experiments. An improved Pig Latin algorithm was proposed in Dutta and Sanyukta. (2020), and its complexity was measured empirically.
We have analyzed the RSA cryptographic algorithm and AES-128. The RSA algorithm is an asymmetric cryptography algorithm. The term asymmetric actually means that it works on two different keys, i.e., public key and private key. As the name indicates, the public key is given to everyone and the private key is kept private.
The encryption and decryption in the algorithm are carried out using the power and mod operation working in a group. The encryption procedure follows the following equation: where C is the generated cipher text, M is the plain text, N is the modulus and e is the public exponent. Similarly, the decryption routine follows the following equation: where d is the private exponent.

Revised difference method
Let ''p'' and ''m'' be referring to the power and mod operations, respectively. Since computer clocks are not reliable for small input, the input size keeps on increasing. This helps to draw a relation between the size of the input and the corresponding execution time. We have taken three trials for each input size in order to defy the dominance of cache hit which is the case of finding an input in the cache memory itself instead of the main memory. This can significantly affect the execution time of a specific input as finding the input in cache instead of the main memory takes considerably less amount of time. Let t p i andt m i À Á be the mean run time of the three trials for the ''power'' and ''mod'' operations for i = 1, 2, 3,…., corresponding to r instances of input size. The revised difference is defined as: The revised difference is unit-free, statistically random with a small variance (more precisely stated, the small variance means there is no sign of such variance with respect to the standard deviation). The magnitude of the difference between two times should be reckoned relative to the magnitude of the terms being differenced. Further, through trial and error, we discovered that the function given in Eq. (1) generally has a small variance, when c and d are the simplest operations considered. However, this may not be so always (e.g., in a program with a partial differential equation) for which the formula will have to be further improved. The unit freeness follows from the definition itself. For testing statistical randomness, we recommend the well-known run test for randomness. Although the formula for our t (Eq. 2) is same as that of To check whether the mean of the revised differences is zero or not, we have applied the hypothesis testing approach which is best suited to estimate the population behavior from a small sample of the population. We set the null and alternative hypothesis as: H 0 = The mean of the population of revised difference is zero.
H 1 = The mean of the population of revised difference is not zero.
The term ''population'' in general refers to a set of similar items or events which is of interest for some question or experiment. A common aim of statistical analysis is to produce information about some chosen population. In this context, the term ''population'' signifies a large set of both the operations in terms of execution time. In general, a small sample is collected from the whole population by some sampling techniques to estimate the overall behavior of the population. This is done through the ''hypothesis testing'' approach, and we have taken a ''sample'' of 10 and 30 of both the operations to estimate the population behavior. The null hypothesis is the hypothesis under test, and it must be a hypothesis of zero (null) difference which depicts the unbiased attitude of the statistician. The alternative hypothesis is that against which the null hypothesis is to be tested. We compute the test statistics ''t'' as: where r is the total number of observations for each operation, d m is the mean of the revised difference (of r observations) and d s is its standard deviation This may assume to follow a Student's t-distribution which is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown with r-1 degree of freedom. The degree of freedom represents the minimum number of values of the sample which can be used to precisely estimate the whole population. Most of the times, a sample of size N has N-1 degrees of freedom. We reject the null hypothesis if the magnitude of the calculated test statistic exceeds the table value at the 95% level of confidence (and hence level of significance a = 0.05) and r-1 degree of freedom. The level of significance, also denoted as alpha or a, is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. In case of acceptance of the null hypothesis, we say that the computing operations are statistically similar and in case of rejection, the computing operations are termed as statistically dissimilar.
The Student's t-distribution is valid only for sample size less than 30. For sample size greater than 30, instead of Student's t-test, normal table should be used with the same formula (as used for calculating the t-statistic). In this case, if the calculated statistic exceeds 1.96 in magnitude, we reject the null hypothesis; else we may accept it at a 95% level of confidence.
Tables 1 and 2 show the execution time for three different trials for both the operations say x y and x mod y ð Þ . System Specification: • Processor: Intel (R) Core TM i3, • RAM: 4.00 GB, • Operating system: 64-bit Windows 7.
All the implementations in this work were done in MATLAB version 2015a.
From Table 3 of the revised difference, we get: d m = 1.00851, d s = 0.310512.
As the number of observations ''r'' is 10, the degree of freedom = 9. So, calculated t-statistic at the 95% level of confidence and hence level of significance a ¼ 0:05 is 10.27074 and tabulated t 0:05;9 ð Þ = 2.262. Since the calculated t-statistics is greater than the tabulated one, we reject the null hypothesis and conclude at this stage that power and mod operations involved in the encryption and decryption processes of RSA are statistically dissimilar. Now, we check the empirical run time of these two operations through the fundamental theorem of finite difference.
Remark The term statistic refers to a function of sample observations (e.g., sample mean, sample range), as opposed to the term parameter meaning a function of population observations. The term statistics when used as a plural of the singular statistic will mean more than one such sample functions. The term test statistic means a statistic that is used to carry out a test of significance.

Empirical complexity estimate of ''power'' and ''mod operations''
We applied empirical analysis for measuring the complexity of power and mod operations. We took ''x'' to be the size of the input used in each of the two operations and ''y'' the execution time in seconds in order to determine a function ''f'' of the form y = f (x). The estimated run time empirically can be found by using the converse of the fundamental theorem of finite difference discussed in Sect. 2. For the difference Table 3 of power operation, we found that the first difference is almost constant and the second difference is almost zero. This can be proved by emphasizing that as the input keeps varying, the second difference tends to give values that are negligible and can eventually be excluded from the final calculation. Hence, by the converse of the fundamental theorem of finite difference, a first-degree polynomial will approximate the execution time for both encryption and decryption. So the empirical complexity of the power operation can be termed to be empirical O (n), where ''n'' is the length of the message. This can be proved by fitting a statistical model corresponding to the mean execution time of power operation (Table 4). The goodness of fit of the fitted model can be tested using a residual plot method. If an almost horizontal pattern is observed, then we can say that the model predicted is the actual empirical complexity of the run time. The fitted line plot and the residual plot for the encryption process are shown in Figs. 1 and 2, respectively. These were plotted using the statistical package MINITAB version 16.
The fitted line plot in Fig. 1 shows a linear fitting of the power operation, and so, it can be termed to be O (n). The residual plot above has four sub-plots. The normal probability plot depicts an almost linear pattern and thus shows that the data are distributed normally. The versus fit plot shows that there are no outlier data present in the execution time. Similarly, the Histogram shows whether the data are skewed or not. The plotted histogram does not show any skew or outlier. The residual versus order shows the independence of residuals from one another. In normal scenario, if the residuals are distributed randomly around the normal line, then the residuals are almost independent of each other which is what we have obtained in our case.
Similarly, from the difference Table 5 of mod operation, we see that the first difference tends to zero. Table 5 shows that first difference does not show any reasonable pattern in the values and some of the values in the table are negative, which cannot be the case for the execution time of any operation and so no polynomial can estimate the empirical run time of mod operation and hence we term it to be empirical O (1). This can be proved from the following discussion.

Implications of various plots
Normal probability plot The normal probability plot is a graphical technique for addressing the distribution of data, i.e., whether the data are distributed normally or not. The points of the data should form an approximately straight line for the data to be considered as normally distributed. This is the case in our plot, and so, the data seem to be coming from a normal distribution.
Versus fit Residual versus fits plot is a scatter plot with the y-axis containing the residuals and the x-axis the corresponding estimates (fitted value). Our versus fit plot suggests that there is no unusual data point in the dataset. It also suggests that the residuals and the fitted values are correlated, which means that the fitted model which we predicted in our case to be linear is actually the case as the residuals are centered on zero throughout the range of the fitted values.
Histogram: Histograms of the residuals suggest whether the data are skewed or include outliers. If we get one long tail in one direction, this means that the residuals are skewed, and if one bar is far away from others, it means the Versus order This plot displays the order of collection of residual data. This is used to investigate whether the residuals are independent of each other or not. In an ideal scenario, the residuals on the plot should be distributed randomly around the normal line. This is what we observe in our encryption residual plot.
The estimated empirical complexity is evidently O (1), through the scatter plot as it correctly represents how one variable (execution time) is affected by other (number of inputs) and can be used to understand the relationship between two variables. The execution time for the mod operation is the mean of the three trials. If we plot the mean execution time in a scatter plot and the plot does not show any pattern, then we can confirm that no well behaved function such as a polynomial can approximate the empirical run time of the algorithm. So, we plotted the mean of the execution time, and what we observed is shown in Fig. 3.    The scatter plot of the mod operation shows a random distribution of the execution time, and thus, the mod operation can be termed to be empirical O (1).
We conclude that the power and mod operations are statistically dissimilar and the power operation with an empirical complexity of O (n) is more dominant than the mod operation which has an empirical complexity of O (1).
To further focus on the results, we tried both the methods with an input greater than 30. The results of execution time for three different trials with an input greater than 30 for both the operations are shown in Tables 6 and 7, and the revised differences are shown in Table 8.
From Table 3  Since t = 24.01709 which is greater than 1.96 at 95% level of confidence, we reject the null hypothesis and conclude that power and mod operations are statistically dissimilar if the data are normally distributed. We leave the method of finite difference with a comment that it also shows the same result, i.e., power operation is empirical O (n) and mod operation is empirical O (1) for inputs greater than 30 and so both are statistically dissimilar.
From the discussion above, we can say that in encryption and decryption processes of RSA, the two used operations, viz. power and mod, are not similar when analyzed statistically. Rather, power is a dominant operation with the empirical complexity of O (n) as compared to the mod operation which has an empirical complexity of O (1).

Analysis of intermediate steps of AES-128
The Advanced Encryption Standard (AES), also known by its original name Rijndael, is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001. AES is a subset of the Rijndael block cipher developed by two Belgian cryptographers, Vincent Rijmen and Joan Daemen, who submitted a proposal to NIST during the AES selection process. For a proper introduction of AES, seeAES (2001), Daemen and Rijmen (2003), The Advanced Encryption Standard (Rijndael) (2010) and Trenholme S ''S-box.'' AES (2010).
AES works by encrypting a block of size 128 bits and has a key length of 128 bits, 192 bits and 256 bits so giving the AES standards the name as AES-128, AES-192 and AES-256 with each standards using a round of 10, 12 and 14, respectively.
We analyzed each of the four sub-stages of AES-128. The values for the run of three trials for the four intermediate steps are shown in Tables 9, 10, 11 and 12.
We applied the fundamental theorem of finite difference approach discussed in Sect. 2 to evaluate the empirical computational complexity of the four sub-steps of AES-128. This is shown in Table 13. We observed that the first difference, i.e., Dy, is itself almost zero for the three processes, viz. ''Substitute bytes,'' ''Shift Rows'' and ''Add Round Key.'' So, no polynomial can approximate the empirical run time of these three steps, and rather, the mean execution times for these three steps are random, which has been proved using the scatter plot approach, and shown in Figs. 4, 5 and 6, respectively. So, we can say that these three steps are empirical O (1) as the scatter plots for all these sub-steps show an almost random pattern. For the ''mix column'' step, the first differences Dy are almost constant and the second differences Dy 2 are almost zero. So, a first-degree polynomial will approximate the empirical run time of mix column step as predicted through the fundamental theorem of finite difference, and hence, its empirical complexity can be termed to be O (n). On fitting a statistical model on the mean execution time of mix column step, it shows almost a linear pattern, which is shown in Fig. 7. The residual plot for this step is shown in Fig. 8.
By the empirical complexity analysis of the four substeps involved in AES-128, we conclude that the mix column step has the highest complexity of O (n), while the rest others are O (1). So, AES-128 empirical complexity accounts maximum for the empirical complexity of the mix column step. The overall complexity of AES-128 is O (n) by using any standard modes of operations because the most dominant operation is the mix column process.

Conclusion
In the case of compounding operations, the common proximity is toward the most dominant of the operations, working in a group when predicting the overall computational complexity of an algorithm. This may give vague result, and whatever be analyzed theoretically may not be the actual run time of an algorithm specifically in case of compounding operations. So, we have used a statistical approach through the method of revised difference to analyze the statistical similarity between compounding operations for a well-known cryptographic algorithm RSA. The two operations used in encryption and decryption procedure, viz. power and modular arithmetic, have been analyzed, and the analysis shows that both the operations are statistically dissimilar. But, when predicted theoretically it is the modular arithmetic operation which is predicted to be more dominant of the two. So, we further used the fundamental theorem of finite difference method to empirically analyze the running time of both the operations separately. The empirical run time for power operation is O (n), while in the case of mod operation, it is an empirical O (1). So, we conclude that power and mod operations when working as a single operation in the encryption and decryption procedure of RSA are statistically dissimilar and it is the power operation with an empirical complexity of O (n) which is the more dominant.
Similarly, through the method of the fundamental theorem of finite difference, we analyzed the empirical complexity of all the sub-steps used in AES-128 which shows that the mix column sub-step is the most timeconsuming of all the four sub-steps used in AES-128 with an empirical complexity of O (n).
The method proposed can be used to estimate the time complexity of the stated cryptographic algorithms before deploying them for use in any real-life application. The running time of an algorithm greatly enhances the overall cost of any system, and so knowing it before hand is always  4 Scatter plot of substitute byte a better approach while keeping in check that the appropriate security is not loosened. As far as the limitation of the proposed approach is concerned, we can say that since algorithms are platform independent (i.e., a given algorithm can be implemented in an arbitrary programming language on an arbitrary computer running an arbitrary operating system), there are additional significant drawbacks to using an empirical approach to gauge the comparative performance of a given set of algorithms. The time factor depends much on the computer clocks, and the results can get affected if the clocks do not function properly.