In this section, the proposed ESQPA data server is compared with two similar database servers like the MySQL and the CryptDB server. For the experiments, the ESQPA was implemented by C++ programming language, on an Intel R Core (TM) i7-3770 3.4 GHz processor, and memory size of 32 GB.
Algorithm1. Pseudocode of creation database and tables.
Creation_Database_Tables DB= (T1, T2, …, Tn): database and tables creation in CryptDB server Algorithm. Input: DB name, Table T = (Ri ,Cj). Output: Database and Tables are created.
Client application
step1: Enter database name db_name, then write command "create database db_name if not exist"
step2: Enter table name t_name and table properties columns Cj, then write command "create table t_name (Cj)"
step3: Insert the data for each column Cj as rows Ri format in the CryptDB proxy server.
CryptDB Proxy server
step4: Takes plain text data inserted from the client and changes it into encrypted form.
step5: Forward encrypted data to the DBMS server.
DBMS server
step6: Compress encrypted data by executing the LZ77 algorithm, then storing compressed encrypted data.
Algorithm2. Pseudocode of ESQPA algorithm. Algorithm ESQPA (Q): An Efficient and Secure Query Processing on encrypted CryptDB. Input: query Q as a selection command. Output: response from CryptDB server to client application as a query result.
Client application
step1: Write a selection command query (Q) in plain text format to execute an encrypted database.
CryptDB Proxy server
step2: Takes a plain text data query from the client and changes it into encrypted form.
step3: Forward encrypted query to the DBMS server.
DBMS server
step4: Decompresses encrypted stored data by executing the LZ77 decoding.
step5: Execution of an encrypted query on encrypted data, then return the encrypted result to a proxy server.
CryptDB Proxy server
step6: Similar to encryption in step 2, Decrypts the query result and sends it to the client.
ESQPA is tested on several random large-scale datasets and three real datasets in Table 1 and Table 2.
Random Dataset Description
I have used the [28] tool to generate random data, as illustrated in Fig.5. It does support many field data types like integer, float, double, varchar, date, text and binary long object. In addition to supporting foreign keys constraints.
India News Headlines (INH) Dataset Description
This news dataset is real constant historical file of notable events in the Indian subcontinent from start-2001 to end-2020, recorded progressively by the columnists of India [29]. It contains roughly 3.4 million events distributed by Times of India. A majority of the data is focusing on Indian local news including national, city level and diversion. Dataset contains 3.4 million records and three columns defined as follow:
- Publish Date: Date of the article being published online in yyyy-MM-dd format.
- Headline_Category: Category of the headline, ASCII, dot delimited, lowercase values.
- Headline_Text: Text of the Headline in English, only ASCII characters.
PubMed Abstracts Dataset Description
This dataset is real scraped data from the National Library of Medicine [30]. Dataset contains 13.2 thousand records and 17 columns. The columns correspond to some topics, and the records correspond to the data from the pages with articles. To be specific, abstracts where the features of the topic are indicated and the essence of the articles. There are main topics such as deep learning, covid 19, virtual reality, human connectome, brain machine interfaces, electroactive polymers, PEDOT electrodes, and neuroprosthetics.
Amazon Review Polarity Dataset Description
This dataset is a real Amazon review from 6,643,669 users on 2,441,053 products, from the Stanford Network Analysis Project (SNAP) [31]. Dataset contains about 4 million records (reviews) and three columns defined as follow:
- Polarity: Integer value refers to sentiment opinion such that 1 for negative and 2 for positive.
- Title: Text value refers to review heading.
- Body: Text value refers to the review body.
Table 1: Information about random datasets
Dataset
|
Number of columns
|
Number of records
|
1M
|
8
|
1 million
|
2M
|
8
|
2 million
|
4M
|
8
|
4 million
|
8M
|
8
|
8 million
|
16M
|
8
|
16 million
|
Table 2: Information about real datasets
Dataset
|
Number of columns
|
Number of records
|
PubMed Abstracts
|
17
|
13.2 thousand
|
India News Headlines
|
3
|
3.4 million
|
Amazon Reviews Polarity
|
3
|
4 million
|
Table [3], [4] and Fig. [6], [7] show an illustrative execution for the ESQPA algorithm. The performance of the proposed algorithm is evaluated and compared with those existing in MySQL and CryptDB servers in storage space term measured in a gigabyte.
Table 3. Numerical results for storage space of MySQL, CryptDB, and our algorithm ESQPA on random datasets
Dataset
|
MySQL server
|
CryptDB server
|
ESQPA server
|
Com. Ratio
|
1 M
|
0.27 G
|
0.74 G
|
0.33 G
|
55%
|
2 M
|
0.53 G
|
1.5 G
|
0.63 G
|
58%
|
4 M
|
1.1 G
|
3.0 G
|
1.3 G
|
56%
|
8 M
|
2.1 G
|
6.1 G
|
2.3 G
|
62%
|
16 M
|
4.3 G
|
12.0 G
|
4.8 G
|
60%
|
Table 4. Numerical results for storage space of MySQL, CryptDB, and our algorithm ESQPA on real dataset
Dataset
|
MySQL server
|
CryptDB server
|
ESQPA server
|
Com. Ratio
|
PubMed Abstracts
|
0.21 G
|
0.58 G
|
0.24 G
|
59%
|
India News Headlines
|
0.78 G
|
2.21 G
|
0.92 G
|
58%
|
Amazon Reviews Polarity
|
1.83 G
|
5.86 G
|
2.17 G
|
63%
|
Table [5], [6] and Fig. [8-10] show an illustrative run time measured in seconds for the ESQPA algorithm. The execution time of the proposed algorithm is evaluated and compared with those existing in MySQL and CryptDB servers.
Table 5. Execution time for random datasets in MySQL, CryptDB, and ESQPA server
Dataset
|
MySQL server
|
CryptDB server
|
ESQPA server
|
1 M
|
167 s
|
1653 s
|
1773 s
|
2 M
|
328 s
|
3281 s
|
3525 s
|
4 M
|
666 s
|
6644 s
|
7156 s
|
8 M
|
1352 s
|
12801 s
|
13773 s
|
16 M
|
2716 s
|
24518 s
|
26446 s
|
Table 6. Execution time for real datasets in MySQL, CryptDB, and ESQPA server
Dataset
|
MySQL server
|
CryptDB server
|
ESQPA server
|
PubMed Abstracts
|
148 s
|
1532 s
|
1728 s
|
India News Headlines
|
417 s
|
4225 s
|
4460 s
|
Amazon Reviews Polarity
|
1237 s
|
12682 s
|
13526 s
|
As illustrated in Fig. [6], [7], storage space at ESQPA is much less than CryptDB server storage by more than 55% saving. While the run time at ESQPA is a little higher than CryptDB as illustrated in Fig. [8-10]. This is due to the data compression and decompression operations that take place after encrypting the data. The sequence of operations is the key factor in a very small increase in run time. Depending on [32], [33] the performance analysis of ESQPA is calculated from the product of complexity for order preserving encryption O (n log n) by complexity for LZ77 compression O (m). Thus, time complexity for ESQPA is O (m*n log n). In general, the advantage of saving server storage space allows acceptance of a small amount of time to increase. Finally, the experimental results show that our algorithm is better and optimal for saving space. In addition to having the advantages of being intuitive, extremely less storage space specifically when applied to the encrypted database management system.