FPGA Based Hardware Abstraction of Quantum Computing System

The number of transistors per unit area are increasing every year by the virtue of Moore’s law. It is estimated that the current rate of evolution in the ﬁeld of chip design will reduce the size of transistor to an atomic scale by 2024. At atomic level the quantum mechanical characteristics dominate, eﬀecting the ability of transistors to store information in the form of bits. The quantum computers have been proposed as one way to eﬀectively deal with this predicament. The quantum computing circuits utilize the spinning characteristics of the electron to store information. This paper describes a proposition of resource eﬃcient FPGA based quantum circuit abstraction. A non-programmable em-bedded system capable of storing, measuring and introducing a phase shift in the qubit is implemented. The main objective of the proposed abstraction is to provide the FPGA based platform comprising of fundamental sub-blocks for designing the quantum circuits. A pri-mary quantum key distribution algorithm i.e. BB84 is implemented on the proposed platform as a proof of concept. The distinguishing feature of the proposed design is the ﬂexibility to enhance the quantum circuit emulation accuracy at the cost of computational re-sources. The proposed emulation exhibits two principal properties of the quantum computing i.e. parallelism and probabilistic measurement.


Introduction
In 1965, Gordon Moore observed that the number of transistors per unit area had doubled every two years since 1959. He predicted that based on the pace of technological innovations, this trend would continue for at least a decade [1]. The growth trend defined by the Moore's law persisted 50 years longer than he predicted. Since 1960s, multiple semiconductor technologies have been implemented to increase the transistor density and the switching speed. Some of the eminent technologies are MOSFETs, Piezo-Electric Transistors (PETs), Tunneling Field Effect Transistors (TFETs) and Near-Threshold Voltage (NTV) [2]. The photolithography is recently being introduced to further reduce the transistor fabrication area. At the current rate of improvement, the photolithography system will be able to use 5nm technology to create transistor features on the scale of the handful of atoms by 2024 [3]. In atomic-scale transistors, the quantum mechanical characteristics of electrons dominate to cause the tunneling effect. Thus, allowing a current to flow in the reverse biased mode of the transistor. Therefore, the quantum tunneling effect in transistors at nano-scale is assumed as the prime cause to end the Moore's law. The quantum computers have been proposed as one way to effectively deal with this predicament. The quantum processors utilize the spinning characteristics of the electrons as demonstrated in the Stern-Gerlach experiment to store information [4]. A single electron with spin i.e. qubit constitutes the smallest building block of a quantum computer . A brief comparison between the conventional and the quantum computers is given in table 1. The companies like Google, IBM, Hitachi and D-wave are working to develop a fully functional and stable quantum computer. However due to the challenges like decoherence, imperfect isolation from the environment, limitations of the capability to measure the output from the qubit and robust representation of the quantum information; a true quantum computer has not been developed yet [5] [6]. The technologies that are being used for the quantum computer hardware architecture development are summarized in table 2. The quantum computers are ideally suited for solving The vial of liquid filled with sample molecules [7] 2 Ion-trap-based Computer Single ion constitutes controlled by the laser beams [8] 3 Cavity Quantum Electrodynamics (QED) The photon [9] 4 Linear Optics Quantum Computer Optical mode of the photon [10] 5 Quantum-dot-based Computer Spin of the electron [11] computationally expensive problems. Fundamental examples proving the quantum computing precedence to the conventional computers are Shor's algorithm and Grover's search algorithm. The Shor's algorithm can factor a 512-bit number in 3.5 hours with 1GHz clock rate [12], whereas 8400 years are required by the conventional computer working with the speed of million instructions per second [13]. The Grover's search algorithm can search an element from an unsorted list of N elements in O( √ N ) time [14]. Some of the advanced examples of the quantum computing algorithms from a variety of disciplines are as follows: 1. Intelligent Control Systems: The knowledge-based optimizers with quantum computing, incorporated with the structure of intelligent control system has been proven effective for solving multicriteria control problems [ [20] and quantum ghost imaging [21]. 5. Artificial Intelligence: Quantum Inspired Neural Networks (QINN) can be classified in two categories [22]. The class of QINN which explicitly uses the concepts from quantum computing remains at a theoretical level as it requires a functional quantum computer to be implemented. The second category includes models of biological neural networks explaining the exceptional performance of the biological brains by employing concepts from the quantum computing and the quantum mechanics.
The idea of using quantum circuits to perform the computational tasks has been prevailing for over 30 years. The non-availability of a fully functional quantum computer is impeding the implementation of the quantum algorithms on a useful scale. This research gap drives the focus on the concept of quantum computer hardware/software abstractions [5].
The concept of hardware/software abstraction of the quantum circuits was introduced at the beginning of the 20 th century. The quantum computing emulations are designed by using Graphics Processing Unit (GPU), multiprocessor systems, super computers and FPGA. The FPGA based system by the leverage of parallelism provides emulation in a time-efficient manner. In this paper, a hardware abstraction of the quantum computing system is implemented on the FPGA. The proposed quantum gate emulation circuit provides a platform to develop and test the algorithms based on the quantum computing principles. The organization of the paper is as follows: section 2 presents a literature review on the quantum computing principles. In section 3, the architecture of the proposed Single Input Single Output (SISO) quantum abstraction is presented followed by the performance analysis in section 4. Section 5 demonstrates the implementation of a quantum cryptographic algorithm i.e. BB84 based on the cascaded SISO sub-block. The conclusion explores the connection between the qubit size and the emulation accuracy in section 6.

Literature Review
In the conventional computing, a bit is the smallest unit of information describing a classical system. Whereas, the qubit is the basic unit of information in a quantum system. The Stern-Gerlach experiment defines the concept of the qubit with the help of an electron spin [23] . The experimental setup consists of shooting a beam of electrons through non-homogeneous magnetic field oriented along the z − axis. The field splits the beam into two streams with opposite spins i.e. positive z − axis or the up spin (| ↑ ) and negative z − axis or the down spin (| ↓ ). The mathematical model of the experiment states that an electron exists in the superposition (a combination of (| ↑ ) and (| ↓ )) and the probability of an electron to follow up or down path is |α 0 | 2 and |α 1 | 2 respectively. An electron spin is described as follows: In the digital computers, a bit represents two states i.e. 0 and 1. In quantum computing, a 2 × 1 matrix is used to define a qubit. The conventional states analogous are represented as follows: A qubit as defined in the equation (1) exists as a combination of orthonormal states known as the superposition. The mathematical model for superposition is given in equation (5).
where C 0 and C 1 are complex numbers. The graphical representation of the qubit is a unit sphere termed as the Bloch sphere. The bloch sphere maps complex numbers C 0 and C 1 on a spherical coordinate system. The qubit is first translated into 3-dimensional rectangular space and then on a unit sphere. The mathematical expressions for the transformations are as follows: Consider a qubit |ψ , defined as the superposition of |0 and |1 The equation (7) represents the complex coefficients of the qubit in a polar form. Since the qubit does not change if multiplied with a unit magnitude complex number, we take product of the equation (7) with e jφc0 . Following equation gives the result of multiplication.
Implementing the Euler's identity to expand e jφ .
Since qubit is a normalized vector, for the translation of such a vector onto unit sphere; the unit magnitude condition is applied on the qubit under observation as depicted in equation (11). In equation (12), the qubit is represented in spherical coordinate system ,r C0 = cosθ is the projection of vector |ψ on the θ = 0 0 ; φ = 0 0 line and r C1 = sinθ is the shadow of qubit under discussion on θ = 90 0 plane.
Equation (13) shows the transformation equations from the qubit representation to the rectangular system.
In order to make the |0 and |1 qubits 180 0 apart, the θ in equation (12) moves at half speed which can be mathematically represented by multiplying θ with two (2θ = γ). The equation (14) represents the qubit on to the bloch sphere.
x = sinγcosφ; y = sinγsinφ; z = cosγ The graphical depiction of qubit representation on the bloch sphere and the rectangular coordinates system is given in figure 1.
In the electron beam splitting experiment, the asymmetric magnetic field is used to define the spin of an electron. Analogous to the Stern-Gerlach experiment, in quantum computing the measurement procedure transforms the qubit into a classical bit. From the bloch sphere point of view, the projection of qubit on positive z − axis depicts its chances to be measured as a zero bit. Similarly, the projection of the qubit vector on the negative z − axis shows its probability to be measured as one bit. The classical Probabilistic Turning Machine (PTM) is used to express the concept of measurement. In the PTM tree, the vertex shows the state and the edges define the probability of the transition occurrence. In figure 2, the level 1 vertex is a state of the qubit and the level 2 vertices define the states in which the electron will collapse after transition based on the probabilities defined by the edges. In classical computing, the logic gates are the fundamental building blocks of the digital integrated circuits. These gates implement boolean functions on the classical bits. The fundamental classical logic gates are and, or, xor and not. In the quantum world, the reversible matrix functions are analogous to the classical logical gates [24]. A quantum gate is formally defined as a unitary function that acts on the qubits. The quantum gates can be classified into three categories i.e. Signal Input (SI) operations, Double Input (DI) operations and Multiple Input (MI) operations. The SI operations add a phase shift to a qubit on the bloch sphere thus varying the probability of the qubit to assume |0 or |1 once measured. The remaining two categories are collectively termed as controlled -U gates. In controlled-U gates, the first input controls the nature of operation on the remaining inputs [25]. The quantum gates and their effects on the qubits are given in table 3 [4]. In quantum information theory, a quantum circuit model is used to describe the quantum computations and algorithms as a sequence of quantum gates; which are reversible transformations on the n − qubit register. The universal set of the quantum gates capable of performing any unitary transformation on the n − qubits are Controlled − N OT gate and single-qubit gates [26]. The fundamental features of the quantum mechanics demonstrated by the unitary transformations are parallelism, probabilistic measurement and entanglement. The SI quantum gates can perform parallelism and measurement whereas entanglement is demonstrated by the Controlled − N OT gate. In the absence of fully functional quantum computers, the emulation of the quantum gates and the qubit registers provide a platform for the design and the development of quantum circuit models.  Gates Functionality Fredkin gate |0, y, z ⇒ |0, y, z ; |1, y, z ⇒ |0, z, y tion circuits available in the literature. The first two emulation approaches defined in table  4 involve the abstraction of quantum algorithms however the third approach is based on the development of quantum circuit building blocks. This approach facilitates the quantum circuit design at the abstraction level by cascading the quantum gates and registers. In this paper, the architecture of the hardware abstraction for the single input quantum system is presented. This design is capable of storing qubit, introducing a phase shift in the qubit and its measurement. The main objective of the proposed abstraction is to provide an FPGA based platform comprising of the fundamental sub-block for designing quantum circuits. The distinguishing feature of our design is the flexibility to enhance the quantum circuit emulation accuracy at the cost of computational resources.  [35]. Figure 3 presents the elaborated block diagram of the proposed SISO emulator. The QFSM along with the counter controls the sequence of operations for the qubit processing. The qubit processing involves fetching of the quantum information, application of unitary transformation on the qubit by the ALU, writing the ALU output in the qubit flipflop and conversion of the ALU results to the classical bit via measurement block.
The design also provides flexibility to improve the emulation results accuracy by increasing the quantized superposition states of the qubits. This is achieved by increasing the number of bits used for the qubit representation. For demonstration purposes the abstraction working is defined with respect to 8-bit model for the qubit representation. The following subsections elaborate the working principle of each function block in the SISO architecture.

Qubit Flip Flop
The qubit flip flop allows the abstraction to store or fetch a qubit as per instructions from the Quantum Finite State Machine (QFSM) in the form of read/write signals. The mathematical representation of the qubit is given in equation (5). In this equation C 0 and C 1 are complex numbers with the condition |C 0 | 2 + |C 1 | 2 = 1.
Graphically all the points on the bloch sphere represents possible values of a qubit. Ideally, a qubit has infinite number of states as C 0 and C 1 has infinite possible values between absolute values 1 + 0i and −1 + 0i. In the abstraction model, we need to store C 0 and C 1 in order to create a memory for a qubit. Since coefficients of the qubit are complex numbers, two eight-bit fixed point values are used to store each quantity. Consider the coefficient C 0 , it is represented as α 0 + β 0 i where α 0 and β 0 are represented as Q(2, 6) signed numbers. The reason for dedicating 2 bits for integer part is to include the north and south pole of the bloch sphere in the given range. By dedicating six bits for the fractional part representation, we are limiting the superposition states that can be assumed by the qubit. There is a direct relation between accuracy of the qubit abstraction and the number of fraction bits dedicated to represent Franka et al. [29] Resource efficient design of quantum algorithms Goto and Fujishima [30] Use of unitary macro-operations to facilitate memory efficient simulation of quantum circuits Behavioral emulation of physical quantum algorithms Lee et al. [31] Quantum algorithm emulation closer to natural quantum systems at the cost of increased hardware complexity Design of basic quantum gates for developing quantum algorithms on classical architecture Aminian et al. [32] The proposed approach is focused on development of quantum gates and using them to emulate quantum algorithms.
Negovetic et al. [33] The paper proposes a softwarehardware solution for quantum circuit emulation

Khalid et al. [34]
Quantum circuit development encompassing the concept of entanglement and parallelism The accuracy of the abstraction of a qubit is also effected when a qubit is processed by a quantum gate. Each gate involves addition and multiplication of complex numbers. Since the gate output is stored in Q(2, 6) memory, an error is introduced by truncating the fractional part of the gate's output. Figure 4 shows the architecture of single-qubit storage.

Measurement Block
The concept of an electron spin in quantum mechanics allows the qubit to be in a coherent superposition of the absolute states i.e. |0 and |1 . The north and south pole of the bloch sphere can be considered analogous to the classical 0 and 1 bit respectively. The measurement block transforms the qubit coherent superposition to the discrete 0 or 1 bit based on the probabilistic model. The equation (5) represents that the probability of a qubit to land as a |0 or 0 bit is |C 0 | 2 and probability to assume the |1 state is |C 1 | 2 . In the proposed design, the measurement takes place on the principle of Roulette wheel [35]. The following procedure describes the computational steps performed to convert a qubit to a classical bit: 1. The probability of measuring 0 ( |C 0 | 2 ) is transformed from fixed point number ranging between 0 − 1 to the integer number ket0 p within the range 0 − 255 via linear mapping. 2. An 8 bit pseudo-random number R p is generated that is unsigned integer in nature. 3. If the pseudo-random number R p is less than ket0 p , the qubit is transformed to the classical bit 0 else the output is a conventional bit 1. six sub-modules that work sequentially for 16 cycles to create c inter and the final output c is generated by 7 t h sub-module that gives the xored version of c inter and n. The registers used in the architecture and their purposes are given in table 5.
Step by step description of each stage of PbS function architecture is elaborated in figure 6 and is narrated as follows: 1. As a first step, the inputs m and n are stored in registers m reg and n reg respectively and also the most significant bits of m reg and n reg are stored in as m msb and n msb. For the bit by bit manipulation, both registers are left rotated so that the msb now represents the subsequent bits for the next cycle.       6. In this step, the m msb is actually placed at the position defined by pointerc defined in step 4. The above stated six steps are repeated for 16 cycles to completely shuffle the bits of m reg and form c inter. 7. As a final step, the output is obtained by taking xor of c inter and n reg. This gives a 16 bit output that is divided into two 8 bit PRNs.
The block diagram of the architecture of PbS based pseudo random number generator is presented in figure  6. The statistical test suite developed by National Institute of Standards and Testing (NIST) is an international merit to evaluate various aspects of randomness in a long sequence of bits [37]. In order to verify the efficiency of the proposed random number generator, the output of PbS function is subjected to NIST randomness test suit. Appendix A shows that the proposed PbS based random number generator creates 8-bit random number as per NIST standards as p-value for each test is well within the upper bound defined by randomness test suite [38].

Arithmetic Logic Unit
The qubit signal processing is defined by applying a set of transformations on the quantum state space. These transformations are termed as the quantum gates. In classical computing, the basic gates are and, or, not and xor. Any computation on conventional bits can be simplified into a sequence of basic operations. In quantum computations, any algorithms can be broken down into a series of the quantum gates. Mathematically a quantum gate can be represented as a unitary matrix and graphically the transformations correspond to a phase shift of qubit on the bloch sphere. These gates can be classified into two categories i.e. Single Input (SI) operations and Multiple Input (MI) operations. A brief description of quantum transformation categories are as follows: -Single Input (SI) Operations: These gates accept single qubit as an input and alters the spin of the qubit. Common SI gates are P auli − X, P auli − Y, P auli − Z and Hadamard gate. In this paper, we have presented the hardware abstraction of the SI operations. The implemented operations are P auli − X, P auli − Y, P auli − Z and Hadamard gate. Mathematically each SI gates depicts a matrix multiplication. However, in terms of architecture, the behavioral model of matrix multiplication is implemented to reduce the implementation cost. The architecture of each gate is elaborated in figure 7 and the description is as follows:

P auli − X Gate
The P auli−X gate is analogous to the not gate. Graphically, it gives the qubit vector 180 0 rotation along x − axis. Equation (15) shows the mathematical description of the gate. 0 1 1 0

P auli − Y Gate
This transformation rotates the bloch sphere and qubit vector plotted on the sphere 180 0 along y − axis. Equation (16) presents the mathematical model of the proposed gate.

P auli − Z Gate
This transformation gives an anticlockwise flip to the qubit along z − axis. This results in variation of probabilities to materialize as |0 and |1 qubit after measurement. The mathematical description of P auli − Z gate is given in equation (17).

Hadamard Gate
The transformation gives a sequence of two rotations to a qubit on the bloch sphere. The Hadamard gate can be expressed as 90 0 rotation around y−axis followed by 180 0 rotation around x−axis. Equation (18) represents its mathematical expression and architecture.

The Counter
The counter is connected with the ALU and QFSM to synchronize the qubit signal processing operations based on clock cycles. The number of clock cycles required by the quantum operation depicts its execution time.

Quantum Finite State Machine (QFSM)
The finite state machine is responsible for coordination among all the functional blocks of the quantum processor. Following are the features of the quantum emulator FSM.
1. Take qubit as an input. 2. Coordinate with the ALU and counter for SI transformation. 3. Stores the output in qubit memory. 4. Transfer the SI output to the measurement block so that the qubit can be materialized as classical 0 bit or 1 bit.

Performance Analysis
This section defines the accuracy of the emulation model and the computational cost of the abstraction. A trade off exists among the emulation accuracy and the implementation cost mainly because of fixed point nature of the proposed FPGA based architecture. Figure  The gate level hardware abstraction is implemented on spartan 3 using Verilog. The resource requirement of each block is illustrated in table 9 as Look-up-tables (LUT) and slice count. The direct application of the proposed design is an ultra-lightweight emulation of quantum key distribution algorithms. The basic quantum algorithms used for the symmetric key distribution are BB84 and BB92 [4]. The quantum circuits for these algorithms exercise the concept of parallelism and probabilistic measurement for the encryption of a private key. The micro blocks in the proposed architecture are used to design non-programmable architectures of the BB84 algorithms in the subsequent section. The total computational cost of the quantum protocol designed by using the proposed micro blocks is the cumulative effect of operations performed on the input.

Emulation of Quantum Algorithms
The emulation of SISO quantum gates defined in section 3 can be used to develop abstractions of the quantum algorithms i.e. Shor's algorithm, Grover's search algorithm and quantum key distribution algorithms. By virtue of the NIST post-quantum cryptography call, the probabilistic measurement feature of the qubit is  widely being explored for public and private key encryption techniques [39]. The post quantum cryptography aims to develop a cryptographic system that is secure against both quantum and classical computers. Since large-scale quantum computers are not widely accessible, an emulation of quantum systems can facilitate the design, development and testing of the algorithms for the post-quantum era. In this paper, we have implemented the architecture of the quantum key distribution algorithm i.e. BB84 [40]. The protocol is deployed on four different systems i.e 8-bit,16-bit,32-bit and 64bit. With the increase in the number of bits used to define a qubit, the superposition states increase and hence the accuracy of the emulation output enhances. This section explains the working of the BB84 protocol, the abstraction architecture and performance analysis of the algorithm.

The Bennett & Brassard 84 Algorithm
The Bennett and Brassard (BB84) algorithm is considered as the foundation stone in the field of quantum cryptography. The protocol uses non-cloning and probabilistic measurement feature to securely communicate the private key. The BB84 protocol implementation requires two unbiased orthonormal basis i.e. [+, ×]. The polarization states of each basis are given in table 10. The protocol executes in three steps defined as follows: . Key transmission phase: The transmitter i.e. Alice generates an 8-bit binary key through a pseudorandom number generator. Alice then generates a random sequence of basis among [+, ×], transforms each bit of the key to a qubit with the corresponding basis. Finally, an 8 − qubit sequence is transmitted to the receiver. Table 11 elaborates the key transmission process.
4. The sequential processing of the input sequence is governed by a 3 × 8 mux. The selection line of this mux is a 3-bit counter that increments after every clock cycle. Starting from the lsb side every positive edge of the clock, inputs subsequent key bit along with the corresponding random sequence bit to the selection line of 2 × 4 mux. 5. The quantum transformation of the key is saved in an 8-bit qubit flipflops and also transmitted to the receiver's end.
The architecture of the transmitter's side is depicted in figure 9.
At the receiver's side, the qubit key sequence is transformed to a classical 8-bit sequence through the probabilistic measurement. The qubits are randomly subjected to either of the basis [+, ×] for the measurement purpose. The step by step description of the receiver's architecture is as follows: 1. A 8 × 1 mux is responsible for the sequential measurement of the qubits with the help of a 3-bit counter as a selection line. The counter increments after every successful measurement of a qubit. 2. This step corresponds to the transformation of a qubit so that for the basis among [+, ×], a single measurement block can be used for the key retrieval. For the + basis measurement, no transformation is required as the block in section 3 is based on states | → , | ↑ . For measurement with × base, the measurement block should use | ր and | տ as absolute states. A Hadamard gate is used to spin (| ր , | տ states to (| → , | ↑ ) respectively. This corresponds to translation of × basis qubit to a + basis qubit without altering its probability to materialize as a 0 or a 1. 3. A 2 × 1 mux is used to select the basis for measurement of a qubit based on the sequential bits of 8-bit random number generated by PbS function i.e. randomnumber[n] = 0 =+ basis and randomnumber [n] = 1 =× basis. 4. As a final step, the measurement block is used to convert a qubit to a conventional bit.
The architecture of the receiver's side is presented in figure 10. The architecture is designed for four different systems i.e. 8-bit,16-bit,32-bit and 64-bit. The perfor-   of 0.25. The overall performance in terms of emulation error is obtained by comparing the success rate of the abstraction with the ideal system. Table 15 shows linear decrease in the error with the increase of mantissa size for the qubit representation. This shows that by increasing the number of bits for the qubit representation the quantization and truncation errors are reduced thus making the emulation more reliable.

Conclusion
In this paper, an FPGA based emulation of the low-cost quantum circuit is proposed. The architecture of single input gates, memory and measurement block have been implemented. Our proposed solution is a standalone system capable to exhibit parallelism and probabilistic measurement. This paper also presents the gate level abstraction of BB84 protocol on four different system i.e. 8-bit,16-bit,32-bit and 64-bit. The performance analysis shows that with the increase in size of quantum emulation, the accuracy of the model improves linearly. The proposed system can be implemented for the ASIC solution of quantum key distribution for the resource constraint scenarios.