Implementation and Optimization of CNTFET Based Ultra-Low Energy Delay Flip Flop Designs

Energy conservation and delay minimization are the two major goals while designing ultra-low-power digital integrated circuits at lower technology nodes. Here, silicon based carbon nanotube field effect transistor (CNTFET) has been explored as a novel material for future electronics design applications (EDA). In this paper, two energy-efficient switching activity minimization techniques have been applied with proposed designs. First technique detects the completion of sensing stage operation known as transition completion detection (TCD) technique. TC signal generated from NAND operation of complementary outputs of sensing stage which minimizes glitches in the complementary outputs of the latch stage. Another clock gating mechanism applied at the latch stage to smoothen the output waveforms Q and Q¯\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\overline {Q}$\end{document}. The proposed and existing designs simulated using 32nm CMOS and 32nm CNTFET technology, indicating that the CNTFET based design reduces power by 45% and 36% respectively in comparison with conventional CMOS. Proposed Low Power Sense Amplifier Flip Flop with transition control detection (TCD-LPSAFF) and Ultra Low Energy Sense Amplifier Flip Flop (ULESAFF) give minimal power delay product (PDP) which is 35.7 × 10− 18 J and 29.6 × 10− 18 J respectively. Also, the effect of process variation has been analyzed at specified corners (FF, TT and SS) in the temperature range of -40∘C to 120 ∘C. The performance of all designs has been validated by functionality testing with variation in load cpacitance, diameter, number of tubes and pitch respectively.


Introduction
The two major achievements considered in modern digital VLSI applications are ultra-low power and high speed operation. In particular, primary concern shifted towards power reduction rather than delay minimization in present energy efficient portable device applications [1]. Power optimization with reliable operation of the sequential logic block in clock storage elements (CSE's) is a very critical task while working beyond sub-nanometer nodes [2]. Reliability of digital logic designs is limited by source regions. The only difference is that the transmission medium is replaced with rolled graphene carbon nanotubes. Basically, two types of carbon nanotubes are available. First type is single wall carbon nanotube (SWCNT) and the second one is multi wall carbon nanotube (MWCNT). SWCNTs are further classified as semiconducting type and metallic type. SWCNT's are useful in digital design applications specially, for designing switching devices with high ON and OFF current ratio. The presence of metallic type CNT's during fabrication of SWCNT's deteriorating performance as they are having near zero energy band gap hence uncontrolled gate voltage showing a direct impact on drain current and power performance of Si based CNTFET technology [6][7][8][9][10].
It is known that the total average power is an integration of three components namely leakage, static and dynamic. This paper is mainly focused on the reduction in dynamic power by minimizing the switching activity of major capacitive nodes. Various power reduction techniques are incorporated comprehensively in the available state of art for minimization of data activity at major switching nodes [11]. This paper mostly concentrated around two low power techniques such as clock gating and transition completion detection. The performance of SAFF is drastically degraded due to glitch occurrence at intermediate and output capacitive nodes. Section Section:Literature demonstrates a literature review of existing single edge trigger sense amplifier based delay flip flop designs. There is no previous literature available utilizing these schemes with CNTFET technology on proposed low power sense amplifier flip flops. Performance of flip flops is measured in terms of timing metric parameters like setup time, hold time, propagation delay and energy-delay product. Also, a comprehensive comparison of previously reported low power sense amplifier based delay flip flops with proposed methodology using 32nm CMOS and 32nm CNTFET technology is presented in Section 3. Parametric variation in supply voltage, temperature, diameter, number of tubes and pitch has been done for enhanced power delay product in Section 4. Finally, the findings of this work can be summarizes in conclusion.

Investigation of Sense Amplifier based Single Edge Trigger D Flip Flop Designs
Sequential elements (such as latches and flip flops) store data input with respect to the rising or falling edge of clock pulse as shown in Fig. 1a. Level triggered latches create problem and always need more attention hence the focus of the work is on flip flops. Flip flops are edge triggered that means they change their state only when a control signal (clock pulse) changing from rising to falling edge (called negative edge triggered) or falling to rising edge (known as positive edge triggered).
Delay flip flops (DFF's) can be divided on the basis of their functionality as Dynamic and Static type. Their functionality can be improved by analyzing various switching activity data pattern as mentioned in Fig. 1b and evaluating best case, worst case and average case. Switching activity minimized in different state of art papers utilizing various dynamic power reduction techniques at circuit level design abstraction included clock gating, sleep transistors method and forced stack transistor technique [12][13][14][15]. Clock gating is used to reduce dynamic power and illustrate the effect of switching activity on total power consumption of a digital synchronous design. Addition of a high V t sleep transistor to cut off particular design block having low switching activity is a very common way of power gating in CSE's. Basically stack transistor method utilized to reduce leakage power. Stacking effect integrated with clock gating technique and TCD method in this work to minimize the total power of the digital design.
Size of the memory in terms of effective area gets reduced day by day and the storing capacity gets enhanced with recent advancements in VLSI design. Hence, there is a strong need for fast detection of input data available at bitlines and data-lines with reliable read and write operation between different blocks of memory subsystem. For this application various kinds of sense amplifiers has been utilized for sensing ultra-low voltage present at bit lines and amplify them with the help of pair of transistors in a positive feedback manner to improve the performance of whole Most of the previous work on SAFF has been reported using CMOS technology [16][17][18][19][20]. There is no previous available literature on CNTFET based SAFF designs. In this context, this paper deals with redesigning and analysis of eight previously available SETFF structures as well as proposed typologies at 32nm CNTFET and after that compared with 32nm CMOS for a fare comparative analysis.
Sense amplifier based delay flip flop have an advantage of reliable operation on ultra-low voltage which is a basic need of energy efficient smart portable handheld devices. The first sense amplifier based D flip flop was introduced in [25] consists of a fast sensing stage and a slow slave latching stage. This sensing stage is a differential ratio-less structure, having following near zero setup time and low hold time with reduced clock load. The main drawbacks of these SAFF designs is that asymmetric output delays generated by slave S-R NAND latch.
A modified sense amplifier based delay flip flop invented by Nikolic is reported in [21], enhanced delay metric by adding a symmetric two input NAND slave latch comprises of two NAND based logic gates and two inverters. The number of transistor count increased by a factor of two while comparing with conventional two inputs NAND based latch. There is a trade-off between performance and area in this design due to increased number of transistor counts as shown in Fig. 2a. Another topology presented by Lin in [22] had reduced delay with less transistor count in latch stage as mentioned in Fig. 2b. There is a large power consuming glitch present when both output and next data input are high. In [23], Kim presented new SAFF design that utilizes a modified latch structure realized from two C2MOS structures with two cross coupled weak inverter pairs as shown in Fig. 2c. This change in latch structure made it static in nature. The Kim's SAFF has lesser delay and require less transistor count in the output latch stage. This type of SAFF still has glitch problem while applying less load at output side. The next is the application of crosscoupled inverter needed appropriate transistor sizing for reliable operation and minimization of power consumption due to glitch occurrences. The next SAFF introduced by Strollo in Fig. 2d, represented a new slave latch that removes the problems faced by Kim and maintaining the profits of the same. This new slave architecture requires twelve transistors and it was a mixed solution which was obtained from NAND-based SR latch [11,21,25] and the C2MOS   [25] b Low Power SAFF(LPSAFF) [11] design [15]. Now, sense amplifier based D flip flop designs have been discussed below in detail and further compared with proposed designs.

Sense Amplifier Flip Flop (SAFF)
Binary and ternary logic D flip flop designs, shift registers and counters have been discussed and optimized in various published literature [15,18,26,27]. Sense-amplifier based flip-flop (SAFF) with transistor count of 18 reduces clock swing as shown in Fig. 3a, has a power efficient sensing stage followed by a cross coupled NAND2 based D latch. Clock network designed using reliable low power techniques such as clock gating, transistor stacking and multi threshold operation. Sense amplifier based master stage followed with a static slave output latch. There exists interdependency between input outputs for reliable transmission of information. Initially, a rising clock pulse applied at sensing stage. Both R and S, pre-charged with supply voltage transistors through transistors MP 1 and MP 4 . Internal nodes N 1 and N 2 and N 3 , also pre-charges with difference of supply voltage and threshold voltage of transistors. Next case is CLK=1 and D=0, turning ON MN 1 and turning OFF MP 1 and MP 4 . Node N3 discharges faster than node N 2 through MN 1 and MN 3 path. Finally, both R and S latched with maximum V DD and minimum voltage (zero). D starts evaluated through a feedback path MP 5 -MP 6 and MP 2 -MP 3 , in turn Q and Q in either 0 or 1 according to the input state. Here, an always ON weak transistor MN 4 is included for generating complementary signal at both R and S.
Sense amplifier based circuits are very useful in lower technology nodes, where flip flop have to accept very small input signal and need to amplify them. The main drawback is that its sampling stage introduces more dynamic power dissipation if there is a very less data activity at major input nodes as shown in Fig. 5a. One more problem occurs in latch operation due to interdependency of complementary output nodes Q and Q respectively.

Low Power Sense Amplifier Flip Flop (LPSAFF)
Conventional SAFF design shows inferior performance due to high switching at dynamic capacitive nodes N 1 and N 2 , can be optimized by adding two data controlled P type transistors in the pre-charging path as shown in Fig. 3b. Here, node N 1 discharged through MN 2 -MN 1 path in first cycle and this node is either floating (for low value of CLK) or discharged through ground (for high value of CLK) for remaining (n-1) consecutive cycles in a case when data is high for n number of cycles. The modified pre-charging paths ofR and S enhances transistor count by increase area perhaps minimize switching activity of internal nodes discussed in [11]. Hence, there will be a tradeoff between area and power. Still, LPSAFF will face floating conditions; when CLK=0 and D=1 for N 1 and N 2 floats when both CLK and data are zero. This problem is resolved by adding a weak transistor MN 4 in between discharging path. Still, LPSAFF required a lot of improvement at slow working NAND based D latch circuitry.
A variety of slave latch structures for low power SAFF are explored [15][16][17][18][19][20][21][22][23][24][25]. A modified latch design introduced by Nikolic is explained in [21]. Two inverters inserted in pull down paths for Q and Q respectively. Another latch topology designed by Kim, badly affected by a glitch produced previous state of complementary outputs and D are '1'. Also, both clock to output delay and leakage current increased due to inflow of contention current across cross

Proposed Transition Control Detection Low Power Sense Amplifier Flip Flop (TCD-LPSAFF)
Transition detection signal will be generated from a NAND operation of R and S. This signal applied at the gate of MN 4 to figure out previously faced problem of speed reduction due to always on weak transistorMN 4 in sensing stage. Moreover, MN 8 and MN 11 are also connected with TC signal in pull down stack of slave latch in Fig. 4a. The number of NCNTFET stacks reduced by one in this slave latch in comparison with Strollo's latch architecture. Initially clock is in low state, it means all nodes pre-charged to high and TC remains low (nearly zero). Now clock starts rising in turn sensing stage in transition with MN 4 remain in OFF state. After ending of transition, either R and S will be in '0' state which forces TC towards '1'. Another thing is that node N 2 and N 3 remain in identical state (either '0' or '1') for clock equal to 0 or 1. Another advantage of utilizing transition control scheme is the prevention from current contention problem occurred in previous D flip flop designs. SAFF designed in [17] paper with transition control scheme still faces a new problem in latch stage as shown in Fig. 5c. If clock is moving towards falling edge after instantiation of evaluation phase, TC is discharged through pull down path after pre-charging of node S. Both MN 7 and MN 8 are turned ON for a while produces glitch and enhances dynamic power at sensing stage. This problem will be resolved in next design by applying clock gating in D latch.

Proposed Ultra Low Energy Sense Amplifier Flip Flop (ULESAFF)
Addition of two clock controlled NMOS stack in this proposed ULESAFF architecture depicted in Fig. 4b, reduces the effect of a variety of low power techniques discussed in different research articles exploring power effective optimization methods applied to reduced switching activity at highly unstable capacitive nodes. Out of which, clock gating methodology introduces for reduction in dynamic power without any enhancement in crow bar currents at complementary outputs of static latch. Meanwhile, power saving is achieved on the expenses of transistor count (Area). Glitches are almost removed in this design as shown in output waveform in Fig. 5d.

Simulation Methodology and Optimization
A comprehensive detailed study has been done accounting highly efficient sense amplifier based D flip flop designs   Table 1.
The proposed and reported SAFF's are designed and analyzed for enhanced figure of merit which depends upon the transistor count, type of power reduction technique and frequency of operation. In order to elaborate the pros and cons of proposed LPSAFF's, the comparative analysis has been performed with respect to eight previously reported SAFF designs at 32 nm CMOS and 32nm CNTFET technologies. Clock to Output and data to output delays have been measured from Fig. 5 for 50% signal transitions with a supply of 0.9V using HSPICE simulations. Nikolic, Strollo and conventional SAFF have 2-stage delay to change the output signal from 0 to 1 hence reduced parasitic capacitances at the output node which minimizes clock to output delay (t CLK−Q ) as shown in Table 2 at load capacitance 1fF. In Strollo's SAFF, when CLK=1 and Q =1, the total capacitance associated with pull-down transistors MN 1 and MN 7 contributes CLK-Q delay. However, in proposed designs, CLK-Q delay with CLK=1 can me reduced due to reduction in parasitics which are only associated with single N-type transistor. Nikolic's SAFF provides minimum fall delay at larger load capacitance due to fast pull down network availability.
Proposed SAFF's with smaller or negative setup time have minimum Data to output delay. Conventional SAFF design provides high D-Q delay with large setup time hence slower in speed of operation. Basic circuit design of novel slave latch structures reduces overall power dissipation of LPSAFF as compared to others eight existing designs. Optimum power delay product (PDP) can be obtained from the proposed TCD-LPSAFF and ULESAFF with a load capacitance (C load ) of 1fF . It can be seen from results that conventional SAFF has a maximum glitches due to very high data activity at node N1 and N2. Glitches are removed

Results and Discussion
Effect of process corners are analyzed for power and delay performance on different LPSAFF's in Fig. 6a. There is a smaller variation observed in dynamic power consumption with temperature varied from -40 • C to 120 • C. Leakage power or standby power is changed by approximate 50% while moving process corners SS to FF. In comparison of Nikolic, Kim, Lin and Strollo SAFF, the proposed ULESAFF gives minimum leakage power. HSPICE simulation results have been performed for calculation of total propagation delay which is and integration of clock to output and data to output delay as shown in Fig. 6b for both CMOS and CNTFET technologies. Propagation delay depends on the total transistor count or area, number of clock transistors and type of low power technique utilized. Now, power consumption can be calculated for different input data activities as shown in Table 3. Here power dissipation is reduced by ∼15% to ∼26% for TCD-LPSAFF and about ∼18% to ∼33% in cased of ULESAFF for 50% switching activity. The proposed low power SAFF shows an enhanced energy delay product in comparison with other two low power D flip flop designs. The minimum operating V DD of o.4V for ultra low energy sense amplifier flip flops can be calculated from HSPICE simulations using worst case input data activity.
Extensive simulations has been carried out with a load capacitance of fan out 4 (FO4) for determination of optimum power and delay performance. Two complementary outputs Q and Q of a SAFF has been loaded with C L in the range of 1fF to 27 fF at 1 GHz input clock frequency. Fig. 6c shows load capacitance versus total power dissipation plot for considered four low power SAFF's. It can be observed from the graph that proposed ULESAFF shows minimum power for the respective load capacitance in comparison with other three SAFF topologies. Proposed ULESAFF consumes minimum power and able to provide reliable results at low V DD with considerable dynamic power. 9.18E-08 6.18E-08 5.78E-08 2.92E-08 50% 1.05E-07 7.53E-08 6.53E-08 3.52E-08 100% 1.13E-07 7.97E-08 6.730-08 3.61E-08

Conclusions
Two novel low-power delay (D) flip flop designs utilizing switching activity minimization methodology for CMOS and CNTFET are reported in this paper. CNTFET based low power sense amplifier based flip flop implemented with transition completion detection (TCD) technique and clock gating method at latch stage and sensing stage. Simulated results of the proposed designs are compared with benchmarking SAFF based topologies for average power dissipation and enhanced energy delay metric. The proposed method leads to substantial reduction in power and as well as in energy delay performance (EDP) of 43% and 36%, respectively while comparing with their CMOS counterpart. This paper summarize the theoretical approach and simulation work on CNTFET to find optimal valued design constraints for proposed sequential blocks. To get rid of unwanted glitches in R and S, highly recommended power optimization method have been used with adding very few additional circuitry is being highly appreciated.